Main Questions
- Ok we are looking for pure segments
- Dataset has many attributes
- Which is the right attribute for pure segmentation?
- Can we start with any attribute?
- Which attribute to start from? – The best separating attribute
- Customer Age can impact the sales, gender can impact sales , customer place and demographics can impact the sales. How to identify the best attribute and the split?
The Splitting Criterion
- The best split is
- The split that does the best job of separating the data into groups
- Where a single class(either 0 or 1) predominates in each group
Example Sales Segmentation Based on Age
Example Sales Segmentation Based on Gender
The next post is about how decision tree splits works.
Link to the next post : https://statinfer.com/204-3-3-how-decision-tree-splits-works/