LOGIN

No products in the cart.

Home
Machine Learning Basics
203.3.5 Information Gain in Decision Tree Split

203.3.5 Information Gain in Decision Tree Split

Information Gain

In previous section, we studied about How to Calculate Entropy for Decision Tree Split?

Information Gain= entropyBeforeSplit – entropyAfterSplit
Easy way to understand Information gain= (overall entropy at parent node) – (sum of weighted entropy at each child node)
Attribute with maximum information is best split attribute

Information Gain- Calculation

Entropy Ovearll = 100% (Impurity)
Entropy Young Segment = 99%
Entropy Old Sgment = 99%
Information Gain for Age =100-(0.699+0.499)=1

Entropy Ovearll = 100% (Impurity)
Entropy Male Segment = 72%
Entropy Female Sgment = 29%
Information Gain for Age =100-(0.672+0.429)=45.2

LAB: Information Gain

Calculate the information gain this example base on the variable split

Output-Information Gain

Split With Respect to ‘Owning a car’

Entropy([28+,39-]) Ovearll = -28/67 log2 28/67 – 39/67 log2 39/67 = 98% (Impurity)
Entropy([25+,4-]) Owing a car = 57%
Entropy([3+,35-]) No car = 40%
Information Gain for Owing a car =98-((29/67)57+(38/67)40)=50.6

Split With Respect to ‘Gender’

Entropy([19+,21-]) Male= 99%
Entropy([9+,18-]) Female = 91%
Information Gain for Gender=98-((40/67)99+(27/67)91) =2.2

Other Purity (Diversity) Measures

Chi-square measure of association
Gini Index : Gini(T) = \(1 – \sum p_j^2\)
Information Gain Ratio
Misclassification error

The next post is about The Decision Tree Algorithm.

20th June 2017

© 2020. All Rights Reserved.