• No products in the cart.

203.3.5 Information Gain in Decision Tree Split

Information Gain

In previous section, we studied about How to Calculate Entropy for Decision Tree Split?

  • Information Gain= entropyBeforeSplit – entropyAfterSplit
  • Easy way to understand Information gain= (overall entropy at parent node) – (sum of weighted entropy at each child node)
  • Attribute with maximum information is best split attribute

Information Gain- Calculation

  • Entropy Ovearll = 100% (Impurity)
  • Entropy Young Segment = 99%
  • Entropy Old Sgment = 99%
  • Information Gain for Age =100-(0.699+0.499)=1

  • Entropy Ovearll = 100% (Impurity)
  • Entropy Male Segment = 72%
  • Entropy Female Sgment = 29%
  • Information Gain for Age =100-(0.672+0.429)=45.2

LAB: Information Gain

Calculate the information gain this example base on the variable split

Output-Information Gain

Split With Respect to ‘Owning a car’

  • Entropy([28+,39-]) Ovearll = -28/67 log2 28/67 – 39/67 log2 39/67 = 98% (Impurity)
  • Entropy([25+,4-]) Owing a car = 57%
  • Entropy([3+,35-]) No car = 40%
  • Information Gain for Owing a car =98-((29/67)57+(38/67)40)=50.6

Split With Respect to ‘Gender’

  • Entropy([19+,21-]) Male= 99%
  • Entropy([9+,18-]) Female = 91%
  • Information Gain for Gender=98-((40/67)99+(27/67)91) =2.2

Other Purity (Diversity) Measures

  • Chi-square measure of association
  • Gini Index : Gini(T) = \(1 – \sum p_j^2\)
  • Information Gain Ratio
  • Misclassification error

 

The next post is about The Decision Tree Algorithm.

Statinfer

Statinfer derived from Statistical inference. We provide training in various Data Analytics and Data Science courses and assist candidates in securing placements.

Contact Us

info@statinfer.com

+91- 9676098897

+91- 9494762485

 

Our Social Links

top
© 2020. All Rights Reserved.