• No products in the cart.

# 204.7.6 Practice : Random Forest

##### Building a Random Forest model using Python.

Link to the previous post : https://statinfer.com/204-7-5-the-random-forest/
Let’s implement the concept of Random Forest into practice using Python.

### Practice : Random Forest

• Dataset: /Car Accidents IOT/Train.csv
• Build a decision tree model to predict the fatality of accident
• Build a decision tree model on the training data.
• On the test data, calculate the classification error and accuracy.
• Build a random forest model on the training data.
• On the test data, calculate the classification error and accuracy.
• What is the improvement of the Random Forest model when compared with the single tree?
In :
#Importing dataset

In :
from sklearn import tree

var=list(car_train.columns[1:22])
c=car_train[var]
d=car_train['Fatal']

###buildng Decision tree on the training data ####
clf = tree.DecisionTreeClassifier()
clf.fit(c,d)

Out:
DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,
max_features=None, max_leaf_nodes=None, min_samples_leaf=1,
min_samples_split=2, min_weight_fraction_leaf=0.0,
presort=False, random_state=None, splitter='best')
In :
#####predicting on test data ####
tree_predict=clf.predict(car_test[var])

In :
from sklearn.metrics import confusion_matrix###for using confusion matrix###
cm1 = confusion_matrix(car_test[['Fatal']],tree_predict)
print(cm1)

[[3244  648]
[ 695 4478]]

In :
#####from confusion matrix calculate accuracy
total1=sum(sum(cm1))
accuracy_tree=(cm1[0,0]+cm1[1,1])/total1
accuracy_tree

Out:
0.85184776613348046
In :
from sklearn.metrics import confusion_matrix###for using confusion matrix###
cm1 = confusion_matrix(car_test[['Fatal']],tree_predict)
print(cm1)
total1=sum(sum(cm1))
#####from confusion matrix calculate accuracy
accuracy_tree=(cm1[0,0]+cm1[1,1])/total1
accuracy_tree

[[3244  648]
[ 695 4478]]

Out:
0.85184776613348046
In :
### accuracy_score() also gives the same result[using confusion matrix]
from sklearn.metrics import accuracy_score
accuracy_score(car_test[['Fatal']],tree_predict, normalize=True, sample_weight=None)

Out:
0.85184776613348046
In :
####buliding a random forest classifier on training data#####
from sklearn.ensemble import RandomForestClassifier
forest=RandomForestClassifier(n_estimators=10, criterion='gini', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, bootstrap=True, oob_score=False, n_jobs=1, random_state=None, verbose=0, warm_start=False, class_weight=None)

forest.fit(c,d)

Out:
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
max_depth=None, max_features='auto', max_leaf_nodes=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)
In :
###predicting on test data with RF model
forestpredict_test=forest.predict(car_test[var])
e=car_test['Fatal']

In :
###check the accuracy on test data
from sklearn.metrics import confusion_matrix###for using confusion matrix###
cm2 = confusion_matrix(car_test[['Fatal']],forestpredict_test)
print(cm2)
total2=sum(sum(cm2))
#####from confusion matrix calculate accuracy
accuracy_forest=(cm2[0,0]+cm2[1,1])/total2
accuracy_forest

[[3383  509]
[ 471 4702]]

Out:
0.89189189189189189
• We can see an improvement in the Accuracy

The next post is about boosting.
Link to the next post : https://statinfer.com/204-7-7-boosting/

### 0 responses on "204.7.6 Practice : Random Forest"

Statinfer Software Solutions LLP

Software Technology Parks of India,
NH16, Krishna Nagar, Benz Circle,