Link to the previous post : https://statinfer.com/204-4-10-cross-validation/
Solution
##Defining the model parameters
tree_KF = tree.DecisionTreeClassifier(criterion='gini',
splitter='best',
max_depth=30,
min_samples_split=30,
min_samples_leaf=30,
max_leaf_nodes=60)
#Importing kfold from cross_validation
from sklearn.cross_validation import KFold
#Simple K-Fold cross validation. 10 folds.
kfold = KFold(len(Fiber_df), n_folds=10)
## Checking the accuracy of model on 10-folds
from sklearn import cross_validation
score10 = cross_validation.cross_val_score(tree_KF,X, y,cv=kfold)
score10
#Mean accuracy of 10-fold
score10.mean()
#Simple K-Fold cross validation. 20 folds.
kfold = KFold(len(Fiber_df), n_folds=20)
#Accuracy score of 20-fold model
score20 = cross_validation.cross_val_score(tree_KF,X, y,cv=kfold)
score20
#Mean accuracy of 20-fold
score20.mean()
With 10-fold kross validation we can expect Accuracy : 76.29%.
With 20-fold kross validation we can expect Accuracy : 77.98%.
The next post is about bootstrap cross validation.
Link to the next post : https://statinfer.com/204-4-12-bootstrap-cross-validation/