# 204.7.9 Boosting Conclusion

##### When does boosting not work?
Link to the previous post : https://statinfer.com/204-7-8-practice-boosting/

## When Ensemble doesn’t work?

• The models have to be independent, we can’t build the same model multiple times and expect the error to reduce.
• We may have to bring in the independence by choosing subsets of data, or subset of features while building the individual models
• Ensemble may backfire if we use dependent models that are already less accurate. The final ensemble might turn out to be even worse model.
• Yes, there is a small disclaimer in “Wisdom of Crowd” theory. We need good independent individuals. If we collate any dependent individuals with poor knowledge, then we might end with an even worse ensemble.
• For example, we built three models, model-1 , model-2 are bad, model-3 is good. Most of the times ensemble will result the combined output of model-1 and model-2, based on voting.

## Conclusion

• Ensemble methods are most widely used methods these days. With advanced machines, its not really a huge task to build multiple models.
• Both bagging and boosting does a good job of reducing bias and variance.
• Random forests are relatively fast, since we are building many small trees, it doesn’t put lot of pressure on the computing machine.
• Random forest can also give the variable importance. We need to be careful with categorical features, random forests tend to give higher importance to variables with higher number of levels.
• In Boosted algorithms we may have to restrict the number of iterations to avoid overfitting.
• Ensemble models are the final effort of a data scientist, while building the most suitable predictive model for the data.
18th March 2017

### 0 responses on "204.7.9 Boosting Conclusion"

• #### 204.4.4 ROC and AUC

Link to the previous post :...