In previous section, we studied about Random Forests and Boosting : Wisdom of Crowd
In this post we will discuss a bit about Ensemble Models and why they work.
Ensemble Models
- Obtaining a better predictions using multiple models on the same dataset.
- Not every time it is possible to find single best fit model for our data, ensemble model combines multiple models to come up with one consolidated model.
- Ensemble models work on the principle that multiple moderately accurate models can give us a highly accurate model.
- Understandably, the Building and Evaluating the ensemble models is computationally expensive.
- Build one really good model is the usual statistical approach. Build many models and average the results is the philosophy of Ensemble learning.
Why Ensemble technique works?
- Imagine three models
- M1 with an error rate of 10%
- M2 with an error rate of 10%
- M3 with an error rate of 10%
- The three models have to be independent, we can’t build the same model three times and expect the error to reduce. Any changes to the modeling technique in model -1 should not impact model-2.
- In this scenario, the worst ensemble model will have 10% error rate.
- The best ensemble model will have an error rate of 2.8%
- 2 out of 3 models predicted wrong + all models predicted wrong
- (3C2)*(0.1)(0.1)(0.9) + (0.1)(0.1)(0.1)
- 2.8% The best ensemble model will have an error rate of 2.8%
The next post is about Types of Ensemble Models.