The models have to be independent, we can’t build the same model multiple times and expect the error to reduce.
We may have to bring in the independence by choosing subsets of data, or subset of features while building the individual models.
Ensemble may backfire if we use dependent models that are already less accurate. The final ensemble might turn out to be even worse model.
Yes, there is a small disclaimer in “Wisdom of Crowd” theory. We need good independent individuals. If we collate any dependent individuals with poor knowledge, then we might end with an even worse ensemble.
For example, we built three models, model-1 , model-2 are bad, model-3 is good. Most of the times ensemble will result the combined output of model-1 and model-2, based on voting.
Ensemble methods are most widely used methods these days. With advanced machines, its not really a huge task to build multiple models.
Both bagging and boosting does a good job of reducing bias and variance.
Random forests are relatively fast, since we are building many small trees, it doesn’t put lot of pressure on the computing machine.
Random forest can also give the variable importance. We need to be careful with categorical features, random forests tend to give higher importance to variables with higher number of levels.
In Boosted algorithms we may have to restrict the number of iterations to avoid overfitting.
Ensemble models are the final effort of a data scientist, while building the most suitable predictive model for the data.