6.2.1.6. Ensemble Methods#
A single model - however well-tuned - is limited by its assumptions and the particular random choices made during training. Ensemble methods combine multiple models so that their errors partially cancel out, producing a stronger predictor than any individual learner.
The statistical argument is simple: if \(B\) models each make independent errors with mean zero and variance \(\sigma^2\), averaging their predictions reduces the variance to \(\sigma^2 / B\). In practice, model errors are correlated, so the gain is smaller - but it is still substantial.
There are three complementary strategies covered on this page:
Strategy |
Models trained |
How predictions are combined |
|---|---|---|
Voting |
Different model types, in parallel |
Average (or weighted average) of all predictions |
Bagging |
Same model type, on different bootstrap samples, in parallel |
Average of all predictions |
Stacking |
Different base models + a meta-model |
Meta-model learns from base-model outputs |
Boosting - the fourth major ensemble strategy - is treated separately on the Boosting page because its sequential training logic is fundamentally different.
1. Voting (Averaging)#
Intuition#
The simplest ensemble: train several diverse models independently and average their predictions. Diversity is the key - models that make different kinds of errors benefit most from averaging. Using a mix of model families (linear, tree-based, kernel-based) is a common strategy.
In scikit-learn#
from sklearn.ensemble import VotingRegressor
voting = VotingRegressor(estimators=[
('ridge', Ridge(alpha=1.0)),
('dt', DecisionTreeRegressor(max_depth=5)),
('svr', Pipeline([('sc', StandardScaler()), ('svr', SVR(C=50))])),
])
Weights can be assigned via the weights parameter to give stronger models more influence.
2. Bagging (Bootstrap Aggregating)#
Intuition#
Bagging trains the same model type on many different bootstrap samples of the training data. Each bootstrap sample draws \(n\) points with replacement, so each model sees a slightly different view of the data.
High-variance models like deep decision trees make quite different predictions on different bootstrap samples - averaging smooths out those idiosyncratic errors. Bagging primarily reduces variance without increasing bias.
Note
If sampling is done without replacement instead of with replacement, the technique is called Pasting. The key difference: with Pasting, each sub-sample contains unique points, so individual models see less diversity between them. Bagging (with replacement) typically generalises better because the overlapping bootstrap samples force more variation, leading to more decorrelated models.
The Math#
Each \(\hat{f}_b\) is trained on a bootstrap sample \(\mathcal{D}_b \sim \mathcal{D}\) (sampled with replacement). Roughly 37% of the original points are left out of each bootstrap sample - this out-of-bag (OOB) set can be used as a free validation set.
In scikit-learn#
from sklearn.ensemble import BaggingRegressor
bag = BaggingRegressor(
estimator=DecisionTreeRegressor(max_depth=8),
n_estimators=50,
max_samples=0.8, # fraction of training data per bootstrap
random_state=42
)
3. Stacking#
Intuition#
Stacking is a two-level ensemble. Level-0 (base) models are trained first; their predictions are then used as input features for a level-1 (meta) model that learns the best way to combine them.
The critical design challenge is preventing leakage: if the base models were trained on the same data whose predictions you feed into the meta-model, they would simply memorise the training set and the meta-model would overfit badly.
The two standard solutions are:
Hold-out split - Split the training data into two parts. Train base models on the first part, then generate predictions on the second (held-out) part. Feed those held-out predictions as features to the meta-model. This is simple and fast, but wastes training data.
Out-of-fold (OOF) cross-validation (recommended) - Use \(k\)-fold CV: for each fold, train base models on the other \(k-1\) folds and predict on the held-out fold. This produces a full set of leak-free predictions for every training sample, which the meta-model then learns from. No data is wasted.
scikit-learn’s StackingRegressor implements the OOF approach automatically via the cv parameter.
The meta-model exploits the fact that different base models are strong in different regions of the input space.
In scikit-learn#
from sklearn.ensemble import StackingRegressor
stacking = StackingRegressor(
estimators=[ # base models
('ridge', Ridge(alpha=1.0)),
('dt', DecisionTreeRegressor(max_depth=5)),
('bag', BaggingRegressor(n_estimators=20)),
],
final_estimator=Ridge(alpha=1.0), # meta-model
cv=5 # OOF folds
)
Example#
voting = VotingRegressor(estimators=[
('ridge', Ridge(alpha=1.0)),
('dt', DecisionTreeRegressor(max_depth=5, random_state=42)),
('svr', Pipeline([('sc', StandardScaler()), ('svr', SVR(C=50))])),
])
bagging = BaggingRegressor(
estimator=DecisionTreeRegressor(max_depth=8),
n_estimators=50, max_samples=0.8,
random_state=42, n_jobs=-1
)
stacking = StackingRegressor(
estimators=[
('ridge', Ridge(alpha=1.0)),
('dt', DecisionTreeRegressor(max_depth=5, random_state=42)),
('bag', BaggingRegressor(n_estimators=20, random_state=42)),
],
final_estimator=Ridge(alpha=1.0),
cv=5
)
rows = []
models_fitted = {}
for name, model in [("Voting", voting), ("Bagging (50 trees)", bagging),
("Stacking", stacking)]:
row, m = fit_eval(name, model)
rows.append(row)
models_fitted[name] = m
results_df = pd.DataFrame(rows)
results_df
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_dc2dc267ec774d4486024a9f72700cbe for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-9t8e6zqk for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-u2sh0fnv for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-xr9198e0 for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-94c_eanz for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-zdxbi7ue for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-swnsducs for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_5ddfac9f49ff414a88f7fa0b5301d253 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-jbpfbjqx for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /loky-3376-hwl0rkzg for automatic cleanup: unknown resource type semlock
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_5ddfac9f49ff414a88f7fa0b5301d253 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_9fe2187fcb8a4f11a8860aed895ad886_1cdaa13971884fd08632913600a944c3 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_f806d3f360a54c03a9eca325dea07ecf for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_f806d3f360a54c03a9eca325dea07ecf for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_6844ba87e46c4b3ebc691b5cf98d2c21_0f75b73316ab45d59e5e1369d0fa9179 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_2ddade094e104dcfa88a64402d331ff4 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_2ddade094e104dcfa88a64402d331ff4 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_d9b3328c7c274884a4bb2d682242ad76_f36679d26e2a4ec58f362ae0364f0257 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_8802f2918cfe4db68ac68edf60289bf2 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_8802f2918cfe4db68ac68edf60289bf2 for automatic cleanup: unknown resource type folder
| Model | Train R² | Test R² | Test RMSE | |
|---|---|---|---|---|
| 0 | Voting | 0.960 | 0.840 | 71.0 |
| 1 | Bagging (50 trees) | 0.949 | 0.765 | 86.2 |
| 2 | Stacking | 0.979 | 0.977 | 27.0 |
Voting achieves \(R^2\) = 0.84, Bagging 0.765, and Stacking 0.977. Each improves over the single decision tree (see Decision Tree Regression).
Comparing Against a Single Decision Tree Baseline#
The single decision tree baseline gives \(R^2\) = 0.445. All three ensemble strategies outperform it, with Stacking typically providing the largest gain by learning which base model to trust in which region.
How Many Bagging Estimators Are Enough?#
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_25642b3c728a4ad49a255c204685d636_091cbfcd7e8a4e2f8757e6c2eeace483 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_bd925b26ac7944ac8304b946569dbc98 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_bd925b26ac7944ac8304b946569dbc98 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_84bc1c8813bf4c868db3f4ae33251287_9d49e47595874e64abb2bba092eb0382 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_88ded7cb17ec4b959a8a696ad69c0a06 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_88ded7cb17ec4b959a8a696ad69c0a06 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_8d6c15af1def4f3a8a9b7e61e1e177f2_ba02287f4a12492b95a78af04093ee2e for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_7aa9b74c43d64ca689b4a61f4617c6d3 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_7aa9b74c43d64ca689b4a61f4617c6d3 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_f40f6f2d26b64024a09e4a498d6090ea_7d0a74f88f714eedaa16df0a963e4ca0 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_815f41a2e88a47739a7572e3ff64a036 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_815f41a2e88a47739a7572e3ff64a036 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_85dfd3f8ae81474ea1d0d37e8986cba1_721f658aca8748099fd3a73b9b28ca24 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_bab7d731f7ca47ba9cde7b825739aff1 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_bab7d731f7ca47ba9cde7b825739aff1 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_5dc0382bf6e644acb85c6630873efc81_89c0062011d44169a61f860f98583b47 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_2ab630f84a4944c9af01af7948c0d3d5 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_2ab630f84a4944c9af01af7948c0d3d5 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_63cdec281e6349b5852eb64e55c34d83_c814065879f94ff88436cd59dfeac34f for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_30f8716b12914b9a950252fad585ad4a for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_30f8716b12914b9a950252fad585ad4a for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_a21f81ce71eb4bde80e68e1b8b699e00_6820f1632a054cd2954c7d46e2771c6a for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_a3d10042c58141608a79a97a9518a5a6 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_a3d10042c58141608a79a97a9518a5a6 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_b6d376ad60fc427e8e268ea9b72659a7_8dc0d1677bfa424d9270b4e072d8c807 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_992d0a5dd65a4438b06057a6f8fdccb5 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_992d0a5dd65a4438b06057a6f8fdccb5 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_0a507db393494ff599eb4df0d6f7ca27_24adf6cfd2f343dab9b627b2c65b6410 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_dec34442573e4f3e9aca028fa5270354 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_dec34442573e4f3e9aca028fa5270354 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_fe95fcf8c48a4dc187c145e0820d49da_13518ee2c73645b2b794fab676323020 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_36c1c81565d84f7ea3dd91a44134de73 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_36c1c81565d84f7ea3dd91a44134de73 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_b3291b8ac4064a2aad68a3046f29c1ed_f7aa891d75d44eb1a69fe9d4d5270bc0 for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_c960cb9c3e7b4419b89ee0ba8ef21a8d for automatic cleanup: unknown resource type folder
Traceback (most recent call last):
File "/home/runner/.local/share/uv/python/cpython-3.13.12-linux-x86_64-gnu/lib/python3.13/multiprocessing/resource_tracker.py", line 371, in main
raise ValueError(
f'Cannot register {name} for automatic cleanup: '
f'unknown resource type {rtype}')
ValueError: Cannot register /dev/shm/joblib_memmapping_folder_3376_ea3967c7bdb14f9191c2d76856bb62cf_c960cb9c3e7b4419b89ee0ba8ef21a8d for automatic cleanup: unknown resource type folder
Performance improves rapidly up to ~20–50 estimators and then plateaus. Beyond that point, adding more models costs computation without meaningful gains.
Strengths and Weaknesses#
Strategy |
Best for |
Watch out for |
|---|---|---|
Voting |
Quick ensemble of diverse models |
All base models must be independently strong |
Bagging |
Reducing variance of high-variance models |
Slow to train with many estimators; doesn’t help low-variance models much |
Stacking |
Squeezing out maximum performance |
Risk of leakage if OOF predictions are not used correctly; slow; harder to interpret |
Tip
Stacking is often the most powerful but also the most expensive. Use it when you already have well-tuned base models and want a final performance boost. For everyday use, Random Forest - which is essentially optimised Bagging with additional decorrelation - is the better default.