Tree Family in Machine learning and XGBoost

Posted on 2021-12-23 In 学习笔记 Word count in article: 349 Reading time ≈ 1 mins.

因为研究的原因又回到了老本行，要用auto-sklearn和XGBoost，这次一定要把我自己讲明白

一些地址

https://towardsdatascience.com/beyond-grid-search-hypercharge-hyperparameter-tuning-for-xgboost-7c78f7a2929d

https://issueexplorer.com/issue/automl/auto-sklearn/1297

https://www.kaggle.com/residentmario/automated-feature-selection-with-sklearn

https://stackoverflow.com/questions/54035645/features-and-feature-importance-in-auto-sklearn-with-one-hot-encoded-features

https://github.com/automl/auto-sklearn/issues/524

https://towardsdatascience.com/feature-preprocessor-in-automated-machine-learning-c3af6f22f015

https://scikit-learn.org/stable/modules/feature_selection.html

https://github.com/automl/auto-sklearn/issues/524

https://automl.github.io/auto-sklearn/master/examples/40_advanced/example_inspect_predictions.html

POC baiyan那个

What is tree

这里要顺便补充一下之前写了一半的那个

共同点：都是由弱分类器构成。弱分类器就是表现不好的分类器

Bagging，训练多个，然后投票加权，弱分类器通常是过拟合的

Boosting，训练多个，然后投票加权，弱分类器通常是欠拟合的

提升树是由基于残差的训练

不由一个模型完成最终预测，将一个模型训练之后的残差作为下一个模型预测的目标值，记为y2，如此循环

最终预测=模型1的预测+模型2的预测+模型3的预测+…

没有投票，每个模型都参与的预测的记过里面