CarlosGG's Knowledge Garden 🪴

Search

Feature selection

Last updated Jun 15, 2022 Edit Source

See:
AI/Supervised Learning/Regularized regression
AI/Feature learning

# Resources

# Regularization

See AI/Supervised Learning/Regularized regression

http://scikit-learn.org/stable/modules/feature_selection.html#l1-based-feature-selection

# Tree-based methods

https://scikit-learn.org/stable/modules/feature_selection.html#tree-based-feature-selection
Random forest, extra trees. Feature importances with forests of trees
XGBoost, Feature importance and why it’s important:
- http://datawhatnow.com/feature-importance/
- http://machinelearningmastery.com/feature-importance-and-feature-selection-with-xgboost-in-python/
- Importance is calculated for a single decision tree by the amount that each attribute split point improves the performance measure, weighted by the number of observations the node is responsible for. The performance measure may be the purity (Gini index) used to select the split points or another more specific error function. The feature importances are then averaged across all of the the decision trees within the model.

# Books

#BOOK Feature Engineering and Selection: A Practical Approach for Predictive Models (Kuhn 2018)

# Code

#CODE Scikit-feature
- http://featureselection.asu.edu/
#CODE Feature-selector - Feature selector is a tool for dimensionality reduction of machine learning datasets.
- Methods: Missing Values, Single Unique Values, Collinear Features, Zero Importance Features, Low Importance Features
- https://github.com/WillKoehrsen/feature-selector/blob/master/Feature%20Selector%20Usage.ipynb
- https://towardsdatascience.com/a-feature-selection-tool-for-machine-learning-in-python-b64dd23710f0
#CODE ITMO_FS
- Feature selection library in python
- https://itmo-fs.readthedocs.io/en/latest/

# References

#PAPER A Guided Hybrid Genetic Algorithm for Feature Selection with Expensive Cost Functions (Jung 2013)