How To Perform Feature Selection for Regression Problems
In this article I explain what feature selection is and how to perform it before training a regression model in Python.
1. Introduction
What is feature selection ?
Feature selection is the procedure of selecting a subset (some out of all available) of the input variables that are most relevant to the target variable (that we wish to predict).
Target variable here refers to the variable that we wish to predict.
For this article we will assume that we only have numerical input variables and a numerical target for regression predictive modeling. Assuming that, we can easily estimate the relationship between each input variable and the target variable. This relationship can be established by calculating a metric such as the correlation value for example.
- NEW: After a great deal of hard work and staying behind the scenes for quite a while, we’re excited to now offer our expertise through a platform, the “Data Science Hub” on Patreon (https://www.patreon.com/TheDataScienceHub). This hub is our way of…