from sklearn.preprocessing import PolynomialFeatures  # Import PolynomialFeatures class

poly_reg = PolynomialFeatures(degree=2)  # Create a PolynomialFeatures object with degree 2
X_poly = poly_reg.fit_transform(X)  # Transform the original features to polynomial features

lin_reg_2 = LinearRegression()  # Create a LinearRegression object
lin_reg_2.fit(X_poly, y)  # Fit the linear regression model to the polynomial features

In machine learning, we often work with regression models to predict continuous outcomes. One common regression model is linear regression, which assumes a linear relationship between the input features (independent variables) and the output (dependent variable).

However, not all relationships in data are linear. To capture more complex patterns, we can use polynomial regression, which extends linear regression by adding polynomial terms to the model. This allows us to fit non-linear relationships.

Here’s a breakdown of the concepts and steps involved:

  1. Matrix of Features (Design Matrix): This matrix contains all the independent variables (features) for each observation in the dataset. Each column represents a different feature, and each row represents a different observation.
  2. Polynomial Features: In polynomial regression, we create new features by raising the original features to a power. For instance, if we have a feature , we create new features like x^2 , x^3 etc. This helps to capture the non-linear relationships in the data.
  3. Transforming Features: Using the PolynomialFeatures class from Scikit-Learn, we can transform our original feature matrix X into a new feature matrix X_poly that includes polynomial terms. The degree of the polynomial determines the highest power of the original features included.
  4. Fitting the Model: After transforming the features, we can fit a linear regression model to these polynomial features. This model will learn the coefficients for the polynomial terms, enabling it to capture non-linear relationships.

How Linear and Polynomial Regression Are Related

$$

y = b_0 + b_1x_1 + b_2x_2 + \ldots + b_nx_n

$$

$$

y = b_0 + b_1x + b_2x^2

$$

By transforming the features into polynomial terms, we can use the linear regression algorithm to fit a polynomial model to the data. This approach combines the simplicity of linear regression with the flexibility of polynomial regression to capture more complex patterns in the data.