🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Polynomial Regression — Fitting Nonlinear Relationships

Regression AnalysisNonlinear Regression🟢 Free Lesson

Advertisement

Polynomial Regression

Regression Analysis

Fitting Nonlinear Relationships With Linear Methods

Polynomial regression captures curved relationships by adding powers of X as predictors while keeping the model linear in its coefficients. It bridges the gap between simple linear models and complex nonlinear patterns.

  • Pharmacology — Model dose-response curves with diminishing returns

  • Environmental Science — Capture temperature effects on species populations

  • Manufacturing — Relate process parameters to quality with nonlinear response surfaces

Adding polynomial terms lets straight lines bend to follow the data's true shape.


Polynomial regression models nonlinear relationships by including powers of X as predictors, while remaining a linear model in the coefficients:

Polynomial Regression Model

Y=β0+β1X+β2X2++βdXd+εY = \beta_0 + \beta_1X + \beta_2X^2 + \cdots + \beta_dX^d + \varepsilon

Here,

  • YY=Response variable
  • XX=Predictor variable
  • βj\beta_j=Coefficient for X^j
  • dd=Degree of the polynomial
  • ε\varepsilon=Error term

import numpy as np

import matplotlib.pyplot as plt

from sklearn.preprocessing import PolynomialFeatures

from sklearn.linear_model import LinearRegression

from sklearn.pipeline import Pipeline

from sklearn.model_selection import cross_val_score

import warnings; warnings.filterwarnings('ignore')



np.random.seed(42)

n = 80

X = np.linspace(-3, 3, n)

y = 0.5*X**3 - X**2 + 2*X + np.random.normal(0, 1.5, n)



X_2d = X.reshape(-1, 1)

X_plot = np.linspace(-3.2, 3.2, 300).reshape(-1, 1)



fig, axes = plt.subplots(2, 3, figsize=(15, 8))

degrees = [1, 2, 3, 5, 10, 20]

colors = ['blue','green','red','orange','purple','brown']



cv_scores = {}

for ax, deg, col in zip(axes.flat, degrees, colors):

    model = Pipeline([('poly', PolynomialFeatures(deg)),

                      ('lin',  LinearRegression())])

    model.fit(X_2d, y)

    y_pred = model.predict(X_plot)

    

    # Cross-validated R²

    cv_r2 = cross_val_score(model, X_2d, y, cv=5, scoring='r2').mean()

    train_r2 = model.score(X_2d, y)

    cv_scores[deg] = cv_r2

    

    ax.scatter(X, y, alpha=0.4, s=20, color='gray')

    ax.plot(X_plot, y_pred, col, linewidth=2)

    ax.set_ylim(-25, 25)

    ax.set_title(f'Degree {deg}\nTrain R²={train_r2:.3f}, CV R²={cv_r2:.3f}')

    if deg == 3:

        ax.set_title(f'Degree {deg} <- CORRECT\nTrain R²={train_r2:.3f}, CV R²={cv_r2:.3f}')



plt.suptitle('Polynomial Regression: Underfitting -> Overfitting', fontsize=14)

plt.tight_layout()

plt.savefig('polynomial_regression.png', dpi=150)

plt.show()



print("Cross-Validated R² by Degree:")

for deg, cv in cv_scores.items():

    bar = '#' * max(0, int(cv*20))

    print(f"  Degree {deg:2d}: {cv:.4f} {bar}")

print("Peak CV R² indicates optimal degree")

Overfitting Warning

Higher degree polynomials are more flexible but risk overfitting. Always use cross-validation to select the optimal degree.


Key Takeaways

Summary: Polynomial Regression

  • Polynomial regression is still linear — in the parameters ß

  • Higher degree = more flexible but risks overfitting

  • Use cross-validation to select the optimal polynomial degree

  • Center and scale X before computing powers to reduce numerical instability

  • Splines are usually better than high-degree polynomials for flexible fitting

Premium Content

Polynomial Regression — Fitting Nonlinear Relationships

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement