The Foundation of Predictive Modeling
Linear regression is the workhorse of statistics and the gateway to machine learning. Sir Francis Galton invented it in 1886 while studying how children's heights 'regressed toward mediocrity' compared to their parents. Today, linear regression underlies everything from economic forecasting to A/B testing to medical research. If you understand one statistical model, it should be this one.
The Least Squares Method
The idea is elegant: find the line that minimizes the total squared distance between each data point and the line. Why squared? Because squaring penalizes large errors more than small ones, produces a unique solution with a closed-form formula, and connects beautifully to the mathematics of calculus and linear algebra. The resulting formulas for slope and intercept can be computed by hand or by any computer in microseconds.
Reading the Regression Output
R² tells you how well the line fits — 0.82 means 82% of the data's variation is explained by the linear relationship. RMSE (root mean squared error) tells you the typical prediction error in the same units as your data. The slope tells you the rate of change: for every 1-unit increase in x, y changes by β₁ units on average. These numbers together give you a complete picture of the relationship.
Beyond Simple Regression
Simple linear regression with one predictor is just the beginning. Multiple regression adds more predictors (y = β₀ + β₁x₁ + β₂x₂ + ...). Polynomial regression fits curves by adding squared and cubed terms. Regularized regression (Ridge, Lasso) prevents overfitting. All of these extensions build on the same least squares foundation you can explore in the simulation above.