Regression Formula (Least Squares)
Learn the least squares regression formula for finding the line of best fit.
Includes slope and intercept derivation with examples.
The Formula
b = [n × Σ(xy) - Σx × Σy] / [n × Σ(x²) - (Σx)²]
a = (Σy - b × Σx) / n
The least squares method minimizes the sum of squared differences between observed values and predicted values. It produces the unique line that best fits the data.
Variables
| Symbol | Meaning |
|---|---|
| ŷ | Predicted value of the dependent variable |
| a | Y-intercept of the regression line |
| b | Slope of the regression line |
| n | Number of data points |
| Σ(xy) | Sum of the products of each x and y pair |
| Σx, Σy | Sum of all x values, sum of all y values |
| Σ(x²) | Sum of each x value squared |
Example 1
Data: (1, 2), (2, 3), (3, 5), (4, 4), (5, 6). Find the regression line.
n = 5, Σx = 15, Σy = 20, Σ(xy) = 1×2 + 2×3 + 3×5 + 4×4 + 5×6 = 69
Σ(x²) = 1 + 4 + 9 + 16 + 25 = 55
b = (5×69 - 15×20) / (5×55 - 15²) = (345 - 300) / (275 - 225) = 45 / 50 = 0.9
a = (20 - 0.9×15) / 5 = (20 - 13.5) / 5 = 6.5 / 5 = 1.3
ŷ = 1.3 + 0.9x (for each unit increase in x, y increases by 0.9)
Example 2
Using the regression line ŷ = 1.3 + 0.9x, predict y when x = 7.
ŷ = 1.3 + 0.9 × 7
ŷ = 1.3 + 6.3
ŷ = 7.6
When to Use It
Use the least squares regression formula when:
- Finding a linear trend in a data set
- Predicting future values based on historical data
- Quantifying the relationship between two variables
- Analyzing experimental data in science and business research