Chi-Squared Test Formula
Reference for χ² = Σ(O-E)²/E for goodness-of-fit and independence tests.
Covers degrees of freedom, critical values, p-values, and contingency tables.
The Formula
The chi-squared test compares observed frequencies to expected frequencies. A large χ² value means the observed data is significantly different from what was expected.
Variables
| Symbol | Meaning |
|---|---|
| χ² | Chi-squared test statistic |
| O | Observed frequency (actual count) |
| E | Expected frequency (what you would expect by chance) |
| Σ | Sum across all categories |
Example 1
A die is rolled 60 times. Expected: 10 per face. Observed: 8, 12, 10, 11, 7, 12.
χ² = (8-10)²/10 + (12-10)²/10 + (10-10)²/10 + (11-10)²/10 + (7-10)²/10 + (12-10)²/10
χ² = 0.4 + 0.4 + 0 + 0.1 + 0.9 + 0.4
χ² = 2.2 (with 5 degrees of freedom, p > 0.05 — the die appears fair)
Example 2
Survey: 200 people chose colors. Expected 50 each. Observed: Red=70, Blue=55, Green=40, Yellow=35
χ² = (70-50)²/50 + (55-50)²/50 + (40-50)²/50 + (35-50)²/50
χ² = 8.0 + 0.5 + 2.0 + 4.5
χ² = 15.0 (with 3 df, p < 0.01 — strong evidence of preference)
When to Use It
Use the chi-squared test when:
- Testing if a die, coin, or random process is fair
- Determining if two categorical variables are independent
- Comparing survey responses across different groups
- Checking if observed genetic ratios match expected Mendelian ratios
Key Notes
- Formula: χ² = Σ(O − E)² / E: O is the observed count in each cell and E is the expected count if the null hypothesis (no relationship) were true. Larger χ² values suggest a stronger association.
- Degrees of freedom: For a contingency table, df = (rows − 1) × (columns − 1). For a goodness-of-fit test, df = (number of categories − 1). Use df to look up the critical value.
- Minimum expected frequency rule: Each expected cell count should be at least 5. If not, combine categories or use Fisher's Exact Test instead, which works for small samples.
- Tests association, not causation: A significant χ² only shows that two variables are related in the sample — it cannot prove that one causes the other.
- One-tailed vs two-tailed: The chi-squared distribution is always right-tailed. You reject H₀ when χ² exceeds the critical value for your chosen significance level (typically 0.05).