Ad Space — Top Banner

Bayes' Theorem

Reference for Bayes' Theorem: P(A|B) = P(B|A) × P(A) / P(B).
Update probabilities given new evidence.
Foundation of Bayesian inference, machine learning, and medical testing.

The Formula

P(A | B) = P(B | A) × P(A) / P(B)

Bayes' theorem describes how to update the probability of a hypothesis A given new evidence B. The posterior probability P(A | B) is computed from the likelihood P(B | A), the prior probability P(A), and the total probability of the evidence P(B).

Expanded Form

P(A | B) = P(B | A) × P(A) / [P(B | A) × P(A) + P(B | ¬A) × P(¬A)]

The denominator uses the law of total probability to express P(B) in terms of the hypothesis A and its complement. This expanded form is more useful in practice because P(B) is usually not given directly.

Variables

SymbolMeaning
P(A)Prior — probability of A before seeing evidence
P(B | A)Likelihood — probability of evidence B given that A is true
P(B)Marginal probability of evidence B (any cause)
P(A | B)Posterior — updated probability of A after observing B
¬AThe complement of A (A is false)

Classic Example — Medical Test

A disease affects 1% of the population. A test is 99% sensitive (true positive rate) and 95% specific (true negative rate). If someone tests positive, what is the probability they actually have the disease?

P(Disease) = 0.01, P(No Disease) = 0.99

P(Positive | Disease) = 0.99

P(Positive | No Disease) = 1 − 0.95 = 0.05 (false positive rate)

P(Positive) = 0.99 × 0.01 + 0.05 × 0.99 = 0.0099 + 0.0495 = 0.0594

P(Disease | Positive) = (0.99 × 0.01) / 0.0594

P(Disease | Positive) = 0.167 or about 17%

This is the counterintuitive result that surprises most people: a "99% accurate" test can still be wrong 83% of the time when the underlying condition is rare. The reason is the large pool of false positives drawn from the much larger healthy population.

Example — Spam Email Detection

Suppose 20% of emails are spam. The word "lottery" appears in 60% of spam and 1% of legitimate emails. If an email contains "lottery", what is the probability it is spam?

P(Spam) = 0.20, P(Not Spam) = 0.80

P("lottery" | Spam) = 0.60

P("lottery" | Not Spam) = 0.01

P("lottery") = 0.60 × 0.20 + 0.01 × 0.80 = 0.12 + 0.008 = 0.128

P(Spam | "lottery") = (0.60 × 0.20) / 0.128

P(Spam | "lottery") = 0.9375 or about 94%

This is the core mechanism of a naive Bayes spam classifier — combining many such word-level probabilities to score the overall message.

Intuition — Why Priors Matter

If a hypothesis is rare to begin with, even strong evidence may not lift it past 50%. If a hypothesis is common to begin with, even weak evidence can confirm it. The prior P(A) is not optional — it determines how much the posterior shifts from baseline.

Prior P(A)P(B|A)P(B|¬A)Posterior P(A|B)
0.0010.990.050.0194
0.010.990.050.167
0.100.990.050.687
0.500.990.050.952

When to Use It

  • Medical diagnosis and screening test interpretation
  • Spam filtering and document classification (naive Bayes)
  • A/B test analysis and Bayesian inference
  • Genetic risk assessment given family history
  • Fault diagnosis in engineering systems given observed symptoms
  • Legal reasoning about probability of guilt given physical evidence
  • Search algorithms (Bayesian search for lost objects)

Sequential Updating

Bayes' theorem chains: today's posterior becomes tomorrow's prior. P(A | B₁, B₂) is computed by applying Bayes with P(A | B₁) as the new prior and B₂ as the new evidence. This makes Bayesian updating naturally suited to streaming evidence — each observation refines the estimate without restarting from scratch.

The Common Mistake — Base Rate Neglect

The most common error in informal probabilistic reasoning is ignoring the prior. People presented with the test scenario above often answer "99% chance of disease" because they focus on the test's accuracy and overlook the disease's rarity. Bayes' theorem is the mathematical antidote to this — it forces you to combine the test result with the base rate.


Ad Space — Bottom Banner

Embed This Calculator

Copy the code below and paste it into your website or blog.
The calculator will work directly on your page.