Bayes' theorem tells you how to update a probability after seeing new evidence. If P(B)>0P(B) > 0, then

P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A)P(A)}{P(B)}

It answers a very specific question: after event BB has happened, how likely is event AA now? The idea matters in medical testing, spam filtering, and any situation where evidence can be misleading unless you also account for how common the event was to begin with.

Bayes' theorem formula in plain language

Bayes' theorem combines three ingredients:

  • start with what you believed before the evidence, P(A)P(A)
  • ask how compatible the evidence is with that event, P(BA)P(B \mid A)
  • scale by how common the evidence is overall, P(B)P(B)

The result, P(AB)P(A \mid B), is called the posterior probability.

What each part of the formula means

In

P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A)P(A)}{P(B)}

P(A)P(A) is the prior. It is your starting probability for AA before you use the new evidence.

P(BA)P(B \mid A) is the likelihood. It tells you how likely the evidence BB is if AA is true.

P(B)P(B) is the probability of the evidence overall. This term matters because some evidence is common even when AA is false.

P(AB)P(A \mid B) is the posterior. It is the updated probability of AA after learning that BB happened.

Why the denominator changes the answer

Bayes' theorem does not just reward evidence that fits your hypothesis. It also asks whether that same evidence happens a lot anyway.

That is why the denominator P(B)P(B) matters. If the evidence is common across many cases, seeing it should not change your belief very much. If the evidence is rare except when AA is true, it can shift your belief a lot.

Short proof from conditional probability

Assume P(B)>0P(B) > 0 and P(A)>0P(A) > 0 where needed. By the definition of conditional probability,

P(AB)=P(AB)P(B)P(A \mid B) = \frac{P(A \cap B)}{P(B)}

and

P(BA)=P(AB)P(A)P(B \mid A) = \frac{P(A \cap B)}{P(A)}

From the second equation,

P(AB)=P(BA)P(A)P(A \cap B) = P(B \mid A)P(A)

Substitute that into the first equation:

P(AB)=P(BA)P(A)P(B)P(A \mid B) = \frac{P(B \mid A)P(A)}{P(B)}

That is Bayes' theorem.

Worked Bayes' theorem example: a positive medical test

Suppose a disease affects 1%1\% of a population. A test is 99%99\% sensitive and has a 5%5\% false-positive rate.

Let

  • DD = the person has the disease
  • ++ = the test is positive

Then

P(D)=0.01P(D) = 0.01 P(+D)=0.99P(+ \mid D) = 0.99 P(+Dc)=0.05P(+ \mid D^c) = 0.05

We want P(D+)P(D \mid +), the probability that a person actually has the disease given a positive test.

First find the overall probability of a positive result. A positive test can happen in two ways: the person has the disease and tests positive, or the person does not have the disease and still tests positive.

P(+)=P(+D)P(D)+P(+Dc)P(Dc)P(+) = P(+ \mid D)P(D) + P(+ \mid D^c)P(D^c) P(+)=(0.99)(0.01)+(0.05)(0.99)=0.0594P(+) = (0.99)(0.01) + (0.05)(0.99) = 0.0594

Now apply Bayes' theorem:

P(D+)=P(+D)P(D)P(+)=(0.99)(0.01)0.0594P(D \mid +) = \frac{P(+ \mid D)P(D)}{P(+)} = \frac{(0.99)(0.01)}{0.0594} P(D+)=0.00990.0594=160.167P(D \mid +) = \frac{0.0099}{0.0594} = \frac{1}{6} \approx 0.167

So the chance of actually having the disease after one positive test is about 16.7%16.7\%, not 99%99\%. The test is strong, but the disease is rare, so most positive results still come from the much larger group without the disease.

This is the main lesson many people miss: even a strong test can produce a modest posterior probability when the condition is rare to begin with.

A useful two-case version of Bayes' theorem

If the evidence can come from two complementary cases, AA and AcA^c, then

P(B)=P(BA)P(A)+P(BAc)P(Ac)P(B) = P(B \mid A)P(A) + P(B \mid A^c)P(A^c)

Using that in Bayes' theorem gives

P(AB)=P(BA)P(A)P(BA)P(A)+P(BAc)P(Ac)P(A \mid B) = \frac{P(B \mid A)P(A)}{P(B \mid A)P(A) + P(B \mid A^c)P(A^c)}

This form is often the most practical one in two-case problems.

Common Bayes' theorem mistakes

Mixing up P(AB)P(A \mid B) and P(BA)P(B \mid A)

These probabilities usually are not equal. A positive test can be very likely when a disease is present, while the disease can still be fairly unlikely after a positive test.

Ignoring the base rate

The prior P(A)P(A) matters. If AA is very rare, then even strong evidence may not push the posterior as high as intuition expects.

Computing P(B)P(B) too narrowly

The denominator is not just a leftover term. It is the total probability of the evidence and often requires adding contributions from multiple cases.

Using the formula when P(B)=0P(B) = 0

Bayes' theorem in this form requires P(B)>0P(B) > 0. If the evidence has probability 00, the conditional probability P(AB)P(A \mid B) is not defined by the basic formula.

When Bayes' theorem is used

Bayes' theorem appears in medical testing, spam filtering, reliability analysis, machine learning, and scientific inference. In each case, the same idea appears: update a belief when new information arrives.

It is especially useful when people tend to overreact to evidence without asking how common the event was in the first place.

Try a similar Bayes' theorem problem

Keep the same medical test, but change the disease rate from 1%1\% to 10%10\%. The sensitivity and false-positive rate stay the same, but the posterior changes a lot. Working that version once is a fast way to feel why the prior matters.

Need help with a problem?

Upload your question and get a verified, step-by-step solution in seconds.

Open GPAI Solver →