Bayes’ theorem

Statement

  • P(A) : Probablities of A
  • P(B) : Probablities of B
  • P(A|B) : Conditional probablities of A given B
  • P(B|A) : Conditional probablities of B given A

P(A|B) = \frac{P(B|A)\, P(A)}{P(B)}\cdot

Example

  • W : The event that the conversation was held with a woman
  • L : The event that the conversation was held with a long-haired person
  • P(W) = 0.5
  • P(L|W) = 0.75 (Suppose it is known that 75% of women have long hair)
  • P(L|M) = 0.15 (Suppose it is known that 15% of men have long hair)

P(W|L) &= \frac{P(L|W)\, P(W)}{P(L)} \\ &= \frac{P(L|W)\, P(W)}{P(L|W)\, P(W) + P(L|M)\, P(M)} \\ &= \frac{0.75\cdot0.50}{0.75\cdot0.50 + 0.15\cdot0.50} \\ &= \frac56 \\ &\approx 0.83

Interpretations

Bayesian interpretation

  • Probability measures a degree of belief.
  • For proposition A and evidence B,
    • P(A), the prior, is the initial degree of belief in A.
    • P(A|B), the posterior, is the degree of belief having accounted for B.
    • the quotient P(B|A)/P(B) represents the support B provides for A.

Frequentist interpretation

  • Probability measures a proportion of outcomes.
  • For example, suppose an experiment is performed many times.
    • P(A), the proportion of outcomes with property A.
    • P(B), that with property B.
    • P(B|A), the proportion of outcomes with property B out of outcomes with property A.
    • P(A|B), the proportion of those with A out of those with B.

Forms

Events

Simple form

P(A|B) = \frac{P(B | A)\, P(A)}{P(B)}\cdot

P(A|B) \propto P(A) \cdot P(B|A).

P(A|B) = c \cdot P(A) \cdot P(B|A) \ and \ P(\neg A|B) = c \cdot P(\neg A) \cdot P(B|\neg A)\cdot

c = \frac{1}{P(A) \cdot P(B|A) + P(\neg A) \cdot P(B|\neg A) }.

Extended form

P(B) = {\sum_j P(B|A_j) P(A_j)}, \implies P(A_i|B) = \frac{P(B|A_i)\,P(A_i)}{\sum\limits_j P(B|A_j)\,P(A_j)}\cdot

P(A|B) = \frac{P(B|A)\,P(A)}{ P(B|A) P(A) + P(B|\neg A) P(\neg A)}\cdot

Random variables

Simple form

f_X(x|Y=y) = \frac{P(Y=y|X=x)\,f_X(x)}{P(Y=y)}.

P(X=x|Y=y) = \frac{f_Y(y|X=x)\,P(X=x)}{f_Y(y)}.

f_X(x|Y=y) = \frac{f_Y(y|X=x)\,f_X(x)}{f_Y(y)}.

Extended form

f_Y(y) = \int_{-\infty}^\infty f_Y(y|X=\xi )\,f_X(\xi)\,d\xi .

Bayes’ rule

O(A_1:A_2|B) = O(A_1:A_2) \cdot \Lambda(A_1:A_2|B)

\Lambda(A_1:A_2|B) = \frac{P(B|A_1)}{P(B|A_2)}

O(A_1:A_2) = \frac{P(A_1)}{P(A_2)}

O(A_1:A_2|B) = \frac{P(A_1|B)}{P(A_2|B)}

Derivation

For events

P(A|B)=\frac{P(A \cap B)}{P(B)}, \text{ if } P(B) \neq 0, \!

P(B|A) = \frac{P(A \cap B)}{P(A)}, \text{ if } P(A) \neq 0, \!

\implies P(A \cap B) = P(A|B)\, P(B) = P(B|A)\, P(A), \!

\implies P(A|B) = \frac{P(B|A)\,P(A)}{P(B)}, \text{ if } P(B) \neq 0.

For random variables

f_X(x|Y=y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}

f_Y(y|X=x) = \frac{f_{X,Y}(x,y)}{f_X(x)}

\implies f_X(x|Y=y) = \frac{f_Y(y|X=x)\,f_X(x)}{f_Y(y)}.

Examples

Frequentist example

P(\text{Rare}|\text{Pattern}) &= \frac{P(\text{Pattern}|\text{Rare})P(\text{Rare})} {P(\text{Pattern}|\text{Rare})P(\text{Rare}) \, + \, P(\text{Pattern}|\text{Common})P(\text{Common})} \\ &= \frac{0.98 \times 0.001} {0.98 \times 0.001 + 0.05 \times 0.999} \\ &\approx 1.9\%.

Coin flip example

P(\text{Biased coin}) &= \frac{1}{3} \\ P(\text{Fair coin}) &= \frac{2}{3} \\ P(\text{H}|\text{Fair coin}) &= \frac{1}{2} \\ P(\text{HHH}|\text{Fair coin}) &= \frac{1}{8} \\ P(\text{HHH}|\text{Biased coin}) &= 1 \\ P(\text{Biased coin}|\text{HHH}) &= \frac{P(\text{HHH}|\text{Biased coin})P(\text{Biased coin})}{P(\text{HHH}|\text{Biased coin})P(\text{Biased coin}) + P(\text{HHH}|\text{Fair coin})P(\text{Fair coin})} \\ &= \frac{1 \times \frac{1}{3}}{1 \times \frac{1}{3} + \frac{1}{8} \times \frac{2}{3}} \quad = \quad \frac{\frac{1}{3}}{\frac{10}{24}} \quad = \quad \frac{4}{5}

Drug testing

P(\text{User}|\text{+}) &= \frac{P(\text{+}|\text{User}) P(\text{User})}{P(\text{+}|\text{User}) P(\text{User}) + P(\text{+}|\text{Non-user}) P(\text{Non-user})} \\ &= \frac{0.99 \times 0.005}{0.99 \times 0.005 + 0.01 \times 0.995} \\ &\approx 33.2\%