Grokking Bayes' Theorem

This guide walks through a practical example that illustrates Bayes' theorem using a college major scenario. Let's explore how our intuition can sometimes lead us astray and how Bayes' theorem helps us reason correctly about probabilities.

The Problem Statement

Consider the following description of a person named Ahmed:

Ahmed is outgoing and confident.

Question 1: Based solely on this description, which seems more likely?

Ahmed is a communications major
Ahmed is a STEM major

Take a moment to think about your intuitive answer before proceeding.

Examining Our Intuition

Many people intuitively choose "communications major" because the description seems to match stereotypical traits we associate with students in communications programs. This is using what psychologists call the representativeness heuristic - judging probability by how well something matches our mental prototype.

Question 2: What critical information are we missing when we make this judgment if we were to disregard his description?

Click for answer

We're missing the base rates - how common communications majors and STEM majors are in the college population. This is crucial information for making an accurate probability assessment.

Adding Base Rates

In many colleges and universities, STEM majors significantly outnumber communications majors:

Approximately 15% of students are communications majors
Approximately 85% of students are STEM majors

This means STEM majors are roughly 5-6 times more common than communications majors in the overall student population.

Question 3: How should this information affect our probability estimate?

Click for answer

This drastically changes the calculation. Even if the description matches communications majors better, we need to account for the fact that we're much more likely to randomly select a STEM major than a communications major from the student population.

Visualizing the Problem

Let's represent our student population as a grid where:

Communications majors are represented in orange (a smaller portion on the left)
STEM majors are represented in blue (the much larger remaining portion)

Question 4: If we were to randomly select a student from this population, what is the probability they would be a communications major?

Click for answer

About 15%. In our simplified visualization, we're showing 30 communications majors out of a total of 200 students (30 communications + 170 STEM).

Accounting for the Description

Now let's consider how well the description matches each major:

Suppose about 70% of communications majors are outgoing and confident (21 out of our 30 communications majors)
Suppose only about 30% of STEM majors are outgoing and confident (51 out of our 170 STEM majors)

We can visualize this by highlighting the portion of each group that matches the description, as shown in the middle section of our visualization.

Question 5: Even though a higher percentage of communications majors match the description, why might there still be more STEM majors who match it?

Click for answer

Because STEM majors greatly outnumber communications majors in the total population. In this case, even though only 30% of STEM majors match the description compared to 70% of communications majors, the absolute number is still larger: 51 STEM majors vs. 21 communications majors.

Calculating with Bayes' Theorem

Let's use the numbers from our visualization:

We have 30 communications majors in our population
We have 170 STEM majors
70% of communications majors (21 people) match the description
30% of STEM majors (51 people) match the description

Question 6: Among students matching the description, how many are communications majors and how many are STEM majors?

Click for answer

21 communications majors and 51 STEM majors match the description.

Question 7: What is the probability that someone matching the description is a communications major?

Click for answer

$$P(\text{Comm} | \text{Description}) = \frac{21}{21 + 51} = \frac{21}{72} \approx 29.2\%$$

Even though the description matches communications majors at a higher rate, the probability is still only about 29% that Ahmed is a communications major.

As shown in the visualization, even though the description is more representative of communications majors, there are still many more STEM majors who match it simply because STEM majors are so much more common.

Formalizing with Bayes' Theorem

Bayes' theorem provides the mathematical framework for this kind of reasoning:

$$P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$$

Where:

$$P(A|B)$$ is the probability of A given B has occurred (posterior)
$$P(B|A)$$ is the probability of B given A (likelihood)
$$P(A)$$ is the probability of A (prior)
$$P(B)$$ is the probability of B (evidence)

For our problem:

$$P(\text{Comm} | \text{Description}) = \frac{P(\text{Description} | \text{Comm}) \times P(\text{Comm})}{P(\text{Description})}$$

$$P(\text{Comm} | \text{Description}) = \frac{0.7 \times 0.15}{(0.7 \times 0.15) + (0.3 \times 0.85)} = \frac{0.105}{0.36} \approx 29.2\%$$

Question 8: If the description matched 90% of communications majors but still only 30% of STEM majors, how would that change our result?

Click for answer

$$P(\text{Comm} | \text{Description}) = \frac{0.9 \times 0.15}{(0.9 \times 0.15) + (0.3 \times 0.85)} = \frac{0.135}{0.39} \approx 34.6\%$$

The probability increases, but it's still more likely that Ahmed is a STEM major.

The Heart of Bayes' Theorem

The fundamental insight of Bayesian reasoning is that we need to consider both:

How likely each hypothesis is to begin with (prior probabilities)
How well the evidence fits each hypothesis (likelihoods)

As shown in the visualization:

The left panel shows all possibilities with their original distribution
The middle panel shows how the evidence restricts the space of possibilities
The right panel shows how we calculate the final probability within that restricted space

Question 9: Why is it incorrect to only consider how well the description matches each major?

Click for answer

Because we'd be ignoring the base rates - how common each major is in the student population. This leads to the base rate fallacy, where we overemphasize the matching characteristics and undervalue the prior probabilities.

Real-World Applications

Bayes' theorem is crucial in many real-world scenarios:

Question 10: How might this type of reasoning be relevant when interpreting personality test results?

Click for answer

When a personality test suggests someone has traits often associated with a certain profession or personality type, we should consider not just how well the traits match the stereotype, but also how common that profession or personality type is in the general population.

Question 11: How could Bayesian reasoning help with making predictions about student success in different programs?

Click for answer

When trying to predict which program a student might succeed in based on their traits, we should consider both how well their traits match successful students in each program AND how many students succeed in each program overall. A program with higher overall success rates might be a better bet even if another program seems to match their traits slightly better.

Conclusion

The communications/STEM major example highlights a common error in probabilistic reasoning - ignoring base rates. Bayes' theorem provides a formal framework for combining prior knowledge with new evidence to reach more accurate conclusions.

Remember:

Intuition often focuses on how well evidence matches our hypotheses
Proper Bayesian reasoning requires also considering how common each hypothesis is to begin with
When base rates are extreme, they can outweigh even strong evidence to the contrary

Final Question: Can you think of a situation in your own life where you might have fallen prey to the base rate fallacy? How could you apply Bayesian reasoning to avoid this error in the future?

Acknowledgment:

Images are inspired by 3Blue 1Brown