Grokking Bayes' Theorem

This guide walks through a practical example that illustrates Bayes' theorem using a college major scenario. Let's explore how our intuition can sometimes lead us astray and how Bayes' theorem helps us reason correctly about probabilities.

The Problem Statement

Consider the following description of a person named Ahmed:

Ahmed is outgoing and confident.

Question 1: Based solely on this description, which seems more likely?

Take a moment to think about your intuitive answer before proceeding.

Examining Our Intuition

Many people intuitively choose "communications major" because the description seems to match stereotypical traits we associate with students in communications programs. This is using what psychologists call the representativeness heuristic - judging probability by how well something matches our mental prototype.

Question 2: What critical information are we missing when we make this judgment if we were to disregard his description?

Click for answer

We're missing the base rates - how common communications majors and STEM majors are in the college population. This is crucial information for making an accurate probability assessment.

Adding Base Rates

In many colleges and universities, STEM majors significantly outnumber communications majors:

This means STEM majors are roughly 5-6 times more common than communications majors in the overall student population.

Question 3: How should this information affect our probability estimate?

Click for answer

This drastically changes the calculation. Even if the description matches communications majors better, we need to account for the fact that we're much more likely to randomly select a STEM major than a communications major from the student population.

Visualizing the Problem

Let's represent our student population as a grid where:

The Communications vs. STEM Major Problem Base Rates in College Population 30 Communications (15%) 170 STEM (85%) 👤 👤 👤 👤 👤 👤 👤 👤 👤 👤 Students Matching "Outgoing and Confident" Description 70% 30% 21 51 P(Comm given description) = 21 / (21 + 51) ≈ 29.2% Students Who Match the Description 21 51 Even though a higher percentage of Communications majors match the description (70% vs 30%), there are still more STEM majors who match because STEM majors are much more common.

Question 4: If we were to randomly select a student from this population, what is the probability they would be a communications major?

Click for answer

About 15%. In our simplified visualization, we're showing 30 communications majors out of a total of 200 students (30 communications + 170 STEM).

Accounting for the Description

Now let's consider how well the description matches each major:

We can visualize this by highlighting the portion of each group that matches the description, as shown in the middle section of our visualization.

Question 5: Even though a higher percentage of communications majors match the description, why might there still be more STEM majors who match it?

Click for answer

Because STEM majors greatly outnumber communications majors in the total population. In this case, even though only 30% of STEM majors match the description compared to 70% of communications majors, the absolute number is still larger: 51 STEM majors vs. 21 communications majors.

Calculating with Bayes' Theorem

Let's use the numbers from our visualization:

Question 6: Among students matching the description, how many are communications majors and how many are STEM majors?

Click for answer

21 communications majors and 51 STEM majors match the description.

Question 7: What is the probability that someone matching the description is a communications major?

Click for answer

$$P(\text{Comm} | \text{Description}) = \frac{21}{21 + 51} = \frac{21}{72} \approx 29.2\%$$

Even though the description matches communications majors at a higher rate, the probability is still only about 29% that Ahmed is a communications major.

As shown in the visualization, even though the description is more representative of communications majors, there are still many more STEM majors who match it simply because STEM majors are so much more common.

Formalizing with Bayes' Theorem

Bayes' theorem provides the mathematical framework for this kind of reasoning:

$$P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}$$

Where:

For our problem:

$$P(\text{Comm} | \text{Description}) = \frac{P(\text{Description} | \text{Comm}) \times P(\text{Comm})}{P(\text{Description})}$$

$$P(\text{Comm} | \text{Description}) = \frac{0.7 \times 0.15}{(0.7 \times 0.15) + (0.3 \times 0.85)} = \frac{0.105}{0.36} \approx 29.2\%$$

Question 8: If the description matched 90% of communications majors but still only 30% of STEM majors, how would that change our result?

Click for answer

$$P(\text{Comm} | \text{Description}) = \frac{0.9 \times 0.15}{(0.9 \times 0.15) + (0.3 \times 0.85)} = \frac{0.135}{0.39} \approx 34.6\%$$

The probability increases, but it's still more likely that Ahmed is a STEM major.

The Heart of Bayes' Theorem

The fundamental insight of Bayesian reasoning is that we need to consider both:

  1. How likely each hypothesis is to begin with (prior probabilities)
  2. How well the evidence fits each hypothesis (likelihoods)
The Heart of Bayes' Theorem All possibilities 🎤 🔬 All possibilities fitting the description 🎤 🔬 P ( Comm given description ) + If this line of reasoning, where seeing new evidence restricts the space of possibilities, makes sense to you, then congratulations! You understand the heart of Bayes' theorem.

As shown in the visualization:

Question 9: Why is it incorrect to only consider how well the description matches each major?

Click for answer

Because we'd be ignoring the base rates - how common each major is in the student population. This leads to the base rate fallacy, where we overemphasize the matching characteristics and undervalue the prior probabilities.

Real-World Applications

Bayes' theorem is crucial in many real-world scenarios:

Question 10: How might this type of reasoning be relevant when interpreting personality test results?

Click for answer

When a personality test suggests someone has traits often associated with a certain profession or personality type, we should consider not just how well the traits match the stereotype, but also how common that profession or personality type is in the general population.

Question 11: How could Bayesian reasoning help with making predictions about student success in different programs?

Click for answer

When trying to predict which program a student might succeed in based on their traits, we should consider both how well their traits match successful students in each program AND how many students succeed in each program overall. A program with higher overall success rates might be a better bet even if another program seems to match their traits slightly better.

Conclusion

The communications/STEM major example highlights a common error in probabilistic reasoning - ignoring base rates. Bayes' theorem provides a formal framework for combining prior knowledge with new evidence to reach more accurate conclusions.

Remember:

Final Question: Can you think of a situation in your own life where you might have fallen prey to the base rate fallacy? How could you apply Bayesian reasoning to avoid this error in the future?

Acknowledgment:
Images are inspired by 3Blue 1Brown