CNN vs. The Onion Facilitator Guide

Authors
Affiliations

Laurie Baker

Bates College

Jim Scott

Colby College

Mine Dogucu

UC Irvine

Published

June 21, 2025

Modified

July 22, 2025

Overview

This guide accompanies the CNN vs. The Onion - Beta Binomial activity, designed to help students explore the Bayesian updating process using real vs. fake news classification data. Students should enter the activity with a basic understanding of probability. This lesson will introduce how to construct and update Beta distributions as models for belief about a proportion.

Student Background and Course Placement

This activity is designed for students who have a foundational understanding of:

  • Basic probability concepts (e.g., probability of success/failure) in a Binomial distribution

  • What a probability distribution represents

  • Familiarity with the concept of conditional probability

Students do not need prior exposure to Bayesian statistics, but they should be comfortable with interpreting plots and understanding how beliefs or estimates can change with new information.

We expect that this activity would fit well in an Introduction to Probability and Statistics course particularly ones that includes elements of:

  • Statistical Inference
  • Bayesian concepts
  • Applied reasoning with real-world data

It could be introduced after students have learned about conditional probability and Bayes’ Rule.

Learning Objectives

By the end of this activity, students will be able to:

  • Define and distinguish between prior and posterior distributions.
  • Use the Beta distribution to construct prior beliefs about a probability.
  • Apply Bayesian updating to revise beliefs with observed data.
  • Interpret summary statistics (mean, mode, standard deviation) of Beta distributions.
  • Reflect on how prior information and data interact to shape posterior conclusions.

Activity Sequence

1. Introduce the Context

  • Present the quiz task: students will identify whether a headline came from CNN or The Onion.
  • Explain that students will make a prior prediction, then update it using quiz results.

2. Construct Priors

  • Students choose a prior model for the proportion of correct answers they expect people to get on the quiz and translate it into a Beta distribution.

    • Optimistic: Beta(8, 3)
    • Undecided: Beta(1, 1)
    • Pessimistic: Beta(5, 10)
    • Very Optimistic: Beta(140, 10)

3. Visualize Priors

  • Use plot_beta() to examine shapes of priors.
  • Students compare their prior to reference priors and discuss informativeness.

4. Take the Quiz

  • Students complete the quiz and count their correct responses.

5. Update with Data

  • Students update their prior using their own score: plot_beta_binomial() and summarize_beta_binomial().
  • Students repeat the update process adding classmates’ scores (n = 30, then n = 45) tallying the correct responses each time.

6. Update with the Class Data or the Provided Data

  • Use the provided data or ask each student to share their score. Add these together. n will become 15 x number of students.
  • Students update their prior using the class score: plot_beta_binomial() and summarize_beta_binomial().

7. Compare Posteriors

  • Visual comparison: look at posterior distributions from different priors side by side.
  • Numeric comparison: interpret changes in mean, mode, and standard deviation.

Reflection Guidance

Prior Informativeness
  • Prompt: Is your prior informative or vague?

  • Example Responses:

    • “My prior was vague because I used Beta(1,1).”
    • “Beta(10, 3) was informative since I was confident in my guess.”
Posterior Updates
  • Prompt: How did the posterior change after seeing data?

  • Example Responses:

    • “My posterior mean increased because people did better than expected on the quiz.”
    • “Even though my prior was optimistic, the data pulled the posterior lower because people did not as well as expected.”
Adding More Data
  • Prompt: What happens as you add more data?

  • Example Responses:

    • “The distribution became more concentrated.”
    • “The standard deviation decreased.”
    • “The posterior gets closer to the proportion (number correct/total questions) from the data”
Comparing Prior Models
  • Prompt: How did different priors respond to the same data?

  • Example Responses:

    • “The optimistic prior resulted in a higher posterior mean than the undecided one.”
    • “The undecided prior was the most influenced by the data.”
Big Picture
  • Prompt: What does this activity show about the role of prior models?

  • Example Responses:

    • “Priors matter when the data is limited.”
    • “The more data we have, the more it outweighs our prior.”
Further Discussion Questions
  • What would a prior look like that represents being totally uncertain?
  • In what scenarios might you want an informative prior?
  • If two people see the same data but have different priors, should they agree?
Tips
  • Check that students understand: the Beta distribution is a model for belief about a probability.
  • Reinforce that the prior is not “wrong” even if it differs from data, it reflects what we believed before seeing the data.
  • Encourage collaborative discussion when comparing prior/posterior plots.

Common questions that might arise:

Why is the Uniformed Distribution Beta(1, 1)
  • Beta(1,1) is the uniform distribution over the interval [0,1]. This means that before seeing any data, we assign equal plausibility to all possible values of \(\pi\), the probability that someone guesses a headline correctly.
  • We’re saying we have no reason to favor any particular value of \(\pi\) between 0 and 1. In contrast, Beta(8,8) assumes some structure, it implies we already believe the success rate is centered around 0.5 with some confidence.
How to interpret a Density Plot
  • Start by looking at the x-axis, which shows the possible values of the parameter (here, the probability \(\pi\) of guessing correctly), and the y-axis, which shows the plausibility or density of those values.
  • The peak of the curve indicates the most likely value (the mode), and the width reflects uncertainty—narrow curves suggest high certainty, while wide curves indicate more uncertainty.
  • The area under the curve between two points represents how probable it is that the parameter falls in that range.
  • When comparing prior and posterior distributions, look at how the curve shifts and narrows to see how beliefs change with data.
  • Always describe what you see: Is the distribution centered high or low? Is it narrow or wide? Did the data change the belief significantly?

References and Acknowledgements

  • This work was supported by the National Science Foundation under Grant Nos #2215879, #2215920, and #2215709.
  • Johnson, A.A., Ott, M.Q., & Doğucu, M. (2022). Bayes Rules!: An Introduction to Applied Bayesian Modeling (1st ed.). Chapman and Hall/CRC. https://doi.org/10.1201/9780429288340. 📘
  • Doğucu M, Johnson A, Ott M (2021). bayesrules: Datasets and Supplemental Functions from Bayes Rules! Book. R package version 0.0.2.9000. 🔗.