AHL 4.13 · Bayes' Theorem · HL Only · Groups of 3–4 · ~50–55 min
1
Phase 1
Reading the Evidence
Priors · Likelihoods · Conditional probability from a table
Scenario
A school's integrity committee has reviewed 1 000 essays using an AI-detection tool. The table below summarises the results from last semester.
AI-assisted
Human-written
Total
Flagged by detector
153
82
235
Not flagged
27
738
765
Total
180
820
1000
Task 1a — Prior probabilities
Find P(AI-assisted) and P(Human-written) from the table.
These are the prior probabilities — our belief before seeing the detector result.
Task 1b — Likelihoods
Find all four conditional probabilities: P(Flagged | AI), P(Flagged | Human), P(Not flagged | AI), P(Not flagged | Human).
These are likelihoods — how probable each detector outcome is given the essay's true origin.
Task 1c — First instinct
The detector flags an essay. Estimate P(AI | Flagged) as a group and write it on the board — you'll revisit after Phase 2.
2
Phase 2
The Bayes Flip
Draw a tree from scratch · Read the posterior · Meet the formula
Using the Phase 1 values, construct a fully labelled probability tree from scratch. First branches: origin of essay. Second branches: detector result. Calculate all four end-node products.
Task 2b — Read the posterior from your tree
An essay is flagged. Find P(AI | Flagged) from your tree: AI flagged path ÷ sum of all flagged paths. Compare to Task 1c — surprised?
Reference tree below — only look after you've tried it on the board.
Posterior becomes prior · Draw a new tree · Vocabulary diversity signal
New evidence
The essay also scores low on vocabulary diversity. P(Low vocab | AI) = 0.76 · P(Low vocab | Human) = 0.24. Your Phase 2 posterior P(AI | Flagged) ≈ 0.6511 is now the new prior. Erase the old root — replace it.
Task 3a — Draw a new tree on your board
Same two-branch structure as Phase 2, but new root probabilities (0.6511 / 0.3489) and vocabulary diversity as the second-level branches. Calculate all four end-node products.
Task 3b — Read the updated posterior
Find P(AI | Flagged ∩ Low vocab): AI low-vocab path ÷ sum of all low-vocab paths. Confirm with the Bayes formula from Task 2d.
Reference tree below — only look after you've tried it on the board.
Phase 3 reference tree — updated prior (check after attempting)
Compare Phase 3 posterior (≈ 85.5%) to Phase 2 (≈ 65.1%). How much did the vocabulary signal move the needle? What does this mean for the committee?
4
Phase 4
The Threshold
Two chained trees · Critical prior · GDC required
Committee rule
The committee acts only if P(AI | Flagged ∩ Low vocab) > 0.90. With p = 0.18 the posterior was ≈ 85.5% — just below. Find the minimum p (to 2 d.p.) that pushes the posterior above 90%.
Task 4a — Two-tree chain structure
Draw two chained trees on your board with p as the unknown root. Tree 1 (detector signal) gives intermediate posterior r(p). Tree 2 (vocab signal) uses r(p) as its root. Write the general expressions before substituting numbers.
\( r(p) = \dfrac{0.85p}{0.85p + 0.10(1-p)} \quad \rightarrow \quad \text{use } r(p) \text{ as root of Tree 2} \)
Chain structure — substitute p = 0.25 then p = 0.26 to bracket the answer
p = 0.25 → r ≈ 0.7391 → final posterior ≈ 89.97% ✗ · p = 0.26 → r ≈ 0.7492 → final posterior ≈ 90.44% ✓ · Answer: p = 0.26
Task 4b — Test p = 0.25 and p = 0.26
Substitute each value into your two-tree chain and run the full calculation. Find the minimum p (to 2 d.p.) such that the final posterior exceeds 90%.
Task 4c — Reflect
Currently p = 0.18. How much must it increase before the committee can act? What does this tell you about the role of the prior in a Bayesian system?