
| x (hrs/wk) | 4 | 5 | 6 | 6 | 7 | 8 | 8 | 9 | 10 | 11 | 12 | 13 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| y (m/s) | 6.1 | 6.4 | 6.3 | 6.8 | 7.0 | 7.2 | 6.9 | 7.5 | 7.4 | 7.8 | 7.9 | 8.1 |
| Means | \(\bar{x} = 8.25\) hrs/wk Β· \(\bar{y} = 7.117\) m/s | |||||||||||
| Match | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Churros (x) | 210 | 185 | 340 | 290 | 155 | 410 | 375 | 230 | 460 | 310 |
| Goals (y) | 1 | 0 | 3 | 2 | 1 | 4 | 3 | 1 | 4 | 2 |
| Match | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 |
|---|---|---|---|---|---|---|---|---|---|---|
| Churros (x) | 270 | 195 | 385 | 440 | 160 | 325 | 280 | 500 | 215 | 355 |
| Goals (y) | 2 | 1 | 3 | 4 | 0 | 3 | 2 | 5 | 1 | 3 |
| Means | \(\bar{x} = 304.5\) churros Β· \(\bar{y} = 2.25\) goals | |||||||||
"A football scout has been collecting data on youth academy players. Before you analyse any numbers, she needs you to look at some charts and tell her what you see. Your job today: figure out how to measure relationships in data, build the tool mathematicians invented for exactly this, and then use it to call out a very suspicious claim."
Give code SCATTER once all groups have ranked all six plots and answered Task 1c.
Expected descriptions and rankings (accept reasonable variation in ranking of middle plots):
Target properties students should identify:
The relationship is clearly curved β transfer value rises through a player's twenties and falls sharply after ~27β30. A linear measuring number would give a value near zero, making it appear there is no relationship β which is obviously false.
A linear measuring number is misleading whenever the underlying relationship is non-linear. Call back to this in Phase 2: "This is exactly why Pearson's r is only valid for linear relationships."
Give code PEARSON once all groups have found the regression line and made their first prediction.
The activity introduces the formula directly on screen β no teacher introduction needed. Key hand-calculation results:
| x | y | \(x-\bar{x}\) | \(y-\bar{y}\) | \((x-\bar{x})^2\) | \((y-\bar{y})^2\) | \((x-\bar{x})(y-\bar{y})\) |
|---|---|---|---|---|---|---|
| 4 | 6.1 | β4.25 | β1.017 | 18.06 | 1.034 | 4.322 |
| 5 | 6.4 | β3.25 | β0.717 | 10.56 | 0.514 | 2.330 |
| 6 | 6.3 | β2.25 | β0.817 | 5.06 | 0.667 | 1.838 |
| 6 | 6.8 | β2.25 | β0.317 | 5.06 | 0.100 | 0.713 |
| 7 | 7.0 | β1.25 | β0.117 | 1.56 | 0.014 | 0.146 |
| 8 | 7.2 | β0.25 | 0.083 | 0.063 | 0.007 | β0.021 |
| 8 | 6.9 | β0.25 | β0.217 | 0.063 | 0.047 | 0.054 |
| 9 | 7.5 | 0.75 | 0.383 | 0.563 | 0.147 | 0.287 |
| 10 | 7.4 | 1.75 | 0.283 | 3.063 | 0.080 | 0.495 |
| 11 | 7.8 | 2.75 | 0.683 | 7.563 | 0.467 | 1.879 |
| 12 | 7.9 | 3.75 | 0.783 | 14.063 | 0.614 | 2.936 |
| 13 | 8.1 | 4.75 | 0.983 | 22.563 | 0.967 | 4.671 |
| Sums | 88.25 | 4.657 | 19.65 | |||
Model interpretation sentence: There is a very strong positive linear relationship between weekly training hours and sprint speed β players who train more hours per week tend to have significantly higher sprint speeds.
x = 10 lies within the data range (4β13) β this is an interpolation.
Give code PREDICT once all groups have found the x on y line and completed Task 3c.
| x value | In range? (4β13 hrs) | Predicted Ε· | Verdict |
|---|---|---|---|
| 9 hrs/wk | β Interpolation | 0.223(9) + 5.28 = 7.29 m/s | Reliable |
| 12 hrs/wk | β Interpolation (edge) | 0.223(12) + 5.28 = 7.96 m/s | Reliable, mild caution |
| 25 hrs/wk | β Extrapolation | 0.223(25) + 5.28 = 10.86 m/s | Unreliable |
10.86 m/s exceeds the world record sprint speed β a clear sign the linear model does not hold at extreme values.
Step 1 β Rearranging y-on-x line:
Step 2 β Regression line of x on y (from GDC):
The answers are close (because r is very high) but not identical. For a scout making practical decisions, this difference is small. But the principle matters.
The gap between the two lines depends on r. When r is close to 1, the lines are nearly identical. When r is weak (e.g. 0.5), the lines diverge dramatically β and using the wrong one gives badly wrong answers.
Give code CHURROS once all groups have written their verdict.
Students run the full analysis before drawing conclusions. Do not pre-empt the tension.
The maths is correct β r = 0.972 accurately describes the co-variation. What is wrong is the causal interpretation: assuming that because two variables move together, one causes the other.