Every time you look at a set of exam results, the average score only tells half the story. If the average is 50%, did everyone in the class score exactly 50%, or did half the class get 0% and the other half get 100%? Understanding the spread of data reveals what is actually happening behind the averages.
When analysing the spread of an ordered dataset, we divide it into four equal parts using quartiles. The Lower Quartile () is the value 25% of the way through the data, and the Upper Quartile () is the value 75% of the way through.
The difference between these two values is called the Inter-quartile Range (IQR). Because the IQR only measures the middle 50% of the data, it is a highly reliable measure of spread that is not affected by outliers (extreme values).
To find the quartiles from a discrete list, you first order the data and use the formula to find the position, where is the total number of values. If the position is a decimal (e.g., the position), you take the midpoint of the two surrounding values.
Worked Example: Calculate the IQR for the following data set: .
Step 1: Count the values to find .
Step 2: Find the Lower Quartile ().
Step 3: Find the Upper Quartile ().
Step 4: Calculate the IQR.
An outlier is a data point that is significantly different from the rest of the dataset. For AQA Higher Tier, you must calculate exactly where the boundary for an outlier begins using the rule.
Any value in the dataset that falls below the lower bound or above the upper bound is officially classified as an outlier. On a graph, outliers are plotted with a cross (X) or a plus (+).
A Box Plot (Box-and-Whisker Diagram) is a graphical representation used to show the spread and central tendency of a dataset. To draw one, you need a Five-Number Summary: the Minimum value, , Median (), , and Maximum value.
To construct a box plot, follow these steps strictly:
If your dataset contains an outlier, do NOT extend the whisker to the outlier. Instead, mark the outlier with a cross, and end the whisker at the next highest (or lowest) value that is not an outlier.
Reading and Interpreting a Box Plot: To extract the five-number summary and other specific values from an existing box plot scale, follow these steps:
If there are crosses (X) or pluses (+) plotted beyond the whiskers, track them down to the axis to read the values of the outliers. You can then calculate the IQR of the distribution by subtracting the value from the value.
When asked to compare two data distributions, you must always provide two distinct comparisons: one for central tendency (usually the median) and one for spread (the IQR).
Each comparison requires two sentences. The first sentence must state the numerical difference, and the second must explain what this means in context. A smaller IQR means the data is more stable, which is described as having higher consistency.
| Statistical Measure | Numerical Comparison (Sentence 1) | Contextual Meaning (Sentence 2) |
|---|---|---|
| Central Tendency (Median) | The median time for Group A ( seconds) is lower than Group B ( seconds). | Group A was faster on average. |
| Spread (IQR) | Group A has a smaller IQR ( seconds) than Group B ( seconds). | Group A's times are more consistent (less varied). |
You can also comment on the shape of the box plot. If the median line is closer to , the data is positively skewed. If the median line is closer to , the data is negatively skewed.
Students often forget to order the data before finding quartiles, or they miscalculate the midpoint when a position is a decimal (e.g., the value).
In 6-mark 'compare' questions, examiners expect exactly two pairs of comments: one comparing the medians and one comparing the IQRs, with both explicitly linked to the real-life context.
When reading from a box plot scale, always check the labeled intervals carefully; a common error is assuming one small square equals 1 unit when it actually equals 2 or 5.
In negative contexts like races or timed events, remember that a lower median value represents a better or faster performance.
If a question asks for the 'total range', you must include any outliers in your calculation. If it asks for the range of the 'main distribution', use the values at the ends of the whiskers.
Lower Quartile (Q₁)
The value that is 25% (one quarter) of the way through an ordered data set.
Upper Quartile (Q₃)
The value that is 75% (three quarters) of the way through an ordered data set.
Inter-quartile Range (IQR)
A measure of spread focusing on the middle 50% of the data, calculated as the difference between the upper quartile and the lower quartile ().
Outliers
Data points that are more than above the upper quartile or below the lower quartile.
Box Plot (Box-and-Whisker Diagram)
A graphical representation of the five-number summary showing the spread and central tendency of a distribution.
Five-Number Summary
The five key values needed to draw a box plot: Minimum, Lower Quartile, Median, Upper Quartile, and Maximum.
Median (Q₂)
The middle value of an ordered dataset, representing the 50% mark.
Whiskers
Horizontal lines on a box plot extending from the central box to the minimum and maximum values (excluding outliers).
Central tendency
A statistical measure, such as the median, that identifies a single typical or middle value for a dataset.
Consistency
A term used to describe spread; a smaller inter-quartile range indicates the data is more consistent or less varied.
Positively skewed
A distribution where the median is closer to the lower quartile inside the box plot.
Negatively skewed
A distribution where the median is closer to the upper quartile inside the box plot.
Put your knowledge into practice — try past paper questions for Mathematics
Lower Quartile (Q₁)
The value that is 25% (one quarter) of the way through an ordered data set.
Upper Quartile (Q₃)
The value that is 75% (three quarters) of the way through an ordered data set.
Inter-quartile Range (IQR)
A measure of spread focusing on the middle 50% of the data, calculated as the difference between the upper quartile and the lower quartile ().
Outliers
Data points that are more than above the upper quartile or below the lower quartile.
Box Plot (Box-and-Whisker Diagram)
A graphical representation of the five-number summary showing the spread and central tendency of a distribution.
Five-Number Summary
The five key values needed to draw a box plot: Minimum, Lower Quartile, Median, Upper Quartile, and Maximum.
Median (Q₂)
The middle value of an ordered dataset, representing the 50% mark.
Whiskers
Horizontal lines on a box plot extending from the central box to the minimum and maximum values (excluding outliers).
Central tendency
A statistical measure, such as the median, that identifies a single typical or middle value for a dataset.
Consistency
A term used to describe spread; a smaller inter-quartile range indicates the data is more consistent or less varied.
Positively skewed
A distribution where the median is closer to the lower quartile inside the box plot.
Negatively skewed
A distribution where the median is closer to the upper quartile inside the box plot.