Two classes can have the exact same median test score of 50%, but in one class everyone scored between 45% and 55%, while in the other scores ranged from 10% to 90%. Measures of spread describe this "dispersion" or "consistency" of a data set. A smaller spread means the data points are more consistent. Both the range and the interquartile range have the same units as the original data.
An outlier is an extreme value that does not fit the general pattern of the data. The range is highly sensitive to outliers because it relies exclusively on the highest and lowest values.
When a data set contains extreme outliers, the range "distorts" the view of the data by stretching the distance without reflecting the rest of the set. The IQR is much more robust because it ignores the extreme top and bottom 25%.
Quartiles divide an ordered data set into four equal parts. The Lower Quartile () is the value one-quarter of the way through the data, and the Upper Quartile () is three-quarters of the way through.
For discrete lists of data, Edexcel accepts finding the median of the lower and upper halves of the data, or using the position formulas:
Find the range, , , and the IQR for the following data set: .
Step 1: Calculate the Range.
Step 2: Find the median to split the data into halves.
Step 3: Find and using the lower and upper halves.
Step 4: Calculate the IQR.
When data is grouped into classes, the exact individual values are lost. We can only calculate an Estimated Range by finding the difference between the upper boundary of the highest class and the lower boundary of the lowest class.
Find the estimated range for the following masses:
| Mass ( kg) | Frequency |
|---|---|
| 5 | |
| 12 | |
| 3 |
Step 1: Identify the lowest and highest boundaries.
Step 2: Calculate the Estimated Range.
To find quartiles for grouped data, you must use a Cumulative Frequency Graph.
A cumulative frequency graph shows the test marks of 80 students. Estimate the IQR if the curve crosses the frequency line at 32 marks and the frequency line at 58 marks.
Step 1: Identify the positions on the -axis.
Step 2: Read the corresponding values from the -axis.
Step 3: Calculate the IQR.
At Higher Tier, you must use a specific mathematical rule to provide an objective boundary, determining exactly which values are outliers.
An extreme value is an outlier if it falls outside the Lower Boundary or the Upper Boundary:
If asked to draw a box plot, any calculated outliers must be plotted as an 'x' or an asterisk (*). The whiskers should only extend to the smallest and largest values that are not outliers.
A data set has and . The maximum value in the data set is . Determine mathematically if is an outlier.
Step 1: Calculate the IQR.
Step 2: Calculate the upper boundary.
Step 3: Compare the value to the boundary.
Students often identify the position of the quartile (e.g., the 5th term) but forget to go back to the data set to find the actual value at that position.
When asked to compare two data sets in an exam, you must explicitly comment on one measure of average AND one measure of spread, making sure to relate your answer back to the real-life context of the question.
Use phrases like 'more consistent' or 'less spread out' to describe the data set that has the lower IQR or range.
In cumulative frequency graphs, always double-check the scale on the x and y axes before reading your quartiles. A common error is assuming one small square equals 1 unit when it actually equals 2.
If an exam question asks you to justify if a value is an outlier, you must explicitly show the Q₃ + (1.5 x IQR) calculation. Stating it is an outlier 'by inspection' will score zero method marks.
Range
The difference between the highest (maximum) value and the lowest (minimum) value in a data set.
Interquartile Range (IQR)
The difference between the upper quartile and the lower quartile (Q₃ - Q₁), representing the spread of the middle 50% of the data.
Lower Quartile (Q₁)
The value one-quarter (25%) of the way through an ordered data set.
Upper Quartile (Q₃)
The value three-quarters (75%) of the way through an ordered data set.
Outlier
An extreme value that does not fit the general pattern of the data.
Cumulative Frequency Graph
A running total graph used to estimate medians and quartiles for grouped continuous data.
Estimated Range
The difference between the upper boundary of the highest class and the lower boundary of the lowest class in grouped data.
Lower Boundary
The threshold calculated as Q₁ - (1.5 x IQR), below which a data value is considered a small outlier.
Upper Boundary
The threshold calculated as Q₃ + (1.5 x IQR), above which a data value is considered a large outlier.
Put your knowledge into practice — try past paper questions for Mathematics
Range
The difference between the highest (maximum) value and the lowest (minimum) value in a data set.
Interquartile Range (IQR)
The difference between the upper quartile and the lower quartile (Q₃ - Q₁), representing the spread of the middle 50% of the data.
Lower Quartile (Q₁)
The value one-quarter (25%) of the way through an ordered data set.
Upper Quartile (Q₃)
The value three-quarters (75%) of the way through an ordered data set.
Outlier
An extreme value that does not fit the general pattern of the data.
Cumulative Frequency Graph
A running total graph used to estimate medians and quartiles for grouped continuous data.
Estimated Range
The difference between the upper boundary of the highest class and the lower boundary of the lowest class in grouped data.
Lower Boundary
The threshold calculated as Q₁ - (1.5 x IQR), below which a data value is considered a small outlier.
Upper Boundary
The threshold calculated as Q₃ + (1.5 x IQR), above which a data value is considered a large outlier.