Every time you compare your gaming scores with a friend, you are naturally using averages to see who performed better overall. To summarise a list of ungrouped data, we calculate a measure of central tendency to find the "typical" value, and a measure of spread to see how dispersed the numbers are.
There are three main measures of central tendency:
The range is the standard measure of spread, calculated by subtracting the minimum value from the maximum value.
To calculate the mean, we use the sum of all values () and the total number of values ():
To find the median's position in an ordered list, we use:
Five students score the following marks in a spelling test: 12, 18, 15, 12, 23. Calculate the mean and the range.
Step 1: Calculate the mean by summing the values and dividing by the number of values ().
Step 2: Calculate the range by subtracting the lowest score from the highest score.
Find the median and mode of this data set: 7, 3, 9, 3, 4, 11.
Step 1: Order the data from smallest to largest.
Step 2: Identify the mode by finding the most frequent number.
Step 3: Find the median. Since (an even number), the position is .
When collecting heights of hundreds of plants, recording every exact millimetre takes too long, so scientists group the data into intervals. Once data is placed into a frequency table, the exact original values are hidden. Because we do not know the raw data, we can only calculate an estimated mean, and identify the modal class and median class interval.
To estimate the mean, we must assume that every data point in a class interval is equal to its exact middle, called the midpoint (). We multiply each midpoint by its frequency to create a product column (), then divide the total of this column by the total frequency ().
The modal class is simply the interval with the highest frequency. To find the median class interval, we look for the interval containing the value by keeping a running total of the frequencies, known as the cumulative frequency.
The table below shows the time taken (, in minutes) for 30 students to travel to school. Estimate the mean time, identify the modal class, and find the median class interval.
| Time () | Frequency () |
|---|---|
| 6 | |
| 14 | |
| 10 |
Step 1: Add a column for the midpoint () and multiply it by the frequency ().
Step 2: Sum the frequency column () and the product column ().
Step 3: Calculate the estimated mean.
Step 4: Identify the modal class (the interval with the highest frequency).
Step 5: Find the median class interval. The median position is .
You can quickly approximate the spread in a grouped survey by calculating an estimated range. Because the true highest and lowest values in the data set are unknown, you must use the extremes of the table.
The standard method for GCSE is the class boundaries method. You subtract the lower bound of the very first interval in the table from the upper bound of the final interval. Because it is highly unlikely the true minimum and maximum fall exactly on these boundaries, this method is usually an overestimate of the true spread.
Using the travel time data above, calculate the estimated range using the class boundaries method.
Step 1: Identify the upper bound of the highest class and the lower bound of the lowest class.
Step 2: Subtract the lower bound from the upper bound.
When finding the median of ungrouped data, forgetting to order the numbers from smallest to largest first is a guaranteed way to lose marks.
Students often identify the highest numerical value in a data set as the mode, instead of looking for the value with the highest frequency.
In a grouped frequency table, always divide the total of your product column () by the total frequency (), never by the number of rows in the table.
OCR examiners accept mean values as fractions or decimals; do not arbitrarily round an exact decimal answer to a whole number unless the question specifically asks you to.
When asked for the modal class or median class, ensure you write down the actual mathematical class interval (e.g., ) and not the frequency of that class.
Ungrouped data
A collection of individual data values that have not been categorized into intervals.
Measure of central tendency
A single value that attempts to describe a set of data by identifying the central position within that set (Mean, Median, or Mode).
Measure of spread
A statistic that describes how dispersed or scattered the values in a data set are (e.g., Range).
Arithmetic mean
A measure of central tendency calculated by dividing the sum of all values by the total number of values.
Outliers
Extreme values that are significantly higher or lower than the majority of the data set, heavily affecting the mean.
Median
The middle value of a data set when the values are arranged in ascending or descending order.
Mode
The value that occurs most frequently in a data set.
Modal value
The value or category that appears most often in a data set; another name for the mode.
Bimodal
A data set that contains exactly two modes (two values that share the highest frequency).
Range
The difference between the highest and lowest values in a data set, used to measure spread.
Frequency table
A table used to organize data by showing the number of times each value or class interval occurs.
Estimated mean
An approximation of the mean for grouped data, calculated using midpoints because exact values are unknown.
Modal class
The class interval in a grouped frequency table that has the highest frequency.
Median class interval
The specific class interval in a grouped frequency table that contains the middle value (the median) of the data set.
Class interval
A range of values into which data is grouped in a frequency table.
Midpoint
The value exactly halfway between the upper and lower limits of a class interval, used to estimate the mean for grouped data.
Product column
The column in a grouped frequency table representing the frequency multiplied by the midpoint ().
Cumulative frequency
A running total of frequencies used to help locate the median class interval.
Estimated range
An approximation of the spread in grouped data, typically the difference between the upper bound of the last class and the lower bound of the first class.
Lower bound
The smallest possible value that can belong to a specific class interval or data set.
Upper bound
The largest possible value that can belong to a specific class interval or data set.
Put your knowledge into practice — try past paper questions for Mathematics
Ungrouped data
A collection of individual data values that have not been categorized into intervals.
Measure of central tendency
A single value that attempts to describe a set of data by identifying the central position within that set (Mean, Median, or Mode).
Measure of spread
A statistic that describes how dispersed or scattered the values in a data set are (e.g., Range).
Arithmetic mean
A measure of central tendency calculated by dividing the sum of all values by the total number of values.
Outliers
Extreme values that are significantly higher or lower than the majority of the data set, heavily affecting the mean.
Median
The middle value of a data set when the values are arranged in ascending or descending order.
Mode
The value that occurs most frequently in a data set.
Modal value
The value or category that appears most often in a data set; another name for the mode.
Bimodal
A data set that contains exactly two modes (two values that share the highest frequency).
Range
The difference between the highest and lowest values in a data set, used to measure spread.
Frequency table
A table used to organize data by showing the number of times each value or class interval occurs.
Estimated mean
An approximation of the mean for grouped data, calculated using midpoints because exact values are unknown.
Modal class
The class interval in a grouped frequency table that has the highest frequency.
Median class interval
The specific class interval in a grouped frequency table that contains the middle value (the median) of the data set.
Class interval
A range of values into which data is grouped in a frequency table.
Midpoint
The value exactly halfway between the upper and lower limits of a class interval, used to estimate the mean for grouped data.
Product column
The column in a grouped frequency table representing the frequency multiplied by the midpoint ().
Cumulative frequency
A running total of frequencies used to help locate the median class interval.
Estimated range
An approximation of the spread in grouped data, typically the difference between the upper bound of the last class and the lower bound of the first class.
Lower bound
The smallest possible value that can belong to a specific class interval or data set.
Upper bound
The largest possible value that can belong to a specific class interval or data set.