When someone asks for the "average" salary, they might actually be talking about three completely different mathematical concepts. A measure of central tendency is a single value that represents the center of a data set. For discrete data (data that can only take specific values), we use three main measures: the mean, median, and mode.
To find the mean, we use the following formula:
(Where is the mean, is the sum of all values, and is the number of values)
To find the median's position in an ordered list, we use:
Find the mean of the following values: 3, 7, 8, 12.
Step 1: Sum the values to find .
Step 2: Divide by the number of values ().
Find the median of the following values: 15, 2, 9, 11, 4.
Step 1: Arrange the data in numerical order.
Step 2: Find the position of the median ().
Step 3: Identify the 3rd value in your ordered list.
Note for sets with an even : If your list was (), the position would be . The median is the arithmetic mean of the 3rd and 4th values: .
Find the mode of the following values: 4, 7, 2, 7, 9, 4, 7.
Step 1: Count the frequency of each value.
Step 2: Identify the value with the highest frequency.
Note: If the list was , both and would appear twice. The set would be bimodal, and the modes would be and .
Imagine trying to calculate the average age of 1,000 people by adding up a massive list of individual numbers — it would take forever. Instead, large datasets are organised into grouped frequency tables. When data is grouped into class intervals, we can only calculate an estimated mean because the exact values within each interval are unknown.
To estimate the mean, we must assume every value in a class interval is equal to the midpoint of that interval. We multiply each midpoint () by its frequency () to find the total for that row (). Finally, we divide the sum of these products () by the total frequency ().
Calculate an estimate for the mean from the grouped data below.
Step 1: Find the midpoint () for each class interval.
Step 2: Calculate for each row.
Step 3: Calculate the total frequency () and total products ().
Step 4: Divide by .
How do you find the most popular group when data is bundled into intervals? Instead of finding a single numerical mode, you identify the modal class. This is the specific class interval that contains the highest frequency in a grouped table.
Identifying the modal class requires no calculation; it is a direct observation of the frequency column. Be extremely careful: you must state the class interval itself as your final answer, not the frequency number.
Identify the modal class for the following weights ():
Step 1: Locate the highest number in the frequency column.
Step 2: State the corresponding class interval.
One extreme value can completely distort an average, making it highly unrepresentative of the overall group. An outlier is an extreme value that does not fit the general pattern of the rest of the data.
The arithmetic mean is highly sensitive to outliers because it incorporates every value; extreme numbers pull the mean heavily toward them. In contrast, the median is robust and resistant to outliers because it only relies on the middle position of the ordered data.
Compare the mean and median for Set A and Set B to evaluate the most appropriate measure for Set B.
Step 1: Observe the measures for Set A.
Step 2: Observe the measures for Set B (which contains the outlier 100).
Step 3: Evaluate the most appropriate measure for Set B.
When calculating the mean from a frequency table, students often divide the total by the number of rows (classes). You must always divide by the total frequency ().
Always remember to order your data list from smallest to largest before finding the median — this is the most common reason for dropping marks on simple list questions.
Never give the frequency number as your answer for the mode; you must write down the actual class interval (e.g., ), copying the inequality signs exactly.
In Edexcel exams, if you are asked to choose the best average for data with extreme values, write the exact phrase: 'The median is not affected by extreme values/outliers' to guarantee the reasoning mark.
When calculating an estimated mean, do not round your midpoints (e.g., use 17.5, not 18), as this will make your final estimate inaccurate.
Measure of central tendency
A single value representing the center of a data set, such as the mean, median, or mode.
Discrete data
Data that can only take specific, distinct values and cannot be meaningfully divided into smaller fractions (e.g., shoe sizes or number of people).
Arithmetic mean
The mathematical average, calculated by finding the sum of all values and dividing by the total number of values.
Median
The middle value of a set of data when it is arranged in numerical order.
Mode
The value that appears most frequently in a data set.
Bimodal
A data set that has two modes (two values that appear with the highest frequency).
Midpoint
The middle value of a class interval, calculated by adding the upper and lower limits and dividing by two.
Total frequency
The sum of the frequency column in a table, representing the total number of data points collected.
Modal class
The class interval that contains the highest frequency in a grouped frequency table.
Outlier
An extreme value that does not fit the general pattern of the rest of the data.
Put your knowledge into practice — try past paper questions for Mathematics
Measure of central tendency
A single value representing the center of a data set, such as the mean, median, or mode.
Discrete data
Data that can only take specific, distinct values and cannot be meaningfully divided into smaller fractions (e.g., shoe sizes or number of people).
Arithmetic mean
The mathematical average, calculated by finding the sum of all values and dividing by the total number of values.
Median
The middle value of a set of data when it is arranged in numerical order.
Mode
The value that appears most frequently in a data set.
Bimodal
A data set that has two modes (two values that appear with the highest frequency).
Midpoint
The middle value of a class interval, calculated by adding the upper and lower limits and dividing by two.
Total frequency
The sum of the frequency column in a table, representing the total number of data points collected.
Modal class
The class interval that contains the highest frequency in a grouped frequency table.
Outlier
An extreme value that does not fit the general pattern of the rest of the data.