If you measure the depth of a river at ten different points, how do you summarize the whole river with just one number? Geographers use measures of central tendency to find the "typical" or middle value in large datasets, helping to spot trends in factors like population density or rainfall.
The mean is the mathematical average, calculated by adding all values together and dividing by the total number of items (). The median is the middle value when all numbers are arranged in order. The mode is the most frequently occurring value in a dataset.
When dealing with grouped data in a frequency table, we cannot find an exact mode; instead, we identify the modal class, which is the category or interval with the highest frequency.
Worked Example: Mean
Step 2: Substitute the values:
Step 3: Calculate the final answer:
Worked Example: Median
Data: River pebble sizes in mm: 5, 8, 12, 16.
Step 1: Order the data (already done) and locate the middle. Since there is an even number of values, take the mean of the two middle values.
Step 2: Substitute the values:
Step 3: Calculate the final answer:
Worked Example: Mode
Data: Number of flood events per year over a decade: 1, 0, 2, 1, 1, 3, 0, 1, 2, 1.
Step 1: Count the frequency of each value. The number 1 appears five times, 0 appears twice, 2 appears twice, and 3 appears once.
Step 2: Identify the most frequently occurring value.
Step 3: Calculate the final answer:
Worked Example: Modal Class
Data: Elevation categories: 0–10m (Frequency: 5), 11–20m (Frequency: 12), 21–30m (Frequency: 7).
Step 1: Look for the highest frequency. The highest frequency is 12.
Step 2: Identify the corresponding class. The modal class is 11–20m.
Two cities might both have an average temperature of , but one ranges from to while the other swings from to . A measure of dispersion describes how spread out a set of data is, which helps geographers compare variability between different sites.
The range is the simplest measure of spread, calculated as the difference between the maximum and minimum values. However, the range includes extreme outliers, which can distort your understanding of the data.
The interquartile range (IQR) is a much more reliable measure because it focuses only on the middle 50% of the data, completely removing the influence of extreme outliers. It is the difference between the upper quartile (the percentile) and the lower quartile (the percentile).
Worked Example: Range
Data: Weekly high temperatures: , , , , .
Step 1: Write the formula:
Step 2: Identify max and min: Max is , Min is .
Step 3: Calculate:
Worked Example: Interquartile Range (IQR)
Data: River velocities in m/s (ordered): 0.2, 0.3, 0.4, 0.5, 0.6, 0.8, 0.9.
Step 1: Find the median (middle value): .
Step 2: Find the lower quartile (middle of the lower half): .
Step 3: Find the upper quartile (middle of the upper half): .
Step 4: Calculate:
Understanding exactly where the middle 50% of your data lies is crucial when analyzing thousands of pebble sizes on a beach. Cumulative frequency is a running total of frequencies up to the end of a specific class interval, allowing you to estimate medians and quartiles for grouped data.
When plotted, this data forms a stretched-S shape called an ogive. The curve must always start at the lowest value in the first interval with a cumulative frequency of 0.
A steeper section of the curve indicates a higher frequency of data clustered in that range. Because we do not know the exact raw data values within the groups, the values we extract from the graph are only estimates.
Interpreting a Cumulative Frequency Curve (Pebble Sizes)
Table Data: 0-20mm (10), 20-40mm (25), 40-60mm (30), 60-80mm (15).
Step 1: Calculate the running totals: 10, 35, 65, 80. The total frequency () is 80.
Step 2: Plot the points. Crucially, you must plot at the upper class boundary for the x-coordinate: (20, 10), (40, 35), (60, 65), (80, 80).
Step 3: To find the median, calculate the position: .
Step 4: Draw a clear construction line from 40 on the y-axis across to the curve, then down to the x-axis. The estimate is approximately .
Step 5: To find quartiles, use for the lower quartile (position 20) and for the upper quartile (position 60), reading from the graph in the exact same way.
How do we fairly compare population growth between a tiny village and a massive megacity? Geographers use relative change to compare the magnitude of shifts across different scales, usually expressing it as a percentage.
Percentage change shows the relative difference between an old and new value, expressed as a fraction of the original value multiplied by 100. A positive result indicates an increase, while a negative result indicates a decrease.
Proportion describes the relationship of a specific "part" to the "whole" dataset, usually expressed as a percentage ().
Worked Example: Percentage Increase
Data: In 2001, UK GDP was £1.4 trillion. In 2015, it reached £1.64 trillion.
Step 1: Write the formula:
Step 2: Calculate the difference (New - Original): .
Step 3: Divide by the original value:
Step 4: Multiply by 100: .
Worked Example: Percentage Decrease
Data: A village population falls from 850 to 720.
Step 1: Calculate the difference: .
Step 2: Divide by the original starting value:
Step 3: Multiply by 100: .
Being in the 90th percentile for a maths test is great, but being in the 90th percentile for river pollution is disastrous. A percentile is a measure indicating the value below which a given percentage of observations fall.
Percentiles divide a dataset into 100 equal parts to show a data point's relative position compared to the rest of the distribution. For example, a country in the 10th percentile for Gross National Income (GNI) is among the poorest 10% globally.
Quartiles are simply specific benchmark percentiles. The lower quartile is the 25th percentile, the median is the 50th percentile, and the upper quartile is the 75th percentile. Hydrologists also use percentiles to predict hazards, such as using the Q95 (the river flow level equalled or exceeded 95% of the time) to identify drought conditions.
When calculating percentage change, students often divide the difference by the new value instead of the original (starting) value. Always divide by the earliest year's data.
In 'Assess the usefulness' questions, examiners expect you to point out that the mean is easily distorted by extreme outliers, making the median a much more representative measure for skewed data.
When plotting a cumulative frequency graph, you MUST plot your points at the upper class boundary (the end of the interval) for the x-coordinate, never the midpoint.
Examiners require you to draw 'clear construction lines' (dotted lines from axes to the curve) when estimating medians or quartiles from a cumulative frequency graph to award full marks.
When calculating the Interquartile Range from a list, do not just subtract the position numbers (e.g., 7th minus 2nd); you must subtract the actual data values located at those positions.
Mean
The mathematical average, calculated by adding all values together and dividing by the total number of items.
Median
The middle value in a dataset when all numbers are arranged in ascending or descending order.
Mode
The most frequently occurring value in a dataset.
Modal class
The category or interval with the highest frequency in a grouped dataset.
Measure of dispersion
A statistical term describing how spread out a set of data is.
Range
The difference between the highest and lowest values in a dataset.
Interquartile range (IQR)
A measure of spread focusing on the middle 50% of the data, calculated as the upper quartile minus the lower quartile.
Upper quartile
The value at the 75th percentile of an ordered dataset, separating the highest 25% of data from the rest.
Lower quartile
The value at the 25th percentile of an ordered dataset, separating the lowest 25% of data from the rest.
Cumulative frequency
The running total of frequencies up to the end of a specific class interval.
Relative change
A term used to compare the magnitude of change across different scales, usually between an old and new value.
Percentage change
The relative change between an old and new value, expressed as a fraction of the original value multiplied by 100.
Proportion
A part, share, or number considered in comparative relation to a whole dataset.
Percentile
A measure indicating the value below which a given percentage of observations fall.
Relative position
The standing of a data point compared to the rest of the distribution, expressed as a percentage.
Put your knowledge into practice — try past paper questions for Geography
Mean
The mathematical average, calculated by adding all values together and dividing by the total number of items.
Median
The middle value in a dataset when all numbers are arranged in ascending or descending order.
Mode
The most frequently occurring value in a dataset.
Modal class
The category or interval with the highest frequency in a grouped dataset.
Measure of dispersion
A statistical term describing how spread out a set of data is.
Range
The difference between the highest and lowest values in a dataset.
Interquartile range (IQR)
A measure of spread focusing on the middle 50% of the data, calculated as the upper quartile minus the lower quartile.
Upper quartile
The value at the 75th percentile of an ordered dataset, separating the highest 25% of data from the rest.
Lower quartile
The value at the 25th percentile of an ordered dataset, separating the lowest 25% of data from the rest.
Cumulative frequency
The running total of frequencies up to the end of a specific class interval.
Relative change
A term used to compare the magnitude of change across different scales, usually between an old and new value.
Percentage change
The relative change between an old and new value, expressed as a fraction of the original value multiplied by 100.
Proportion
A part, share, or number considered in comparative relation to a whole dataset.
Percentile
A measure indicating the value below which a given percentage of observations fall.
Relative position
The standing of a data point compared to the rest of the distribution, expressed as a percentage.