Every time you watch a television show, broadcast networks decide whether to renew or cancel it by tracking the viewing habits of just a few thousand homes, rather than the whole country.
A provides 100% accurate results but is often impractical due to high costs, excessive time requirements, or situations where testing destroys the item (such as measuring the lifespan of batteries). Because of this, we use . For a to be reliable, it must be a , meaning it shares the exact characteristics and proportions of the . Larger are always more reliable; OCR mark schemes generally consider of fewer than 30 items to be unreliable due to natural variation.
If a teacher asks for volunteers to answer a question, the eager students who raise their hands do not represent the whole class's level of understanding.
To carry out a simple random , you must use a rigorous methodology:
A restaurant asking lunchtime diners to rate their new evening menu will quickly find that their feedback is heavily skewed and not useful.
When evaluating or criticising a data collection method, look for these common sources of bias:
Knowing that 6 out of 30 surveyed teenagers prefer a specific clothing brand allows retailers to predict how many shirts to ship to a city of a million people.
In a random of 60 gym members, 18 said they use the swimming pool. The gym has 1,500 members in total. Estimate the total number of members who use the swimming pool.
Step 1: Identify the frequency, size, and total .
Step 2: Substitute the values into the scaling formula.
Step 3: Calculate the final answer.
(Note: If your final estimated frequency is a decimal, you must round it to the nearest whole number, as you cannot have a fraction of a person or item).
When describing how to take a simple random sample, students often forget to write 'ignore any duplicate numbers' — this is frequently a required marking point in OCR exams.
When asked to 'criticise a sampling method', do not just say 'it is biased'. You must state specifically WHY it is biased (e.g., 'it was only conducted at 10 AM, which misses people who work').
In scaling calculations estimating population totals, always round your final answer to the nearest whole number if it represents discrete items like people, animals, or cars.
Examiners frequently look for the exact term 'sampling frame' when you are asked how to set up a sample from a large population — use it instead of just saying 'a list'.
Population
The entire group of people, objects, or items that are being studied or from which data could be collected.
Sample
A selection of items taken from the population that is used to represent the whole.
Census
A survey or investigation that includes every single member of the population.
Representative sample
A sample that shares the exact characteristics and proportions of the population, allowing for accurate inferences.
Simple random sampling
A sampling method where every member of the population has an equal probability of being chosen.
Selection bias
A systematic error where the person picking the sample subconsciously or purposefully chooses certain types of members.
Sampling frame
A complete, numbered list of every member or item in the population from which a sample is selected.
Random number generator
A tool, such as a calculator function or computer program, used to produce numbers entirely by chance to ensure unbiased selection.
Sampling bias
A systematic error where the sampling method results in certain members of a population being more or less likely to be selected than others.
Location bias
A bias occurring when data is collected from only one specific place, excluding parts of the population who do not visit that location.
Time bias
A bias occurring when data is collected at a specific time, excluding members of the population who are unavailable at that time.
Self-selection bias
A bias that occurs when participants volunteer themselves for a study, often resulting in a sample with over-represented extreme opinions.
Convenience bias
A bias where the sample is chosen based on ease of access rather than random selection.
Statistical inference
The process of using data collected from a representative sample to make estimates or predictions about the whole population.
Scaling factor
A multiplier used to scale up the proportions found in a sample to estimate totals for the entire population.
Put your knowledge into practice — try past paper questions for Mathematics
Population
The entire group of people, objects, or items that are being studied or from which data could be collected.
Sample
A selection of items taken from the population that is used to represent the whole.
Census
A survey or investigation that includes every single member of the population.
Representative sample
A sample that shares the exact characteristics and proportions of the population, allowing for accurate inferences.
Simple random sampling
A sampling method where every member of the population has an equal probability of being chosen.
Selection bias
A systematic error where the person picking the sample subconsciously or purposefully chooses certain types of members.
Sampling frame
A complete, numbered list of every member or item in the population from which a sample is selected.
Random number generator
A tool, such as a calculator function or computer program, used to produce numbers entirely by chance to ensure unbiased selection.
Sampling bias
A systematic error where the sampling method results in certain members of a population being more or less likely to be selected than others.
Location bias
A bias occurring when data is collected from only one specific place, excluding parts of the population who do not visit that location.
Time bias
A bias occurring when data is collected at a specific time, excluding members of the population who are unavailable at that time.
Self-selection bias
A bias that occurs when participants volunteer themselves for a study, often resulting in a sample with over-represented extreme opinions.
Convenience bias
A bias where the sample is chosen based on ease of access rather than random selection.
Statistical inference
The process of using data collected from a representative sample to make estimates or predictions about the whole population.
Scaling factor
A multiplier used to scale up the proportions found in a sample to estimate totals for the entire population.