Sample Means and the Population Mean Any sample mean is only an estimate of the true population mean. If you took a second sample, and then a third, you might find a different mean each time. Imagine that you took one hundred samples and calculated the mean value for each. You would have 100 mean values. Some might be the same as each other, but many would differ. Now imagine you could take every possible sample from a population and calculate their means.
You have seen how a set of measurements form a distribution and it is the same for these sample means. They form what is known as a sampling distribution of sample means.
This distribution has some very useful properties, as you will see in the next steps of this tutorial. The important things to remember now are:
The sampling distribution is a theoretic construct. You do not actually have to collect multiple samples - one will do;
The mean of any single sample is just one value of the many possible values in the sampling distribution of sample means;
Sampling distributions are at the heart of a number of parametric statistical techniques. If you can understand them, you will have cracked the theory of all these techniques.
One thing to remember at this point is that larger samples tend to have a mean that is closer to the population mean than smaller samples do. You can explore this fact in the game below. The larger your sample, the closer your sample mean is likely to be to the true population mean.
You will learn more about the relationship between samples and populations in the section called samples and populations.
Imagine a car company knows the length of every car it has ever made. Obviously, it could calculate the mean length by adding the lengths of every car it has made and dividing that value by the number of cars it has made. That would be the population mean as it covers every car made.
Now imagine that you want to know what that average car length is, but the company won't tell you. You can't measure every car in the world, but you can walk around a big car park and take a sample.
If you just measured the first two cars you came to, you could easily be unlucky and get two very large cars. Your sample mean would be much bigger than the population mean. However, as you measure more cars, your sample mean would get closer to the population mean.
This game lets you explore how this works. The population mean is shown at the top of the table below, with your sample mean below it. The number of samples you have is also shown. Click 'Measure' to measure a new car and see how the sample mean changes each time. To start again, click 'Reset'. To speed things up, you can click 'Measure 100' to add 100 samples at a time, but look at how the mean changes with 1 sample at a time first.
Sample size (n)
As you measure more sample cars, does the sample mean move closer or further from the population mean?
Use the [Reset] button a few times to start a new sample. Are the sample means the same each time?
The sample mean changes each time a measurement is added. Does a single measurement affect the mean of large samples or small samples more?
The topic on confidence intervals in this tutorial explains how you can use what you have learned here to work out how far from the population mean your sample mean is likely to be.
To check that you understand what we have done so far, here are a couple of test questions about your data.
Does the data you have collected represent a sample of a larger population, or have you collected a measurement from every possible patient there is?
The mean of age is 52.1. Do you think that the population mean has the same value?
If you had to guess at the population mean of age for every patient in the whole population, what would you say?
Your sample contains 291 observations. How might you improve the quality of the sample mean as an estimate of the population mean?