ExplanationSampling Distribution of the Sample Means
We have already introduced the concept of sampling means. This is the idea that if you took many samples (rather than just the one you got from your study) you would get a different mean each time. These means form a distribution known as the distribution of sampling means.
Now, we introduce an interesting fact:
Provided the samples are big enough, no matter how the values in the population are distributed, the distribution of these sample means will be approximately normal!
This might take a moment to think about, but it is true. If you were to do the following:
The shape of that histogram will be approximately normal. This fact is explained by The Central Limit Theorem, which you can read about in the extra topic below.
- Take a sample of (say) 20 measurements from a population using simple random sampling;
- Calculate the mean of that sample;
- Record that mean and repeat the process from step one lots of times;
- When you have many samples, take the list of means (one from each sample) and plot a frequency histogram for them.
Remember that all these different samples are taken from the same population and will consequently all have a distribution that is similar to that of the population and a mean that is close to the population mean. Whatever shape that population distribution is, the sampling means distribution will always be close to normal.
This fact forms the basis of many statistical techniques and is the reason why your sample doesn't have to be normally distributed for them to work (however, with small samples, the distribution needs to be closer to normal than with large samples). It is also worth remembering that you only need to take one sample for the techniques to work. The theory requires you to imagine multiple samples to understand it, but in practice, one sample is sufficient. Understanding the normal distribution helps you understand a lot of statistical techniques.