## The Normal Distribution

 Getting Started General Instructions | Introduction to Your Study Descriptive Statistics Histograms | Scatter Plots | Central Tendency | Standard Deviation | Confidence Intervals Relating Variables Correlation Important Concepts The Normal Distribution | Z Scores | Probability Distributions Levels You are currently on The Normal Distribution at level 3. Level 1 | Level 2 | Level 3 Next Topic Correlation | Z Scores

### Explanation

Sampling Distribution of the Sample Means
We have already introduced the concept of sampling means. This is the idea that if you took many samples (rather than just the one you got from your study) you would get a different mean each time. These means form a distribution known as the distribution of sampling means.

Now, we introduce an interesting fact:

Provided the samples are big enough, no matter how the values in the population are distributed, the distribution of these sample means will be approximately normal!

This might take a moment to think about, but it is true. If you were to do the following:

1. Take a sample of (say) 20 measurements from a population using simple random sampling;
2. Calculate the mean of that sample;
3. Record that mean and repeat the process from step one lots of times;
4. When you have many samples, take the list of means (one from each sample) and plot a frequency histogram for them.
The shape of that histogram will be approximately normal. This fact is explained by The Central Limit Theorem, which you can read about in the extra topic below.

Remember that all these different samples are taken from the same population and will consequently all have a distribution that is similar to that of the population and a mean that is close to the population mean. Whatever shape that population distribution is, the sampling means distribution will always be close to normal.

This fact forms the basis of many statistical techniques and is the reason why your sample doesn't have to be normally distributed for them to work (however, with small samples, the distribution needs to be closer to normal than with large samples). It is also worth remembering that you only need to take one sample for the techniques to work. The theory requires you to imagine multiple samples to understand it, but in practice, one sample is sufficient. Understanding the normal distribution helps you understand a lot of statistical techniques.

### Exploration

Use this game to explore how generating random numbers can lead to a normal distribution of sample means.

When you click the Sample button, the program will pick a sample of random numbers from 0 to 8 and then calculate their mean. How many numbers it picks is up to you - choose a number from 1 to 50 in the box provided.

The frequency histogram of the sample means will be built up as you make more and more samples. To start again with a different sample size, click Clear
You will need to repeatedly click the Sample button to build up a population of sample means. Fast repeated clicks will get your there sooner!

• Start with a sample size of 1. That will simply plot the distribution of the random numbers. It should be pretty flat after a while because the random numbers are being picked with equal probability - like rolling dice.
• Then increase the sample size to 5, click Clear and start clicking Sample again. What happens to the distribution of means?
• Now change the sample size to 30. Now what happens to the distribution of means?

( You need to enable Java to see this applet. )

Can you see that, although the data has a flat distribution, the sample means are normally distributed? - Remember, LOTS of clicks on the Sample button
What is the relationship between the width of the distribution and the size of the samples?

### Application

Thinking about your data now, think about what would happen if you collected the same amount of data again by measuring a different group of Applications from the same population.