The Ponzo Illusion

Samples and Populations

Tutorial Navigation

Getting StartedGeneral Instructions | Introduction to Your Study | Experimental Design | Stating a Hypothesis
Descriptive StatisticsHistograms | Central Tendency | Standard Deviation | Confidence Intervals
Comparing Two SamplesSamples and Populations | Choosing a T-Test | Independent T-Test | P-Values and T-Tables
Important ConceptsThe Normal Distribution | Z Scores | Probability Distributions
LevelsYou are currently on Samples and Populations at level 3. Level 1 | Level 2 | Level 3
Next Topic Confidence Intervals | Choosing a T-Test

Explanation

Sampling Means
We have said that a sample is a set of measurements picked with equal probability from a larger population and that a consequence of this fact is that two samples are unlikely to be the same. We also saw that statistics allows us to infer things about the population from a single sample.

Level 1 of this topic introduced you to simple random sampling, sampling error and the idea of inferential statistics. At this level, we will expand on those topics a little.

Simple Random Sampling
When we say that samples from a population are picked at random, we mean that we have tried to ensure that every member of the population has an equal chance of making it into the sample. Think about rolling dice. If you didn't have an equal chance of rolling each number, you would claim that the dice was not properly random. We do not mean that the values we get are random or collected in a haphazard kind of way. Infact, taking a random sample often requires careful planning.

Sampling Error
If two samples from the same population are unlikely to contain the same values, then they are unlikely to have the same mean (or any other descriptive statistic). It follows that they are also unlikely to have the same mean as the population from which they were taken (they can't both be right!). Any difference between a sample mean and the population mean is known as the sampling error of the mean. More generally, any difference between sample statistics and population statistics is known as sampling error.

Inferential Statistics
When we report the mean or standard deviation of a sample (or any other statistic about it) we are stating a verifiable fact about the sample. Such facts are known as descriptive statistics as they describe something about the sample. If we want to use a sample to make a statement about the population, we cannot be as confident. We must use inferential statistics to infer what is most likely to be true about the population. You will see in the section on confidence intervals how we report a range into which we are confident that the population mean falls, rather than just reporting a value for a population mean.

Exploration

In this game, you will see how drawing a sample from a population shows a glimpse of the structure of the population and how larger samples reflect that structure better. Look at the coloured square on the right hand side of the game. This represents the population - you can see that it has a distribution of colours such that blue is the most common and red is the least common. When you click the [Sample] button, the game will pick a number of points at random from the population and put them in the sample box. You can choose the number of points sampled. Click again to see a different random sample. For each sample, the game also tells you the percentage of dots of each colour that were chosen, along with the percentages for the population as a whole. How large does your sample need to be before it is reliably showing the same distribution as the population?

( You need to enable Java to see this applet. )

Imagine you were to sample from the population above following these steps twenty times:

  1. Pick a random point from the population,
  2. Record its colour.
    How would you describe the probability of any single point being picked each time in step 1?   Help
    35% of the population is red. If 25% of a certain sample is red, which of these statements best describes sampling error of the proportion of reds in the sample?   Help

Application

Now we can look at your data.
If we calculate the mean of from the sample, is the result a descriptive or inferential statistic?   Help
The true mean is probably not the same as our sample mean. What does this statement describe?   Help
Confidence Intervals | Choosing a T-Test