## Measures of Central Tendency

Tutorial Navigation

 Getting Started General Instructions | Introduction to Your Study | Experimental Design | Stating a Hypothesis Descriptive Statistics Histograms | Central Tendency | Standard Deviation | Confidence Intervals Comparing Two Samples Samples and Populations | Choosing a T-Test | Paired T-Test | P-Values and T-Tables Important Concepts The Normal Distribution | Z Scores | Probability Distributions Levels You are currently on Central Tendency at level 1. Level 1 | Level 2 | Level 3 Next Topic Frequency Histograms | Standard Deviation

### Explanation

Introducing Central Tendency
The measure of central tendency of your data is the single value that best represents all of the data. It is the value that you would pick if you had to guess which of your data points somebody had chosen at random. This value often (but certainly not always) lies in the 'middle' of the data, in the sense that it has as many values above it as it has below.

There are three main measures of central tendency:

• The mean is the result of adding all the values in your data together and dividing the total by the number of data points you have. The mean is the measure that people often refer to as the average. For example, the mean height is 180 meters;
• The median is the result of arranging the values in order and finding the middle value in the resulting list. For example, the median age is 45;
• The mode is the most commonly occurring value in your data. This corresponds to the highest bar in the frequency histogram. For example, the most common number of children in a family is 2.
The most appropriate measure for a given data set depends on the data itself.
• Continuous values such as height are suitable for using the mean, for example 'The mean height is 32.5 cm'. The mode is not a good measure to use with continuous values measured to high accuracy as such data may not contain any repeated values. For example, if you measured the height of ten people to the nearest millimetre, you might get ten different values;
• Discrete values such as number of children are better suited to the mode or median, thus avoiding 'The average is 2.3 children';
• Categorical data such as Colour of cars sold should use the mode, for example, 'Red is the most common car colour'.
These three terms are described in more detail in the help section below.

### Exploration

Use this game to experiment with calculating the average of up to 9 different values. Type in any values you want, and click 'Go' to see the three different average measures

( You need to enable Java to see this applet. )
Which measure is used if you enter words instead of numbers (try repeating a word in more than one box)?
What happens to the mode if two different values are equally the most common?
Explore the effect of extreme values. Enter 8 low numbers and one very high one. Which measure is affected by the high value?
Some extra challenges..
• Using numbers from 1 to 10 only, how far apart can you get the three different measures?
• Can you enter a set of numbers such that the mode is the most common and the median the least common?

### Application

Your experiment generated data describing two variables.
The independent variable, Recall Interval separates your experimental samples into 100 msecs and 1200msecs.
The dependent variable, Items Correctly Recalled takes discrete numeric values.

You can view the data from your study here.

The Mean
Remembering that the mean is for numbers only, which variable(s) would it be possible to calculate a mean for?
As we measured items correctly recalled in two samples (the 100 msecs sample and the 1200msecs sample), we can measure the mean of each. This is the first step in finding out whether one set of values is different from the other. The means for each sample are listed below:
• The mean of items correctly recalled when recall interval is 100 msecs is 13.9
• The mean of items correctly recalled when recall interval is 1200msecs is 3
Looking at the means, would you say that one sample contained higher measurements of items correctly recalled than the other? If so, which sample tends to have the highest items correctly recalled?
The Mode
As with the mean, we can calculate the mode for both samples:
• There is a single most frequent value for items correctly recalled when recall interval is 100 msecs. It is 14
• There is a single most frequent value for items correctly recalled when recall interval is 1200msecs. It is 3

In some sets of data, there is a single value that appears most often, and that is the mode. In other sets of data, there is no clear winner as many values appear with equal frequency.
Which of these best describes items correctly recalled when recall interval is 100 msecs?
Which of these best describes items correctly recalled when recall interval is 1200msecs?
The Median
For the 100 msecs sample, the median of items correctly recalled is 14. What does that signify?
For the 1200msecs sample, the median of items correctly recalled is 3. What does that signify?

 Frequency Histograms | Standard Deviation