calculator
STATISTICS

A web-based unit for Integrated Algebra

Home
Introduction
Central Tendency
Histograms
Cumulative Histograms
Stem & Leaf Plots
Box & Whisker Plots
Scatter Plots & Correlation
Quiz
Lesson Plan
Additional Resources
References
Central Tendency

Central tendency refers to a number that can be used to best describe a set of data values.  There are three measures of central tendency:  mean, median, and mode.  We also use the term range to describe a set of values.  While we will be discussing the range in this mini-lecture, it should be noted that range is NOT a measure of central tendency.

math homework


Mean

The mean of a data set is often referred to as the average, and describes the middle of a set of data.  To find the mean of a data set you simply find the sum of the data and then divide by how many numbers are in the data set. 

Ex.  11, 16, 10, 16, 12, 31

11 + 16 + 10 + 16 + 12  + 31  =  96                        96/6 = 16

So, the mean of this set of data is 16. 

The mean is a good measure of central tendency when there are no outliers in the data set.   Outliers are data values that are considerably  larger or smaller than all other values in the set.  In the example above, 31 would be considered an outliers because it is significantly larger than the other data values. 

Let's see what happens to the mean when we remove the outliers.

Ex. 11, 16, 10, 16, 12

11 + 16 + 10 + 16 + 12 = 65                                  65/5= 13

So, the mean of the data set without the outliers is 13, which is closer to the middle of the data set than 16.



Median

When a data set has outliers, the median is a better measure of central tendency to use.  The median is the data value
that lies in the middle of the data set when the numbers are ordered for least to greatest.  It is a good idea to order your data set from least to greatest before finding measures of central tendency so that you do not forget to do this when finding the median.

If your data set contains an odd number of values, the median is simply the value that lies in the middle.  If there is an even number of values, you need to find the mean of the two numbers that lie in the middle.

Ex.  11, 16, 10, 16, 12, 31

From least to greatest:  10, 11, 12, 16, 16, 31

There are an even number of values in the set, so we need to take the median of the two middle numbers, 12 and 16.

12 + 16= 28                              28/2=14

So, the median of the data set is 14.  Notice that even with the outliers, this is a better representation of the middle of the data set than the first mean we calculated (16). 



Mode

The mode of a data set is the value that appears the most.   Unlike the mean and median, it is possible to have more than one mode for a set of data.  It is also possible to have no mode (if no piece of data appears more than once).  When there is no mode for a data set it is important to write, "No mode," or, "None."  If you write 0, your response will be marked incorrect, as this means the number zero is the mode.

Ex.  11, 16, 10, 16, 12, 31

The number 16 appears twice in the data set, which is more than any other value in the data set.  Thus, the mode is 16.

A special feature of mode is that it is the only measure of central tendency that can be used when your data is not numbers.   For example, in a survey of students' favorite pizza topping your data set would consist of words (such as cheese, pepperoni, anchovies, etc.).  We can not determine a mean or median for words, but we can determine which
topping(s) appeared the most.
 


Range 

Remember, the range is not a measure of central tendency, but it is a value which can be used to describe a set of data.  The range is the difference between the smallest and greatest value in a data set. 

Ex.  11, 16, 10, 16, 12, 31

Greatest value: 31               Smallest value: 10

31- 10=21

The range of the data set is 21. 




Now that you are finished, take the quiz to test your understanding of central tendency.