Measures of central tendency >

Calculating the mean

The mean of a numeric variable is calculated by adding the values of all observations in a data set and then dividing that sum by the number of observations in the set. This provides the average value of all the data.

Mean = sum of all the observation values ÷ number of observations

There are two types of variables—discrete and continuous. Discrete variables are defined as variables that cannot be divided internally. For example, a hockey player can score 1 or 2 goals, but never 1 and a half goals. Continuous variables, however, can be divided into smaller units. A student's age can be 11 years, 7 months and 3 days, as opposed to just 11 or 12 years.

It is important that you understand the difference between these two types of variables, so that you can properly calculate the mean in any given situation. The following examples use discrete variables to calculate the mean

Example 1 – Soccer tournament at Mount Rival I (discrete variables)
Example 2 – Traffic fatalities (discrete variables)
Example 3 – Soccer tournament at Mount Rival II (frequency tables – discrete variables)
Example 4 – Height of 50 Grade 10 girls (grouped variables – continuous or discrete variables)
Summary

Example 1 – Soccer tournament at Mount Rival ITop of page

Mount Rival hosts a soccer tournament each year. This season, in 10 games, the lead scorer for the home team scored 7, 5, 0, 7, 8, 5, 5, 4, 1 and 5 goals. What was the mean score?

Mean = sum of all the observed values ÷ number of observations
           = (7 + 5 + 0 + 7 + 8 + 5 + 5 + 4 + 5 + 1) ÷ 10
           = 47 ÷ 10
           = 4.7

Therefore, in the 10-game tournament, the player scored an average of 4.7 goals per game. The average of 4.7 is not a whole number so it only has meaning in a statistical sense. In reality, it is impossible to score 4.7 goals, even if you are a top scorer.

The mathematical notation to calculate the mean for a discrete variable is as follows:

or mathematical symbol for sumx ÷ n

where x stands for an observed value,

n stands for the number of observations in the data set,

mathematical symbol for sumx stands for the sum of all observed x values, and

mathematical symbol for the mean stands for the mean value of x.

Top of pageExample 2 – Traffic fatalities

The following table lists the number of people killed in traffic accidents over a 10 year period. During this time period, what was the average number of people killed per year? How many people died each day on average in traffic accidents during this time period?

Table 1. Number of fatalities in traffic accidents
Year Fatalities
1
959
2
1,037
3
960
4
797
5
663
6
652
7
560
8
619
9
623
10
583

Using the formula to calculate the mean for discrete variables, you can see that:

Mean = mathematical symbol for sumx ÷ n
= (959 + 1,037 + 960 + 797 + 663 + 652 + 560 + 619 + 623 + 583) ÷ 10
= 7, 453 ÷ 10
= 745.3

The average number of people killed per year is 745.3.

To calculate the daily death rate from traffic accidents, the average yearly death rate is divided by the number of days in a year (leap years are ignored).

= 745.3 ÷ 365
= 2.0

Therefore, on average, 2 people died each day in traffic accidents.

Frequency tables

A frequency table lists the number of observations that lie in any given data set. It can be used with grouped or ungrouped variables.

For example, to provide a frequency table of the age of people in a data set, you can produce a table using the exact age (ungrouped), or you can group the ages (grouped).

An ungrouped variable can be regarded as being a special type of grouped variable (i.e., a group). You can calculate the mean of a discrete variable using a frequency table. This method provides an approximation of the true mean for an ungrouped variable. How accurate the approximation is depends on how evenly the observed values are spread within each group.

Top of pageExample 3 – Soccer tournament at Mount Rival II

Grouping observations in tables is useful when dealing with a large amounts of data. The goal-scoring figures from the soccer tournament example can be displayed in a frequency table.

Table 2. Mount Rival soccer tournament, frequency of goals for lead scorer
Number of goals
(x)
Frequency
(f)
Total number of goals
(xf)
0
1
0
1
1
1
4
1
4
5
4
20
7
2
14
8
1
8
Total (mathematical symbol for sum)
10
47

Because the observations are grouped, the mathematical notation changes slightly.

For a discrete variable in a frequency table, the mean is calculated as follows:

equation used to calculate the mean using a frequency table. or mathematical symbol for sumxf ÷ mathematical symbol for sumf

where x stands for an observed value,

xf stands for the product of an observed value, multiplied by its frequency,

mathematical symbol for sumxf stands for the total of all xf values,

mathematical symbol for sumf stands for the total of all frequencies, and

mathematical symbol for the mean stands for the mean value of x.

The calculation for the mean of the player's goals is:

Mean = mathematical symbol for sumxf ÷ mathematical symbol for sumf
= (0 + 1 + 4 + 20 + 14 + 8) ÷ (1 + 1 + 1 + 4 + 2 + 1)
= 47 ÷ 10
= 4.7

Since the variable is ungrouped, this is the exact mean. The next example shows what happens when working with grouped variables.

Top of pageExample 4 – Height of 50 Grade 10 girls

The following table shows the heights of 50 randomly selected Grade 10 girls. What is the mean height of the girls?

Determine the midpoint of each class interval for a variable before calculating the mean from a frequency table.

Table 3. Mean height of 50 Grade 10 girls
Height
(cm)
Midpoint
(x)
Frequency
(f)
Total amount of midpoint
(xf)
150 –< 155 152.5 4 610.0
155 –< 160 157.5 7 1,102.5
160 –< 165 162.5 18 2,925.0
165 –< 170 167.5 11 1,842.5
170 –< 175 172.5 6 1,035.0
175 –< 180 177.5 4 710.0
- - 50 8,225.0

The calculation is the same as that used in the soccer tournament example above, except that the xf is now the product of the midpoint of the interval multiplied by the frequency of the same interval. This approximation is required because we do not know the exact height of each girl.

As a result, we must treat all of the heights as if they were midpoints for their interval. For example, because there are four girls in the interval of 150 –< 155 cm, we will treat each of the four girls as measuring 152.5 cm. As was mentioned in the soccer tournament example, the accuracy of the approximation of the mean will depend on how close each of the girls is to the midpoint of her interval.

Thus,

Mean = mathematical symbol for sumxf ÷ mathematical symbol for sumf
= (610.0 + 1,102.5 + 2,925.0 + 1,842.5 + 1,035.0 + 710.0) ÷ (4 + 7 + 18 + 11 + 6 + 4)
= 8,225.0 ÷ 50
= 164.5 cm

Therefore, the mean height of the 50 girls in Grade 10 is 164.5 cm.

Top of pageSummary

The mean is used in computing other statistics (such as the variance) and does not exist for open-ended grouped frequency distributions. It is often not the most appropriate measure for skewed (unbalanced) distributions such as salary information. (See Measures of spread for more information on variance.)

 

Previous section Next section

Navigation and search

Note: This page contains several navigation menus. To enhance accessibility, most of these menus and the site search box are grouped in this section.

To find out more about accessibility features on our site, read our accessibility page.

Page navigation menu

  1. Page content
  2. Site navigation menu
  3. Site utility menus
  4. Site search
  5. Important notices
  6. Top of page
  7. Date modified