Measures of central tendency >
Calculating the mean
The mean of a numeric variable is calculated by adding the values of all observations in a data set and then dividing that sum by the number of observations in the set. This provides the average value of all the data.
Mean = sum of all the observation values ÷ number of observations
There are two types of variables—discrete and continuous. Discrete variables are defined as variables that cannot be divided internally. For example, a hockey player can score 1 or 2 goals, but never 1 and a half goals. Continuous variables, however, can be divided into smaller units. A student's age can be 11 years, 7 months and 3 days, as opposed to just 11 or 12 years.
It is important that you understand the difference between these two types of variables, so that you can properly calculate the mean in any given situation. The following examples use discrete variables to calculate the mean
Example 1 – Soccer tournament at Mount Rival I (discrete variables)
Example 2 – Traffic fatalities (discrete variables)
Example 3 – Soccer tournament at Mount Rival II (frequency tables – discrete variables)
Example 4 – Height of 50 Grade 10 girls (grouped variables – continuous or discrete variables)
Summary
Example 1 – Soccer tournament at Mount Rival I
Mount Rival hosts a soccer tournament each year. This season, in 10 games, the lead scorer for the home team scored 7, 5, 0, 7, 8, 5, 5, 4, 1 and 5 goals. What was the mean score?
Mean = sum of all the observed values ÷ number of observations
= (7 + 5 + 0 + 7 + 8 + 5 + 5 + 4 + 5 + 1) ÷ 10
= 47 ÷ 10
= 4.7
Therefore, in the 10-game tournament, the player scored an average of 4.7 goals per game. The average of 4.7 is not a whole number so it only has meaning in a statistical sense. In reality, it is impossible to score 4.7 goals, even if you are a top scorer.
The mathematical notation to calculate the mean for a discrete variable is as follows:
or
x ÷ n
where x stands for an observed value,
|
Example 2 – Traffic fatalities
The following table lists the number of people killed in traffic accidents over a 10 year period. During this time period, what was the average number of people killed per year? How many people died each day on average in traffic accidents during this time period?
Year | Fatalities |
---|---|
1 |
959 |
2 |
1,037 |
3 |
960 |
4 |
797 |
5 |
663 |
6 |
652 |
7 |
560 |
8 |
619 |
9 |
623 |
10 |
583 |
Using the formula to calculate the mean for discrete variables, you can see that:
Mean =
x ÷ n
= (959 + 1,037 + 960 + 797 + 663 + 652 + 560 + 619 + 623 + 583) ÷ 10
= 7, 453 ÷ 10
= 745.3
The average number of people killed per year is 745.3.
To calculate the daily death rate from traffic accidents, the average yearly death rate is divided by the number of days in a year (leap years are ignored).
= 745.3 ÷ 365
= 2.0
Therefore, on average, 2 people died each day in traffic accidents.
Frequency tables
A frequency table lists the number of observations that lie in any given data set. It can be used with grouped or ungrouped variables.
For example, to provide a frequency table of the age of people in a data set, you can produce a table using the exact age (ungrouped), or you can group the ages (grouped).
An ungrouped variable can be regarded as being a special type of grouped variable (i.e., a group). You can calculate the mean of a discrete variable using a frequency table. This method provides an approximation of the true mean for an ungrouped variable. How accurate the approximation is depends on how evenly the observed values are spread within each group.
Example 3 – Soccer tournament at Mount Rival II
Grouping observations in tables is useful when dealing with a large amounts of data. The goal-scoring figures from the soccer tournament example can be displayed in a frequency table.
Number of goals (x) |
Frequency (f) |
Total number of goals (xf) |
---|---|---|
0 |
1 |
0 |
1 |
1 |
1 |
4 |
1 |
4 |
5 |
4 |
20 |
7 |
2 |
14 |
8 |
1 |
8 |
Total ( ![]() |
10 |
47 |
Because the observations are grouped, the mathematical notation changes slightly.
For a discrete variable in a frequency table, the mean is calculated as follows:
or
xf ÷
f
where x stands for an observed value,
|
The calculation for the mean of the player's goals is:
Mean =
xf ÷
f
= (0 + 1 + 4 + 20 + 14 + 8) ÷ (1 + 1 + 1 + 4 + 2 + 1)
= 47 ÷ 10
= 4.7
Since the variable is ungrouped, this is the exact mean. The next example shows what happens when working with grouped variables.
Example 4 – Height of 50 Grade 10 girls
The following table shows the heights of 50 randomly selected Grade 10 girls. What is the mean height of the girls?
Determine the midpoint of each class interval for a variable before calculating the mean from a frequency table.
Height (cm) |
Midpoint (x) |
Frequency (f) |
Total amount of midpoint (xf) |
---|---|---|---|
150 –< 155 | 152.5 | 4 | 610.0 |
155 –< 160 | 157.5 | 7 | 1,102.5 |
160 –< 165 | 162.5 | 18 | 2,925.0 |
165 –< 170 | 167.5 | 11 | 1,842.5 |
170 –< 175 | 172.5 | 6 | 1,035.0 |
175 –< 180 | 177.5 | 4 | 710.0 |
- | - | 50 | 8,225.0 |
The calculation is the same as that used in the soccer tournament example above, except that the xf is now the product of the midpoint of the interval multiplied by the frequency of the same interval. This approximation is required because we do not know the exact height of each girl.
As a result, we must treat all of the heights as if they were midpoints for their interval. For example, because there are four girls in the interval of 150 –< 155 cm, we will treat each of the four girls as measuring 152.5 cm. As was mentioned in the soccer tournament example, the accuracy of the approximation of the mean will depend on how close each of the girls is to the midpoint of her interval.
Thus,
Mean =
xf ÷
f
= (610.0 + 1,102.5 + 2,925.0 + 1,842.5 + 1,035.0 + 710.0) ÷ (4 + 7 + 18 + 11 + 6 + 4)
= 8,225.0 ÷ 50
= 164.5 cm
Therefore, the mean height of the 50 girls in Grade 10 is 164.5 cm.
The mean is used in computing other statistics (such as the variance) and does not exist for open-ended grouped frequency distributions. It is often not the most appropriate measure for skewed (unbalanced) distributions such as salary information. (See Measures of spread for more information on variance.)
![]() |
Next section ![]() |
Measures of central tendency
Navigation and search
Note: This page contains several navigation menus. To enhance accessibility, most of these menus and the site search box are grouped in this section.
To find out more about accessibility features on our site, read our accessibility page.