Government of Canada | Gouvernement du Canada Government of Canada
    FrançaisContact UsHelpSearchHRDC Site
  EDD'S Home PageWhat's NewHRDC FormsHRDC RegionsQuick Links

·
·
·
·
 
·
·
·
·
·
·
·
 

Appendix D: Determining Appropriate Sample Size


What sample size should be chosen for a particular survey? It depends mainly on tolerable error, population size, the importance of particular subgroups, anticipated level of non-response, and how much money is available.

"Tolerable error" refers to the margin of error for the survey. Whenever the results of polls are reported in the news, the margin of error ­ e.g., plus or minus 3%, 19 times in 20 ­ is included. The margin of error tells the reader how accurate the poll's findings are. It is based on the "standard error," the measure of how much the sample mean differs from the population mean.

The margin of error adjusts the standard error to account for any potential differences between the sample and the population via calculation of a "confidence interval" for the population mean. Traditionally, a 95% confidence interval is used (i.e., 19 times in 20). The tolerable margin of error is usually between 3% and 5% (much lower and the costs of the survey begin to rise dramatically).

The traditional formula for large population sizes is n = 1.962 p(1-p)/SE2 , where n is sample size to be calculated, SE is the tolerable standard error, and p is the proportion having the characteristic being measured and (1-p) is the proportion who lack it (e.g., if 48% said yes, 52% must have said no). The 1.96 figure reflects the choice of a 95% confidence interval (in a normal distribution, 95% of the area under the curve is within 1.96 standard deviations of the mean). For example11, if a margin of error of ± 3%, 19 times in 20 was tolerable, the following sample size would be required:

n = 1.962 (.5*.5)/.032 = 1,068

Population size is a consideration only when it falls below 100,000 or so. Below that, something called the "finite population correction factor" must be used to determine sample size. The correction factor =
(N - n /N - /1)1/2 , where N is the population size and n is the sample size.

Algebraically entering this factor into the sample size equation, yields:

n = (1.962 p(1-p)N)/ (1.962 p(1-p)) + (N-1)SE2

For example, if an evaluator wanted to learn how many EI clients to survey from a sample frame of 2,500, with an error rate of ± 3% at 95% level of confidence:

n = (1.962 *.25(2500))/ (1.962 *.25) + 2499(.0009) = 749

The sampling error associated with subgroups will be higher than that for the whole sample, because there are obviously fewer cases. A rule of thumb is that there should be a minimum of 100 individuals in any major subgroup that will be analyzed separately. This will achieve at least a ± 10% margin of error for each major stratum, the maximum tolerable (Rea and Parker, 1992).

Note that when choosing a sample size, there will always be some people in the sample who can't be located, or who will refuse to cooperate. Allowances must be made for anticipated non-response. For a final sample size of 1,000, with a 50% response rate, the initial sample size must be 2,000. Given a fixed budget, there is always a tradeoff between the initial sample size and the effort to reduce non-response. Too often a large initial sample is chosen and too little effort is expended in reducing non-response, with consequent effects on total error.


Footnotes

11 By convention, p and 1-p are set to the most conservative level .5 for each. [To Top]


[Previous Page][Table of Contents][Next Page]