Government of Canada | Gouvernement du Canada Government of Canada
    FrançaisContact UsHelpSearchHRDC Site
  EDD'S Home PageWhat's NewHRDC FormsHRDC RegionsQuick Links

·
·
·
·
 
·
·
·
·
·
·
·
 

Appendix C: Dealing With Selection Bias


Undoubtedly the most popular approach to dealing with selection bias is the Heckman (1979) two-stage approach. The first stage involves modeling selection into the program. Usually this takes the form of a single equation explaining program participation/non-participation10:

P = ßX + U

where P is a dummy variable (1 for participants and 0 for non-participants), X is a set of all observed factors that may account for participation in the program (e.g., age, sex), and U is a random error term which is assumed to be normally distributed to take account of unobserved factors that influence participation in the program. From this equation, the inverse of Mill's ratio is computed, which is then inserted into a second stage outcome equation to estimate program impact (usually via ordinary least squares):

Y = ßX + aP + dM + U

where Y is the outcome of interest, X is a vector of observed variables, P is the participation dummy, and M is the inverse of Mill's Ratio. If the assumptions underlying the model are correct, the Heckman procedure removes the selection bias (d), thereby producing an unbiased estimate of program impact (a). (The measure of program impact is the estimated coefficient on the indicator variable for participation/non-participation in the program.) If this equation were estimated by ordinary least squares without the inclusion of the selection bias correction term, the estimates would potentially be biased. However, if the model is properly specified, the addition of the "selection bias correction" variable removes this potential bias, thus giving unbiased estimates of program impact.

Another powerful means of controlling for differences between groups is called the differences-in-differences method. Longitudinal data are collected for key outcome measures     e.g., earnings, social assistance use. To account for the differences in the participant and non-participant samples a longitudinal estimator of program impact is employed; such estimators take account of the level of the outcome variable prior to and after the program, in contrast to cross-sectional estimators which use data on post-program outcomes alone. This estimator uses the pre- vs. post-program change in the outcome variable for non-participants as an estimate of the change that would have occurred for participants in the absence of the program. The estimated average program impact is then the difference between the pre- vs. post-program change in the outcome variable for participants and the pre- vs. post-program change in the outcome variable for non-participants. This permits a determination of the incremental impact of the program by controlling for biases caused by unobserved individual differences. A multivariate analysis can then show how the size of the differences-in-differences estimate of program impact varies according to various individual and program characteristics.

In equation form (Moffitt, 1991):

Y = E(Y**it - Y*i,t-1 |d i =1) - E(Y*it - Y*i,t-1 |d i =0)

where t = the posttreatment point, t-1 = pretreatment point, and

Y*it - Y*i,t-1 = change in Y*it from t-1 to t if treatment not received

Y**it - Y*i,t-1 = change in Y*it from t-1 to t if treatment received

Instrumental variable (IV) methods are widely used in situations in which ordinary least squares (OLS) estimates may be biased due to a correlation between one or more of the explanatory variables and the random error term in the model. In the evaluation/selection bias context, such potential bias arises because of the possible correlation between the participation/non-participation variable and the random error term in the outcome equation. This potential bias can be removed if one or more "instrumental variables" are available and included in the model.

The basic model of program outcome or impact is:

Y = ßX + aP + U  (1)

This can be written as:

Y = CW + U  (2)

where C = (ß a)' and W = (X P) using matrix notation.

The least squares estimator of (2) is given by:

c = (W'W)-1 W'Y where (W'W)-1 is the inverse of the matrix (W'W). In general this estimator is biased because of the correlation between W and U (i.e. E{W'U} does not equal zero, where E{ } represents the expectations operator).

The IV estimator of (2) is given by:

c* = (Z'W)-1 Z'Y where Z is the matrix of instrumental variables. This estimator is in general unbiased because Z and U are uncorrelated, i.e. E{Z'U} equals zero if Z is an appropriate instrument.


Footnotes

10 In most cases, the equation is estimated as a ''probit model'' which is appropriate if the random term in this equation is normally distributed. (A probit is a measurement of probability based on deviations from the mean of a normal frequency distribution. It is analogous to multiple regression but with a dischotomous dependent variable.) However, two step prodedures for situations in which the assumption of normally distributed random terms is unlikely to hold are available (through not yet widely used). [To Top]


[Previous Page][Table of Contents][Next Page]