Skip to content

Residualized Regression

Instrumental Variables

Linear instrumental variables consists of a the following two step process. We first regress the treatment variable Di on the controls Xi and the instrument Zi. We then regress the outcome variable Yi on the predicted treatment D^i and the controls.

Di=γ1Xi+γ2Zi+viYi=β1D^i+β2Xi+εi

As emphasized in class, I prefer to interpret the coefficients in a linear model via a residualized approach. For one, it makes the source of the variation clear. We can interpret the coefficient β1 in the second equation above via a residualized regression as follows where D^¯ is the predicted predicted values! Not a typo. We first predict treatment given the controls and the instrument. We then predict this predicted value given only the controls. Via the law of iterated expectations it is equivalent to the predicted treatment given the controls.

Yi=β1(D^iD^¯i)+ui

As I also emphasize in class, I tent to think of linear models as approximations to the underlying conditional expectation function. Therefore, we can re-write the above regression in its nonparametric form as follows:

Yi=β1(E[Di|Xi,Zi]E[Di|Xi])+ui

When I see a linear IV model in a paper, I try to interpret is as an approximation to the above regression. IV keeps only the local source of the treatment due to the instrument. IV is a local correction of the treatment variable.