Thomas F. Crossley
Peter Levell
Stavros Poupakis

regression with an imputed dependent variable (replication data)

Researchers are often interested in the relationship between two variables, with no single data set containing both. A common strategy is to use proxies for the dependent variable that are common to two surveys to impute the dependent variable into the data set containing the independent variable. We show that commonly employed regression or matching-based imputation procedures lead to inconsistent estimates. We offer a consistent and easily implemented two-step estimator, “rescaled regression prediction.” We derive the correct asymptotic standard errors for this estimator and demonstrate its relationship to alternative approaches. We illustrate with empirical examples using data from the US Consumer Expenditure Survey (CE) and the Panel Study of Income Dynamics (PSID).

Data and Resources

Suggested Citation

Crossley, Thomas F.; Levell, Peter; Poupakis, Stavros (2022): Regression with an imputed dependent variable (replication data). Version: 1. Journal of Applied Econometrics. Dataset.