Cecilia Machado, "Unobserved Selection Heterogeneity and the Gender Wage Gap", Journal of Applied Econometrics, Vol. 32, No. 7, 2017, pp. 1348-1366. The data used in the article come from several sources: 1) the Annual Demographic Files of the Current Population Survey, years 1976--2005, hereafter "cps"; 2) the National Logitudinal Survey of Youth 1979 and the Young Women Cohort, hereafter "nlsy79" and "youngwomen"; 3) the Marriage and Fertility Supplement conducted in June of selected years of the Current Population Survey, hereafter "june"; and 4) the one year panel of the above Annual Demographic Files of the Current Population Survey, hereafter "mar_match". Each data set is described in the Appendix, including sample delimitation and variables used in the analysis. The replication files include analysis datasets in CSV format: cps.csv nlsy79.csv youngwomen.csv june.csv mar_match.csv It also include a file that explains the variables in each dataset: cm-variables.txt and two files with the Stata program that generates figures and tables in the paper and appendix: main.do appendix.do These eight files are DOS files in ASCII format. They are zipped in the file cm-files.zip. Unix/Linux users should use "unzip -a". Please address any questions to: Cecilia Machado FGV-EPGE cecilia.machado [AT] fgv.br