배경 설명

실습 사례 [Based on PS by Prof. Evans]

To obtain a consistent estimate of the impact of kids on labor supply, some authors have suggested using whether a mother had twins on their first birth as an instrument for the number of children in the household.

Twins are in many respect random and by definition, the realization of a twin increases the number of children in the household.

Using data from the 1980 Public Use Micro Sample 5% Census data files, Evans constructed a sample of women aged 21-40 with at least one kid. The 1980 PUMS identifies a person’s age at the time of survey and their quarter of birth. Hence, we can infer that any two kids in the household with the same age and quarter of birth are twins.

There are roughly 6,000 1st births to mothers that are twins.

There are over 800,000 observations in the original data set so to make the problem manageable, Evans selects a random sample of about 6,500 non-twin births for a total of about 12,500 observations.

The STATA data file is called twins1st.dta and below are detailed descriptions of the variables.

  1. What fraction of women work? What is average weeks worked among women that work? What is median labor earnings for women who worked?

  2. Construct an indicator that equals 1 for women that have a second child. Call this variable second. What fraction of women had a second child?

  3. Consider a simple bivariate regression where weeks of work (Y) is regressed on second (X), What is the coefficient for $\beta_1$ in this regression and interpret the coefficient?

$$ Y_i = \beta_0 + \beta_1X_i + \epsilon_i $$

  1. Because of the concern that X and $e$ are correlated, use twins on 1st birth (Z) as an instrument for X in an instrumental variables model. What is the first-stage and reduced-form estimates for this model? Interpret these coefficients, that is, what do these coefficients measure? What is the indirect least squares estimate for $\beta_1$?