Problem Set 7 Econ 120B Spring 2021 Â· Xinwei Ma
Department of Economics, UCSD
The first three questions refer to the table below of estimated regressions, computed using data for 2015 from the Current
Population Survey. The data set consists of information on 7178 full-time, full-year workers. The highest educational
achievement for each worker was either a high school diploma or a bachelorâ€™s degree. The workersâ€™ ages ranged from 25 to
34 years. The dataset also contains information on the region of the country where the person lived, marital status, and
number of children. For the purposes of these exercises, let
AHE = average hourly earnings
College = binary variable (1 if college, 0 if high school)
Female = binary variable (1 if female, 0 if male)
Age = age (in years)
Northeast = binary variable (1 if Region = Northeast, 0 otherwise)
Midwest = binary variable (1 if Region = Midwest, 0 otherwise)
South = binary variable (1 if Region = South, 0 otherwise)
West = binary variable (1 if Region = West, 0 otherwise)
1. Using the regression results in column (1):
(a) Do workers with college degrees earn more, on average, than workers with only high school diplomas? How much
more?
(b) Do men earn more than women, on average? How much more?
2. Using the regression results in column (2):
(a) Is age an important determinant of earnings? Explain.
(b) Sally is a 29-year-old female college graduate. Betsy is a 34-year-old female college graduate. Predict Sallyâ€™s and
Betsyâ€™s earnings.
3. Using the regression results in column (3):
1
(a) Do there appear to be important regional differences?
(b) Why is the regressor West omitted from the regression? What would happen if it were included?
(c) Juanita is a 28-year-old female college graduate from the South. Jennifer is a 28-year-old female college graduate
from the Midwest. Calculate the expected difference in earnings between Juanita and Jennifer.
4. Data were collected from a random sample of 200 home sales from a community in 2013. Let Price denote the selling
price (in thousands of dollars), BDR denote the number of bedrooms, Bath denote the number of bathrooms, Hsize
denote the size of the house (in square feet), Lsize denote the lot size (in square feet), Age denote the age of the house
(in years) and Poor denote a binary variable that is equal to 1 if the condition of the house is reported as â€œpoor.â€ An
estimated regression yields
Price \ = 109.7 + 0.567BDR + 26.9Bath + 0.239Hsize + 0.005Lsize + 0.1Age âˆ’ 56.9Poor
(a) Suppose that a homeowner converts part of an existing family room in her house into a new bathroom. What is
the expected increase in the value of the house?
(b) Suppose that a homeowner adds a new bathroom to her house, which increases the size of the house by 80 square
feet. What is the expected increase in the value of the house?
(c) What is the loss in value if a homeowner lets his house run down so that its condition becomes â€œpoorâ€?
5. A school district undertakes an experiment to estimate the effect of class size on test scores in second-grade classes.
The district assigns 50% of its previous yearâ€™s first graders to small second-grade classes (18 students per classroom)
and 50% to regular-size classes (21 students per classroom). Students new to the district are handled differently: 20%
are randomly assigned to small classes and 80% to regular-size classes. At the end of the second-grade school year,
each student is given a standardized exam. Let Y denote the exam score, X denote a binary variable that equals 1 if
a student is assigned to a small class, and W denote a binary variable that equals 1 if a student is newly enrolled. Let
Î²1 denote the causal effect on test scores of reducing class size from regular to small.
Consider the regression Y = Î²0 + Î²1X + u. Do you think that E[u|X] = 0? Is the OLS estimator of Î²1 unbiased and
consistent? Explain
6. Suppose we have a sample (Xi1, Xi2, Yi), i = 1, . . . , n and we estimate the model Y = Î²0 + Î²1X1 + Î²2X2 + u. Let
XÂ¯
2 =
1
n
Pn
i=1
Xi2 and let Â¯uË† =
1
n
Pn
i=1
uË†i
. Explain why the sample covariance between the Xi2 and Ë†ui
is zero. That is, show
1
n
Xn
i=1
(Xi2 âˆ’ XÂ¯
2)(Ë†ui âˆ’ Â¯uË†) = 0.
What is the sample covariance between Xi1 and Ë†ui?
7. We have a sample (Xi1, Xi2, Yi), i = 1, . . . , n and we estimate the model Y = Î²0 + Î²1X1 + Î²2X2 + u.
(a) Suppose that in this particular sample, Î²Ë†
1 = 0. In this case, compute the other slope estimate, Î²Ë†
2 and the intercept
Î²Ë†
0.
(b) When do you expect the R2 of the regression to be 0?
(c) When do you expect the R2 of the regression to be 1?
8. In a department at a hospital, there are two types of surgeons: excellent surgeons and average surgeons. The binary
variable excellentSurgeon equals 1 if a patient receives an excellent surgeon, and 0 otherwise. For patients that have
an operation, we use length to denote the operation length, which is measured in hours. Suppose patients with high
blood pressure are more likely to develop complications which can lead to prolonged surgeries. In anticipation of this
issue, patients with high blood pressure tend to be assigned to excellent surgeons.
An administrator would like to determine the effect of a patient receiving an operation from an excellent surgeon. She
considers the following two regression models:
Short regression: length = Î²0 + Î²1excellentSurgeon + ushort
2
Long regression: length = Î³0 + Î²1excellentSurgeon + Î²2BP + ulong,
where BP stands for blood pressure, which is measured in mmHg.
(a) In regressions, we always assume the error term has zero mean. That is, E[ushort] = 0 and E[ulong] = 0. Explain
why this assumption is innocuous.
(b) What is the relationship between the two intercepts, Î²0 and Î³0?
(c) Do you expect the zero conditional mean assumption, E[ushort|excellentSurgeon] = 0, to hold in the short
regression?
(d) Assume the zero conditional mean assumption holds in the long regression: E[ulong|excellentSurgeon, BP] = 0.
Let Î²Ë†
1,short and Î²Ë†
1,long be the two estimates of Î²1 from the short and the long regression, respectively. Is Î²Ë†
1,short
consistent for Î²1? If not, what is the direction and magnitude of the bias? Is Î²Ë†
1,long consistent for Î²1?
