Does education improve health? A reexamination of the evidence from compulsory schooling laws.
Mazumder, Bhashkar
Introduction and summary
Improving the long-term health of the population is clearly an
important goal for policymakers. It is also likely to become even more
so in the coming years with the aging of the baby boomers and the
anticipated health-related costs that will accompany this demographic
change. Therefore, understanding which policy levers might improve
health is of interest. In a provocatively titled front page article,
"A surprising secret to a long life: Stay in school," the New
York Times recently suggested that many researchers now believe that
education is the key factor in promoting health. (1) While social
scientists have long known that there is a strong positive correlation
between education and longevity, many researchers have speculated that
this association was not truly causal, meaning one didn't
necessarily lead to the other. Rather, the link was thought to reflect
either the fact that for a variety of other reasons (for example,
parental income and personal attitudes), people who tend to acquire more
schooling also tend to be in better health, or that healthier children
stayed in school longer. Of course, in the absence of evidence of a
causal link, there is no reason to expect that policies aimed at
increasing educational attainment will result in improvements in health.
The New York Times article was based upon the results of a recent
study by economist Adriana Lleras-Muney (2005) that provides perhaps the
strongest evidence to date that education has a causal effect on health.
By implementing an instrumental variables (IV) strategy, this research
analyzes changes in compulsory schooling and child labor laws across
different states early in the twentieth century and uses this
information to infer the effects of education on mortality. The idea
behind this strategy is that if differences in these laws induced people
born in different states in different years to obtain different levels
of schooling for reasons that are unrelated to any other determinants of
health, then one can estimate a true causal effect that is not
confounded by the other factors. Lleras-Muney finds that increased
schooling due to these laws led to dramatic reductions in mortality
rates during the 1960s and 1970s. In fact, the results imply that one
more year of schooling would lower the mortality rate over a ten-year
period by nearly 60 percent--a result that is perhaps implausibly large.
If it is true that more education leads to improved health, such a
finding also raises a second important question--namely, how, exactly,
does education affect health? Economists have proposed a variety of
theories including: that more education leads to better jobs and more
financial resources; that education improves knowledge and
decision-making ability, which improves health; and that education
influences other kinds of behavioral responses that, in turn, lead to
better health outcomes. So far, however, there is little convincing
empirical evidence on how to evaluate the importance of these factors.
In this article, I reexamine the use of these compulsory schooling
laws as a way of identifying the causal effects of education on health
through the IV approach. Given the fundamental importance of the
question of whether more education is causally linked to better health,
it is worth investigating the robustness of the relationship. I estimate
the same types of models used in the earlier research, using a much
larger sample and improved measures of compulsory schooling laws. I also
present alternative specifications of the statistical model that may
better account for other reforms that were going on during the same
period. For example, during the early period of the twentieth century,
there were fairly dramatic improvements in public health measures that
led to large declines in concurrent mortality (Cutler and Miller, 2005).
For school-age children specifically, new nutrition and vaccination
programs may have resulted in improved long-term health, independent of
any effects of increased education.
In addition, if compulsory schooling laws can be used to identify a
causal relationship, then they also ought to be useful in identifying
how education improves health. This can be analyzed by using data on
very specific health conditions for which existing theories might favor
one explanation versus another. For example, if processing information
and decision-making ability are the critical channels by which education
affects health, then we might expect lower incidences of chronic
diseases, such as arthritis, cancer, diabetes, lung disease, and heart
disease. These are conditions that might respond better to more
sophisticated management plans or behavioral changes. If the key factor
is increased access to high-quality health care due to greater financial
resources, then we might expect that a broad range of health outcomes
would be improved. Therefore, it makes sense to apply the same
methodology to other outcomes besides mortality.
A careful analysis of how education affects health using the IV
approach also serves as a credibility check on the methodology. If, for
example, all of the health effects appeared to be related to the
long-term effects of poor nutrition, then a plausible alternative
hypothesis would be that changes in compulsory schooling laws are really
just picking up the long-term health effects of improved nutrition in
schools. In that case, the assumption that these laws represent
exogenous sources of schooling differences would be invalid, and the
estimates would not represent a causal relationship between education
and health.
In order to address these issues, I first reexamine the effects of
education on mortality from Lleras-Muney (2005, 2006) by replicating the
results and extending them by adding significantly more data and
employing a variety of robustness checks. I find that the effect of
education on mortality is not robust to the inclusion of state-specific
time trends, casting doubt on whether there is a true causal effect. At
a minimum, my results show that the point estimates are much smaller
than those previously found in the literature. Moreover, the results
appear to be driven by the earliest cohorts (born in 1901-12) during the
1960-70 period.
Second, I use individual-level data on health outcomes from the
U.S. Census Bureau's Survey of Income and Program Participation
(SIPP) to further investigate the causal pathways between compulsory
schooling and health. In contrast to the U.S. Census data, which
requires the use of a cohort grouping strategy to infer mortality, the
SIPP provides data on the health status of each individual so that we
can be sure that those who were affected by the compulsory schooling
laws are indeed the same individuals registering the change in health.
Using the SIPP with the same IV strategy, I find large and statistically
significant effects of education on general health status that are
robust to the inclusion of state-specific time trends. This suggests
that the SIPP micro data are able to overcome the limitations of the
U.S. Census data.
However, when I turn to the results that identify which specific
health conditions were affected by education improvements induced by
compulsory schooling laws, the results do not point to a coherent story
of how education affects health. For example, only a small fraction of
health conditions are affected by education, and several of those
affected are conditions, such as sight and hearing, where economic
theories don't appear to be relevant. What is also striking is the
absence of effects among many chronic diseases where decision-making
ability is believed pivotal. A limitation of the data, however, is that
specific conditions are only identified for a subset of the sample that
report having some health limitations. Nevertheless, this pattern of
results suggests that the use of compulsory schooling laws as an
instrument may be suspect. I also note that in a recent working paper,
Clark and Royer (2007) use an even more sophisticated approach to
analyze the effects of compulsory schooling law changes in the United
Kingdom on mortality. Their findings also cast doubt on whether there is
a strong causal connection between education and health.
Background and previous literature
Kitagawa and Hauser (1973) were the first to document the sharp
differences in health in the United States by socioeconomic status. A
large number of studies have since replicated this basic finding of a
"gradient" in health by education or income, and this pattern
has also been found in other countries? For policymakers, a critical
question is whether this gradient reflects a causal relationship that
can be exploited to improve the long-term health of the population. For
example, in a document soliciting research proposals on the pathways
linking education to health, the National Institutes of Health (2003)
cautioned that: "The association or pathway between formal
education and either important health behaviors or diseases may not be
causal. Instead it may reflect the influence of confounding or
co-existing determinants or may be bi-directional."
A review of the literature on whether the education gradient in
health is causal may be found in Grossman (2005). While these studies
typically find an effect of more education leading to better health, in
most cases it is questionable whether the instruments are truly
exogenous. For example, Dhir and Leigh (1997) use parent schooling,
parent income, and state of residence as instruments, all of which could
plausibly affect long-term health independently of their effects through
schooling. The innovation by Lleras-Muney (2005) to use changes in
compulsory schooling laws early in the twentieth century appears to be
more compelling, since it is more plausibly exogenous than instruments
used in prior work. Nevertheless, other changes in public policy that
coincided with changes in compulsory schooling laws might have led to
long-run improvements in health. Cutler and Miller (2005) find that the
introduction of clean water technologies during this period could
explain as much as half of the concurrent decline in mortality.
Similarly, many states introduced food programs in schools, recognizing
that compulsory schooling was pointless if children were malnourished.
Near the beginning of the twentieth century, Robert Hunter (1904) wrote
in the book Poverty: "There must be thousands--very likely sixty or
seventy thousand children--in New York City alone who often arrive at
school hungry and unfitted to do well the work required. It is utter
folly, from the point of view of learning, to have a compulsory school
law which compels children, in that weak physical and mental state which
results from poverty, to drag themselves to school and to sit at their
desks, day in and day out, for several years, learning little or
nothing." In response to this situation, Philadelphia, Boston,
Milwaukee, New York, Cleveland, Cincinnati, and St. Louis all began
large-scale programs to provide food in public schools during the 1900s
and 1910s (Gunderson, 1971). Mazumder (2007) also provides suggestive
evidence that the mechanism by which compulsory schooling laws might
have improved long-term health was through school requirements for
vaccination against smallpox. If improvements in nutrition and
vaccination programs were coincident with changes in compulsory
schooling laws, then these might explain some or all of the long-term
health improvements that were associated with changes in these laws.
Supposing that it is true that more education leads to improved
health, this finding raises an interesting question--namely, how,
exactly, does education affect health? As Richard Suzman of the National
Institute on Aging recently stated, "Education ... is a
particularly powerful factor in both life expectancy and health
expectancy, though truthfully, we're not quite sure why." (3)
Economists have proposed a variety of explanations. These theories
typically emphasize the role of education in affecting various proximate
determinants of health, including financial resources, knowledge and
decision-making ability, and other behavioral characteristics that could
lead to better health outcomes.
Financial resources come into play because better educated
individuals may obtain higher paying and more stable jobs and thereby
may be able to afford better quality health care and health insurance.
With greater economic resources, they may also choose safer and more
secure living and work environments. One might expect that if financial
resources are the key factor behind the link between education and
health, then we should expect to see virtually all forms of health
conditions affected by exogenous sources of increased education.
The second explanation is that higher levels of schooling may lead
to greater knowledge and an improved ability to process information and
make better choices or take better advantage of technological
improvements. In one widely cited paper, Goldman and Smith (2002) note
that better educated patients may manage chronic conditions better.
Those with more schooling adhere more closely to treatment regimens for
human immunodeficiency virus (HIV) infection and diabetes, which can be
fairly complex. For such conditions, the ability to form independent
judgments and comprehend treatments is important, and apparently is
fostered by schooling. Accordingly, Goldman and Smith (2002, p. 10934)
state that "self-maintenance is an important reason for the very
steep SES [socioeconomic status] gradient in health outcomes."
Glied and Lleras-Muney (2003) argue that "the most educated make
the best initial use of new information about different aspects of
health," permitting them to respond more adeptly to evolving
medical technologies.
Finally, it could be that education induces other kinds of
behavioral changes. For example, the better educated may value the
future more than the present compared with those with less education,
and therefore, the better educated may take better care of their health
(Becker and Mulligan, 1997). Others have argued that education improves
one's perception of one's relative status in society and that
improved social standing is associated with better health (Marmot,
1994).
Mortality analysis: Methodology and data
The first part of the analysis estimates the effects of education
on mortality, using the approach developed by Lleras-Muney (2005). In
the absence of a large sample of data on individuals containing both
education and lifespan, I use group-level data from successive U.S.
Decennial Censuses to estimate mortality rates. Specifically, I use
population estimates for groups defined by state of birth, gender, and
year of birth to estimate the mortality rate across ten-year periods.
The mortality rate at time t for birth cohort c of gender g born in
state s, ([M.sub.cgst]), is simply measured as the percentage decline in
the population count ([N.sub.cgst]) within these cells over the
subsequent ten years:
1) [M.sub.cgst] = [N.sub.cgst] - [N.sub.cgst+1]/[N.sub.cgst].
I then model the mortality rate for each cell as follows:
2) [M.sub.cgst] = a + [E.sub.cgst][pi] + [W.sub.cs][delta] +
[[gamma].sub.c] + [[alpha].sub.s] + [[theta].sub.cr] + fem +
[[tau].sub.t] + [[epsilon].sub.cgst],
where [E.sub.cgst] is the average education level for that cell at
time t and [W.sub.cs] measures a set of cohort and state-specific
controls measured at age 14 intended to capture differences in other
potential early life determinants of mortality (for example,
manufacturing share of employment and doctors per capita). The model
also includes a set of cohort dummies c, state of birth dummies s,
interactions between cohort and region of birth [[theta].sub.cr], a
female dummy (fem), and year dummies [[tau].sub.t].
One straightforward way to estimate [pi] in equation 2 would be
through weighted least squares (WLS), with the weights corresponding to
the population represented by each cell. However, this would produce a
biased estimate because of omitted variables. Any number of factors
could plausibly be associated with both higher education and lower
mortality even at the group level. Therefore, I use two-stage least
squares, where in the first stage, education is instrumented with the
set of compulsory schooling laws, [CL.sub.cs], in place for each cohort
and state of birth:
3) [E.sub.cgst] = b + [CL.sub.cs][rho]+[X.sub.cgst][beta] +
[W.sub.cs][delta] + [[gamma].sub.c] + [[alpha].sub.s] + [[theta].sub.cr]
+ fem + [[tau].sub.t] + [u.sub.cgst].
In Lleras-Muney (2005), the instruments for the compulsory
schooling laws were constructed in the following way. The variable
childcom measured the minimum required age for work minus the maximum
age before a child is required to enter school, by state of birth and by
the year the cohort is age 14. This variable takes on one of eight
values. A set of indicator variables were then used as instruments. In
addition, an indicator for whether school continuation laws were in
place in that state was also used. These laws required workers of school
age to continue school part time. However, it probably makes more sense
to match individuals to the laws concerning the maximum age for school
entry around the age at which students start school, rather than to the
laws in place when they were age 14. Therefore, I use a different set of
data independently collected by Goldin and Katz (2003). (4) Goldin and
Katz carefully compared their series with other codings of the
compulsory schooling laws (for example, Lleras-Muney, 2005; and Acemoglu
and Angrist, 2001) and resolved differences wherever possible. Since the
Goldin and Katz data go back further in time, it is possible to match
all of the cohorts to the school entry age laws in effect when the
cohorts were younger than 14. I use these data to measure the required
age for school entry when the cohorts were at age 8 instead of 14. In
principle, incorporating these data should provide a better measure of
the total years of compulsory schooling.
Several estimation samples are constructed for this part of the
analysis. Initially, I produce a sample combining data from the 1
percent Integrated Public Use Microdata Series (IPUMS) from the 1960,
1970, and 1980 U.S. Censuses in order to replicate the basic results in
Lleras-Muney (2005, 2006). (5) I then expand the analysis in stages.
First, I replace the 1 percent samples in 1970 and 1980 with a 2 percent
sample for 1970 and a 5 percent sample for 1980. Second, I also expand
the periods by adding 5 percent samples for 1990 and 2000. Following the
literature, I restrict the analysis to cohorts born between 1901 and
1925, top-code years of education at 18 starting in 1980, and exclude
immigrants and blacks. (6) For the expanded samples, I also exclude
cases where age, state of birth, and education are imputed by the U.S.
Census Bureau. The descriptive statistics for the replication sample and
the expanded sample are shown in table 1.
It is worth noting that the death rate for the 1970-80 period is
quite a bit larger with the expanded sample but that the standard
deviation is about 20 percent lower. There are now also five additional
cells that had missing data when using just the 1 percent samples. The
death rates for the 1980-90 and 1990-2000 periods are much higher
because i follow these same cohorts when they are much older. Figure 1
plots the death rates by age for each U.S. Census year. This highlights
the importance of controlling for age in the specifications, which is
done by adding polynomials in age to the models.
Health analysis: Methodology and data
The methodological approach changes only slightly when I turn to
using individual-level data from the SIPP. Many of the outcomes in the
SIPP are indicator variables that take on the value of 1 if a particular
health problem is present and 0 otherwise. Therefore, I now use
two-stage conditional maximum likelihood, or 2SCML (Rivers and Vuong,
1988), rather than IV. (7) Rivers and Vuong show that 2SCML has
desirable statistical properties, is easy to implement, and produces a
simple test for exogeneity. I continue to use IV for the few continuous
dependent variables. Also, all of the analysis is now done using
individual-level data. The statistical model is similar to equation 2,
only now 1 use the latent variable framework:
4) [y.sup.*.sub.it] = a+[E.sub.i][pi]+[X.sub.i][beta][W.sub.cs]
[delta]+[[gamma].sub.c]+[[alpha].sub.s]+[trend.sub.s] + [[tau].sub.t] +
fem + [[epsilon].sub.it],
5) [y.sub.it] = 1 if [y.sup.*.sub.it] > 0, [y.sub.it] = 0 if
[y.sup.*.sub.it] [less than or equal to] 0.
In the first stage, 1 run a similar regression as before:
6) [E.sub.i] = b+[CL.sub.cs][rho]+[X.sub.i][beta]+[W.sub.cs]
[delta]+[[gamma].sub.c]+[[alpha].sub.s] + [trend.sub.s] + [[tau].sub.t]+
[[tau].sub.t] + d + [[epsilon].sub.it].
To implement 2SCML, I use the predicted residuals from equation 6,
[[??].sub.it], and I include it as an additional right-hand side
variable (along with the actual value of [E.sub.i]) when running the
second stage probit. For comparability, I use the same sample
restrictions and covariates as in the U.S. Census results, with only a
few exceptions. I include a quadratic in age and use state-specific
cohort trends to address concerns that region of birth interacted with
cohort may not adequately control for state-specific factors that are
smoothly changing over time. (8)
[FIGURE 1 OMITTED]
The sample is constructed by pooling individuals from the 1984,
1986-88, 1990-93, and 1996 SIPP panels. Each SIPP panel surveys
approximately 20,000 to 40,000 households, and most panels are
representative of the noninstitutionalized population. (9) Because
participation in many programs is closely related to an
individual's health and disability status, the SIPP routinely
collects information on health and medical conditions. The SIPP is also
ideally suited for this analysis because it contains the state of birth
of all sample members, which allows me to implement the IV strategy of
using compulsory schooling laws during childhood.
One especially useful outcome is self-reported health (SRH). The
SRH is on a 1-5 scale, where 1 is "excellent," 2 is "very
good," 3 is "good," 4 is "fair," and 5 is
"poor." The SRH has been found to be an excellent predictor of
mortality and changes in functional abilities among the elderly (Case,
Lubotsky, and Paxson, 2002). I experiment with this measure in a few
ways. First, I use it as a continuous variable. Second, I use indicators
for being in poor health or in fair or poor health. Finally, I use the
health utility scale that measures the differences between the
categories in a health model using the National Health Interview Survey
(conducted by the U.S. Department of Health and Human Services, Centers
for Disease Control and Prevention, National Center for Health
Statistics). (10)
I also examine some other general outcomes. These are whether the
individual was hospitalized during the past year, the number of times
she was hospitalized, the total number of nights spent in the hospital,
and the number of days spent in bed in the past four months.
There are also questions dealing with functional activities,
activities of daily living, and instrumental activities of daily living
that are derived from the International Classification of Impairments,
Disabilities, and Handicaps (ICIDH). I assembled a common set of
questions that were consistently asked across surveys. These are whether
the individual has "difficulty" with seeing, hearing,
speaking, lifting, walking, and climbing stairs, as well as whether the
person can perform any of these activities "at all." in
addition, there is information on whether individuals have difficulty
getting around inside the house, going outside of the house, or getting
in or out of bed, as well as whether they need the assistance of others
for these activities.
For a subset of individuals who report limited abilities in certain
tasks or who have been classified as having a work disability
("health limitation"), detailed information is collected on a
number of very specific health conditions including: arthritis or
rheumatism; back or spine problems; blindness or vision problems;
cancer; deafness or serious trouble hearing; diabetes; heart trouble;
hernia; high blood pressure (hypertension); kidney stones or chronic
kidney trouble; mental illness; missing limbs; lung problems; paralysis;
senility/dementia/Alzheimer's disease; stiffness or deformity of
limbs; stomach trouble; stroke; thyroid trouble or goiter; tumors (cyst
or growth); or other. (11) Since the specific health ailments are only
asked of specific subsamples, they probably only pick up on the most
severe cases. Even though many of the sample individuals are not
actually asked about these specific health conditions, I still include
them in the estimation sample so that the sample is not a selected
sample of only those in poor health. The summary statistics for these
data are shown in table 2.
Mortality results
I begin by trying to match the estimates of the effect of education
on ten-year mortality rates shown in Lleras-Muney (2006). (12) Using
WLS, Lleras-Muney's estimate is -0.036, and using IV, her estimate
is -0.063. These estimates imply huge effects. For example, the IV
estimate implies that one additional year of education would reduce the
ten-year mortality rate by about 60 percent. (13) In table 3, I show the
results of the replication exercise, as well as the effects of expanding
the sample and employing additional robustness checks. In the first row
of panel A of table 3, I match the WLS estimate of -0.036 exactly,
although my IV estimate of -0.072 is slightly larger. It is also worth
pointing out that the partial F statistic on the first stage regression
is reasonable at 7.5. (14) The second row of panel A uses the 1960 (1
percent) sample, as well as the larger samples for 1970 (2 percent) and
1980 (5 percent), and utilizes the Goldin and Katz (2003) data for
constructing the instruments. I find that the WLS estimate rises to
-0.045 and that the IV estimates drop considerably to -0.045. Had I used
the Lleras-Muney data for constructing the instruments, the estimate
would be exactly the same at -0.045. However, the standard error would
have declined by about 25 percent relative to the first row, suggesting
that expanding the sample provides considerably more precision. In the
third and fourth rows of panel A, I control for age and find that this
lowers the WLS estimates a little and increases the IV estimates a
little. In the fifth row, I drop the region of birth interactions with
cohort and instead use state-specific linear (cohort) trends. This
raises the WLS estimate to -0.048, but I now find that the IV
coefficient is sharply lower at -0.016 and is no longer statistically
significant. However, the fact that the standard error does not rise
suggests that the precision is the same when including the
state-specific trends.
In panel B of table 3, I add data from the 5 percent samples of the
1990 and 2000 U.S. Censuses. With this larger data set, I construct
death rates over four ten-year periods and therefore follow cohorts over
a longer period with a considerably larger sample. Given that the sample
also tracks the cohorts later in life when mortality rates are much
higher, the age controls are essential. I use a cubic in age, although I
find that the results are not very sensitive to the choice of the
polynomial. Since medical technology and other health-related factors
might change over time, I have also interacted the cubic in age with the
U.S. Census year. In this specification (the first row of panel B), I
now find that the WLS estimate is about -0.034 and that the IV estimate
is -0.026. Both of these estimates are a bit more plausible than the
ones mentioned previously. The IV estimate is now significant at the 10
percent level, hut not at the 5 percent level. With this larger sample,
the inclusion of state-specific cohort trends again results in a point
estimate that is much smaller in magnitude (-0.012) and not
statistically distinguishable from zero (the second row of panel B),
despite a similar degree of precision.
In the remaining panels of table 3, I examine how the effects vary
by year, age, and cohort. In panel C, I separately estimate the
education coefficient for each U.S. Census year. Since the specification
includes a full set of cohort dummies, these are equivalent to age
controls when using a single U.S. Census year. Although the WLS
estimates are significant in all years, they peak in 1970-80 at -0.061
and drop to only -0.012 by 1990-2000. The IV estimates have large
standard errors, so they are likely to be imprecisely estimated.
Nonetheless, the point estimate is large only for 1960-70 and is
actually positive for 1980-90 and 1990-2000. In panel D, I stratify the
sample by three age ranges: 35-54, 55-64, and 65-89. Here I observe
different patterns between the WLS and IV specifications. The WLS
estimates suggest that the largest effect may be for those aged 55-64,
while the IV estimates are largest for those aged 35-54. Given the
imprecision of the estimates, I cannot draw any meaningful inferences
regarding the age pattern.
Panel E of table 3, however, provides a striking result when using
the IV specification. It appears that the entire effect of education on
mortality arising from compulsory schooling laws is due to cohorts born
in 1901-12, who constitute just over 40 percent of the sample. In fact
for those born in 1913-25, the point estimate is actually positive.
Interpreting the mortality results
I interpret the results in the fifth row of panel A and the second
row of panel B of table 3 as suggesting that I cannot reject the null
hypothesis that the effect of education on mortality is zero. in other
words, education has no causal effect on mortality once I adequately
control for state time trends. An alternative view might be that once
one includes state time trends, the coefficient is smaller but still
negative, and that the standard errors are simply too large to estimate
the effect precisely, and therefore, I cannot rule out a causal effect.
One might be concerned, for example, that the instruments are highly
collinear with the time trends. However, as I have shown, the standard
errors do not rise when including the time trends. In any case, this
alternative interpretation of the results would implicitly start with
the hypothesis that there is a causal effect and that the results here
do not offer sufficient evidence to reject that hypothesis--a strong
assumption given that the literature has yet to successfully identify a
causal effect.
If one takes seriously the point estimates shown in the fifth row
of panel A and the second row of panel B of table 3 (despite their
statistical insignificance), then this implies that the causal effects
of education on mortality are much smaller than previously thought. A
more reasonable estimate then is that an additional year of schooling
lowers mortality risk over a ten-year period by about 10 percent. This
is still a large effect that might reflect the true causal effect.
Still, it bears repeating that using the current research design, I am
unable to reject the hypothesis that the true effect is actually zero.
My analysis also suggests that, upon closer inspection, the results
are driven by cohorts born very early in the century and their mortality
experience during the 1960-70 period. One possible explanation could be
that the effect of education stayed roughly constant but that compulsory
schooling laws had their biggest effect on those born earlier in the
century. However, I have run the first-stage regressions by these cohort
groupings and found that the partial F statistics on the instruments are
actually much higher for the 1913-25 cohorts. This suggests that the
schooling laws may actually have been more binding for the later
cohorts, casting doubt on this alternative explanation.
Health outcome results
Table 4 presents the results using the microdata on health outcomes
using the SIPP. The first column shows the effects of education using a
simple probit (or ordinary least squares, or OLS), which does not
account for endogeneity. The second column presents the 2SCML (or IV)
estimates using the compulsory schooling laws as instruments. Given the
possible effects of education on mortality and the fact that outcomes in
the SIPP are not observed until at least 1984, one might not expect any
remaining health effects to be apparent. As it turns out, I do find
significant effects using the instruments for several broad health
outcomes. The first row of panel A shows that self-reported health
measured as a continuous variable is affected by education. The IV
estimate of -0.23 is more than twice the OLS estimate of -0.09. In the
fourth column using a Hausman test of exogeneity, I can reject that the
OLS and IV coefficients are the same at the 7 percent level (shown as
0.074 in the table). Translating the SRH into a health index on a 1-100
scale following Johnson and Schoeni's (2007) approach, the IV
estimate implies that an increase in schooling by one year improves the
health index by 4.5 points, or about 7 percent evaluated at the mean
(third column). I also estimate that the probability of being in fair or
poor health is reduced by 8.2 percentage points with an additional year
of schooling, a fairly large effect that is statistically different from
the naive probit at the 18 percent significance level. I do not find,
however, that any of the measures of hospitalization or days spent in
bed are significant when accounting for endogeneity.
Looking across a variety of measures of physical function, I find
that, while all of the naive probit estimates are significant and of the
expected sign, the two-stage estimates are typically not significant.
Those who have an additional year of schooling because of compulsory
schooling laws are no less likely to have trouble lifting, walking,
climbing stairs, getting around outside the house, getting around inside
the house, or getting into or out of bed. In fact for many of these
outcomes, the coefficients are actually positive, suggesting they have a
greater propensity for worse health. On the other hand, those with
greater schooling associated with compulsory schooling laws are
dramatically less likely to experience problems with seeing, hearing, or
speaking. In almost all of these cases, the differences between the
simple probit and the 2SCML estimates are very large and statistically
different at about the 10 percent level. For example, the 2SCML
estimates imply that an additional year of schooling reduces the
probability of having trouble "seeing" by 5.6 percentage
points. In this sample, the mean rate of this health outcome is 13.6
percent. These results might suggest that the channel by which general
health is compromised for those with less schooling may be related to
sensory functions.
Next, I estimate results based on the incidence of specific health
conditions. Recall that these conditions are only identified for subsets
of individuals and that the screening criteria changed across SIPP
survey years. Also recall that all individuals are included regardless
of whether they were screened for this question, so as to avoid using a
sample of only those in poor health. Generally, the underlying health
conditions were only asked of individuals who reported particular kinds
of activity limitations, reported having a work disability, or reported
being in fair or poor health. This is captured by the variable
"health limitation," which, not surprisingly, is significant
under both probit and 2SCML. When I turn to the estimated likelihood of
having one of the underlying health conditions, the probit estimates
once again are significant in every case. The 2SCML estimates, however,
are only negative and significant for four outcomes: back or spine
problems: stiffness or deformity of a limb; diabetes; and
senility/dementia/Alzheimer's disease. It is important to point out
that "trouble seeing," "trouble hearing," and
"trouble speaking" were never used as screening criteria for
asking about an underlying health condition. This likely explains why
blindness and deafness are not significant within the subsamples.
Surprisingly, both kidney problems and hypertension appear to be
positively associated with more schooling. This is especially notable
because these are two outcomes for which self-management and recent
technological advances appear to be especially important. According to
appendix table B of Glied and Lleras-Muney (2003), treatment of kidney
infections experienced substantial innovation. Among the 56 causes of
death, kidney disease experienced the fastest decline in age-adjusted
mortality from 1986 to 1995--falling more than 9 percent per year (Glied
and Lleras-Muney, 2003, p. 8, appendix table B). Accordingly, a steep
(negative) gradient between education and kidney disease would
presumably be expected. It is therefore of note that the 2SCML
specification finds an increase in the incidence of kidney problems
among those with high education. Treatment of diabetes is "often
considered the prototype for chronic disease management" (Goldman
and Smith, 2002). My findings, which analyze a broad range of health
conditions and chronic diseases, would suggest that, insofar as the
formal schooling is concerned, diabetes appears to be an exception. In
the SIPP data, diabetes enters in the expected direction; that is,
increases in schooling appear to reduce the incidence of severe cases of
diabetes.
On the one hand, since diabetes is also associated with loss of
limbs and poor vision, the diabetes result could be a plausible
explanation for those findings. On the other hand, kidney problems and
hypertension, which are also commonly associated with diabetes, go in
the wrong direction. Further, there is no well-established connection
between diabetes and speech, hearing, and back problems. An alternative
explanation for the diabetes result could be that states that had higher
compulsory schooling levels also promoted nutritional policies that
might have reduced adult onset of diabetes. Overall, however, one
conclusion that may be drawn from this table is that there is little
support for the "decision-making" hypothesis.
I would also note that explanations for the link between education
and health that focus on better health care access due to more financial
resources (for example, from higher income and a better paying
occupation) or unobserved time preferences do not appear to be
consistent with these results. These explanations would likely imply
that many outcomes ought to be affected, not just a few.
There are two important limitations to this analysis. First, I
observe individuals only if they have survived into the 1980s and 1990s
when they are anywhere between the ages of 59 and 83. This sample is
almost certainly positively selected on education and health, so it is
unclear to what extent they may be generalized. I suspect that because
of this selection, my results are biased against finding any effects of
education on improving health, making it still surprising that there are
very large negative coefficients on the incidence of several negative
health outcomes. Second, because specific health conditions are only
asked of those who report an activity limitation or being in fair or
poor health, some individuals with a particular condition may not be
captured in the analysis. Nonetheless, it may be even more meaningful to
identify the effects of education on specific conditions that were
severe enough to cause an activity limitation.
Conclusion
In this article, I expand upon the growing literature that attempts
to identify whether there is a causal effect of education on health, l
closely examine the effects of education induced by compulsory schooling
laws early in the twentieth century on long-term health, using several
approaches. First, I revisit the results in Lleras-Muney (2005, 2006) by
expanding the U.S. Census sample and employing a variety of robustness
checks. The main finding is that the effects of education on mortality
induced by changes in compulsory schooling laws are not robust to
including state-specific time trends, suggesting that a causal
interpretation is unwarranted.
Second, I use the SIPP to identify not only general health effects
but also specific health outcomes that were induced by changes in state
compulsory schooling laws to see if these outcomes correspond to our
existing theories of how education affects health. The results suggest
that there is a large effect of education on general health status
arising from compulsory schooling laws that are robust to state time
trends. However, I find that, with the important exception of diabetes,
none of the other specific health conditions that are associated with
education (for example, vision, hearing, speaking ability, back
problems, deformities, and senility) correspond to the leading theories
of how education improves health (for example, technological
improvements, better decision-making, lower discount rates, higher
income). This suggests that either our theories are incorrect or that
the compulsory schooling laws are suspect instruments. An important
caveat, however, is that the SIPP analysis uses a sample of older
individuals who are almost surely positively selected on education and
health. While this likely makes it more difficult to detect effects of
education on improved health, it also raises questions as to how far one
can generalize these results.
A few other studies have begun to implement strategies to better
identify the causal effects of education on health with mixed findings.
In a working paper, Clark and Royer (2007) use differences in compulsory
schooling laws affecting very narrowly defined birth cohorts in the
United Kingdom, combined with individual-level mortality data and find
very small effects of education on mortality, which are consistent with
the results here. In another working paper, Deschenes (2007) uses
plausibly exogenous variation based on cohort size in the U.S. and
estimates a statistically significant and large effect of education on
mortality using a grouped estimator. Deschenes' estimates suggest
that an additional year of schooling adds an additional year to life
expectancy. Because we are still only in the early stages of our
understanding of this important issue, it is important to conduct
replication and extension exercises on the small number of studies that
have used more credible research strategies.
REFERENCES
Acemoglu, D., and J. Angrist, 2001, "How large are human
capital externalities? Evidence from compulsory schooling laws," in
NBER Macroeconomics Annual 2000, B. S. Bernanke and K. S. Rogoff (eds.),
Cambridge, MA: MIT Press, pp. 9-59.
Becker, G. S., and C. B. Mulligan, 1997, "The endogenous
determination of time preference," Quarterly Journal of Economics,
Vol. 112, No. 3, August, pp. 729-758.
Case, A., D. Lubotsky, and C. Paxson, 2002, "Economic status
and health in childhood: The origins of the gradient," American
Economic Review, Vol. 92, No. 5, December, pp. 1308-1334.
Clark, D., and H. Royer, 2007, "The effect of education on
longevity: Evidence from the United Kingdom," Case Western Reserve
University, working paper.
Cutler, D., and G. Miller, 2005, "The role of public health
improvements in health advances: The twentieth-century United
States," Demography, Vol. 42, No. 1, February, pp. 1-22.
Deaton, A., and C. Paxson, 2004, "Mortality, income, and
income inequality over time in Britain and the United States," in
Perspectives on the Economics of Aging, D. A. Wise (ed.), Chicago:
University of Chicago Press, pp. 247-280.
Deschenes, O., 2007, "The effect of education on adult
mortality: Evidence from the baby boom generation," University of
California, Santa Barbara, working paper.
Dhir, R., and J. P. Leigh, 1997, "Schooling and frailty among
seniors," Economics of Education Review, Vol. 16, No. l, February,
pp. 45-57.
Glied, S., and A. Lleras-Muney, 2003, "Health inequality,
education, and medical innovation," National Bureau of Economic
Review, working paper, No. 9738, June.
Goldin, C., and L. F. Katz, 2003, "Mass secondary schooling
and the state," National Bureau of Economic Review, working paper,
No. 10075, November.
Goldman, D. P., and J. P. Smith, 2002, "Can patient
self-management help explain the SES health gradient?," Proceedings
of the National Academy of Sciences, Vol. 99, No. 16, August 6, pp.
10929-10934.
Grossman, M., 2005, "Education and nonmarket outcomes,"
National Bureau of Economic Review, working paper, No. 11582, August.
Gunderson, G. W., 1971, "The National School Lunch Program:
Background and development," U.S. Department of Agriculture, Food
and Nutrition Service, report, available at www.fns.usda.gov/cnd/
Lunch/AboutLunch/ProgramHistory.htm.
Hunter, R., 1904, Poverty, New York: Macmillan.
Johnson, R. C., and R. F. Schoeni, 2007, "The influence of
early-life events on human capital, health status, and labor market
outcomes over the life course," University of California, Berkeley,
Institute for Research on Labor and Employment, working paper, No.
iirwps-140-07, January 2.
Kitagawa, E. M., and P. M. Hauser, 1973, Differential Mortality in
the United States: A Study in Socioeconomic Epidemiology, Cambridge, MA:
Harvard University Press.
Kolata, G., 2007, "A surprising secret to a long life: Stay in
school," New York Times, January 3, available at
www.nytimes.com/2007/01/03/health/03aging.html.
Lleras-Muney, A., 2006, "Erratum: The relationship between
education and adult mortality in the United States," Review of
Economic Studies, Vol. 73, No. 3, p. 847.
--, 2005, "The relationship between education and adult
mortality in the United States," Review of Economic Studies, Vol.
72, No. 1, pp. 189-221.
--, 2002, "Were compulsory attendance and child labor laws
effective? An analysis from 1915 to 1939," Journal of Law and
Economics, Vol. 45, No. 2, part 1, October, pp. 401-435.
Lyman, R., 2006, "Census report foresees no crisis over aging
generation's health," New York Times. March 10, available at
www.nytimes.com/2006/03/ 10/national/10aging.html.
Mazumder, B., 2007, "How did schooling laws improve long-term
health and lower mortality?," Federal Reserve Bank of Chicago,
working paper, No. WP-2006-23, revised January 24, 2007.
Marmot, M. G., 1994, "Social differences in health within and
between populations," Daedalus, Vol. 123, No. 4, pp. 197-216.
National Institutes of Health, 2003, "Pathways linking
education to health," report, Bethesda, MD, No. RFA OB-03-001,
January 8, available at
http://grants1.nih.gov/grants/guide/rfa-files/RFA-OB-03-001.html.
Rivers, D., and Q. H. Vuong, 1988, "Limited information
estimators and exogeneity tests for simultaneous probit models,"
Journal of Econometrics, Vol. 39, No. 3, November, pp. 347-366.
NOTES
(1) Kolata (2007)
(2) For example, Deaton and Paxson (2004) document that there is a
strong association between education and health in the United Kingdom.
(3) See Lyman (2006). The National Institute on Aging is part of
the National Institutes of Health.
(4) The results from using the Lleras-Muney (2005) instruments
instead of the Goldin and Katz (2003) instruments are not very
different. and are in an earlier version of this article, Mazumder
(2007).
(5) The IPUMS are from the University of Minnesota, Minnesota
Population Center.
(6) Lleras-Maney (2002) found no effect of compulsory schooling
laws on the education levels of blacks.
(7) I thank Jay Bhattacharya for this suggestion. In a previous
version of the article, I found very similar results using two-stage
least squares for the dichotomous outcomes.
(8) I generally found that the IV results were larger and more
significant when using the state trends than when using region of birth
interacted with cohort. The ordinary least squares results were
virtually identical under either specification.
(9) The 1990 and 1996 panels include an oversample of poorer
households The restriction to the noninstitutionalized population means
that those living in nursing homes are not included in the survey.
However, more than 90 percent of the disabled and more than 80 percent
of those requiring long-term care live outside of institutions; for
further details, see http://aspe.hhs.gov/daltcp/reports/rnl1.htm
(10) See Johnson and Schoeni (2007) and the citations therein for a
discussion of this approach.
(11) I pool responses from the 1984, 1990-93, and 1996 SIPPs in
order to maximize sample size. Unfortunately, different criteria were
used across the SIPP survey years to select the subsamples for which
specific health conditions were asked. For example, in 1996 the health
conditions were asked of those who reported being in fair or poor health
I found that it was important to combine all of the subsamples in all of
the years in order to have enough power to identify effects. There are
also an additional set of ten outcomes that are not used because they
were not available in the 1984 SIPP. Experimentation with a smaller
sample suggests that the conclusions are not altered by dropping these
other outcomes
(12) Note that these are estimates from errata that correct the
previous estimates in Lleras-Muney (2005). See Mazumder (2007) for more
details.
(13) The mean ten-year mortality rate in Lleras-Muney (2005) is
10.6 percent, so a reduction of 63 percentage points implies a 59
percent reduction in mortality.
(14) The partial F statistic rises to 9.07 when using the expanded
sample.
Bhashkar Mazumder is a senior economist in the Economic Research
Department and the executive director of the Chicago Research Data
Center at the Federal Reserve Bank of Chicago. The author thanks Douglas
Almond, Claudia Goldin, Adriana Lleras-Muney. Anna Paulson, and Diane
Schanzenbach.
TABLE 1
Summary statistics for Integrated Public Use Microdata Series samples
1960 1%, 1970 1% and
1980 1% samples
Standard Number of
Variables Mean deviation observations
Ten year death rates
Overall 0.108 0.136 4,792
1960-70 0.110 0.119 2,395
1970-80 0.105 0.152 2,397
1980-90 -- -- --
1990-2000 -- -- --
Individual characteristics
Education 10.548 0.990 4,795
1960 dummy 0.471 0.499 4,795
1970 dummy -- -- --
1990 dummy -- -- --
Female 0.517 0.500 4,795
Age 50.366 8.482 4,795
Born in 1905 0.031 0.174 4,795
Born in 1910 0.038 0.191 4,795
Born in 1915 0.044 0.205 4,795
Born in 1920 0.048 0.213 4,795
Born in 1925 0.050 0.217 4,795
State of birth characteristics
Percentage urban 53.523 21.279 4,795
Percentage foreign-born 11.737 8.523 4,795
Percentage black 8.983 11.901 4,795
Percentage employed
in manufacturing 0.067 0.038 4,795
Annual manufacturing wage ($) 7,171.39 1,343.09 4,795
Value of farm per acre ($) 540.05 276.35 4,795
Per capita number of doctors 0.001 0.000 4,795
Per capita education
expenditures ($) -97.01 42.05 4,795
Number of school buildings
per square mile 0.174 0.09 4,795
1960 1%, 1970 2%, 1980 5%,
1990 5%, and 2000 5% samples
Standard Number of
Variables Mean deviation observations
Ten year death rates
Overall 0.213 0.173 8,636
1960-70 0.113 0.105 2,397
1970-80 0.154 0.125 2,400
1980-90 0.287 0.170 2,399
1990-2000 0.433 0.122 1,440
Individual characteristics
Education 10.729 1.002 8,636
1960 dummy 0.325 0.469 8,636
1970 dummy 0.289 0.453 8,636
1990 dummy 0.142 0.349 8,636
Female 0.532 0.499 8,636
Age 56.811 11.287 8,636
Born in 1905 0.025 0.157 8,636
Born in 1910 0.031 0.174 8,636
Born in 1915 0.047 0.211 8,636
Born in 1920 0.052 0.222 8,636
Born in 1925 0.057 0.232 8,636
State of birth characteristics
Percentage urban 53.778 21.153 8,636
Percentage foreign-born 11.562 8.430 8,636
Percentage black 8.945 11.787 8,636
Percentage employed
in manufacturing 0.066 0.037 8,636
Annual manufacturing wage ($) 7,206.15 1,353.57 8,636
Value of farm per acre ($) 535.18 272.57 8,636
Per capita number of doctors 0.001 0.000 8,636
Per capita education
expenditures ($) 99.78 41.71 8,636
Number of school buildings
per square mile 0.172 0.09 8,636
Notes: Summary statistics are for state of birth, cohort, and
gender cells. All means and standard deviations use sample
weights where the weights are the population estimates for the
cell in the base period.
Source: Author's calculations based on data from the University
of Minnesota, Minnesota Population Center. Integrated Public Use
Microdata Series.
TABLE 2
Summary statistics for Survey of Income
and Program Participation Sample
Standard Number of
Variables Mean deviation observations
Outcomes
Self-reported health
(1 is excellent, 5 is poor) 3.084 1.138 26,030
Poor health 0.119 0.324 26,030
Fair or poor health 0.357 0.479 26,030
Health index (1-100 scale) 67.992 24.842 26,030
Hospitalized in last year 0.180 0.384 26,484
Days in bed, last four months 3.937 17.030 25,223
Number of times hospitalized 0.282 1.029 22,229
Number of nights in hospital 1.908 7.898 26,274
Trouble seeing 0.136 0.342 20,853
Trouble hearing 0.152 0.359 20,845
Trouble speaking 0.021 0.144 20,834
Trouble lifting 0.237 0.425 20,837
Trouble walking 0.289 0.453 20,799
Trouble with stairs 0.276 0.447 20,820
Trouble getting around outside
the home 0.129 0.335 17,401
Trouble getting around inside
the home 0.059 0.235 17,643
Trouble getting in/out of bed 0.079 0.270 17,636
Trouble seeing at all 0.023 0.149 20,811
Trouble hearing at all 0.013 0.114 20,819
Trouble speaking at all 0.003 0.052 15,138
Trouble lifting at all 0.115 0.319 20,789
Trouble walking at all 0.154 0.361 20,723
Trouble with stairs at all 0.116 0.321 20,775
Needs help getting around outside 0.088 0.283 13,610
Needs help getting around inside 0.024 0.154 13,893
Needs help getting in/out of bed 0.025 0.156 13,868
Work limitation due to health
conditions 0.423 0.494 19,073
Arthritis 0.129 0.335 19,073
Back 0.062 0.242 19,073
Blind 0.026 0.159 19,073
Cancer 0.016 0.125 19,073
Deaf 0.023 0.149 19,073
Deformity 0.027 0.162 19,073
Diabetes 0.030 0.170 19,073
Heart 0.090 0.287 19,073
Hernia 0.006 0.080 19,073
Hypertension 0.036 0.185 19,073
Kidney 0.005 0.067 19,073
Lung 0.043 0.203 19,073
Mental illness 0.005 0.067 19,073
Missing limb 0.003 0.056 19,073
Paralysis 0.006 0.075 19,073
Senility 0.007 0.084 19,073
Stomach 0.010 0.099 19,073
Stroke 0.021 0.144 19,073
Thyroid 0.003 0.056 19,073
Other 0.066 0.247 19,073
Individual characteristics
Education 11.432 3.208 26,030
Female 0.580 0.494 4,795
Age 72.079 5.606 4,795
Source: Author's calculations based on data from the U.S.
Census Bureau, Survey of Income and Program Participation.
TABLE 3
New estimates of effects of education on mortality
Number of
Sample and specification WLS IV observations
A. 1960-80
1960-1980 1%:
No age controls, region x cohort -0.036 -0.072 4,792
(0.004) (0.025)
1960 1%, 1970 2%, and 1980 5%:
No age controls, region x cohort -0.045 -0.045 4,797
(0.004) (0.024)
With age cubic, region x cohort -0.039 -0.047 4,797
(0.004) (0.024)
With age cubic x Census year, -0.040 -0.047 4,797
region x cohort (0.004) (0.024)
With age cubic x Census year, -0.048 -0.016 4,797
state x cohort trend (0.004) (0.024)
B.1960-2000
1960 1%, 1970 2%, and 1980-2000 5%:
With age cubic x Census year -0.034 -0.026 8,636
(0.003) (0.015)
With age cubic x Census year, -0.036 -0.012 8,636
state x cohort trend (0.003) (0.016)
C. 1960-2000, by Census year
1960 1%, 1970 2%, and 1980-2000
5% with age cubic x Census year:
Estimated effect for 1960-70 -0.025 -0.081 2,397
(0.006) (0.052)
Estimated effect for 1970-80 -0.061 -0.023 2,400
(0.005) (0.033)
Estimated effect for 1980-90 -0.043 0.023 2,399
(0.004) (0.029)
Estimated effect for 1990-2000 -0.012 0.027 1,440
(0.005) (0.039)
D. 1960-2000, by age
1960 1%, 1970 2%, and 1980-2000
5% with age cubic x Census year:
35-54 year olds -0.017 -0.067 2,879
(0.005) (0.036)
55-64 year olds -0.039 0.063 2,398
(0.005) (0.053)
65-89 year olds -0.030 -0.047 3,071
(0.003) (0.023)
E. 1960-2000, by cohort
1960 1%, 1970 2%, and 1980-2000
5% with age cubic x Census year:
Cohorts born in 1901-12 -0.019 -0.203 3,644
(0.004) (0.125)
Cohorts born in 1913-25 -0.017 0.025 4,992
(0.004) (0.023)
Notes: WLS means weighted least squares. IV means instrumental
variables. The dependent variable is the ten-year mortality rate;
table entries are the coefficient on education. All
specifications include year dummies, cohort dummies, state of
birth dummies, region of birth interacted with cohort, and an
intercept (except for panel A, fifth row, and panel B, second
row). Estimates are weighted using the number of observations in
the cell in the base year. Standard errors, shown in parentheses,
are clustered at the state of birth and cohort level.
TABLE 4
Estimates of effects of education on health outcomes
IV/2SCML
Dependent variable OLS/probit IV/2SCML effect size
A. General health outcomes
Self-reported health -0.0941 -0.2289 -0.074
(1 is excellent, 5 is poor) (0.0023) (0.0745)
Health index (1-100 scale) 1.9674 4.5345 0.067
(0.0511) (1.6738)
Fair or poor health -0.0359 -0.0824 -0.230
(0.0010) (0.0343)
Poor health -0.0141 -0.0269 -0.226
(0.0006) (0.0206)
Hospitalized in last year -0.0049 -0.0268 -0.149
(0.0008) (0.0241)
Days in bed, last four months -0.3310 2.1526 0.547
(0.0364) (1.4848)
Number of times hospitalized -0.0101 -0.0944 -0.335
(0.0024) (0.0884)
Number of nights in hospital -0.0730 -1.0828 -0.567
(0.0186) (0.7668)
B. Functional limitations/activities of daily
living/instrumental activities of daily living
Trouble seeing -0.0122 -0.0559 -0.412
(0.0007) (0.0254)
Trouble hearing -0.0103 -0.0499 -0.329
(0.0007) (0.0247)
Trouble speaking -0.0019 -0.0192 -0.909
(0.0002) (0.0079)
Trouble lifting -0.0198 -0.0055 -0.023
(0.0009) (0.0330)
Trouble walking -0.0251 0.0130 0.045
(0.0011) (0.0325)
Trouble with stairs -0.0250 -0.0066 -0.024
(0.0010) (0.0324)
Trouble getting around -0.0120 -0.0146 -0.114
outside the home (0.0008) (0.0257)
Trouble getting around -0.0048 0.0051 0.087
inside the home (0.0005) (0.0208)
Trouble getting in/ -0.0056 0.0013 0.016
out of bed (0.0006) (0.0230)
Trouble seeing at all -0.0020 -0.0078 -0.343
(0.0002) (0.0084)
Trouble hearing at all -0.0008 -0.0100 -0.758
(0.0001) (0.0045)
Trouble speaking at all 0.0000 -0.0008 -0.284
(0.0001) (0.0001)
Trouble lifting at all -0.0100 -0.0029 -0.025
(0.0007) (0.0250)
Trouble walking at all -0.0148 0.0107 0.069
(0.0008) (0.0260)
Trouble with stairs at all -0.0114 0.0071 0.061
(0.0006) (0.0202)
Needs help getting -0.0066 0.0044 0.050
around outside (0.0007) (0.0153)
Needs help getting -0.0010 0.0108 0.446
around inside (0.0002) (0.0078)
Needs help getting -0.0011 0.0092 0.372
in/out of bed (0.0003) (0.0080
C. Specific health conditions
Health limitation -0.0250 -0.0743 -0.175
(0.0013) (0.0348)
Arthritis -0.0088 -0.0043 -0.034
(0.0008) (0.0217)
Back -0.0028 -0.0349 -0.561
(0.0005) (0.0167)
Blind -0.0014 0.0145 0.557
(0.0003) (0.0084)
Cancer -0.0007 0.0025 0.161
(0.0002) (0.0078)
Deaf -0.0003 -0.0041 -0.179
(0.0002) (0.0064)
Deformity -0.0006 -0.0159 -0.591
(0.0002) (0.0066)
Diabetes -0.0023 -0.0258 -0.868
(0.0003) (0.0082)
Heart -0.0062 -0.0014 -0.016
(0.0006) (0.0194)
Hernia -0.0003 0.0023 0.362
(0.0001) (0.0037)
Hypertension -0.0031 0.0376 1.053
(0.0004) (0.0124)
Kidney -0.0001 0.0042 0.938
(0.0001) (0.0027)
Lung -0.0037 0.0203 0.472
(0.0005) (0.0152)
Mental illness -0.00009 -0.0002 -0.045
(0.00008) (0.0424)
Missing limb -0.00007 -0.0019 -0.580
(0.00005) (0.0016)
Paralysis -0.00011 0.0016 0.287
(0.00006) (0.0020
Senility -0.00005 -0.0015 -0.214
(0.00002) (0.0006)
Stomach -0.0006 0.0069 0.695
(0.0002) (0.0060
Stroke -0.0008 0.0084 0.397
(0.0003) (0.0090)
Thyroid -0.0000001 0.000001 0.000
(0.000000) (0.000000)
Other -0.0023 -0.0013 -0.019
(0.0005) (0.0152)
Exogeneity
test Number of
Dependent variable p value observations
A. General health outcomes
Self-reported health 0.074 26,030
(1 is excellent, 5 is poor)
Health index (1-100 scale) 0.131 26,030
Fair or poor health 0.176 26,030
Poor health 0.533 26,030
Hospitalized in last year 0.364 26,484
Days in bed, last four months 0.074 25,223
Number of times hospitalized 0.329 22,229
Number of nights in hospital 0.185 26,289
B. Functional limitations/activities of daily
living/instrumental activities of daily living
Trouble seeing 0.085 20,853
Trouble hearing 0.109 20,845
Trouble speaking 0.039 20,573
Trouble lifting 0.667 20,837
Trouble walking 0.242 20,797
Trouble with stairs 0.993 20,820
Trouble getting around 0.918 17,401
outside the home
Trouble getting around 0.635 17,463
inside the home
Trouble getting in/ 0.764 17,621
out of bed
Trouble seeing at all 0.490 20,589
Trouble hearing at all 0.060 20,256
Trouble speaking at all 0.000 7,516
Trouble lifting at all 0.775 20,789
Trouble walking at all 0.328 20,723
Trouble with stairs at all 0.359 20,775
Needs help getting 0.470 13,598
around outside
Needs help getting 0.125 13,757
around inside
Needs help getting 0.191 13,794
in/out of bed
C. Specific health conditions
Health limitation 0.157 19,073
Arthritis 0.836 19,012
Back 0.061 18,924
Blind 0.060 18,454
Cancer 0.677 18,569
Deaf 0.568 18,422
Deformity 0.018 18,821
Diabetes 0.007 18,688
Heart 0.804 19,025
Hernia 0.454 17,179
Hypertension 0.000 18,683
Kidney 0.072 16,593
Lung 0.106 19,060
Mental illness 0.932 15,794
Missing limb 0.155 14,565
Paralysis 0.348 17,301
Senility 0.070 17,993
Stomach 0.195 17,701
Stroke 0.295 18,918
Thyroid 0.000 14,559
Other 0.947 19,060
Notes: OLS means ordinary least squares. IV means instrumental
variables. 2SCML means two-stage conditional maximum likelihood.
Standard errors, shown in parentheses, are clustered at the
state of birth and cohort level.