The Black-White Test Score Gap Through Third Grade

Preparing to load PDF file. please wait...

0 of 0
The Black-White Test Score Gap Through Third Grade

Transcript Of The Black-White Test Score Gap Through Third Grade

The Black-White Test Score Gap Through Third Grade∗

Roland G. Fryer, Jr. Harvard University Society of Fellows and NBER

Steven D. Levitt University of Chicago and ABF

September 2004 (preliminary and incomplete)
This paper describes basic facts regarding the black-white test score gap over the first four years of school. A number of stylized facts emerge. Black children enter school substantially behind their white counterparts in reading and math, but including a small number of covariates erases the gap. Over the first four years of school, however, blacks lose substantial ground relative to other races; averaging .10 standard deviations per school year. By the end of third grade there is a large Black-White test score gap that cannot be explained by observable characteristics. Blacks are falling behind in virtually all categories of skills tested, except the most basic. None of the explanations we examine, including systematic differences in school quality across races, convincingly explain the divergent academic trajectory of Black students.

∗ Correspondence can be addressed to Fryer at Department of Economics, Harvard University, 1875 Cambridge Street, Cambridge MA, 02138 (e-mail: [email protected]); or Levitt at Department of Economics, University of Chicago, 1126 E. 59th Street, Chicago IL, 60637 (e-mail: [email protected]).

Decades after the landmark Supreme Court decision in Brown v. Board of Education, racial gaps in educational achievement remain substantial. Prior research shows black children enter kindergarten lagging their white counterparts, and these differences grow throughout the school years (Campbell, Hombo, and Mazzeo 2000, Carneiro and Heckman 2002, Coleman et. al 1966, Neal 2004, Phillips, Crouse, and Ralph 1998). On every subject at each grade level there are substantial differences between Blacks and Whites (Campbell, Hombo, and Mazzeo 2000, Neal 2004). The typical Black seventeen year-old reads at the proficiency level of the typical White thirteen year-old (Campbell, Hombo, and Mazzeo 2000). Black college bound students score, on average, more than one standard deviation below white college goers; Blacks are the lowest performing minority group (Roach 2001). Even in affluent neighborhoods, achievement gaps are startling (Ferguson 2001, 2002 and Ogbu 2003). Including a myriad controls, the test score gap remains essentially unchanged (Jencks and Phillips 1998). While the Brown decision provided unprecedented hope for a future of educational equality, that hope has yet to be realized.
Despite these disturbing differences, a recent analysis of a newly available data set, the Early Childhood Longitudinal Study (ECLS), provides two reasons for optimism (Fryer and Levitt 2004). First, the raw test score differences for the recent cohort covered by ECLS are substantially below those found in earlier studies, suggesting the possibility of real gains by Blacks in recent cohorts. Second, in stark contrast to previous studies, Fryer and Levitt (2004) are able to eliminate the black-white test score gap for incoming kindergartners with the inclusion of just a parsimonious set of controls. Any optimism, however, is tempered by the fact by the end of first grade (the last data used in Fryer and Levitt 2004), Black students have

already lost substantial ground (the equivalent of almost three months of schooling) relative to Whites. If this trend were to continue, by the tenth grade blacks would be one standard deviation behind whites – a number consistent with prior research (Jones, Burton, and Davenport 1982, Phillips et al. 1998b; Phillips 2000).
Fryer and Levitt (2004) were largely unsuccessful in pinpointing the mechanisms driving the divergent trajectories of blacks and whites. A number of leading hypotheses (the importance of parental and environmental contributions grow over time, black students suffer worse summer setbacks, standardized tests are poor measures, interactions between black students and schools interferes with learning) fail to explain why Blacks lost ground. The only hypothesis that received any empirical support was systematically lower quality schools for Blacks relative to Whites. The primary evidence in favor of this hypothesis emerged from comparisons of test score trajectories within versus across schools. Including school fixed effects eliminates twothirds of the difference in the learning trajectory of blacks and whites over the first two years of school. In other words, a White student attending the same school as a Black student loses twothirds as much ground against the typical White student as does the Black student. Nonetheless, the evidence on school quality as the driving force in the racial gaps in Fryer and Levitt (2004) was largely circumstantial and subject to numerous important caveats.1
1 There are at least three weaknesses to the limitations to the argument that school quality is the mechanism behind black underachievement. First, Hispanics also attend worse schools than whites, yet their test scores converge. Second, because the assignment of kids to schools depends in large part on residential location, school fixed effects is in many ways equivalent to neighborhood fixed effects. Third, including school inputs included in the ECLS does little to lessen the gap.

In this paper, we extend the analysis offered in Fryer and Levitt (2004) in three directions. First, data from ECLS through the third grade have recently become available, allowing us to extend the analysis from first grade to third grade. Second, we have obtained the restricted use version of the data which contain detailed information on additional geographic indicators down to the zip code level. Third, we investigate an additional explanation for the emerging Black-White test score gap, namely, that the set of skills tested in the third grade systematically differ relative to those in kindergarten, and that Blacks perform worse on the skills emphasized in the later years.
A number of stylized facts emerge in this paper. We find that Blacks continue to lose ground relative to Whites in second and third grade at a pace consistent with the losses observed between kindergarten and first grade. On average, blacks are losing .10 standard deviations per year relative to whites in the first four years of school. In contrast to Fryer and Levitt (2004), however, systematic differences in school quality appear much less important in explaining the differences in test-score trajectories by race once the data are extended through third grade; Blacks lose substantial ground relative to whites within the same school and even in the same classrooms. That is, including school or teacher fixed effects do little to explain the divergent trajectories of black and white students between kindergarten and third grade. Hispanics continue to make up their inferior initial conditions relative to whites, while Asians continue to make gains.
By the end of third grade, even after controlling for observables, the black-white test score gap is evident in every skill tested in reading and math except for the most basic tasks such as counting and letter recognition which virtually all students have mastered. The largest racial

gaps in third grade are in the skills most crucial to future academic and labor market success: multiplication and division in math, and inference, extrapolation, and evaluation in reading.
The remainder of the paper is structured as follows. Section II describes the data used in the analysis. Section III presents the basic facts and patterns in test scores in the first four years of school using these data. Section IV investigates the extent to which alternative hypotheses can account for the fact that Blacks are steadily losing ground. Section V concludes.
II. The Data The Early Childhood Longitudinal Study Kindergarten Cohort (ECLS-K) is a nationally
representative sample of over 20,000 children entering kindergarten in 1998. Thus far, information on these children has been gathered at 5 separate points in time. The full sample was interviewed in the fall and spring of kindergarten, spring of first grade, and spring of third grade. A random sample of one-fourth of the respondents were also interviewed in the fall of first grade. The sample will ultimately be followed through fifth grade.2 Roughly 1,000 schools are included in the sample, with an average of more than twenty children per school in the study. As a consequence, it is possible to conduct within-school analyses.
A wide range of data is gathered on the children in the study, which is described in detail at the ECLS website We utilize just a small subset of the available information in our baseline specifications (although Fryer and Levitt (2004) show that similar
2 In addition, there is an ECLS birth cohort that tracks a nationally representative sample of over 15,000 children born in 2001 through the first grade.

results are obtained in a much more fully specified model). Students who are missing data on test-scores, race, or age are dropped from our sample.
Summary statistics for the variables we use in our core specifications are displayed by race in Table 1, with White referring solely to non-Hispanic Whites.3 Our primary outcome variables are math and reading standardized test scores.4 Standardized tests were administered to the full sample in the fall and spring of kindergarten and first grade and the spring of third
3 There are also a small number of children in the data whose racial status is classified as “other.” These include Hawaiian, mixed race, and Native American students. Such students are included in our regressions, but not shown in the summary statistics table.
4 These tests were developed especially for the ECLS, but are based on existing instruments including Children’s Cognitive Battery (CCB); Peabody Individual Achievement Test-Revised (PIAT-R); Peabody Picture Vocabulary Test-3 (PPVT-3); Primary Test of Cognitive Skills (PTCS); and Woodcock-Johnson Psycho-Educational BatteryRevised (WJ-R). Students are administered the test questions orally, as it is not assumed that they know how to read. A “general knowledge” exam was also administered. The general knowledge test is designed to capture “children’s knowledge and understanding of the social, physical, and natural world and their ability to draw inferences and comprehend implications.” No further information is available on the precise content of the general knowledge exam questions or skills tested. We limit the analysis to math and reading scores, primarily because of the comparability of these test scores to past research in the area. In addition, there appear to be some peculiarities in the results of the general knowledge exam. For instance, Asians score well above other groups on math and reading, but do extremely poorly on the general knowledge exam. Also, Black students do extremely poorly on the general knowledge exam, even though teachers rate them only slightly behind Whites in this area on the subjective teacher evaluations. Most of our results also appear in the general knowledge scores, and we note the instances where differences arise.

grade.5 The reading test includes questions designed to measure basic skills (print familiarity, letter recognition, beginning and ending sounds, rhyming sounds, and word recognition), vocabulary and comprehension, listening and reading comprehension, knowledge of the alphabet, phonetics, and so on. The math test evaluates number recognition, counting, comparing and ordering numbers, solving word problems, interpreting picture graphs, addition and subtraction, multiplying and dividing, place value and rate and measurement. The values reported in the table are item response theory (IRT) scores provided in ECLS-K, which we have transformed to have mean zero and a standard deviation of one for the overall sample on each of the tests and time periods.6 In all instances sample weights provided in ECLS-K are used.7 White students on average score .307 standard deviations above the mean on the math exam in the fall of kindergarten, whereas Black students perform .356 standard deviations below the mean on that test, yielding a Black-White gap of .663 standard deviations. By the spring of third
5 The tests were also given in the spring of kindergarten, but we limit our focus to the endpoints of the available data. The kindergarten spring test results are in all cases consistent with the results presented in the paper.
6 Because children were asked different questions depending on the answers they provided to the initial questions on the test, IRT-adjusted scores are preferable to simple test-score measures reflecting the number of correct answers a child provided. For more detail on the process used to generate the IRT scores, see chapter 3 of the ECLS-K Users Guide. Our results are not sensitive to normalizing the IRT scores to have a zero mean and standard deviation equal to one. 7 Because of the complex manner in which the ECLS-K sample is drawn, different weights are suggested by the providers of the data depending upon the set of variables used (BYPW0). We utilize the weights recommended for making longitudinal comparisons. None of our findings are sensitive to other choices of weights, or not weighting at all.

grade, that gap has increased to .882 standard deviations. The initial Black-White gap on reading is smaller (.400 standard deviations). Like math, however, the reading gap widens substantially to .771 standard deviations by the end of third grade.
The remainder of Table 1 presents summary statistics for the other variables used in the analysis. In contrast to the test score variables, for which we have observations at multiple points in time, many of the control variables are either not time varying, (e.g., birth weight), collected only once, or exhibit little variation over time for individual students. The most important of these covariates is a composite measure of socio-economic status constructed by the researchers conducting the ECLS survey. The components used in the SES measure are parental education, parental occupational status, and household income. Other variables included as controls are gender, child’s age at the time of enrollment in kindergarten, WIC participation (a nutrition program aimed at relatively low income mothers and children), mother’s age at first birth, birth weight, and the number of children’s books in the home.8 There are substantial differences across races on many of these variables. Black children in the sample are growing up under circumstances likely to be less conducive to academic achievement than White children: lower socio-economic status, fewer children’s books in the home, etc. Hispanics are also worse off than Whites on average. For Asians, the patterns are more mixed. The set of covariates we include match those used in Fryer and Levitt (2004). While this particular set of covariates might seems odd, the results we obtain with this small set of variables mirrors the findings when we include an exhaustive set of over 100 controls. In light of past research that has had great difficulty making the Black-White test score gap disappear, we focus on the results from these very parsimonious regressions to highlight the fact that the sharp differences between
8 A more detailed description of each of the variables used is provided in the appendix.

our results and earlier studies is not primarily a consequence of the availability of different covariates in the ECLS. It is important to stress that a causal interpretation of the coefficients on the covariates is likely to be inappropriate; we view these particular variables as proxies for a broader set of environmental and behavioral factors.

III. Basic Facts about Racial Differences in Early Achievement

Table 2 presents a series of estimates of the racial test score gap in math for the tests

taken over the first four years of school. The specifications estimated are of the form:

yit = ρiγ + xit β + ε it


where yit denotes an individual i’s test score in grade t and xit represents an array of student

level social and economic variables describing each student’s environment. The variable ρi is a

full set of race dummies included in the regression, with White as the omitted category. Consequently, the coefficients on race capture the gap between the named racial category and Whites. Our primary emphasis, is on the Black-White test score gap. In all instances, the estimation is done using weighted least squares, with weights corresponding to the sampling weights provided in the data set. When there are multiple observations of social and economic variables (SES, number of books in the home, and so on), for all specifications, we only include the value recorded in the fall kindergarten survey.9
The odd numbered columns of Table 2 present the differences in means, not including any covariates. These results simply reflect the raw test score gaps reported in Table 1. The even numbered columns mirror the main specification in Fryer and Levitt (2004). Controls

9 Including all the values of these variables from each survey or only those in the relevant years does not alter the results.

include: the composite indicator of socio-economic status constructed by the ECLS survey administrators, number of children’s books in the home and that variable squared, gender, age, birth weight, indicator variables for having a mother whose first birth came when she was a teenager or over 30 (the omitted category is having a first birth in one’s twenties), and WIC participation. These covariates generally enter with the expected sign. Older children, those with higher birth weights, those with older mother’s at the time of first birth all score better, although the benefit of entering school at a later age decreases steadily over time. Children on WIC do worse on the tests, suggesting that this variable is not capturing any real benefits the program might provide, but rather, the fact that eligibility for WIC is a proxy for growing up poor that the SES variable is not adequately capturing. Socio-economic status and the number of children’s books in the home are important predictors of test scores at each grade level. A onestandard deviation increase in the SES variable is associated with a .30 increase in fall kindergarten math scores and a .29 increase in spring first grade math scores. The number of books is also strongly positively associated with high kindergarten test scores in math.10 Evaluated at the mean, a one-standard deviation increase in the number of books (from 72 to 137) is associated with an increase of .143 (.115) in math and reading respectively. This variable seems to serve as a useful proxy for capturing the conduciveness of the home environment to academic success. The other variables tend to enter with the expected sign and have magnitudes that are similar to those reported in Fryer and Levitt (2004).
10 The marginal benefit associated with one additional book decreases as more books are added. Beyond roughly 150 books, the marginal impact turns negative. Only 16 percent of the sample lies above this cutoff point.