No Child Left Behind—these words convey an inspiring idea, an admirable goal for reform, and a shared value of universal access to high-quality education. As a national education policy adopted in 2002, No Child Left Behind (NCLB) aims at societal transformation by freeing historically underserved students from what supporters often call “the mediocrity of low expectations,” demanding that public schools educate all students to meet standards. Accomplishment of this aim might finally resolve an argument that animated the Continental Congress that produced the Declaration of Independence—whether a government of the people, by the people, and for the people can be sustained in a population not fully literate, not fully educated—by educating everyone.
II. Federal Assistance or Constitutional Incursion?
III. Good Policy, Bad Effects?
A. Test Score Increases
C. School Improvements
IV. Saving or Abandoning Students and Public Education?
NCLB has inspired more controversy than any previous educational policy in the United States. To grasp the arguments and to contextualize them historically, three questions will be explored:
- Is 40 years of federal education policy, culminating in NCLB, more accurately described as assisting and improving public schooling in much-needed ways or as hostile and harmful incursions into the Constitution’s reservation to the states of the provision of public education?
- Are unintended consequences of NCLB-mandated testing outweighing planned or actual benefits?
- Is test-driven accountability rescuing schools and students from intolerably low expectations or abandoning them and the enterprise of public education?
The explosion of test-related legislation and litigation and some invisible but inevitable problems of educational measurement will thread this discussion of assessment-related aspects of NCLB.
Federal Assistance or Constitutional Incursion?
Despite the Constitution’s reservation of public education to the states, federal interest is nonetheless clear. In a democracy where it is the right and responsibility of all members to participate in decision making, education is fundamental to ensuring the electorate’s capacity for comprehending complicated issues, seeking salient information and judging its accuracy, and comparing the campaign promises of elected representatives with their performance. Problems associated with state-level failure to make satisfactory education available to all were illustrated a half-century ago in the class action suit, Brown v. Board of Education of Topeka, Kansas, 1954. The persistence of efforts to include and exclude evolution from science curricula suggest continuing problems associated with the local level.
All three branches of the federal government have, in fact, played a role in public education. Brown and later cases demonstrate the role of the federal judiciary. The executive and legislative branches became active a decade after Brown with the Elementary and Secondary Education Act (ESEA) of 1965. Part of President Lyndon Johnson’s “war on poverty,” ESEA’s centerpiece was Title I funding to assist disadvantaged students. In 1988, as the numbers and categories of these students grew, testing was introduced to monitor their progress and assure that federal funds were well spent. The 1994 reauthorization of ESEA, called the Improving America’s Schools Act or Goals 2000, called for development of national standards and standards-aligned assessments for all students, tasks set for two bodies created by the same legislation—the National Council on Education Standards and Testing and the National Assessment Governing Board.
Eight years later, in 2002, as ESEA was again reauthorized, this time as NCLB, Goals 2000’s six national goals remained unmet or unmeasurable—(1) all children ready to learn; (2) 90 percent high school graduation rate; (3) all children competent in core subjects (including arts and foreign languages); (4) United States ranked first in the world in math and science; (5) every adult literate and competitive in the work force; and (6) safe, disciplined, drug-free schools. Only 19 states had reached full compliance with the law; no national system of assessments had been created; the effort to calibrate state tests so that their scores could be compared had been abandoned; and neither national content standards nor opportunity-to-learn standards had been developed. Performance standards on the National Assessment of Educational Progress (NAEP) had been set, but the standard-setting process was found “fatally flawed” in multiple evaluations. Most states had adopted content standards, and, ultimately, to comply with NCLB, every state except Iowa had implemented standards-based assessments.
NCLB mandated reading and math proficiency by all students in grades three through eight within 12 years, to be measured by states’ standards-based tests and confirmed by the state’s NAEP scores; state participation in NAEP, previously voluntary, was required. State test scores were to be disaggregated for each of several subgroups of students identified by major racial and ethnic backgrounds, by disability, and by limited English proficiency, and each group was required to meet state proficiency standards. State improvement plans, including Adequate Yearly Progress (AYP) targets leading to 100 percent proficiency in 2014, were required to be submitted for U.S. Department of Education approval, the first time in January 2003.
The federal government could not compel states to attempt to meet NCLB requirements. Rather, the financial reward of choosing to participate in NCLB—the proverbial carrot—was continuation of the federal funding, an estimated 10 percent of most state education budgets, to which their struggling students had been declared “entitled” 40 years earlier by ESEA. The corresponding stick was a graduated series of penalties increasing in number and severity over time. Schools failing to meet student achievement goals have been identified as needing improvement and are required to do the following:
- In year two following NCLB implementation, develop an improvement plan, accept technical assistance, and make school choice available to any student in a school “in need of improvement” with the district paying transportation to the chosen school.
- In year three, all of the above plus provide approved supplemental services.
- In year four, all of the above plus take corrective action, such as hiring a new staff or adopting a new curriculum.
- In year five, restructure or face takeover by the state or a contractor.
Some saw these provisions as holding schools appropriately accountable for the educational outcomes of vulnerable students, giving the law teeth. Others saw them as threats to compel top-down change, which researchers had repeatedly found to be a hopeless reform trajectory.
A variety of reactions followed enactment of NCLB. Some reactions suggested NCLB was oppressive. Initially, Vermont declared that it would forego federal funding and accountability but later reversed this decision. In subsequent years, other states also considered and rejected opting out, but some did entertain legislation that allowed districts and schools to do so. The numbers of districts and schools that opted out grew.
A variety of reactions was also apparent in the first-round state improvement plans submitted in 2003 for federal approval. More differentiated than the federal government had expected, these plans prompted the first in a series of adjustments to requirements.
With such a multiplicity of responses, perhaps NCLB might be considered in terms of a multiple-choice item. For example, NCLB is best described as:
- representing President George W. Bush’s attempt to achieve ESEA’s original hope of equal outcomes for all students.
- furthering preexisting state trends toward testing and accountability and expanding on a national scale the Texas system instituted when Bush was governor, which produced (select one) the Texas miracle /the myth of the Texas miracle.
- a violation of the U.S. Constitution.
- imposing on states, districts, and schools a test-driven educational accountability system with severe penalties for compliance failures.
- ensuring the failure of public education by setting unrealistic requirements for reading and math proficiency by all students in grades three through eight.
Or perhaps, since the right answer might depend on who was scoring this item, an essay question might be better. For example,
Is NCLB more about entitling students and improving their schools or threatening and punishing both?
Such a question can be taken as a frame for understanding the educational testing and accountability debates intensified by NCLB. The history of educational measurement has been dubbed a history of unintended consequences, to which some consider that Bush has contributed with NCLB. Bush himself described the policy as “real reform having real results” in a press conference December 20, 2004. Whether NCLB’s results are positive or negative and, if the former, whether the gain is worth the pain are at issue.
Good Policy, Bad Effects?
Conflicting opinions exist not only among people but also within people. For example, in a small school district in January 2005, an assistant superintendent described NCLB to the author as “the moon shot in education. We might as well go for a big goal. We’re going to be better for having tried.” By the end of the school year, however, she was saying:
I think [the law] has been both positive and vexing—positive in that it really has forced us to look carefully at what students are learning, especially those who are not yet learning to any standard this state considers appropriate. At the same time, it’s been really vexing because, although we have put into place things that really help students and teachers, we don’t know how we can keep on. . . . I don’t know if people are ultimately going to love me or curse me for this, but I’ve said I think it’s time to take Title 1 dollars off the table. We did refuse it for two years, but we need the money, frankly.
Even the positive aspects can be vexing, as discussion of the following points will show: (1) Rising scores come with significant caveats, and evidence that they provide of school improvement does not always survive scrutiny. (2) While states develop their own improvement plans, the federal approval process limits their real options. (3) NCLB’s wording protects districts and schools from new expenditures, but complaints and lawsuits charge that it is nonetheless an unfunded mandate.
Test Score Increases
There have been reports that scores on state reading and math achievement tests for students in grades three through eight are rising across the country. To NCLB proponents, this indicates that the bottom-line objectives are being attained, that reform is working. Although achievement gaps persist, test scores also offer evidence that some of the students who have been left behind in the past are beginning to catch up. NCLB promoted the credibility of the score increases by blocking some possibilities for gaming the system and thereby distorting test results, problems that had been reported with state tests:
- By requiring testing every student in grades three through eight every year, NCLB facilitated individual student progress monitoring. No longer was it necessary to compare fourth-graders in year one with fourth-graders in year two, a completely different student population.
- By requiring 95 percent student participation rates, NCLB diminished the possibility of skewing results by such manipulations as scheduling field trips for low-scoring students on test days, calling them to the school office, or suggesting their parents keep them home.
- By requiring comparison of each state’s test scores to its scores on NAEP, NCLB checked any “dumbing down” of state tests that might raise scores by lowering expectations.
Even if tests remain imperfect, as test developers themselves acknowledge, few critics would argue that scores reveal nothing. Love tests or hate them, trust scores or harbor lingering doubts about their accuracy and meaning, not even the most acerbic critic can regret evidence from scores of improving educational outcomes for children and youth.
Across time, student socioeconomic status has been so strongly and consistently correlated with test scores that it has been argued that zip codes, identifying the affluence of neighborhoods, tell as much about student performance as large-scale assessments do. It has even been joked that a high school senior’s college performance can be better predicted by the number of cavities in his or her teeth than by the student’s score on a college entrance examination, a whimsical allusion to the dental care available to the affluent. The correlation suggests that accountability based on test scores amounts, in effect, to holding students accountable for their parents’ income and holding public schools, which must accept all the students in their catchment areas, accountable for factors beyond their control.
While state scores are rising, the NCLB requirement to compare each state’s test scores to its NAEP scores is often failing to confirm reported state gains. For example, according to a 2006 New York Times investigation, NAEP math scores between 2002 and 2005 improved for poor, African American, and white fourth-graders, but NAEP reading scores declined slightly for African American and white eighth-graders and for eighth-graders eligible for free or reduced-price lunches. Mississippi declared 89 percent of its fourth-graders proficient readers, the highest percentage in the nation, but only 18 percent proved proficient in reading on NAEP.
The discrepancy signals invalidity. Because NAEP scores are less easily corrupted than scores on state tests, the discrepancy suggests that the invalidity is located in state testing—not for the first time. In the 1980s, the credibility of state test scores was called into question when a West Virginia physician, John J. Cannell, discovered that every state was claiming above-average scores, ushering into the measurement vernacular the “Lake Wobegon phenomenon,” a reference to the fictitious community imagined by Garrison Keillor, where “all the children are above average.” In the 1990s, Tom Haladyna and his colleagues found tests susceptible to a variety of types of “score pollution.” In 2000, Center for Research on Educational Standards and Student Testing codirector Bob Linn showed that scores typically rise after a new test is implemented and typically return to original levels when the second test takes its place, suggesting that it is common for rising scores to exaggerate rising achievement. And in 2009, it was discovered through an investigation that school officials in Georgia were routinely correcting standardized tests submitted by students in order not to be penalized for poor performance and thus face the withdrawal of federal funds from their districts.
That the achievement gains suggested by rising scores may be an illusion is underscored by the fact that NAEP scores have remained fairly stable for 40 years. Flat NAEP scores also suggest that NCLB requirements are unrealistic. Many doubt that all third- through eighth-graders could achieve proficiency in literacy and numeracy. Assuming they could, Stanford University measurement expert Ed Haertel has calculated that the states with the best NAEP progress records would nevertheless need more than 100 years to reach 100 percent proficiency, not NCLB’s 12 years.
Some of the cautions applicable to interpreting satisfactory progress toward NCLB requirements are technical. For example, while NCLB has stimulated high student participation in test-taking, it has also allowed each state to determine how many students in each subgroup (e.g., English language learners, special education students) must be enrolled for their scores to count in determining whether AYP targets have been met. States have set different minimum numbers of students in subgroups, which has proven problematic whether the minima are high or low. With a “low n,” aggregated scores will be inconsistent (unreliable) from year to year, showing too much variability to tell whether progress is being made. With a “high n,” the scores of students in the subgroup will not count toward AYP targets, making it impossible to tell whether these students are making progress or whether they are being left behind. There is no consensus about the right minimum number.
The status models, which have been federally approved for most states, base accountability on whether students reach a prescribed target or proficiency status, operationally defined as a cut-score on a test. Status models create uneven playing fields where students performing at lower levels than their age-mates—and the schools that serve them—have to make more progress to reach the required status. Alternatively, growth models have been developed, which compare a student’s test performance in one year with his or her scores in succeeding years. Growth models permit measuring a student’s improvement across time and the impact of the schooling the student has experienced. Yet, the U.S. Department of Education has approved only a few state requests to use growth models and only on a trial basis. As a result, in most states, poor students and their schools could be making greater progress than the more affluent, yet be judged as needing improvement in comparison with wealthier counterparts whom they have outperformed in terms of growth. President Barack Obama’s Race to the Top program, in which states compete for federal monies based on a variety of criteria (including “turning around the lowest-achieving schools” and having “great teachers and leaders”), is meant to address some of NCLB’s shortcomings.
In 2001, the Education Trust, a Washington, DC, policy group, listed 1,320 “high-fl ying schools” across the country in which half of the students were poor and half minority, yet scoring in their states’ top thirds. Hope rose even faster than the reported score increases until researchers investigated the claims, which were based on single-subject gains, in single subjects, in single years. When Douglas at Harris Florida State University checked to see how many had made gains in two subjects, at two grade levels, over two years, all but 23 of the high-fl ying schools were grounded.
Researchers with the Civil Rights Project at Harvard University determined that the states reporting the biggest gains were those with the lowest expectations, that experienced teachers were transferring out of low-income minority schools, and that no significant headway had been made in closing the gap for minority and poor children since the 1970s–1980s with civil rights and antipoverty initiatives. Moreover, project researchers found that schools identified as needing improvement tended to enroll low-income students, to be racially segregated, and to fail to make the needed improvements. In a Hispanic P–5 school in Texas in 2005, Jennifer Booher-Jennings documented “educational triage,” in which the overwhelming majority of instructional resources were concentrated on the “accountable kids,” whose scores would figure in AYP determinations, and the “bubble kids,” whose scores were near passing, at the expense of all others.
NCLB’s assumption that educational accountability systems based on standards and standards-based tests will improve education has been tested independently. In 1996, the Pew Charitable Trusts awarded four-year grants to seven urban school districts to assist them in implementing standards-based reform. Five years later, as NCLB was nearing enactment, the Trusts reported that high-stakes accountability motivates educators to avoid penalties by raising test scores through less ambitious teaching, especially for low-performing students, and that emphasis on testing comes at the expense of curriculum, instruction, and professional development and prevents real improvement in student achievement. This comparison suggests that NCLB’s logic was flawed from the start.
Saving or Abandoning Students and Public Education?
States may refuse NCLB requirements and federal funding, although most, like Vermont, found they could not afford to do so. And NCLB came with a local price tag. Connecticut found itself facing $8 million in increased expenditures by 2008 related to NCLB and filed suit against NCLB’s own prohibition against unfunded mandates, as did a Michigan school district and the National Education Association. Despite increased federal funding related to NCLB, such as grants to develop smaller learning communities in large high schools, NCLB has drained state and district coffers, whether through increased data collection, analysis, and reporting or through legal expenses.
The special education impact has been problematic in two ways. One was exemplified by the lawsuit filed by two Illinois districts on the basis of conflicts between NCLB and the federal Individuals with Disabilities Education Act. The other has been the difficulty in raising the scores of students eligible for special services, a task so daunting that it became common knowledge in Washington that every district in the state not meeting the first AYP targets had failed on the basis of their special education subgroup. While some adjustments to the law have been made, this difficulty was predictable by educators, but they were left behind in the policy-making process.
Also left behind, according to an analysis of the policy-making process for one part of NCLB, the Reading First provisions, were professional education and educational measurement organizations, which have worked to provide policy analyses, conduct research on policy impact, and issue public statements. Commercial interests were not left behind, as witnessed by the recent furor over the U.S. Department of Education’s narrowing of approved options regarding curricula and supplementary materials and also by this parody: No Psychometrician Left Unemployed.
As the 1990s’ shift toward state legislation focused on tests rather than curriculum and professional development, NCLB has operated on an implicit theory of action that educators and students—the least powerful players in the system—would work harder if coerced. The results to date suggest that coercion may raise scores without raising achievement and may not close achievement gaps.
Evidence suggests that schools categorized as needing improvement cannot bring all students to proficiency and are closed, that experienced teachers are leaving these schools behind, and promising students who want to transfer out of them outnumber the seats available in “high-flying schools.” With the 2007 reauthorization of ESEA, consideration should have been given to the logical conclusion of these trends: NCLB might well leave public education behind. Nevertheless, the process of reforming it has already begun—in the guise of Race to the Top and other measures. To date, at least 27 states have adopted new national standards in English and math based, in part, on Race to the Top recommendations (Lewin 2010).
- Abernathy, Scott Franklin, No Child Left Behind and the Public Schools. Ann Arbor: University of Michigan Press, 2007.
- Bamberger, Michael J., Jim Rugh, and Linda Mabry, Real World Evaluation: Working under Budget, Time, Data, and Political Constraints. Thousand Oaks, CA: Sage, 2006.
- Booher-Jennings, J., “Below the Bubble: ‘Educational Triage’ and the Texas Accountability System.” American Educational Research Journal 42 (2005): 231–268.
- Hayes, William, No Child Left Behind: Past, Present, and Future. Lanham, MD: Rowman & Littlefield, 2008.
- Lewin, Tamar, “Many States Adopt National Standards for Their Schools. New York Times (July 21, 2010).
- Linn, R. L., “Assessments and Accountability.” Educational Researcher 29 (2000): 4–16.
- Linn, R. L., E. L. Baker, and D. W. Betebenner, “Accountability Systems: Implications of Requirements of the No Child Left Behind Act of 2001.” Educational Researcher 31 (2002): 3–16.
- Ravitch, Diane, The Death and Life of the Great American School System: How Testing and Choice Are Undermining Education. New York: Basic Books, 2010.
- Rebell, Michael A., and Jessica R. Wolff, eds., NCLB at the Crossroads: Reexamining the Federal Effort To Close the Achievement Gap. New York: Teachers College Press, 2009.
- Reese, William J., America’s Public Schools: From the Common School to “No Child Left Behind.” Baltimore: Johns Hopkins University Press, 2005.
- Sadovnik, Alan R., et al., eds., No Child Left Behind and the Reduction of the Achievement Gap: Sociological Perspectives on Federal Educational Policy. New York: Routledge, 2008.
- Vinovskis, Maris, From a Nation at Risk to No Child Left Behind: National Education Goals and the Creation of Federal Education Policy. New York: Teachers College Press, 2009.