No Child Left Behind (NCLB) put schools under
the microscope by requiring that they
report, annually, the test-score performance of students in grades 3
through 8, and, again, for grade 10. As President Bush said shortly before
he signed the bill into law, “We need to know whether a curriculum is
working. We need to know whether the teachers, the methodology that
teachers use is working. We need to know whether or not people are
learning. And if they are, there will be hallelujahs all over the place.
But if not, we intend to do something about it.”
Five years later, it has become clear that the
microscope NCLB uses to get the information the president said he wanted
contains a lens that distorts. Many good schools—both charter schools
and inner-city public schools serving the disadvantaged—are not
recognized as such, while many poorly performing schools are given a pass.
If NCLB is to fulfill its mission, Congress needs to make some major
repairs or risk seeing those opposed to all forms of school accountability
assume control of the political battleground.
I do not join those opponents of accountability. We do
not raise any principled objections to holding schools accountable or
testing students annually. On the contrary, the evidence suggests that
accountability has had some positive benefits for American education. The
reading and math scores of 4th and 8th graders on the National Assessment
of Educational Progress (NAEP) have risen steadily since NCLB was put into
place. If it cannot be proved that those gains are due to improved school
accountability, it is heartening to know that Margaret Raymond and Eric
Hanushek found, in more precise estimates of accountability impacts,
somewhat larger gains on the NAEP in those states that were the first to
put accountability systems into place (see “High-Stakes
Research,” features, Summer 2003).
Still, the current federally mandated accountability
system falls well short of what is needed. The gains made by 4th and 8th
graders do not translate into higher levels of performance once students
reach the age of 17. Instead, high school achievement has remained as
stagnant as ever, and high school graduation rates continue to hover around
the industrial world average.
Two things are needed to get the most out of
accountability. First, the lens used to look at schools must be reground so
that distortions are minimized. Once that repair has been completed,
accountability’s bright light needs to shine on the performance of
individuals, that is, on students, teachers, and administrators, not just
on schools.
A Distorted Prism
Most people would agree that a good school is a place
where students are learning, and a poor school is one where that is not
happening. But NCLB’s way of measuring school performance does not
look directly at how much individual students are learning from one year to
the next. Instead, a school is evaluated according to whether or not its
students are making Adequate Yearly Progress (AYP) toward full proficiency
by 2014. In that year, every tested student must be achieving at the
state-determined proficiency level. By next year, the midpoint between 2002
and 2014, the percentage of students proficient at a school is expected to
have increased by roughly half the distance from where it was when the law
was enacted. Various subgroups of students, defined by ethnicity, gender,
economic disadvantage, and need for special education, must be making
comparable progress. While some exceptions to those requirements are
allowed, schools are said to be A-OK only if the percentage of students
scoring at the proficient level is moving forward “adequately.”
Schools where that is not happening are identified as “not making
AYP” or, after two years, “in need of improvement.” In
common parlance, they have “failed.”
Evaluating schools by the AYP measuring stick is
typically justified on the grounds that it ensures that “no child
shall be left behind.” While this sounds both noble and egalitarian,
it in fact expects those schools that had a lower percentage of students
scoring at the proficient level in 2002 to make more rapid progress over
the ensuing years than those with higher-performing students. Both are
expected to arrive at the same point by 2014 as well as to make more rapid
progress each year. It is not unlike a race between the turtle and the
rabbit, in which the turtle is asked to complete a two-mile run while the
rabbit need only traverse 200 meters in the same stretch of time.
The consequences of this peculiar accountability
system are very different, but equally damaging, for two contrasting types
of schools. For those schools blessed with high-performing students (as a
result of learning either at home or in earlier grades), the proficiency
standard to which they are held accountable is often much too low. If the
country is to have a better-educated citizenry, the schools serving
higher-performing students need to lift their performance well above levels
of mere “proficiency.” As for those schools whose
responsibility is the education of large numbers of low-performing students
(as a result of either minimal education at home or bad instruction earlier
in life), it is not reasonable to expect that every child will reach the
state proficiency standard by the end of 3rd grade and every grade
thereafter. Applying that criterion puts schools serving the disadvantaged
at risk of being said to “fail,” even if they are doing a fine
job of enhancing the skills of their students.
One can get a pretty good fix on how much students are
learning by tracking individual student test scores from one year to the
next. When students at a particular school are
outpacing the typical student in the rest of the state, most people would
agree the rate of learning is, at least, better than average. When student
gains lag significantly behind average statewide gains, most people might
agree that the situation deserves attention by school boards and
administrators, if only to make sure that below-average performance in any
given year is nothing more than a statistical aberration.
Any well-designed measuring stick should provide that
kind of basic information, especially if it purports to identify schools
that are or are not making Adequate Yearly Progress. Martin West and I
(“Is Your Child’s School Effective?” check the facts, Fall 2006) discovered
just how inadequate the AYP measuring tool is when we tracked student
progress in Florida. We compared pairs of schools, checking to see whether
students were learning more at the one said to be making AYP than at the
one said to be failing. Thirty percent of the time the opposite proved to
be the case. Any measuring stick that gets something wrong 30 percent of
the time is itself a failure.
The errors are systematic. If a school is blessed with
initially high-performing students, it is likely to be given a pass by
NCLB, even if students are not learning much from one year to the next.
Schools serving the poor, the disabled, and the educationally disadvantaged
face a greater challenge, as they must make rapid progress from one year to
the next to escape the “failed school” designation. As a
result, they are often found to be “failing,” even when the
gains made by their students exceed those in “passing” schools.
Simplistic Dichotomy
The imperfections of the NCLB measuring stick are
magnified by the fact that it divides all schools into just two categories,
pass or fail (“making AYP” or not). The practice borrows from a
more common propensity that has unfortunately crept into American education
in the name of helping the challenged. Even elite universities, such as my
own, allow professors to give students “pass-fail” grades. I
have learned from bitter experience that such a grading system both gives
students license to do nothing and, ultimately, provides less information
to those who rely on grades as a way of ascertaining whether students have
learned something. (Generally speaking, the “behind-the-scenes”
rule is to treat a pass as a fail, causing further distortions.)
In days gone by, and even now in traditional schools,
teachers graded students over a five-point scale that ranges from A to F.
NCLB needs to rediscover that ancient practice. States have already shown
the benefits of using a multiplicity of cut points. Florida employs an A to
F scale, providing a much more intuitive way of telling families and
citizens about the quality of their schools. New York grades schools on a
four-point scale, which, if not as good as the traditional grading system,
would be satisfactory were it not for the fact that, in New York, 4 is
good, 1 is bad. (Can you hear the chants? “We’re number
four!”)
Regrinding the Lens
A five-point A to F scale that focused strictly on
student growth at a school would greatly enhance the transparency of the
accountability system. Admittedly, such scrutiny was not possible when NCLB
was originally enacted into law, simply because at the time the legislation
was passed there was no way in most states of tracking student progress
over time. Since 2002, however, several states (North Carolina, Texas, and
Florida, for example) have put into place at least the beginnings of
systems that allow for tracking of student performance from one year to
another. At the time NCLB is reauthorized, Congress needs to mandate such
tracking systems in all states, and then ask states to use the systems as a
way of identifying which schools are effective, and which are not.
To be sure, not every state could implement such a
system immediately, so Congress would need to allow for a period of
transition from the current policies to the new ones. Introducing a growth
approach via the “safe harbor” provisions of the law may be the
politically feasible way to begin. States with high standards and quality
information on individuals’ performances over time could be given a
second way of showing that schools are making AYP. If given this option,
they will have every incentive to migrate to the new system as quickly as
possible, as the distortions of the existing approach intensify.
Distortions across States
Thus far we have focused on how the NCLB
accountability lens provides misleading information about school quality
within states. Equally disconcerting, the accountability measuring stick
provides grossly misleading results when states are compared to one
another. The cause of the distortion: allowing each state to establish its
own standards and its own definition of proficiency, thereby generating 50
different definitions of the same concept.
The meaning of the word “standard” has its
origins in the flag or emblem carried high in battle in order to martial a
fighting force toward a fixed objective. If standard-bearers head off in
inconsistent directions, they direct portions of the battle force toward
divergent objectives, opening the army to flank attacks. Accordingly,
“standard” came to mean something that was fixed, such as the
specific weight for an official coin or the unchanging value of a precious
metal against which the value of paper currency could be compared.
It is thus an oxymoron to hold students accountable to
more than one standard, as is the case under NCLB, which allows each state
to establish its own standard, no matter how widely it diverges from some
national definition. Frederick Hess and I show that a very few
states—only Massachusetts, Maine, and South Carolina—have as
high a definition of proficiency as the one originally set nationally by
those who administer the NAEP. (For more, see “Keeping an Eye on
State Standards,” features, Summer 2006.) Standards in most states fall far short of
that national mark, North Carolina, Oklahoma, and Tennessee being the most
extreme laggards. So, by official definitions, Johnny may be deemed a
proficient reader in North Carolina but not if he should move to South
Carolina.
So varied are state standards that the relative rigor
of a state’s proficiency definition is a better predictor of the
percentage of schools said to be “failing” (not making AYP)
than the overall quality of student performance in the state, as estimated
by average NAEP scores. The correlation between the proficiency standard
and the percentage of schools failing to make AYP rate is 0.44. The
correlation between the actual level of student proficiency on the NAEP in
the state and the percentage of schools identified as “failing”
is only negative 0.31.
Clearly, AYP is giving information that is at least as
much political as it is substantive. In Massachusetts, for example, 43
percent of the students failed to make AYP, despite the fact that the state
has the highest-performing students in the country. Why? Because
Massachusetts has one of the highest standards in the country, a standard
as high as the one NAEP uses. Conversely, only 7 percent of the schools in
Tennessee are failing, though the state ranks near the bottom in terms of
school performance. Why? Because Tennessee has one of the lowest
operational definitions of proficiency in the country. The pattern
nationwide is laid out in Table 1.
We are not necessarily proposing the NAEP or
Massachusetts standard of “proficiency” as the correct one. If
one is going to expect every child to reach that level by the year 2014,
one can be quite certain it won’t happen. Even world leaders in
education do not come close to reaching that goal. In 8th-grade math, for
instance, only 73 percent of the students in Singapore are proficient by
the NAEP definition of the word, despite the fact that Singapore has the
highest-performing math students (see Figure 1).
That fact helps to clarify a basic dilemma that NCLB
confronts as long as it continues to use the 2014 goal of full proficiency
as its benchmark. Either the word “proficiency” will have to be
dumbed down to mean little more than “basic” understanding of
the given material, or a new way of measuring school performance must be
introduced.
The simplest solution: use a high standard, such as
the one employed by the NAEP, when holding students accountable for
reaching full proficiency, if they are to receive an academic diploma, but
hold schools accountable for achieving a high but realistic rate of student
growth from one year to the next.
Accountability
Who should be held accountable? That question brings
me to NCLB’s final distortion: exactly who or what is being held
responsible. In ordinary language, only individuals, not entities such as
schools, can be held accountable. We hold drivers, not cars, responsible
for accidents. Or, if cars are faulty, we hold responsible those who made
them. But under NCLB, only entities (schools, school districts, states),
not students, teachers, or administrators, are held responsible for what is
happening. To fix the NCLB accountability system, we need to find ways of
holding accountable the individuals, that is, the students and teachers,
who are involved in the education process.
At one time, student promotion to the next grade was
conditional on performance, and graduation from high school depended on
learning a specific body of material. Gradually, it has become standard
practice to promote virtually all students from one grade to the next,
regardless of whether they have learned the material. Such practices are
justified on the grounds that holding a child back for poor performance
only undermines self-esteem and aggravates
learning problems. Minimal high-school graduation requirements are
similarly justified on the grounds that having a diploma is better than not
receiving one, regardless of what is learned.
Recently, some cities and states have introduced
policies that return to more traditional practices. The results have been
surprisingly promising. In Florida, the performance of 3rd graders jumped
the first year they were expected to pass a test if they were to move on to
4th grade (see “Getting Ahead by Staying Behind,” research, Spring 2006). Those
held back benefit from being required to repeat the 3rd grade. In
Massachusetts, the expectation that students pass a 10th-grade test if they
are to graduate from high school spiked student performance the first year
the law was introduced, with continuing gains in subsequent years.
Internationally, Ludger Woessmann has shown that students score higher in
countries that require students to perform well on comprehensive
examinations than in countries that, like the United States, have no such
expectations (see “Crowd Control,” research, Summer 2003).
Teachers and administrators should be held accountable
as well. Once the other elements of a well-designed accountability system
have been put in place, it is reasonable to hold teachers accountable for
student learning. As Thomas Kane and his colleagues have shown (see
“Photo Finish,” research, Winter 2007), the best measure of teacher quality in any
given year is how much students learned from that same teacher the
preceding year. The research simply confirms what every school child knows:
certain teachers are consistently effective, while others are not.
Once the information is available to track student
progress from one year to the next, one can identify the classrooms in
which the most, and least, learning is taking place. That information can
be used to reward the high performers and to counsel the low performers,
who should be dismissed if they remain consistently ineffective classroom
teachers. Of course, any teacher can have a bad year, and any
accountability system may make an error, so all personnel decisions must be
made by administrators who are fully informed of particular circumstances.
But until teachers are held responsible for the performance of their
students, it is unlikely that accountability systems will prove effective.
Finally, an effective accountability system requires
strong administrative leaders, who should be held responsible for the
learning gains realized at their school.
The Political Problem
It is rumored that influential interest groups in
Washington and key members of Congress are considering many of the changes
I have proposed in this essay. Let us hope so, but one should not be
optimistic about the outcome of the legislative process in the absence of
strong, sustained public support for reforms along these lines. The defects
in NCLB, as originally written, are not accidental. The law took the form
that it did because Congress navigated among powerful political
interests—those of unions, suburbanites, state and local education
officials, and other interested parties. Had teachers been held
accountable, union opposition would have blocked the law’s enactment.
Had states not been given the option to set any standard they wanted, many
state and local officials would have balked at excessive federal control.
Had states been required to put in place a data collection system that
tracked student performance over time, privacy fanatics would have insisted
that every child had a right not to be known, even to those responsible for
the child’s education. Had every school been measured for growth in
student performance, many a suburban district (as well as its board and
superintendent) would have been found wanting. Had students been held
accountable, groups of students and parents would have raised strenuous
objections.
In 2001, lawmakers displayed sheer political genius
when they came up with a law that could be sold to the public as an
egalitarian policy that would leave no child behind. By insisting that
every child reach a minimum level of performance in 2014 (well beyond the
political lifetime of many of the key decisionmakers), the law left most
parts of the educational system untouched and limited the rigors of
accountability to the schools that served the most challenging populations.
If that was politically shrewd, it is educationally problematic. The best
and the brightest were given a pass. Meanwhile, excellent schools serving
the most challenged, whether charter schools or high-quality inner-city
public schools, were placed at the greatest risk of being called failures,
even when they were successes.
None of this can be altered without regrinding and
polishing the lens through which NCLB’s accountability light shines
on America’s schools. If we cannot soon come to believe what we are
shown, the whole microscope will be tossed into history’s dustbin.
Paul E. Peterson is professor of government at Harvard
University and a senior fellow at the Hoover Institution. He serves as
editor-in-chief of Education Next.