Why NAEP isn't really 'the nation's report card'

By Richard Rothstein, The Washington Post

Education policy in both the Bush and Obama administrations has suffered from failure to acknowledge a critical principle of performance evaluation in all fields, public and private—if an institution has multiple goals but is held accountable only for some, its agents, acting rationally, will increase attention paid to goals for which they are evaluated, and diminish attention to those, perhaps equally important, for which they are not evaluated.

When law and policy hold schools accountable primarily for their students’ math and reading test scores, educators inevitably, and rationally, devote less instructional resources to history, the sciences, the arts and music, citizenship, physical and emotional health, social skills, a work ethic and other curricular areas.

Over the last decade, racial minority and socio-economically disadvantaged students have suffered the most from this curricular narrowing. As those with the lowest math and reading scores, theirs are the teachers and schools who are under the most pressure to devote greater time to test prep, and less to the other subjects of a balanced instructional program.

One way the federal government promotes this distortion is through its National Assessment of Educational Progress (NAEP), an assessment administered biennially in every state, but only in math and reading. Government officials spend considerable effort publicizing the results. They call NAEP “the nation’s report card,” but no parent would be satisfied with so partial and limited a report card for his or her child.

Twenty-five years ago, Congress created the National Assessment Governing Board (NAGB) to create NAEP policy. At NAGB’s conference this week celebrating its silver anniversary, Rebecca Jacobsen and I described (in a presentation drawn from our book with Tamara Wilder, Grading Education. Getting Accountability Right) how NAGB’s disproportionate attention to math and reading was not intended when NAEP was first administered in the early 1970s.

In those early years, NAEP attempted to assess any goal area for which schools devote, in the words of NAEP’s designers, “15-20% of their time…, [the] less tangible areas, as well as the customary areas, in a fashion the public can grasp and understand.”

For example, to see whether students were learning to cooperate, NAEP sent trained observers to present a game to 9 year olds in sampled schools. In teams of four, the 9 year olds were offered a prize to guess what was hidden in a box. Teams competed to see which, by asking questions, could identify the toy first. Team members had to agree on which questions to ask, and the role of posing questions was rotated. Trained NAEP observers rated the 9 year olds on their skills in cooperative problem-solving and NAEP then reported on the percentage who were capable of it.

NAEP assessors also evaluated cooperative skills of 13 and 17 year olds. Assessors presented groups of eight students with a list of issues about which teenagers typically had strong opinions. Students were asked to reach consensus on the five most important and then write recommendations on how to resolve two of them. The list included, for 13 year olds, such issues as whether they should have a curfew for going to bed, and for 17 year olds, eligibility minimums for voting, drinking, and smoking. NAEP observers rated skills such as whether students gave reasons for their points of view and defended a group member’s right to hold a contrary viewpoint.

To assess commitment to free-speech principles, NAEP interviewers in the early 1970s asked 13 and 17 year olds if they thought someone should be permitted to say on television that “Russia is better than the United States,” that “Some races of people are better than others,” or that “It is not necessary to believe in God.” NAEP reported that only 3 percent of 13 year olds and 17 percent of 17 year olds thought all three statements should be permitted. Have we improved since then in students’ understanding of and commitment to First Amendment rights? We have no way to know. In the 1970s, NAEP ceased observing such behavioral outcomes and, with very rare exceptions, NAEP became exclusively a pencil and paper test.

This early NAEP history has become a quaint curiosity. Few officials in the U.S. Department of Education are even aware of it. But knowledge of NAEP’s experiences during its first decade should be revived, and NAGB should consider whether to return to this early approach. It illustrates how assessment could be used as part of a balanced accountability system for education, a system upon which the public could rely to learn if schools truly perform satisfactorily, and where attention to improvement should be directed.