This week, the expanding protest of teachers at Seattle’s Garfield High School has become a topic of national debate. It is difficult to parse out the issues because the tactics of the teachers at Garfield are relatively narrow but resonate with all sorts of larger issues around assessment.
Here is what the teachers at Garfield say they are protesting: the extensive use of locally-chosen tests, above and beyond what is required by the state of Washington. Seattle’s administrators chose the Measures of Academic Progress (MAP) as an additional layer of assessment beyond the Washington state annual tests. MAP is produced by the Northwest Evaluation Association, which is an assessment-focused non-profit in the northwest (headquartered in Portland). From what I understand, NWEA has focused on computer-adaptive testing, and MAP is commonly used as a “several-times-a-year” assessment by school districts that purchase the assessment package.
MAP is a pretty good representative of the new face of norm-referenced testing. It is sold to school-district administrators as a way to monitor school performance, with a bunch of bells and whistles about being able to fine-tune instruction based on results, spit out Lexile scores, and a whole bunch of other things. But fundamentally it’s sold to district administrators who want to be able to hit the panic button mid-year at a school: if you make kids take MAP several times a year, you will be able to know which principals to pressure in December and January. That is the fundamental sales pitch to good and bad district administrators alike.
Here is what MAP is not:
- MAP is not independently documented as an intervention or monitoring system package. An Institute of Education Sciences review headed by David Cordray and released in December 2012 had the following as its bottom line: “The results of the study indicate that the MAP program was implemented with moderate fidelity but that MAP teachers were not more likely than control group teachers to have applied differentiated instructional practices in their classes. Overall, the MAP program did not have a statistically significant impact on students’ reading achievement in either grade 4 or grade 5″ (p. xii).
- MAP is marketed as norm-referenced testing, not as aligned to content. When a large minority of school districts purchased MAP over several years, in different states and before the rolling out of common core standards, you know that MAP is a generic off-the-shelf test.
- MAP is not the type of formative assessment solidly backed by the research literature: the short, every-week quizzes that can help teachers fine-tune instruction. There is a big difference both in the research and in the lives of schools between “Here’s your five-minute Friday quiz” and “Let’s dissolve the school structure for a week so everyone can do MAP.”
- As far as I am aware, there is absolutely no independent research to suggest that MAP has external validity as a way to evaluate teachers, which apparently is part of the mix in Seattle.
Before I rip into Seattle administrators and the marketing of MAP and its cousins more generally, let me explain the best possible argument for the more-intensive, thrice-yearly attempt at formative assessment: it is hard to get weekly formative assessment into the routine of a school. When you try to change school culture, you don’t go immediately for the ideal assuming you could fine-tune everything that happens in a classroom. Instead, you go for low-hanging fruit, and that is doubly so given that you are buying the services of a company with a budget to be considered.
So that’s the theory, and from one perspective it’s understandable. But it doesn’t fit the reality of MAP. MAP’s marketing strategy to district administrators is very different from how the Florida Center for Reading Research approached the changing of reading instruction in Florida’s schools: an incremental change across the supported parts of the state every year. Instead of asking for short tests every week, explained the founders of the center, you ask teachers to test kids a few times a year. You slowly give teachers experience in the benefits of formative assessment and good instruction, and that pair is more likely to stick with them for their careers.
In contrast, the design and marketing of MAP is focused on that mid-year panic button for district administrators. Think for a second about how these packages have to sustain themselves: you have to sell both to good superintendents and also to really awful, out-of-control district administrators (and your sales force has no clue from a first meeting who is which). If you design an assessment tool to be used only for instructional fine-tuning, you may get some sales, but the market is fairly limited for that. So you design an assessment package as an administrative multitool for any size district in any state. That means you have to run the operation as a norm-referenced test, with claims about curriculum standards jerry-built on top of the norm-referenced structure. You also have to provide the information that administrators say they need, because you make no sales unless you meet that need. In other words, you go where the market is.
And I have some really awful news for my fellow fans of effective formative assessment: the market is not rewarding systems of instructionally-useful, frequent, brief probes of student achievement. The market moved many years ago to giving administrators a mid-year panic button. When I wrote The Political Dilemmas of Formative Assessment (behind a paywall), I hadn’t thought much about the thrice-yearly systems school districts had begun to implement in the 2000s. But what we see in Seattle and elsewhere is consistent with my rather depressing account in print: “Data-driven decision making is often discussed as if it is an activity of upper-level administrators in any system. In addition to the previously described general concerns, one may reasonably worry that the discourse of data analysis will distance teachers from the process, undercut the legitimacy of formative assessment, and reinforce the tendency to see all data as high-stakes and something to game” (p. 332).
That discourse of administrative control is clearly at the heart of the Seattle protest over MAP. The twist here is the attempted implementation of MAP in a high school. At least at a first glance, I think MAP has penetrated the market in “progress monitoring” moreso at elementary ages than at the high school level, so Seattle’s use of MAP in high school (I think in ninth grade) is at the margins of the MAP market. The fact that the protest is in a high school is important in a few ways. Can Seattle’s superintendent make a case that this is research-based? I’d give it a snowball’s chance in Frostproof, Florida. For a variety of reasons, the technical research historically has concentrated far below ninth grade, and that includes the practical stuff about use. The last time I looked closely, there was a giant desert of research about formative assessment in high school, and I don’t think things have improved.
More significantly for the news this week, high school students are going to be more independent-minded than elementary students, and in this case teachers made a very shrewd tactical choice that encourages student buy-in. Since teachers framed the protest around the local assessment rather than the state tests that have high stakes for students, the students had and have a clear path to protesting what they may not like about testing.
Update (2/13): See Paul Bruno’s response for a thoughtful explanation of why teachers find it difficult to sustain structured formative assessment.