Grades for coverage of NYC Teacher Data Report release

Before the New York City Department of Education released the name-and-number spreadsheet on the now-defunct Teacher Data Reports, I wrote and released grading criteria for news coverage. And now, with more than a week of coverage from a number of outlets, the grades (limited to major outlets where I read a critical mass of coverage).1


Gotham Schools: A

Gotham Schools climbed into the A-C range by declining to publish specific teacher names. Here are some of the stories GS published on the TDR dump to earn the A:

At the same time, GS kept up coverage on other school issues in the last week, despite having fewer resources than the metro dailies. That is not to say that GS coverage was perfect, but I do not require perfection in A work. Some reporters (including but not restricted to GS staff) could have explored how the TDR agreement went off the rails, from a closely-held pilot to a Klein-provoked court case.

New York Times: D

The New York Times earned a D because it published teacher names and scores. It avoided an F by including some indication of margin of error–not very smartly, because the point estimate had two nonzero digits, which is inconsistent with the brontosaur-wide error bands. But if I am not going to require perfection for an A, why should I quibble too much with work that is no higher than a D? NYT reporters added plenty of context, sometimes smartly and sometimes foolishly (see the superficial-analysis article linked above, including the GS response). My bottom-line judgment: so much effort, not enough value added.2

New York Daily News: D

New York 1: D

Wall Street Journal: D

Publishers, producers, and editors of the Daily News, NY1, and the Wall Street Journal comprised the rest of the herd in publishing teacher names and scores but avoiding the worst of tabloid coverage and displays. The WSJ has taken a different tack from others in displaying the labels (high, above-average, etc.) rather than quantities. The central problem of such reports is that when the data are so untrustworthy, there are no good options for displaying them with integrity. Quantify and you falsely imply precision (e.g., the too-many-significant-digits issue). Use categories and you imply the labels are applied with precision.

New York Post: F

What can you say about the New York Murdoch tabloid? For several days the NYP only displayed point estimates with no margins of error, and in several cases the paper wrote an entire story about a single low-rated teacher, which qualifies as sensationalizing. Either of those would have earned the paper a failing grade.

Observations on the TDR race to the bottom

I am struggling with explaining why the publishers and producers of major news outlets in New York took the bait Joel Klein offered more than a year ago, despite clear evidence that a broad range of people in journalism and education thought publishing individual teacher ratings was a lousy idea. The major dailies and NY1 continued down that path through the long court case and after Klein’s departure, even after editors knew or should have known that TDR was a pilot, fragile, and based on tests revealed to be problematic since the original NYCDOE-UFT agreement creating the pilot. The easy answer is that the TDR coverage was the equivalent of nightly-news “if it bleeds, it leads” practice. But I think it’s more complicated, since several of the outlets had plenty of time to have extensive internal discussions before the release, and there were plenty of options between “publish everything with names” and “publish nothing.”

It is important to keep pushing on this story, because unlike the Los Angeles Times VAM publishing project, we have several news organizations and a longish period of time in which decisions could have been made, reversed, and so forth. The Columbia Journalism Review article from March/April 2011 should not be the last word on this, because this is likely to be a continuing ethical dilemma with large data sets and journalism. I have seen at least one comparison between the NYT coverage of TDR and the Judy Miller Iraq-WMD scandal, but it feels to me to be a different challenge to the Times‘ integrity, even if it is the same publisher (“Pinch” Sulzberger) who made the misjudgments. On the other hand, in both cases, some form of heady rush (pre-war coverage or data on public employees) trumped caution on ethics and professionalism. Sadly, it looks like bad data are publishers and editors’ crack cocaine.

I have seen a few efforts nationally to spin the coverage in NYC in different ways, from “well, it got a conversation started” to “Gates’ op-ed in the NYT was an Overton Window thing” (paraphrases, not direct quotations). Folks: please stop this effort to spin. Just. Stop. When people with widely-varying views of value-added measures all think publishers and editors at the Times et al. jumped the shark with their coverage, there is no obvious effort at spin that doesn’t come across as tenuous reasoning. The coverage mostly stank, it bodes poorly for how news organization managers respond to data dumps in general, and that’s about all to be said for it for now.3


  1. Thanks to Leo Casey of UFT for forwarding voluminous news clips, which provided a reality check on my separate online reading. Any errors in judgment are my own. []
  2. Pun fully intended. []
  3. … until well-researched post-mortems. []

3 responses to “Grades for coverage of NYC Teacher Data Report release”

  1. Susan Ohanian

    Excellent! And much appreciated. Here’s the right Columbia Journalism Review url

  2. Lisa Fleisher

    WSJ has taken a different tack from others in displaying the labels (high, above-average, etc.) rather than quantities.

    WSJ did not publish labels rather than quantities – WSJ published both labels AND quantities, including clearly marked margins of error.