3 (free) things that journalists can do right now to protect their data and their sources at the border
July 13, 2009, 2:53 p.m.

In the news cycle, memes spread more like a heartbeat than a virus

The New York Times reports today: “For the most part, the traditional news outlets lead and the blogs follow, typically by 2.5 hours, according to a new computer analysis of news articles and commentary on the Web during the last three months of the 2008 presidential campaign.” By that measure, I’m past due in responding, but here’s why the Times has it wrong.

The study in question demonstrates a fascinating technique, borrowed from genetics research, for tracking memes in media coverage, and produces some surprising results that I’ll get to below. But part of the paper is based on a flawed methodology that totally discredits the findings highlighted by the Times. Here’s the illustration of that two-and-a-half-hour gap between peak coverage of memes — in this case, phrases from the 2008 presidential election — in the mainstream media and on blogs:

In order to determine whether a news source belongs on the red curve or the green one, the authors look to whether it’s indexed by Google News. If so, the source is labeled as “mainstream media.” But Google News indexes loads and loads of political blogs, from conservative Hot Air to liberal Talking Points Memo. It includes Daily Kos, Power Line, AMERICAblog, and the celebrity news site Just Jared. Even the Nieman Journalism Lab is in there, so by the study’s reckoning, you are currently consuming mainstream media.

So it may be true, as the Times reports on the front of its business section, that “the traditional news outlets lead and the blogs follow,” but the study doesn’t support that conclusion. What it finds is a surprisingly narrow gap between 20,000 prominent news sources, of all stripes, that are indexed in Google News and 1.6 million websites that don’t make the cut. Or, as Jon Kleinberg, an author of the study, told me: “This shows how important it is to look at blogs and news media as one single organism.”

From that perspective, the paper has quite a bit to add. First, it’s fascinating that memes in political reporting can be tracked with methods drawn from bioinformatics and genetic sequence analysis. As Bill Wasik explains in his new book on viral culture, And Then There’s This, the term meme was coined by the British biologist Richard Dawkins, who wrote in The Selfish Gene:

I think that a new kind of replicator has recently emerged on this very planet. It is staring us in the face. It is still in its infancy, still drifting clumsily about it its primeval soup…The new soup is the soup of human culture. We need a name for the new replicator, a noun that conveys the idea of a unit of cultural transmission, or a unit of imitation. ‘Mimeme’ comes from a suitable Greek root, but I want a monosyllable that sounds a bit like gene. I hope my classicist friends will forgive me if I abbreviate mimeme to meme.

Perhaps owing to those biological origins, we have come to describe particularly fast-moving Internet memes as viral, which evokes the image of passing through a population (and doesn’t, for what it’s worth, have anything to do with genetics). In New York magazine’s discussion of Wasik’s book, Times culture critic Virginia Heffernan questions whether the virus metaphor is “misleading — and ripe for retirement.” The meme-tracking study provides an alternative analogy: the heartbeat.

Back to Kleinberg’s idea of the web as a “single organism.” The study found that, yes, memes peak on prominent websites — that is, those indexed by Google News — before less prominent ones, but both of those peaks are generally preceded by a blip on the less prominent sites. Here’s how it’s illustrated in the paper; we’re looking at the percentage of each meme’s mentions that occur on sites not indexed by Google News. Zero hour represents the meme’s overall peak:

The authors describe this as a “‘heartbeat’-like dynamic” or a series of handoffs between blogs and mainstream media. My hunch is that the heartbeat would be even more visible — healthier, you might say — if the study had made a more-precise distinction between mainstream media and blogs. But in any event, the finding explodes that worn-out notion, furthered by today’s Times piece, of parasitic blogs merely reacting to the work of professional journalists.

It also complicates Wasik’s description of viral culture: “This telltale spike, this ascent to sudden heights followed by a decline nearly as precipitious.” He attributes that fleeting attention span to the Internet, but the meme-tracking study finds that, if anything, obscure blogs dwell much longer on a meme — in that heartbeat-like fashion — than the more-prominent news sources indexed by Google News.

There are a few other findings that I’ll highlight tomorrow because, despite some flaws in the methodology, the study is totally fascinating — and worth exploring on its dedicated web site. When I talked to Kleinberg two weeks ago, he acknowledged that Google News was an imprecise measure of mainstream media but argued, reasonably, that they had to draw a line somewhere to produce any meaningful results. He also noted that the paper was, in part, intended to demonstrate the meme-tracking technique they’ve developed, in which case, the distinction isn’t important.

And while I was writing this post, Scott Rosenberg weighed in with many of the same criticisms, while concluding:

Nonetheless, I fully expect to see it taken as conventional wisdom from this point forward that “news starts with the traditional media and then moves into the blogosphere.” Perhaps the Memetracker folks can follow the phrase “2.5 hours” and show us exactly how that happens.

POSTED     July 13, 2009, 2:53 p.m.
PART OF A SERIES     Lab Book Club: Viral culture
