Twitter  This Week in Review: Facebook thinks they have the formular for cracking down on clickbait  
Nieman Journalism Lab
Pushing to the future of journalism — A project of the Nieman Foundation at Harvard

How does Wikipedia deal with a mass shooting? A frenzied start gives way to a few core editors

A researcher into editing patterns on Wikipedia compares the Connecticut school shooting to other acts of mass violence to see how coverage of breaking news gets done on the site.

If you follow me on Twitter, you’re probably already well acquainted with my views on what should happen in the wake of the shooting spree that massacred 20 children and 6 educators at a suburban elementary school in Newtown, Connecticut. This post, however, will build on my previous analysis of the Wikipedia article about the Aurora shootings, as well as my dissertation examining Wikipedia’s coverage of breaking news events, to compare the evolution of the article for the Sandy Hook Elementary School shooting to other Wikipedia articles about recent mass shootings.

In particular, this post compares the behavior of editors during the first 48 hours of each article’s history. The fact that there are 43 English Wikipedia articles about shootings sprees in the United States since 2007 should lend some weight to this much ballyhooed “national conversation” we are supposedly going to have, but I choose to compare just six of these articles to the Sandy Hook shooting article based on their recency and severity as well as an international example.

Wikipedia articles certainly do not break the news of the events themselves, but the first edits to these article happen within two to three hours of the event itself unfolding. However, once created these articles attract many editors and changes as well as grow extremely rapidly.

Figure 1: Number of changes made over time.

The Virginia Tech article, by far and away, attracted more revisions than the other shootings in the same span of time and ultimately enough revisions in the first 48 hours (5,025) to put in within striking distance of the top 1,000 most-edited articles in all of Wikipedia’s history. Conversely, the Oak Creek and Binghamton shootings, despite having 21 fatalities between them, attracted substantially less attention from Wikipedians and the news media in general, likely because these massacres had fewer victims and the victims were predominantly non-white.

A similar pattern of VT as an exemplary case, shootings involving immigrants and minorities attracting less attention, and the other shootings having largely similar behavior is also found in the the number of unique users editing an article over time:

Figure 2: Number of unique users over time.

These editors and the revisions they make cause articles to rapidly increase in size. Keep in mind, the average Wikipedia article’s length (albeit highly skewed by many short articles about things like minor towns, bands, and species) is around 3,000 bytes and articles above 50,000 bytes can raise concerns about length. Despite the constant back-and-forth of users adding and copyediting content, the Newtown article reached 50kB within 24 hours of its creation. However, in the absence of substantive information about the event, much of this early content is often related to national and international reactions and expressions of support. As more background and context as information comes to light, this list of reactions is typically removed which can be seen in the sudden contraction of article size as seen in Utøya around 22 hours, and Newtown and Virginia Tech around 36 hours. As before, the articles about the shootings at Oak Creek and Binghamton are significantly shorter.

Figure 3: Article size over time.

However, not every editor does the same amount of work. The Gini coefficient captures the concentration of effort (in this case, number of revisions made) across all editors contributing to the article. A Gini coefficient of 1 indicates that all the activity is concentrated in a single editor while a coefficient of 0 indicates that every editor does exactly the same amount of work.

Figure 4: Gini coefficient of editors’ revision counts over time.

Across all the articles, the edits over the first few hours are evenly distributed: editors make a single contribution and others immediately jump in to also make single contributions as well. However, around hour 3 or 4, one or more dedicated editors show up and begin to take a vested interest in the article, which is manifest in the rapid centralization of the article. This centralization increases slightly over time across all articles suggesting these dedicated editors continue to edit after other editors move on.

Another way to capture the intensity of activity on these articles is to examine the amount of time elapsed between consecutive edits. Intensely edited articles may have only seconds between successive revisions while less intensely edited articles can go minutes or hours. This data is highly noisy and bursty, so the plot below is smoothed over a rolling average of about 3 hours.

Figure 5: Waiting times between edits (y-axis is log-scaled).

What’s remarkable is the sustained level of intensity over a two day period of time. The Virginia Tech article was still being edited several times every minute even 36 hours after the event while other articles were seeing updates every five minutes more than a day after the event. This means that even at 3 am, all these articles are still being updated every few minutes by someone somewhere. There’s a general trend upward reflecting the initially intense activity immediately after the article is created following increasing time lags as the article stabilizes, but there’s also a diurnal cycle with edits slowing between 18 to 24 hours after the event, before quickening again. This slowing and quickening is seen around about 20 hours as well as around 44 hours suggesting information being released and incorporated in cycles as the investigation proceeds.

Finally, who is doing the work across all these articles? The structural patterns of users contributing to articles also reveals interesting patterns. It appears that much of the editing is done by users who have never contributed to the other articles examined here, but there are a few editors who contributed to each of these articles within 4 hours of their creation.

Figure 6: Collaboration network of articles (red) and the editors who contribute to them (grey) within the first four hours of their existence. Editors who’ve made more revisions to an article have thicker and darker lines.

Users like BabbaQ (edits to Sandy Hook), Ser Amantio di Nicolao (edits to Sandy Hook), Art LaPella (edits to Sandy Hook) were among the first responders to edit several of these articles, including Sandy Hook. However, their revisions are relatively minor copyedits and reference formatting reflecting the prolific work they do patrolling recent changes. Much of the substantive content of the article is from editors who have edited none of the other articles about shootings examined here and likely no other articles about other shootings. In all likelihood, readers of these breaking news articles are mostly consuming the work of editors who have never previously worked on this kind of event. In other words, some of the earliest and most widely read information about breaking news events is written by people with fewer journalistic qualifications than Medill freshmen.

What does the collaboration network look like after 48 hours?

Figure 7: Collaboration network after 48 hours.

3,262 unique users edited one or more of these seven articles, 222 edited two or more of these articles, 60 had edited 3 or more, and a single user WWGB had edited all seven within the first 48 hours of their creation. These editors are at the center of Figure 7 where they connect to many of the articles on the periphery. The stars surrounding each of the articles are the editors who contributed to that article and that article alone (in this corpus). WWGB is an editor who appears to specialize not only in editing articles about current events, but participating in a community of editors engaged in the newswork on Wikipedia. These editors are not the first to respond (as above), but their work involves documenting administrative pages enumerating current events and mediating discussions across disparate current events articles. The ability for these collaborations to unfold as smoothly as they do appears to rest on the ability for Wikipedia editors with newswork experience to either supplant or compliment the work done by amateurs who first arrive on the page.

Of course, this just scratches the surface of the types of analyses that could done on this data. One might look at changes in the structure and pageview activity of each article’s local hyperlink neighborhood to see what related articles are attracting attention, examine the content of the article for changes in sentiment, the patterns of introducing and removing controversial content and unsubstantiated rumors, or broaden the analysis to the other shooting articles. Needless to say, one hopes the cases for future analyses become increasingly scarce.

Brian Keegan is a post-doctoral research fellow in computational social science at Northeastern University. He earned his Ph.D. in the Media, Technology, and Society program at Northwestern University’s School of Communication, where his dissertation examined the dynamic networks and novel roles which support Wikipedia’s rapid coverage of breaking news events like natural disasters, technological catastrophes, and political upheaval. This article originally appeared on his website.

What to read next
Ken Doctor    Aug. 25, 2014
America’s largest newspaper company says it’s building for the future. But it’s hurting its own value proposition in the process.
  • Gregory Kohs

    Brian, based on what you know of prolific Wikipedia editors and administrators, do you think Adam Lanza was a frequent editor of Wikipedia?  He seems to fit the profile in a number of ways:  Member of the Technology Club at school. Then home schooled. Windowless basement at home, with computer, bed, and bathroom.  Withdrawn. Socially awkward. Possibly Asperger’s.

  • Tod Robbins

    Fabulous piece Brian! I’m curious how the community deals with disambiguation of current events in the beginning hours and days. I would assume there are multiple articles authored before one predominates and replaces the others. Any insight on this? 

  • Howell Clark

    Brian, thank you for pointing out  why many folks simply do not trust everything in wikipedia.I loved your concise explanation of how the editorial side works. Knowledge is our only protection in the long run . Wikipedia is a great public service instrument but it is a reader beware for the serious knowledge addict. i look at it as simply a primer for ideas on where to look for indepth research on a subject if i’m so inclined.

  • Sterling Ericsson

    Kohs, can you ever maybe not insult people in your comments? I know it’s difficult for you, but you should at least try.

  • Gregory Kohs

    I’m sorry that you have not familiarized yourself with the academic literature, Mr. Ericsson (for those not aware, this is Sterling Ericsson’s “fursona” — ).  However, the report “Personality Characteristics of Wikipedia Members” by Yair Amichai–Hamburger, et al, concluded the following:

    Variance analysis revealed significant differences between Wikipedia members and non-Wikipedia members in agreeableness, openness, and conscientiousness, which were lower for the Wikipedia members.

    So, while you may find the scientific evidence insulting for some reason, that would appear to be your own problem to work out, not mine.

  • Sterling Ericsson

    What does that result from the study have anything to do with the insults you made up above? You went from a survey of less than 70 presumed Wikipedia editors that merely stated there were differences in “agreeableness, openness, and conscientiousness” to claiming they have Asperger’s?

    And where does that leave you then? Wouldn’t you also qualify as a “frequent editor of Wikipedia”, for all of your making of alternate accounts after getting banned?

  • Gregory Kohs

    Oh, Mr. Ericsson, you’re again ignorant of your own project’s demographics.  Here is a list that your friends in the Wikimedia projects maintain, to salute the many Wiki editors with Asperger’s:   If you want to count me among “frequent editors of Wikipedia”, perhaps — I’m editing about 10 times per month, almost always as a part of contracted work that I must perform on behalf of clients.  If I didn’t have these clients paying me, I wouldn’t be editing Wikipedia, trust me.  So, where does that leave me, then?

  • alice smith

    One the most important points the article made is that at least some part of the famed Wikipedia community often behaves as a frenzied mob.

  • Sterling Ericsson

    I already knew about such a list and I also know that it’s incredibly short, since it covers all the Wikimedia projects. It isn’t indicative of anything. If you mean just to insult that group of people, then go ahead, but there seems to be no real purpose to it beyond general dickishness. 

    And you had a fairly decent amount of edits from when your account wasn’t blocked: 1,853 on Thekohser, 414 on MyWikiBiz. Add in all of your other sockpuppet accounts and it must be a fairly substantial amount of edits. 

  • Sterling Ericsson

    An interesting article overall, Brian. I especially liked how you mapped out the editors that were involved in the various current events shooting articles. I’ve mostly only dabbled in current event articles, usually sticking to the talk page for the most part and posting references that could be added, since directly editing a current event article is a chaos of edit conflicts when someone else edits at the same time as you.

    It’ll be interesting to see how this data develops in the future when more high profile shootings happen (as we all know they will). I hope you will continue to bring this data forward or possibly combine it with other data sets that you run for other topics. It would be nice to see a continued evolution of this sort of thing.

  • Robofish

     What a pointlessly insulting comment.

    By the same logic, one could ‘link’ Adam Lanza to any number of groups that share some of those characteristics – video gamers for example, or bloggers. But doing so would rather miss the point. Lanza wasn’t representative of Wikipedia editors (if he even was one) or any of those other groups, for the simple reason that 99% of them don’t murder people.

    He may have had Aspergers, but so what? No serious psychiatrist thinks that was a factor in his turning violent. I don’t have Aspergers myself, but I know enough about it to know it’s not linked to violence, and wasn’t an important element in this tragedy.

    The only obvious purpose of your comment seems to have been to try to slur Wikipedia by drawing extremely dubious associations with a mass murderer. I hope you’re proud of yourself.

  • Gregory Kohs

    Because you call it “pointlessly insulting” does not make it so.  Let’s look at what Wikipedia says about mass murderer Anders Behring Breivik: 
     In one section of the manifesto entitled “Battlefield Wikipedia” Breivik explains the importance of using Wikipedia as a venue for disseminating views and information to the general public,[166] although the Norwegian professor Arnulf Hagen claims that this was a document that he had copied from another author and that Breivik was unlikely to be a contributor to Wikipedia.[167] According to the leader of the Norwegian chapter of the Wikimedia Foundation an account has been identified which they believe Breivik used.[168] In the second day of his trial Breivik cited Wikipedia as the main source for his worldview.[169] The blogger Fjordman claims that a large part of his manifesto quoted Wikipedia and that it “probably shaped his strange and imprecise political vocabulary”.[170]

    So, I’m able to see a possible pattern forming here, and I am willing to discuss that pattern with the credibility of using my own real name, while you’re playing armchair psychiatrist to defend Wikipedia from behind a pseudonym “Robofish”.  I hope you’re proud of yourself.

  • Robofish

     Fair enough, but one example doesn’t make a trend. If it turns out that Lanza was also a Wikipedia editor, you’ll have more of a case, but it’s hard to make an extrapolation from a single Wikipedia editor who became a murderer to all Wikipedia editors in general.

    As for pseudonymity – fair point, I’ll identify myself. I tried to log in via Facebook but that didn’t work, but for what it’s worth here’s my page: I’m just a student and don’t pretend to have any special knowledge regarding mental disorders or Wikipedia users.

  • Gregory Kohs

    Well, thank you for engaging with accountability, Alasdair.  I know that I only provided one example, and I know that one data point doesn’t make a trend.  However, there are more examples out there, even whole studies, that indicate that a disproportional number of heavy Wikipedia editors have antisocial characteristics.

  • Gregory Kohs

    And… it turns out that Lanza edited Wikipedia, just like I predicted:

  • Sterling Ericsson

    Kohs, it’s been a heavenly few months where I haven;t had to deal with you or think about you at all. Can you please just go and do something else, rather than obsessively digging up year old discussions?

  • Gregory Kohs

    Mr. Keegan, is it of any interest at all to your research that Adam Lanza was himself an editor of Wikipedia?