Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment

By Michael AndersenJune 23, 2009  /  7 a.m.

Okay, question time: Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets.

How do you catch up?

If you’re the Guardian of London, you wait for the associated public-records dump, shovel it all on your Web site next to a simple feedback interface and enlist more than 20,000 volunteers to help you find the needles in the haystack.

Your cost for the operation? One full week from a software developer, a few days’ help from others in his department, and £50 to rent temporary servers.

Journalism has been crowdsourced before, but it’s the scale of the Guardian’s project — 170,000 documents reviewed in the first 80 hours, thanks to a visitor participation rate of 56 percent — that’s breathtaking. We wanted the details, so I rang up the developer, Simon Willison, for his tips about deadline-driven software, the future of public records requests, and how a well-placed mugshot can make a blacked-out PDF feel like a detective story.

He offered four big lessons:

Your workers are unpaid, so make it fun. Willison started coding one week before the Thursday launch date, teamed with a designer on Tuesday, a system administrator on Wednesday and leaned on everyone in his 15-person department for ad-hoc help on Thursday. But the bulk of the labor would come from Guardian readers.

How to lure them?

By making it feel like a game, said Willison, 28. The Guardian’s four-panel interface — “interesting,” “not interesting,” “interesting but known,” and “investigate this!” made categorization easy. And the progress bar on the project’s front page, immediately giving the community a goal to share.

But a video game needs more than an interface and a score. It needs a narrative — and this project offered that, too.

That was what Willison discovered when, on a whim, he added the Guardian’s mugshots of each MP to their pages in the database. Participation shot up, he said.

“There’s that wonderfully personal element, because everybody in the U.K. has an MP,” Willison said. “You’ve got this big smiling face looking at you while you’re digging through their expenses.”

On Monday, to add a competitive edge, Willison posted lists of the top-performing volunteers. By that point, the project had drawn 36,000 unique visitors and 20,440 participants.

“Any time that you’re trying to get people to give you stuff, to do stuff for you, the most important thing is that people know that what they’re doing is having an effect,” Willison said. “It’s kind of a fundamental tenet of social software. … If you’re not giving people the ‘I rock’ vibe, you’re not getting people to stick around.”

Public attention is fickle, so launch immediately. Before Parliament released its records Thursday, Willison’s team thought they might be able to postpone their launch to Friday if necessary. When they saw Thursday’s newsbroadcasts, they realized they’d been wrong. The country’s imagination was caught.

“It became quickly clear on Thursday that it was a huge story, and if we failed to get it out on Thursday, we’d lose a lot of momentum,” Willison said.

The result: No time to load-test the program, perfect the interface, or even set up a system for Guardian reporters to view the vast amount of data that started pouring into their servers. (The first overview wasn’t ready for publication until Monday.)

Some programmers would be uncomfortable in those circumstances. Welcome to journalism, folks.

“We kind of load-tested it with our real audience, which guarantees that it’s going to work eventually,” Willison said impishly. “It’s a very realistic way of debugging the application.”

Speed is mandatory, so use a framework. Willison’s project was built on Django, the custom Web framework “for perfectionists with deadlines” that he and Adrian Holovaty created for the Lawrence Journal-World. In the world of database programming, a framework is like an offset press: hard to build — Django 1.0 required three years of open-source development — but once it’s set up, there’s no faster way to churn out content. Hand-coding an application like the Guardian’s would have been like publishing a daily newspaper with movable type.

Other frameworks and languages would have worked, too. “You absolutely could build this in Ruby on Rails or in PHP,” Willison said, but “as far as I’m concerned, this is absolutely Django’s sweet spot. This is absolutely what Django is designed to do…Once I had a designer and a client-side engineer working on the project, I could really just hand it over to them and I didn’t have to worry about the front-end code any more.”

Participation will come in one big burst, so have servers ready. As well as the Guardian’s first Django joint, this was its first project with EC2, the Amazon contract-hosting service beloved by startups for its low capital costs.

Willison’s team knew they would get a huge burst of attention followed by a long, fading tail, so it wouldn’t make sense to prepare the Guardian’s own servers for the task. In any case, there wasn’t time.

“The Guardian has lead time of several weeks to get new hardware bought and so forth,” Willison said. “The project was only approved to go ahead less than a week before it launched.”

With EC2, the Guardian could order server time as needed, rapidly scaling it up for the launch date and down again afterward. Thanks to EC2, Willison guessed the Guardian’s full out-of-pocket cost for the whole project will be around £50.

As for the software, it was all open-source, freely available to the Guardian — and to anyone else who might want to imitate them. Willison hopes to organize his work in the next few weeks.

“There’s a lot of stuff in there that’s potentially reusable,” Willison said.

Photo of Willison by Matt Patterson used under a Creative Commons license.

This entry was written by Michael Andersen, posted on June 23, 2009 at 7:00 am, and tagged , , , , , , , , , , , , , , , , , . Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback.


123 comments:

  1. andrew at 11:20 am, June 23, 2009

    ouch. “tenet” not “tenant”. dude. [Fixed. —Josh]

     
  2. David at 12:00 pm, June 23, 2009

    Great post, ive just expanded on this with a focus around FMCG & Online companies and how they are using Crowd Sourcing.

     
  3. eas at 1:25 pm, June 23, 2009

    Thanks for the post. I’m hoping that other news organizations will copy and improve on this model now that its potential has been shown.

    This sort of thing is broadly useful. The “sausage business” relies on the obscurity of information. In the US, the full text of bills is often not available in time for serious review by anyone. The “final” text of the huge US recovery bill was available for less than a day before it was signed into law, and it was in the form of scanned documents with handwritten annotations.

    Newspapers can prove their relevance in the future by being ready to enlist readers in stories like these. They can do the hard and sometimes expensive work of securing access to important documents and making them available, and they can provide the systems readers can use to help transcribe it and flag anything that looks fishy. In exchange for their investment in securing the documents, and providing the systems for reviewing them, their professional journalists can then get first crack at the aggregate work of their volunteers.

     
  4. Andy Mabbett at 5:26 pm, June 23, 2009

    eas’ point that “the full text of bills is often not available in time for serious review by anyone” is addressed, in the UK at least by the Free Our Bills campaign.

     
  5. gaston monescu at 8:56 pm, June 23, 2009

    great article. crowdsourcing is evolutionary.

     
  6. pbhj at 9:55 pm, June 23, 2009

    I did some data entry for the Guardian – it would have been easier if they’d given some more guidelines: what is significant, what is known (other than duckhouses). Also date entry, not all line items have dates – expense cover a range of dates, should the date given be the submission date, start date or end date?

    These small things may have reduced participation but I think they wouldn’t substantially and may even have increased participation as uncertainty as to if you’re doing it right tends to put one off. Also the date issue would have given them better data.

    Data entry could have been speeded up with more standardised options – nearly all MPs I viewed use “Banner” (initially I thought that was advertising) which receipts show is a stationery supplier. Perhaps matching line entries repeated X times could have been used as type-in suggestions.

    Impressed they put it all together so quick though. Well done Guardian.

     
  7. Michael Andersen at 12:38 am, June 24, 2009

    Thanks for the comments, folks.

    @eas – I totally agree. As a working reporter who sometimes chases database projects of his own, I can even tell you that almost every substantial public records request includes the question, “but how long will this take for me to read?” I think projects like this are the beginning of a huge shift.

    @David – Nice post. Left a comment there for you.

    @pbhj – Good point. I was about to say that any further complexity would have cut participation, but I think you’re right about clarity: better definitions for the buttons would have made me (as a participant) more certain that I was being helpful. I was haunted by the fear that I might be messing up.

    Simon mentioned another thing I couldn’t figure out how to fit into the post: the buttons are essentially votes, because many of the documents have been viewed multiple times. If people disagree on a document, his system double-flags it.

    Finally, to clarify my final paragraphs above: Willison says (on his Twitter feed) that though the software he used is open source (and can be used by imitators, etc.), the software he wrote is not.

     
  8. Pete at 4:57 am, June 24, 2009

    Lesson 5: make sure you double check your ‘crowdsourced’ facts.

    http://foiblesblog.wordpress.com/2009/06/21/guardian-mp-paler-than-we-might-have-led-you-to-believe/

     
  9. Aron Pilhofer at 8:08 am, June 24, 2009

    Right on, and good show to the Guardian and Simon for a job well done. If I can add one thing, News organizations should pay special attention to Simon’s last point.

    Even more than a framework (we use Ruby on Rails and a bit of Django), Amazon EC2 has allowed us to work miracles online. We’ve been able to go from a standing start to a fully deployed application in a matter of hours.

    It takes technology largely (but not completely) out of the equation for these news apps, and allows us to focus on the important stuff.

     
  10. Duncan Smart at 8:51 am, June 24, 2009

    “Guardian of London” – Manchester actually

     
  11. Andrew Ingram at 9:23 am, June 24, 2009

    I also felt that I might be doing it wrong. I wasn’t sure how to categorise expenses data that looked fine, if I’m categorising cover sheets as ‘not interesting’ should I really be categorising small expenses in the same way?

    Same difficulty with expenses for durations, like hotel stays. Another difficulty was that landscape pages were a pain to work with so I tended to skip them.

    I generally felt after doing a few pages that there was a real chance I wasn’t doing it right, and that stopped me doing more.

     
  12. Michael Andersen at 3:43 pm, June 24, 2009

    @ Pete – Ouch. Good call.

    @ Duncan — originally, yep, but they relocated to the big city in the 60s. They’re definitely a Londocentric outlet now.

     
  13. Andrew at 5:01 pm, June 24, 2009

    Willison says “There’s a lot of stuff in there that’s potentially reusable.”

    But the genius of working this way, especially in frameworks like Django or Ruby on Rails, is that there’s no need to worry about stuff being reusable, since it’s so easy to build out bespoke applications tailored to specific projects. The reusable parts are already extracted out into the Django framework itself.

     
  14. 墨尔本 at 10:54 am, June 26, 2009

    That visitor review idea is really brilliant.

     
  15. Pablo Lara at 3:01 pm, June 26, 2009

    It is a sample of how the Internet is enabling the democracy of the knowledge.

     
  16. Claire Halley at 6:16 am, June 29, 2009

    Bit of a damp squib in the end, don’t you think? The processing has dried up less than half way through the project and all they’ve got from it, as far as I can see, is the need to apologise for publishing some claims without checking them properly. If there are any Guardian people reading, could you give us an idea of what you’ve learned from this?

     
  17. Phillip Smith at 11:26 am, July 6, 2009

    “Hand-coding an application like the Guardian’s would have been like publishing a daily newspaper with movable type.”

    Why the dig at Movable Type? Last I hears, the Guardian was using Movable Type to power it’s massively-successful “Comment is free” community.

    No disagreement about using a application framework vs. using a blogging tool — but I don’t see need to single out a single application unnecessarily.

    Phillip.

     
  18. Joshua Benton at 11:31 am, July 6, 2009

    Phillip, for the record, we weren’t knocking Movable Type, the blogging software. We were referring to movable type, the stuff Gutenberg invented for printing, which while useful for illuminated Bibles way back when, would make printing a daily newspaper today awfully painful.

     
  19. Michael Andersen at 2:38 pm, July 6, 2009

    Great point, Claire. I emailed it to Willison last week (along with some of the other critiques above), and he said the same.

    He said he’d offer a longer response as soon as he’s finished working on version 2.0 of the crowdsourcing site. Stay tuned!

     
  20. Phillip Smith at 9:33 am, July 8, 2009

    @Joshua Benton

    Phillip, for the record, we weren’t knocking Movable Type, the blogging software. We were referring to movable type, the stuff Gutenberg invented for printing, which while useful for illuminated Bibles way back when, would make printing a daily newspaper today awfully painful.

    Doh. My mistake. Many apologies, and thanks for the clarification. I guess I should have followed the link. :-)

    Phillip.

     
  21. Hax0r at 12:54 am, August 17, 2009

    Lies, Lies and Damn Lies.

    (1) Tracing requests to mps-expenses.guardian.co.uk/ reveals that the site (or its content) is NOT hosted on EC2.
    (2) 200k worth of documents would consume the bandwidth equivalent of 500gb of space. I wonder how $50 can pay for that kind of bandwidth consumption. Without going into details, I reckon it should take atleast 10-15 instances of Ec2 to support the kind of traffic suggested by the article. http://aws.amazon.com/ec2/#pricing
    (3) 1 week of django – really ?

    I’ve seen tech PR spins before – but this one pwns all.

     
  22. Rose Holley at 1:16 am, September 15, 2009

    The lessons learnt about how to effectively crowdsource are really relevant for libraries and archives. I wish more libraries would take them up with the specific aim of enhancing existing content. Another good example of effective crowdsourcing is the Australian Newspapers Digitisation Program http://newspapers.nla.gov.au where the National Library of Australia encourages the public to correct the OCR text of historic newspapers which improves the data quality and therefore search results for everyone. So far 5 million lines of text have been corrected by the public. Read more about it at: http://www.nla.gov.au/ndp/project_details/documents/ANDP_ManyHands.pdf “Many Hands Make Light Work”.

     
  23. Rudy Turinay at 10:48 am, September 24, 2009

    ” Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. ”

    No… I’m a community manager !

     

Trackbacks:

  1. Quatro lições sobre “crowdsourcing” : Ponto Media at 9:59 am, June 23, 2009

    [...] AINDA SOBRE a experiência de “crowdsourcing” do The Guardian, é preciso ler: Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment. [...]

     
  2. Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman Journalism Lab « Netcrema - creme de la social news via digg + delicious + stumpleupon + reddit at 10:15 am, June 23, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman…niemanlab.org [...]

     
  3. Nieman Journalism Lab: Four crowdsourcing lessons from the Guardian’s expenses experiment | Journalism.co.uk Editors' Blog at 10:30 am, June 23, 2009

    [...] Full post at this link… [...]

     
  4. Crowdsourcing works? « The Lost Press Marketing Blog at 11:57 am, June 23, 2009

    [...] was able to sort through the 2,000,000 pages of documents surrounding the recent UK expense scandal using crowdsourcing. While this project was made easy for the Guardian with a circulation of around 340,000 to call for [...]

     
  5. links for 2009-06-23 / read write play at 12:00 pm, June 23, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment (tags: crowdsourcing media research social) [...]

     
  6. transparantezaken's status on Tuesday, 23-Jun-09 16:08:30 UTC - Identi.ca at 12:08 pm, June 23, 2009

    [...] Vier lessen uit het crowdsourcing onderzoek van The Guardian naar uitgaven Britse MP’s http://ur1.ca/65ia [...]

     
  7. Lessons in the Guardian’s crowdsourcing site at 1:30 pm, June 23, 2009

    [...] Jay Rosen learned with Assignment Zero, crowdsourcing isn’t easy. That’s why comments by the Guardian’s developer on the project, Simon Willison, are so interesting. In an interview [...]

     
  8. popurls.com // popular today at 2:00 pm, June 23, 2009

    popurls.com // popular today…

    story has entered the popular today section on popurls.com…

     
  9. 5 Successful News Crowdsourcing Experiments | Business Pundit at 3:51 pm, June 23, 2009

    [...] 1. The UK Guardian’s Expense Scandal Coverage [...]

     
  10. buzz at 6:35 pm, June 23, 2009

    Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment (NJL)…

    “Journalism has been crowdsourced before, but it’s the scale of the Guardian’s project — 170,000 documents reviewed in the first 80 hours, thanks to a visitor participation rate of 56 percent — that’s breathtaking”…

     
  11. Brian Boyer — Hacker Journalist : Kick Ass News Apps! — projects to inspire journos at 7:19 pm, June 23, 2009

    [...] the Guardian investigate the data even if the public’s participation is minimal or inaccurate. This Nieman Labs article provides some good lessons learned from Simon Willison, the application [...]

     
  12. PR, Public Relations & communications news and features at 8:39 pm, June 23, 2009

    [...] LEARN: How the Guardian worked their social mojo on the scandal ‘experiment’ [Guardian] [...]

     
  13. Investigative journalism on 24 June 09 « The Centre for Investigative Journalism News Blog at 5:16 am, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… [...]

     
  14. Some tech and decision details on the cr… « Paul M. Watson at 6:05 am, June 24, 2009

    [...] 10:05 am on June 24, 2009 Permalink | Reply Tags: dev (184), media (42) Some tech and decision details on the crowd-sourcing MPs Expenses project from the Guardian and Simon Willison. Django and Amazon [...]

     
  15. Upstream Connections – SEO » Guardians Crowdsourcing Experiment at 7:09 am, June 24, 2009

    [...] http://www.niemanlab.org/2009/06/four-crowdsourcing-lessons-from-the-guardians-spectacular-expenses-... [...]

     
  16. Gathering expense data – Joe Public does the leg work « Mel Poluck at 7:19 am, June 24, 2009

    [...] Harvard’s Nieman Journalism Lab blog has a good article on this particular experiment and details what the author Michael Anderson thinks are the four main [...]

     
  17. Hire nobody, hire everybody. | Taylor Davidson at 8:09 am, June 24, 2009

    [...] what the Guardian newspaper in the UK recently did to crowdsource their investigation into the recent scandal over MPs’ expenses (link via Ethan [...]

     
  18. Technology Links at 8:26 am, June 24, 2009

    [...] on The Guardian’s crowdsourcing venture. Amazing, the simplicity of the [...]

     
  19. Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment | dv8-designs at 9:44 am, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment. Michael Andersen from the Nieman Journalism Lab interviewed me about the MP expenses crowdsourcing site. [...]

     
  20. Four Crowdsourcing Lessons From The Guardian’s (Spectacular) Expenses-Scandal Experiment | Michael Andersen | Voices | AllThingsD at 10:24 am, June 24, 2009

    [...] Read the rest of this post on the original site Tagged: Internet, Voices, digital, innovation, media, newspaper, politics, software, Michael Andersen, newspaper, Nieman Journalism Lab, scandal, secrets | permalink Sphere.Inline.search(“”, “http://voices.allthingsd.com/20090624/four-crowdsourcing-lessons-from-the-guardian%e2%80%99s-spectacular-expenses-scandal-experiment/”); « Previous Post ord=Math.random()*10000000000000000; document.write(”); [...]

     
  21. Four Crowdsourcing Lessons From The Guardian’s (Spectacular) Expenses-Scandal Experiment [Voices] | UpOff.com at 11:14 am, June 24, 2009

    [...] Okay, question time: Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets. Read the rest of this post on the original site [...]

     
  22. Crowdsourcing Lessons | Superposition Kitty at 12:40 pm, June 24, 2009

    [...] Crowdsourcing with The Guardian. [...]

     
  23. Crowd-sourcing Lessons From The Guardian’s Expenses Scandal Experiment | Sharpe's Opinion at 1:21 pm, June 24, 2009

    [...] Crowd-sourcing Lessons From The Guardian’s Expenses Scandal Experiment [...]

     
  24. Pigsaw Blog » Blog Archive » Bookmarks for 24 Jun 2009 at 2:02 pm, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment &raqu…"Okay, question time: Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets. [...]

     
  25. Brits Investigate Politicians « Changing Way at 2:20 pm, June 24, 2009

    [...] to classify each document. Michael Andersen at Harvard’s Nieman Journalism Lab presented four crowdsourcing lessons, based on an interview with Simon Willison, who developed the web [...]

     
  26. A Guardian crowdsourcing update » Nieman Journalism Lab at 4:43 pm, June 24, 2009

    [...] quick thoughts I want to pull up from the comments of Michael’s post on the Guardian’s success crowdsourcing the analysis of documents in the MP expenses [...]

     
  27. Dario Salvelli’s Blog » Blog Archive » Feedmastering #154 at 4:49 pm, June 24, 2009

    [...] raramente i giornali tendono a dare link esterni, non solo quelli italiani. Forse almeno sul crowdsourcing dei newspapers bisognerebbe imparare dal [...]

     
  28. Interesting Reading #305 – The Blogs at HowStuffWorks at 5:02 pm, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment – “Okay, question time: Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets. How do you catch up?” [...]

     
  29. Crowdsourcing investigation · rinsing the colander at 6:23 pm, June 24, 2009

    [...] Willison talks about the process at Nieman Journalism Labeman: “We kind of load-tested it with our real audience, which guarantees that it’s going to work [...]

     
  30. links for 2009-06-24 - magnum blog at 7:05 pm, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… If you’re the Guardian of London, you wait for the associated public-records dump, shovel it all on your Web site next to a simple feedback interface and enlist more than 20,000 volunteers to help you find the needles in the haystack. (tags: crowdsourcing journalism guardian media socialmedia django internet Politics socialnetworking opensource social) [...]

     
  31. Footprints (24.06.09) | Chris Deary at 8:33 pm, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment [...]

     
  32. Mine seneste bookmarks (18.06.09 – 24.06.09) | Morten Gade at 9:01 pm, June 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment &raqu…: (medier avis2.0 ) [...]

     
  33. pligg.com at 9:39 pm, June 24, 2009

    Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman Journalism Lab…

    Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment:
    — Your workers are unpaid, so make it fun.
    — Public attention is fickle, so launch immediately.
    — Speed is mandatory, so use a framework.
    — Participatio…

     
  34. links for 2009-06-25 | burningCat at 4:07 am, June 25, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment Okay, question time: Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets. [...]

     
  35. Crowdsourcing: The Smartest way to get an answer? « Daniel Nisbet at 9:18 am, June 25, 2009

    [...] but lot’s of noise. Facebook: Limited to your friends, so depends on how smart they are. The Guardian: They crowd sourced their reviewing of MPs expense documents. Was is successful, should they [...]

     
  36. links for 2009-06-25 « Amy G. Dala at 10:01 am, June 25, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… django, ec2 (tags: journalism python technology politics crowdsourcing) [...]

     
  37. links for 2009-06-25 – Hartmut Ulrich - Randbetrachtungen at 11:08 am, June 25, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… Die Methoden journalistischer Arbeit verändern sich von Grund auf. Eine der Ideen besteht darin, seine Leser die Arbeit machen zu lassen – wenigstens einen Teil davon. Denn sie sind ohnehin besser informiert und schneller als die Redaktion sein kann. (tags: socialmedia Community media crowdsourcing) [...]

     
  38. BlogLESS : Four Design Trends: June 25, 2009 at 12:28 pm, June 25, 2009

    [...] a nice piece from Harvard’s Nieman Journalism Lab about how the UK’s Guardian newspaper used [...]

     
  39. links for 2009-06-25 at a convenient truth at 1:03 pm, June 25, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… Fascinating article on how to do crowdsourcing (tags: python opensource crowdsourcing journalism) « links for 2009-06-23 [...]

     
  40. - First Drafts - The Prospect magazine blog at 2:07 pm, June 25, 2009

    [...] the Browser, the Nieman Journalism Lab on the Guardian’s crowdsourcing [...]

     
  41. Four short links: 24 June 2009 | Design Website at 2:57 pm, June 25, 2009

    [...] Four Crowdsoucing Lessons from the Guardian’s Spectacular Expenses Scandal Experiment — Your workers are unpaid, so make it fun. How to lure them? By making it feel like a game. “Any time that you’re trying to get people to give you stuff, to do stuff for you, the most important thing is that people know that what they’re doing is having an effect,” Willison said. “It’s kind of a fundamental tenet of social software. … If you’re not giving people the ‘I rock’ vibe, you’re not getting people to stick around.” (via migurski on delicious) [...]

     
  42. Michael Nielsen » Biweekly links for 06/26/2009 at 6:53 am, June 26, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… [...]

     
  43. Jessica Chapel / Railbird v2 - Links for 2009-06-26 at 11:29 am, June 26, 2009

    [...] Four lessons from the Guardian’s expenses-scandal experiment Make it fun, launch immediately, use a framework, have servers ready. Question for visitors: What in racing would benefit from this sort of intensive crowdsourcing? [...]

     
  44. Kaobanga.com Blog’as » Blog Archive » Lygtis su 457,153 nežinomaisiais at 1:42 pm, June 26, 2009

    [...] tai įmanoma? Jeigu jūs taip pat kamuoja šis klausimas, tai pateiksiu keletą ištraukų iš interviu su Simon Willison’u, programuotoju vadovavusiu šiam [...]

     
  45. Journalism 3.0: Crowd Sourced, Mashup, and Mobile Journalism « Compassion in Politics: Christian Social Entrepreneurship, Education Innovation, & Base of the Pyramid/BOP Solutions at 9:27 pm, June 26, 2009

    [...] investigative journalism of the government (here are some suggestions if you want to follow the Guardian crowdsourced journalism model). Of course, New Assignment.net, Now Public, Digg, and other crowd sourced journalism experiments [...]

     
  46. Sunday Links for 2009-06-28 | MarkSimon.de at 2:18 am, June 28, 2009

    [...] Four crowdsourcing lessons from the Guardian’s expenses-scandal experiment [...]

     
  47. ciberesfera » Blog Archive » links for 2009-06-28 at 4:02 pm, June 28, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… (tags: crowdsourcing journalism socialmedia citizenjournalism) Share [...]

     
  48. Four short links: 24 June 2009 | ★ Technology News | Tech Crown at 8:14 am, June 29, 2009

    [...] Four Crowdsoucing Lessons from the Guardian’s Spectacular Expenses Scandal Experiment — Your workers are unpaid, so make it fun. How to lure them? By making it feel like a game. “Any time that you’re trying to get people to give you stuff, to do stuff for you, the most important thing is that people know that what they’re doing is having an effect,” Willison said. “It’s kind of a fundamental tenet of social software. … If you’re not giving people the ‘I rock’ vibe, you’re not getting people to stick around.” (via migurski on delicious) [...]

     
  49. inhr » Blog Archive » Oude wijze mannen of slimme jonge honden? at 8:22 am, June 29, 2009

    [...] nieuwtjes over onderwerpen die mijn aandacht hebben. Via Twitter werd ik geattendeerd op een mooi voorbeeld eigentijds van onderzoeksjournalistiek. De Engelse kwaliteitskrant The Guardian heeft [...]

     
  50. The Third Bit » Blog Archive » Four Crowdsourcing Lessons at 9:58 am, June 29, 2009

    [...] has been the way people like Simon Willison have leveraged the web to dig up the truth.  This recent post has a nice summary of lessons [...]

     
  51. links for 2009-06-29 - NOWUSEIT.COM at 8:33 pm, June 29, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… Journalism has been crowdsourced before, but it’s the scale of the Guardian’s project — 170,000 documents reviewed in the first 80 hours, thanks to a visitor participation rate of 56 percent — that’s breathtaking. (tags: crowdsourcing journalism guardian socialmedia media politics opensource internet socialsquare) [...]

     
  52. Tortuous | UAEBlogging.com at 6:30 am, June 30, 2009

    [...] here. The site, Harvard’s Nieman Journalism Lab, is a must-add for your RSS feed if you have any [...]

     
  53. crankycoder.com » Links for July 1st at 2:01 am, July 1, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment – [...]

     
  54. pligg.com at 6:21 pm, July 1, 2009

    Four crowdsourcing lessons from the Guardians (spectacular) expenses-scandal experiment…

    “With EC2, the Guardian could order server time as needed… the Guardians full out-of-pocket cost for the whole project will be around 50.”…

     
  55. Bookmark: Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment - Ipertesti di Paolo Sordi: il blog at 5:01 am, July 2, 2009

    [...] Le notizie 2.0 si programmano. Come un’applicazione. URI: http://www.niemanlab.org/2009/06/four-crowdsourcing-lessons-from-the-guardians-spectacular-expenses-... [...]

     
  56. Karine Sabatier » Blog Archive » Links w#26 at 8:14 am, July 2, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment When the whole is greater than the sum of its parts. Communities ! [...]

     
  57. How the Guardian parlayed a rival’s scoop into its own journalistic triumph : contrarian at 11:04 pm, July 2, 2009

    [...] game-saving exercise in crowdsourceing. The Nieman Foundation’s Journalism Lab picks up the remarkable story, complete with four crucial pointers for would-be imitators. Journalism has been crowdsourced [...]

     
  58. Palmas para o The Guardian « Run, Motherfucker, run at 7:56 pm, July 3, 2009

    [...] escassa atenção é o The Guardian. Tava fuçando o delicious do Russel Davies e me deparei com um post do Nieman Lab sobre uma das coisas mais fodas que eu já vi no [...]

     
  59. A Photograph, 10,000 Eyes and the Future of Journalism « Reinventing the Newsroom at 12:22 pm, July 9, 2009

    [...] could be poring over the expenses of Members of Parliament — witness the Guardian’s exercise in crowdsourced investigative journalism, which is also a model of successfully using social media to drive reader participation and [...]

     
  60. A lire ailleurs du 22 juin au 9 juillet | traffic-internet.net at 3:49 am, July 10, 2009

    [...] . 4 leçons sur l’avenir du journalisme Le Nieman Journalism Lab revient sur l’expérience lancée par le Guardian autour du scandale des notes de frais des députés britanniques, réussissant à engager 20 000 lecteurs dans le déchiffrage des 457 000 documents récupérés. Parmi les 4 leçons à retenir : rendre le processus agréable, être rapide, avoir des serveurs robustes… [...]

     
  61. The Digital Wing » Blog Archive » The Guradian’s crowdsourcing experiment at 8:35 pm, July 16, 2009

    [...] it’s worth reading this article from the Niemen Journalism Lab on why it worked – as it’s not just a matter of whacking up the documents and letting [...]

     
  62. John Brookmyre's Blog : Outsourcing the information worker at 2:26 pm, July 19, 2009

    [...] the law firm Pinsent who outsource litigation work to South Africa, however I was very impressed by this article which Matt Harris passed on to me – essentially the Telegraph have had a fantastic couple [...]

     
  63. News and Updates on Crowdsourcing at 12:15 am, July 22, 2009

    [...] 20,000 readers. The response, participation and speed of this effort are simply astounding. In this article, the developer of the crowdsourcing environment offers his perspective on how the event was [...]

     
  64. Crowdsourcing at its boldest « New Conversation at 11:12 pm, July 22, 2009

    [...] at its boldest This experiment of the Guardian’s and Neiman Lab’s succeeding review of it are incredible.   It used to be almost unthinkable for newspapers to be this outwardly [...]

     
  65. Interesting stuff I saw online, Jun. 23 to Jul. 24 | STL Social Media Guy at 4:03 pm, July 24, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment Niema… – "Journalism has been crowdsourced before, but it’s the scale of the Guardian’s project — 170,000 documents reviewed in the first 80 hours, thanks to a visitor participation rate of 56 percent — that’s breathtaking." [...]

     
  66. links for 2009-06-25 at 5:46 pm, August 8, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… Free software on outsourced servers, developed in days, tested while deployed. New journalism is here. (tags: design programming journalism code tools) @zota for the week of 2009-06-25 links for 2009-06-24 [...]

     
  67. Life of Alan » links for 2009-08-18 at 12:01 am, August 19, 2009

    [...] "Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment&quo… Okay, question time: Imagine you’re a major national newspaper whose crosstown archrival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets. [...]

     
  68. Graverende om gravende journalistikk at 2:56 pm, August 30, 2009

    [...] Guardian.co.uk gjør journalistkk på nett på et annet nivå enn alle norske nettaviser. Les for eksempel om deres crowdsourcing-eksperiment hos Niemanlab [...]

     
  69. Offentlighetsprincipen 2.0 - Sydsvenskan.se at 9:16 am, September 5, 2009

    [...] finns här till 18 augusti (spola till 06.50). • The Guardians kollaborativa kvittokoll, och Nieman’s kommentar. • Wired intervjuar ministern som vill öppna upp USA:s arkiv. • Demokratisajten Riket.se ska [...]

     
  70. hangingtogether.org » Blog Archive » Crowdsourcing Lessons at 11:59 pm, September 14, 2009

    [...] and creating an effective crowdsourcing environment are two very different things. And this article, from the Nieman Journalism Lab, describes lessons from the Guardian newspaper in the UK that [...]

     
  71. The web as a social landscape « Netcultures at 1:19 pm, September 17, 2009

    [...] political scrutiny, as in the british MPs expenses scandal (1, 2) [...]

     
  72. The Third Bit » Blog Archive » Applications and Data Sets at 9:06 pm, September 23, 2009

    [...] http://mps-expenses.guardian.co.uk/: The Guardian broke one of the biggest stories of the year in the UK by crowdsourcing the reading of MPs’ expense reports. They had an enormous public dataset of scanned documents, and they essentially made a game out of it to involve the community. As a result, a really important piece of investigative journalism was fielded out, and everyone benefited. As a cool aside, the software was developed by one of the co-creators of Django, Simon Willison, over the course of a single week (http://www.niemanlab.org/2009/06/four-crowdsourcing-lessons-from-the-guardians-spectacular-expenses-…). [...]

     
  73. The Chapter: News Media on Twitter « PC’s Pixels and Posts at 7:57 pm, September 24, 2009

    [...] do you read 170,000 documents in 80 hours? Outsourcing or rather: Crowdsourcing. Uncovering a series of scandals about the expenses of [...]

     
  74. Crowdsourcing My Crowdsourcing Blog Post? « The Ministry of Charles at 1:11 am, October 7, 2009

    [...] – but I can’t help myself from jumping a bit a head in the syllabus and talk about the crowdsourcing technique used by The Guardian. I always found it reassuring that [...]

     
  75. Journalism: Not Dead After All! « MediaChao at 11:18 am, October 7, 2009

    [...] reading examined different models of journalism in the digital age.   The Rosenberg, Michel, and Andersen pieces really changed my ideas about public interest in information.  I used to think that most [...]

     
  76. The Orange Chair » A textbook example of how to crowdsource, provided by guardian.co.uk at 1:47 pm, October 9, 2009

    [...] Nieman Journalism Lab has provided a great example of how to crowdsource, courtesy of the Guardian.  [...]

     
  77. Truth, Taste and Talent Online « Ida Noa at 2:33 pm, October 12, 2009

    [...] OffTheBus project and their hyperactive ‘pro-am journalism’ for the 2008 US election, or the Guardian’s swift move to get their audience to help them sift through the extensive amounts of documents related to the [...]

     
  78. let the readers do the job « Pimtim’s Blog at 3:47 am, October 15, 2009

    [...] here is the full article: niemanlab.org [...]

     
  79. A look back at Guardian crowdsourcing project | The Evolving Newsroom at 8:01 pm, October 26, 2009

    [...] Labs wrote a great post about ‘four lessons’ learned from the experiment, which was turned around very quickly.  They talked to developer Simon [...]

     
  80. eBriefings.ca » Lessons from the Guardian’s Crowdsourcing Experiment at 8:02 pm, October 27, 2009

    [...] Labs recently wrote about four crowdsourcing lessons learned from the experiment. In conversation with the developer, Simon Willison, here are the four big lessons from the [...]

     
  81. Mein Social Circle und ich | lab at 3:16 pm, October 28, 2009

    [...] Dialog tritt. Diese Qualifikation und ein stetes Vertrauen ermöglichen zum Beispiel Crowdsourcing-Projekte des Guardian. Vertrauensverhältnisse sind ein fragiles Gut und die Zyklen im Netz sind kurz (Man werfe einen [...]

     
  82. links for 2009-11-09 « Sameer Padania at 2:03 am, November 10, 2009

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment » Nieman… Shows how constraints – resources, time, technology – can serve a mind-focusing function, and create an elegant way of facilitating genuine participation in journalism. (tags: journalism Parliament crowdsourcing) [...]

     
  83. Netflix’s smart crowdsourcing initiative at 4:32 pm, November 19, 2009

    [...] might not have been achieved, but it’s still a great example of crowdsourcing, and there are other examples of so-called “Open” business models (Innocentive jumps to [...]

     
  84. Time.com and Crowdsourcing | Melissa Apter at 7:08 pm, November 22, 2009

    [...] is aware of the phenomenon known as “crowdsourcing” because they have several blog posts on their site that make reference to crowdsourcing (I [...]

     
  85. Ten Minutes with Carl Esposti – Industry Expert « blog.devongroup.com at 8:06 am, November 30, 2009

    [...] applications around the world of creativity. That breaks down to things like photography, journalism, travel, and graphic design. There are also social applications of crowdsourcing where people are [...]

     
  86. Will collaborative, user-driven journalism reshape reporting? « Explorations in New Media from the Schieffer School of Journalism at TCU at 5:08 pm, December 8, 2009

    [...] they posted them online and asked users to weigh in with what they found. The Guardian of London wrote a computer program to handle a similar situation with a giant set of public documents this [...]

     
  87. 2009’s Most Influential Media About Media « John Bracken at 3:36 pm, December 24, 2009

    [...] crowd-sourced analysis of MP expense reports, and wants to know more. ”There were four lessons written up soon after it went live, but I would like to hear at least one more insider report now [...]

     
  88. 2009’s Most Successful Crowdsourcing Projects « Skillocracy: The Blog at 12:35 am, January 14, 2010

    [...] particular story is well covered in this excellent post by Michael Andersen. Check it out!  It includes the major points as explained by the project’s creator. His main [...]

     
  89. A new vision for blogging, and content-based policy crowdsourcing « Poblish Blog at 4:36 am, January 28, 2010

    [...] doing things that are just too tricky for a computer to do. The Guardian’s recent, and very successful, crowdsourced MP’s expenses exercise is a good example of this. Provide users with an [...]

     
  90. Guardian crowdsourcer politikernes middagsregninger – Nedrelid.com at 9:10 am, February 4, 2010

    [...] I dag behandles skandalene rundt de britiske politkernes utgiftsrefusjoner i House of Commons. Skandalen ble som kjent avslørt av et graveteam i  The Telegraph i desember 2009, men ble innen få dager kuppet, om man vil, av The Guardian som på framifrå vis brukte crowdsourcing til å gjennomgå de enorme dokumentmengdene. I forbindelse med høringene har politikernes middagsregninger blitt frigitt og Guardian ber nok en gang om lesernes hjelp. Nieman Labs har skrevet en fin sak om Guardians metodikk. [...]

     
  91. Crowdsourcing reporting « p5 at 11:55 pm, February 6, 2010

    [...] a comment » The Nieman Journalism Lab had a fascinating blog entry a little while ago bout the Guardian’s crowdsourcing their investigative reporting of the MP [...]

     
  92. A Few Goodies | Journoblog.com at 5:19 am, February 8, 2010

    [...] being carried out in the modern world. Well worth a look. Lastly, if you like things a bit techy, here is an example of one of those methods, crowdsourcing, and how The Guardian used it to compete with The Telegraph [...]

     
  93. 2009’s Most Successful Crowdsourcing Projects | Skillocracy: The Blog at 5:15 am, February 16, 2010

    [...] particular story is well covered in this excellent post by Michael Andersen. Check it out!  It includes the major points as explained by the project’s creator. His main [...]

     
  94. Franco Piccato » Periodismo y sabiduría de las multitudes at 8:29 pm, February 21, 2010

    [...] impacto público. Y servidores que respondieron bien ante un aumento de la participación. Más en Nieman Journalism Lab (en [...]

     
  95. Using the Guardian as inspiration « Ephemeral digest at 7:55 am, March 14, 2010

    [...] mostly from American newspapers and one of the most interesting articles I came across was by the Nieman Journalism Lab. The project is “a collaborative attempt to figure out how quality journalism can survive and [...]

     
  96. zitiert #9: Journalisten sind Stahlarbeiter? « Kommander Kaufmann at 11:51 am, March 28, 2010

    [...] Aber das ist erst der Anfang. Am meisten lässt sich da wohl im Moment vom Guardian lernen, die im Massencrowdsourcing schon mal einen Schritt voraus gegangen sind. Auch Projekte wie Wikileaks gehen für mich in diese [...]

     
  97. Guardian plans to expand open data tools at 1:04 pm, May 18, 2010

    [...] June 2009, The Guardian hired a programmer for a week and built a portal to distribute more than 400,000 government documents of MPs’ expenses. The [...]

     
  98. Internet Strategy for News Organisations » Session 3: The people formerly known as the audience at 7:04 am, June 6, 2010

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment, Michael Anderson, Nieman Journalism Lab [...]

     
  99. Links for July 5th to July 6th | at 7:12 pm, July 14, 2010

    [...] Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment &raqu… – Okay, question time: Imagine you’re a major national newspaper whose crosstown arch-rival has somehow obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets. How do you catch up? If you’re the Guardian of London, you wait for the associated public-records dump, shovel it all on your Web site next to a simple feedback interface and enlist more than 20,000 volunteers to help you find the needles in the haystack. [...]

     
  100. Paolo blog: Ramblings on Web2.0, Trust, Reputation, Recommender Systems, Social Software, Free Software, ICT4D and much more » Blog Archive » Tidbits from Wikipedia presentation at Wikysym by Andrew Lih “What Hath Wikipedia Wrought: Crow at 9:21 am, August 3, 2010

    [...] 91: An experiment by The Guardian on crowdsourcing journalism. The Guardian obtained two million pages of explosive documents that outed your country’s biggest [...]

     

Leave a comment

Check out these related posts