Nieman Foundation at Harvard
HOME
          
LATEST STORY
How The Washington Post built — and will be building on — its “Knowledge Map” feature
ABOUT                    SUBSCRIBE
Nov. 3, 2010, 10 a.m.

It’s people! Meet Soylent, the crowdsourced copy editor

The phrase “on-demand human computation” has a sinister tinge to it, if only because the idea of sucking the brain power out of a group of people is generally frowned upon. And yet, if you call it “crowdsourcing” everything sounds so much friendlier!

But calling Soylent “crowdsourced copy-editing” isn’t quite fair, since the system performs the type of jobs that are somewhere in the gray area between man and machine. More than a spell check, not quite the nightside copy editor versed in AP style, Soylent really is on-demand computation. It’s what all word processors need, the “Can you take a look at this?” button with a small workforce of people at your disposal.

Soylent is an add-in for Microsoft Word that uses Mechanical Turk as a distributed copy-editing system to perform tasks like proofreading and text-shortening, as well as a type of specialized edits its developers call “The Human Macro.” Currently in closed beta, Soylent was created by compsci students at MIT, Berkeley, and University of Michigan.

For those unfamiliar, Mechanical Turk is an Amazon service that makes it easier for small tasks (and the money to pay for them) to be distributed among a group of humans called Turkers. While savvy writers could already use MTurk to edit their work, the team at Soylent believes their system can produce better and more efficient results than would a writer working alone.

“The idea of Soylent is, what if we could embed human knowledge in the word processor?” MIT’s Michael Bernstein, the lead researcher on Soylent, told me.

That sounds technical, but as Bernstein explains, we all call on friends for help when writing. Research paper, essay, email, story, or blog post — most people rely on a second pair of eyeballs for help at least some of the time. And one thing Mechanical Turk has to offer is a lot of eyeballs.

Soylent’s three current features are called Shortn, Crowdproof, and the Human Macro:

Shortn: Ever write 1,700 words and blow right past your 1,200 word count? Shortn lets writers submit passages of text to MTurk for trimming. They can determine how much they want to cut with a handy slider tool.

Crowdproof: A superpowered, sophisticated spell, grammar and style check that provides suggestions as well as explanations why your choices are wrong.

The Human Macro: For more complicated changes — something like “change all verbs to past tense” — the Human Macro is, as Bernstein says, programming-as-craigslist-ad. The writer describes the changes she wants (capitalization of proper names, altering verb tense, annotating references with Creative Commons photos) in a request form, which humans then act on.

Bernstein argues that Soylent’s cold, detached eye is just what some writing needs. “It’s really hard to kill your own babies in your writing,” Bernstein said. “To be honest, another motivation for me is that it’s very time consuming to go and snip words and cut things from paragraphs an hour before deadline.”

But to writers already nervous about those babies being disappeared on the copy desk, handing over their copy to the faceless masses might not sound like a solution. In their research, Bernstein and his colleagues identified “lazy” and “overeager” individual Turkers, with the lazy ones doing the minimal amount of work and the overeager making wholesale changes. Bernstein said the distributed editing process behind Soylent eliminates this problem because no one Turker is working with whole passages of a document; the work is split among many.

Some in news circles are already experimenting with Mechanical Turk; ProPublica used it to identify companies getting stimulus dollars for the Recovery Tracker project. (Here at the Lab, we use it for the long transcripts we sometimes run of video or audio interviews.) MTurk could be used for any number of tasks that call for on-demand labor. But what makes Soylent different from using MTurk directly is a programming pattern Bernstein and his colleagues created called Find-Fix-Verify, which disseminates tasks across a large group of workers. The only thing required of writers is an Amazon account to pay Turkers; Soylent sets the payment rates.

Instead of one Turker reading over an entire page or paragraph, Soylent asks a group of workers to find areas that need fixing and make corrections. Those fixes are then filtered by other Turkers for inaccuracies, which produces a set of recommendations or an edited graph to a writer. Depending on the job and the document, it usually took Soylent around 40 minutes to complete a task.

To news traditionalists, Soylent may sound like the latest turn toward outsourcing in journalism that has sent copy editing jobs to places in India. It could also be akin to the automated journalism being tested by some companies or the Huffington Post’s real-time headline testing. And some day it may be. But Soylent is far from ready for the mainstream, thanks to the processing time and payment methods. Bernstein says they’re working towards having real-time edits and managing payment through Soylent, as well adapting the program to work on photo editing. Instead of outsourcing, think of Soylent as microsourcing.

And about that name: It comes from exactly what you’re thinking. Bernstein said they were looking for something familiar but also true to the idea of what they created. Soylent is made of people. It is indeed, people.

“The original name was Homunculus,” Bernstein said. “It didn’t have the same ring to it.”

POSTED     Nov. 3, 2010, 10 a.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
How The Washington Post built — and will be building on — its “Knowledge Map” feature
The Post is looking to create a database of “supplements” — categorized pieces of text and graphics that help give context around complicated news topics — and add it as a contextual layer across lots of different Post stories.
How 7 news organizations are using Slack to work better and differently
Here’s how Fusion, Vox, Quartz, Slate, the AP, The Times of London, and Thought Catalog are using Slack for workflow — and which features they wish the platform would add.
The New York Times built a robot to help make article tagging easier
Developed by the Times R&D lab, the Editor tool scans text to suggest article tags in real time. But the automatic tagging system won’t be moving into the newsroom soon.
What to read next
1119
tweets
New Pew data: More Americans are getting news on Facebook and Twitter
A new study from the Pew Research Center and Knight Foundation finds that more Americans of all ages, races, genders, education levels, and incomes are using Twitter and Facebook to consume news.
701Newsonomics: The halving of America’s daily newsrooms
If you’re lucky enough to have the right deep-pocketed owner buy your paper and steady it, you’ve won the lottery. If you’re in a town whose paper is owned by the better chains, or committed local ownership, your loss will probably be mitigated. Otherwise, you’re out of luck.
551“Modern” homepage design increases pageviews and reader comprehension, study finds
A new report from the Engaging News Project shows that users prefer modular, image-heavy homepage designs.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
The Miami Herald
The Guardian
Ushahidi
PBS NewsHour
Center for Investigative Reporting
FactCheck.org
Medium
Suck.com
Poynter Institute
Daily Kos
Milwaukee Journal Sentinel
The Daily Beast