HOME
          
LATEST STORY
The newsonomics of auctioning off Digital First’s newspapers (and California schemin’)
ABOUT                    SUBSCRIBE
Nov. 22, 2011, 11 a.m.

Bull beware: Truth goggles sniff out suspicious sentences in news

A graduate student at the MIT Media Lab is writing software that can highlight false claims in articles, just like spell check.

You’re reading a wrap-up of the Sept. 22 Republican presidential debate when you land on this claim from Rep. Michele Bachmann: “President Obama has the lowest public approval ratings of any president in modern times.”

Really? You start googling for evidence. Maybe you scour the blogs or the fact-checking sites. It takes work, all that critical thinking.

That’s why Dan Schultz, a graduate student at the MIT Media Lab (and newly named Knight-Mozilla fellow for 2012), is devoting his thesis to automatic bullshit detection. Schultz is building what he calls truth goggles — not actual magical eyewear, alas, but software that flags suspicious claims in news articles and helps readers determine their truthiness. It’s possible because of a novel arrangement: Schultz struck a deal with fact-checker PolitiFact for access to its private APIs.

If you had the truth goggles installed and came across Bachmann’s debate claim, the suspicious sentence might be highlighted. You would see right away that the congresswoman’s pants were on fire. And you could explore the data to discover that Bachmann, in fact, wears some of the more flammable pants in politics.

“I’m very interested in looking at ways to trigger people’s critical abilities so they think a little bit harder about what they’re reading…before adopting it into their worldview,” Schultz told me. It’s not that the truth isn’t out there, he says — it’s that it should be easier to find. He wants to embed critical thinking into news the way we embed photos and video today: “I want to bridge the gap between the corpus of facts and the actual media consumption experience.”

Imagine the possibilities, not just for news consumers but producers. Enhanced spell check for journalists! A suspicious sentence is underlined, offering more factual alternatives. Or maybe Clippy chimes in: “It looks like you’re lying to your readers!” The software could even be extended to email clients to debunk those chain letters from your crazy uncle in Florida.

Schultz is careful to clarify: His software is not designed to determine lies from truth on its own. That remains primarily the province of real humans. The software is being designed to detect words and phrases that show up in PolitiFact’s database, relying on PolitiFact’s researchers for the truth-telling. “It’s not just deciding what’s bullshit. It’s deciding what has been judged,” he said. “In other words, it’s picking out things that somebody identified as being potentially dubious.”

That means the software might flag a Bachmann claim from another debate — “Our government right now — this is significant — we are spending 40 percent more than what we take in” — and mark it as true. PolitiFact had investigated that claim and the claim checked out.

Things get trickier when a claim is not a word-for-word match. For example, the reporter paraphrases: “Our government right now…[is] spending 40 percent more than what we take in,” Bachmann said. Or: Bachman said government spending is 40 percent higher than revenue. It’s not easy for computers to understand the nuances of language the way we do.

(An adviser at the Center for Civic Media, Ethan Zuckerman, is wrestling the same ideas for his more meta-news literacy project, MediaRDI, which would stick nutritional labels on the news.)

Schultz’s work explores natural language processing, in which computers learn to talk the way we do. If you’ve ever met Siri, you’ve experienced NLP. Schultz’s colleagues at the Media Lab invented Luminoso, a tool for what the Lab calls “common sense computing.” The Luminoso database is loaded with simple descriptions of things: “Millions and millions of things…Like, ‘Food is eaten’ or ‘Bananas are fruit.’ Stuff like that, where a human knows it, but a computer doesn’t. You’re taking language and turning it into mathematical space. And through that you can find associations that wouldn’t just come out of looking at words as individual items but understanding words as interconnected objects.

“Knowing that something has four legs and fur, and knowing that a dog is an animal, a dog has four legs, and a dog has fur, might help you realize that, from a word you’ve never seen before, that it is an animal. So you can build these associations from common sense. Which is how humans, arguably, come to their own conclusions about things.”

Open-source versus for-profit

Schultz’s truth goggles will be made open-source once finished next year. PolitiFact, of course, is not open-source; it’s a business still trying to figure out how to monetize its data, said editor Bill Adair.

“Whether we’re included or not will be a decision we’ll make down the road,” Adair told me. “I think what he’s going to ultimately come up with is going to benefit all fact-checking news organizations, so I think we’ll be happy to be part of that. The goal is to get more accurate journalism in front of more people….My goal is not to get people to stop lying. I still believe strongly that the role of the journalists is to inform democracy and let people make decisions about their leaders.”

But even the strongest declaration of truth or falsehood can still spark dissent. It’s beyond the scope of his software, but Schultz’s truth goggles software would be stronger if it could draw from multiple sources. There could be specialty fact-checking sources for physics, or psychology. Or maybe Snopes.com could open up its data with an API.

More sources “would help people break away from their filter bubble. They would be exposed to opinions they hadn’t seen before,” Schultz said. “The ultimate goal is to enable intelligent conversations about contentious issues.”

POSTED     Nov. 22, 2011, 11 a.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
The newsonomics of auctioning off Digital First’s newspapers (and California schemin’)
More than 200 newspapers are up for sale — as one group, in clusters, or one by one. Where they go could have a big impact on how the industry will look in the coming years.
Could a Bay Area news nonprofit take over some of its biggest newspapers?
There are plenty of reasons for it not to happen. But news nonprofits could end up being vehicles for civic-minded locals to take over dailies as they continue to drop in value.
Through The Wire: What happened with The Atlantic’s experiment in aggregation?
The Atlantic invested years and money into figuring out what they wanted The Wire to be. Now, after relaunching and promising reinvestment, the site is being brought back under the wing of its parent.
What to read next
751
tweets
Wearables could make the “glance” a new subatomic unit of news
“The audience wants to go faster. This can’t be solved with responsive design; it demands an original approach, certainly at the start.”
677Designer or journalist: Who shapes the news you read in your favorite apps?
A new study looks at how engineers and designers from companies like Storify, Zite, and Google News see their work as similar — and different — from traditional journalism.
594Ken Doctor: Guardian Space & Guardian Membership, playing the physical/digital continuum
The Guardian is making its biggest bet on memberships and events by renovating a 30,000 square foot space to host live activities in the heart of London.
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
The Daily Show
Chicago Tribune
The Economist
The Orange County Register
Poynter Institute
PubliCola
Amazon
Next Door Media
Milwaukee Journal Sentinel
Quora
Investigative News Network
DocumentCloud