HOME
          
LATEST STORY
From rumor to out: Tim Cook reminds us that “unpublishable” facts don’t live in a vacuum online
ABOUT                    SUBSCRIBE
Nov. 22, 2011, 11 a.m.

Bull beware: Truth goggles sniff out suspicious sentences in news

A graduate student at the MIT Media Lab is writing software that can highlight false claims in articles, just like spell check.

You’re reading a wrap-up of the Sept. 22 Republican presidential debate when you land on this claim from Rep. Michele Bachmann: “President Obama has the lowest public approval ratings of any president in modern times.”

Really? You start googling for evidence. Maybe you scour the blogs or the fact-checking sites. It takes work, all that critical thinking.

That’s why Dan Schultz, a graduate student at the MIT Media Lab (and newly named Knight-Mozilla fellow for 2012), is devoting his thesis to automatic bullshit detection. Schultz is building what he calls truth goggles — not actual magical eyewear, alas, but software that flags suspicious claims in news articles and helps readers determine their truthiness. It’s possible because of a novel arrangement: Schultz struck a deal with fact-checker PolitiFact for access to its private APIs.

If you had the truth goggles installed and came across Bachmann’s debate claim, the suspicious sentence might be highlighted. You would see right away that the congresswoman’s pants were on fire. And you could explore the data to discover that Bachmann, in fact, wears some of the more flammable pants in politics.

“I’m very interested in looking at ways to trigger people’s critical abilities so they think a little bit harder about what they’re reading…before adopting it into their worldview,” Schultz told me. It’s not that the truth isn’t out there, he says — it’s that it should be easier to find. He wants to embed critical thinking into news the way we embed photos and video today: “I want to bridge the gap between the corpus of facts and the actual media consumption experience.”

Imagine the possibilities, not just for news consumers but producers. Enhanced spell check for journalists! A suspicious sentence is underlined, offering more factual alternatives. Or maybe Clippy chimes in: “It looks like you’re lying to your readers!” The software could even be extended to email clients to debunk those chain letters from your crazy uncle in Florida.

Schultz is careful to clarify: His software is not designed to determine lies from truth on its own. That remains primarily the province of real humans. The software is being designed to detect words and phrases that show up in PolitiFact’s database, relying on PolitiFact’s researchers for the truth-telling. “It’s not just deciding what’s bullshit. It’s deciding what has been judged,” he said. “In other words, it’s picking out things that somebody identified as being potentially dubious.”

That means the software might flag a Bachmann claim from another debate — “Our government right now — this is significant — we are spending 40 percent more than what we take in” — and mark it as true. PolitiFact had investigated that claim and the claim checked out.

Things get trickier when a claim is not a word-for-word match. For example, the reporter paraphrases: “Our government right now…[is] spending 40 percent more than what we take in,” Bachmann said. Or: Bachman said government spending is 40 percent higher than revenue. It’s not easy for computers to understand the nuances of language the way we do.

(An adviser at the Center for Civic Media, Ethan Zuckerman, is wrestling the same ideas for his more meta-news literacy project, MediaRDI, which would stick nutritional labels on the news.)

Schultz’s work explores natural language processing, in which computers learn to talk the way we do. If you’ve ever met Siri, you’ve experienced NLP. Schultz’s colleagues at the Media Lab invented Luminoso, a tool for what the Lab calls “common sense computing.” The Luminoso database is loaded with simple descriptions of things: “Millions and millions of things…Like, ‘Food is eaten’ or ‘Bananas are fruit.’ Stuff like that, where a human knows it, but a computer doesn’t. You’re taking language and turning it into mathematical space. And through that you can find associations that wouldn’t just come out of looking at words as individual items but understanding words as interconnected objects.

“Knowing that something has four legs and fur, and knowing that a dog is an animal, a dog has four legs, and a dog has fur, might help you realize that, from a word you’ve never seen before, that it is an animal. So you can build these associations from common sense. Which is how humans, arguably, come to their own conclusions about things.”

Open-source versus for-profit

Schultz’s truth goggles will be made open-source once finished next year. PolitiFact, of course, is not open-source; it’s a business still trying to figure out how to monetize its data, said editor Bill Adair.

“Whether we’re included or not will be a decision we’ll make down the road,” Adair told me. “I think what he’s going to ultimately come up with is going to benefit all fact-checking news organizations, so I think we’ll be happy to be part of that. The goal is to get more accurate journalism in front of more people….My goal is not to get people to stop lying. I still believe strongly that the role of the journalists is to inform democracy and let people make decisions about their leaders.”

But even the strongest declaration of truth or falsehood can still spark dissent. It’s beyond the scope of his software, but Schultz’s truth goggles software would be stronger if it could draw from multiple sources. There could be specialty fact-checking sources for physics, or psychology. Or maybe Snopes.com could open up its data with an API.

More sources “would help people break away from their filter bubble. They would be exposed to opinions they hadn’t seen before,” Schultz said. “The ultimate goal is to enable intelligent conversations about contentious issues.”

POSTED     Nov. 22, 2011, 11 a.m.
SHARE THIS STORY
   
Show comments  
Show tags
 
Join the 15,000 who get the freshest future-of-journalism news in our daily email.
From rumor to out: Tim Cook reminds us that “unpublishable” facts don’t live in a vacuum online
The Apple CEO confirmed what some websites had reported years ago — the fragmented lens of online media giving new meaning to the idea of an “open secret.”
Ken Doctor: The New York Times’ financials show the transition to digital accelerating
The numbers may look flat, but they contain a continuing set of ups and downs. Up next: executing on a year’s worth of launches.
Before the “teaching hospital model” of journalism education: 5 questions to ask
It’ll take a new generation of academic leadership — willing to incur the wrath of faculty, the greater university, alumni, industry, and analysts — to break through the old ways we train journalists.
What to read next
1020
tweets
The newsonomics of the millennial moment
The new wave of news startups is aiming at a younger audience. But do legacy media companies have a chance at earning their attention?
531Ken Doctor: The New York Times’ financials show the transition to digital accelerating
The numbers may look flat, but they contain a continuing set of ups and downs. Up next: executing on a year’s worth of launches.
413The new Vox daily email, explained
The company’s newsletter, Vox Sentences, enters an increasingly crowded inbox. Can concise writing and smart aggregation on the day’s news help expand their audience?
These stories are our most popular on Twitter over the past 30 days.
See all our most recent pieces ➚
Encyclo is our encyclopedia of the future of news, chronicling the key players in journalism’s evolution.
Here are a few of the entries you’ll find in Encyclo.   Get the full Encyclo ➚
The Guardian
BBC News
CBS News
USA Today
Wikipedia
FactCheck.org
The Boston Globe
Quartz
MSNBC
NewsTilt
Conde Nast
Time