Nieman Foundation at Harvard
Postcards and laundromat visits: The Texas Tribune audience team experiments with IRL distribution
ABOUT                    SUBSCRIBE
Nov. 6, 2017, 10:23 a.m.
Reporting & Production

Even automating just parts of journalists’ fact-checking efforts can speed up their work. Is this the right tool?

FightHoax classifies ingested links to articles as either “Trusted,” “Hoax,” or “Mixed” based on seven different criteria, ranging from language quality to author’s publication history.

— A Greek university student is trying to build an algorithm to help journalists, fact-checkers, and regular news consumers make an assessment on the trustworthiness of an article before it’s spread widely online.

Valentine Tzekas began developing FightHoax in late 2016, as anxieties around online misinformation took hold in the months after the November U.S. elections. He was particularly concerned that fake news might impact people’s voting preferences.

“People today need to know how to identify whether what they read or what they listen to is real or fake,” he said, suggesting boldly that “FightHoax aims at taking that fact-checking process into automation.”

Still in private beta and so far English-only, Tzekas said he’s now grown FightHoax to a team of three and is working currently on a closed B2B sales model, with freemium features for individual users forthcoming. The tool has apparently turned a few heads in the region: According to Tzekas, he’s in talks with organizations such as Greece’s Athens-Macedonian News Agency to test out the tool for news verification within its English-speaking department; a few other major European newsrooms have reached out as well. But how does it actually work, and how well?

FightHoax classifies ingested links to articles as either “Trusted,” “Hoax,” or “Mixed” based on seven different criteria, ranging from language quality to author’s publication history. When a link is parsed, the algorithm runs a Google search to scour for similar stories in content published elsewhere online. It then scrapes and compares the text of the article in question and of similar published articles, using its core criteria:

— Quality of the writing: five different functions look for grammar and syntax errors, vocabulary variety, and the overall formality language used
— Whether the title matches the story and whether it uses clickbaity language (e.g., “you will never believe what happened”)
— Polarized/hyperpartisan language
— Length: The algorithm weighs as more trusted longer stories around 800 to 2,000-words. The algorithm also assesses ‘quality in quantity’ (such as whether an article contains quotes and links out to reputable sources)
— Whether it’s a source that’s been previously named for publishing false or misleading stories
— Political leaning: It uses classifications drawn partly from, as well as a custom-built model for sites not included in that list
— The article author’s publication history

The algorithm also detects whether other recognized fact-checking outlets like Snopes have already debunked a story. Finally, FightHoax looks out for opinion pieces by recognizing keywords and phrases (for instance, if a URL slug contains “column”) and won’t classify the piece. (For those interested: FightHoax is built on PHP and Python. For text and author extraction the algorithm uses IBM Watson APIs and for similarity comparison, Dandelion API. For clickbait title identification, it uses machine learning models in Python. Check out also the winners of the Fake News Challenge, which set out to “explore how artificial intelligence technologies, particularly machine learning and natural language processing, might be leveraged to combat the fake news problem.”)

I received access earlier this year to test the first version of FightHoax for myself, and my (very mixed) results are included at the end of this story. Tzekas said he’s releasing soon a beta 2.0 version of FightHoax, with improvements and additional features such as reading time estimates, social shares of news topics, charts to analyze the emotional valence of the parsed article, and more.

FightHoax attempts to perform text analyses of news articles. What it’s trying to offer is detection, not actual fact-checking of claims, as promotional materials claim. It isn’t able to refute or contextualize specific statements by drawing from a comprehensive database of existing fact-checks, which new automated programs like Full Fact or Factmata are working towards. The Argentinian fact-checking outlet Chequeado, is also working on a beta version of an automated fact-checking software that will evaluate Spanish-language claims by comparing it to similar previously made ones already available in its large database of claims.

“With the technologies we have right now, it’s counterproductive to concentrate on a ‘silver bullet’ of fact-checking everything from start to finish. We can benefit a lot more from looking at what specific steps of the fact-checking process could be automated,” Alexios Mantzarlis, director of Poynter’s International Fact-Checking Network, told me earlier this year. There are, he said, a number of promising efforts in the prototype stage. “A full story is hard to analyze — there’s nuance to what is going on with a story. It’s hard to do. It’d be more helpful [for startups] to present the data they have collected [about a story] following a set of rules, and not make a sweeping and automated conclusion about it.”

I also asked Rui Miguel Forte, a data scientist consultant and chief technology officer at Vidpulse, a Greek technology startup, for an assessment of how far natural language processing technologies have progressed. He wrote:

Our brain uses a lot of other [than syntax, grammar, word definitions] knowledge about the world — the context in which the speaker spoke [a] sentence and the speaker themselves — all of which is incredibly hard to codify, store and make use of in an algorithm. Though we are making great strides and are farther down the line than just a few years ago, we have a lot to do still.

NLP systems are still to a very large extent being developed to perform well on a specific task, language and context. A real world application might involve putting together a number of different NLP systems to process text, each of which will have its own individual accuracy that impacts the overall accuracy of the system. The more complex the task the lower the accuracy.

Tzekas emphasized that with FightHoax, he’s not looking to reinvent the wheel — he’s trying to incorporate existing technologies into his detection algorithm, to help automate at least some of fact-checkers’ tasks. FightHoax “might not be 100% perfect” right now, he said, but he’s still making tweaks every day.

The FightHoax homepage shared a spreadsheet that shows the algorithm has 89 percent accuracy when tested on a list of 172 articles, taken from what’s already been debunked or fact-checked by the outlet Snopes.

I originally got access to a FightHoax beta earlier this year; when I had access I ran a quick test to check the algorithm’s performance. For that test, I used two lists of news items: one from BuzzFeed’s Craig Silverman with items that had been already flagged as fake or accurate; and one that I compiled myself with items published on the day of my test. (My methodology and overall results are available on GitHub here.)

FightHoax overall performed with decent accuracy when it came to news items that had been proved fake, but experienced several inconsistencies when it tried to identify news published on the day of the test. It tagged some true stories as “hoaxes,” or tagged the source of a story as “trusted” and the story itself as a “Hoax,” or it didn’t recognize a few opinion pieces and processed them instead as news items. To be fair, on some articles, even I found it difficult to manually classify them. (See also: the nuance of the ratings fact-checkers give to their fact-checks).

Tzekas told me he’s already aware of all these issues, and suggested they might be due to Watson’s API extracting reader comments because of poor website architecture on the publisher end; finding too few similar articles published elsewhere online for the algorithm to compare against; and processing shorter, copy-and-pasted versions of long stories that might have omitted key sections.

Photo illustration based on a Stuart Rankin vector of an illustration from an 1894 issue of Puck.

POSTED     Nov. 6, 2017, 10:23 a.m.
SEE MORE ON Reporting & Production
Show tags
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
Postcards and laundromat visits: The Texas Tribune audience team experiments with IRL distribution
As social platforms falter for news, a number of nonprofit outlets are rethinking distribution for impact and in-person engagement.
Radio Ambulante launches its own record label as a home for its podcast’s original music
“So much of podcast music is background, feels like filler sometimes, but with our composers, it never is.”
How uncritical news coverage feeds the AI hype machine
“The coverage tends to be led by industry sources and often takes claims about what the technology can and can’t do, and might be able to do in the future, at face value in ways that contribute to the hype cycle.”