“We reached the boundaries of automation faster than expected”

LINK: www.cjr.org ➚ | Posted by: Laura Hazard Owen | July 5, 2017

Want to bring automation to your newsroom? A new AP report details best practices

April 5, 2017

A new report out Wednesday from Columbia’s Tow Center for Digital Journalism looks at how automated journalism worked during the 2016 U.S. presidential election. The report is written by Andreas Graefe, who also spearheaded the Tow Center’s Guide to Automated Journalism in 2016.

The project aimed to study the creation of automated news for forecasts of the 2016 U.S. presidential election, based on data from the forecasting platform pollyvote.com. In addition, the resulting texts provided the stimulus material for studying the consumption of automated news for a high-involvement topic that involves uncertainty.

Since 2004, PollyVote has combined forecasts within and across different forecasting methods to come up with popular vote projections for the U.S. presidential elections. (It doesn’t make projections on a state level.) For this project, the Tow Center’s team worked with German software company AX Semantics to develop automated news based on PollyVote’s projections. Over the course of the project, the team published 21,928 news articles in English and German.

Here’s one example:

The yellow highlighting shows data that is simply taken from the raw data and inserted into the text: the name of the poll, the candidate’s actual polling numbers, or other statistics such as the polling period, the sample size, or the margin of error. For a list of all data fields, see the API at pollyvote.com. The purple highlighting shows fields that are based on calculations with the raw data. For example, the algorithm derives from the data that (a) Clinton is ahead in the poll, (b) she is ahead by 10 points, and (c) this lead is statistically significant. Thus, the algorithm relies on a set of pre-defined rules. For example, the statement of whether a candidate’s lead in the poll is significant is based on whether the candidates’ poll numbers plus/minus the margin of error overlap. The green fields highlight sample synonyms, which are used to add variety to the text. For example, instead of simply saying “Democrat” Hillary Clinton, the algorithm randomly chooses from a list of synonyms (e.g., “Democratic candidate,” “the candidate of the Democratic party,” etc.), which we formulated as a team. Similarly, instead of using the expression “will vote,” the algorithm could use other expressions such as “intend to vote” or “plan to vote.” Furthermore, the project team wrote several variants for each sentence, of which the algorithm randomly choses one when generating the text.

Graefe lists a few lessons learned:

— There was a “high” rate of errors per article at first, most commonly “due to errors in the underlying data.” Also, “we underestimated the efforts necessary for quality control and troubleshooting.”

— The algorithm had an easy time with simpler articles, like “the plain description of poll results” above. But:

Adding additional insights often resulted in levels of complexity that were difficult to manage. Examples include the comparison of a poll’s results to results from other polls (or historical elections), or making statements about whether a candidate is trending in the polls.

— The project might have been overly ambitious. “Rather than developing a complex, ‘one-fits-all’ solution, we should have worked on one story type at a time until an acceptable level of quality had been reached.”

— Overall, “we reached the boundaries of automation faster than expected”:

When developing the algorithm’s underlying rules, we constantly faced questions, such as: How should we refer to the margin between candidates in polls? When does a candidate have a momentum? When is there a trend in the data? While such questions might be easy to answer for a human journalist, they are hard to operationalize and to put in pre-defined rules. The reason is that concepts such as lead, trend, or momentum, which are common in traditional campaign coverage, are not well defined and heavily depend on the context of the election. For instance, even for the most basic question of who is ahead in the polls, there are no clear guidelines for how to refer to the actual distance in polling numbers. When is a lead small or large? To come up with a ruleset for this question, we conducted a content analysis of how newspapers report polls, along with an expert survey. Needless to say, we did not have the resources for such a thorough approach for each decision we faced. Thus, many rules were simply formulated on the fly and based on our own judgment.

The full report is here.

Show tags

Leave a comment