Nieman Foundation at Harvard
Are you willing to pay for Prepare to be asked before year’s end
ABOUT                    SUBSCRIBE
Aug. 1, 2022, 2:50 p.m.

“Number soup”: Can we make it easier for readers to digest all the numbers journalists stuff into their stories?

“Numbers do not speak for themselves. All the same, many people believe that they do. An ideology we call numerism, which accords a privileged epistemic status to quantification, is widespread.”

A widely watched measure of news organizations’ median obscurant adumbration hit a year-to-date high of 11.2 in July, a new study has found, a 42% increase from June’s 7.9 measure and the biggest single-month jump since a 7.1 percentage-point change in April 2019, but still well below the anticipated range of between one in six and 32.1.

Okay — I made all that up. No one measures news organizations’ median (or average!) “obscurant adumbration,” because that’s not a thing.

But did that paragraph remind you of any ledes you’ve read (or written)? A long sentence clotted with clauses, thick with too many numbers and percentages and comparisons? One that intermingles a “percent increase” with a “percentage-point change”? Or the hard-number precision of “32.1” with the unstable do-math-in-your-head of “one in six”? Even if you somehow know what “obscurant adumbration” is, is a “year-to-date high of 11.2″…good, bad, catastrophic? And it’s 11.2 whats, exactly? Points, units, megatons, gigabytes?

That may be an extreme example, but we’ve all seen sentences like this, where even the most attentive reader is unlikely to reach its concluding period better informed than when she began it. A new paper out in the journal Journalism Practice terms the overriding faith in numbers journalists sometimes fall prey to “numerism”:

An ideology we call numerism, which accords a privileged epistemic status to quantification, is widespread. Within this ideology, numbers are taken for granted. They are assumed to be objective and truthful at their core, and the conditions of their production are elided. As Porter (1995) observes:

The appeal of numbers is especially compelling to bureaucratic officials who lack the mandate of a popular election, or divine right. Arbitrariness and bias are the most usual grounds upon which such officials are criticized. A decision made by the numbers (or by explicit rules of some other sort) has at least the appearance of being fair and impersonal. Scientific objectivity thus provides an answer to a moral demand for impartiality and fairness. Quantification is a way of making decisions without seeming to decide. Objectivity lends authority to officials who have very little of their own.

“Numerism” is a pretty good name, but I prefer the term the paper’s authors picked for its title: “Number Soup: Case Studies of Quantitatively Dense News.”

Those authors — Jena Barchas-Lichtenstein, John Voiklis, Bennett Attaway, Laura Santhanam, Patti Parson, Uduak Grace Thomas, Isabella Isaacs-Thomas, Shivani Ishwar, and John Fraser — are an interesting mix. Six work at Knology, a “collective of scientists, writers, and educators dedicated to studying and untangling complex social issues,” which works with news organizations to better understand the effects journalism can have on its audiences. The other three work at PBS Newshour, one of those partner news organizations and the flagship newscast of American public television.

They wanted to figure out what characterizes the number-dense stories that reach readers and whether there are better ways to present them. As they put it in their abstract:

Numbers don’t speak for themselves — yet taking numbers for granted (numerism) is widespread. In fact, journalists often rely heavily on numbers precisely because they are widely considered objective. As a team of journalists and social scientists, we undertook a qualitative exploration of clauses and entire news reports that are particularly quantitatively dense.

The dense clauses were often grammatically complex and assumed familiarity with sophisticated concepts. They were rarely associated with explanations of data collection methods. Meanwhile, the dense news reports were all about economy or health topics, chiefly brief updates on an ongoing event (e.g., stock market fluctuations; COVID-19 cases).

We suggest that journalists can support public understanding by:

  • Providing more detail about research methods;
  • Writing shorter, clearer sentences;
  • Providing context behind statistics;
  • Being transparent about uncertainty; and
  • Indicating where consensus lies.

We also encourage news organizations to consider structural changes like rethinking their relationship with newswires and working closely with statisticians.

The researchers gathered a corpus of 230 U.S. news stories covering four subject areas — economics, health, science, and politics — in late February 2020, meaning they caught the early days of COVID as well as that year’s presidential primaries. Most were text stories, though they did use the transcripts of some news videos. The economics, health, and science stories averaged around 650 words and 30 distinct clauses; politics stories were a bit longer (860 words, 45 clauses).

They then used the wonderfully named tool Dedoose to analyze both each story and all of their constituent clauses for their quantitative density and the sorts of literacy that would be required to comprehend them. Dedoose coded the stories and clauses for these qualities; the more codes assigned to a given clause, the more quantitatively dense it is.

(Some examples of the codes used: Proportion or Percentage; Variability, Concentration, and Variation; Risk and Probability; Magnitude and Scale; and Sampling, Representativeness, and Generalizability. Much more detail in the paper.)

So, what did they find?

Quantitative density is unevenly distributed. Stories that weren’t overstuffed with numbers often had individual clauses or paragraphs that were. Nearly half of the densest stories were about the economy, followed closely by health stories; politics and science stories had many fewer.

Ledes can be the site of the greatest number density. The inverted-pyramid ideal led some journalists to overstuff their ledes with data, especially on wire stories. Here’s one example from the AP:

SEOUL, South Korea (AP) — South Korea reported an eight-fold jump in viral infections Saturday with more than 400 cases mostly linked to a church and a hospital, while the death toll in Iran climbed to six and a dozen towns in Italy effectively went into lockdowns as health officials around the world battle a new virus that has spread from China.

An online reading level calculator put that lede at the “college graduate” level of difficulty.

A lot of numbers can correlate to a lot of grammatical complexity. Clauses tacked on clauses tacked on clauses. An example:

In Virginia, only 16 of 122 licensed hospitals provide sexual assault forensic exams, and only about 150 of the state’s 94,000 registered nurses are credentialed forensic nurses, according to a 2019 study by the Virginia Joint Commission on Health Care.

Economics coverage is at particular risk of too much density. As I said above, economics stories were by some margin the ones most likely to be overstuffed in a potentially confusing way. You might well be able to understand this on a quick first reading:

Wages at the 95th percentile grew by 4.5% last year, while the median increase was just 1%.

But will most readers? “This clause asks readers to understand the percentage change over time (comparison) in median (central tendency) wages, as well as the variability in change over time at different parts of the wage spectrum,” the authors write. “Consider how much easier it is to understand the following version: Workers earning the highest salaries saw a 4.5% increase, while the typical worker saw an increase of just 1%.”

Coverage of the economy is particularly important, given that public perceptions of its strength or weakness is perhaps the most clearly established influence on voting behavior — and that perception is highly influenced by both news coverage and ideology.

The authors are not advocating that complex, number-dense stories be dumbed down; they’re advocating that they be written in clearer and more accessible ways.

Qualitatively dense clauses can occur anywhere within the “inverted pyramid” story structure. When they appear very early, they likely indicate numbers that are newsworthy in and of themselves. When these sentences appear later in a story, they typically provide supporting details for the larger story.

Across story placement, they share several traits. They are often grammatically complex, with multiple clauses. Even before accounting for content, this complexity means they are relatively difficult to understand.

Many of them assume familiarity with sophisticated quantitative measures like economic indicators and epidemiological concepts. Audiences who lack this prerequisite knowledge may find this type of writing inaccessible, particularly because these sentences and the stories that contain them rarely take the time to fully explain research methods. In particular, references to official statistics typically left data collection methods unquestioned and unexplained. Without an understanding of the underlying methods (e.g., how BLS calculates unemployment), news users may not be prepared to make meaning of changes and trends, particularly at times of social disruption.

In combination, all of these traits seem to suggest these journalists are speaking to a more sophisticated target audience and may be leaving typical news users behind.

The authors lay out suggestions for improvement, as described in the abstract above. A few of my favorites:

write shorter, clearer sentences. Breaking up quantitative information across sentences is one of the simplest ways to demystify it. Rather than expecting audiences to make sense of multiple comparisons, stick to one concept per sentence. Norris and Phillips (2003) define “reading and writing when the content is science” as “the fundamental sense of scientific literacy.” We suggest that the same is true for quantitative reasoning more broadly: fundamental literacy plays a role. Simplifying the language used and the sentence structures, then, should leave more processing ability available for quantitative reasoning.

Consider making changes in the organization’s relationship with newswire content. Some of the densest clauses and stories we read were republished directly from the Associated Press and other newswires. We urge news organizations to add or link to explanations of statistical concepts and findings, rather than reprinting this content unchanged. We also encourage news organizations to demand that newswires provide more accessible content.

indicate where consensus lies. We often saw multiple measures or multiple models reported without a clear explanation of either the differences between them or their relative support. “Both-sides” reporting strategies are not well suited to situations where the vast majority of credible experts agree, because these strategies falsely legitimize fringe opinions. In contrast, “weight of evidence” reporting strategies allot column space or on-air time proportional to the amount of evidence supporting competing claims. Such strategies help journalists ensure that they do not undermine solid findings by focusing overly on controversy. Such strategies also help readers recognize some of the sites of disagreement within the research community.

You can find the full paper here, and a thread by author Jena Barchas-Lichtenstein here.

Image generated using Midjourney, using the prompt “one ceramic soup bowl, filled with spreadsheets and data, on a table, from above, in the style of Gustav Klimt, in the style of Thomas Hart Benton, –ar 16:9.” (For some reason, it repeatedly refused to make a more literal “number soup,” despite many attempts asking for “numbers,” “numerals,” and such. Still, it’s pretty.)

Joshua Benton is the senior writer and former director of Nieman Lab. You can reach him via email ( or Twitter DM (@jbenton).
POSTED     Aug. 1, 2022, 2:50 p.m.
Show tags
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
Are you willing to pay for Prepare to be asked before year’s end
The cable news network plans to launch a new subscription product — details TBD — by the end of 2024. Will Mark Thompson repeat his New York Times success, or is CNN too different a brand to get people spending?
Errol Morris on whether you should be afraid of generative AI in documentaries
“Our task is to get back to the real world, to the extent that it is recoverable.”
In the world’s tech capital, Gazetteer SF is staying off platforms to produce good local journalism
“Thank goodness that the mandate will never be to look what’s getting the most Twitter likes.”