Nieman Foundation at Harvard
After criticism over “viewpoint diversity,” NPR adds new layers of editorial oversight
ABOUT                    SUBSCRIBE
April 29, 2021, 1:46 p.m.

Do Americans really not support “core journalism values”? It all depends on your definitions (and the questions you ask)

Debating methodology may be boring, but a poorly structured study can warp how journalists think about their audiences.

Last week, I criticized a recent report from the American Press Institute and the Associated Press-NORC Center for Public Affairs Research. That report made a number of bold claims about the American public and what it said was a widespread lack of support for core journalistic values like transparency, oversight, and giving voice to the less powerful. Among those claims:

  • Only 29% of Americans believe that “a good way to make society better is to spotlight its problems.”
  • Just 44% of Americans support the journalistic value of transparency.
  • Only half — 50% — of Americans support the journalistic value of giving voice to the less powerful.

In all, the report said, only 11% of Americans support all five of what it considers the core values of journalism. “Bad news for journalists: The public doesn’t share our values,” the headlines said.

I argued that that conclusion was false and that some bad and arbitrary methodology by the researchers had artificially lowered the support levels for journalistic values.

For example, when asked if they agreed with the statement “We need to put a spotlight on problems in society in order to solve them,” 72% agreed and only 6% disagreed. When asked if “it’s important to offer a voice to the voiceless,” 74% agreed that it was, versus 4% who disagreed. These results are very different from the depressingly low “support” numbers the study pushed out — and those lower numbers were generated by a muddled methodology. If you haven’t already, go read my piece to check out my arguments.

After my piece, API’s Tom Rosenstiel wrote a defense of their study; he posted a version of it over at CJR. I do not find it convincing, but you may! Tom makes a number of good points about the merits of complexity in research and the sense that the field has hit a sort of wall in figuring out how to fruitfully examine issues of media trust. I think that’s all correct. But from my perspective, it doesn’t really address the criticisms I raised.

If you have something better to do with your time than reading people argue about survey methodology, I totally respect that! Go check out something else on the site. But I do think it’s important to go into some depth on this. First, because journalists should do more to understand (and critique, where necessary) the ways in which the studies they report on are made. And second, because it is perilously simple for a complex set of methodological decisions to be transformed into a simple headline. That makes it critically important that the headline is right.

The idea that most people took away from this study is, roughly: “Ugh — most Americans don’t believe in the things that we journalists believe, like that the government should be transparent, or that it’s good to expose problems in society.” This study doesn’t show that, and we should get it right before that takeaway achieves a life of its own.

In the interest of transparency — a core journalism value! — I’m going to publish Tom’s statement below along with, well, why I don’t think its arguments are particularly strong. You should read it all and make up your own mind! (Especially if you’re having sleep issues and need a bunch of words about survey methodology to get those sheep a-countin’.)

If you want to read the entirety of the statement before hearing my counterarguments, click here. Otherwise, I’ll tackle it section by section below.

Tom argues that, if you are looking to evaluate people’s psychological responses to something or their core moral principles, it’s better to ask a battery of questions rather than focus in on just one.

To understand people’s reactions to complex psychological/­sociological concepts, one mistake researchers have learned to avoid is to hang too much importance on a single question, or to simply compare two questions to each other. Doing so leaves the research vulnerable to shallow understanding of what people are really thinking. It also puts too much emphasis on the vagaries of wording of a single question. To avoid this problem, a common best practice today is to ask people their reactions to a battery of both multiple positive and negative statements around the same concept and then to combine those results into a score weighting the battery or questions together. Many well known studies use this combined multiple statement and scale approach, among them Moral Foundational Theory, the UCLA Loneliness Scale, the racial resentment scale on the ANES, and the Psychological Wellbeing Scale, just to name a few. That is the method we used, in no small part so that we could compare it to Moral Foundational Theory results.

The idea behind this method is that it provides a much more nuanced, robust and reliable look at how people complex psychological and sociological concepts like values or morality. It is especially valuable methodologically to examine both positive and negative items together. Few attitudes can be easily measured in a single statement.

If Tom thinks I’m opposed to the idea of using a battery of questions — or believe that a single question is always better than multiple questions — I am happy to disabuse him of that notion.

Multi-item batteries can be great! They’re particularly useful in evaluating qualities that need to be teased out of a subject — questions people aren’t likely to answer accurately or honestly if they’re just asked directly.

But a battery of questions still needs to be done well. Adding more questions to a survey instrument can make a survey instrument better, or it can make it worse. It depends on the questions, and what you do with the responses. I raised a lot of reasons in my piece why I don’t believe this particular battery of questions was well done, which Tom doesn’t address here.

Allow me to drift into the world of a hypothetical example for a bit, one that might put all this in more familiar terms.

Say it’s October 2020 and you’re a pollster for Joe Biden. You want to see what his support levels are like in, say, Michigan. So you do a poll that has these four items in it, asking likely voters if they agree or disagree with each statement.

1. I plan to vote for Joe Biden for president over Donald Trump.

2. There are some issues on which I agree more with Donald Trump than with Joe Biden.

3. There is no one alive I would rather have as our next president than Joe Biden.

4. Sometimes, I think it’s good for Republicans to win the presidency.

These are all questions that get at, in their own way, people’s opinions about Biden, Trump, and the election. Agreeing with two of these statements (1 and 3) would suggest that you’re a Biden supporter. Disagreeing with the other two (2 and 4) also suggests that you’re a Biden supporter.

Here’s an important point: Each one of these items could generate valuable information for the campaign.

If someone thinks there are literally zero humans alive better suited for the job, that tells you they’re a very dedicated Biden supporter, not someone who’s likely to waffle between now and Election Day. Same for if someone says there are literally no issues on which they prefer Trump’s position over Biden’s. “Sometimes, I think it’s good for Republicans to win the presidency” can let you know if someone is an ideologically committed Democrat or more of a swing voter. Each statement has its merits as a survey instrument.

Let’s say the results to these questions came in looking something like this:

1. I plan to vote for Joe Biden for president over Donald Trump.

51% agree, 48% disagree.1

2. There are some issues on which I agree more with Donald Trump than with Joe Biden.

79% agree, 14% disagree.

3. There is no one alive I would rather have as our next president than Joe Biden.

22% agree, 71% disagree.

4. Sometimes, I think it’s good for Republicans to win the presidency.

68% agree, 23% disagree.

Again, those are all interesting items that generate useful information. But let’s say you then decided that, to be counted as a “supporter” of Joe Biden, you had to (on average) agree with 1 and 3 and disagree with 2 and 4. In other words, a real “supporter” of Joe Biden would be someone who says:

Yes, I plan to vote for Joe Biden for president over Donald Trump.

No, there is not a single issue on which I agree more with Donald Trump than with Joe Biden.

Yes, there is no one alive I’d rather have as our next president than Joe Biden.

No, it is never good for Republicans to win the presidency.

Or maybe you say a person has to answer three out of the four — a majority — “correctly” to be counted as a supporter.

Remember: Each of those four is a reasonable and useful question to ask someone. And those are the pro-Biden responses to each question. But if you set that high of a bar, you might end up with a finding that that only, say, 17% of people meet that higher standard. These are Biden superfans, but you’ve decided that definition is the one you’re using to define mere “support.”

Would your key takeaway from all this, then, be: “Only 17% of Michigan likely voters support Joe Biden”?

Would you think it made sense to headline a writeup of the poll: “Bad news, Joe. Michiganders don’t share your values”?

Unless you are a terrible pollster, no. You asked people directly if they plan to vote for Biden over Trump, and 51% said they do. You also asked three other reasonable questions that gave you potentially valuable information about people’s opinions.

They’re all legitimate questions, but they do not all have the same predictive value and they do not all reflect a common definition of “support.” Someone who says “Yes, I plan to vote for Joe Biden” but “No, he’s not literally the best candidate on earth — I really liked that Mayor Pete kid” still plans to vote for Joe Biden.

It would be perfectly reasonable to use information from the other questions to inform your opinions about Biden’s support.

“These voters who support Trump on some issues — they’re probably less committed to Biden, so we should discount their support a bit and target some more messaging at people like them. They might be people who aren’t willing to tell a pollster they like Trump, but who won’t have a problem pulling his lever in the voting booth.”

“The fact that so many people think Republican presidents are sometimes good makes me worry they could be swayed by certain GOP endorsements — we have to take that into consideration.”

“These people who literally think Biden is the best candidate alive — I bet they’ll have super-high turnout levels, and we can be a little more confident about the Michigan counties where they’re concentrated.”

Any of those could be a completely reasonable conclusion to draw from the data. But if you’re just trying to gauge support for Biden, you’ll get a more accurate answer by just paying attention to Item 1 than by setting an arbitrary standard that treats all four items equally and just averages the responses together. Item 1 on its own might not tell you everything you need to know — but if your methodology produces a result that’s radically different from Item 1, you should probably have a good justification for why your method is better.

This is what I argued happened in API/AP-NORC’s study. For the journalistic values they wanted to measure support for, they asked some questions that broached the subject directly. They also asked other questions that — while potentially informative and useful — simply aren’t equally predictive of whether someone supports transparency, or oversight, or whatever. Those questions were formed inconsistently and they elicited wildly disparate responses — and then the study treated them all the same and came out with an artificially low number, with no particular justification for why it’s a more accurate result than what the straightforward question generates.

There is certainly nothing wrong with using a battery of questions to evaluate a person’s beliefs of positions. I’m quite a fan of Stanley Feldman’s four-question authoritarianism battery, which I find useful in thinking about political polarization. (Good details on that in Marc Hetherington and Jonathan Weiler’s book Prius or Pickup?) The Donald Kinder/Lynn Sanders racial resentment scale Tom mentions is also a modern classic. (I can highly recommend Kinder’s book with Nathan Kalmoe, Neither Liberal nor Conservative: Ideological Innocence in the American Public, which makes strong arguments for the value of using multiple measures to evaluate a given variable — in their case, ideological commitments.)

But that doesn’t mean they’re above reproach. There are good batteries and bad ones, and they’re only as strong as the questions they contain. These things are imperfect! They get tested, revised, and improved based on how useful their results turn out to be. The UCLA Loneliness scale Tom mentions, for example, has been revised twice to address concerns about whether some of its questions biased responses in one direction or another.

You can find robust debates about the merits and demerits of all of the multi-question batteries he’s talking about. Which one do you like best: the ANES Racial Resentment index, the GSS Racial Resentment index, or the CCES racial resentment index? Maybe you prefer the Pew “Old-Fashioned Racism” five-category variable, or the GSS “Old-Fashioned Racism” five-category difference score? Perhaps you like the 2008-09 version of the ANES Negative White Stereotypes index, or maybe the later version, or even the other ANES version done with CCAP?

There’s one really important thing to know about the various surveys that Tom cites here. (“Moral Foundational Theory, the UCLA Loneliness Scale, the racial resentment scale on the ANES, and the Psychological Wellbeing Scale, just to name a few.”) The researchers who developed each of those batteries put in a lot of work trying to validate their merits statistically. They didn’t just brainstorm a list of items and assume that whatever comes out on the other end is gospel.

For instance, here’s a too-quick summary of the multi-year process that went into developing the first version of the UCLA Loneliness scale:

  • an initial set of 75 potential questions, drawn from existing literature (dating to 1964), narrowed first to 25 and then to 20
  • an initial test group of 239 subjects, some of them pulled into a three-week “clinic/discussion group on loneliness,” others used as controls
  • in addition to the semi-winnowed battery of tests, asking subjects to complete a series of other questionnaires and self-reports to try to measure the validity of the potential questions
  • a measure of each question tested based on its degree of correlation with alternate measures of loneliness
  • a measure of the internal consistency of the total battery
  • a measure of the correlation of the overall scale score with batteries for related states (e.g., depression, anxiety)

And that was just the start. As I said above, the UCLA scale has been revised multiple times to address concerns of internal bias. And it has been the subject of frequent evaluation and critique, especially looking at different cultural contexts, in the years since. There have been literal decades of debate over how well it captures loneliness.

How about the API/AP-NORC study — what sort of validation of the quality of their measure did they seek? Nothing like what went into the other batteries Tom mentions.

There was at least a little external validation in picking the five core journalism values; the researchers “identified [the values] through our own experience” and then held a brainstorming session with six journalists and two professors. As I said in my earlier piece, I think the five they came up with are basically fine — they’re not what I have a problem with. But the specific statements they used, the structure of the instrument, the cut point they set for defining “support” — those were not the result of years of testing alternatives and iterative measures of external validation.

And you know what? That’s fine. Someone has to be the first to try out an instrument or a new approach, and I’m glad they did. But that also means that, if someone questions your methodological decisions, you can’t just point at a few classic survey instruments that have been put through the wringer to justify your choices.

The API/AP-NORC researchers write they were building on Jonathan Haidt and Jesse Graham’s Moral Foundations Theory and wanted to use a methodology to match it. (The API/AP-NORC study gave subjects a version of the standard MFT questionnaire in addition to their journalism-based instrument.)

Moral Foundations Theory is a controversial theory in some corners of academia. But it an established theory, and the tools it uses have been tested and developed with care. Early versions (including alternate sets of questions) were published in 2009 before its official debut in 2011.

So what sort of effort has gone into the validation of the Moral Foundations Theory questionnaire? What did it take to get there? “Multiple rounds of item analysis using large heterogeneous samples” — more than 34,000 people tested. The first version was found to have some “low…internal consistencies,” and some of the items they thought were testing one moral value turned out to actually be testing a different one. To measure the merits of different question combinations, they compared their potential items against three different external criterion scales. They discovered that three-question subsets for each value were often more reliable than four-question subsets; they tested 15-, 30-, 40-, and 43-item surveys to compare their validities. They administered the same test twice to the same people (on average, 37 days apart) in order to measure versions’ Pearson correlation, or how often people generated similar results across multiple administrations of the instrument. They separately analyzed the survey’s robustness among self-identified political liberals, moderates, conservatives, and libertarians (they found “reasonable internal consistency”). They ran a confirmatory factor analysis to generate goodness-of-fit indices. For each of the five moral foundations they were examining, they compared results to multiple different external measures created by other researchers — 15 different external comparisons in all. For certain subsets of their subjects, they surveyed self-reported reactions to social groups aligned with or against the values they were testing. (For example, people whose stance on “authority” was being measured were asked for their thoughts on the military, police officers, and anarchists.) They had 10,652 people complete both their survey and the broader Schwartz Values Scale to measure how they looked at both individual values and the same grouped in aggregate. They ran another cut at their gathered survey data normalized by both country of origin and country of residence, and then ran controls for political ideology, age, gender, religious attendance, and education levels. All of that — and more steps that I’m skipping over — is how they landed at the 30-item 2011 version of the Moral Foundations Questionnaire with confidence that it’s a valid measure.

They put in the work. (And even after all that, plenty of people still don’t like it!)

Smart people can have honest disagreements with the MFQ or the ANES racial resentment scale or any of those classic measures — but you can feel comfortable giving them at least some benefit of the doubt because they’ve been tested and tested and retested and retested, deeply and profoundly and repeatedly.

There’s nothing wrong with using a group of multiple items to measure someone’s values or opinions. But “combining answers” only “allows researchers to obtain a much more supple and accurate measure of the people’s thinking” if the questions are good. I wrote why I didn’t think those questions were good in my previous piece, and I don’t see any response to them here.

Back to Tom:

The reason this is helpful is easy to understand. While a majority of people may agree with an affirmative statement, if many of those same people also agree with a contradictory statement, that’s a strong signal that they have mixed views about the value. If, on the other hand, someone tends to agree with all the affirmative statements and feels strongly about the contradictory ones, that’s a good signal that the person’s feelings are less mixed. Combining answers allows researchers to obtain a much more supple and accurate measure of the people’s thinking. Researchers caution against trying to draw conclusions from any one individual item without considering the full set.

A really, really important word here is contradictory. One of my main complaints last week was that the paired survey items were claimed to represent a value and then “the antithesis” of that value — but they just aren’t.

Let’s look at the UCLA Loneliness survey Tom mentioned. You can find the questions for its survey instrument here. They’re framed as questions rather than statements; instead of agreeing or disagreeing, you answer whether an item describes you “never,” “rarely,” “sometimes,” or “often.” (The affirmative and negative questions are intermingled in the actual survey, but I’ve separated them here to show the distinction.)

Items where answering “never” codes as high loneliness:

How often do you feel that you are “in tune” with the people around you?

How often do you feel close to people?

How often do you feel you can find companionship when you want it?

How often do you feel that there are people who really understand you?

How often do you feel that there are people you can talk to?

How often do you feel that there are people you can turn to?

How often do you feel part of a group of friends?

How often do you feel that you have a lot in common with the people around you?

How often do you feel outgoing and friendly?

Items where answering “never” codes as low loneliness:

How often do you feel left out?

How often do you feel that your relationships with others are not meaningful?

How often do you feel that no one really knows you well?

How often do you feel isolated from others?

How often do you feel shy?

How often do you feel that people are around you but not with you?

How often do you feel that you lack companionship?

How often do you feel that there is no one you can turn to?

How often do you feel alone?

How often do you feel that you are no longer close to anyone?

How often do you feel that your interests and ideas are not shared by those around you?

What do notice about those questions? How they’re all structured in parallel ways (“How often do you feel…”)? How tightly they stick to the concept of loneliness? Each one serves a distinct point of entry into the same phenomenon — a different cut at the same apple. (Maybe someone taking the test happened to have lunch with someone today, so he’s not feeling particularly “isolated from others” at the moment — but he might still feel that his “relationships with others are not meaningful” or that he’s not “outgoing and friendly.”)

This is a well-aligned set of questions. While there will no doubt be variations among and within different subjects, there aren’t any serious logical problems here that could jam up the works. There’s no real confusion about which response to each question tends toward loneliness and which one doesn’t.

Compare them to the four API/AP-NORC items about oversight:

Items where agreeing codes as being pro-oversight:

The powerful need to be monitored or they will be inclined to abuse their power.

It’s vital that the public know what government leaders are doing and saying each day.

Items where agreeing codes as being anti-oversight:

It’s important to put some trust in authority figures so they can do their jobs.

Leaders need to be able to do some things behind closed doors to fulfill their duties.

Does that second pair of statements strike you as the “antithesis” of the first pair? Do they clearly represent “contradictory” values?

If you think “the powerful need to be monitored,” but also that “it’s important to put some trust in authority figures so they can do their jobs” — does that mean, as Tom says, that you “have mixed views about the value” of oversight? Or does it mean that the two statements aren’t actually antithetical to one another, they don’t represent conflicting beliefs, or that it’s perfectly consistent to think both are true at the same time?

As it happens, Americans were 70% agree/8% disagree on “the powerful need to be monitored” and 68% agree/9% disagree on “it’s important to put some trust in authority figures” — despite the fact this study considers agreeing with those two statements “contradictory.”

Which do you think is more likely: that all those people really are just that muddle-headed on these issues, or that it’s just a muddled set of questions?

The API/AP-NORC questions are consistently poorly formed if they’re meant to represent contradictory or antithetical values. Just as in the Biden story above, different strengths of opinion get lumped into the same basket, blanket statements paired with occasional exceptions.

Some items are just straightforward statements of general principle:

The more facts people have, the more likely it is they will get to the truth.

The powerful need to be monitored or they will be inclined to abuse their power.

A society should be judged by how it treats its least fortunate.

It’s important to offer a voice to the voiceless.

Some ask you to weigh two different values:

On balance, it’s usually better for the public to know than for things to be kept secret.

Some that don’t just declare something is good, but that it’s the usually the best of all possible options:

Transparency is usually the best cure for what’s wrong in the world.

There are items that ask if there are ever any exceptions to a broader rule:

Sometimes the need to keep a secret outweighs the public’s right to know.

Too much focus on what’s wrong can make things worse.

Sometimes favoring the least fortunate doesn’t actually help them.

It’s important to put some trust in authority figures so they can do their jobs.

Leaders need to be able to do some things behind closed doors to fulfill their duties.

Some ask if something is the way to fix a problem, not just a way:

The way to make a society stronger is through criticizing what’s wrong.

The way to make a society stronger is through celebrating what’s right.

There are items that are supposed to be contradictory, but where sloppy wording makes it possible for both to be logically true:

For most things, knowing what’s true is a matter of gathering evidence and proof.

For a lot of things that matter, facts only get you so far.

There are items addressing an undefined set of circumstances (what problems, exactly?):

Most problems can be addressed without putting embarrassing facts out in the open.

That mishmash of degrees of strength is how you end up with a 20-item survey where more people agreed than disagreed with 19 of the 20. All 10 of the “pro-journalism-value” items got more agrees than disagrees — and so did 9 of the 10 “anti-journalism-value” items.

So imagine survey subject Jane Doe gets to the four items about “giving voice to the less powerful,” and here’s how she responds:

It’s important to offer a voice to the voiceless. “STRONGLY AGREE. It is just so critical to raise up those unheard voices.”

A society should be judged by how it treats its least fortunate. “STRONGLY AGREE. We all have a moral imperative to support the least among us.”

Sometimes favoring the least fortunate doesn’t actually help them. “SOMEWHAT AGREE? I mean, I guess it hasn’t always helped, every single time. Once in a while, it backfires or just doesn’t work.”

Inequalities will always exist and you can’t eliminate them. “STRONGLY AGREE. We can’t eliminate all inequalities, of course — someone will always have more than someone else — but that doesn’t mean we shouldn’t try our best to reduce inequalities whenever we see them.”

Does it strike you that Jane supports the value of “giving voice to the less powerful”? Well, for the purposes of this survey, Jane officially does not support “giving voice to the less powerful.” Yes, even though she “strongly agreed” that…”it’s important to offer a voice to the voiceless.” Because the muddled questions count just as much as the direct ones.

And if she’s not a supporter of giving voice to the less powerful, she is officially — like 89% of Americans, according to API/AP-NORC — not a supporter of all five of journalism’s core values. (“Bad news for journalists: Jane doesn’t share our values”!)

You may have noticed something about the various classic batteries that Tom listed. Each of them aims to answer questions that many people either won’t or can’t answer honestly if you just asked them.

If you’re psychologically unwell, it’s entirely possible that you don’t realize it — or at least that you would struggle to say “Yes, I am psychologically unwell” in response to a direct question.

Imagine you wanted to find out how racist someone is. If you just straight up asked them — say, “Do you hate Black people?” — even most racist people would be savvy enough to know that saying so out loud will likely earn some sort of social opprobrium — whether a disappointed look, a punch to the face, or something in between. You won’t get accurate or useful answers. That’s why the ANES racial resentment scale instead asks about statements like “Irish, Italian, Jewish, and many other minorities overcame prejudice and worked their way up; Blacks should do the same without any special favors,” or “It’s really a matter of some people just not trying hard enough: if blacks would only try harder they could be just as well off as whites.”

In each of these cases, the multi-question approach is useful precisely because the direct question is hard to answer. That’s why you don’t need a complicated battery of questions to figure out if someone had breakfast today — you can just ask and be reasonably confident in someone’s ability to give a cogent answer.

I am willing to be convinced that just asking someone directly about values like journalistic oversight might not give you the full picture. I’m sure there are people who would say they support “giving voice to the less powerful” if you asked them — but who would hem and haw and retreat from that position if you ask them about specifics. But I don’t see any reason to think people are anywhere near as hesitant or unable to share their thoughts on, say, government transparency as they are their thoughts about race or the current state of their brain.

Again, I’m open to the idea that additional questions could generate more accurate outcomes when looking at these values. But there’s no evidence that these particular questions do. And the yawning gap between how people directly answer specific questions and what this methodology produces is evidence that many of these particular questions reduce accuracy rather than increase it.

Back to Tom:

We fear Josh made that mistake. He begins, for instance, by looking at the questions probing the idea of the press as a social critic. He notes that 72% of people to some degree agreed with the statement that “we need to put a spotlight on problems in society in order to solve them.” But he ignores one of the next questions, that only 43% agreed with a variation of the same concept, “the way to make society strong is through criticizing what’s wrong.” It is not part of his critique. While he does show them at the end of his report, he has also condensed the six-answer scale of the agree and disagree options, conflating the results. Even these summarized figures, however, show that 67% of Americans think “the way to make society stronger is by celebrating what’s right,” strongly balancing against the positive statement. And 50% believe “too much focus on what’s wrong can make things worse.” When all the answers to the four-question battery of this concept are combined and weighted, it shows that support for the concept is much lower than if you simply look at one question.

I sought clarification on this part of the statement: While Tom refers to “the answers to the four-question battery of this concept” being “combined and weighted,” there actually isn’t any differential weighting of the four questions in each battery. Each of the four is treated as an equally predictive measure of the value in question. (Which, you may have guessed, I think is a bad idea, given these questions.)

For each item, a subject’s response is coded on a scale of 1 to 6 — 6 being the most “pro-journalism-value” answer (e.g., “strongly agreeing” with “The more facts people have, the more likely it is they will get to the truth”) and 1 being the least (e.g., “strongly disagreeing” with that statement). With four items per value, that means a maximum score of 24 (6×4) and a minimum of 4 (1×4). To count as a supporter of a particular value in this study, you needed a score of at least 16 — the equivalent of a 4 (“slightly” pro-journalism-value) on each question.

So when it comes to measuring whether someone supports the idea of journalistic oversight, agreeing with “we need to put a spotlight on problems in society in order to solve them” — a fairly direct summation of the value — counts exactly as much as disagreeing with “too much focus on what’s wrong can make things worse,” which simply asks if focusing too much on an issue “can” make it worse.

There’s one other thing I mentioned in my critique that Tom doesn’t respond to here, but I’ll mention it again. This study wants to figure out how big of a gap there is between how journalists think about these values and how the general public does. But they never actually asked journalists how they think about them.

The assumed position of “journalists” here is 100% agreement with the values as the researchers have defined them. But I can assure you that I and many other journalists I’ve spoken to in the past few days would not “fully support all five of the journalism values tested” if we’d taken this exact survey.

Not because we don’t like the values of journalism — just because we wouldn’t recognize the extreme version of those values being portrayed in these questions.

I would agree, to one degree or another, with all of these statements that I’m supposed to disagree with in order to “support” core journalism values. Not because I hate transparency or oversight or reliance on facts — just because they’re structured poorly enough that they seem boringly and self-evidently true:

It’s important to put some trust in authority figures so they can do their jobs.

Leaders need to be able to do some things behind closed doors to fulfill their duties.

A lot of the time you know enough about something and more facts don’t help.

For a lot of things that matter, facts only get you so far.

Sometimes favoring the least fortunate doesn’t actually help them.

Inequalities will always exist and you can’t eliminate them.

Too much focus on what’s wrong can make things worse.

Sometimes the need to keep a secret outweighs the public’s right to know.

Most problems can be addressed without putting embarrassing facts out in the open.

Look at all those wiggle words: “some trust,” “some things,” “a lot of the time,” “for a lot of things,” “sometimes,” “can make things worse,” “sometimes.” I read disagreeing with these as meaning:

It’s important to never put any trust in authority figures so they can do their jobs.

Leaders don’t need to do anything behind closed doors to fulfill their duties.

It’s relatively rare for someone to know enough about something so that more facts wouldn’t help.

Facts are all you need for most things that matter.

Favoring the least fortunate always helps them.

All inequalities can be eliminated.

Too much focus on what’s wrong never makes things worse.

The need to keep a secret never outweighs the public’s right to know.

Most problems can’t be addressed without putting embarrassing facts out in the open.

Maybe there are some on this list that you, my fellow journalist, would answer differently. But because the researchers never asked journalists about them, we don’t know how far we stand from the general public on these values. As Tom told Margaret Sullivan, they start with the assumption that these values as they define them are just universal in the profession.

“Journalism is a tribe,” said Tom Rosenstiel, executive director of the American Press Institute. “These are our core values, and we think that everybody shares them.”

I’m sure it’s frustrating to have someone like me complaining about your methodology on a study like this. As I said last time, most of the research here I have zero issue with.

The meat of this study is taking these measures of support for journalism values and using them (a) to compare them to a set of moral values derived from the MFT and (b) to test whether experimental changes to a news story can change the reaction of readers with particular value structures. That’s all good stuff. And frankly, I wouldn’t have a real problem with the methodology the researchers used if that was all it was used for. Having this scattershot mix of statement types is fine, even desirable, if your goal is to differentiate people within a group for the purpose of something like a cluster analysis.

Think back to that Biden hypothetical from earlier. If your goal as a Biden pollster was to understand the dynamics within the set of Biden supporters — which ones are hardcore loyalists, which ones could be swayed with a gentle push, which ones might respond to hearing messages x, y, or z — you’d absolutely want to ask a whole bunch of different questions that could help you distinguish between the various clusters among his supporters and figure out their relative size. More questions mean richer data to inform your analysis. The inconsistent shifts in emphasis from question to question can help you segregate people by the intensity or quality of their support.

But if you’re just trying to figure who “supports” Biden, asking all those scattered questions, treating them all as equally predictive, and then just calculating the average response across the board is going to give you a terrible answer.

Same thing here. If you’re going to use this methodology to decide who “supports” transparency, you just can’t do it without justifying why these questions are the right ones to ask, why this structure is the right one, why responses to these items matter more than (or less than, or the same as) these others, and why this is the correct dividing line between “supports” and “doesn’t support.” This study just doesn’t have that. And without that, this study can’t tell us anything about the public’s “support” (or lack thereof) for journalistic values.

Coda: There are two other paragraphs in Tom’s statement that I don’t think are particularly responsive to what I wrote, so I won’t bother responding to them, but I’ll include them here for reference.

Some discussion of the study in journalism circles also seems to infer something the work does not suggest — an implied criticism of the public for not embracing journalism’s values or some inference criticizing journalists for having these values. Nowhere does the study suggest either. Nor do we think it. Great journalism depends on reporting that is inclusive and understands and accurately represents everyone with a stake in the story. There is much more value to journalism when it reaches as much of the potential audience as possible.

The goal of the research was to open a new door into understanding the question of trust in the news media in the same way that moral foundational theory introduced a new way of understanding politics–around underlying values. We used the same methodological approach as moral foundational theory does so we could match the findings. That methodology is well understood and accepted in the research community. It may be new to the study of trust in journalism. But it provides a much more robust way of understanding the data.

  1. This was Biden’s actual final margin in Michigan↩︎
Joshua Benton is the senior writer and former director of Nieman Lab. You can reach him via email ( or Twitter DM (@jbenton).
POSTED     April 29, 2021, 1:46 p.m.
Show tags
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
After criticism over “viewpoint diversity,” NPR adds new layers of editorial oversight
“We will all have to adjust to a new workflow. If it is a bottleneck, it will be a failure.”
“Impossible to approach the reporting the way I normally would”: How Rachel Aviv wrote that New Yorker story on Lucy Letby
“So much of the media coverage — and the trial itself — started at the point at which we’ve determined that [Lucy] Letby is an evil murderer; all her texts, notes, and movements are then viewed through that lens.”
Increasingly stress-inducing subject lines helped The Intercept surpass its fundraising goal
“We feel like we really owe it to our readers to be honest about the stakes and to let them know that we truly cannot do this work without them.”