Nieman Foundation at Harvard
HOME
          
LATEST STORY
War correspondent Jane Ferguson pulls back the curtain on her career covering global conflicts
ABOUT                    SUBSCRIBE
Feb. 22, 2024, 12:27 p.m.

With The New York Times suing Microsoft and OpenAI for copyright infringement (a case the Times might well win, AI writer and researcher Timothy B. Lee and Cornell professor James Grimmelman argued this week), it’s a good time to take a look at how news sites in general are responding to tech companies’ use of their content. A report out Thursday from the Reuters Institute for the Study of Journalism finds that nearly half (48%) of the top news publishers across 10 countries were blocking OpenAI from crawling their sites as of the end of 2023.

The websites of legacy print publications (like The New York Times and Der Spiegel) were more likely to block AI crawlers than TV and radio broadcasters or digital-born news sites — 57% of them were doing so, according to Richard Fletcher’s research.

News websites were less likely to block Google’s AI crawler than OpenAI’s, with a little less than a quarter doing so, but “almost every website (97%) that decided to block Google’s AI crawler was also blocking OpenAI’s crawlers.” From the report:

The proportion of top online news websites blocking OpenAI ranged from 79% in the US, to just 20% in Mexico and Poland. For Google, the proportion blocking their AI crawler ranged from 60% in Germany to 7% in Poland and Spain. In general, outlets in the Global North were more likely to be blocking than those in the Global South. (Interestingly, the figures are aligned with attempts to index countries in terms of AI capabilities and preparedness, such as those published by Tortoise and Oxford Insights, both of which rank the US first.)

In every country apart from Germany, where the figure was 60% for both, more top news websites blocked OpenAI’s crawlers than Google’s. Moreover, almost every website that blocked Google AI also blocked OpenAI (97%). This could be because ChatGPT is more prominent and widely used than Bard/Gemini, or it could be because the OpenAI crawler was released first. But it is also possible that publishers are more cautious about blocking Google in case it affects their prominence in search results — even though there are separate crawlers for search and AI.

You can read the research here.

Show tags
 
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
War correspondent Jane Ferguson pulls back the curtain on her career covering global conflicts
“People experience war on a personal level, and our ability to communicate extraordinary stress on an individual human level is the goal of good war reporting.”
I moved to rural New Mexico to report on the aftermath of a massive wildfire. My neighbors were my best sources.
Reporter Patrick Lohmann has lived in New Mexico for most of his life, but covering the Hermits Peak-Calf Canyon Fire required building trust in a divided community. Here’s how he did it.
The Financial Times is ready for its AI to answer your questions (well, some of them)
Ask FT is in a very limited beta, but it promises to bring the wisdom of its archives to bear on your information needs.