Nieman Foundation at Harvard
Is the future about one all-knowing AI or many? The new app Poe gets you ready to chat with them all
ABOUT                    SUBSCRIBE
May 14, 2018, 7:01 p.m.
Business Models

European news sites are among the worst offenders when it comes to third-party cookies and content

Major news sites in seven countries averaged 81 third-party cookies per page, compared to 12 for other popular websites.

The forthcoming General Data Protection Regulation on May 25 is pushing publishers to take a hard look at just how dependent their outlets have become on cookies and third-party trackers they load on their own sites in order to collect data from their visitors.

News sites actually load more third-party content and set more third-party cookies than other top websites, according to a new study of websites across seven European countries from the Reuters Institute.

News sites in those countries averaged 40 different third-party domains per page and 81 third-party cookies per page, compared to an average of 10 and 12, respectively, for the group of top websites in those countries. (Among sites that run some kind of advertising, the study found that news sites on average load four times as many third-party domains compared to other top websites.)

U.K. news sites were, on average, the most bloated of the bunch:

The prevalence of cookies and third-party tracking varies across news sites that rely on different revenue sources, and thus have different incentives and advertising needs.

Public media sites, most of which depend on neither subscription revenue nor advertising, share the least data with third parties. German news sites are generally more restrained than their U.K. counterparts. Compare the news site for the popular daily Bild, for instance, to the Daily Mirror site in the U.K., and BBC News to German public broadcaster ARD/Tagesschau:

Researchers were able to compare all the third-party requests made on a selection of news sites (based on relative reach and prominence in their respective countries), as well as the 500 overall most popular sites in Finland, France, Germany, Italy, Poland, Spain, and the U.K., by using an open-source tool called webXray, which monitors and then records third-party content that loads on a given page in Chrome. webXray can identify about 400 different types of third-party services, 270 of which showed up in the Reuters analysis.

Surprise, surprise: Google services are on most of the pages the researchers analyzed (followed distantly by Facebook):

GDPR takes aim at the collection of identifiable data on internet users that the users have not knowingly consented to, and levies heavy fines for non-compliance, meaning news sites should have due diligence on what’s loading on their pages…like, yesterday. In their study, the Reuters researchers have included handy rundown of types of third-party content that a site might be carrying and the purposes of each, many of which are not inherently problematic. But just to give you a taste of the range: Loading images from hosting services like AWS? Run Google Analytics? Load ads via Google’s DoubleClick network? Have a Facebook “Share” widget? Include Taboola/Outbrain recommendations on your page? That’s all part of this.

So what are news organizations to do about their sites, with GDPR coming into effect in a little over a week? Researchers Timothy Libert and Rasmus Kleis Nielsen offer a helpful matrix for understanding relative privacy risk of each type of content loaded, as it applies to users:

News organizations should be able to make some simple improvements to protecting users’ privacy pretty easily (see especially, the “low risk,” “low effort to replace” items):

For sites focused on improving privacy especially in light of the GDPR, content which ranks ‘low’ on the effort scale could be prioritized for migration. Hosted JavaScript files, fonts, and images all have low-to-medium privacy risk, and in some cases changing a single line of code may provide immediate privacy gains.

Similarly, social media buttons frequently set cookies and may link browsing data directly to users’ profiles, representing a high privacy risk. While social media companies provide code to enable sharing, it is possible to implement widgets on a first-party basis which facilitate social sharing. Even if social media companies would prefer sharing to happen with their widgets, they have no interest in preventing sharing.

The full study is available here.

Photo of cookie crumbs by Dean Shareski used under a Creative Commons license.

POSTED     May 14, 2018, 7:01 p.m.
SEE MORE ON Business Models
Show tags
Join the 60,000 who get the freshest future-of-journalism news in our daily email.
Is the future about one all-knowing AI or many? The new app Poe gets you ready to chat with them all
Poe lets you use ChatGPT alongside a new rival named Claude — which seems to work better in important ways.
Google now wants to answer your questions without links and with AI. Where does that leave publishers?
A dozen years ago, Eric Schmidt forecast the AI pivot that’s playing out this week. And the questions it prompts — around the link economy, fair use, and aggregation — are more real than ever.
A journalistic lesson for an algorithmic age: Let the scientific method be your guide
“One of the best parts about using the scientific method as a guide is that it moves us beyond the endless debates about whether journalism is ‘fair’ or ‘objective.’ Rather than focus on fairness, it’s better to focus on what you know and what you don’t know.”