June 10, 2010, 10 a.m.

Linking by the numbers: How news organizations are using links (or not)

In my last post, I reported on the stated linking policies of a number of large news organizations. But nothing speaks like numbers, so I also trawled through the stories on the front pages of a dozen online news outlets, counting links, looking at where they went, and how they were used.

I checked 262 stories in all, and to a certain degree, I found what you’d expect: Online-only publications were typically more likely to make good use of links in their stories. But I also found that use of links often varies wildly within the same publication, and that many organizations link mostly to their own topic pages, which are often of less obvious value.

My survey included several major international news organizations, some online-only outlets, and some more blog-like sites. Given the ongoing discussion about the value of external links, and the evident popularity of topic pages, I sorted links into “internal”, “external”, and “topic page” categories. I included only inline links, excluding “related articles” sections and sidebars.

Twelve hand-picked news outlets hardly make up an unbiased sample of the entire world of online news, nor can data from one day be counted as comprehensive. But call it a data point — or a beginning. For the truly curious, the spreadsheet contains article-level numbers and notes.

Of the dozen online news outlets surveyed, the median number of links per article was 2.6. Here’s the average number of links per article for each outlet:

Source Internal External Topic Page Total
BBC News 0 0 0 0
CNN 0.3 0.2 0.7 1.2
Politico 0.7 0.2 0.6 1.5
Reuters.com 0.1 0.2 1.4 1.7
Huffington Post 1.1 1.0 0 2.1
The Guardian 0.5 0.2 1.8 2.4
Seattlepi.com 0.9 1.9 0 2.8
Washington Post 1.0 0.3 2.0 3.3
Christian Science Monitor 2.5 1.1 0 3.6
TechCrunch 1.8 3.6 1.2 6.6
The New York Times 1 1.2 4.6 6.8
Nieman Journalism Lab 1.4 13.1 0 14.5

The median number of internal links per article was 0.95, the median number of external links was 0.65, and the median number of topic page links was also 0.65. I had expected that online-only publications would have more links, but that’s not really what we see here. TechCrunch and our own Lab articles rank quite high, but so does The New York Times. Conversely, the BBC, Reuters, CNN, and The Huffington Post are not converting from a print mindset, so I would have expected them to be more web native — but they rank at the bottom.

What’s going on here? In short, we’re seeing lots of automatically generated links to topic pages. Many organizations are using topic pages as their primary linking strategy. The majority of links from The New York Times, The Washington Post, Reuters.com, CNN, and Politico — and for some of these outlets the vast majority — were to branded topic pages.

Topic pages can be a really good idea, providing much needed context and background material for readers. But as Steve Yelvington has noted, topic pages aren’t worth much if they’re not taken seriously. He singles out “misplaced trust in automation” as a pitfall. Like many topic pages, this CNN page is nothing more than a pile of links to related stories.

It doesn’t seem very useful to use such a high percentage of a story’s links directing readers to such pages. I wonder about the value of heavy linking to broad topic pages in general. How much is the New York Times reader really served by having a link to the HBO topic page from every story about the cable industry, or the Washington Post reader served by links on mentions of the “GOP”?

I suspect that links to topic pages are flourishing because such links can be generated by automated tools and because topic pages can be an SEO strategy, not because topic page links add great journalistic value. My suspicion is that most of the topic page links we are seeing here are automatically or semi-automatically inserted. Nothing wrong with automation — but with present technology it’s not as relevant as hand-coded links.

So what do we see when we exclude topic page links?

Excluding links to topic pages — counting only definitely hand-written links — the median number of links per article drops to 1.7. The implication here is that something like 30 percent of the links that one finds in online news articles across the web go to topic pages, which certainly matches my reading experience. Sorting the outlets by internal-plus-external links also shows an interesting shift in the linking leaderboard.

Source Internal External Total
BBC News 0 0 0
Reuters.com 0.1 0.2 0.3
CNN 0.3 0.2 0.5
The Guardian 0.5 0.2 0.7
Politico 0.7 0.2 0.9
Washington Post 1.0 0.3 1.3
Huffington Post 1.1 1.0 2.1
The New York Times 1 1.2 2.2
Seattlepi.com 0.9 1.9 2.8
Christian Science Monitor 2.5 1.1 3.6
TechCrunch 1.8 3.6 5.4
Nieman Journalism Lab 1.4 13.1 14.5


The Times and the Post have moved down, and online-only outlets Seattlepi.com and the Christian Science Monitor have moved up. TechCrunch still ranks high with a lot of linking any way you slice it, and the Lab is still the linkiest because we’re weird like that. (To prevent cheating, I didn’t tell anyone at the Lab, or elsewhere, that I was doing this survey.) But the BBC, CNN, and Reuters are still at the bottom.

Linking is unevenly executed, even within the same publication. The number of links per article depended on who was writing it, the topic, the section of the publication, and probably also the phase of the moon. Even obviously linkable material, such as an obscure politician’s name or a reference to comments on Sarah Palin’s Facebook page, was inconsistently linked. Meanwhile, one anomalous Reuters story linked to the iPad topic page on every single reference to “iPad” — 16 times in one story. (I’m going to have to side with the Wikipedia linking style guide here, which says link on first reference only.)

Whether or not an article contains good links seems to depend largely on the whim of the reporter at most publications. This suggests a paucity of top-down guidance on linking, which is in line with the rather boilerplate answers I got to my questions about linking policy.

Articles posted to the “blog” section of a publication generally made heavier use of links, especially external links. The average number of external links per page at The New York Times drops from 1.2 to 0.8 if the single blog post in the sample is excluded — it had ten external links! Whatever news outlets mean by the word “blog,” they are evidently producing their “blogs” differently, because the blogs have more links.

The wire services don’t link. Stories on Reuters.com — as distinguished from stories delivered on Reuters professional products — had an average of 1.7 links per article. But only 0.3 of these links were not to topic pages, and only blog posts had any external links at all. Stories read on Reuters professional products sometimes contain links to source financial documents or other Reuters stories, though it’s not clear to me whether these systems use or support ordinary URLs. The Associated Press has no hub news website of its own so I couldn’t include it in my survey, but stories pushed to customers through their standard feed do not include inline links, though they sometimes include links in an an “On the Net” section at the end of the story.

As I wrote previously, Reuters and AP told me that the reason they don’t include inline hyperlinks is that many of their customers publish on paper only and use content management systems that don’t support HTML.

What does this all mean? The link has yet to make it into the mainstream of journalistic routine. Not all stories need links, of course, but my survey showed lots of examples where links would have provided valuable backstory, context, or transparency. Several large organizations are diligent about linking to their own topic pages, probably with the assistance of automation, but are wildly inconsistent about linking to anything else. The cultural divide between “journalists” and “bloggers” is evident by the way that writers use links (or don’t use them), even within the same newsroom. The major wire services don’t yet offer integrated hypertext products for their online customers. And when automatically generated links are excluded, online-only publications tend to take links more seriously.

POSTED     June 10, 2010, 10 a.m.
