How Tribune Co. plans to rid itself of SEO-killing duplicate content
Last month, I wrote about how The Associated Press plans to leverage its network of members and customers with centralized topic pages linked to content distributed by the consortium. That post has sprung at least three noteworthy legs:
- an intelligent comments thread on Wikipedia’s strength in search results
- some informed skepticism of automated pages from Reuters’ Felix Salmon
- an epic comment by Brent Payne, director of search engine optimization at Tribune Co., on the perils of duplicated content on news sites
This was all very exciting for me, because while I’m deeply interested in the mystical arts of SEO, it’s not something I knew tons about. Intrigued, I contacted Payne for more on how he’s dealing with a problem faced by all networks of news sites: multiple copies of the same article. When the Tribune Co.’s Washington bureau files a story, for instance, it may appear on lots and lots of pages across the company’s many news sites, diluting its power in search. And while Google attempts to make sense of duplicate content, publishers are still advised “to avoid over-syndicating the articles that you write.”
Payne said he’s readying a plan to rid the Tribune Co. of duplicate content. “The goal will be to always have only a single URL for a piece of content across all of our sites,” he told me in an email. For example, when The Los Angeles Times writes a story, it exists, of course, at latimes.com. But when The Chicago Tribune picks up the piece, the current system creates a duplicate of the article with a chicagotribune.com URL.

Under Payne’s plan, Tribune readers would instead visit the Times domain. Meanwhile, a cookie or URL parameter would make the page look like the Tribune’s site and serve the Tribune’s ads. One article, one URL, maximum SEO.
Though I’m hardly the one to judge, it sounds like a neat trick. As news organizations of all sizes act more like networks, they’ll need to make sure that syndication doesn’t, paradoxically, cannibalize their ability to be found by readers. (Another tool for avoiding the duplicate-content blues is the canonical tag recently adopted by Google and other major search engines, but that doesn’t work across domains.)
Payne also shared an unrelated but similarly smart SEO initiative he has implemented at Tribune Co.: Soon, when reporters and editors file a story, they’ll be required to include a phrase they think is most likely to lead a Google searcher to the article.
Payne is convinced that the SEO game is won by ranking high on the search results for three- and four-word phrases. So while he’s happy that The Chicago Tribune’s topic page for Barack Obama ranks on the first page when searching for the president’s name in Google, Payne is more concerned with doing well with phrases like “obama birth certificate,” “obama’s secret service code names,” and “michelle obama and new do” — to name a few searches that lead lots of readers to the Tribune Co.
Some additional advice from a guy who says he doubled the Tribune Co.’s SEO traffic between 2007 and 2008:
When you do SEO for a huge media company with more content than you can possibly digest with a team so small it’s frustrating, it’s all about what you can automate and what you can force the masses of content creators to do via the CMS to help be more intelligent about SEO. It’s about training the masses and finding evangelists for SEO from those groups that are trained and encouraging them when there are wins.
If you like the way Payne thinks, check out some more SEO tips in the slides of a presentation he gave this year to the Society of Professional Journalists. And last week, we shared some tips for optimizing content in Google News.









Hello,
I think Payne is on the right track there. I believe automation is the way to go when it comes down to execution, especially if you’re working across a large number of domains. For the rest of us publishers with less of a clout and nimbler technical team than Tribune Co., I recommend looking at the news content management module built by my organization. To the best of my knowledge, it is the only system that builds Google news compliant architecture out of the box, RSS feeds, newssitemap xml feeds, and a number of other neat things. A description can be found at the bottom of the following page:
http://www.seosamba.com/seo-technology.html
Linking to the original source is a huge win for you, Brent. It probably wasn’t an easy sell, so congratulations.
But I think the AP’s problem (and one that I deal with as well) is that most of its nationally shared content has no home. It’s created for the masses and they can’t pick just one destination URL.
AP seems to get this, at least, and is doing something about it. I’ve seen their plans for 2010 Olympic content and it involves a lot of exclusive hosting and not as much syndicating/duplicating.
I think this is a great idea. I’m no SEO expert, so I can’t really say if it will work, although it sounds promising. But what I find really important about this idea is that it’s an example of a media organization changing itself to get its stories better play in Google searches, rather than just moaning that Google’s system is unfair. That’s a big step in the right direction. Will be interesting to see how this works, and how others can use this idea.
It will be interesting to see if the cookie or URL parameter does not appear to be a form of cloaking, which is also looked down upon by search engines. Other than that, eliminating duplicates also helps search engines (and the internet) by reducing the amount of unnecessary content online.
Google is a national, even worldwide platform. It has no reason to send it’s readers to multiple newspaper websites that are running the same content. This ia a basic problem the newspapers just refuse to understand. Search engines want original content, because their readers do.
Just an update to this, Google now supports a cross-domain canonical tag:
http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html
This should make many of the duplicate content issues moot, except in those cases where articles are illegally redistributed on other domains and don’t include the tag.