Twitter  A look at the state of local public media via Localore nie.mn/1ipr7OP  
Nieman Journalism Lab
Pushing to the future of journalism — A project of the Nieman Foundation at Harvard

How Tribune Co. plans to rid itself of SEO-killing duplicate content

Last month, I wrote about how The Associated Press plans to leverage its network of members and customers with centralized topic pages linked to content distributed by the consortium. That post has sprung at least three noteworthy legs:

  • an intelligent comments thread on Wikipedia’s strength in search results
  • some informed skepticism of automated pages from Reuters’ Felix Salmon
  • an epic comment by Brent Payne, director of search engine optimization at Tribune Co., on the perils of duplicated content on news sites

This was all very exciting for me, because while I’m deeply interested in the mystical arts of SEO, it’s not something I knew tons about. Intrigued, I contacted Payne for more on how he’s dealing with a problem faced by all networks of news sites: multiple copies of the same article. When the Tribune Co.’s Washington bureau files a story, for instance, it may appear on lots and lots of pages across the company’s many news sites, diluting its power in search. And while Google attempts to make sense of duplicate content, publishers are still advised “to avoid over-syndicating the articles that you write.”

Payne said he’s readying a plan to rid the Tribune Co. of duplicate content. “The goal will be to always have only a single URL for a piece of content across all of our sites,” he told me in an email. For example, when The Los Angeles Times writes a story, it exists, of course, at latimes.com. But when The Chicago Tribune picks up the piece, the current system creates a duplicate of the article with a chicagotribune.com URL.

Under Payne’s plan, Tribune readers would instead visit the Times domain. Meanwhile, a cookie or URL parameter would make the page look like the Tribune’s site and serve the Tribune’s ads. One article, one URL, maximum SEO.

Though I’m hardly the one to judge, it sounds like a neat trick. As news organizations of all sizes act more like networks, they’ll need to make sure that syndication doesn’t, paradoxically, cannibalize their ability to be found by readers. (Another tool for avoiding the duplicate-content blues is the canonical tag recently adopted by Google and other major search engines, but that doesn’t work across domains.)

Payne also shared an unrelated but similarly smart SEO initiative he has implemented at Tribune Co.: Soon, when reporters and editors file a story, they’ll be required to include a phrase they think is most likely to lead a Google searcher to the article.

Payne is convinced that the SEO game is won by ranking high on the search results for three- and four-word phrases. So while he’s happy that The Chicago Tribune’s topic page for Barack Obama ranks on the first page when searching for the president’s name in Google, Payne is more concerned with doing well with phrases like “obama birth certificate,” “obama’s secret service code names,” and “michelle obama and new do” — to name a few searches that lead lots of readers to the Tribune Co.

Some additional advice from a guy who says he doubled the Tribune Co.’s SEO traffic between 2007 and 2008:

When you do SEO for a huge media company with more content than you can possibly digest with a team so small it’s frustrating, it’s all about what you can automate and what you can force the masses of content creators to do via the CMS to help be more intelligent about SEO. It’s about training the masses and finding evangelists for SEO from those groups that are trained and encouraging them when there are wins.

If you like the way Payne thinks, check out some more SEO tips in the slides of a presentation he gave this year to the Society of Professional Journalists. And last week, we shared some tips for optimizing content in Google News.

                                   
What to read next
joseph-pulitzer
Mark Coddington    April 18, 2014
Plus: The pushback against Vox and The Intercept, Twitter’s data buy, and the rest of this week’s news and tech must-reads.
  • http://www.seosamba.com Michel Leconte

    Hello,

    I think Payne is on the right track there. I believe automation is the way to go when it comes down to execution, especially if you’re working across a large number of domains. For the rest of us publishers with less of a clout and nimbler technical team than Tribune Co., I recommend looking at the news content management module built by my organization. To the best of my knowledge, it is the only system that builds Google news compliant architecture out of the box, RSS feeds, newssitemap xml feeds, and a number of other neat things. A description can be found at the bottom of the following page:
    http://www.seosamba.com/seo-technology.html

  • Pingback: Optimal Use Of Technology And Content | Hartford Courant Alumni Association and Refugee Camp

  • http://www.ibsys.com Andy Kruse

    Linking to the original source is a huge win for you, Brent. It probably wasn’t an easy sell, so congratulations.

    But I think the AP’s problem (and one that I deal with as well) is that most of its nationally shared content has no home. It’s created for the masses and they can’t pick just one destination URL.

    AP seems to get this, at least, and is doing something about it. I’ve seen their plans for 2010 Olympic content and it involves a lot of exclusive hosting and not as much syndicating/duplicating.

  • Pingback: SearchCap: The Day In Search, September 8, 2009

  • Pingback: 5 O’Clock Roundup: Sprint’s disappearing promo, stem cell breakthrough (sort of), search engine optimization | UpOff.com

  • Pingback: links for 2009-09-08 « Glenna DeRoy

  • Pingback: How Tribune Co. plans to rid itself of SEO-killing duplicate …- SFWEBDESIGN.com

  • http://savethemedia.com Gina Chen

    I think this is a great idea. I’m no SEO expert, so I can’t really say if it will work, although it sounds promising. But what I find really important about this idea is that it’s an example of a media organization changing itself to get its stories better play in Google searches, rather than just moaning that Google’s system is unfair. That’s a big step in the right direction. Will be interesting to see how this works, and how others can use this idea.

  • http://www.submitawebsite.com SEO Company

    It will be interesting to see if the cookie or URL parameter does not appear to be a form of cloaking, which is also looked down upon by search engines. Other than that, eliminating duplicates also helps search engines (and the internet) by reducing the amount of unnecessary content online.

  • http://successdegrees.com Inisheer

    Google is a national, even worldwide platform. It has no reason to send it’s readers to multiple newspaper websites that are running the same content. This ia a basic problem the newspapers just refuse to understand. Search engines want original content, because their readers do.

  • Pingback: Some Newspapers Get It, And Some Still Don’t

  • Pingback: Newspaper’s top 5 search queries are commercial brands » Nieman Journalism Lab

  • http://www.webanablog.com Andy Batten

    Just an update to this, Google now supports a cross-domain canonical tag:

    http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html

    This should make many of the duplicate content issues moot, except in those cases where articles are illegally redistributed on other domains and don’t include the tag.

  • Chloe Williams

     Very Interesting!
    Payne’s plan everything seems working well. Keep it up Mr. Payne. I have been following your works and plans since I find them all very relevant and informative.

    events management courses sydney

  • Dominic108

    Andy Batten suggests that we use the canonical tag in pages that contain the same article, but look like pages of different journals. This is useful if the journal to be displayed is included in the URL. It’s a way to tell Google that the different URLs should be collapsed into a single URL to avoid duplication of content.  Great, but the canonical tag is for different URLs with very similar content. Having different journals implies having different contacts, etc.  Is this sufficiently similar?  I would be very interested to know what is the status of this new design. Did we hear from Google engineers about it?  

  • Dominic108

    I agree with SEO Company that making the content dependent on cookies can be a form of cloaking, especially because these  cookies can be systematically used when redirecting different URLs to a canonical one.  Google prefers to be able to make its own decision whether or not distinct URLs should be collapsed into a single one.  Cloaking to manipulate search engines can even result in a penalty.  On the other hand, there is no penalty for a natural and open duplication of content as long as the duplicate content is only gathered to please customers, not to fool the search engines.  Google will pick one of the duplicates to present in the search results and it wants to do that in the best way possible. Therefore,  hiding the differences and presenting one URL is counterproductive and a dangerous practice.  The canonical tag is a better solution. It suggests to Google that the different URLs should be collapsed into a canonical and which one should be the canonical.  However, the situation is  tricky if the different URLs contain different contact info (business name, phone number and street address), which is likely to be the case.  This is discussed here http://civm.ca/.