Aggregators, curators, and indexers: There’s a difference, and it matters

By C.W. AndersonJune 1  /  1 p.m.

Aggregation. Curation. Indexing. They’re all the same, aren’t they? Ask any serious online journalist or new media entrepreneur, and the answer will be quick and obvious: of course not! But in the public debate over the future of journalism — especially the debate as framed by legal analysts and public officials — the words often get thrown around as if they are identical. Ordinarily, such word quibbling would seem a little sad. But in the current context, where every aspect of journalism is up for grabs and concepts like “the hot news doctrine” are discussed in serious tones, words and definitions mean a great deal. So I thought it might be worth a little time thinking about what we mean by aggregation, by curation, and by indexing. In other words: if you’re an “aggregator,” what is it, exactly, that you do?

To get a sense of how I thought these terms were being increasingly lumped together, and some of the problems this might cause, I wanted to highlight the first couple paragraphs from the written materials distributed at the Online Media Legal Network’sJournalism’s Digital Transition,” which was a conference I attended at Harvard a few weeks ago. The conference, by the way, was great, and I don’t mean to pick on the OLMN. But I did think that the discussion of aggregation included in their CLE (Continuing Legal Education) materials really summed up the issues that I wanted to get at in this post. In the document “News Aggregation and Copyright Fair Use,” conference attendees read:

One of the hottest topics in copyright law these days is the rise of the news aggregator, from Google News to the Huffington Post … debate arises when third-parties get into the act [of] reselling and profiting from information generated by traditional media organizations.

Of course, building a business model around monetizing another’s website content isn’t novel, and methods for doing so have been around for almost as long as the Internet has been considered a viable commercial entity. Consider the practice of framing, or superimposing ads, onto linked websites … News aggregators, which take information from multiple websites and display it on a single page, providing a convenient one-stop resource for readers, are merely the latest flavor-of-the-week.

Though Google News may be the most well known commercial news aggregator, there are many others, such as the Huffington Post and Newser.com. Some use only headlines and links, others copy full (or nearly full) articles and photos. Nearly all receive ad revenue, many based on page views that, copyright owners allege, are being diverted from websites that originate the content.

Are Google News, Huffington Post, and Newser.com the same? How about the other online organizations traditionally tossed into the mix, such as Gawker? If you view the online news ecosystem as basically bifurcated into two categories — content originators and content reusers — than this view of the world might make sense. In the above model, the primary issue isn’t what these sites actually do all day, but the fact that they “receive ad revenue, many based on page views that, copyright owners allege, are being diverted from websites that originate the content.” And yet, as soon as you start to conceptually differentiate between Google News and the Huffington Post, it becomes clear that there’s a much more complex news ecosystem out there.

So what’s actually going on online? I thought it might be interesting to take one of our very own Lab posts, Mark Coddington’s all around smashing This Week in Review, and parse out how the ways that Mark engages in both what I’d call “aggregation” and “curation.” In essence, I think the upper sections of This Week in Review are fundamentally different from the bottom, concluding section, and the differences between the two sections point to different ways of doing online newswork.

The first dozen paragraphs of TWIR are usually broken down into three or four “hot topics” that are big in the future of journalism world that week. As Mark told me when I emailed him and asked him to explain his thinking behind This Week in Review, the upper sections

explore a discussion — a news development with commentary surrounding it, or ideas that spark responses and thus launch (or, usually, continue) a conversation. With those sections, I see myself as mapping out a discussion — explaining who’s on what side, what each person is saying and where that places them in relation to everyone else…If I see some substantive discourse coalescing around an article, that’s more likely to merit its own section because there are several connections I feel I need to explain (i.e. Person A said this, Person B responded with this, and Person C and D reminded both A and B of this and this).

Let’s take one recent TWIR as an example. The hot topics picked by Mark involved (1) the continuing controversy over Facebook, (2) a discussion of iPad apps, (3) New York Times and Wall Street Journal paywalls, and (4) finally, a good overview of recent pieces on new digital news experiments. I’d call this first, lengthiest section of the Week in Review “content aggregation and analysis.” In the old days I would have just called it “blogging.”

  • The topics Mark discusses in This Week in Review emerge from a deep immersion in the conversation about the future of journalism, and a lengthy period of active listening to what people are saying. I follow future-of-journalism news pretty closely, and I’ve almost never disagreed with Mark’s analysis about what the important topics of the week are. In short, I trust his judgment. But it’s a judgment that stems from deep, active engagement in the topic at hand.
  • The way Mark highlights the contours of the debate is through linking back to his original sources. The discussion of Facebook contains 17 links in four paragraphs.
  • Mark occasionally (but not often) weighs in on one of the debates, but he does it pretty subtly, and the bulk of This Week in Review is definitely taken up with summarizing and translating what others are saying.

The second part of TWIR — and it’s usually just a few paragraphs — is called “Reading Roundup.” I’d call this part of This Week in Review “curation,” and it strikes me as pretty different from the rest of the piece. It’s not as centered around debates, and the links tend to go to online content which is more “think-piecey.” In this section, Mark seems to be listening a little bit less, and exercising a bit more personal judgment. I hear him telling me: “Hey! You’ve followed the piece to the end, which tells me you really care about this issue. Since I think we share similar interests, you might like these pieces too!” Or as Mark put it when I quizzed him about the difference:

You’re right — there is a difference between the “reading roundup” and the rest of the weekly review posts…with the reading roundups, I’m merely pointing the reader toward an interesting link without substantively explaining its connection to the rest of the journalism-in-transition world. Essentially, the reading roundup is like me inviting you to a party, while the main sections are like me walking you through a room at that party, introducing you to people, explaining who’s who, and giving you a sense of who you might enjoy talking to.

Finally, compare both of these forms of writing to something like Google News, which uses complex algorithms to determine what the hot topics of the minute are, what counts as a spotlight story, and how to rank stories in order of originality and importance. If Google News looks like anything, it’s a phone book — or one of those yearly news indexes in the big green binders you used to encounter in libraries, just more up to date. There isn’t the same sense of “listening,” the process of judgment seems different, and most importantly, there isn’t the same kind of interstitial commentary surrounding the links. For me, what Google News and other sites do might productively be called “indexing.”

Because this blog post is already over 1,300 words, I’m not going to get into the question posed by Ken Doctor: Can’t we just call all this stuff “content arbitrage“? Maybe that’s the subject for another post, but the short answer is I don’t think you can. I think we need to begin to compare the new forms of journalistic work that exist online, not just to some imaginary ideal of “content creation” versus an evil “repurposing,” but to each other.

Ultimately, why does all this matter? Is there an ultimate upshot of all this linguistic parsing?

For me, the lesson is simple. Anytime you hear someone talk about Google News, The Huffington Post, Gawker, blogging, aggregating, curation, and indexing as if they are the same phenomenon, ignore them. And if they attach that discussion to a set of policy recommendations, without acknowledging the full complexity of what it is people actually do when they aggregate, curate, and index information — well, then you should put your fingers in your ears and run in the other direction.

This entry was written by C.W. Anderson, posted on June 1, 2010 at 1:00 pm, and tagged , , , , , , , , , . Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback.


26 comments:

  1. Lyn Headley at 3:07 pm, June 1, 2010

    Google News, Huff post, etc. may not be the same phenomonen, but they share common characteristics that I am reluctant to define away. The debate about whether linking is a form of support, as Google and Jeff Jarvis argue, or whether it’s a form of parasitism, as the AP and Connie Schultz argue, is an important one. It often lacks nuance, but that’s because each side has only an initial piece of the puzzle, not because the project of finding commonalities between these “aggregators” is fundamentally misguided or morally or politically distracting.

     
  2. Bill Enator at 4:23 pm, June 1, 2010

    Regarding “parasitism”.

    How is it any different when a website uses content and attributes the AP then when the AP does the same thing?

    Is this any more or less parasitic?

    The AP routinely takes content, “hot news” etc.., without license, permission or payment from others.

    Out of one side of their mouth the AP calls it “fair use” when they do it.. but a copyright violation or “hot news” infringement when any others do.

     
  3. Josh Young at 5:57 pm, June 1, 2010

    Wait, what really is the “ultimate upshot of all this linguistic parsing”?

     
  4. Andrew at 11:45 pm, June 1, 2010

    Wow, thanks for bring to light this topic/condition. I spend a lot of time online, and that these methods of presentation/creation are so phenomenally different seems only too obvious, even as the boundaries between them have been played with for much of the past few years (as Mr. Coddington does).

    The interesting question to me is how the legal apparatus deals with this ontology, and the OLMN quote you use is disheartening. The ‘agency’ and ‘use of content’ by the aggregator/curator/algorithm are not brought into the picture–only the relationship between the deployment of the quoted sources ‘content’ and the current legal structure seem to matter.

    I would be curious to know more about the ways actual laws are changing around these matters–these apparently ambiguous forms of journalism–partially because I assume these laws would/do effect other forms of expression (art, etc) but also because, if statements that you quote are typical of the legal discourse, the legal profession may veer journalism away from using the more experimental methods that technology enabled, bloggers began, and journalism is now grabbing hold of and utilizing on a larger scale with a qualitatively different result.

    I got to do me some more research. Thanks.

     
  5. Kerry at 11:08 am, June 2, 2010

    Should we really ignore people and run away if they don’t have a nuanced conception of what news sites do, or should we talk to them about why they think what they think, and try to educate them? If we all just ignored one another, we wouldn’t have any news, aggregation, indexing, curation, or linking. Why not ‘curate’ a collection of information (including resources like this one) for people who think there’s no difference between Google News and informed wrap-ups?

     
  6. JEANNE WOLF at 11:40 am, June 2, 2010

    THE DEFINITIONS ARE FASCINATING

    BUT WHAT CONCLUSIONS ARE MADE ABOUT “FAIR USE”?
    THAT SEEMS TO BE THE BIG QUESTION AND THE BIG DEBATE

     
  7. Briana at 3:04 pm, June 2, 2010

    I’ve had a hard time explaining the difference and this is a good way to finally differentiate the terms. I’m a fan of linking back to the original article rather than ripping it line for line, but I can respect sites who came up with the rip and profit model too, only because I didn’t come up with it and it’s obviously working for them

     
  8. jiks at 9:40 am, June 4, 2010

    What about delicous.com, Digg etc.? They are more like curators as most people post there at least after one read.

     
  9. Shelley Kaufman-Young at 8:33 am, June 5, 2010

    Much food for thought in this post. The connection between what curators, aggregators and journalists do is there but as you note the difference is in how particular parts of the unit of information are presented and used. Especially difficult is the trademark law issue. How does one value one aspect of information over another? Is that even a valid question? For instance, Google now wants me to ‘monetize’ my blog. So far, I have resisted since I quail at the idea of making a profit from private thoughts and having my site used to purvey products. But this is surely a rich field for further thought on philosophy, word use, and information. Thank you for a provoking post.

     
  10. Taariq Lewis at 8:21 pm, June 12, 2010

    An excellent piece on how journalism considers the changing role of aggregation and curation. Thank you for an interesting conversation. However, I noticed there was no mention of Twitter as part of the journalist’s aggregation and then curation toolkit. Was that intentional?

     
  11. Mark at 3:41 pm, June 15, 2010

    no mention of Techmeme?

     

Trackbacks:

  1. Aggregators, Curators, and Indexers: There’s a Difference, and it Matters | C.W. Anderson | Voices | AllThingsD at 5:20 am, June 2, 2010

    [...] Read the rest of this post on the original site Tagged: Voices, blogging, C.W. Anderson, Nieman Lab | permalink Sphere.Inline.search("", "http://voices.allthingsd.com/20100602/aggregators-curators-and-indexers-there%e2%80%99s-a-difference-and-it-matters/"); « Previous Post ord=Math.random()*10000000000000000; document.write(''); [...]

     
  2. Hot Links: The Gates of Hell The Reformed Broker at 8:01 am, June 2, 2010

    [...] On the difference between curating, aggregating and indexing in the blogosphere.  (NiemanLab) [...]

     
  3. Storyful. » TIMBP #2 at 8:06 am, June 2, 2010

    [...] Neiman Lab on the difference between indexers, curators and aggregators. [...]

     
  4. Marketing News Headlines – 06/02/2010 » Brand Central Station at 10:18 am, June 2, 2010

    [...] And Truffle-Gate (NY Observer) BPA Board Votes Down Controversial Reporting Changes (Folio:) Aggregators, Curators, And Indexers: There’s A Difference, And It Matters (Nieman Journalism L… Handicapping The Newsweek Mag Sweepstakes (NY Post / Media Ink) ALM’s The Recorder Goes to [...]

     
  5. Aggregators..New Media and the Digital Age…. at 11:47 am, June 2, 2010

    [...] Here is a rather interesting article by Nieman Journalism Lab on the different terms for various New Media ventures. Basically the article is asking us all to get our terms right! [...]

     
  6. Is there a difference in high-end, name brand dried herbs and spices and generic store brand herbs and spices? at 3:13 am, June 3, 2010

    [...] Aggregators, curators, and indexers: There's a difference, and it … [...]

     
  7. links for 2010-06-03 « David Black at 4:06 am, June 3, 2010

    [...] Aggregators, curators, and indexers: There’s a difference, and it matters – Nieman Journalis… "Aggregation. Curation. Indexing. They’re all the same, aren’t they? Ask any serious online journalist or new media entrepreneur, and the answer will be quick and obvious: of course not! But in the public debate over the future of journalism – especially the debate as framed by legal analysts and public officials – the words often get thrown around as if they are identical." (tags: internet socialmedia participatory journalism citizenmedia aggregators) [...]

     
  8. Critical Literacies Class: Week One Readings Reflection « Field Notes: Online Education at 10:25 am, June 4, 2010

    [...] space, the internet can function as an information waste land. Organization already assists us (aggretators, curators and indexers) but we still run the risk of searching for “A” and instead finding “B through [...]

     
  9. Curation is Important « Financialfreezeframe's Blog at 12:24 pm, June 16, 2010

    [...] June 16, 2010 by financialfreezeframe Leave a Comment Business Insider’s, Steve Rosenbaum wrote about Curation being King due to the fact that there was a flood of news all around from facebook, tweeter, Mainstream Newfeeds, bloggers. Almost everyone and everybody is creating content. Hence managing and sieving the bset content is now a premium.  Although I would not go so far as to say as curation is king, I would say a healthy blend of both would be important.  I have also profiled the best Curator of the Web currently here.  For more details between the different types of contents you can refer here. [...]

     
  10. Around the Web: ZURB’s Bounce App, Founder Salaries, & WP 3.0 Adice | Carsonified at 5:28 pm, June 23, 2010

    [...] Aggregators, curators, & indexers: There’s a difference & it matters (via NiemanJournalismLab) [...]

     
  11. The battle lines are being drawn over fair use: Two POVs on the Barclays v. TheFlyOnTheWall.com case » Nieman Journalism Lab at 6:24 pm, June 23, 2010

    [...] an awful precedent. And that makes sense from their point of view, since they’re both in the aggregation/curation/indexing business. But there were at least two other amicus briefs filed yesterday that make sharply [...]

     
  12. Around the Web: <span class="caps">ZURB</span>’s Bounce App, Founder Salaries, <span class="amp">&</span> <span class="caps">WP</span> 3.0 Advice | RefreshTheNet at 8:09 am, June 28, 2010

    [...] Aggre­ga­tors, cura­tors, & index­ers: There’s a dif­fer­ence & it mat­ters (via NiemanJournalismLab) [...]

     
  13. whats the best way to bring two bikes on my 98 camry? at 12:48 am, July 24, 2010

    [...] Aggregators, curators, and indexers: There's a difference, and it … [...]

     
  14. links for 2010-07-28 | Beyond the Echo Chamber at 11:02 am, July 28, 2010

    [...] Aggregators, curators, and indexers: There’s a difference, and it matters » Nieman Journalism Lab Parsing the differences bw curation, aggreation and indexing. Important when thinking of jouranlism/publishing and community strategies and what your communities want. (tags: mediaconsortium NewModels_tmc tmc_handbook aggregation journalism) // [...]

     
  15. The difference between aggregators, curators, and indexers | Brian Heys - thinking out loud at 8:43 am, August 16, 2010

    [...] Read in full at niemanlab.org. This entry was written by Brian, posted on 16 August 2010 at 1:30 pm, filed under Content curation, Curated by me. Bookmark the permalink. Follow any comments here with the RSS feed for this post. Post a comment or leave a trackback: Trackback URL. « How publishers are using curation tools to curate the world of content [...]

     

Leave a comment

Check out these related posts