Twitter  "Algorithms have consequences." Zeynep Tufekci on Ferguson and net neutrality: nie.mn/VpE1Ef  
Nieman Journalism Lab
Pushing to the future of journalism — A project of the Nieman Foundation at Harvard

Aggregators, curators, and indexers: There’s a difference, and it matters

Aggregation. Curation. Indexing. They’re all the same, aren’t they? Ask any serious online journalist or new media entrepreneur, and the answer will be quick and obvious: of course not! But in the public debate over the future of journalism — especially the debate as framed by legal analysts and public officials — the words often get thrown around as if they are identical. Ordinarily, such word quibbling would seem a little sad. But in the current context, where every aspect of journalism is up for grabs and concepts like “the hot news doctrine” are discussed in serious tones, words and definitions mean a great deal. So I thought it might be worth a little time thinking about what we mean by aggregation, by curation, and by indexing. In other words: if you’re an “aggregator,” what is it, exactly, that you do?

To get a sense of how I thought these terms were being increasingly lumped together, and some of the problems this might cause, I wanted to highlight the first couple paragraphs from the written materials distributed at the Online Media Legal Network’sJournalism’s Digital Transition,” which was a conference I attended at Harvard a few weeks ago. The conference, by the way, was great, and I don’t mean to pick on the OLMN. But I did think that the discussion of aggregation included in their CLE (Continuing Legal Education) materials really summed up the issues that I wanted to get at in this post. In the document “News Aggregation and Copyright Fair Use,” conference attendees read:

One of the hottest topics in copyright law these days is the rise of the news aggregator, from Google News to the Huffington Post … debate arises when third-parties get into the act [of] reselling and profiting from information generated by traditional media organizations.

Of course, building a business model around monetizing another’s website content isn’t novel, and methods for doing so have been around for almost as long as the Internet has been considered a viable commercial entity. Consider the practice of framing, or superimposing ads, onto linked websites … News aggregators, which take information from multiple websites and display it on a single page, providing a convenient one-stop resource for readers, are merely the latest flavor-of-the-week.

Though Google News may be the most well known commercial news aggregator, there are many others, such as the Huffington Post and Newser.com. Some use only headlines and links, others copy full (or nearly full) articles and photos. Nearly all receive ad revenue, many based on page views that, copyright owners allege, are being diverted from websites that originate the content.

Are Google News, Huffington Post, and Newser.com the same? How about the other online organizations traditionally tossed into the mix, such as Gawker? If you view the online news ecosystem as basically bifurcated into two categories — content originators and content reusers — than this view of the world might make sense. In the above model, the primary issue isn’t what these sites actually do all day, but the fact that they “receive ad revenue, many based on page views that, copyright owners allege, are being diverted from websites that originate the content.” And yet, as soon as you start to conceptually differentiate between Google News and the Huffington Post, it becomes clear that there’s a much more complex news ecosystem out there.

So what’s actually going on online? I thought it might be interesting to take one of our very own Lab posts, Mark Coddington‘s all around smashing This Week in Review, and parse out how the ways that Mark engages in both what I’d call “aggregation” and “curation.” In essence, I think the upper sections of This Week in Review are fundamentally different from the bottom, concluding section, and the differences between the two sections point to different ways of doing online newswork.

The first dozen paragraphs of TWIR are usually broken down into three or four “hot topics” that are big in the future of journalism world that week. As Mark told me when I emailed him and asked him to explain his thinking behind This Week in Review, the upper sections

explore a discussion — a news development with commentary surrounding it, or ideas that spark responses and thus launch (or, usually, continue) a conversation. With those sections, I see myself as mapping out a discussion — explaining who’s on what side, what each person is saying and where that places them in relation to everyone else…If I see some substantive discourse coalescing around an article, that’s more likely to merit its own section because there are several connections I feel I need to explain (i.e. Person A said this, Person B responded with this, and Person C and D reminded both A and B of this and this).

Let’s take one recent TWIR as an example. The hot topics picked by Mark involved (1) the continuing controversy over Facebook, (2) a discussion of iPad apps, (3) New York Times and Wall Street Journal paywalls, and (4) finally, a good overview of recent pieces on new digital news experiments. I’d call this first, lengthiest section of the Week in Review “content aggregation and analysis.” In the old days I would have just called it “blogging.”

  • The topics Mark discusses in This Week in Review emerge from a deep immersion in the conversation about the future of journalism, and a lengthy period of active listening to what people are saying. I follow future-of-journalism news pretty closely, and I’ve almost never disagreed with Mark’s analysis about what the important topics of the week are. In short, I trust his judgment. But it’s a judgment that stems from deep, active engagement in the topic at hand.
  • The way Mark highlights the contours of the debate is through linking back to his original sources. The discussion of Facebook contains 17 links in four paragraphs.
  • Mark occasionally (but not often) weighs in on one of the debates, but he does it pretty subtly, and the bulk of This Week in Review is definitely taken up with summarizing and translating what others are saying.

The second part of TWIR — and it’s usually just a few paragraphs — is called “Reading Roundup.” I’d call this part of This Week in Review “curation,” and it strikes me as pretty different from the rest of the piece. It’s not as centered around debates, and the links tend to go to online content which is more “think-piecey.” In this section, Mark seems to be listening a little bit less, and exercising a bit more personal judgment. I hear him telling me: “Hey! You’ve followed the piece to the end, which tells me you really care about this issue. Since I think we share similar interests, you might like these pieces too!” Or as Mark put it when I quizzed him about the difference:

You’re right — there is a difference between the “reading roundup” and the rest of the weekly review posts…with the reading roundups, I’m merely pointing the reader toward an interesting link without substantively explaining its connection to the rest of the journalism-in-transition world. Essentially, the reading roundup is like me inviting you to a party, while the main sections are like me walking you through a room at that party, introducing you to people, explaining who’s who, and giving you a sense of who you might enjoy talking to.

Finally, compare both of these forms of writing to something like Google News, which uses complex algorithms to determine what the hot topics of the minute are, what counts as a spotlight story, and how to rank stories in order of originality and importance. If Google News looks like anything, it’s a phone book — or one of those yearly news indexes in the big green binders you used to encounter in libraries, just more up to date. There isn’t the same sense of “listening,” the process of judgment seems different, and most importantly, there isn’t the same kind of interstitial commentary surrounding the links. For me, what Google News and other sites do might productively be called “indexing.”

Because this blog post is already over 1,300 words, I’m not going to get into the question posed by Ken Doctor: Can’t we just call all this stuff “content arbitrage“? Maybe that’s the subject for another post, but the short answer is I don’t think you can. I think we need to begin to compare the new forms of journalistic work that exist online, not just to some imaginary ideal of “content creation” versus an evil “repurposing,” but to each other.

Ultimately, why does all this matter? Is there an ultimate upshot of all this linguistic parsing?

For me, the lesson is simple. Anytime you hear someone talk about Google News, The Huffington Post, Gawker, blogging, aggregating, curation, and indexing as if they are the same phenomenon, ignore them. And if they attach that discussion to a set of policy recommendations, without acknowledging the full complexity of what it is people actually do when they aggregate, curate, and index information — well, then you should put your fingers in your ears and run in the other direction.

                                   
What to read next
Peter Van Laer's Magic Scene with Self-portrait (better) via Shi-Chi Chiang
Caroline O'Donovan    Aug. 12, 2014
A new voice on social platforms helped Mother Jones beat its traffic records.
  • Lyn Headley

    Google News, Huff post, etc. may not be the same phenomonen, but they share common characteristics that I am reluctant to define away. The debate about whether linking is a form of support, as Google and Jeff Jarvis argue, or whether it’s a form of parasitism, as the AP and Connie Schultz argue, is an important one. It often lacks nuance, but that’s because each side has only an initial piece of the puzzle, not because the project of finding commonalities between these “aggregators” is fundamentally misguided or morally or politically distracting.

  • Bill Enator

    Regarding “parasitism”.

    How is it any different when a website uses content and attributes the AP then when the AP does the same thing?

    Is this any more or less parasitic?

    The AP routinely takes content, “hot news” etc.., without license, permission or payment from others.

    Out of one side of their mouth the AP calls it “fair use” when they do it.. but a copyright violation or “hot news” infringement when any others do.

  • http://networkednews.wordpress.com Josh Young

    Wait, what really is the “ultimate upshot of all this linguistic parsing”?

  • http://darushimo.com Andrew

    Wow, thanks for bring to light this topic/condition. I spend a lot of time online, and that these methods of presentation/creation are so phenomenally different seems only too obvious, even as the boundaries between them have been played with for much of the past few years (as Mr. Coddington does).

    The interesting question to me is how the legal apparatus deals with this ontology, and the OLMN quote you use is disheartening. The ‘agency’ and ‘use of content’ by the aggregator/curator/algorithm are not brought into the picture–only the relationship between the deployment of the quoted sources ‘content’ and the current legal structure seem to matter.

    I would be curious to know more about the ways actual laws are changing around these matters–these apparently ambiguous forms of journalism–partially because I assume these laws would/do effect other forms of expression (art, etc) but also because, if statements that you quote are typical of the legal discourse, the legal profession may veer journalism away from using the more experimental methods that technology enabled, bloggers began, and journalism is now grabbing hold of and utilizing on a larger scale with a qualitatively different result.

    I got to do me some more research. Thanks.

  • Pingback: Aggregators, Curators, and Indexers: There’s a Difference, and it Matters | C.W. Anderson | Voices | AllThingsD

  • Pingback: Hot Links: The Gates of Hell The Reformed Broker

  • Pingback: Storyful. » TIMBP #2

  • Pingback: Marketing News Headlines – 06/02/2010 » Brand Central Station

  • http://www.bostonist.com Kerry

    Should we really ignore people and run away if they don’t have a nuanced conception of what news sites do, or should we talk to them about why they think what they think, and try to educate them? If we all just ignored one another, we wouldn’t have any news, aggregation, indexing, curation, or linking. Why not ‘curate’ a collection of information (including resources like this one) for people who think there’s no difference between Google News and informed wrap-ups?

  • http://JEANNEWOLF.COM JEANNE WOLF

    THE DEFINITIONS ARE FASCINATING

    BUT WHAT CONCLUSIONS ARE MADE ABOUT “FAIR USE”?
    THAT SEEMS TO BE THE BIG QUESTION AND THE BIG DEBATE

  • Pingback: Aggregators..New Media and the Digital Age….

  • http://lgmassmedia.com/blog Briana

    I’ve had a hard time explaining the difference and this is a good way to finally differentiate the terms. I’m a fan of linking back to the original article rather than ripping it line for line, but I can respect sites who came up with the rip and profit model too, only because I didn’t come up with it and it’s obviously working for them

  • Pingback: Is there a difference in high-end, name brand dried herbs and spices and generic store brand herbs and spices?

  • Pingback: links for 2010-06-03 « David Black

  • jiks

    What about delicous.com, Digg etc.? They are more like curators as most people post there at least after one read.

  • Pingback: Critical Literacies Class: Week One Readings Reflection « Field Notes: Online Education

  • http://skyscraps.com Shelley Kaufman-Young

    Much food for thought in this post. The connection between what curators, aggregators and journalists do is there but as you note the difference is in how particular parts of the unit of information are presented and used. Especially difficult is the trademark law issue. How does one value one aspect of information over another? Is that even a valid question? For instance, Google now wants me to ‘monetize’ my blog. So far, I have resisted since I quail at the idea of making a profit from private thoughts and having my site used to purvey products. But this is surely a rich field for further thought on philosophy, word use, and information. Thank you for a provoking post.

  • http://www.hivefire.com Taariq Lewis

    An excellent piece on how journalism considers the changing role of aggregation and curation. Thank you for an interesting conversation. However, I noticed there was no mention of Twitter as part of the journalist’s aggregation and then curation toolkit. Was that intentional?

  • http://twitter.com/markmayhew Mark

    no mention of Techmeme?

  • Pingback: Curation is Important « Financialfreezeframe's Blog

  • Pingback: Around the Web: ZURB’s Bounce App, Founder Salaries, & WP 3.0 Adice | Carsonified

  • Pingback: The battle lines are being drawn over fair use: Two POVs on the Barclays v. TheFlyOnTheWall.com case » Nieman Journalism Lab

  • Pingback: Around the Web: <span class="caps">ZURB</span>’s Bounce App, Founder Salaries, <span class="amp">&</span> <span class="caps">WP</span> 3.0 Advice | RefreshTheNet

  • Pingback: whats the best way to bring two bikes on my 98 camry?

  • Pingback: links for 2010-07-28 | Beyond the Echo Chamber

  • Pingback: The difference between aggregators, curators, and indexers | Brian Heys - thinking out loud

  • Pingback: Real-Time News Curation – The Complete Guide Part 2: Aggregation Is Not Curation

  • Pingback: John Walcott makes the switch to online, but wants to bring some traditional-media virtues with him » Nieman Journalism Lab

  • Byebbey

    you want to grow up quickly do not look at fairy tales the prince riding a
    white horse
    with you is there is sac paul smith more love tears of happiness shining on the
    former princess will love too complicated grief
    It always makes great

  • Mulberry-fj
  • Onthewaysunshine

    Mulberry Oversize Work’s Bags and so on 
    mulberry bags gain recognition as a end result of its multifunction.Their natural leather messenger tote within our shop for a large amount of people can be a custom bag.With perfect design and charming styles 
    Mulberry Cross Body Bags  are utilised as being a classy style accessory.
     

  • michael kors

     If you have to modify the shape of a watch,michael kors
    outlet designer know: I almost can not improve it, only change it. michael kors
    watches changes occur, including rotating outer ring fuller, clearer,
    and the surface of a larger case, strap and folding buckle significantly
    improved.michael kors watches on sale The same tradition is that
    all parts are from the last milling block watch materials michael kors sale
    forming type

  • michael kors

     If you have to modify the shape of a watch,michael kors
    outlet designer know: I almost can not improve it, only change it. michael kors
    watches changes occur, including rotating outer ring fuller, clearer,
    and the surface of a larger case, strap and folding buckle significantly
    improved.michael kors watches on sale The same tradition is that
    all parts are from the last milling block watch materials michael kors sale
    forming type

  • Pingback: Sooperman