Who Owns Your RSS?

In a case with far-reaching implications for the widespread practice of automated aggregation of headlines and ledes via RSS, GateHouse Media has, for the most part, won its case against the New York Times, who owns Boston.com, who in turn run a handful of community web sites. Those community sites were providing added value to their readers in the form of linked headlines, pointing to resources at community publications run by GateHouse. The practice of linked headline exchange is healthy for the web, useful for readers, and helpful for resource-starved community publications. However, for reasons that are still not clear (to me), GateHouse felt that the practice amounted to theft, even though the Boston.com sites were publishing the RSS feeds to begin with.

Trouble is, RSS feeds don’t come with Terms of Use. Is a publicly available feed meant purely for consumption by an individual, and not by other sites? After all, the web site you’re reading now is publicly available, but that doesn’t mean you’re free to reproduce it elsewhere. The common assumption is that a site wouldn’t publish an RSS feed if it didn’t want that feed to be re-used elsewhere. And that’s the assumption GateHouse is challenging.

Let’s be clear – this is not a scraping case (scraping is the process of writing tools to grab content from web pages automatically when an RSS feed is not available). Boston.com was simply utilizing the content GateHouse provided as a feed. I would agree that scraping is “theft-like” in a way that RSS is not, but that’s not relevant here.

In a weird footnote to all of this, GateHouse initially claimed that Boston.com was trying to work around technical measures they had put in place to prevent copying of their material. Those “technical measures” amounted to JavaScript in its web pages, but boston.com was of course not scraping the site — they were merely taking advantage of the RSS feeds freely provided by GateHouse. In other words, they were putting their “technical measures” in their web pages, not in their feed distribution mechanism, missing the point entirely.

GateHouse seems primarily concerned with the distinction between automated insertion of headlines and ledes (e.g. via RSS embeds) vs. the “human effort” required to quote a few grafs in a story body. Personally, I don’t see how the two are materially different, or how one method would affect GateHouse publications more negatively or positively than the other. If anything, now that GateHouse has gotten its way, they’re sure to receive less traffic.

The result is that Boston.com has been forced to stop using GateHouse RSS feeds to automatically populate community sites with local content. If cases like this hold sway, there will soon be a burden on every site interested in embedding external RSS feeds to find out whether it’s OK with each publisher first.

PlagiarismToday sums up the case:

It was a compromise settlement, as most are, but one can not help but feel that GateHouse just managed to bully one of the largest and most prestigious new organizations in the world.

Also:

The frustrating thing about settlements, such as this one, is that they do not become case law and have no bearing on future cases. If and when this kind of dispute arises again, we will be starting over from square one.

I’m trying to figure out who benefits from this decision… and I honestly can’t. GateHouse loses. Boston.com loses. Community web sites with limited resources lose. And readers lose. Something’s rotten in the state of Denmark.

Hermenautic Circle

Hermenaut In the beginning, there was Hermenaut, an excellent ‘zine out of the Boston area from the mid-90s. Hermenaut hit it pretty big, as zines go, because it was packed with excellent writing and funky topics (issues had themes like “False Authenticity” and “Vertigo”). My old Liberace piece was originally written for Hermenaut’s “camp” issue. Fast forward a decade. Some of the original Hermenenaut authors, including Boston Globe writer Josh Glenn (who was one of Hermenaut’s founders) participate in a free-form (but closed) mailing list for around a hundred writers and gadflies.

Eventually, the “Hermeneutic Circle” realized that many of its subscribers maintained their own blogs, which gave rise to the idea of a “planet” web site that could be used to aggregate new posts from all of the individual blogs (without requiring writers to post in two places). Glenn signed up with Birdhouse Hosting, we registered hermenaut.org, and went looking for a solution.

The rub was that Glenn wanted more than simple RSS aggregation. He wanted posts from scattered blogs made into actual posts on the Hermeneutic Circle, so people could comment directly on the site. Somehow we needed to consume RSS feeds and produce new entries on the new blog, rather than just links. Eventually I stumbled on FeedWordPress – one of the coolest WordPress plugins I’ve tried in a while. Hand it a URL and it will discover all embedded feeds and ask you which one to subscribe. Each new author found in the feeds is made into a genuine author in the local WP system. Each category found in a feed becomes a genuine category in the local WP system. A nice API gives you a new set of template tags you can use to control whether commenting happens on the original author’s site or on the local site. And so on. Really nicely done (and yes, we tipped the plugin developer).

Hermenautic Circle went live today in starter mode; we’re off and running. And once again, I’m just amazed at the amount of work saved by the rich plugin landscape surrounding WordPress (I really thought I was going to have code this by hand).

Music: Angels Of Light :: Black River Song

New J-School RSS Feeds

Built RSS feed generators for the J-School’s student and faculty stories databases today. Nice to be able to test RSS feeds directly in Safari. These will probably be lightly updated now that we’re into summer. Now that these exist, I’d like to build a custom Dashboard Widget for J-School RSS feeds in time for next semester, and load it onto all of the new incoming Macs.

Finally got approval and backing to undertake a massive re-creation of the J-School web site this summer, glory be. We’ll be hiring a designer; the hard work is going to be cleaning up and prepping 1500 pages of static content, taking it out of 1997 mode. Design tables, font tags, non-existent standardization on how pages are constructed. By the time I’m done, every page should be XHTML compliant, lightweight, and with total separation of design and content. We’ve got a heck of a lot of custom PHP/MySQL stuff interleaved, which could make choosing and integrating a CMS tricky.

Now that back-to-back conferences and jury duty are over, nose to the grindstone.

Music: Magazine :: shot by both sides