More on RSS and Atom
July 4th, 2006 by DeWitt Clinton

Responding to my post on RSS and Atom, Robert Scoble writes, “where’s the Atom publishing tool and aggregator that demonstrates Atom’s superiority?”

And you know what? Scoble is absolutely right to ask that. Even more so when he says that “users don’t care about specs, or arguments about formats.”

The truth is, until we create the features that touch real people, it doesn’t really matter whether or not we use RSS, Atom, RDF, or heck, CSV to syndicate our content.

RSS really is good enough for the bulk of the blogging community. And it is hell of a lot better than what we had before.

Besides, if RSS was not sufficient for blogging, then there wouldn’t be umteen-gazillion bloggers using it. I’m not sure I’d even recommend the average blogger think twice about formats. Just use whatever format your blogging software of choice gives to you. It works, be happy, rejoice.

But if you’re a Microsoft, Google, Yahoo, Amazon, Ebay, etc., then you are probably thinking hard about how to incorporate content syndication into your applications. And your “content” is quite possibly more than just words on a page. Your content might include images, sound, movies. It might include rich metadata like geo tags, event and calendar information, address data, product details, or recipient lists.

It is in those rich data scenarios — i.e., in the stage beyond simple text blogging — that Atom will be most useful. In fact, I’m tempted to make the argument that the reason we haven’t gone there already is simply because the popular syndication formats weren’t anticipating how to deal with it.

Sure, we’ve squeezed some rich media into RSS. And we can probably squeeze a whole lot more. I’m not even a zealot against using RSS in sophisticated applications — the first version of OpenSearch was built on RSS after all, and it still works great for syndicating simple search results.

My point was more for the application developers, more for the people that work behind the scenes. Your users won’t care whether you pick RSS vs Atom. Not yet, anyway. But I’m pretty sure that you, the application developer, will care. And that ultimately will impact your users.

So with that aside, why don’t we take Scoble up on his challenge? Why don’t we go ahead and build the next generation of applications that do impact real users and do take advantage of the full richness that syndication can offer?

We should start embedding addresses, calendars, products, and contact information in our syndicated feeds. And we should start expecting our feed reader applications to notice this rich data and automatically open address books and maps and shopping carts whenever they can.

We should start syndicating our search engines. And if those search engines have access to rich data, then they should include that rich data in their syndicated search.

This will open up entirely new opportunities for the application developer. And more importantly, new opportunities for the users. Who, as Scoble rightly points out, are the only ones that really matter.

RSS helped get the idea out to the public that content can be freed.

Now, let’s go give them some more content.

23 Responses to “More on RSS and Atom”

  1. Randy Holloway Unfiltered » Should developers prefer Atom over RSS? Says:

    [...] UPDATE- Dewitt has responded to Robert’s post and writes, “RSS really is good enough for the bulk of the blogging community. And it is hell of a lot better than what we had before.” He also agrees with Robert that we should show the value to end users. I’m still not sure what those scenarios would look like, but I guess if they exist it can’t be a bad idea. [...]

  2. Alex Barnett Says:

    Interesting, if not an old debate ;-)

    I get the point you are making, so my question is this:

    You’ve seen the RSS + SSE work that’s gone on that can carry structured data via microformats (using Live Clipboard): hCard and hCalendar.

    From a technical perspective, why do you think this approach (RSS + SSE) doesn’t meet the requirement re: transmitting content is that more than just words on a page?

    thanks,

    Alex.

  3. DeWitt Clinton Says:

    Alex,

    Good question. I have seen several efforts to add structured data and explicit content types to RSS.

    But that’s just the issue. That there are several efforts underway to add rich typing to RSS. And all of those efforts are external to the core format itself.

    It will be tough for one of them to take off. But you never know.

    BTW, I’ve understood SSE to be tackling a much bigger problem than simple data typing.

    Though I am certainly curious as to what the result would have been if you guys had started work on SSE with the power of Atom and Atom Pub behind you. I know that I’ve gone and revisited a number of tricky issues I had with RSS in the past and used Atom to find elegant solutions.

  4. Nick Lothian Says:

    The GData API (http://code.google.com/apis/gdata/) - in particular the Calendar APIs - is a good example of the usefulness of some of the features of Atom compared to RSS (eg, the fact that an Atom “Entry” can be a self-contained document).

  5. DeWitt Clinton Says:

    Nick — that is a great example. Thank you!

  6. Rod Begbie Says:

    The example of Atom-in-action that makes me giddy at the prospect is the Lucene Web Service (http://dev.lucene-ws.net/wiki/API). A combination of Atom syndication, Atom publishing, and OpenSearch, over a RESTful API, for the remote indexing and searching of Lucene indices.

    If that doesn’t get your geek drool going, I don’t know what will!

  7. DeWitt Clinton Says:

    I have to agree with you there, Rod.

    Lucene Web Service has the potential to be a tremendous application — and one that could ultimately be the foundation for many new features that the end user can see.

  8. Michael Walsh Says:

    Extra features such as… including the equivilant of in your main feedburner feed like scoble’s wordpress blog does. I sure would love to follow the comments in my aggregator that supports them instead of using the site. Thanks

  9. Michael Walsh Says:

    Hmm, that got filtered out. I mean something like “wfw:commentRss” in your main feed would be nice.

  10. Jason Warren Says:

    We’ve been using RSS at Moto as the basis of our mobile content distribution offering, and are now running into exactly the issues you’ve described. Now that we’re moving towards trying to support richer content, and extract stateful information from content, the limitations of RSS are becoming more apparent.

    Atom may be an option for filling at least part of our needs.

  11. Danny Says:

    “We should start embedding addresses, calendars, products, and contact information in our syndicated feeds. And we should start expecting our feed reader applications to notice this rich data and automatically open address books and maps and shopping carts whenever they can.”

    Absolutely.

    (btw, I found this post and your last one incredibly refreshing, thanks!)

  12. Ugo Cei Says:

    Michael Walsh: the Atom Threading extensions should probably be the answer.

  13. Mark Says:

    Scoble once again pulls out this tired yarn: “Users don’t care about specs, or arguments about formats.”

    My standard answer to that is, “Yes, but they should. And that’s why we developers have to care about them on our users’ behalf.”

  14. stephen ogrady Says:

    i’m mostly with Mark here, i think, because i think Scoble’s missing the point. i think his contention that users don’t care about specifications is specious and mostly irrelevant.

    i don’t know that i’d agree with Mark that they should care, but i’m of the opinion that developers are indeed the target market here.

    the original post being debated here, IMO, is not now nor is unlikely to ever be highly relevant to users (and i say that w/ all due respect to DeWitt - it’s a great post).

    the point is that formats, syndication, protocols and such are only relevant to users in terms of what they enable that’s new and useful. and that’s where the focus should be.

    so there are two challenges, of which the first i think is most important:

    1. build developer awareness, so that they see the advantages
    2. use the technology to make users’ lives better

    essentially, i have no quarrel with the goals as outlined above. but i do think that a “building applications to prove the value of Atom to regular users like Scoble” is the wrong mission.

    i think the Atom community would be far better served by evangelizing the format with developers, and letting them take it from there.

  15. Mark Says:

    > i don’t know that i’d agree with Mark that they should care

    When I say that users should care, I mean that it is in their own best interests to care. Data loss, data corruption, data lock-in… all of these are perils of choosing the wrong format. (I’m not going to get into whether any or all of these apply to RSS in particular, I’m just responding to Scoble’s general point.)

    I think we can all agree that the “average users” of the world (that Scoble claims to care so deeply about) do NOT actually understand how important this choice is. They don’t even understand enough about the technical details to make a rational choice in the first place, even if they’re aware of the issue at all (which they’re usually not). *This* is why I consider it a moral issue for developers (who have the resources and expertise to understand all the issues involved) to make their applications store user-generated content in open formats. Second-best is to provide full-fidelity export to open formats. From version 1.0. Anything less is a moral failing.

  16. Steven Ickman Says:

    From a search perspective I’d argue that the use of either format, RSS or Atom, is pretty much a hack. I think OpenSearch is awesome and I understand the motivators driving the format choices but it still feels like a hack to me.

    Just like you I want to see rich structured results returned for queries but both formats basically limit you to results of a single type and contain a few known fields (i.e. link, title, subject, author, date, & enclosure) that are expected to be common across all items.

    Where do we put the 100+ Outlook defined contact fields and how do we know that a result is a contact and not an appointment or auction? Vista has almost 1000 properties defined in its schema so how do we convey that much metadata in a loseless way? Embedded Microformats are a great sugestion for how to deal with richer content but it sort of feels like a hack on top of a hack to me? What’s the Microformat for an auction? Do I have to wait a year for some committee to arrive at joint aggreement on what attributes define an auction before I can return structured auction results?

    And neither format seems particularly well suited for returning hierarchical result sets on a par with OLE-DBs Chaptered Rowsets. It seems like you actually want a mix of OPML & RSS/Atom for this in the flavor of what Chris is doing over at TagJag.com.

    OpenSearch, RSS, Atom, OPML, etc. are all great formats. I also think RSS/Atom based results are the right choice at this point in time for numerous reasons but they’re less then ideal in my opinion.

    -Steve

  17. Darren Chamberlain Says:

    @Steven Ickman:

    Where do we put the 100+ Outlook defined contact fields and how do we know that a result is a contact and not an appointment or auction?

    You can always return RDF, since this is close to exactly the type of problem that RDF was designed to solve: Sharing of data using arbitrary schema. There’s even a data format that could be reused.

  18. Robert Yates Says:

    “Where do we put the 100+ Outlook defined contact fields and how do we know that a result is a contact and not an appointment or auction?”

    You could put them in the Atom element and you can use its type attribute to tell you whether it is a contact or an appointment.

    “And neither format seems particularly well suited for returning hierarchical result sets on a par with OLE-DBs Chaptered Rowsets. It seems like you actually want a mix of OPML & RSS/Atom for this in the flavor of what Chris is doing over at TagJag.com.”

    Have you looked at the ATOM threading extension http://www.ietf.org/internet-drafts/draft-snell-atompub-feed-thread-12.txt, we’re using that to represent hierarchies.

  19. DeWitt Clinton Says:

    Where do we put the 100+ Outlook defined contact fields and how do we know that a result is a contact and not an appointment or auction?

    Fun question. I actually think this is a great opportunity for a microformat. The field definitions are already there (in the spirit of paving the cow path and all); one simply needs to pass that data around in a way that is valuable both to “dumb” clients and smart ones.

  20. Kevin Marks Says:

    To answer your questions, Steve, Microformats, specifically hCard, define the fields used for contacts, and they map to outlook well.
    There isn’t a microformat for an auction yet, but if you’d like to start one, got to http://microformats.org/wiki/process and see how to go about it (hint, start by looking at existing practices).
    If you want hierarchical results, use XOXO

  21. Steven Ickman Says:

    DeWitt/Kevin, microformats would certainly be one posible approach and I can see how the synergy between dumb & smart clients makes them attractive. I’d suggest pushing the use of XHTML if that route is taken just so that non-browser based apps don’t have to parse both HTML and XML.

    The primary issue I have with the use of microformats is that the client has to know about a given microformat before it can consume it. For instance a *non-browser based* client would have to be programmed with explicit knowledge of an “auction” microformat before it can even render it. This sucks because a) it could take a while for the format to stabilize, and b) clients have to periodically be reved to understand new formats.

    Why can’t e-bay just return items with a few extra properties that define something they call an auction? The trick is including pointers to the schema for these extra properties so that the UI can be taught how to properly render the item. This is what we’d like to ultimatly support in Search Center (formerly Windows Live Search Desktop). http://findmystuff.spaces.msn.com

    Publishers will want to return structured content and there are plenty of item types worthy of a microformat. But there will also be plenty of item types that aren’t worthy of being turned into a microformat. The case I worry about is an enterprise that wants to make its internal call center records searchable by OpenSearch clients. The structure of their data is going to be very proprietary so it’s unlikely there will ever be a microformat that fully meets their needs.

    Robert, thanks for the pointer to the ATOM threading extensions. I’ve only skimmed the spec but it does look liek a viable approach for returning hierarchical results.

  22. Taka Says:

    >>> Steven Ickman Says: The primary issue I have with the use of microformats is that the client has to know about a given microformat before it can consume it.

    Not true :-) I posted a reply to DeWitt’s original post that talks about Awasu’s new features to extract arbitrary metadata from feeds:

    http://www.unto.net/unto/work/on-rss-and-atom/#comment-1387

    It’s all driven by config files so not only do you not need to wait for feed reader developers to add specific support for a given microformat, you can also add support for formats that would never, ever get supported e.g. your company’s own proprietary formats.

  23. Steven Ickman Says:

    Taka, I remember seeing that post…

    The config file idea is essentially what I meant when I said the results should include pointers to its schema. Part of the schema would need to include information for how to display the various properties when redering to a list view (justification, formatting, etc.) and I guess the schema could also include XPATH expressions for how to extract the various columns from a microformat.

    That’s probably one viable approach, I’m sure there are many. What ever the end solution, I’d just like to see the community come up with a solution that doesn’t require physically installing something on an end-users machine. Any schema changes on the server end would ideally be reflected by the client on their next query.

    Also, please don’t think I’m anti-microformats because I’m not. I think microformats rock and would love to see every web page in the world adopt them. I’m just providing feedback and exploring ideas…

    -Steve