Mod Atom?
December 9th, 2006 by DeWitt Clinton

By the way, that last post was in the context of a series of ongoing experiments with the Atom Publishing Protocol.

I switched back to Java (over Python) for the back-end “Atom Store” portion. Not particularly because I wanted to write more Java, but because I really wanted to use the Abdera codebase. Their progress continues to impress me and there doesn’t seem to be any good reason to duplicate their hard work.

But after wasting another day fighting with Java tools, I am serious about needing a change of environments.

So here’s a crazy idea: What about a “mod_atom”? I.e., an Apache httpd module that supports the basics of the Atom Publishing Protocol in much the same way that mod_dav adds DAV capabilities to the Apache web server.

Advantages: it could be fast. Fast as in written-in-C fast. Fast as in linked-into-httpd fast. Fast as in libxml/expat fast. And similarly, very memory efficient, especially considering that the footprint of httpd, which is typically already running on the server anyway, is much smaller than a Tomcat instance doing the same work. Also, it could be simple (i.e., less feature-rich) as the typical apache module development cycle tends to be slower paced. And most importantly, half of Atom Pub is really just HTTP, and nothing does HTTP better than httpd. All of the hooks are right there for the using. Admit it, APR rocks.

One more advantage: I’m in the quiet minority I think, but the whole “open source” Java thing has left me with a pretty bad feeling. A stack consisting entirely (L)GPL, Apache, and BSD licensed code makes me sleep better.

Disadvantages: security, insofar as one needs to walk carefully around third-party data in a world of buffer overflows. Memory must be managed, which a pain for those of us spoiled by GC’d runtimes, though httpd’s pooled memory makes that much easier. Extensibility will be more of a challenge compared with, say, Abdera. Supporting things like GData, OpenSearch, or arbitrary extended elements will probably be outside the initial scope. (Atom Threading Extension might make the cut, though, as that’s a big part of the functionality I’d be building this to get.)

A simple first pass at this would probably have only a rudimentary to store and retrieve entries to the filesystem. The persistence layer could be abstracted a bit; I could see wanting to persist to a bdb instead of flat files. While I wouldn’t want to link the mysql client libs into httpd, sqllite might also be a good candidate, too. Authentication, url mappings, load balancing, proxying, and whatnot would all be handled by standard httpd modules and configurations.

Please speak up now if you can see a reason why creating a simple “Atom Store” as an Apache httpd module would be a bad idea.

17 Responses to “Mod Atom?”

  1. Stefan Tilkov Says:

    On the contrary, I think it’s an *excellent* idea. It would probably be nice to have pluggable back-ends, IIRC this is the way he WebDAV module and SVN work together (but I may be wrong).

  2. Nick Lothian Says:

    “One more advantage: I’m in the quiet minority I think, but the whole “open source” Java thing has left me with a pretty bad feeling. A stack consisting entirely (L)GPL, Apache, and BSD licensed code makes me sleep better.”

    Could you elaborate on that a little? Since they decided to GPL Java isn’t that an argument _for_ Java?

  3. Jilles Says:

    It’s not a bad idea, but as you said: apache abdera is shipping working code now. Probably the php crowd is going to need some level of non-java support for atom pub and that’s where this could be useful.

    As for tools, it seems you tried to go for the “lets get everything I could possibly need” which predictably results in a huge stack of software that would take you years to master. My advice: keep it simple. Just eclipse (no WTP) and a good xml editor (i.e. a schema validating one) should cover most of your needs. Also it will force you to rely less on stupid wizards and more on what you actually know about what these wizards should be doing.

    As for infrastructure and technology. It seems to me the main application of atom pub will not be to replace webdav but to be a standard interface to CMS infrastructure. It so happens that there is a bunch of nice CMS related Java technologies. It wouldn’t surprise me if there was already some working code combining Abdera and Jackrabbit (both from apache), for example. Jackrabbit is the reference implementation for JSR 170 which standardizes an API for content repositories. Also apache has the httpclient project which I can tell from experience is pretty damn good to have around when doing any advanced http stuff in Java.

    The OSS stuff has been commented on already, it seems your needs are being addressed.

  4. Garrett Rooney Says:

    One of my original goals in helping out with Abdera was to do something like this, although obviously that means getting a C or C++ port up and running, since embedding Java code in an Apache module is basically insane. Still haven’t gotten around to doing the port though, which is why the mod_atompub idea hasn’t gotten off the ground. If you are looking to do something like this, and would be willing to start with a C or C++ implementation of Abdera, we’d love to see you on the mailing lists…

  5. DeWitt Clinton Says:

    Garrett,

    I agree, a C/C++ port of Abdera could be a wonderful thing. However, wouldn’t it be difficult at this time to keep a port in sync with the rapidly changing/improving Java codebase? Once the APIs settle down, perhaps once Abdera 1.0 starts hitting release candidates.

    I lurk a bit on the Abdera list… Communication has been impressive with the project overall.

  6. Garrett Rooney Says:

    I don’t know, I imagine that the APIs will be slightly different between a C/C++ port and a Java port anyway. The big thing that I would like to see any port maintain is the mechanism for handling extensions, that’s the part of Abdera that I really think is important, and is also the part that requires the most work to get right when designing the parser implementation.

  7. DeWitt Clinton Says:

    I guess it is also a question of scope. I simply want a fast Atom store. I.e., a place that I can stick arbitrary well-formed Atom documents.

    If there was a C library version of Abdera, then writing mod_atom would be trivial.

    But for my particular needs, implementing the full model that Abdera has is probably overkill.

    That’s not to say that a C/C++ Abdera is a bad idea, because again, I think it is a great idea. Just that it is beyond the scope of what I have as a requirement.

    BTW, originally I was just going to use Blogger as my Atom store. It already exists, it is fast, and is has nearly everything I need. Except for the one thing I wanted most for the little project I have in mind — the Atom Threading Extension.

  8. DeWitt Clinton Says:

    Hmm, the more I think about it, the more a libabdera makes sense. Particularly a port of the classes under org.apache.abdera.core.*. Even just having the model and parser classes exposed would greatly benefit a number of use cases.

  9. Garrett Rooney Says:

    Yeah, just the model and parser portions of Abdera are more than enough to get you started on the road to some particularly cool applications. Now if only someone would find the time to code it… ;-)

  10. James Snell Says:

    Ooh ooh ohh! Just what I like to see in my reader first thing in the morning! I, of course, would absolutely love to see a libabdera. Whether or not it’s an exact port of the features in the Java impl, I think, is not as important as just getting started on something. I have little doubt that the APIs are going to be different. That’s perfectly fine with me.

  11. DeWitt Clinton Says:

    James, glad to hear you are so enthusiastic about it!

    It sounds like you’re still working on the final filter API, but how stable do you feel the model.* and parser.* APIs are? I wouldn’t want to call it “libabdera” unless they stay in sync.

  12. Garrett Rooney Says:

    I’d say that once you’ve got something basic up and running, post it to the Abdera dev list, and we can talk about getting it into the tree as an official port. I know a number of people who have expressed interest in helping out with a C/C++ port, and I expect that an initial implementation to start from would really kick them into high gear.

  13. James Snell Says:

    The model.* and parser.* APIs should be considered “pretty stable”. Other than some likely refactoring of the IRI interface once it gets moved out into it’s own project, I do not anticipate any changes to the core interfaces. The factory.* API is also pretty stable.

  14. links for 2006-12-11 at tecosystems Says:

    [...] DeWitt Clinton » Blog Archive » Mod Atom? “One more advantage: I’m in the quiet minority I think, but the whole “open source” Java thing has left me with a pretty bad feeling.” - interesting, i wonder why? be sure to check out the comments for this entry, however (tags: Java opensource Clinton Abdera APP Atom mod_atom Apache httpd) [...]

  15. Danny Says:

    Big +1 to mod_atom and/or libabdera!

    Incidentally, on the extensions issue, I wouldn’t worry too much. Pseudo-core things like threading could be built in, and arbitrary extensions should be relatively easy to support as (content) payload.

    O’Reilly’s CodeZoo thing accepts chunks of RDF/XML payload describing projects (using DOAP). But there’s no reason the subject of the descriptive RDF shouldn’t be an entity in the feed (feed, entry, person or whatever). Hmm, not sure you’d tell a reader not to display it…ok, Plan B - use microformats :-)

  16. Evan Miller Says:

    You might think I’m crazy for suggesting this, but you should consider making a module for Nginx instead of Apache. Nginx doesn’t have much in the way of documentation in English, but it’s very fast, very memory efficient, and all the illicit, cost-conscious Russian porn/mp3 sites use it (I’m told). It’s written by the same guy who made mod_deflate for Apache 1.3. There are already several modules that come with Nginx, including versions of mod_rewrite, mod_proxy, and mod_perl (!). But the best part is that it’s honest-to-God the cleanest code I’ve ever seen, and its module API is easier (for me) to understand than Apache’s. Anyway, check it out, if you’re curious:

    http://sysoev.ru/en/

    http://nginx.net/

    It’s still in active development (two releases came out just today), and the author’s commitment to bug-fixing is admirable. I think this little web server is going places. I hope it is, at any rate.

  17. Asbjørn Ulsberg Says:

    mod_atom (aka mod_atompub aka mod_app) sounds like a marvellous idea! I’d love to see a native httpd implementation of my favorite publishing framework, because that makes it usable in everything that runs on top of httpd, like PHP, Python, etc.

    I can’t think of any reason not to do this, so go ahead — make my day! :)