Migrating From Blosxom To WordPress


I know I've spoken well in the past about Blosxom, Rael Dornfest's lightweight weblog application. There is a lot to like about Blosxom, and I stand by the reasons that I gave before for picking Blosxom in the first place. It is open source, database independent, easy to use, install, and customize. I wrote a plugin or two and just generally enjoyed using the software.

blosxom logo

In fact, I just now made a donation to Rael for all his hard work. (More open source software developers should have a donation button like that!)

So considering all of that, why did I move off of Blosxom?

Reluctantly, in fact. I was downloading and trying out a number of different weblog tools for a project at work, and I found a lot to like about a handful in particular. While Blosxom gets a number of things exactly right, it is quite clearly a project of organic development -- it supports plugins, but the script itself is (to be honest) not designed for extensibility or modifications. The community around Blosxom is huge considering the nature of the project, but it is not (nor was it designed to be) the most feature-rich weblog application. Blosxom is ideal for a hacker -- for someone who wants to poke around with the internals and know exactly what is going on -- and in many ways it is the absolute best tool to use if you really want to get to know how things work (like I do). In fact, I'd recommend Blosxom over all others as a first weblog application if you are serious about learning the technology behind how things work. But Blosxom wasn't designed to be the uber-blog -- and the personal syndication space is evolving faster than one small project can possibly keep up.

wordpress logo

WordPress stood out as a first-class alternative. It met the criteria that I outlined initially, and had more features than I even knew what I could do with. So I downloaded a copy on my local workstation and tried to set up an demo instance. While WordPress claims a 5-minute installation, that's only true if you already happen to have all of the dependencies correctly configured. If you need to install Apache, PHP, MySQL, etc., as well, then you are looking at more of a one or two hour installation. Which still is impressive, all things considered.

The real challenge came in importing my old data. WordPress offers a half-dozen scripts for migrating data out of other weblog applications, but none specifically for Blosxom. One developer wrote a Blosxom to WordPress exporter that relied on custom flavors (templates) to output a format that WordPress can import, but after trying it, I realized that cleaning up the data (particularly the whitespace, newlines, and metadata) would take far longer than I could afford.

So I backed up a bit and decided that the best way to export the Blosxom data would be to write my own script that walked the directory structure and generated a RSS 2.0 feed that could later be imported by the existing RSS importer. While I was at it, I had the script strip out redundant whitespace from my blog entries, as well as validate the HTML as XHTML (something that I had been meaning to get around to for while). If you are interested, you can download the script here:

blosxom_to_wp_import

Just invoke it with the path to your blosxom data dir and it will recurse down your directory heirarchy finding world-readable entries that end in ".txt" (the ones that appear on your site), and then spit out XML that you can subsequently place in your wp-admin folder as a file called "import.rss". Then edit the import-rss.php script and make the following changes:

define('RSSFILE', 'import.rss');

And modify lines 91 and 92 to read:

preg_match('|(.*?)|is', $post, $name);
$post_name = $name[1];

That last changes allows you to set your WordPress permanent links to be the same as your Blosxom text filenames. With a little Apache mod_rewrite or mod_alias magic, you should be able to keep your old links working. My rewrite line look like:

RedirectMatch permanent /unto/(.*?)\.(html|print|writeback) /unto/$1/

And my WordPress permanent link format is:

/%category%/%postname%/

(This is probably not 100% the best way to do it, but it is working for now.)

WordPress has it's share of tricky bits. You shouldn't feel like you can just install and forget about it. In particular, little gotchas included having to delete a non-existent empty user from the database (via the mysql prompt) in order to get comments working, a fair bit of twiddling to get the .htaccess model of permalinks working, and a fairly extensive (er, total) rewrite of the templates. But overall it was tolerable -- maybe one good weekend for migrate a medium-sized personal site.

And now for the fun part -- getting to use all of the features that WordPress provides. Already I'm impressed -- the autopinging on new posts is great, the out-of-the-box support for all major syndication formats, the ant-comment-spam plugins, the third-party-tool integration (del.icio.us, flickr, etc), and the whole manageability aspect of it. WordPress is clearly a mature and growing piece of software. And since it saved me the non-insignificant trouble of writing my own (which I'm still going to do just for fun, but now that my goals have changed, I can do it in x86 asm like I always wanted), I can focus on getting the most out of other people's work.

I'll write more about the design of the new unto.net as soon as I have a minute. Until then, I'd love to hear your thoughts on WordPress and the migration -- and please let me know if anything is broken for you.