On permanence and identity


I observed something interesting while migrating my servers last week. In the process of the migration I changed the domain of my blog from www.unto.net to blog.unto.net. As a consequence, the permalink of each article also changed.

To the average reader or visitor this change was transparent. First, my syndicated content feed is proxied through a permanent URL at Feedburner. The result is that subscribers saw no interruption in service, although recent posts did get displayed once again in blog readers.

Second, standards compliant clients, such as a modern browser and most blog readers, don't have any problem accessing the content via the old URLs. The server on www.unto.net now issues a 301 Moved Permanently status code and a Location header. Clients silently follow this response header to the new URL and display the content without any problem.

But certain other applications, most obviously search and blog tracking engines, seem to esentially ignore the 301 redirect. For example, all unto.net content will effectively drop out of the Google, Yahoo, MSN, etc., search indexes during their next refresh cycles. No external sites have linked to the new URLs, thus the inbound link ranking for the "new" pages will be negligible. The search engines are unlikely to credit the targets of the 301 redirect with the rank karma of the original pages.

And engines that track identity, such as Technorati, make no (obvious) attempt to associate feeds that have changed domains or URLs.

From the Technorati FAQ:

I've moved my blog from one server to another so it has a new URL. Can I transfer or consolidate my link count and ranking from my old URL to my new URL?

We are unable transfer or combine links from different URLs at this time. Links are URL-based and are unique citations to that blog at that time.

However, if you permanently redirect the old blog URL to the new blog URL by sending a permanent redirect response (HTTP Status 301) to anyone requesting the old URL, it will help consolidate your online blog presence for all web aggregators and help have your links reestablished eventually. Apache's mod_rewrite is a popular method of handling such requests.


While the Technorati approach adheres to a literal interpretation of "permalink," it feels suboptimal in terms of what one would expect one of a blog search engine/identity tracker. Sites move all the time, often for reasons completely out of the control of the authors. Isn't it everyone's best interest to attempt to preserve that historical data?

Which raises the following question: should 301 redirects be interpreted by search engines to mean "treat the new URL as an equivalent replacement for the old one?"

The HTTP specification appears to imply it, stating that "clients with link editing capabilities ought to automatically re-link references to the Request-URI to one or more of the new references returned by the server, where possible."

Not that it matters much for me personally as this site is just a hobby. But for someone running a business the concept of identity and redirects could matter a great deal.

What do people think about this? Should HTTP redirects preserve identity?

Please answer in the comments below.