Hashing and Caching, Redux

DeWitt Clinton
February 2005

A few months ago, after SHA-0 and MD5 were demonstrated to be less cryptographically secure than designed, I wrote an article on the integrity of hashing functions on caching and e-commerce. I conclude that article with:

Cache::Cache currently uses SHA-1. I switched it off of MD5 a few years ago because of a bug with the Digest::MD5 implementation, but was happy to get the extra 32 bits of keyspace. However, since the chances of accidentally collisions is no lower than before (i.e., astronomically low), and it's s unlikely that SHA-1 has been reduced to problem anywhere near the order of 2^48 (we'll wait and see), I see no reason to replace the current implementation at this time. There are other applications in which this will likely have an impact, but generating the cache key is not one of them.

Today, as is being covered on Boing Boing, and Bruce Schneier's blog, the SHA-1 hashing algorithm may be flawed as well. A Chinese research team is claiming to have been able to cause collisions in the keyspace in 2^69 operations, which while not as bad as the 2^48 of SHA-0, definitely invalidates SHA-1 as a one-way hashing technique for digital signatures and secure password storage.

As far as what this means for Cache::Cache, I believe it is still okay. The SHA-1 attack requires access to the hash string itself, which the Cache::Cache API does not expose directly. Without the hash itself, the chance of random collision is still infinitesimally small. An attacker on code using Cache::Cache would require access to the cache data directory. And if that were the case, this is almost certainly not the weakest point in the system. Moving away from SHA-1 would require invalidating existing caches, which would be difficult for many users. However, if you are particularly vulnerable to this attack, please let me know and I will write a custom patch for your version of the library.