Mod_Perl
August 5th, 2004 by DeWitt Clinton

I haven’t used mod_perl in several years. At one point in time I was very involved with the project and the mailing lists, but that all went on hiatus when I joined Site59, which was built on AOLServer and Tcl (not my choice!), and then Travelocity, where I helped lead the (largely successful) effort to migrate to Java. So aside from the odd scripts here and there, I haven’t done any serious work in Perl for a few years.

At the new gig I found the occasion to get some code up and running quickly, and decided to turn to mod_perl once again. Partly because the application involves a fair bit of string parsing and partly because I’m a believer that Java and C++ are only worth writing if you have a full-on component architecture and the time to do it “correct.” And partly because Perl code, even high-quality Perl code, is easy and fun to write. (I could replace Perl with PHP here, but I suppose it’s fair to say I’m already a first-rate developer at Perl, and a bit of a neophyte with PHP.)

Installing mod_perl for the first time in four years turned out to be a breeze. Certainly easier than it was back in the day — I am very impressed with how mature it all is now. The documentation is impressive and clear, and the pieces just fit together on the first try. (Which could not have been further from the case back in 2000.)

I whipped up a test mod_perl application in about two hours using Apache 2.0.50 and mod_perl 1.99.14. I’m using GCC 3.2.3 (20030502) and Red Hat Enterprise Linux WS release 3 (Taroon Update 2) with the 2.4.21-15.0.3.ELsmp #1 SMP kernel. Everything compiled right out of the box — and this is definitely a first for me with mod_perl and apache. Things hadn’t changed so much that it felt unfamiliar (we still use startup.pl and PerlRequire and all that…), but at the same time it felt more streamlined. I had been worried that the Apache 2.0 dependency would make things difficult, but I’ve been pleasantly surprised by the build process.

What floored me was the performance of the test app. On my development workstation (an overpowered beast — an HP box with a hyper-threaded Intel Pentium 4 running at 3.0 GHz) I ran “ab” locally for 10,000 requests with a concurrency of 20 and got 1663 requests per second. The mean time per request was 12 ms. Keep in mind that this is a) a local instance of ab, and b) a completely untuned apache, and a completely untuned mod_perl. And really, at those speeds tuning is hardly necessary.

In fact, at speeds like that something else — something profound — is true: The overhead of the server and the application environment is negligible. In other words, you can remove the server architecture from the equation completely when calculating throughput. And considering I’m getting those kind of numbers on a dirt-cheap (relatively speaking) development server, you can scale laterally to your heart’s (and budget’s) content.

If the server and interpreter overhead is negligible, that means that the only metric that you need to be concerned with is that of the application itself. I can’t express how different this is today than it was just 8, 10 years ago. My gut tells me that this can be attributed partly to the process spawning model of Apache 2.0 and partly to various performance improvements with the Perl interpreter (though not to the same degree JIT has improved Java). But mostly it’s probably just the hyper-threaded 3.0 GHZ CPU and fast RAM.

This two-hour long experiment will probably have far-reaching and paradigm-shifting consequences on me as an architect. Time will tell, no pun intended.

As a long footnote, I think interpreted languages such as Perl and PHP can be phenomenal tools — even for large-scale, high-volume server development. The raw performance of both languages is up there with Java and C++. In fact, any of the major languages is frankly “fast enough” for most applications these days — 1 GHz CPUs were pretty much the great equalizers, as it’s all just I/O wait now. And when it’s not I/O wait, just buy the 2 GHz CPU. Or the 3 GHz CPU. Moore’s “law” has done away with performance as a criteria for most choices of programming language. The areas we need to concentrate on now are far more concerned with lateral scalability and proper algorithm design.*

An individual developer, or a small team, can be amazingly productive in Perl. And not just “script-like” Perl — developers can code quickly in interpreted languages even when it is written with a full object model, component architecture, and test framework. And CPAN is bar-none the largest and most diverse code repository, free or otherwise, for any language.

But the big reason not to choose Perl is the amount of rope it gives you. Every developer writes Perl differently — sometimes the differences are so great that you would hardly know that two bits of code did the same thing. There are few true development standards for Perl — just read through the modules on CPAN if you want an example. Few (no?) Perl programmers have any formal training in the language. (Not that formal training in any language helps that much — experience and a dedication to simply doing a better job counts far more.) No, the real reason to not chose Perl for big projects is not because the architecture or the performance doesn’t scale, but rather because the development processes itself won’t scale without a more formally structured language.

I’ve worked on projects with dozens (wait — hundreds, wow) of developers all trying to code against the same libraries and the same code base. This was logistically hard enough in a compiled and rather inflexible environment such as C++ and Java. In an interpreted and unstructured environment like Perl or PHP (though PHP less so) it would be an impossibility. So Perl is great for small teams and/or rapid development, but it’s not really the right choice for rolling out to every one of your three hundred developers for a ten-year application life-span.

Anyway, in the case of the project I’m working on right now, I have the luxury of being able to code it myself, or work one-on-one with the other developers. The scope of the application in terms of volume, throughput, etc, will equal the largest projects I’ve done. But the scope of the development team will be a small fraction of that.

To be perfectly honest — I feel blessed by that.

* And as a footnote to the footnote, one of my colleagues was recently making the case that even NP-hard and other ugly polynomial-time problems are not always that daunting any longer. We have the horsepower to compute a many node polynomial problem in a reasonable amount of time (i.e., web-request time). By proper partitioning of the problem space, modern hardware has taken computational challenges that were previous impractical to solve on the fly due to their algorithmic complexity and reduced them to mere number crunching.

Comments are closed.