
Perhaps surprisingly, the task of choosing the best programming language in which to develop a new software project remains non-trivial. One might presume that, as feature-rich high-level languages such as C++, Java, and C# have matured, one among them would have emerged as the clear winner for the vast majority of enterprise needs.
On the contrary — the relative maturity of so many viable options makes it even more difficult to choose. The upside is that, due to the large explosion of software development communities (most notably the ascendency of open source communities), it is now harder than ever to make an outright horrible choice. But a language still needs to be picked, and these notes may help with that decision.
First, one needs to define what “enterprise” even means in this context. For these purposes, enterprise software is software that is developed and deployed by a company for long-term use in-house. (Shrink-wrapped and/or ISV efforts have additional distinct development requirements.) The software’s lifetime should be indefinite, and will ideally last longer than any one development or management team. The software will almost certainly be built and maintained by more than one invidual. And the software should be considered server-side (either batch, command-line, or accessed via a web/socket interface). But mostly, the thing that all enterprise software development projects have in common is that they require an investment of time and money to build, and that the business in some way depends on them.
Second, it should be stressed that there is no one “best” language. Not just in the general sense — but in the specific sense that a healthy enterprise environment will be heterogeneous and will incorporate many different programming languages, frameworks, toolkits, and architectures. Not only do individual jobs have unique requirements, but exploring multiple disparate avenues will ultimately keep an organization flexible and adaptable, avoiding the problem of the monolith (and the subsequent rebuilding of the monolith).
So what choices are out there for enterprise programming?
Java
Sun may have originally intended for Java to be the next “Operating System” (thus wresting the desktop out from under Microsoft), but the language really found its wings as a server-side alternative to C and C++. A straightforward object oriented syntax, a very rich set of libraries, a clean and capable threading model, managed memory, and a portable interpreted runtime all made for an appealing choice. Companies (and their software architects) that were hoping to get off of legacy and outdated systems loved Java simply because it was different and it would force a fresh start. The Java hype of the late 90’s pumped out tens of thousands of capable engineers with the basics in hand, and just about every experienced engineer starting learning the language on the side. And while Java on the client still hasn’t fully materialized (a few notable exceptions and cell phone platforms aside), Java on the server is now among the leading candidates for an enterprise implementation.
But many of the strengths of Java were also weaknesses. The relative simplicity of Java — a language that long eschewed features such as parametric polymorphism, multiple inheritance, a macro pre-processor, and even enumerated types — made for code that was easier to learn, but harder to program extremely well than languages such as C++. And the most significant features, such as managed memory, strong typing, and interpreted bytecode, came at a cost in terms of run-time performance. While JIT and native compilers (such as gcj) have come a very long way in bridging the performance gap for certain applications, their remains a healthy amount of padding keeping the driver from the metal. That padding may make the ride much easier for the developer 90% of the time, but it gets in the way in certain specific scenarios.
In particular, the inability for the developer to conveniently (and in the case of type-checking, ever) disable the managed aspects of Java make the language fall down in scenarios in which resources are constrained. Now, this might initially sound like all enterprise scenarios. In practice however, it turns out that most enterprise back-end processes are actually laterally scalable. This means that nearly all tasks can be run on an arbitrarily large number of independent systems. And even if memory was a constrained resource, the ability to simply distribute the tasks across an effectively infinite amount of RAM goes a tremendously long way toward mitigating the limitations of managed memory. (And this is increasingly true as low-cost Linux/x86 continues to revolutionize enterprise computing.)
Undoubtedly, Java underperforms compiled code for many jobs. While JIT technologies eliminate this difference for many number-crunching and tightly looped routines, the very existence of a run-time type system incurs an unavoidable overhead (as does the design of some of the native class libraries, such as the immutable String class). These were for the most part all good and smart decisions on Java’s part, as they make for significantly more stable code. And in fact, since most of real-world enterprise applications are IO bound (not CPU bound), the run-time performance differences are never going to be noticed in the first place.
Thus for the types of problems that can be scaled laterally to an arbitrary degree and are largely IO bound, then Java is most likely one of best choices for an enterprise programming project. Java is still actively being improved (1.5 in particular addresses a number of concerns listed above), and has one of the richest set of development tools, such as Eclipse, and third-party libraries such as Apache’s Jakarta project, Ant, and JUnit. Also, the servlet architecture is perhaps the best balance of performance vs. development simplicity for web-services based applications.
That being said, there are still tasks that ask for more than Java alone can provide.
C++
C++ is without a doubt one of the most complicated and potentially painful languages a developer can ever learn. The language not only gives you enough rope to hang yourself, but the inventor of the language seems to have taken an almost peverse glee in escorting you up the scaffolding with a handshake and a smile. From operator overloading to templates to virtual inheritance to highly flexible freestore management the language gives you essentially everything you need to do things right, or very, very wrong. Yet strangely, no other language is quite as satisfying to write. (I suspect taming wild tigers offers similar job satisfaction.)
In the hands of a competent programmer, a C++ application can be as fast and as efficient as any code in the world. An architect that maps out lightweight and shallow object models and employs a robust component-oriented design can achieve a high-level architecture that scales to dozens, even hundreds, of developers. And yet at the same time they can fine-tune the critical sections to tease out the poor performing bits, and have precise (enough) control over the memory that is allocated.
While it is true that C++ developers tend to over-engineer and spend more time fighting the compiler and the development environments than doing actual programming, if the target architecture is well understood and the underlying platform is mature, then C++ is legitimate option for an enterprise project. In fact, if an application can not be scaled laterally and must fit on a fixed number of large boxes in a fixed amount of memory, and still must accommodate dozens of developers, then sometimes C++ is the only option.
The downside, of course, is that good C++ programmers are harder and harder to find. Many of them have already switched to Java or C#, and fewer and fewer young engineers are picking up the skills. The learning curve of C++ is probably three times as steep as either other language — and while the payoff is still there today, I would hesitate to bet on that holding true for another 10 years.
C#
Looking at the language itself, C# is quite likely the best enterprise programming language available. C# lives up to the promise of providing the safety and elegance of Java with the power and control of C++. It is the synthesis of the best parts of so many different languages, yet it feels organic and familiar. The supporting class libraries are good (and growing) and the C# community is already very strong. The ability to switch into unsafe code when necessary addresses the biggest performance limitations of Java. (And for what it’s worth, I briefly sat across the hall from Anders Hejlsberg as C# and the .NET rutime were being developed and I was in awe of what they were doing. For better or for worse, he moved to an other part of the building before I could pledge eternal fealty. On the bright side, this probably saved my career.)
Unfortunately, C# is really only supported on Microsoft operating systems. While version 1.0 of the language itself is ECMA standard, much of the richness of any language is in the associated libraries, and those have only been cloned to a certain degree. While Mono is one of the most impressive open source development projects in existence (after Linux and Mozilla), one would be hard pressed to bet their company on something that is both as-yet this unproven and that much at the descretion of Microsoft.
However, C# on Windows NT is a hard case to make for the enterprise. Granted, the Dev Studio tools are among the very best (though not specifically so for server-side programming). But NT itself is expensive and the management costs are prohibitive, to say the least. MS has never really demonstrated that it “gets” enterprise programming, offering one half-finished framework after another, never realizing that, at the end of the day, the sweet spot is not in making more management tools for more complicated systems, it is in making systems that a) require less management, and in b) making it easier for the operators to build out solutions on their own tools. (Microsoft has missed the most important lesson that Linux has to offer — one they could learn simply by watching what people have built for themselves on top of open source platforms, not just by listening to what the CIO’s are saying they’ll pay for.) It is no small wonder that the Googles and Amazons and Yahoos of the world are using Linuxes and BSDs.
That said, a C#/Mono/Linux enterprise project is a real temptation, and is definitely an area to watch in upcoming years. Perhaps this path is viable on a project that is not strictly mission critical — it is something that merits more exploration. But such a project would be coming at a risk — Microsoft is under no obligation to submit further language improvements (of which there could be many) as standards, and one would have to be incredibly optimistic to take Microsoft’s future goodwill on faith alone.
C
C, the most influential programming language of all time, remains to this day an indispensable tool for the enterprise developer. Second to no other language in terms of raw performance, (and in many people’s opinions, elegance), C is the language of choice when speed matters most.
Additionally, nearly every third-party library in the world offers C bindings, and the existence of the GNU C compiler and software configuration tools make it the most portable language choice of all time. Moreover, C is probably the most “fun” (subjectively speaking) language to write, in that nothing in the language feels awkward, contrived, or without reason. Things in C just make sense.
But C code also takes a long time to write. Simple ideas that require a dozen lines to express in an OO language like C++ or Java often require 100 in C. Writing modular C code is certainly possible, (witness Apache HTTPD for example), but only the best developers do that by habit. Worse, C projects seem to suffer greatly when engineered by anything other than the “benevolent dictator” model — something that doesn’t usually fly in the enterprise.
Perl
Probably the most unlikely candidate in history for a real enterprise programming language, Perl has turned it’s eclectic and pathological rubbish into the language of choice for a huge number of serious programmers. One can spend whole afternoons just browsing through the libraries available on CPAN. In fact, when it comes to the esoteric and obscure, no one language covers more ground in less time than Perl. In spite of all the limitations associated with an interpreted language (and in this case, one that can’t even be reliably compiled down at all), this little language, the one with a almost laughably idiosyncratic syntax, has essentially defined the Web 1.0 revolution. Everyone has brushed up against Perl in some form or another, and some people even got to be quite good at it.
In fact, somewhere along the way, a few people learned how to write Perl so well that you could actually build reasonable enterprise applications with it. Projects such as Mason and mod_perl have helped take what was once a glorified sed/awk/sh scripting language and turn it into a legitimate server-side, modular, OOP environment.
However, to this day there still hasn’t been a a Perl IDE or a interactive visual debugger. (While those are looked down on by some talented engineers, they are still essential resources for a large number of enterprise quality developers who know how to make the most of them.) And Perl’s interpreted and “type-squishy” runtime is still fragile and prone to confusing breakage. But worse, one of Perl’s greatest blessings — it’s malleability — is a curse when no two developers in the same company can write Perl code that anyone else would recognize.
In the end, even in spite of Perl’s rapid development cycle, it is still probably best left for the front-end of web applications and for smaller one-or-two-person projects. The language is brittle — and while very good Perl can be written by a very good Perl programmer, even that same very good Perl code can appear foreign and hard to comprehend to someone else. (This is a hard point to take for some who sometimes consider themselves among the upper echelons of Perl developers.) Also, Perl 6 is a great unknown, and could tip the scales in either direction.
Python
Python continues to surprise with its resilience and growth. A tight, clean language with an ever-growing library, what once was used exclusively for system administration scripts keeps pushing its way toward server-side applications. Much in the same way Perl exploded into a full-fledged enterprise language in the era of CGI-based programming, Python keeps pushing its way upward into bigger and better things. While it is still best left to the back-end maintenance side of things and the one-off CGI script, Python has a viable shot at taking the crown if Perl 6 misses the mark.
Other: Ruby, Lisp, Fortran, Eiffel, Smalltalk, ML, Visual Basic, Pascal, Objective C, PHP, Tcl…
Certainly, other languages have their places and bear consideration in certain circumstances. Ruby is a clean language and is carving out a niche for itself in the same way Python has. Lisp has long been the providence of AI researchers, and Pascal the domain of undergraduate programming courses (even spawing Delphi). Eiffel and ML are academically rigorous and support advanced (if more theoretical, rather than strictly practical) feature sets. Objective C is alive and well on Mac OSX. PHP is a leading language for front-end web development. Fortran and Smalltalk have deep legacies and still inform many of the decision we make today. And still others, such as VB and Tcl have, almost embarrassingly, crept into enterprise systems.
But for all of their differences, each of these languages share a common shortcoming that prohibits their serious consideration for an enterprise project: their relative obscurity. While each may flourish in one or two particular ecosystems, none of these comes close to their breadth of scope that a C++ or Java does in modern software development shops. There is absolutely nothing wrong with an individual developer or researcher deciding that they feel like learning a bit of Ruby and deciding to implement a small project in it. But be seriously concerned if your CTO is pushing a niche language, such as ML, or a wholly inappropriate one, such as VB, as your new company standard. Pet languages of an individual are a fine thing (perhaps even something that should be encouraged), but betting your company on a language that few people know or care about is technical suicide.
Conclusions:
While there are certainly a number of viable candidates for enterprise development — and the list appears to be growing over time rather than thinning out — there still are only a few leading contenders. Fortunately, deciding between Java and C++ comes down to essential one variable — memory. Or more particularly, can the application be split across an arbitrary number of machines, thus making memory concerns less consequential? And, if in the event the application will be CPU-bound, can it benefit from a fine-grained optimization of a particular critical section?
If the application is indeed that performance and memory sensitive — and keep in mind that latency in a C++ application is not necessarily better than in a Java application — then C++ is maybe the only choice. For other tasks, Java is probably going to give you more for the development investment.
But the most important thing to remember is the point mentioned right at the beginning — that no one language is going to be best for all projects in one company. In fact, one language may not even suffice for a single project. Heterogeneous environments are healthier and more flexible and provide ample opportunities for everything from Java, to C++, to C, to Python to be used at the appropriate time. But for the new development jobs, the ones that you expect will be shared by a team of engineers and maintained for years to come, consideration should be given to the handful of best choices with an eye to making things easier on those to follow.
