A New Project, Part 10


sticky, part 10



I just wanted to cover a handful of miscellaneous details regarding the new project. Each of these has the possibility to be a bit contentious, so I'll try to outline my reasoning the best I can.

Source Control: Having a source control system implies that there will indeed be source code, so the mere fact that I'm ready to discuss it is a good sign. In fact, the discussion will be brief -- I've decided to go with Subversion. While the reasons for the choice are long, the major benefits of Subversion are that (a) it is free and opensource, (b) it has a large, active community, (c) it bears enough similarity to cvs to be familiar to most people, (d) it is well documented, (e) it is feature rich, (f) it is stable, having now just reached version 1.2. I actually wanted to go with BitKeeper as the distributed repository model is pure genius, but the licensing issues scared me away, though if this were a commercial endeavor I would serious consider that route.

I've now installed a public instance of the Subversion server over at svn.unto.net. I'm using the mod_dav_svn plugin for Apache 2, and I'm thus far very impressed with how it works. This repository will be open for anyone to read, and I will put the majority of code for the new project in there. Write access will be restricted to me for now, but if there is a reason, I'll open that up as well. In the meantime I will be happy to review and apply patches for people that would like to suggest a change. In other words, I'll be opting for the benevolent dictator model, whereby I fill the role until such time as it makes sense to hand it off. Which is all very pretentious for a project that may or may not ever get off the ground. Just planning ahead.

For those that do not want to install Subversion -- it's easy however, and there are pre-built clients for all major systems -- I will also be providing a nightly tar.gz and zip file for your enjoyment.

Naming: I still don't know what to officially call the project. However, as soon as you start adding code you need to at least have something to refer to it by. Rather than try and pick just the right name today, I'm going to just pick a naming convention for codenames, and use those codenames for first versions of the various components (the backend, the frontend, etc.). They will probably be something boring like the moons of Jupiter, neighborhoods in Manhattan and Brooklyn, or towns in Northern California or similar. So don't freak if you are checking out the source code for "ganymede" or "redhook" -- it's only for development purposes.

Versioning: Everything will be strongly versioned. The source will be versioned and tagged, each release will be versioned and publicly archived. But more importantly, the API itself will be versioned. This is something I think Amazon Web Services has done a fantastic job with, and I intend to support versioning as much as I can.

Database: Since no one objected during the discussion of the data model, I am going to use a relational database as the backend for this system for the time being. I'm keeping partitioning in mind, and trust me, I'll be looking closely at more scalable and efficient ways to handle data. I'm going to go with MySQL 3.23, rather than the more advanced MySQL 4.1 or PostgreSQL. The reason for this is that I don't feel right now that I actually need the real database functionality of transactions and triggers and views, and I may as well just choose the least common denominator. That said, all data access and the actual data model will be hidden behind the API, and the backend supporting the API will try to keep the specifics of the database interaction well abstracted.

License: The content, such as these articles and the documentation, will be release under a Creative Commons license. The source code will be released under some open source license, though I don't know which one. The GPL is a good first choice, as it does do some of the things I want when other people (especially corporations) use the code in their own products. If I expected to be the only author, I'd probably choose that, as I could always re-license the code under something more relaxed later on if need be. However, if other people contribute to the project then I will not have the option of changing the license, and I need to pick better from the outset. I doubt people would be willing to assign their copyright to me when they commit (nor should they have to), so it will take a little more thinking. No matter what, it will be a recognized Open Source license -- I'm not going to make up my own license just for this project.

Language: Potentially the most contentious decision, I've decided to write the backend layer in Perl. Never an easy choice, I wrote about some of these considerations at length in On Choosing An Enterprise Language, and I recommend at least reading through article for some background. More than that, I've decided to use mod_perl rather than CGI, FastCGI, or other integration tools. I also wrote on rediscovering mod_perl last August, and since then have only been more impressed by how good the project continues to be, especially now that it has officially reached version 2. I ported the AWS OpenSearch project over to mod_perl in an hour, for example and immediately realized huge gains.

Both PHP and Java were strong alternatives, but the balance tipped toward Perl because (a) mod_perl and apache 2 are a very, very powerful platform, particularly when one wants to support the REST HTTP operations, (b) I already know it inside and out, (c) the unto.net infrastructure is already well suited for it, (d) I ran a company for a while that had some excellent open source enterprise Perl libraries that I am familiar with and want to reuse, (e) just about everything, from MySQL to Berkely DB to Markdown has mature Perl bindings, (f) XS support will allow for native compiled code and library support if need be, (g) I can write it in such a way that it can be ported to other languages, and (h) I had to pick something.

Releases: Early and often. Or continuously if I can. I will likely start with some of the behind the scenes work that you can't see (like request dispatchers and database connection management), but hope to move quickly toward getting working API calls out there. I am going to start with the user model, as that has the most general applicability, even long before support for notes is complete. I will try to work out a rich enough backend before even starting on the real client-side application. I will however write a server-side "client" application to demo the features as they come online.

Community: I will probably add a forum system to unto.net, such as bbPress, once this gets underway. I'll add a mailing list as well if people are interested. I'd like to use a good group system like Google Groups of Yahoo Groups, but frankly, I'm not sure how much I want to host on other people's servers, particularly if either of those companies might end up being a good future partner to support this project.

How you can help: I love the help I've received so far in the form of comments on the blog posts. For now, that's a great way to contribute. In the future you could help by:



As always, thanks for reading. I'll keep posting as much as I can as the development continues.