A New Project, Part 5


sticky, part 5



It is getting a little late tonight, so I will probably ask more questions than I answer here. (For those just catching up, this is the continuation of part 4 in a series of articles on the development of a new web application.)

Since this project is evolving into something in which the API itself may be used by far more than I had initially intended, careful attention must be paid to the responses returned by each REST (Representational State Transfer) web services call.

I am tempted to simply say that the responses will be XML, then offer little justification beyond that. But that doesn't seem fair considering the transparent nature of this project. So I will at least attempt an explanation.

XML is a natural candidate -- it is robust, it can be validated, and most importantly, it is nearly ubiquitous. All programming languages have reasonably robust XML parsers available to them -- including languages that are available on the client side. Plus, there is not a developer out there that has not at least worked with XML before. That said, there are downsides to XML -- it takes longer to parse and uses more bytes to express a simple key/value pair than some other formats. However, since our goal is more toward making an API that is actually used (and frankly, because our first client has a built in DOM parser), XML is certainly the leading choice.

The good news is that the responses themselves do not strictly have to be presented in just one format. An alternative implementation of the server or a server written with a layered approach in returning data could easily convert each response into JSON (JavaScript Object Notation) or PLIST (Property Lists) depending on client preferences.

It seems like there is some freedom in the structure of the data being passed back in the response. My initial vision for this web service API is centered around REST, as opposed to an RPC (Remote Procedure Call) mechanism like SOAP (Simple Object Access Protocol) or XML-RPC. Thus the complexity and overhead of SOAP-like envelopes -- while terribly interesting on an academic level, and still a reasonable option for some web services -- are something I'd like to avoid altogether here. My sense is that this application should target something as close to the least common denominator as it can. So while the rigorous framework that RPC responses necessitate does add value, it would also add considerably more burden to a client written in (for example) JavaScript than I think anyone would enjoy.

With all this in mind, I will first try to focus on what data is necessary in each response, then return to the formatting and structure later. To begin, we know we need some sort of status for each response. We need to know if it succeeded or failed, and if the latter, what went wrong. The response status should be both machine readable and human readable. For this we will want both a status code and a status message.

There is a fairly substatial overlap between what HTTP 1.1 status codes can represent and what we may want to return. In fact, due to our use of REST it is even tempting to try to push all error handling up to the HTTP layer.

But I believe that would be a mistake. For one, there is no single best way of putting additional error information into an HTTP response. (See Ethan Cerami's discussion of RESTful error handling for more background.) Second, our client software will not be handling the error condition at the HTTP level -- the logic that deals with each condition will be probably be implemented after the HTTP GET or POST request has already been parsed. Third, HTTP error codes are not particularly extensible, and we may well find that we would like a machine parsable code that does not exist in the 1.1 specification.

All that leads me to suggest that we include a structure that looks roughly like this with each response:

(Please forgive the intentionally ambiguous represenation of types here).


status:
   status_code : int
   status_message: string


The status code will be as similar to the three digit HTTP status code as we can reasonably expect. I.e., all successful responses will return a 200 status code, whereas a permission denied error would include a a 401 status code. However, if we need to invent something specific then we will have that luxury.

The status message should be suitable for displaying directly to the end user. This is always a thorny dilemna in multi-tiered servers, but it seems to me that there are things that happen on one tier that can only be adequately described there. No intermediate layer could programatically interpret and rephrase the response string. This clearly complicates efforts to internationalize error strings -- a trade-off I'd be willing to revisit if someone had a cleaner proposal. (Note that this parallels my theory of having code fail early -- and spectacularly -- whenever anything bad happens. Come to think of it, that may explain this series of articles, too.)

Moving on to the content of the responses: the body of each GET response will usually be limited to a single layer of the hierarchy. While there is often a tempatation to return deep data structures -- or at the very least, provide the option to do so -- successful application of REST (and even more so, AJAX) seemed to be predicated on the ability to make multiple round-trips to retreive deep data. It is my opinion that allowing the client to decide how deep they want to dig is better in the long run for the widespread adoption of the API.

For example, if I request /users/dewitt/, then the response will contain only the "top-level" data, such as (depending on who is making the call) a first name, email address, etc. It will not contain a list of friends, a list of notes, a list of groups, etc. And certainly not the detailed data for each friend itself. For more information the caller will need to make a subsequent, and more specific, API call.

Note that "top-level" does not in any way imply anything about the persistence of each of these structures. It is in no way intended to suggest a database schema or filesystem layout. (I'll actually try to address that in the next part of this series, though.) The way the data is stored should be independent of the API itself. While it would be nice if there was some consistency, there are two very different goals for each layer. At the web services API layer the only important criteria is the API's useability from the perspective of the client. (If the API doesn't appeal to the client, then it will not be adopted.) Whereas the persistence layer is concerned (almost exclusively) with performance and scalability issues, and will optimize it's internal organization accordingly.

And speaking of consistency, REST APIs do encourage consistency between what is stored (via POST and PUT), and what is retrieved (via GET). Ideally almost identical structures can be used in both cases.

To wrap up the discussion on the response, we do need to circle back to exactly what type of XML is going to be returned. For all it's complexity, RDF (Resource Description Framework) really is the technically right way to go here. RDF is a cornerstone of the Semantic Web -- a future that may or may not play out -- but one that this application could reasonably be a full-fledged participant in. Some serious (albeit brief) soul searching is in order before making that decision.

As always, more to follow.