Friday, August 01, 2014

Old meme still makes me me laugh

There are many variations on this particular 'quote' in the history of the Internet (I heard it long ago, somewhere in the mid 1990's). The version I remember doesn't seem to match the recorded history we can find at sites like the Way Back Machine. I think that means this was a popular meme back then and some exercised poetic license. The version I remember is:

"You couldn't get a clue if you stripped naked, smeared yourself with clue musk, and ran naked through a field full of horny clues at the height of clue mating season.”

Hoping this gave you a chuckle...


Wednesday, June 26, 2013

Sunday, June 16, 2013

Chaining Either's in Java 8

The discussion about Either in Scala and trying to implement it in Java got me to thinking. If we make the simplifying assumption that the error portion of any Either is some kind of Exception, It would seem we can make pretty good progress in using the Either pattern of handling error conditions.

I'd be interested to hear of problems and/or improvements to this idea.

Stop writing your code for the happy path

So Jessitron recently posted about avoiding cxceptions in Scala code using 'Either'. Mario Aquino followed up with an example of using Either in Ruby, and Heath responded with a post about the awkwardness of using Either in Java compared to using checked exceptions.

I agree with all of them on many levels. But the most important thing they didn't say (but which is implicit in their discussions), is the need to stop thinking that writing code is about implementing the happy path. It always amazes me when I encounter code base and discover how little thought has been put into the error handling (the "not so happy paths"). The truth is that about 10% of software engineering is implementing the happy path. The other 90% is figuring out how things can go wrong and then eliminating those possibilities (when feasible) or writing code that gracefully handles the errors.

The art of system design is in finding ways to simplify the code that handles all the errors that can occur. The happy path code can never get any shorter than the minimum required. It's all the error handling code that leaves room for good design.

Wednesday, March 21, 2012

A Checklist for Designing Software

Sometimes when we are developing software, developers can suffer from "forest and trees" syndrome (being unable to see the forest due to all the trees in the way). We get caught up in the details, or time pressure comes to bear, etc. We then fail to remember the overarching requirements for any software system. It's not like this hasn't been written about many times, but I've distilled it down into a short list of the most common requirements in order of their priority. Sometimes, a particular application will add requirements, cause requirements to be re-ordered, or have particularly heavy weight associated with one or more of the requirements. But I find this a useful touchpoint when building, modifying, enhancing, or rebuilding just about any software. It's not exhaustive, but it helps me keep things in focus.

In typical priority order:

Does it work?
Our software needs to produce correct outputs or at least flag when it suspects it is failing to produce correct outputs (it's amazing to me how often developers put so much thought into the "happy path" and almost none into detecting and handling errors).
Is it secure enough?
Sometimes this is a noop. Sometimes it's one of the most crucial considerations (a cryptographic algorithm being the poster child here). And security is much harder to add after the design is mostly complete. It's best to get on top of this one as early as possible, as security requirements that clash with a system design can cause problems in every other category.

Does it meet our scaling needs?
If our software does what we want but can't handle the required concurrent requests or total volume of requests quickly enough, it is effectively non-functional. Often back-of-the-napkin estimates are good enough to understand the basic problem here. Are you dealing with tens, thousands, millions, or billions?

Is it as simple as possible?
The smaller and simpler the code is, the easier it is to wrap your head around it and the easier it is to troubleshoot. Also, the less you might have to rewrite to meet a new or altered requirement,. This isn't an excuse for skipping error checking/handling, logging, security etc. Rather, we should implement those things as simply as possible. Sometimes, this may mean using a framework that helps solve part of the problem (e.g. an dependency injection container). Sometimes, it means not using a framework because that framework is overkill for the problem, overly difficult to use, or too opaque.

Thursday, March 10, 2011

Pardon me, but I believe there's been a mistake...

I logged into my Twitter account this morning after having not looked at it for quite some time. Imagine my surprise when I looked at the list of people following me and I see this:

While it's flattering to entertain the idea that the UK Prime Minister cares about my tweets, somehow I don't think so.

I've gotten my unintended humor for the day and the day has barely begun...

Tuesday, March 01, 2011

Jargon, REST, and Reuse

Benjamin Carlyle has a great post about jargon in REST and how it relates to media types and reuse. RESTs uniform API is great but he succinctly makes a compelling case for why we also need to focus on reusing media types whenever possible to really enable reuse across services and over time.

It's well worth your time. Go read it now.

Tuesday, June 15, 2010

Handling conflicts in media type headers

If you read my previous entry discussing versioning of RESTful services, I pointed out how we can leverage media types to support incompatible versioning of RESTful services. I think this is an excellent solution, but it can lead to an issue for service implementors

Imagine you have a service which has more than one version. Let's say it's a hotel reservation system, just to pick something familiar. We've picked media types for our versions of the service, and adopted the convention that in the absence of any media types we will use "Version 1" of the service. Assume our media types are "application/hotel-json" and "application/hotel-json-v2".

If you support operations for updating a reservation using PUT, you can run into a situation where you have both a Content-Type header (specifying the content of the entity body being used to update the reservation) and an Accept header specifying the kind or kinds of entities the client can accept in the response. This might be just fine, if the two versions of the service can be intermixed between the sent entity and the returned entity. But the two versions of the service might not mix. For example, an update sent using version one of the updating entity might not contain a value required for version two of the service.

In this situation what the client did is an error, so we clearly need to return a status code in the 400 range. But which is the most appropriate code? A quick look at the 4XX codes shows two promising candidates:

406 - Not Acceptable
415 - Unsupported Media Type

But when we read the description of 415 in RFC 2616 we see it isn't appropriate:

The server is refusing to service the request because the entity of the request is in a format not supported by the requested resource for the requested method.
The Accept header is specifying a media type the service understands for the resource, it's just the combination of provided entity media type and requested return media type that is a problem. Happily, status code 406 fits perfectly:

The resource identified by the request is only capable of generating response entities which have content characteristics not acceptable according to the accept headers sent in the request.
Interestingly, the RFC makes a comment about this very situation, and suggests that returning an entity type different than those specified in the Accept header might be preferable to a 406:

Note: HTTP/1.1 servers are allowed to return responses which are not acceptable according to the accept headers sent in the request. In some cases, this may even be preferable to sending a 406 response. User agents are encouraged to inspect the headers of an incoming response to determine if it is acceptable.
In this particular scenario (where a required value isn't provided), one way to return a result would be to "downgrade" to a version 1 response in hopes that works for the client. There are potential issues with this, if the semantics of applying a version 1 update differ from those that would have occurred with a version 2 update. And if the client can't interpret the returned version 1 entity, they may end up in an inconsistent state. But it's a good idea to keep in mind and make work whenever possible.

Versioning REST services

I read a blog entry yesterday from Ganesh musing on whether RESTful web services need versioning. In his posting, Ganesh suggests putting some kind of version number into the body of a request. He suggests that this be done even for a GET or DELETE.

While I suppose this could work, it feels awkward. Putting a version number into an entity body for a GET, which would typically not have an entity body, is awkward and a fair bit of work. Things get even more complicated in the case of a PUT or POST where the entity body is some binary format like an image format. In this case, the Content-Type can't just be image/jpeg (for example) since your putting some kind of version information before the image data.

Even in the case of textual data such as XML, putting version information into the entity body requires that our XML format support a version number somewhere (probably as an attribute of the top-level element in the entity body?). This may be feasible, but it may not be feasible if we are trying to use a data format that is already defined. We could start hacking and just throw a line at the beginning of the entity body that indicates the version with the understanding that the 'normal' payload starts after this line. But this is hackish, and makes our Content-Type delcarations wrong (we aren't adhering to the content type definition since we've thrown that version information at the front).

After reading Ganesh's entry, I thought about versioning for a while and it seemed to me that the right way to handle versioning is via media types (using, for example, the Accept and Content-Type headers). This works well because the first version of our service can be delivered when no media-type headers are specified. Then, if a client wants to take advantage of a newer, incompatible version of our service, they simply specify this using media types in the Accept and Content-Type headers.

I could spend a fair bit of time describing this in more detail, but I did some googling and found a great blog entry (written back in May of 2008) by Peter Williams where he describes things concisely and clearly. It's short and sweet and well worth a read.

Back from reading his posting? There are two issues I thought of which Peter didn't directly address. One is specific to versioning, and one is a more general issue that can arise. The versioning related issue is this:

What if your service model changes to much that the set of resources you expose has to remove one of the resources in your original version of the service?

In this case, you make operations to that URI path with an Accept header and/or Content-Type header for later versions of your service return 404 (or similar error code) since that type of resource doesn't exist with that version of the service. While Peter didn't talk specifically about this, this approach meshes well with his description and I suspect it is the same answer he would give if asked.

A related but more general problem can occur if your service receives a request with both an Accept and a Content-Type header and the values are incompatible. I'll address that in my next entry.

[Updated 2010-06-15 11:25 a.m.]

I should mention that you shouldn't create new media types unless you have to. If your service deals with ATOM or RSS, use the version number for those to distinguish different versions of your service if at all possible. In his thesis, Roy Fielding states that creating new media types is not RESTful. And that is true in the sense that custom media types reduce the likelihood that clients will know how to interpret your representations. But there is a tension here between accurately representing incompatible versions of a service and using a pre-existing media type. There's no perfect answer, just a continuum with different trade-offs for different kinds of applications.

Monday, May 31, 2010

I spoke at the Gateway Software Symposium (NFJS St. Louis) last weekend. After my RESTful Web Services with JAX-RS talk, one of the attendees asked me how to discover the proper URLs when starting to interact with a RESTful service. The impetus for the question came from my statement that many RESTful services don't have large specifications (such as the WSDL files for "Big" web services).

I explained that well-designed RESTful services are "well-connected". By this, I mean they adhere to the idea of "hypermedia as the engine of application state", or HATEOAS. If your service adheres to HATEOAS, your service is accessible and navigable given just a single URL. How can this be?

Recall that in RESTful services, we use URIs/URLs to connect various related resources. To be well-connected (I'll use this term in place of to HATEOAS mostly because the acronym looks odd to some people) and be RESTful, representations of resources should contain URIs/URLs to related resources. To make this clearer, let's look at a specific example. As we look at this example, we'll make note of the aspects of the API which couldn't be discovered by examining the results of searches or resource retrieval.

In my talks on JAX-RS, I use a simple music service. In this service, each artist has a name and some number of music albums they have published. Each album as a name and some number of tracks on that album. Here is a simple UML diagram of the objects:

We now come to the first piece of information we can't discover for ourselves. We need to know a starting URL. With the example service, it's simply a host name and a port, as in: "http://localhost:3131".

When we run the music service and hit it with a browser, we see a spartan web page:

While this certainly allows us to do some (mildly) interesting things, how does it help us discover the API for this RESTful service? The answer lies in examining the underlying HTML. When we do that, we see this:

At first, there might not seem to be much here. But let's look at the form for searching for artists:

From this we can see that we can search for artists whose name "is" (IS), "starts with" (STARTS), "contains" (CONTAINS), or "ends with" (ENDS) the text specified by 'searchString'. If you don't live and work with HTML forms and their resulting queries every day, this still might not be obvious. So we can always go ahead and do a search. Let's search for artists whose name begins with 'F':
Clicking the 'Search' button does two things for us:

1) It performs the search and gives us the results:

2) It shows us the proper format for executing queries against the service (in the browser's address text box):

Looking at this, we know that we can query for artists by specifying a searchType and a searchString if we search under /artists.

My example music service supports more than HTML for representations. It also supports an XML format and a JSON format. In order to use the XML format, we need to specify that we accept a content type of "application/music-xml" (a content type I made up for this example). This is the second piece of information the service itself did not tell us. However, we could simply query the service and get back an HTML representation of the search results so it's not strictly necessary. The XML format is a more compact representation than the HTML, but the same information is contained in the HTML version.

If we use the curl command line utility to query:

curl --header "Accept: application/music-xml" "http://localhost:3131/artists?searchType=STARTS&searchString=F"

We get XML results:

Because this service is designed with API discovery in mind, we see that it reminds us of the search that was performed:

< artists uri="/artists?" searchString="F" searchType="STARTS">

In these results we also see an example of how to build well-connected services. Notice that each search result lists not only the name of the artist found, but also the URI of the resource we would need to retrieve in order to get a representation for that artist. For example, the artist "Fish" lists a URI of /artists/id/149. If we go to the music service in the browser and query for this URI (a URL of http://localhost:3131/artists/id/149), we get information about the artist named Fish:

Looking at the HTML, we see:

And if we query for the XML representation using curl:

curl --header "Accept: application/music-xml" "http://localhost:3131/artists/id/149"

We see:

Notice that the information about each album contains a URI we can use to retrieve a representation of that album, just like the search results listed a URI for each artist that was found. These are examples of how we build a well-connected RESTful service. By making sure that each HTML page (or XML result) contains URIs for related resources, users of our services can discover the proper URIs/URLs to use when invoking our service without a large specification.

In part two, I'll explore this notion further, looking at how our self-describing service allows us to discover how to create and delete resources as well as search for and retrieve them.