Tuesday, March 02, 2010

Principals of the Web

As I noted in a previous entry, I've been reading Architecture of the World Wide Web, Volume One and am finding it a great read. For example, take this little gem:
Constraint: URIs Identify a Single Resource

Assign distinct URIs to distinct resources.

In a nutshell the authors have made it clear that a URI should refer to a particular resource. And just a bit further on they point out that URI's can be aliases for a single resource:
Just as one might wish to refer to a person by different names (by full name, first name only, sports nickname, romantic nickname, and so forth), Web architecture allows the association of more than one URI with a resource. URIs that identify the same resource are called URI aliases. The section on URI aliases (§2.3.1) discusses some of the potential costs of creating multiple URIs for the same resource.

They even offer thoughts on the performance consequences of aliases.

Ya gotta love it... :-)

Monday, March 01, 2010

The Architecture of the World Wide Web - Volume 1

As part of my foray into RESTful services, I've been reading The Architecture of the World Wide Web - Volume 1 and find it refreshingly informative. For example:

The choice of syntax for global identifiers is somewhat arbitrary; it is their global scope that is important. The Uniform Resource Identifier, [URI], has been successfully deployed since the creation of the Web. There are substantial benefits to participating in the existing network of URIs, including linking, bookmarking, caching, and indexing by search engines, and there are substantial costs to creating a new identification system that has the same properties as URIs.


And:

A resource should have an associated URI if another party might reasonably want to create a hypertext link to it, make or refute assertions about it, retrieve or cache a representation of it, include all or part of it by reference into another representation, annotate it, or perform other operations on it. Software developers should expect that sharing URIs across applications will be useful, even if that utility is not initially evident.


There is so much packed into each of these brief statements, and they are in equivalent of the first 10 pages of the document.

I find it both amazing and sad that this document was published in 2004 yet I've found very few references to it in the six years since it's publication. Maybe I just haven't been looking in the right places?

I will share additional passages that I find enlightening in the days (and weeks?) to come.

Saturday, February 27, 2010

GWT and SmartGWT

I'm doing some work with GWT 2.0 and also with SmartGWT. I like both toolkits, but I'm in the midst of a very steep learning curve (meaning I'm learning a lot quickly). While I'm making good progress in becoming proficient, I'm finding that SmartGWT suffers from the same problem that many open source products with commercial support options suffer from: weak documentation.

It's completely understandable and I don't blame the developer(s) of SmartGWT. When you are working on an open source project and also trying to provide commercial support as a means of revenue, it's hard to find the time to produce good documentation. And the truth is that if you provide a good product with documentation good enough that developers don't need your services, you have just put yourself out of business.

Such is the nature of open source projects that have commercial support models as their primary financing model. There's nothing that can be done about it (except find a philanthropist who will fund the project; and philanthropists of any sort are a rare breed these days, not to mention those interested in the the obscurity of open source software).

So I'll keep climbing the learning curve with the documentation as is, and take good notes for the future when I've stopped working with these technologies and come back to them.

Oh, and if you know any philanthropists looking to fund open source projects, I've got a project or two of my own I can suggest...

Wednesday, February 10, 2010

Guice 2.0 - tasty

I'm finally getting a chance to do some work with Guice 2.0. I don't know if I just couldn't wrap my head around Guice in the past and I've finally "gotten" it, or if Guice 2.0 provides a more approachable API. Either way, I'm finding it great to use.

I've been mildly whiny about Guice in the past, stating that it wasn't completely statically typed (which is true, since it's possible you'll ask for a resource at runtime that isn't available because it wasn't configured). But even with that small limitation, I'm finding Guice 2.0 to be far better than Spring for dependency injection. The code is much smaller, much more type-safe, not XML (a big plus), and much more strongly type-checked.

If you haven't checked out Guice, or if you tried Guice 1.0 but haven't tried out 2.0, you should give 2.0 a serious look.

Now I need to integrate Guice with Jersey (the JAX-RS reference implementation)...

Monday, February 08, 2010

JAX-WS tarpit

I'm currently doing some work with an open source framework (to be left unnamed) built upon JAX-WS (using SOAP, SAML, and HTTPS). I'm making headway working with it, but talk about a tar pit. You go into the code and it's almost impossible to get out. Every time you gain a bit of understanding, some other anti-pattern slaps you until you see starts and you're back to global searches with google and looking through the code to make further progress.

Violations of the DRY principle are rampant. The generated code has magic constants sprinkled liberally throughout. Generated classes have default constructors and public getters and setters for all fields despite the fact that some fields are actually required; no hashCode; no equals; no ability to determine if one of these data objects is valid or not. Turning the WSDL into code generates vast numbers of classes (reminding me of the terrible mapping from CORBA IDL to Java). The generated code has almost no helpful comments (despite the fact that generating at least some reasonable back-references in generated code is easy precisely because you are generating code). And we've only delved a bit into the SAML aspects of things; I expect that to be another can of worms entirely.

I've heard that the SOAP/WS-* specifications were co-opted by certain large companies and made so complex it's nearly impossible to work with them except with the very big IDEs-with-god-complexes those vendors sell. Based on these experiences and previous ones working with some SOA frameworks, I can believe it. If that isn't the reason for their incredible opaqueness and anti-patterns, I'm afraid to find out who thought all of this was a good idea.

So while I'm making progress and getting things done, the entire experience makes me want to go wash my hands and update my resume.

If you know of good ways to work with this stuff short of plunking down vast quantities of money for some IDE that will sort-of/mostly/kind-of hide all the complexity (until the moment things break and you really need to understand what is going on), I'd love to hear about them...

Tuesday, February 02, 2010

Java Annotations have Become Pixie Dust

I was giving a talk about RESTful services using JAX-RS and Jersey recently and was asked why I had used Mark Volkmann's WAX for generating HTML and XML. The person asking the question pointed out that Jersey has integration with JAXB.

There were two answers to that question. One answer is that I am leery of anything which automatically converts my Java objects into a serialized format (bitten by Java's object serialization in the past). Incompatible object changes can be difficult or impossible to reconcile in a backward-compatible manner.

But the main answer I gave got some chuckles and further questions. I explained I was trying to avoid too much "pixie dust". In the example code, I was already using the Java Persistence API (JPA) and JAX-RS and their associated annotations. If I had not been careful, there would have been annotations for Spring and JAXB as well. All of these annotations are small in the code but have very large effects. Those innocent looking annotations subject my poor unsuspecting code to some very big (and some would argue bad) frameworks. Understanding how these frameworks interact is not only hard, but those interactions change as the frameworks change (possibly resulting in the system breaking with no code changes).

I have real misgivings about the number of annotation-based technologies that should be applied to any one project. Each annotation you use represents some amount of code you don't have to write. And that is, of course, a good thing from a development perspective. But every annotation you use represents 'pixie dust', behavior which is hidden from you and generally performed at runtime. That means that when things go wrong, there isn't any code to look at. It also means that small differences between configurations can produce large changes in behavior. That's a very bad thing for testing, production deployment, production maintenance, etc.

I've been thinking about this issue for some time*, so I was pleasantly surprised to find Stephen Schmidt's post admonishing us to Be careful with magical code. His post is not specific to annotations (he calls out aspect-oriented programming, for example - I agree that AOP is another kind of pixie dust). And he points out some examples of the "pixie dust" phenomenon. While I don't agree with his 'magic containment' example, it's a good post. You should read it.

As a rule of thumb, I think two kinds of pixie dust is the maximum to sprinkle on a project. So think hard and choose wisely when picking which ones to use: the more kinds of pixie dust you sprinkle, the harder it will be for you and others to understand and troubleshoot things, now and especially in the future.



*Thanks to Mike Easter for planting the idea of talking about the state of annotations in Java

Friday, January 08, 2010

Taxonomy of Technical Blog Posts

I categorize technical blog postings into a taxonomy:

Type I: Describing how to use some kind of technology, your own or someone else's
Type II: Describing how to overcome some limitation, bug, or quirk of technology
Type III: Whining about failures to get one or more technologies to work (together)
Type IV: Crowing about getting one or more technologies to work (together) - often a follow-up to a Type III posting
Type V: Indulging in a post that really doesn't belong in a technical blog.

For a classic example of a Type II posting, see:

Tuesday, January 05, 2010

A RESTful web service testbed

It's time to build a more complete RESTful service example. In this web service, we'll aim to perform all the common activities of a real service. This is often referred to by the acronym CRUD, which stands for Create, Read, Update, and Delete for all of the common operations associated with persistent data. Since we're dealing with a RESTful web service, we'll also throw in multiple representations for a resource, and connectedness to make it easy for clients to navigate through the service resources.

So, what should our service do? My friend and colleague, Mark Volkmann, introduced the eample of a database of music information. In this example, the service service contains information about several kinds of resources:
  • Music artists (Artists)
  • Albums
  • Songs
This is a nice example because it is rich enough to expose many of the problems that need to be solved by RESTful web service without becoming so large that it's unwieldy to explore. The domain model is immediately familiar, so we can focus on the technologies and not the model.

With this service we can explore not only traditional browser-based HTML, but XML and even an Ajax client using JSON. We can also use it as testbed for things like caching, security, clustering/failover, and composition of services.

Monday, December 07, 2009

JAX-RS and Jersey talk materials

I recently gave a talk on JAX-RS, JSR 311, and Jersey at the St. Louis Java User's Group website. You can find the presentation and the sample code on their website.

Friday, November 13, 2009

Curl has a bug with redirects?

I gave a presentation at the JUG last night about JAX-RS and Jersey. The talk went well, and it was a great crowd with a lot of participation (I'll post more once I have the materials cleaned up and they are posted to the JUG website).

But one thing that didn't go right last night was when I was using the 'curl' command line tool to create new artists in the RESTful music database service I was using as an example.

The curl man page on my MacBook Pro says:

When curl follows a redirect and the request is not a plain GET (for example POST or PUT), it will do the following request with a GET if the HTTP response was 301, 302, or 303. If the response code was any other 3xx code, curl will re-send the following request using the same unmodified method.

Yet, when I was using it, it wasn't doing that. For example when trying to create artist 'Jane':


brian@Widget: curl -v -H "Accept: text/html" -X POST -L "http://localhost:3131/artists/Jane"
* About to connect() to localhost port 3131 (#0)
* Trying ::1... Connection refused
* Trying fe80::1... Connection refused
* Trying 127.0.0.1... connected
* Connected to localhost (127.0.0.1) port 3131 (#0)
> POST /artists/Jane HTTP/1.1
> User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
> Host: localhost:3131
> Accept: text/html
>
< HTTP/1.1 303 See Other
< server: grizzly/1.9.10
< Location: http://localhost:3131/artists/id/313
< Content-Type: text/plain; charset=iso-8859-1
< Content-Length: 0
< Date: Fri, 13 Nov 2009 05:24:12 GMT
<
* Connection #0 to host localhost left intact
* Issue another request to this URL: 'http://localhost:3131/artists/id/313'
* Re-using existing connection! (#0) with host localhost
* Connected to localhost (127.0.0.1) port 3131 (#0)
> POST /artists/id/313 HTTP/1.1
> User-Agent: curl/7.16.3 (powerpc-apple-darwin9.0) libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
> Host: localhost:3131
> Accept: text/html
>
< HTTP/1.1 405 Method Not Allowed
< server: grizzly/1.9.10
< Allow: DELETE,OPTIONS,HEAD,GET
< Content-Type: text/plain; charset=iso-8859-1
< Content-Length: 0
< Date: Fri, 13 Nov 2009 05:24:12 GMT
<

When it followed the redirect and issued the follow-up request, it went to the right place:


...
< HTTP/1.1 303 See Other
< server: grizzly/1.9.10
< Location: http://localhost:3131/artists/id/313
...
* Connection #0 to host localhost left intact
* Issue another request to this URL: 'http://localhost:3131/artists/id/313'

But it didn't switch to GET:

> POST /artists/id/313 HTTP/1.1

As a result, I got a 405 error code because curl tried to do a POST to the redirected URL rather than doing a GET as is documented.

I did some further experiments last evening after the talk and confirmed this behavior. I even tried switching the return code to a 302 just to see if it would make a difference and it doesn't.

How does your curl behave? Does it switch from POST to GET when following a redirect using the -L option?

I'd appreciate it if anyone who knows anything about this could comment. Is curl doing the wrong thing, or is the man page wrong and curl is no longer supposed to do this?