Stateless Blog

Month
Filter by post type
All posts

Text
Photo
Quote
Link
Chat
Audio
Video
Ask

July 2012

HAL-HTML: HTML for automatons

I’ve said recently that HTML is clunky for APIs, but I didn’t really give it much thought. Since then I have given it some more thought and that led to me creating an HTML variant of HAL to test the waters and find out exactly how awkward it is.

tl;dr - it’s actually not that bad. I’m almost convinced. I wrote some jQuery helpers and a stylesheet to turn it into something human-browsable. It all works ok.

The method

  • Ambiguity in a machine interface is bad. I had to establish some clear and unambiguous constraints.
  • I didn’t want any presentational info targetted at humans to muddy the machine interface, which means that surfacing the selectors in a web browser had to be done via the css3 content property.. I did have to resort to JS in a couple of places.
  • I wanted to produce valid HTML. Didn’t really manage this due to the use of rel=“self” upsetting the validator. I can live with that.

How It Looks

You can see here how it renders in a browser.

Here’s the markup (minus style and script tags):

https://gist.github.com/3114393.js?file=example.hal.html

Issues

There were a couple of road-blocks I ran up against here whilst trying to stick to purely html and css:

  • can’t present input names without (implemented JS shim)
  • can represent rel/class of links names as hyperlinks to docs (implemented JS shim)
  • can’t show headers for request to page (TODO: fix by overriding all links and forms to use xhr + pushState)

HAL-HTML jQuery API

I put together a rudimentary jQuery API for consuming this HAL-HTML stuff. It’s a bunch of helper methods that can be called on a resource element (there are top level methods on the jQuery object that default to the root resource [i.e. body tag]).

getLink('rel');
getProperty('name');
getResource('rel');
getControl('name');
fill_in('input_name', { with: 'value' });

They should be relatively self explanatory, I think the most interesting is the last function ‘fill_in’ which is meant to be called on a control element. It returns the control element so you can chain it, you end up with code that looks like this:

$.getControl('ht:make-reply').fill_in('content', { with: 'zomg awesome hypermedia client' }).submit();

The 'spec’

The below are the basic rules that describe how to express HAL with HTML.

Resources

There are two types of resource in a hal-html document:

  • Root Resource, target of request (represented as body element)
  • Embedded Resource, contained/in-line resources to save request (see Embedded Resources section below for how these are represented.)

Properties

Selector Pattern

{resource} > .properties > input[name="{property}"]

Examples

body > .properties > input[name="created_at"]
body > .embedded > .post > .properties > input[name="content"]

Links

Selector Pattern

{resource} > .links > a[rel~="{relation}"]

Examples

body > .links > a[rel~="author"]
body > .embedded > .post > .links > a[rel~="self"]

Embedded Resources

Selector Pattern

{resource} > .embedded > .{relation}

Examples

body > .embedded > .post

Form Controls

Selector Pattern

{resource} > .controls > form[name~="{control_name}"]

Examples

body > .controls > form[name~="make-reply"]

Conclusion

This was all relatively simple and, now that I’ve done it and thought about it some more, I’ve come to the conclusion that applying HAL’s information model to HTML results in a pretty clean generic interface for machines that you can easily build tooling around.

I’m still not sure using HTML for an API is a brilliant business descision right now though, it’s a bit too “out there” still. ;)

Jul 15, 20124 notes
Don't use HTML as the Media Type for your API

tl;dr

Render one hypermedia type for your human consumers (HTML) and another hypermedia type for your machine consumers (HAL). Conneg (see HTTP’s Accept, Content-Type, Vary headers) should be relatively easy to leverage for this with a decent development stack.

Why?

Clearly the hypermedia API for you application will have parallels with your HTML application, but the interface you are presenting to machines needs to account for their automated behaviour in an entirely different manner. Machines are not adaptable consumers, so the machine interface to your application needs to be carefully managed so that it exposes the minimum ‘attack surface’ against which developers can write coupled client code.

Mixing your human and machine affordances in one HTML interface results in a messy, broad attack surface and is therefore not a good strategy for sustaining the evolvability of your application.

How that might look in practice..

If you use HTML for *both* human and machine interactions , you will actually end up presenting two competing interfaces to the machine clients, e.g:

<a href=“/people/bob” rel=“author”>author</a>

Is it valid to select the link control in a coded client by the text 'author’ wrapped in the anchor tag or is it only valid to select via the @rel attribute? It’s not immediately apparent from the media type, and that’s one example of the kind of tax you place on consumers by using a muddy’d interface like HTML. What happens if one of your developers decides the page looks better if you change author to Author?

Machine hypermedia requires a much lighter interface than human hypermedia does. Forms sound good in theory but have a much lower cost-benefit when the run-time consumers of the application are automatons. Also, it’s not unlikely that machine clients will be written against your API that 'cheat’ on the full processing rules for forms, which means when you change @method from POST to GET.. many of those external clients bomb out. This is more likely to happen as the complexity of the affordance goes up.

a few other thoughts..

There’s also a few subjective reasons HTML is considered 'not very nice’:

HTML is not at all elegant for representing data. Querying the DOM, even with css selectors, feels clunky in comparison to traversing a JSON object. 

I’m not sure the hypermedia affordances of HTML are that rich when it comes to machines, either: for example, there’s no clean way of representing the embedded'ness/containment of another resource’s state with HTML.

Jul 10, 2012
Next page →
20122013
  • January
  • February
  • March
  • April
  • May
  • June
  • July
  • August
  • September
  • October
  • November
  • December
201120122013
  • January
  • February
  • March
  • April
  • May
  • June
  • July
  • August
  • September
  • October
  • November
  • December
20112012
  • January
  • February
  • March
  • April
  • May
  • June
  • July
  • August
  • September
  • October
  • November
  • December