Last night, I had a closer look at Microsoft’s OData protocol. The opening line on the introduction page reads: “The Open Data Protocol (OData) is a Web protocol for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today.”

This very much sounded like a solution for the problem we were facing with our last generation LIMS (Laboratory Information Management System) a few years back: A closed, monolithic system, accessible only through a native GUI client, which was difficult to maintain and to adapt to our perpetually changing requirements. In early 2010, we decided to tackle this data silo problem by redesigning the LIMS from scratch as a RESTful web services architecture. The project was a great success and earlier this year we released the generic components of this work as a Python package called everest.

Given this background, I was quite curious to learn what the OData folks have on offer and went on to study the data model and service specification documents. This turned out to be a highly fascinating read for two reasons:

Firstly, I realized that OData shares a lot of basic design decisions with everest: Both put uniform REST operations on the exposed data objects at the center and use ATOM as the main representation content type; OData “entities” correspond to everest “member resources”, “entity sets” to “collection resources”, “properties” to “attributes”, and “navigation properties” to “links”. Initially, I was quite flattered that a project as serious and widely known as OData would make a lot of the same design decisions as we did, but then I realized that most of these shared elements are actually “forced moves” in the sense that they are just the most sensible way to build a uniform RESTful web service framework.

The second thing that fascinated me about the OData specification were not the similarities with everest, but the differences. I will quickly point out a few of them that I found most striking, starting with OData features that are missing in everest:

  • Complex data types. The everest data model only distinguishes between resources (which may reference other, nested resources) and “terminal” data objects (which have a simple, atomic type). While this simplifies the protocol, it forces the service provider to expose all complex data types as full blown, addressable resources, which is not always desirable;
  • Query parameters for fine-tuning the response.. Specifically, the client can control which attributes of a resource should be included in the representation returned by the server (using the $select parameter) and whether they should be represented inline (using the $expand parameter) or as URLs (using the $links parameter). In everest, this can only be done statically for each combination of resource and representation;
  • Operations on resources. I can only guess that this part of the OData specification was added to make the transition from SOAP based web services easier. everest purposefully abstains from a “hybrid” service architecture (i.e., mixing REST with RPC-style operations). In our daily practice, we have yet to encounter a situation where (admittedly sometimes creative) use of the REST operations was not sufficient to implement the required application logic;
  • Support for PATCH.This is a really nifty feature – the ability to do partial updates for large resources is enormously useful;
  • Math and grouping operators for filter operations. This is also a neat feature as it provides a substantial extension of the realm of possible queries at relatively little cost.

There are also a few things that everest offers and OData does not:

  • Decoupling of resource and entity level. In everest, the resource layer is constructed explicitly on top of the entity domain model. Only entity attributes that are exposed through a resource attribute are visible to the client. This allows you to a) Expose a pre-existing entity domain model through a thin resource layer as a REST application; and b) Isolate changes in your entity domain model from the resource layer. Of course you could also perform such a mapping inside an OData application, but this would have to happen outside of the framework;
  • “in-range”, “contains” and “contained” operators for filter operations. Especially the “contained” operator is very handy in cases where you want to retrieve a whole collection of resources with one request;
  • CSV as representation content type. This is vital in our application domain (Life Sciences) where data import and export is still often manual (e.g., through Excel).

In the end, I came away deeply impressed with OData: The protocol specification leaves little to be desired and has been adopted by a thriving ecosystem of producers and consumers. I still think everest has a few things to offer, however: If you are already committed to OData, you could use it to reflect on OData‘s design (like I just did in the other direction); if, on the other hand, you are a Python-affine web developer looking for a RESTful framework to open up a number of data silos in your organization, everest might be able to supply all the functionality you need with very little overhead.

Last night, I was looking for a background image that would visualize the main theme of this blog to make it more appealing. Since we chose “Bits and Bases” as the tag line for the blog, an image showing random sequences of bits interspersed with random sequences of nucleotid bases seemed perfectly adequate. Not unexpectedly, a web search for suitable images did not turn up any useful hits, so I started to look for a programmatic way to create the desired symbol strings, preferably in my favorite language, Python.

There are at least a dozen different packages that would allow you to do this comfortably in Python, but I ended up with a slightly unusual choice: NodeBox, an application that allows you to create stunning 2D visuals with very little effort. What was particularly captivating about NodeBox was not only the beauty of the samples shown in the gallery, but the ease with which it allows you to explore its features interactively: Just copy&paste some of the sample code in the script editor panel, press F5 and review your rendered artwork on the display panel (or, if your script contains an error, decipher the strack trace in the message panel).

The script shown below runs inside the NodeBox environment which already has all the relevant drawing functions in the global namespace:

Pretty simple – but it is astonishing how different the results look with different symbol and background colors or with different font sizes.

Obviously, the script above only scratches the surface of what NodeBox can do. There is much more to explore within NodeBox (paths, transformations, images) and beyond (e.g., the fancy NodeBox 2 project which adds a graphical workflow layer on top of the NodeBox engine). Enjoy!