Friday, September 5, 2008
Designing API for fragile content
Just as with any other profession, programming involves quite a bit of monotonous and repetitive work. Interesting problems do come up, of course, but not so frequently that encountering one always brings a smile to my face. One of my deep interests in the programming profession is API design, so it would be fair to say that when I recently encountered a tricky API design puzzle I got pretty excited. So I was tasked with building a forms-based editor in Eclipse for an XML file with a certain schema. I started out by extending the XML source editor that's part of WTP. That gave the source view tab for my editor and I could access XML DOM that the source editor exposed. Any changes I made to the DOM would propagate to the source buffer. That's a pretty good start, but I did not want my forms UI working directly with DOM. I don't know if DOM API has any fans, but I am certainly not one of them. I didn't want my UI code getting cluttered with it. Ok, easy enough. Just take DOM and wrap it in API custom-created for the schema. Many of the elements in this particular document schema are tightly-typed. There are integers, class names, file paths, etc. My first cut at the API used these types in the getters and setters...That works well enough when content is well-formed, but this is an XML file that's edited directly by users. Handling of malformed content is very important. Let's say that the min-duration element is found, but it's content cannot be parsed as an integer. The only option that the above API left me was to return null. That might be acceptable in some cases, but it's produces a rather poor user experience in the context of an editor. The text field that would be bound to this property would be blank, forcing the user to either type in a new value or revert to the source view in order to fix the existing value. What I wanted to do is show the malformed value in the text field together with a problem decoration so that the user can see and fix it easily. Ok, so let's augment the API a bit... That's better, but min duration has a default value and only positive integers are valid. A bit more API augmentation was in order... Now I had enough information in the API to build the UI that I needed, but the API was starting to smell a bit. That's six methods for one element in the schema that has dozens of elements. There has to be a better way to structure this API. After some head-scratching, I decided to try returning a surrogate object from the getter method instead of the actual value. The surrogate would handle parsing, default values and validation... The getMinDuration() method would always return a non-null surrogate object. The caller then decides what aspect of value they are interested in querying. The IntegerValue class supplies default validation logic for handling unparsable content, but additional validation can be added. For instance, in this case only integers greater than zero are valid. Since range is a pretty common constraint, I made the IntegerValue constructor take the min and max values (in addition to the raw string value of the property and the default value). More complicated validation scenarios can be handled by subclassing the IntegerValue class. Note that only the getter deals with surrogate object. I wanted to keep the surrogate objects immutable so that they can be handled in a manner similar to basic value types without worrying about synchronization. When setting a value, you either have a raw value (either it can't be parsed or the code in question doesn't want to deal with parsing it) or you have a tightly-typed value. An overloaded setter method takes care of both of these scenarios. As you can imagine, it was simple at this point to extend this pattern to other types. I created a base class for all value types, which made it possible for some code to handle variety of types without knowing what they actually are. A good example of this is text field data binding code. Since any value can be retrieved and set as a string, any value can be bound to a text field. I actually ended up using the same pattern even for properties that were strings by creating a StringValue class. Even though there is no parsing involved, the benefit of having consistent access to default value handling and validation made it worth it. So what do you think?