Sunday, February 1, 2009

Better way to manage dependency version ranges

OSGi provides an extremely powerful and precise mechanism for controlling acceptable version ranges when specifying dependency on bundles or packages. In theory (as described by policies of various projects at Eclipse), the developer would take into account his plugin’s API and behavior needs, cross-reference that with version information about the bundle in question and carefully craft the version range in dependency declaration to accurately reflect his plugin’s actual needs while leaving the version range as open as possible to allow users maximum flexibility when composing an installation. Further, in theory, the developer should be continuously aware of dependency version ranges specified in his product’s various plugins and how they correlate to functionality exposed by those plugin. As development progresses, the developer is supposed to be able to spot when he started depending on functionality that’s not available in the specified min version and reset the min version accordingly.

That’s the theory. In practice I haven’t met a single developer with sufficient time on their hands or sufficient mental capacity to keep all of the necessary information in his head at all times in order to properly apply this policy. What I’ve seen happen most often is that the min range gets set based on whatever the plugin version happens to be at the time the dependency is first added. PDE helpfully inserts this information in your manifest by default. The max version then gets set by applying a team policy (typically by bumping up either the major or the minor version). This happens when dependency is first introduced. As the code continues to evolve, the min version is typically not touched again. The max version is incremented when the build gets broken by the dependency bumping up their versions past a certain point. The cycle repeats.

After many years of observing this situation, I am convinced that having developers manage version ranges creates a lot of overhead and does not yield satisfactory results no matter how hard people try.  To me, dependency version ranges are most useful when you have shipped your product in binary form. When taken collectively across a component (collection of bundles), they represent a statement of what your team is willing to support as a working configuration. Ideally, this information should be consistent across plugins and as accurate as possible.

Any time you talk about setting version ranges, you are considering three versions:

  1. The version that you developed and did most of your testing with. I call this the “target version”. Typically, this is what you would list as recommended configuration in your documentation.
  2. The minimum version that you are willing to support. The level of testing you can afford to allocate to this version is bound to be less than what you would allocate for the target version, so there is a certain amount of risk that an undetected issue is going to slip through. The further back you go from the target version when setting the minimum version, the greater your risk.
  3. The maximum version that you are willing to support. Since this version will typically not exists at the time of your ship date, setting this version involves an educated guess based on understanding of what policies your dependencies use when incrementing their versions and the degree to which you are relying on undocumented (internal) code and behaviors. The spread between the target version and the maximum version is where you highest risk lies. On one hand you’d like to assure long viability of your release in the field. On the other hand, the further out you go, the greater the risk that your product will not work and make a liar out of you in the eyes of your users.

Because getting the above version decisions right and consistent across a component is extremely important, it is not a good idea for individual developers to be making these decisions on a plugin-by-plugin basis. In an Open Source environment, this should be a component-wide decision made collectively by the committers. In a commercial environment, this decision is often made higher up in the organization based on availability of resources and target user base considerations.

When the overall decision is made, it is typically expressed in broad terms. For instance… “this version will ship on Ganymede SR1, but should work with all versions of Ganymede starting with GA”. It is then up to developers to translate that requirement into version ranges in the manifest.

That’s a ton of tedious manual work with lots of room for mistakes. In other words, a perfect candidate for automation. A few years ago, I wrote a set of two custom Ant tasks to automate this process. The first task reads an Eclipse installation and produces an inventory file that lists id and version of every bundle found. The second tasks takes as input an inventory file representing the minimum platform, an inventory file representing the target platform and a policy for setting the maximum versions.  For every dependency, the task looks up the version from the minimum platform inventory. That becomes the left-hand-side of the version range. It then looks up the bundle version in the target platform inventory and applies the policy function to it. Here are some examples of policy functions: “x.y.z ->x+1.0.0”, “x.y.z ->x.y+1.o” or the extremely conservative “x.y.z ->x.y.z+1”. You can set different policies for different plugins or components based on what you know of their versioning conventions. The version returned by the policy function becomes the right-hand-side of the version range.

We have been using these two tasks to automate and improve the quality of our version ranges for several releases of Eclipse tooling products at BEA and now at Oracle. Developers don’t set the versions on the dependencies specified in the bundle manifests stored in the source repository. At the end of every build, a process runs that splices version ranges into the manifests just prior to packaging the bundles for distribution. The target inventory is always generated on the fly based on whatever the product is building against. The minimum platform inventory is generated once when the minimum platform decision is made. The inventory is then stored in the source repository.

This has been an extremely useful process improvement for us. Not only do we have more confidence in the version ranges encoded in our product distributions, but it takes significantly less work for developers to manage all of this. The developers never have to think about dependency versions during normal course of development and integrating new versions of dependencies takes less work (since version ranges in manifest don’t have to be fixed manually to get the build to work).

3 comments:

Unknown said...
This comment has been removed by the author.
Unknown said...

This sounds like an interesting approach. But it obviously requires that providers of plugins follow some rules, like the rules defined in eclipse version numbering.

Since eclipse is promoting bundle import and not package import, I guess your solution works only for bundles (which is not to bad in the context of eclipse, although Jeff seems to suggest to use package imports (see page 7))

Konstantin Komissarchik said...

Scharf,

You definitely have to know something about how your dependencies manage their versions. If you don't know their versioning convention, then you can't effectively decide on a policy for setting upper ranges. The only thing you can do absent this information is adopt a defensive stance (a.b.c.d -> a.b.c.d policy). This is true whether you are using an automated approach that I have described or doing this manually.

The same principles that I described can be adapted to handle package import as well as bundle import. I am sure I will have to dig deeper into this at some point if package import starts making inroads at Eclipse.