Tuesday, October 21, 2008

Creating API, Lessons Learned

The subject of how to properly create and declare API is frequently debated at various communities within the Eclipse ecosystem. One of the thorniest issues is the disagreement over how to treat the so-called "provisional API", basically the API that either has not yet received sufficient feedback or has known issues that cannot be addressed prior to the release. There is a lively debate going on right now on this very subject on the mailing list of the Eclipse E4 project (the effort to build the next generation Eclipse platform - sometimes referred to as Eclipse 4.0), so I thought I should jump in and share our experience in this area at the Eclipse Web Tools Platform (WTP) project in hopes that others don't repeat the mistakes that were made.

It's important to note that opinions expressed here are mine alone. Other involved parties may not agree. Names and other identifying information is withheld to protect the guilty.


The start of WTP project was pretty rough. Large code contributions had to be rationalized in the context of a platform that is supposed to be extensible for many external adopters. One of the challenges was the belief by many of the committers from the company that made the code contribution that the APIs are good as is because they existed like that for a long time inside that company's commercial products and are therefore proven. Aligned against that was the growing feedback from new adopters who were saying that the APIs were insufficient and in some cases just plain wrong. Creating good APIs that are flexible enough to address variety of adopter usecases takes a very long time, but unfortunately time was running out. Major companies involved in WTP were pressuring the project to make a release so that they can build commercial products with it. After much debate, a compromise was reached. WTP was going to make a release, but we were not going to declare any API as stable. Everything will be labeled as provisional.

Sounds good in theory, right? What went wrong is that the concept of provisional API was not concretely defined as part of the initial agreement. Everyone (from committers to adopters) ended up with their own ideas about the meaning of the concept. The first release happened and WTP team went to work on the next release trying to improve the APIs based on the growing feedback. That's when the fireworks really started. Certain adopters were not particularly happy that WTP was continuously breaking them despite the fact that they were leveraging code clearly placed in internal packages or otherwise marked as provisional. Granted, adopters didn't have much choice if they wanted to build products on top of WTP, but that's what you get when a project is starting out. None of that seemed to matter and eventually WTP PMC bowed under the pressure by instituting a very restrictive code change policy. An "adopter usage scan tool" was created that could be used by WTP adopters to scan their code base for references to WTP code and send these reports back to WTP where they would be collected and used as a reference for determine whether a change is allowed or not. This new policy effectively negated the original promise that was made with regard to provisional API. The new contract covered everything including code previously designated as provisional and purely internal code. Instead of committers promoting API, anything that an adopter touched (as represented by these reports) effectively became API.

Work on improving API essentially ground to a halt. It just became too expensive to fix many of the larger problems. Technically, a committer could seek PMC approval to break code referenced in adopter scans, but exceptions were rarely granted. The argument that was frequently made is that by making a proposed change, many lines of code in adopter products would be effected. It's "cheaper" for committers to not make the change in question or at least make it in a way that's completely backwards compatible. I will leave it as an exercise for the reader to see the fallacy of that argument.

The end result is that WTP was left with large amounts of "in progress" API code in random internal packages that was effectively frozen because it became too expensive for committers to continue to work on this API within the imposed constraints. In many cases, providing the requisite backwards compatibility would have effectively doubled the amount of work. Some improvements that were easy to make in additive fashion continued to be made over the next few releases, but real progress essentially stopped.

Finally, last year a group of committers was convened to try to improve the situation by proposing a new API policy for WTP. The end result formally defined provisional API and started the process of phasing out the flawed adopter usage scans policy. As someone who was involved in drafting the new API policy, I can tell you that I still see many flaws in it, but it is an effort to take a step in the right direction. Only time will tell for sure.

Thoughts on API Creation

The following is a collection of my somewhat random thoughts on API creation and the related processes.

  • Ability to declare provisional API is an essential step in API creation. You cannot be sure that the API is right until you received sufficient and diverse feedback. It is impossible to attain that level of feedback with one release cycle. Most external adopters will not start looking at a release until it's close to being finished. They will not start building products on it until even later. The best you will get early on is "yeah, that looks about right", which is not good enough.
  • Placing provisional API in an internal package (such as internal.provisonal convention sometimes used by Eclipse Platform ) creates unnecessary churn for adopters and committers. Consider the case where provisional API turns out to be 90% correct. The advantage of the internal.provisional approach is that you don't have to separately define expectations for provisional code (it gets treated as internal by virtue of the package name), but I would argue that it's worth taking the time to define a separate contract for provisional API since allowing provisional API in non-internal packages results in less work for both adopters and committers.
  • It's important to have a good system for determining whether API is ready to be declared as fully supported (not provisional any more). Leaving the decision completely in the hands of committers or even project leads can lead to problems since people are inherently biased towards their own code. Some things to check when deciding if API is ready to be declared are level of documentation, unit test coverage, presence of outstanding API issues in bugzilla and level (as well as diversity) of adopter feedback. I prefer a system where a committer nominates the API for declaration and there is a process where other committers and adopters can raise objections.
  • It's important to carefully balance the needs of committers working on the API and adopters consuming the API. It's a mistake to only look at the problem from the perspective of resource expenditure. For any successful platform, there will always be far fewer resources working on the platform than consuming the platform. Trying to add too much protection for platform adopters can inhibit innovation in the platform and ultimately hurt those same adopters.


Boris Bokowski said...

Thanks for sharing this - very interesting. I agree with most of your analysis in the "History" section but then was surprised by the "Thoughts on API Creation" section. You can either make it easy for clients and make strong promises about your interfaces (such as promising not to change the API in incompatible ways), or you can decide to not make those promises, in which case you need to be very explicit about it at the time you publish the interfaces. I don't believe you can defer that decision until after you have learned whether your interfaces are 90% useful or only 10% useful. Like I said in my email on the e4 mailing list, two possible options are (1) to make very little promises about compatibility, in which case it is probably good to warn adopters for the necessary "churn" when you decide to break compatibility, or (2) to make promises only to specific parties which are known to you, who are willing to start using the published interfaces under the terms you negotiate with them (no promises to anybody else!).

Konstantin Komissarchik said...


I definitely agree that you don't want to be ambiguous with adopters. But as long as you carefully define what is provisional and the guarantees that go along with that, my argument is that you should able to have the provisional code in a non-internal package. You can use comments in class javadoc or even special annotations (that could be tooled by API tools) to mark provisional API. That way, you've been up front with adopters and still can change just the parts that need changing prior to declaring API as fully supported (rather than having to change everything due to package change).

BTW, provisional API doesn't have to imply no guarantees. One of the things that I think we did well in the new WTP API policy is to spell out the guarantees. In particular, provisional APIs cannot change after API freeze or in a service release.

Steve said...

This is what SWT does: Introduce new API early in the cycle and make sure there is a motivated client to exercise it, then interate with the client over the cycle. No client? No API.


Ed Merks said...

Excellent post! Those who don't learn from the mistakes of the past are doomed to repeat them.