Sunday, March 1, 2009

Count me out from p2 fan club

I don’t make habit of ranting about technology, but p2 has been driving me up the wall. The old update manager may not have been perfect, but at least it didn’t have the bad habit of preventing installation cases that should work from working.

So we are putting finishing touches on a new version of Oracle Enterprise Pack for Eclipse (OEPE) and it’s time to test various installation scenarios. Eclipse Ganymede SR2 was also just released last week, so we are verifying compatibility with it. One of the basic installation scenarios that we are testing starts out with an all-in-one kit that includes Eclipse Ganymede GA with the previous version of OEPE. The second step in the scenario is to update all of the Eclipse components in that installation. The new version of OEPE requires at least SR1. I let Eclipse search for updates and install everything that it finds. That works. Presumably at this point, the installation should be equivalent to a fresh Eclipse Ganymede SR2 install. The final step is to install the new version of OEPE from a local update site. P2 thinks for a while and then says there are problems that will prevent the installation from working and refuses to go forward.

WTF? I know perfectly well that the plugins we are installing are compatible with Ganymede SR2. They’ve been built using SR2 as the target platform and they work just fine when simply added to the Eclipse installation. Now what? Taking a look at the problems reported by p2, I find about a hundred messages that look like the following.

Cannot find a solution where both "bundle org.eclipse.wst.validation [1.2.0,1.3.0)" and "bundle org.eclipse.wst.validation [1.1.0,1.2.0)" are satisfied.
Unsatisfied dependency: [org.eclipse.jst.ws.axis.consumption.core 1.0.204.v200708151945] requiredCapability: osgi.bundle/org.eclipse.core.resources/[3.2.0,3.4.0)
Unsatisfied dependency: [org.eclipse.jst.ws.axis.consumption.ui 1.0.204.v200801222138] requiredCapability: osgi.bundle/org.eclipse.core.resources/[3.2.0,3.4.0)
Unsatisfied dependency: [org.eclipse.wst.ws 1.0.204.v200711140435] requiredCapability: osgi.bundle/org.eclipse.core.resources/[3.2.0,3.4.0)
Unsatisfied dependency: [org.eclipse.wst.command.env.ui 1.0.203.v200709052219] requiredCapability: osgi.bundle/org.eclipse.core.resources/[3.2.0,3.4.0)

These messages don’t make a whole lot of sense. None of them reference plugins that I am trying to install and there is no hint in the messages regarding what actually caused the problem. I looked at one of them in detail to make sure I wasn’t missing something obvious. The second message says that a dependency of org.eclipse.jst.ws.axis.consumption.core plugin cannot be found. That makes sense since it’s a very old version of the plugin that’s not compatible with Ganymede. The question is why p2 trying to resolve that plugin? I look at my installation and I see versions 1.0.304.v200805140230 and 1.0.306.v200810082309. That makes sense. Those versions correspond to what shipped with Ganymede GA and Ganymede SR2. I do not find version 1.0.204.v200708151945 that p2 is complaining about anywhere.

At this point, I gave up trying to make sense of the problem messages and proceeded to blindly try various changes to the way the OEPE update site is constructed to see if I would get a different result. Two alternatives I tried made this scenario work: (a) removing all version constraints from plugin dependencies and (b) reverting to the old-style update site with a site.xml file and no p2 metadata. We still need to do some more testing, but we probably will go with (b) and give up on p2-enabling our update site.

I had reservations about p2 since the poor way in which it was rolled out roughly a year ago, but after a year of fighting with it and this recent experience, I can honestly say (without an ounce of exaggeration) that p2 is the worst regression ever introduced into the Eclipse Platform. I understand the problems that p2 is supposed to solve, but there is just no excuse for destroying the most basic of core scenarios in the process. If it wasn’t ready for Ganymede, it should have stayed as an incubator for a while longer.

11 comments:

David Carver said...

One of the issues with P2 is that it tried to re-invent the wheel. Instead of incrementally updating and fixing update manager and evolving it, they tossed it out and started from scratch.

There are other issues as well. Packaging and installing updates should have leveraged existing specifications instead of recreating their own. Eclipse in many ways is getting into too much of a not invented here mentality.

I too have found better success with the old site.xml format. Scenarios that fail under the p2 meta-data format usually seem to work under the site.xml format.

schriner said...

P2 is one of those rare examples where for political reasons the Eclipse process failed. There are plenty of known bugs in P2 when 3.4 shipped, but it was deemed more important to replace the old update manager than to ship a working product. Bugs included this (https://bugs.eclipse.org/bugs/show_bug.cgi?id=121201) fine performance bug, or the non-exposure of features by the UI (e.g.: with the old update manager it was possible to install into extension locations). Another issue is the locking UI (who in ***s name thought fetching many files from not-always-fast servers was a good idea).

I guess the people responsible for P2 are working really hard on their feature, and I am gratefull for their work. Just why they chose to take us all hostage is beyond my understanding...

Kim Sullivan said...

I think there must be some kind of dependency calculation bug. In one particular case, I found it hard to get a working eclipse installation without a complete JDT (installing Eclipse colorer manually into dropins worked).

Or maybe the dependency calculation is correct, but there are weird dependency bugs in a some plugins that p2 discovers and complains about (unfortunately, it doesn't show how it got the information it got).

Unknown said...
This comment has been removed by the author.
Antoine Toulmé said...

I keep on hearing people complaining about the exact same problems on IRC. I keep discussing this stuff with the committers on eclipse-dev and I have opened a couple of bugs about p2 that turned into hot discussions.

In a way, I am happy to see I'm not alone being stuck with this stuff. This deserves a separate blog post from me I guess.

Pascal said...

Konstantin, I think you have been burnt by the initial package not being built properly (e.g. constructed using the dropins) and trying to update from this.
I wished you had opened bugs rather than fighting by yourself.

Anonymous said...

I think P2 is an improvement from the end user perspective, but besides other problems it's still a mess from the user point of view as well:
* you get left with old packages taking up space and there does not to be any sensible way for removing them. Delete manually? What can you delete and what can't you? Do you want to spend potentially hours doing that as an end user?
* the "update" or "install new features" story is still not very user-friendly: referenced sites are not consulted automatically when I add a new site, and when there are problems the error messages are AWFUL (as you noted).

Konstantin Komissarchik said...

In response to Le ScaL:

> Konstantin, I think you have been
> burnt by the initial package not
> being built properly (e.g.
> constructed using the dropins) and
> trying to update from this.

That is a possibility that we are investigating.

> I wished you had opened bugs
> rather than fighting by yourself.

We have tried reaching out for help on issues like this in the past, but it often comes down to not having an effective way of communicating a repro for these scenarios. You would need access to both released and pre-released versions of proprietary software and there are all sorts of issues that arise from that.

Konstantin Komissarchik said...

Quick update...

These problems do appear to be related to how the original all-in-one kit was packaged. I have verified that using the dropins folder instead of the old-style structure where you unzip add-ins directly on top of the base eclipse platform does make this scenario work.

Of course, this doesn't really help as both of these packaging forms were supposed to work and the all-in-one kits that we are trying to upgrade are already in the hands of users.

John Arthorne said...

I have debugged many installation failures like this over the last two years, and every one of them has turned out to be a bug in the input to p2 (the metadata). I.e., the error that it reports is a legitimate problem in your bundle and feature dependencies that make the thing you are trying to install fundamentally incompatible with the system you are installing into.

You didn't see many of these errors in UM because UM did almost no dependency analysis - it would allow you to install and build broken installations that would instead fail in unpredictable ways at runtime. While open source hackers generally preferred this (I know what I'm doing, let me break myself), it was not an acceptable situation for commercial products.

I also want to reinforce Pascal's comment that you are encouraged to report these problems rather than suffer silently. From a quick bug search I see you have *never* entered a bug against p2 since its inception, from which I can only surmise that you are quite satisfied with it (bug 225410 reported by you against PDE was later marked a dup of a p2 bug). Like all successful software projects there are many bugs and enhancements open against p2, but there are also nearly 2000 resolved bugs - so reporting your problems, suggestions, etc, only helps to improve the chance of them being addressed.

A case in point is bug 200380, which is the poor explanation problem you are seeing here. A team from University of Artois, Genuitec, and IBM, have been working hard for the past year to solve this extremely complex problem of providing simple explanations for arbitrarily complex constraint satisfaction problems. Much of this work was released last weekend, and we look forward to your input on how we can further improve this solution.

Konstantin Komissarchik said...

> I have debugged many installation
> failures like this over the last
> two years, and every one of them
> has turned out to be a bug in the
> input to p2 (the metadata).

I would be careful about making broad claims like this. It might be common for the issue to be with the input, but p2 is not free of bugs. In this particular case, for instance, the issue turned to be a known bug (based on Pascal's comments) in p2 that surfaces when the base installation is put together using the traditional "just unzip it together" approach. I have since confirmed that the problem does not reproduce if using the dropins folder.

> You didn't see many of these
> errors in UM because UM did
> almost no dependency analysis
> [snip] it was not an acceptable
> situation for commercial products.

The issue is not the presence of dependency analysis. That's a great feature in principle. The issue is presence of large number of bugs in p2 when it was launched and failure messages that are completely incomprehensible to anyone outside of the p2 team.

> I also want to reinforce
> Pascal's comment that you are
> encouraged to report these
> problems rather than suffer
> silently. From a quick bug
> search I see you have *never*
> entered a bug against p2 since
> its inception, from which I can
> only surmise that you are quite
> satisfied with it [snip]

That's quite a leap. I open bugs all the time when I find problems with various Eclipse components. Due the nature of the problem, opening bugs against p2 is quite challenging. In most cases, there is no effective way to gather the necessary information to open a useful bug. I don't enjoy taking the time to open a bug only to have it closed as "not repro" or "not sufficient information".

This is a problem that p2 could help with by including debugging and problem reporting tools. I have opened Bug 267275 with some ideas of what could be done to improve the situation considerably.

> A team from University of
> Artois, Genuitec, and IBM, have
> been working hard for the past
> year to solve this extremely
> complex problem of providing
> simple explanations for
> arbitrarily complex constraint
> satisfaction problems.

That's very good to hear and I look forward to improved p2 user experience on Galileo. Part of the point that I was trying to make is that this problem should have been solved before rolling p2 out. In Ganymede, p2 solved one big problem (installation safety) at the expense of creating an equally big problem (incomprehensible failure messages). Different people will weigh those two problems differently, but from our perspective, we had far fewer users complaining about broken installations prior to p2 than we have complaining about not being able to install our product (and not understanding why) with p2.