I suspect that I am about to bring the Wrath of Ed down on me, but here it goes...
There are quite a few Java code generators that can take an arbitrary XML Schema and spit out tons of code that a developer doesn't have to write. There is JAXB, Apache XMLBeans and of course EMF. I am sure there are others. While code generators save us a lot of time, the push-button approach can lead to dumbing down of our models.
I would argue that there are relatively few core modeling patterns, but the flexibility of XML makes it easy to express these patterns in a variety of ways. The generated code then unintentionally surfaces (rather than hiding) these XML serialization details in the model layer. This forces the clients of the model to deal with inconsistent and often difficult to use API.
Consider the basic example of a boolean property. I have seen at least three different ways of representing that in XML:
<some-flag>true</some-flag> <some-flag-enabled/> <!-- absence means false --> <some flag value="true"/> <!-- the attribute is required -->
The above three cases would generate different model code even though from the modeling perspective, they represent the same construct.
A more complex pattern is the selector-details construct where selector is a type enumeration and details provide settings specific to the type. I stopped counting how many different ways I've seen that pattern represented. Here are two of the most common examples:
Example 1: An explicit type element controls which property elements are applicable.
<type>...</type> <!-- valid values are X and Y --> <property-1>...</property-2> <!-- associated with type X --> <property-2>...</property-2> <!-- associated with type Y --> <property-3>...</property-3> <!-- associated with type X and type Y -->
Example 2: In this case, the elements alternative-x and alternative-y are mutually exclusive. The element names are functioning as type selectors.
<alternative-x> <property-1>...</property-1> <property-3>...</property-3> </alternative-x> <alternative-y> <property-2>...</property-2> <property-3>...</property-3> </alternative-y>
I would argue that the the above cases are semantically identical and therefore should have the same representation in the Java model. Of course, that doesn't happen. All of the existing code generators that I am aware of will produce drastically different code for these two alternatives.
So why should we care? I would argue that in many cases, while we are saving time by generating model code, the cost savings are at the expense of complicating the model consumer code. Recently, I took over a project at Oracle that was building a form-based editors for several rather complicated WebLogic Server deployment descriptors. The schemas of these descriptors evolved over many server releases and many people had a hand at augmenting them. The result is a complete lack of consistency. You could say that perhaps the schemas should have been more carefully evolved, but I would argue that they represent a rather realistic example of what real world complex schemas look like. In any case, the first attempt at building these editors was to generate an EMF model based on the XSDs and to build UI that would bind to EMF. That worked ok for a while, but eventually the UI code started to get too complicated. Many of the UI binding code had to be hand-written. It ultimately made sense to throw away the generated model code and to hand-code the model. That allowed us to arbitrarily control how model surfaces XML constructs and made it possible to reduce the amount of custom UI code that was necessary by literally several orders of magnitude.
I am certainly not trying to say that generated model code is a bad idea, but the ease with which it is possible to toss an XSD into a code generator and get a bunch of model code in return plays part in encouraging developers to pay less attention than is really necessary to the model layer.