Friday, January 4, 2013

Runtime bytecode generation comes to Sapphire

Sapphire developers define the model by writing Java interfaces annotated with data semantics. A simple element definition might look like this:


public interface Person extends IModelElement
    ModelElementType TYPE = new ModelElementType( Person.class );
    // *** Name ***
    ValueProperty PROP_NAME = new ValueProperty( TYPE, "Name" );
    Value<String> getName();
    void setName( String value );
    // *** Age ***
    @Type( base = Integer.class )

    ValueProperty PROP_AGE = new ValueProperty( TYPE, "Age" );
    Value<Integer> getAge();
    void setAge( String value );
    void setAge( Integer value );

To instantiate a Person object, we need a concrete class that implements this interface. For the last few years, Sapphire developers relied on an annotation processor that is part of Sapphire SDK and is triggered by the @GenerateImpl annotation. The annotation processor would generate an implementation class like this:

public final class PersonImpl extends ModelElement implements Person
    public PersonImpl( final IModelParticle parent, final ModelProperty parentProperty, final Resource resource )
        super( TYPE, parent, parentProperty, resource );
    public Value<String> getName()
        return (Value) read( PROP_NAME );
    public void setName( final String value )
        write( PROP_NAME, value );

    public Value<Integer> getAge()
        return (Value) read( PROP_AGE );
    public void setAge( final String value )
        write( PROP_AGE, value );
    public void setAge( final Integer value )
        write( PROP_AGE, value );

The generated class is trivial as all the heavy lifting is done by the code in the ModelElement base class. Nevertheless, generating these implementation classes is important. No one wants to write any significant amount of code with a model that is accessible only via the generic read and write methods.

The annotation processor has been working well enough, but I have been wanting to see if on-demand runtime bytecode generation would be a better solution. Deferring generation of implementation classes until runtime removes the burden of incorporating Sapphire compilation into the application build.

Let me preface the next part by saying that I know next to nothing about Java bytecode, so I have been putting off this project for a while. Bytecode generation is difficult, I thought. I would have to learn a lot of new concepts and it would take a long time to re-implement the compiler. Boy was I wrong! I started this project two days ago and today I was able to remove the old annotation processor and push the changes. I haven’t kept track of how long it took to implement the original compiler, but it wasn’t two days!

Another surprising aspect is that the new compiler is significantly simpler than the original one. Purely in numerical terms:

  • Old Compiler: 17 classes, 3219 lines of code
  • New Compiler: 3 classes, 808 lines of code

I attribute the size disparity primarily to two factors:

  1. Java reflection API is far easier to use than the equivalent Java mirror API that you must use to build an annotation processor.
  2. Generating readable Java source code requires managing formatting and imports. Neither factors into bytecode generation.

The fast progress on the new compiler was further made possible by ASM, a Java bytecode manipulation framework. Leveraging ASM, a framework completely new to me, was made particularly easy by the Bytecode Outline plugin for Eclipse and its ASMifier mode. With the Bytecode Outline view open, you just select a method and you see either the Java bytecode or an ASM code snippet. An incredibly effective way to use ASM without taking the time to learn new API.



Major kudos to those behind ASM and Bytecode Outline. Secondary kudos to Java Decompiler Project. I used the standalone version (JD-GUI) to check the bytecode that I was generating.

The new compiler referenced here will ship as part of the upcoming Sapphire 0.7 release.

No comments: