Thursday, March 12, 2009

More Compiling Ruby to Java Types

I did another pass on compiler2, and managed to wire in signature support. So let's look at a couple examples:
class MyRubyClass
def helloWorld
puts "Hello from Ruby"
def goodbyeWorld(a)
puts a

signature :helloWorld, [] => Java::void
signature :goodbyeWorld, [java.lang.String] => Java::void

In this case we have our friend MyRubyClass once again, with helloWorld and goodbyeWorld methods. You'll recall from my previous post that these two methods originally compiled as returning IRubyObject, and goodbyeWorld compiled as receiving a single IRubyObject parameter.

But with signature support, things are so much cooler! The two "signature" lines at the bottom of the class (syntax and structure are totally up for debate) associated signatures with the two methods. helloWorld receives no parameters and has a void return type. goodbyeWorld receives a single String parameter and has a void return type.

The compiler takes this new information, and produces a more normal-looking set of Java signatures:
Compiled from ""
public class MyObject extends org.jruby.RubyObject{
static {};
public MyObject();
public void helloWorld();
public void goodbyeWorld(java.lang.String);

Huzzah! There's almost nothing here to give away that we're actually dealing with Ruby code under the covers. And the code that consumes this is just as simple:
public class MyObjectTest {
public static void main(String[] args) {
MyObject obj = new MyObject();

And that's literally all there is to it. Here's a more advanced example:
class MyRubyClass
%w[boolean byte short char int long float double].each do |type|
java_type = Java.send type
eval "def #{type}Method(a); a; end"
signature "#{type}Method", [java_type] => java_type

This time we're actually *generating* the methods, looping over a list of Java primitives and eval'ing a method for each. So this is *runtime* generation of methods, like any good Rubyist loves to do. And of course, this is absolutely no problem for compiler2:
Compiled from ""
public class MyObject2 extends org.jruby.RubyObject{
static {};
public MyObject2();
public double doubleMethod(double);
public int intMethod(int);
public char charMethod(char);
public short shortMethod(short);
public boolean booleanMethod(boolean);
public float floatMethod(float);
public long longMethod(long);
public byte byteMethod(byte);

All the methods are there, just as you'd expect them! Fantastic!!! (Though the ordering is a little peculiar; I think that's because we don't have an ordered method table in our class impl. Does it matter?)

Even better, the above methods are doing the same type coercion on the way in and out that we do for any other Java-based method calling. So your integral numerics are presented to Ruby as Fixnums, floating-point numerics are Floats, and booleans come through as Ruby true or false.

There's certainly more work to be done:
  • There's no support for overloads at the moment, but I'll likely provide a method aliasing facility so you can define multiple Ruby methods and then say which one maps to which overload. And of course, you'll be able to define multiple overloads that go to the same method body if you wish.
  • I also have not wired in varargs, but it will be an easy match to Ruby's restargs. And optional arguments could automatically generate different-arity Java signatures.
  • Annotations will also be trivial to add; it's just a matter of attaching appropriate metadata and having compiler2 emit them. So you'll be able to use JavaEE 5, JUnit4, and any other frameworks that depend on having annotations present.
Of course this is all checked into JRuby trunk, so feel free to give it a try. Stop by JRuby mailing lists or IRC if you have questions. And it's all still written in Ruby; signature support bloated the compiler up to a whopping 178 lines of code, most of that for dealing with the JVM opcodes for primitive types.

This is just the beginning!


  1. I think method order is insignificant. You can observe it when iterating over the methods of a class using reflection, but at least for Java the Javadocs explicitly say that you must not assume any specific order.

    This has bitten me once as IBM's JDK behaves different from Sun's in that place.

  2. I think method order is important.

    Only because a compiler should be deterministic. For the same input the output should always be the same. Down to the last byte!

    If not then the md5sum of your project maybe different each time!

    If you are reliant on hashcode to order, then as some strings may get interned, the order can change.

    Also note that in Maven 2.0.10, many of the main uses on HashMap have been replaced with LinkedHashMap to avoid non-deterministic dependency orders.

    Sure it doesn't matter to reflection, and compilation. But why introduce non-determinism when you don't have to?

  3. Your awesomeness knows no bounds :-) Great work. Looks like a very natural way to add the meta data necessary to interface to the Java world.

  4. The "signature" method is great--no new Ruby syntax, and provides all the hinting necessary (and opens up new possibilities as a bonus).

  5. I'm impressed! I didn't think there was any way you could create a Ruby compiler which produced real JVM classes. The downside is that I don't think this could be practically applied to *every* Ruby class in a project, but since an API is generally defined by a few "outer" classes, I don't think that will be a problem.

  6. Ken: Not a bad thought. And of course patches are accepted; but that sounds like a good way to do it, and I'll probably get around to that soon. I'm really hoping more people will have a look at the code, since it's just Ruby and pretty simple to figure out. tool/compiler2.rb in JRuby repo.

  7. dvae: The reflection ordering doesn't convince me but determinism does. I'll see what I can do to get the methods generating in the same order every time (probably alphabetical).

  8. JoergWMittag: I sympathize, and that's why I've left signature specification intentionally vague. The only requirement that would be set in stone is that there be a way for the compiler to get signature data; how that signature data is attached to the class is up for debate.

    So, for example, someone could take any one of those other type-annotating schemes and tweak them for compiler2. I wouldn't mind at all. The syntax here, with the "signature" method, is just something simple to get the compiler itself working.

  9. Daniel: Yeah, I don't think there's even a need to produce a Java class for every Ruby class in the system, and really you don't need to lock yourself into Java types except where you intend to present an API. Of course, I don't see that there are any limitations to this means of compilation, so in theory you *could* annotated every API in your system. But I doubt that's desirable.

  10. Great work! Yours spurs my enthusiasm too; I can even imagine myself working with Java again, after I got bitter over its blown-uppedness.

    Of course people like me will also start to demand type-inference-assistance further down the road. But given the facility of type annotations that may very well be done by a different project.

  11. Great stuff! Am I able to add signature info later, reopening the class?

    That way, we could conditionally add this information in separate .rb files, only when running on JRuby, to build portable apps.

    I guess it could be done, as the compiler works with the "runtime version" of classes.

  12. Fabio Kung: Yes, you can add the signature info anywhere, any time in your application, so long as it's present for compiler2 to inspect and emit the Java type information. That's what makes it so much nicer than any options that required syntax changes, "special" structured classes, or offline inspection of an AST to get the compiler information.

  13. I follow you on Twitter regularly but I don't have a Twitter account ;)

  14. This work is fantastic. I've modified compiler2 a bit to load up some gems-in-jars and now I've got Ruby files in my Java web service that are being compiled by the same ant task as everything else.

    What's the work that's required for subclassing, or perhaps I should ask what the strategy is? I'm unsure how one might wrap the Ruby guts to expose a class as anything other than extends RubyObject. I'd love to fiddle around and maybe get onto this work. My dream is to write Wicket in Ruby....

  15. Jonathan: I'd love to have you collaborate on it. I think I'm going to spin this off as a separate project today so others can start to contribute to it, and we'll plan to just release it as a gem.