Sunday, April 30, 2006

More v1 Compilation Experiments

Since it's turned out to be so easy to generate bytecodes based off the Ruby AST, I tackled another trouble spot for us: literal array creation.

The current interpreter, in order to follow a generic, iterative model, has a large overhead for instantiating literal arrays. For example, to create the following array:

['this', 'is', 'an', 'array', 'of', 'strings']

There are at least seven AST nodes to process: the array node and six nodes for the elements (technically, there's more that one node for each element, but we'll call it one for simplicity). The recursive way to process these nodes would be to visit the array node, and then recurse to process each of the elements. However, we're trying to escape recursion, so a different model was necessary.

The current interpreter avoids recursion by maintaining its location in the AST on a stack. As nodes are encountered, they are pushed onto that stack, and their instructions--rather than recursing--expand the subnodes by also pushing them onto the stack. Once the expansion reaches a termination point (such as a literal value or the last element in a block), the expanded nodes are processed one at a time.

The instructions associated with each node perform whatever JRuby operations are required to process that node. They instantiate classes, define methods, assign variables. For this reason, they actually are of the type Instruction internally, and this is where we can plug in the compiler.

Compiler Design Version One: Microcompilation

Because the interpreter simply traverses the AST, retrieving and caching instructions as each node is encountered, we can short-circuit this process by pre-populating the appropriate instruction for a parent node with a compiled instruction based on its children. When the interpreter reached that node and sees an instruction has been prefetched, it executes that rather than continuing to traverse. As a result, we can selectively compile branches of the AST for targetted speed gains. I call this microcompilation.

The benefit to this approach is that the compiler can be written a piece at a time, tested incrementally as those pieces come together. It also has the huge benefit of allowing compiled code to run within the existing interpreter engine without any modifications to JRuby.

There is, however, a downside to my current approach. Instead of pushing toward a green-threadable and continuation-able iterative model, this simple compiler will still deepen the stack. However, given that no applications we are currently working to support seem to require continuations or threads, I believe this is a good initial approach. It may also prove true that this simpler compiler is fastest (while perhaps not the most Ruby-compliant), and for many embedded Ruby applications it will continue to be useful into the future.

Processing The Array


So back to our six-element array. In order to handle it iteratively, rather than recursing to process each element, a complicated transformation happens. The instructions elements of the array are mixed with instructions for aggregating the results, and another instruction is added at the end to finally build the array. So the sequence of instructions processed, after initially processing the array node, goes something like this:

String Node: 'this'
Aggregate (push down result accumulator)
String Node: 'is'
Aggregate
String Node: 'an'
Aggregate
String Node: 'array'
Aggregate
String Node: 'of'
Aggregate
String Node: 'strings'
Build Array (deaggregates all results and constructs a ruby array)

For more complex elements, this approach also works; if one element requires a method call or a variable lookup, that branch of the AST is processed inline, with the end result placed on the top of the accumulator stack. Processing then continues with the next element.

The added overhead of this approach is not without reason. Yes, it would have been possible to modify the interpreter engine to understand arrays and handle them specially, but my goal originally was to create a generic iterative interpreter (and perhaps to prove that all nodes could be processed in this way). However, that design mostly proven out, it is now time to address its deficiencies.

Compiling The Array

Compiling the array turns out to be a simple matter, especially considering that the only element type supported by the compiler right now is a literal string. The compiled code instantiates a new IRubyObject[] of the appropriate size, then visits the compilers for each of the elements. The String node compiler I wrote for the "foo" test is reused here, and as each element is encountered bytecodes are generated to construct ruby String objects and insert them into the array. Finally, an operation is added to construct the ruby array instance, and our work is done.

During this round of work, it has also become apparent that although we make heavy use of JRuby internals from the compiled code, large portions of the interpreter itself can go away: for example, with a handy-dandy operand stack available in bytecode mode, we would no longer need to maintain state on a separate result accumulator. The "v1" compiler is turning out to be a fun and easy affair.

The Numbers

The results of these little microcompilations are perhaps anti-climactic...after all this noise I post a handful of numbers for you to look at. Even if the numbers are great, it's a small payoff for listening to my rambling. Sorry! That's how it is!

The test methods take the same form:

def foo_arr; ['this', 'is', 'an', 'array', 'of', 'strings']; end

The foo_arr method is our compiled version, and bar_arr is the uncompiled version. And so after another 1_000_000.times { foo_arr } and { bar_arr } we have:

foo_arr time: 32.628
bar_arr time: 93.31700000000001

This equates to a roughly 60% speedup from what is, again, a fairly simple operation even in the current interpreter.

Coming up in future posts: more v1 microcompilation and numbers, v2 and v3 compiler designs, and more...

Friday, April 28, 2006

In The Beginning: Early Returns on JRuby Compilation

I'll put a big fat disclaimer here stating that these results are extremely preliminary, and mean practically nothing. I'm just very tired and very excited that after seven hours of work, I've already made some progress...I sorta, kinda compiled some Ruby code to Java bytecodes. Yes, I've been up all night. Yes, I have to work tomorrow. Yes...I wish I could work on this during the day. C'est la vie!

The Contenders

In order to soften any growing excitement you may be feeling, I must first introduce the scripts in question. The script to be compiled is quite a whopper:

def foo; 'bar'; end

Hey now, everything has to have a beginning. Even the above script is not the whole story; the only code actually being compiled are the four AST nodes that make up the method's body: a Newline (to start the line), a DStr (dynamic string), another DStr, and a Str (the actual string), all nested one inside the other.

Now as anti-climactic as that may sound, this is early proof-of-concept work here...I don't intend to write a compiler for all of Ruby at the same time. This was the smallest non-trivial script I thought I could handle in a first attempt. I do not compile the nodes that define the foo method, nor do I compile any dispatches to the foo method. I just compile the body.

In order to have an uncompiled version, we have another identical method:

def bar; 'baz'; end

The bar method functions exactly as the foo method, but we will not compile this one. It will be the control.

Now since only a portion of the AST nodes are being compiled, it seemed appropriate to measure a method that does nothing, so that the cost of dispatching could be isolated, further narrowing the test. For that, we have baz:

def baz; end

By running the same tests against baz, we can get a better idea how much compilation is helping. There are also other interpreter bits and pieces that will skew the results a bit, but early results are early results; we want to know whether it will be worth the effort.

The Tools

For this round, I chose to use ObjectWeb's ASM bytecode generation library. It seemed best suited to the visitor pattern we use to traverse the AST (since it folows the visitor pattern itself), and it's itty bitty.

Now for the fun part: I wrote the compiler in Ruby using JRuby's Java integration support. Some time ago, Tom added the ability to parse a string and get back a reference to the AST within a Ruby script under JRuby. Since we already have visitor interfaces for the interpreter, and since JRuby does such a bangup job of implementing interfaces, I figured I'd do the whole thing in Ruby. Naturally, if we continue down this route, we would hope to eventually compile the compiler and add "self-hosting" to our list of buzzwords. However, I digress.

The basic model was simple: parse foo, retrieve the body node from the AST, convert the body to bytecodes, and replace the body with the newly-compiled . This allows our interpreter to continue doing its normal interpretation, but execute this one method's body natively. During my interpreter redesign last Fall we hoped to make this legerdemain invisible...and it seems we've succeeded thusfar.

The foo method's body was transformed into a class with one method, execute, implementing the same Instruction interface we use in the interpreter. The difference, of course, is that we would now have one bytecode-based instruction rather than four.

It's also worth pointing out that this model does not turn a Ruby class or a Ruby method into an accessible Java class or method. It turns snippits of Ruby code into tidbits of Java code...and that Java code continues to live as part of the interpreter. This is the most digestable model of compilation for Ruby, and has been followed by most other projects of merit. We may be able to take things further in the future (especially with the potential dynamifying of the JVM), but this level of compilation has massive potential right now.

The Test

Given that the scripts are so simple, the test should be equally simple:

1_000_000.times { foo }

...and equivalent tests for bar and baz. These tests are parsed, processed, and timed in the exact same way (other than the compilation step):

require 'java'
require 'jruby'
include_class "org.jruby.Ruby"

my_ruby = Ruby.default_instance

foocode = JRuby.parse("def foo; 'bar'; end")
#compilation done here
footest = JRuby.parse("1_000_000.times {foo}")

my_ruby.eval(foocode)
t = Time.now
my_ruby.eval(footest)
p Time.now - t


And again, equivalent tests for bar and baz. That's all folks! Have a nice weekend!

...

Ok, ok. You'd like to know what the times are. That's fair. I've led you this far, and you'd like some payoff for reading my technojargon gobbledygook.

Naturally I wouldn't be writing this if the results were poor.

The Results

Now did I mention these are very early, very preliminary numbers? Don't go off half-cocked talking about JRuby's new compiler, ok?

The bare method baz demonstrated that a large amount of time is spent dispatching in JRuby; perhaps an inordinate amount of time. We have plans to correct this, and have a few optimizations being tossed about.

The baz method took 3.8s to invoke one million times on this machine (Opteron 2.6GHz, Gentoo Linux 2.6.15). That may not seem bad, but of course it wasn't doing anything. Also, C Ruby does the same million dispatches in about 3 tenths of a second. An order of magnitude worse, we are.

The bar method, which represents the uncompiled foo, clocked in at 12.5s to complete a million calls. Now you start to see the real cost from traversing all those AST nodes. Adding only four nodes to the mix (and the side effects that result, of course) quadrupled the amount of time. Minus the method invocations, the body took roughly 8.7s, about 70% of the total run.

As you might expect, foo did better than bar. One million invocations of foo, our bar-with-a-compiled-body, took only 6.5s, almost 50% faster than bar. Subtract the method invocation hit and we're looking at around 2.7 seconds, a 60% improvement over bar.

baz: 3.8 seconds
bar: 12.5 seconds, or 8.7 seconds excluding method invocation
foo: 6.5 seconds, or 2.7 seconds excluding method invocation

So there you have it.

If we only get a 60% improvement I would be very pleased. However, given that Newline, DStr, and Str nodes are some of the least interpreter- and processor-intensive nodes in the AST, I'm certain we'll do far better.

Thursday, April 27, 2006

JRuby Compiler Will Happen

Things have really been clicking along on JRuby. Tom has been knocking down Rails unit tests left and right, and more and more stuff is working. I found a nice little performance booster that gives us as much as 30-40% speed increase when starting up. Other folks have been working on updated YAML parsers, ActiveRecord-to-JDBC connectors, and in general helping us test JRuby in more and more scenarios. The future looks pretty good.

However...

I am still far from satisfied with the performance of JRuby. Even with recent cleanup and minor optimizations, a basic CGI-servlet Rails request takes a second or two to process. Much, much longer than it should have to, especially considering we're not hitting a persistent store yet. With JRuby moving ever closer to its goal of C Ruby 1.8 compatibility, why not start looking at those performance issues in more depth?

First Time for Everything

I will admit, I've never written a compiler. Heck, until JRuby, I'd never worked on a language interpreter, much less written one. Of course, a lot has changed in a year, and I'm now one of the two people in the world who know the most about JRuby's internals. It has been quite a lesson in interpreter, language, and VM design...a lesson I've wanted to learn for the better part of my life.

I believe it is time for us to start exploring options for compiling JRuby to intermediate "JRuby bytecode" or to Java bytecode in many cases. Other dyn-typed langages for the JVM continue to hold their compilers up as a reason to choose something other than JRuby. I think JRuby's time has come.

I have been studying Ruby's AST, JRuby's implementation of its interpreter and core libraries, and the C Ruby code, and I am now confident we can do some level of compilation in the short term. The redesign of the core interpreter I committed last Fall has opened the door for more flexible traversal of the AST, including the ability to partially compile some scripts.

And So It Begins...

With Rails, Spring, Swing, IRB, and our other demos for JavaOne basically ready to go (ahead of schedule!), I will be spending the next week or two researching and exploring options for a preliminary Ruby compiler. I believe I can have something by JavaOne; something primitive, sure, but perhaps something that can provide numbers to go on. Stay tuned!

Wednesday, April 19, 2006

JRuby on Rails

After seeing David Heinemeier Hansson ask the "Groovy on Rails" project not to use that moniker (since it isn't Ruby, it isn't Rails, and it has a completely separate codebase), I figured I'd play it safe and ask him directly whether "JRuby on Rails" would be acceptable. We intend to run Rails unmodified, and we're trying to be as "Ruby" as possible, so it seemed to me that since JRuby on Rails is the same under the covers, the extra "J" wasn't that big a deal. David agreed:

As long as Rails remains virtually unaltered and fully functional, I suppose JRuby on Rails isn't too bad. I'm a tad protective of the term Ruby on Rails, though, and I want to make sure that there's no confusion about what it refers to. But I can't immediately see any confusion points here.

So the JRuby on Rails name should stick, and we have cautious approval from the Rails creator himself to use it. Huzzah!

Eclipse UnsatisfiedLinkError: libswt-pi-gtk-3139

I ran into this a bit ago and found no solutions online. I figured it was my duty to provide at least one posting on how I solved it, since there seems to be a lot of frustrated people out there. My eventual solution was to run with 1.5 instead of 1.4.

Upon starting Eclipse 3.1.1 or higher (or other SWT-based apps, it seems) under AMD64-based Linux, many people have reported the following error:

!ENTRY org.eclipse.osgi 2006-04-08 13:53:48.407
!MESSAGE Application error
!STACK 1
java.lang.UnsatisfiedLinkError: /usr/local/src/eclipse/configuration/org.eclipse.osgi/bundles/85/1/.cp/libswt-pi-gtk-3139.so: /usr/local/src/eclipse/configuration/org.eclipse.osgi/bundles/85/1/.cp/libswt-pi-gtk-3139.so: cannot open shared object file: No such file or directory
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1586)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1495)
at java.lang.Runtime.loadLibrary0(Runtime.java:788)

...etc

The .so in question is present, but it seems that when running under Sun's JDK 1.4.2, which does not provide an AMD64 version, this library can't be successfully loaded. I'm not saying this is a Java issue or an Eclipse issue, but I resolved it by running Eclipse under Sun's JDK 1.5 built for AMD64.

Hopefully this helps anyone else out there searching for a solution.

Tuesday, April 18, 2006

A Personal Blog

I've decided to add a personal blog, since many other tech writers do the same (and it seemed like a good idea).

Pop over to Headius@Home.

Wednesday, April 12, 2006

The Beginning of JRuby on Rails

It must be something of a debate in the blogosphere as to whether titles should be descriptive, possibly giving away a great secret hidden within an entry's text, or whether titles should only hint at the truth, enticing a curious and diligent reader to venture onward. The former perhaps produces a better tagline for newsfeeds, where a less-descriptive title may not be as attention grabbing. The latter is certainly more suspenseful, allowing a careful author to draw readers toward a mounting climax. I prefer the former, and so this entry's title once again gives away the Big Secret:

Today, Rails handled a simple request under JRuby.

Now perhaps this monumental event deserves uproarious fanfare, and perhaps it does not. I will let you be the judge.

We really WERE close!

For several weeks now, Tom and I have been saying back and forth to each other that we thought Rails was very close to working. Developer optimism, perhaps, but we've poured a lot of effort into fixing the last interpreter issues preventing Rails from running successfully. In the process, we've accomplished many peripheral goals like getting IRB running, improving RubyGems support, reparing untold interpreter and core class bugs, and getting a round education in the internals of Rails. I've personally traced through more Rails code than I ever hoped to, and committed patches upon patches to achieve success. Tom has also been cranking out fixes, and our growing community of contributors have sent some amazing enhancements our way. In short, there's real momentum behind JRuby right now, and we had a feeling Rails was almost there.

The last time I really burned the midnight oil was probably over a week ago. With SourceForge CVS down for an entire weekend, family business to attend to, and a pre-JavaOne presentation at the Minnesota Object Technology Users Group's Java Special Interest Group (whew!), there hasn't been a lot of time for late-night coding jags. That changed today.

With all other distractions behind me (the Java SIG presentation was last night), I set out today to finally accomplish our biggest goal

I resolved that by night's end, Rails on JRuby would handle a request.

Expecting a long night, I put on some music (somafm.com is excellent...listen and donate), grabbed a few snacks and beverages, and dug in.

How did I do it?

Now I know that should probably say how did "we" do it. Tom has obviously put in a substantial amount of time on JRuby, and our contributors deserve their props. My intent here is just to describe the final steps leading up to Rails working, to allow you to better judge our success and to document steps necessary for others to replicate my work.

Step 1: Apply all outstanding patches

Over the past several weeks, a number of core bugs have been resolved. In many cases, we had not yet committed those changes to CVS, preferring to review and clean them up. I committed several such fixes this evening:

- A patch to allow Kernel#system and backquote calls to stay in-process if executing a Ruby script. Previously, this caused a new interpreter to be launched (in a new JVM), which was obviously a bad thing for a Java app to do. This fix is hopefully a temporary modification, pending a better solution in RCR 328.
- A rejiggering of the Main and CommandlineParser classes in JRuby to better allow Multi-VM support in the future. These two classes originally used System.x input and output streams and called System.exit on error. For an interpreter that's intended to run in controlled environments, these obviously had to be fixed. The new versions won't kill off a VM and will only use the streams provided to them.
- A small fix for String#split to allow for splitting on ? characters (kinda important for URLs, no?). I have an improved fix for split to be committed tomorrow.
- A new ENV implementation that takes advantage of Java5's new System.getenv method, and falls back to other hacks on earlier Java versions. Originally, without any way to retrieve env vars, JRuby had an empty ENV hash. This fix was necessary in order to run Rails in my preferred test configuration. (Can you guess why?)
- Process#times was not implemented (because it can't be supported under Java). I added it to return a Tms with all zeros.
- File#flock was not implemented. I added it, using java.nio file locking support.

A few of these fixes came out of my playing with Rails this evening, and I stopped when Rails started working.

Step 2: Set up Rails in the most traditional way

Rails is a very CGI-style web framework. In its simplest form, it is a basic CGI script, and all hacks to improve its performance build around that idea. Given that Rails wanted to be CGI, it made sense to set it up that way...with JRuby executing it.

dispatch.cgi is the main CGI script for Rails. There's also a .fcgi version for FastCGI. Eventually, JRuby will provide a "CGI servlet" of some kind to wrap Rails requests, as well. For now, default.cgi was the order of the day.


#!C:/jrubywork/jruby/bin/jruby.bat

require File.dirname(__FILE__) + "/../config/environment" unless defined?(RAILS_ROOT)

# If you're using RubyGems and mod_ruby, this require should be changed to an absolute path one, like:
# "/usr/local/lib/ruby/gems/1.8/gems/rails-0.8.0/lib/dispatcher" -- otherwise performance is severely impaired
require "dispatcher"

ADDITIONAL_LOAD_PATHS.reverse.each { |dir| $:.unshift(dir) ifFile.directory?(dir) } if defined?(Apache::RubyRun)
Dispatcher.dispatch



Yes, running a Java interpreter on each CGI hit is gross. No, I don't expect anyone to run like this in production. It was simply the easiest way to get this working. Now we can go forward.

Step 3: Configure Apache

I'll admit, working in application servers from dawn to dusk I don't have to tweak Apache configs much. This config probably isn't the most spectacular thing in the world, but it does what it needs to. The biggest change from a standard Rails config is the env vars provided for JRuby.


DocumentRoot "C:/rails-1.0.0/public"

<Directory "C:/rails-1.0.0/public">
SetEnv JRUBY_HOME C:\\jrubywork\\jruby
SetEnv JAVA_HOME C:\\j2sdk1.5.0_06
Options Indexes ExecCGI FollowSymLinks
AddHandler cgi-script .cgi
AllowOverride all
Allow from all
</Directory>



Step 4: Turn off what doesn't work

Ok, you knew there had to be a catch. Since there's still a lot of work to be done on Rails, there was at least one area I discovered that wasn't going to work correctly today. To save myself staying up all night, I disabled session support; with it enabled, the script got stuck in a neverending loop somewhere I didn't feel like investigating. For the sake of this test, I added the following line to rails_info_controller.rb:


module Controllers #:nodoc:
class RailsInfoController < ApplicationController
session :disabled => true



We'll circle around to session management and get it working, so don't fret.

Step 5: Finally, test out our very narrow, very basic request

As you've probably gathered, the request I got Rails to handle was a simple info dump. Calling /rails_info/properties on a standard Rails install just dumps some version numbers and path information. It does, however, exercise the full Rails request handling and dispatching mechanism. Having it working is a big deal...it means that Rails is actually able to handle requests and display a result.

(as an aside, during the final stages of debugging I also saw the default error page come up, fully rendered and containing stack traces and error info...so even that's pretty cool).

On my system, /rails_info/properties outputs:

Ruby version1.8.2 (java)
Rails version1.0.0
Active Record version1.13.2
Action Pack version1.11.2
Action Web Service version1.0.0
Action Mailer version1.1.5
Active Support version1.2.5
Application rootC:/rails-1.0.0/public/../config/..
Environmentdevelopment
Database adaptermysql

The (java) up there represents the first time you or anyone else has seen Rails running in a JVM. Mark this day on your calendar.

Where to now?

It is may or may not be safe to say that Rails "runs" on JRuby. There's obviously a number of other subsystems to get working, and without sessions a web app would be pretty dumb. Saying that Rails works, except X and except Y, basically means it doesn't work without a whole bunch of asterisks--but handling an end-to-end request of any kind represents a major milestone.

What would be safe is to say that this represents the birth of Rails on JRuby. This is the first time a request has successfully been handled by Rails in a JVM. The next steps are obviously to get all normal use cases working, get the rest of the Action Pack functioning, and as always, speed JRuby up to be a viable deployment option.

To all those out there who have supported us and believed in us, we on the JRuby team give our thanks. This milestone would not have been possible without you.

To all those out there interested in Ruby, Rails, JRuby, or any combination of the three, I say this: You ain't seen nothing yet.