Headius: May 2008

Wednesday, May 21, 2008

JRuby Pre-RailsConf Hackfest on Thursday

Hey there JRubyists and JRubyists-to-be...if you're planning to be in Portland the evening of Thursday, May 29, there's going to be a sponsored JRuby hackfest for you!

The event is going to be at the McMenamins Brewpub/Restaurant at the Kennedy School. We'll have a room set aside for around 40ish people, so please RSVP via email or comments. There will be a taco bar and beers/beverages provided!

JRuby Hackfest!
Thursday, May 29, 6:30PM - whenever
McMenamins at the Kennedy School
5736 N.E. 33rd Ave.
Portland, OR 97211

We're pushing this event as a real hackfest, so bring your laptops and apps you are running or want to run on JRuby. At least Tom Enebo, Nick Sieger, Ola Bini and I will be there, so we'll probably be able to get you up and running. Otherwise, if you don't have an app, stop by and we'll give you a walkthrough of JRuby and maybe there will be a bug or piece of code you can help out with.

I want to thank LinkedIn for initiating this event...Tom and I and the other JRubyists are too busy to set these sorts of things up, so it's great to have community members take this initiative. LinkedIn is also sponsoring, along with Joyent and Sun Microsystems.

Remember, please RSVP in comments or by email so we can get an idea of headcount. And if you have any suggestions for things you'd like to see or hear about or work on at the 'fest, let us know!

Monday, May 19, 2008

JRuby on Rails Fighting Infectious Disease

A new JRuby on Rails venture was just publicly announced. It's a collaboration between Collaborative Software Initiative and the State of Utah:

Portland, Ore., May 19, 2008 - Collaborative Software Initiative (CSI), the company that brings like-minded organizations together to work on collaborative software at a fraction of the cost, today announced the release of the first open source, web-based infectious disease reporting and management system.

The application is basically a system for reporting, investigating, and managing outbreaks of communicable disease. So if some kid at a local school contracts bacterial meningitis, this is the sort of system that would record the event and track related cases or contacts with that kid. Seems like a great application for JRuby on Rails, and a potential to see wide use.

Technical details are still a little sparse on the announcement and on the project site, but it's JRuby on Rails based, and "friend of JRuby" Mike Herrick is quoted in the article saying that they "look forward to rolling this out and talking to other states about how to implement it and improve the health and safety of their citizens."

Mike has promised me more information, but from our discussions with him he's very happy using JRuby on Rails. I think he's going to be at RailsConf next week, so if you're interested in talking to him you might drop an email his way.

It seems like JRuby is picking up speed.

Sunday, May 18, 2008

"Ask the Experts" Session: NetBeans 6 Ruby Support

Sun is running an "Ask the Experts" session on Ruby/JRuby support in NetBeans 6(.1) with myself, Tor Norbye, and Brian Leonard. If you've had questions about NetBeans Ruby support you'd like answered, but haven't had a chance to ask...here's your opportunity.

The main page is here: Ask the Experts

Fire away! You have all this week to get your questions in.

Saturday, May 17, 2008

The Road to Babel

It's Saturday and I'm going to be waxing poetic again. This time, it's stream-of-consciousness mind-dumping. Enjoy, or not.

The Game of Life

Every so often you have to take a step back from the daily grind and consider your position in the world. I've done this a few times during my career.

After a "checking in" manager discussion at the University of Minnesota in 1997, my conclusion was that I had been grossly undervalued in my last review. The solution was to strike out on my own, finally, for a job in the private sector. So I fell into my first post-University job earning twice what I got as a full-time member of the U of M Web Team.

I think I stuck around there for about 1.5 years. It wasn't a bad gig, but for whatever reason I really wanted to move up. When people made bad decisions, I wanted to be able to veto them or force a reevaluation rather than sitting by as projects failed. At the time, it seemed like management was the only path I could take, so I took some of the requisite courses, involved myself in more meetings and discussions, and generally tried to unofficially steer various projects in directions I thought would be more successful. Finally, I expressed my interest in moving into the management chain. "We're sure you'd be entirely capable, but we don't think it would be good for the team if you were leading folks who had been here much longer." Great. Age descrimination. So what if I was 23...I was fucking right, and I could have handled the job. Time to jump ship.

Thus began my dark foray into consulting. I hooked up with a local firm, consulting at a large Manufacturing and Mining company in Minnesota. Talk about stooges. These guys didn't know which was was up. They had some gigantic CORBA-based back-end for their product "store" they wanted to wire up to an ATG Dynamo front-end (which I *still* get job opportunities for). It was slow. They couldn't figure out why. Oh, is it a problem that all our web services, which are based on one end of the city, need to call back to a CORBA server on the other end of the city for ever damn field? Nah, it couldn't be that. I think we need more hardware. Probably about a month before my contract ended, I got tired of arguing the case. BAIL.

At this point, I think I just got tired of trying. I landed a comfy job as some sort of Java EE architect, and never actually did any real work. I lead a team of two, provided some basic architectural guidance (which was probably wrong, I was totally disinterested now), and things moved along reasonably well. Unfortunately, instead of hooking my wagon to a gravy train, I ended up with the Donner Party. About nine months after coming on board, the company crumbled under a high-level management embezzling scam, and the remaining employees started to cannibalize one another. I was laid off.

This was a rough time for everyone. It was 2001, and the shit was starting to hit the fan. Of course "me and my friends" all saw this coming...the nonsense prices people were paying for bullshit companies, the SuperBowl spots for web sites we'd never heard of and had no intention of using...all of it added up to a big fat crash. Not a great time to fall out of a job. Luckily, I had some old friends pulling for me.

Revelation

Enter Kelly Nawrocke. Kelly and I had worked for a while at the University, on the Web Team. Kelly was never really devoted to the position...while I was desperately fiddling with the latest stupid AWT GUI they'd tasked me to write for no apparent reason, Kelly would be toying with a digital theremin, echoing sci-fi whines and whistles all the way down the hall. I sometimes hated his flippancy, but I sure wouldn't have shared an office with anyone else. Anyway, Kelly and I had been out of touch for a few years when started my Java EE architect job. While working there, I started studying Chinese "just because" and ran into Kelly on the U of M campus. After describing the work I was doing, Kelly replied with five words that have weighed on my conscience ever since, and are probably responsible for where I am now:

"I expected better from you."

Of course he said it with a smile. He was joking around, probably poking fun at my nonchalance about that EE job. But shit, he was right. What the hell was I doing with my life? Is this what I wanted to do?

So the bottom fell out and I needed a job. Kelly came to the rescue, getting me an interview and then vouching for me at another local shop. This time, it was a spin off of another Minnesota chemical company, building Java-based (but not necessarily EE-based) storefronts for both the original company and new ones interested in the same service. And this was probably a couple months before I would have had to bail on a mortgage and start over. Kelly really came to the rescue.

The new place paid pretty well, but it was totally doomed. I got there well after the happy times, when execs would fly off to Rome for a business lunch with prospects and "hired gun" consultants would organize daily poker games while they waited for work. When I arrived, we were heads-down working on stuff, sometimes late into the evening.

I think this job really started to sculpt where I would go in the future. I started out as just another "senior software engineer" in a company full of "senior software engineers". I helped build out a few apps, did a whole crapload of catch-up learning on the latest web and enterprise frameworks, and generally found the whole experience really enlightening. Sure, it wasn't the most compelling technology, but it was a fair day's work and a great team to hang out with. So I was reasonably happy.

A Return to Creative Exploration

At some point, they decided to look into a new suite of software from TIBCO. The managers-of-the-day had decided that it was foolish to always be building our own toolchain, workflows, portals, and messaging tiers and mandated that we start using TIBCO's suite of COTS products instead. A couple folks started to play with the workflow engine. Others got to look at the messaging tier. Me, I got the portal. Yick.

The portal, at the time, was a badly stitched-together collection of damn near every open source Java project. There was Apache stuff in there, Sun stuff, arbitrary third-party projects...a whole mess of them. Several were not properly attributed...and none of them came with source. I was forced to actually unpack and decompile pieces just to figure out what version of Apache commons it was running or which version of Velocity they had included. It was a bloody nightmare, but actually a hell of a fun puzzle to put together.

The end result was that we had the ability to run the portal on servers of our choosing, largely because I did the late-night homework to figure out what it actually used. So we largely disassembled the TIBCO suite and put it back together in a way that actually made sense for our app. It worked.

But remember I said this company was doomed. No amount of portal-wrangling was going to save it, so eventually it got absorbed back into its parent company, several million dollars poorer. Most of the staff were laid off or left, but those of us "real employees" got introduced back into the matrix. And I was in corporate stooge hell once again. Sure, I stuck around for a bit...I'm not one to jump ship at the drop of a hat. But a few quarters later, after no bonuses, no pay increases, and our numbers slowly dwindling under increasing demands...it was time to bail again.

Never Give Up Hope

Oddly enough, it was my old consulting friends that bailed me out. They had a gig with a government services consulting operation out in northern Virginia. I probably wouldn't have considered it if it were just a straight-up hourly consulting gig. But this one was contract-to-hire, with six months commuting to the DC area. I was intrigued, and signed on.

Here I found a brand new project. I arrived at perhaps the perfect time (or as perfect as it can get). They had just spent a couple *years* assembling a set of requirements documents for a gigantic new project, a piece of software intended to replace the key mainframe-based database that powered a multi-BILLION-dollar US government program. And the guv had just recently signed off on the requirements and funded three years of development. Basically a clean slate as far as the software went.

I was just hired on as an extra hand, but it quickly became apparent they needed a lot more than that. The design was flimsy at best...the folks in charge had a general idea of the technology needed to make it work, but they'd obviously never done this before. What's worse, many of them didn't seem to have the analytical skills necessary to make this project succeed. So I was presented with a choice. I could either toe the line and let those in charge make bad decisions...or I could stand up and say "you know what, you're wrong...and I'm going to prove it." I chose the latter.

Let's fast-forward about a year. My tour of duty in Virginia had long since been over, and I was now working as the on-site lead at the Fed's office in Minneapolis, Minnesota. I was back home, I was busing to work, and I was in charge of the leading edge of the app, where we were just a few months away from going to production. I had basically taken over the entire architect role by now. The original build and deployment process were irreparably broken; I rewrote them in a 36-hour, three-day epic battle with Ant. The SCM workflow was useless, focusing on tagging what was currently in a given environment and moving those tags around; I replaced it with version numbers, release schedules, and tags and branches for actual project snapshots. And the production environment was a total wreck. I worked with the highly-competant guv IT staff to produce a ground-up automated setup and deployment script; they could now take a machine from bare server to fully-operational production box in about 20 minutes, including EE server installs, EE server and Apache configuration, and application staging and deployment. All by running one command.

We went live in October of 2004. A multi-million-dollar, 750kloc, fully "Java EE" project, deployed on time, under budget, and without any production snags. Yes, my friends, it can be done. Yes, you did it wrong.

Given that success, I think it was reasonable that I stuck around for another couple years. I became the lead architect for the project, guiding additional feature work, spinning off a subset of the application into a framework for future projects, and generally acting as part of the guv's team, rather than as an outside consultant. Things were pretty good...I had the run of the show.

Never Get Comfortable

Why did I leave?

"I expected better from you."

It was too comfortable. I started to realize that I was getting soft. I stopped tracking the latest web and persistence frameworks. I stopped caring about the fate of the project, since we'd largely automated it into a self-sufficient dynamo. I got tired of vetoing decisions...tired of calling idiots idiots and dealing with the fallout. I got 5/5 on my reviews in all but one area: interpersonal relationships...I made one dude cry because he restarted the UAT environment without asking me, and I had to explain to the guv why their server was down. It wasn't fun anymore.

Enter JRuby.

Find Something You Love

I started working casually on JRuby in about 2004, shortly after that big production release. I'd arranged to make a trip out to the home office in Tyson's Corner, VA, so I could attend RubyConf 2004 in Reston. It was Kelly Nawrocke who had turned me on to Ruby, even though I'd never actually looked at the language or written anything in it. Kelly was working in New York City, and promised to join me at the conference. While there, I saw David Heinemeier Hansson present Rails publicly for the first time, Koichi Sasada unveil his work on YARV, and meet a number of already-old-school Rubyists who thought they'd found the answer to all life's problems. But I wasn't really stricken by any of them. What struck me most was that without even knowing Ruby, I could look at their code and their slides and actually understand what was going on. I was hooked.

Naturally, being stuck on the then-crufty Java platform, I needed an in. So I started poking around to see if there might be a Ruby implementation on the JVM. It turned out JRuby had already existed for a couple years and my old U of MN Web Team coworker Tom Enebo was the current project lead.

Isn't it funny how fate works sometimes? I end up in a series of progressively more-challenging jobs because of a well-respected friend's offhand comment. I stumble upon JRuby where another respected coworker is lead developer. Weird.

Anyway...like I say, I only casually contributed to JRuby for about a year, but around the summer of 2005 I really started to take an interest. I think I'd reached a tipping point, where I understood enough of JRuby to make larger changes, and I'd become bored enough with my daily job to start working on night projects. I rejiggered the interpreter, hoping to move it toward a stackless design. And what do you know, it worked. I went to RubyConf 2005 and presented JRuby for the first time, running "fib" recursively to 100_000 with ease and showing how basic functions all worked great in JRuby. It was a lot of fun. And damn was that interpreter slow. But it really planted the seed in me.

Hell Breaks Loose

Over the next year, the shit hit the fan. Around January 2006, we got IRB working. About a month later, apps like Rake and RubyGems started to come online. Ola Bini came on the project around then and helped get them fully working. Then we heard from Tim Bray that he could spare a few Sun machines for us to work on, and if we got Rails running by JavaOne there could be bigger things in store. We did...and were hired four months later by Sun to work on JRuby full time. Since then we've had JRuby 1.0 and 1.1 releases, numerous production JRuby on Rails deployments, a new compiler and "generally better" performance than other impls. And today, we're one of the best options for deploying Rails, with ever-improving performance, native threading, and a great software stack backing us up.

Now it's time to take a step back and examine where we really are. JRuby, by most accounts, is fighting a war on two fronts.

Fighting on Two Fronts

On the JVM side, there's a renewed interest in languages. Where two years ago JRuby was really the only big language story (or at least the only one getting press), now both Groovy and Scala are talking about how they're "just as good" or "better than" Ruby. And in many cases, they're right...Groovy still integrates better with Java and Scala mops the floor with both JRuby and Groovy in the performance arena. So no matter what success we've had with JRuby, there's a constant arms race going on. We need to demonstrate features to compare with Scala (or evangelism to show that Ruby still has an edge). We'll trade performance back-and-forth with Groovy (who now have really decent performance in 1.6...many kudos to them). And Jython is coming online, already performing better than C Python with many, many enhancements yet to be made. So JRuby's got a tough fight on the JVM to remain one of the lead competitors. Or is it just a part of a big happy family?

On the Ruby side of the world, situations have changed just as drastically. In 2006, when JRuby was starting to run basic Rails apps, still deep in compatibility hell, and only beginning to look at performance, there were still really only two implementations: C Ruby (MRI) and JRuby. Now, two years later, there's 5, or 6, or maybe 8 depending on the day and how you count implementations. And as of yesterday, JRuby and C Ruby are joined by Rubinius in being able to route basic Rails requests...so the other implementations are most certainly snapping at our heels now. They're going to move fast. It may take some time, but they're going to start presenting compatibility and performance milestones comparable to JRuby. Already, Chad Fowler of the Ruby community has chosen his horse and claimed that within a year or two Rubinius will be on its way to becoming the "de facto standard" Ruby implementation. Others have put their bets on Rubinius or Gemstone's MagLev. Everyone has a favorite contender in the Ruby implementation arena. Whither JRuby among this crop of promising upstarts?

So it makes me think. Here I sit, on a Saturday night which could be a relaxing evening at home. It's a nice warm night in Minnesota. There's a few good movies on TV. I've got a couple nice beers in the fridge.

But I'm here writing a blog post and considering what next major enhancement to make to JRuby. Why am I doing this? Why do I labor day and night to improve JRuby, or push JVM languages forward, or try to show people why the JVM is such an awesome platform to target? Why suffer under the pain of a two-front battle to keep JRuby relevant?

"I expected better from you."

Kelly's words still ring in my ears. What is better? Is better making a great Ruby implementation that runs on the JVM and solves most of the scaling and "enterprisey" problems with the original? Perhaps success is bringing an off-JVM language to the JVM, and arguably making it the *best* choice for a large subset of users? Does better mean constantly chasing a dream of staying on top and fighting performance and compatibility wars and looking out for number 2 until the end of days?

I think there's something missing here.

The Ruby Side

Chad claims that Rubinius will be the de facto standard Ruby implementation in a year or two. Of course he's totally wrong...in a year or two things are going to be just as fucked up and confusing as they are now, but they'll be fucked up and confusing in altogether different ways.

JRuby will by then be convincingly the fastest way to run Ruby webapps either buoyed by continuing JRuby performance work, Rails multithreading enhancements, a new JVM version, or a crop of new web frameworks with Merb leading the way. And probably a majority of Rubyists (including several key "thought leaders" in the Ruby community) still won't care because they've always wanted Ruby to "kill Java" in some way. Poor guys...prejudiced and short-sighted.

Rubinius will be running Rails well enough to do production deployment, but without multithreading, memory reductions, or performance improvements it won't present a much more compelling story than MRI. Of course, it's probably going to receive most of those improvements during the year, and with five or six full-time folks working on it it's not infeasible for it to be better than MRI for Rails deployments by then.

IronRuby might be running Rails, might even be running it well, but will probably be getting most of its press running some proprietary Microsoft RIA or MVC framework. John Lam will still be fighting the good fight to make MS an Open Source company...and maybe even succeeding.

MacRuby will be released, probably in an official OS X release, and folks will be using it for various awesome mostly-GUI apps...but it probably won't be a substantial contender in the Ruby web arena.

Ruby 1.9.x will be more stable than today, but not enough of an improvement for most people to move off the 1.8 line; people will be waiting for the mythical Ruby 2.0 that brings all the promised features left out of 1.9.

MagLev will be bringing Avi Bryant's dream of Ruby on Smalltalk true, maybe even running Rails and other web frameworks. And...well, nobody will really want to pay for it, even if it can be ten times faster and has the best persistence architecture in the world.

What will I be doing?

I don't expect I'll spend the next five years working on JRuby. I don't necessarily expect I'll spend the next year working on JRuby. There's too much else out there.

...And Beyond

With JRuby, we've shown it's possible to bring a language, a set of libraries, and popular frameworks from another world onto the JVM and make them run even better. We've shown that there's real value in pushing the multilanguage JVM meme over "One Java to Rule Them All". Never before has there been such support for polyglots. Job postings once again list three or four or five languages they'd like candidates to know. Libraries and frameworks boast support for Groovy, JRuby, Jython as part of their feature list. Real money is being spent to turn the existing OpenJDK into a dynamic language powerhouse, led by efforts like the Da Vinci Machine, JRuby, and Groovy. And new static-typed languages like Scala are showing where the Java language needs to evolve (or not evolve) into the future. Ruby is just one part of a great new adventure.

I want to be a part of that. A year ago I started up the JVM Languages Google group to bring JVM language implementers together, and it's been a great success. You can post a question about polymorphic inline caches one day and read about call frame reification or compiler strategies the next. The Da Vinci Machine project, led by John Rose, has started to incorporate all those crazy features we dynlang implementers have really wanted into OpenJDK. Projects like the Maxine VM are starting to show that self-hosting works just as well (or even better) with Java and the JVM than other languages and platforms. These are exciting times.

No, dear readers, I don't mean to say I won't be working on JRuby. JRuby is the gateway drug for me into many different arenas of software. I've learned more about languages and compilers and parsers and VMs and runtimes and libraries and unicode and so on from working on JRuby than I ever learned from any of my jobs. My work on JRuby will continue long into the future.

But it's time to look to bigger things. JRuby has been successful not because of what magic Tom or I or anyone else were able to work...it has been successful because of the JVM, that fantastic piece of engineering that enables top-notch implementations of dozens of languages already. But it's too hard to make languages run well on the JVM right now, and I'll attest to that. We need to make it easier to get languages performing on the JVM. We need to make it easier to build tools for them. In short, we need to open the JVM up to a much larger audience...an audience that might have written Java off as a dead technology. And we need your help.

The Road To Babel

So I'm publicly announcing that we at Sun are hosting a JVM Language Summit. It's long overdue in my opinion...we should have been having these events ten years ago. But it's happening now. We're calling all language and VM implementers to come talk about their projects. It doesn't have to be something running on the JVM...we want very much to hear from folks on the Rubinius project, Parrot project, LLVM project, CLR and DLR projects, and any other language and runtime you can think of. This is the chance to get together with a group of your peers to discuss topics you can usually only explore over email or IRC. It's your chance to say what you want the JVM to do for you...or else to say why your platform does it better. It's a meeting of the minds...a first step toward building more open platforms, better runtimes, and completely Free software stacks that all languages can take advantage of.

And this is just the first step. Over the next year, I'm going to be actively working with others in the JVM Languages community to build out a library of tools and frameworks we can all use to better our implementations. That process has already started...Attila Szgedi has been working on a standard MetaObject protocol for the JVM. John Rose has been working on the DVM and its set of dynlang features. I've been working on various backports of those features and similar libraries for JRuby. The Groovy guys have been working on code-generated call site optimizations. The list of independent projects goes on and on, and now we need to bring these efforts together.

I did a talk at CommunityOne this year about "Bringing the JVM Language Implementers Together", and I really meant it. It's happening right now, and it's about time. It's bigger than Ruby, bigger than Groovy, bigger than any one project for sure. It's about a real platform for real users, users that have different tastes and want different tools for the job. It's about you and those projects you might have put aside. It's about those biases you might have against anything Java-related, as though somehow any project with Java involved is an automatic FAIL. Of course you know it's foolish, but old habits die hard. It's time for you to get involved. It's time for you to cast off those prejudices and help push this platform in the right direction. And I'm going to be here to help...I really want this to happen. But it depends on you. Are you up to the challenge?

"I expected better from you."

No doubt. We should all be doing better. And this is your chance.

Wednesday, May 14, 2008

The Great JRuby Japanese Tour

Yes, friends, it's time one again for a JRuby tour. This trip, we're localizing to the islands of Japan. Do I have any Japanese readers out there?

Here's our route for this trip...it's going to be a crazy ten days:

View Larger Map

June 19: Tsukuba, Ruby Kaigi

Tom Enebo and I will be presenting JRuby at the Ruby Kaigi 2008 in Tsukuba this year, as well as meeting up with other Ruby implementers making the trip and communing with the locals. The Kaigi was great last year...lots of fun, great company, and an excellent community. We're both really excited about it. JRuby has come a long way, so we'll be doing a whirlwind tour of performance, Rails, GUI support, and finishing off with something special. It should be a great conference again this year.

June 23: Matsue, Lecture at Shimane University, Meetup with Matz and Locals

We were invited to present JRuby at Shimane University, as part of a lecture series they're doing on Ruby. And since Matz is based on Matsue, we'll certainly meet up with him to talk a bit offline...I expect he'll be pulled many directions at the Kaigi.

June 24: Fukuoka, Ruby Business Commons

Tom did a keynote last year for the Ruby Business Commons, a group of Rubyists proactively trying to bring Ruby to the business community. I suspect we'll deliver another talk or just meet up with them and see how things have progressed in the past year. At any rate, Tom enjoyed the trip to Fukuoka, so I'm looking forward to this.

June 25-27: Tokyo, meetup with local partners, universities, Rubyists?

Whenever Tom and I travel to Japan or meet up with Japanese associates, we put ourselves entirely in the hands of Sun Japan, and specifically our excellent friend and guide Takashi Shitamichi. So far, the Tokyo leg of our trip has a few embedded question marks, but I'm sure Shitamichi-san will be able to fill our days from dawn until dusk with events. Hopefully we'll have a little time to take a breath and poke around Tokyo again, but either way it will be a great end to the tour.

Spreading the Word

I'd really like to jam in as much Ruby and JRuby meetups, discussions, and talks as possible on this trip, so feel free to reblog (and perhaps translate) this entry, contact me, Tom, and Takashi directly...especially if you know of any good events while we're in Tokyo. And if you're press, Takashi can certainly hook you up if there's time in the tour (for emails...use firstname.lastname@sun.com).

JRuby is ready for the Japanese Ruby community, and we're coming to town to help send it off!

Tuesday, May 13, 2008

RubySpec: Bringing Ruby Test Suites Together

Hooray! The RubySpec Project, a collection of runnable specifications for Ruby 1.8.6ish behavior, has graduated into its own domain. Finally there's a lively, fast-moving, independent project to create a Ruby specification and test kit. And it's already well on its way.

Better documentation on how to pull the specs, update them, and use them for your own Ruby implementation (you do have a Ruby implementation, don't you?) are still being ironed out, but the repository is already available at the RubySpec github address, so you can pull them and start reading and running them. Also see the MSpec github for a lightweight (lighter than RSpec) tool to run the specs with.

But this post is not just about the the RubySpec project...Brian Ford is putting together an official announcement for that as we speak. This post is a call to action.

JRuby currently encompasses something like 6 separate test suites:

Our old JRuby test suite using "minirunit", a small runit clone no longer in wide use (the camelCase.rb tests at that URL)
Ryan Davis and Eric Hodel's "BFTS" suite, a narrow but deep set of tests for a few core classes
Our newer set of JRuby tests using test/unit (the underscore_case.rb tests at that URL)
MRI's own set of tests, from the Ruby 1.8 repository (link is to our somewhat out-of-date copy)
A test/unit port of the Rubicon test suite, originally written by Dave Thomas while writing the Pickaxe books
The ruby_test test suite, a suite of tests created by Daniel Berger for his projects (link is to our out-of-date copy)

We don't want to run these tests forever...we would rather just run the RubySpec. So this is where we need help.

Much of these tests are already encompassed in the RubySpec specs. BFTS, for example, focuses only on a very few core classes, which have been heavily covered in RubySpec. In many cases, these test suites even overlap each other, meaning that our 3 minute test run could probably be a lot shorter. If we could just replace our test suite with the RubySpec (modulo JRuby-specific bits like Java integration), we'd be very happy.

But we can't afford to do that unless we know we're not throwing away good tests. The RubySpec is a work in progress, and there are always going to be gaps. It would be folly to throw away our tests without consideration. So that's where you come in.

We need to start at A and work our way through Z, porting over any test cases that aren't covered in the RubySpec.

I started the process tonight, adding a number of missing cases from our test_array.rb script and deleting everything I ported and everything that was already covered. It took perhaps an hour to go through, and it was of a reasonable size. Many other scripts will be much smaller, some will be larger.

The benefits extend far, far beyond JRuby of course. By adding missing test cases, we're going to ensure that all new implementations have a complete spec to go on. We're going to make sure there aren't a lot of incompatibilities you users have to deal with. And we're going to show all those other languages (who are still laughing at our lack of a spec) that we can do this in our own Ruby way.

So what are you waiting for? Contact Brian Ford and get access to the specs (perhaps after paying a one-patch toll)...have a look at the JRuby test repository...pick a file, and start comparing. Tell your friends, email your favorite Ruby list, blog and reblog this effort. The time is now to pull together all the disparate suites into one. RubySpec is ready!

Saturday, May 3, 2008

The Power of the JVM

In the past couple days, a new project release was announced that has shown once again the potential of the Java platform. Shown how the awesome JVM has not yet begun to flex its muscles and really hit its stride in this project's domain. Made clear that even projects with serious issues can correct them, harnessing much more of the JVM with only a modest amount of rework. And demonstrated there's a lot more around the corner.

That project wasn't JRuby this time. It was Groovy.

Groovy's Problem

Groovy 1.6 beta 1 was released a couple days ago. This release was focused largely on performance, rather than polishing bugs and adding features like the 1.5 series. You see, in 1.5 and earlier, Groovy had become basically feature-complete, and was starting to hit its stride. Most of the capabilities they desired were in the language and working. Their oft-touted Java integration had caught up to most Java 5 features. And Grails recently had its 1.0 release; finally there's a framework that can show Groovy at its best. But there was a problem: Groovy was still slow, one of the slowest languages on the JVM.

This doesn't really make a lot of sense, especially compared to languages like JRuby, which have a more complicated feature set to support. JRuby's performance regularly exceeded Groovy's, even though several Ruby features require us, for example, to allocate a synthetic call frame for *every* Ruby method invocation and most block invocations. And JRuby had only received serious work for about 1.5 years. The problem was not that Groovy was an inherently slow language...the problem was the huge amount of code that calls had to pass through to reach their target. Groovy's call path was fat.

A few months back I measured the number of frames between a call and the actual receiver code in Groovy and JRuby. JRuby, which has received a lot of work to shorten and simplify that call path, took only about four stack frames between calls. Groovy, on the other hand, took nearly 15. Some of these frames were due to Groovy still using Java reflection to hold "method objects", but the majority of those frames were Groovy internals. Calls had to dig through several layers of dispatch logic before they would reach a reflected method object, and then there were a few more layers before the target method was actually executed. Oh, and next time you call that method? Start over from scratch.

A Standard Solution

Early in the JRuby 1.1 dev cycle, we shortened the call path in two ways:

Rather than use reflection for core Ruby class's methods, we generate small stub methods ("method handles") that directly invoke for us. This avoids all the argument boxing and overhead of reflection entirely. It's only applicable for the core classes, but a very high percentage of any JRuby app--even one that calls Java classes--depends on core classes being fast. So it made a big difference.
When compiling Ruby code to Java bytecode, we employed what's called a call site cache, a tiny slot in the calling method where the previously looked-up method handle can be stored. If when we return to that call site the class associated with the method has not changed, and if we're again invoking against that class...we can skip the lookup. That drastically reduces the overhead of making dynamic calls, since most of the time we don't have to start over.

It is the call site mechanism that gave us our largest performance boost back in November (though I blogged a bit about the technique way back in June and July of 2007...boy was I naïve back then!).

It's certainly not a new technique. There are scads of papers out there (some really old) about how to build call site caches, either monomorphic (like JRuby's and Groovy's) or polymorphic (like most of the high-performance JVMs). Until we put them in place in JRuby, they weren't commonly used for languages built on top of the JVM. But that's all changing...now Groovy 1.6 has the same optimizations in place.

What's the result? A tremendous improvement in performance, similar to what we saw in JRuby last fall. According to Guillaume Laforge, Groovy project lead, the boost on the "Alioth" benchmarks can range anywhere from 150% faster to 560% faster. And the latest Benchmarks Game results prove it out: Groovy 1.6 has drastically improved, and even surpasses JRuby for most of those benchmarks. And while JRuby and Groovy will probably spend the next few months one-upping each other, we've both proven something far more important: the JVM is an *excellent* platform for dynamic languages. Don't let anyone tell you it's not.

Why It Works

The reason call site optimizations work so well for both JRuby and Groovy is twofold.

Firstly, eliminating all that extra dispatch logic whenever possible reduces overhead and speeds up method calls. That's a no-brainer, and any dynamic language can get that boost with the simplest of caches.

But it's the second reason that not only shows the benefit of running on the JVM but gives us a direction to take the JVM in the future. Call site optimizations allow the JVM to actually inline dynamic invocations into the calling method.

The JVM is basically a dynamic language runtime. Because all calls in Java are virtual (meaning subclass methods of the same name and parameters always override parent class methods), and because new code can be loaded into the system at any time, the JVM must deal with nearly-dynamic call paths all the time. In order to make this perform, the JVM always runs code through an interpreter for a short time, very much like JRuby does. While interpreting, it gathers information about the calls being made, 'try' blocks that immediately wrap throws, null checks that never fail, and so on. And when it finally decides to JIT that bytecode into native machine code, it makes a bunch of guesses based on that profiled information; methods can be inlined, throws can be turned into jumps, null checks can be eliminated (with appropriate guards elsewhere)...on and on the list of optimizations goes (and I've heard from JVM engineers that they've only started to scratch the surface).

This is where the call site optimizations get their second boost. Because JRuby's and Groovy's call sites now move the target of the invocation much closer to the site where it's being invoked, the JVM can actually inline a dynamic call right into the calling method. Or in Groovy's case, it can inline much of the reflected call path, maybe right up to the actual target. So because Groovy has now added the same call site optimization we use in JRuby, it gets a double boost from both eliminating the dispatch overhead and making it easier for the JVM to optimize.

Of course there's a catch. Even if you call a given method on type A a thousand times, somewhere down the road you may get passed an instance of type B that extends and overrides methods from A. What happens if you've already inlined A's method when B comes along? Here again the JVM shines. Because the JVM is essentially a dynamic language runtime under the covers, it remains ever-vigilant, watching for exactly these sorts of events to happen. And here's the really cool part: when situations change, the JVM can deoptimize.

This is a crucial detail. Many other runtimes can only do their optimization once. C compilers must do it all ahead of time, during the build. Some allow you to profile your application and feed that into subsequent builds, but once you've released a piece of code it's essentially as optimized as it will ever get. Other VM-like systems like the CLR do have a JIT phase, but it happens early in execution (maybe before the system even starts executing) and doesn't ever happen again. The JVM's ability to deoptimize and return to interpretation gives it room to be optimistic...room to make ambitious guesses and gracefully fall back to a safe state, to try again later.

Only The Beginning

So where do we go from here? Well ask me or the Groovy guys about putting these optimizations in place and we'll tell you the same thing: it's hard. Maybe too hard, but I managed to do it and I don't really know anything. It took the Groovy guys quite a while too. At any rate, it's not easy enough, and because we have to wire it together by hand (meaning we can only present a finite set of call paths) we're still not giving the JVM enough opportunity to optimize. Sure, we'll all continue to improve what we have for existing JVMs, and our performance will get better and better (probably a lot better than it is now). But we're also looking to the future. And the future holds another key to making the JVM an even better dynamic language runtime: JSR-292.

JSR-292 is basically called the "invokedynamic" JSR. The original idea for 292 was that a new bytecode could be added to the JVM to allow invoking methods dynamically against a target object, without actually knowing the type of the object or signature of the target method. And though that sounds like it might be useful, it turns out to be worthless in practice. Most dynamic languages don't even use standard Java class structures to represent types, so invokedynamic against a target object wouldn't accomplish anything. The methods don't live there. And it turns out there's a political side to it too: getting a new bytecode added to the JVM is *super hard*. So we needed a better way.

John Rose is in charge of the HotSpot optimizing compiler (the "server" compiler) at the heart of Sun's JVM. HotSpot is an amazing piece of software...it does all the optimizations I listed above plus hundreds of others that may or may not make your ears bleed. It has two different JIT compilers for different needs (soon to be merged into a single three-stage optimization pipeline), probably half a dozen different garbage collectors (a few weeks ago I met a guy in charge of one generation of one collector...crazy), and probably a thousand tweakable execution and optimization flags. It can make most Java run as fast as equivalent C++, even while the HotSpot engineers recommend you "just write normal code". In short, HotSpot has balls of steel.

John took over JSR-292 about this time last year. Not much work had been done on it, and it looked like it was moving toward a dead-end; most of the dynamic language projects agreed it wouldn't help them. Around that time, it was becoming apparent that JRuby would be able to make Ruby run really well (aka "fast") on the JVM, but it was taking a lot of work to do it. Tom and I talked with John a few times about strategies, many of which we've put in place over the past year, and they were all rather tricky to implement. Largely, they moved toward making the call path as fast as possible, by both shortening it and making the number and type of parameters match the target all the way through.

In order to reduce this workload for language implementers, John has been working on several features leading up to "invokedynamic". Here's the rough overview of how it will fit together.

The first feature is already working in John's multi-language VM "Da Vinci Machine" project: anonymous classloading. JRuby first improved invocation performance by avoiding reflection and generating little wrapper classes, but those classes incur a very high cost. Each one has to be generated, classloaded, named, stored, and eventually dereferenced and garbage-collected independently. You can't do that with a single class or a single classloader, so we had a class per method, and a classloader per class. That's a crapload of memory used just to get around the JVM's bent toward plain old Java types. Anonymous classloading aims to eliminate that overhead in two ways: first, it will not require hard references or names for these tiny loaded classes, allowing them to easily garbage collect when the code is no longer in use; and second, it will allow you to generate a template class once, then creating duplicates of it with only small constant pool changes. Lost? Keep up with me...it leads into the next one.
The second feature John hopes to have done real soon now: lightweight method handles. Method handles are essentially like java.lang.reflect.Method objects, except that they exactly represent the target method's parameter list and they take up far less memory...about 1/10 that of Method by John's estimate. Here's where the anonymous classloading comes in. Because all methods that have a given signature can be invoked with basically the same code, we only need to generate that handle once. So to support the broad range of classes and method names we'll want to invoke with that handle, we just patch the handle's constant pool. It's like saying "now I want a handle that invokes the same way, but against the 'bar' method in type B". Ahh, now anonymous classloading starts to make sense. We have one copy of the code with several patched instances. It makes me giddy just to think about it, because of how it would help JRuby. Because all our core classes just accept IRubyObject as arguments, we'd have to generate exactly ten primary handles instead of the thousand or more we generate now. And that means we can get even more specific.
Method handles feed into the big daddy itself: dynamic invocation. Because handles are so close to the metal, and because the JVM understands what the hell they are (rather than having to perform lots of nasty tricks to optimize reflection) we can start to feed handles straight back into the JVM's optimization logic. So once we present our dynamic types to the JVM's dynamic lookup logic, we simply have to toss it method handles. And because the JVM can now connect the caller with the callee using standard mechanisms, our call site optimizations get chucked in the bin. The JVM can now treat our dynamic call like any other virtual call. All we need to do is provide the trigger that tells the JVM that the old handle is no longer correct, and it will come back for a new one. And we get to delete half the JRuby codebase that deals with making dynamic invocation fast. WOW.

Of course this is not there yet and won't be until JDK7 (fingers crossed!). We want to continue to support pre-JDK7 JVMs with languages like JRuby and Groovy, so an important component of this work will be backported libraries to do it "as well as possible" without the above features. That work will probably grow out of JRuby, Groovy, Jython, Rhino, and any other dynamic JVM languages, since we're the primary consumers right now and we're making it happen today. But I'll tell you, friends...you don't know what you've been missing on the JVM. Groovy's performance improvement from simply adding call site caches amazes me, even though we received the same boost in JRuby last year. The techniques we're both planning for our next versions will keep performance steadily increasing. And we've got invokedynamic right around the corner to really take us the last mile.

The future is definitely looking awesome for dynamic languages on the JVM. And languages like Groovy and JRuby are proving it.

Thursday, May 1, 2008

Culling the Herd

I've been on Twitter for a while, but only recently started using it in earnest. I've got 'rific, have my Growl notifications up, and run a few Tweetscans through my feed reader to rush to the rescue of JRubyists in trouble. But I try to do something it seems most tweeters don't: I try not to crapflood my followers with useless bullshit.

Don't get me wrong, I'm all for stream-of-consciousness information flow. I do it myself, either when hanging out with people (where it might form the beginning of a conversation) or when on IRC (where it often doesn't, but is easily ignored). But I dunno, it seems to me tweets ought to be something at least a little more substantial.

So I'm keeping my list of followees small and tight. Around 20-25 seems like a pretty good range.

That's meant unfollowing people that tweet nothing but their travel schedules, where in the house they happen to be sitting, what great new product their company just released, and a load of other nonsense. That's meant not following everyone that follows me, especially if their primary topics include walking the dog and taking out the trash. To me, the value of Twitter is both in keeping track of what people I respect are working on or find interesting and as a sort of micro-feed, a little forced 2-second thought break to help me step back from hard problems. Whether you buttered your toast on the bottom or found an unrecognizable lump of once-food in the refrigerator is worthless to me...so if that's the tweets you're inflicting on the world, why should I begin or continue to follow you?

Of course there are a few folks that have a few very insightful tweets sprinkled in with others I don't find interesting...not necessarily a signal-to-noise problem, but a relevant-to-irrelevant thing. It's a judgment call, so if you're not immediately followed by me don't take offense. We just have different interests.