Tuesday, November 27, 2007

Java 6 Port for OS X (Tiger and Leopard)

I just stumbled across this little gem today:

Landon Fuller's JDK 6 Port for OS X

Who Landon Fuller is I don't know. But I find it incredibly impressive that he's managed to get the base JDK 6 ported to OS X and working. Talk about showing the value of an open-source JDK...Landon Fuller FTW. Apple, are you hiring? Perhaps this guy can kick the Apple JDK process in the ass.

So naturally when I'm confronted with a final JDK 6 release for OS X one thing immediately springs to mind: performance.

We'd always suspected that the early preview version of JDK 6 on OS X was not showing us the true awesome performance we could expect from a final version.

We were right.

So I'll give you two sets of numbers, one that's specious and unreliable and the other that's a more real-world test.

fib numbers

Yes, good old fib. A constant in benchmarking. It shows practically nothing, and yet people use it to demonstrate perf. And in the case of Ruby 1.9, they've specifically optimized for integer-math-heavy benchmarks like this.

Ruby 1.9:
  0.400000   0.000000   0.400000 (  0.413737)
0.420000 0.010000 0.430000 ( 0.421622)
0.400000 0.000000 0.400000 ( 0.411591)
0.410000 0.000000 0.410000 ( 0.411593)
0.400000 0.000000 0.400000 ( 0.410080)
0.410000 0.000000 0.410000 ( 0.408836)
0.400000 0.000000 0.400000 ( 0.408572)
0.410000 0.000000 0.410000 ( 0.408114)
0.400000 0.000000 0.400000 ( 0.410374)
0.400000 0.000000 0.400000 ( 0.413096)

Very nice numbers, especially considering Ruby 1.8 benchmarks at about 1.7s on my system. Ruby 1.9 contains several optimizations for integer math, including the use of "tagged integers" for Fixnum values (saving object costs) and fast math opcodes in the 1.9 bytecode specification (avoiding method dispatch). JRuby does neither of these, representing Fixnums as a normal Java object containing a wrapped Long and dispatching as normal for all numeric operations.

JRuby trunk:
  0.783000   0.000000   0.783000 (  0.783000)
0.510000 0.000000 0.510000 ( 0.510000)
0.510000 0.000000 0.510000 ( 0.510000)
0.506000 0.000000 0.506000 ( 0.506000)
0.505000 0.000000 0.505000 ( 0.504000)
0.507000 0.000000 0.507000 ( 0.507000)
0.510000 0.000000 0.510000 ( 0.510000)
0.507000 0.000000 0.507000 ( 0.507000)
0.508000 0.000000 0.508000 ( 0.508000)
0.510000 0.000000 0.510000 ( 0.510000)

This is improved from numbers in the 0.68s range under the Apple JDK 6 preview. Pretty damn hot, if you ask me. I love being able to sit back and do nothing while performance numbers improve. It's a nice change from 16 hour days.

Anyway, back to performance. JRuby also supports an experimental frameless execution mode that omits allocating and initializing per-call frame information. In Ruby, frames are used for such things as holding the current method visibility, the current "self", the arguments and block passed to a method, and so on. But in many cases, it's safe to omit it entirely. I haven't got it running 100% safe in JRuby yet, and probably won't before 1.1 final comes out...but it's on the horizon. So then...numbers.

JRuby trunk, frameless execution:
  0.627000   0.000000   0.627000 (  0.627000)
0.409000 0.000000 0.409000 ( 0.409000)
0.401000 0.000000 0.401000 ( 0.401000)
0.402000 0.000000 0.402000 ( 0.402000)
0.403000 0.000000 0.403000 ( 0.403000)
0.403000 0.000000 0.403000 ( 0.403000)
0.404000 0.000000 0.404000 ( 0.405000)
0.401000 0.000000 0.401000 ( 0.401000)
0.403000 0.000000 0.403000 ( 0.403000)
0.405000 0.000000 0.405000 ( 0.405000)

Hello hello? What do we have here? JRuby actually executing fib faster than an optimized Ruby 1.9? Can it truly be?

Pardon my snarkiness, but we never thought we'd be able to match Ruby 1.9's integer math performance without seriously stripping down Fixnum and introducing fast math operations into the compiler. I guess we were wrong.

M. Ed Borasky's MatrixBenchmark

I like Borasky's matrix benchmark because it's a non-trivial piece of code, and pulls in a Ruby standard library (matrix.rb) as well. It basically inverts a matrix of a particular size and multiplies the original by the inverse. I show here numbers for a 64x64 matrix, since it's long enough to show the true benefit of JRuby but short enough I don't get bored waiting.

Ruby 1.9:
Hilbert matrix of dimension 64 times its inverse = identity? true
21.630000 0.110000 21.740000 ( 21.879126)
JRuby trunk:
Hilbert matrix of dimension 64 times its inverse = identity? true
14.780000 0.000000 14.780000 ( 14.780000)

This is down from 16-17s under the Apple JDK 6 preview and a clean 25% faster than Ruby 1.9.

So what have we learned today?
  • Sun's JDK 6 provides frigging awesome performance
  • Apple users are crippled without a JDK 6 port. Apple, I hope you're paying attention.
  • Landon Fuller is my hero of the week. I know Landon will just point at the excellent work to port JDK 6 to FreeBSD and OpenBSD...but give yourself some credit, you did what none of the other Leopard whiners did.
  • JRuby rocks
Note: You have to be a Java Research License licensee to legally download the binary or source versions of Landon's port. That or complain to Apple about some dude making a working port before they did. Landon mentions on his blog that he plans to contribute this work to OpenJDK soon...which would quickly result in a buildable GPLed JDK for OS X. Awesome.

10 comments:

  1. Very impressive results, Charles. I have a gut feeling that JRuby will become the impementation of choice for quite a few developers.

    ReplyDelete
  2. Good news for all leopard users :)

    ReplyDelete
  3. Actually, it also looks like he left Apple since.

    ReplyDelete
  4. I'm confused about your remark about of tagged integers in 1.9: I thought Ruby 1.8.x already used tagged integers to represent Fixnums... isn't that the reason why Fixnums max out at 29 bit?

    ReplyDelete
  5. He's the good fellow who provided patches to some of the MOAB (Month of Apple Bugs)...

    ReplyDelete
  6. Apple users are crippled without a JDK 6 port. Apple, I hope you're paying attention.

    Not to sound like too much of a troll, but Sun could also make this happen. I realise that apple wanted their own jvm for 'better OS integration', but they've abandoned that route now. Sun clearly has the talent and knowledge to do it....

    ReplyDelete
  7. koz: I don't speak for Sun on this, but my understanding is that it's largely a resource issue. Maintaining JDK releases for Solaris, Windows, and Linux takes up a *lot* of effort.

    ReplyDelete
  8. Karl said:
    *Why is it that all of these open source Mac projects are targetting Intel first and PPC later or not at all? PPC binaries run on Intel Macs due to Rosetta. Therefore logically if you're going to target one architecture, you should choose PPC so that it will run on Macs of both architectures.

    I dunno about "all these open source projects", but the reason for this port being limited to Intel macs is obvious: Sun's JVM contains some very close-to-metal Sparc and x86 code. A nontrivial amount is assembly or assembly-generating C++, and some of the low-level code is very subtle and heavily optimized. I've read some of this code as part of a research project on Java performance, and I wouldn't want to be the one who had to port it to a different CPU architecture. =)

    ReplyDelete
  9. Hi Charles,
    I met you briefly at RailsConf back in May and you quickly got me interested in JRuby. I'm consistently impressed with how much it has advanced just since then. You and the rest of the team are doing really great work. Keep it up!

    ReplyDelete
  10. @david koontz:
    Hm... I guess what I meant was just this:
    (2**29).class => Fixnum
    (2**30).class => Bignum

    30 bits of data makes sense, since Ruby's tags take up 2 bits.

    ReplyDelete