Preliminary Rubinius inliner benchmarks

I've done some very preliminary benchmarking on the inliner I've been hacking into Rubinius.

For the very simple case it can handle so far—guaranteed dispatch to self, fixed number of arguments (no splats or defaults), no blocks—here's what we get for 10m iterations of a simple function calling another simple function:

user system total real
uninlined-no-args 22.495877 0.000000 22.495877 ( 22.495978)
inlined-no-args 21.741561 0.000000 21.741561 ( 21.741581)
uninlined-4-args 27.742596 0.000000 27.742596 ( 27.742583)
inlined-4-args 24.593837 0.000000 24.593837 ( 24.593869)

So inlining results in a 3.5% speedup on method dispatch with no arguments, and a 12.8% speedup when there are four arguments.

Of course this is the very optimal case for the inliner. Guaranteed dispatch to self means that I don't even add any guard code, which would definitely slow things down. But this actually is a fairly common case that occurs whenever you use self accessors and any helper functions that don't have blocks or varargs.

And the real boost of inlining, presumably, is going to be in conjunction with JIT, since the CPU can pipeline the heck out of everything.


Blog Archive