SBCL's register allocator should now be a bit more effective. The old allocator basically went through the intermediate code linearly and greedily allocated TNs to some free register or on the stack. The new behaviour is still greedy, but instead of a linear ordering it allocates the TNs inside deep loops first, giving them a better chance of ending up in a register. The utility of this optimization can easily be illustrated with a retarded benchmark. More pragmatically, many bignum operations are now much faster on x86.

The changes to the allocator itself were pretty straightforward, and detecting loops could be done with some ancient code written by Rob MacLachlan for CMUCL. A bit of software archaeology was required for the latter, since the code in question had probably not been compiled or executed since the 80's. I missed some of the implementation subtleties and introduced a nasty bug which Paul Dietz found almost immediately with his cool random tester. Fortunately Alexey Dejneka fixed the problem within hours of the report. Thanks!

Other lispy things that have happened since my last blog entry:

  • There have been a couple of Helsinki Lisp meetings.

  • I've finally come to the conclusion that writing blog entries as s-exps is just too painful. To relieve the pain, I stole the markup code from clog.