SBCL character IO has been always been rather slow, but after the Unicode support was added about a year ago it got even worse. To give an idea of how bad, reading and printing a 65MB (1.3 million lines) file line-by-line takes <1.5 seconds with Perl, 5-8 seconds with other Lisps that I have installed, and 15 seconds with SBCL 0.9.6. A pre-unicode SBCL takes about 7 seconds.
So I went hunting for some low-hanging fruit in (fd-)streams, and found quite a lot.
There were several places where the Unicode-induced separation of
(SIMPLE-ARRAY CHARACTER)had forced formerly inlined operations (stuff like
REPLACE, etc) to be replaced with a generic calls due to insufficient type information.
The addition of the
OUTPUT-NOTHINGrestart when trying to write a character into a stream with an incompatible external format was causing overhead on every iteration of some inner loops. Though I have a vague recollection that it was even worse at some point (creation of a restart on every iteration of the innermost loop) than it was now (establishing a catch tag on every iteration).
The input buffer for UTF-8 streams never received more than one character at a time.
READ-LINEwas fetching data from the internal input buffer character by character, instead of looking ahead for a newline and then copying a bigger batch of characters at once.
After fixing all of the above and doing some additional micro-optimizations SBCL now takes about 3.5 seconds, which isn't too bad. If you've been having IO performance troubles with SBCL, now might be a good time to test CVS SBCL.
One thing that I ran into and didn't have time to look at is that
SB-SYS:*STDIN* doesn't get a
CIN-BUFFER at all, and thus is still
painfully slow. If this is intentional, my guess is that
FD-STREAM-READ-N-BYTES doesn't play along well with line-buffering.