<rss version='2.0'><channel><title>Juho Snellman's Weblog</title><link>https://www.snellman.net/blog/</link><description>Lisp, Perl Golf</description><item><title>Numbers and tagged pointers in early Lisp implementations</title><link>https://www.snellman.net/blog/archive/2017-09-04-lisp-numbers/</link><description>
  &lt;p&gt;
    There was a bit
    of &lt;a href=&#039;https://news.ycombinator.com/item?id=15121859&#039;&gt;discussion
    on HN about data representations in dynamic languages&lt;/a&gt;, and
    specifically having values that are either pointers or immediate
    data, with the two cases being distinguished by use of tag bits in
    the pointer value:
  &lt;/p&gt;

  &lt;blockquote&gt;
    &lt;blockquote&gt;
      If there&#039;s one takeway/point of interest that I&#039;d recommend looking at, it&#039;s the novel way that Ruby shares a pointer value between actual pointers to memory and special &quot;immediate&quot; values that simply occupy the pointer value itself [1].
    &lt;/blockquote&gt;
    This is usual in Lisp (compilers/implementations) and i wouldn&#039;t be surprised if it was invented on the seventies once large (i.e. 36-bit long) registers were available.
  &lt;/blockquote&gt;

  &lt;p&gt;I was going to nitpick a bit with the following:&lt;/p&gt;

  &lt;blockquote&gt;
  &lt;p&gt;
    The core claim here is correct; embedding small immediates inside
    pointers is not a novel technique. It&#039;s a good guess that it was
    first used in Lisp systems. But it can&#039;t be the case that its
    invention is tied into large word sizes, those were in wide use
    well before Lisp existed. (The early Lisps mostly ran on 36 bit
    computers.)
  &lt;/p&gt;

  &lt;p&gt;
   It seems more likely that this was tied into the general migration
   from word-addressing to byte-addressing. Due to alignment constraints,
   byte-addressed pointers to word-sized objects will always have unused
   bits around. It&#039;s harder to arrange for that with a word-addressed
   system.
  &lt;/p&gt;
  &lt;/blockquote&gt;

  &lt;p&gt;
    But the latter part of that was speculation, maybe I should try to
    check the facts first before being tediously pedantic?
    Good call, since that speculation was wrong. Let&#039;s take a tour
    through some early Lisp implementations, and look at how they
    represented data in general, and numbers in particular.
  &lt;/p&gt;

&lt;read-more&gt;&lt;/read-more&gt;

  &lt;h3&gt;Table of Contents&lt;/h3&gt;

  &lt;ul&gt;
  &lt;li&gt; &lt;a href=&#039;#problem&#039;&gt;The problem with integers&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#lispi&#039;&gt;LISP I&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#lisp1.5&#039;&gt;LISP 1.5&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#pdp1&#039;&gt;Basic PDP-1 LISP&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#m460&#039;&gt;M 460 LISP&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#pdp6&#039;&gt;PDP-6 LISP&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#bbn&#039;&gt;BBN LISP&lt;/a&gt;
  &lt;li&gt; &lt;a href=&#039;#conclusion&#039;&gt;Conclusion&lt;/a&gt;
  &lt;/ul&gt;

  &lt;a name=&#039;problem&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;The problem with integers&lt;/h3&gt;

  &lt;p&gt;
    Before we get started, let&#039;s state the problem that tagged pointers
    solve. In a
    dynamically typed programming language, the language
    implementation must be able to distinguish between values of
    different types. The obvious implementation is boxing; all values
    are treated as blobs of memory allocated somewhere on the heap,
    with an envelope containing metadata such as the type and (maybe)
    the size of the object.
  &lt;/p&gt;

  &lt;p&gt;But this means that integers now have tons of overhead. They use
    up heap space, need to be garbage collected, and new memory needs
    to be constantly allocated for the results of arithmetic
    operations. Since integers are so critical to almost all kinds of
    computing, it would be great to minimize the overhead.  And
    ultimately, to eliminate the overhead completely by encoding small
    integers as recognizably invalid pointers.
  &lt;/p&gt;

  &lt;a name=&#039;lispi&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;LISP I&lt;/h3&gt;

  &lt;p&gt;
    I wasn&#039;t super hopeful about finding out exactly what numbers looked like
    in the original Lisp implementation. As far as I know, the source
    code hasn&#039;t been preserved. Now, the original paper describing Lisp
    (&lt;a href=&#039;http://edge.cs.drexel.edu/regli/Classes/Lisp_papers/McCarthy-original-LISP-paper-recursive.pdf&#039;&gt;
    Recursive Functions of Symbolic Expressions and their Computation
    by Machine, Part I
    &lt;/a&gt;) isn&#039;t quite as theoretical as the title suggests. For
    example it describes the memory allocator and garbage collector on
    a reasonable systems level. But it doesn&#039;t mention numbers at all;
    this is a system for symbolic computation, so numbers might as
    well not exist.
  &lt;/p&gt;

  &lt;p&gt;
    The &lt;a href=&#039;https://kyber.io/rawvids/LISP_I_Programmers_Manual_LISP_I_Programmers_Manual.pdf&#039;&gt;
      LISP I Programmer&#039;s Manual&lt;/a&gt; from 1960 is more illuminating, though not
      entirely consistent. In one place the manual claims that LISP I
      only supports floats, and you&#039;ll need to wait until LISP II to
      use integers. But the rest of the document happily describes the
      exact memory layout of integers, so who can tell.
  &lt;/p&gt;

  &lt;p&gt;
    A floating point value looks like this:
  &lt;/p&gt;

  &lt;img src=&#039;/blog/stc/images/lisp-numbers/lisp1.0-float.png&#039; /&gt;

  &lt;p&gt;
    Let&#039;s say we have the value 1.0 in a LISP I program. This value
    is actually pointer to a word. How do we know what the type of the
    pointed to word is? If the upper half of that word is -1, it&#039;s a
    symbol. Otherwise it&#039;s a cons. (The use of -1.0 and 1.0 as the
    example floats in this picture is unfortunate, since it looks like
    the -1.0 and -1 are somehow related. That&#039;s not the case, -1 is
    the universal tag value for atoms, and independent of the exact
    floating point values.)
  &lt;/p&gt;

  &lt;p&gt;
    So the number 1.0 is a symbol? Technically yes, since at this stage of Lisp&#039;s
    evolution everything is either a symbol or a cons. There are no
    other atoms. We can find out if the symbol represents a number by
    following the linked list starting from the &lt;code&gt;cdr&lt;/code&gt; of
    the symbol (a pointer stored in the lower half of the word). If
    we find the symbol &lt;code&gt;NUMB&lt;/code&gt; on the list, it&#039;s some kind
    of number. If we find the symbol &lt;code&gt;FLO&lt;/code&gt;, it&#039;s a floating
    point number, and the property list will be pointing to a word
    that contains the raw floating point value that this number
    represents.
  &lt;/p&gt;

  &lt;p&gt;
    There&#039;s a detail here that&#039;s kind of amazing. Notice that 1.0 and
    -1.0 share the same list structure. The only difference is that
    -1.0 has the symbol &lt;code&gt;MINUS&lt;/code&gt; in the list, after which
    the list merges with the list of 1.0. What a fabulously
    inefficient representation! Not only do you have to do a bunch of
    pointer chasing just to find the actual value of a number, but
    then you&#039;ll get to do it again to find out the sign!
  &lt;/p&gt;

  &lt;p&gt;
    The question I can&#039;t answer just from reading this document is how
    exactly the raw floating point value is handled. Surely the
    garbage collector must know not to interpret those raw
    bits as pointer data? There is a very detailed example of the
    memory layout for an integer on pages 94-95, but even with that
    example I just don&#039;t see where the type information is stored.
    It&#039;s clearly not based on address ranges (the raw values are mixed
    in with the other words), nor the pointer value (all the pointers
    are stored as 2&#039;s complement), nor the 6 unused bits in the
    machine word.
  &lt;/p&gt;

  &lt;p&gt;
    Suggestions welcome. My best guess is that the example is
    inaccurate.
  &lt;/p&gt;

  &lt;a name=&#039;lisp1.5&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;LISP 1.5&lt;/h3&gt;

  &lt;p&gt;The LISP 1.5 Programmer&#039;s Manual from 1962 explains in a very
    concise manner how numbers worked in that implementation:
  &lt;/p&gt;

  &lt;img src=&#039;/blog/stc/images/lisp-numbers/lisp1.5-int.png&#039; /&gt;

  &lt;p&gt;Numbers are still considered to be symbols, and symbols are
    still marked with -1 as the &lt;code&gt;car&lt;/code&gt;. But the standard
    symbol property list is now gone; instead the symbol is pointing
    directly to the memory that stores the raw integer value. How
    does the program know not to follow that pointer as a list? As the
    document says, that&#039;s specified by &quot;certain bits in the tag&quot;.
  &lt;/p&gt;

  &lt;p&gt;
    The tag? What&#039;s the tag? The IBM 704 had a 36-bit word size but
    just a 15 bit address space. The words were split (on the ISA
    level) into a 3 bit &quot;prefix&quot;, 15 bit &quot;address&quot;, 3 bit &quot;tag&quot;, and
    15 bit &quot;decrement&quot;.  Since Lisp values are pointers, only the two
    15 bit regions are useful for that. One of the 3 bit regions has
    been repurposed by the Lisp implementation to mark the pointers
    to raw data.
  &lt;/p&gt;

  &lt;p&gt;
    This is a clear improvement over LISP I, but a number is still
    represented as an untagged pointer to a tagged pointer to the raw
    value. Why is the intermediate word there at all, why not go
    directly with a tagged pointer to the raw value? Maybe code size?
  &lt;/p&gt;

  &lt;p&gt;
    In parallel to that, the address space has now been split into
    multiple separate pieces, with the cons cells being allocated from
    a different range of addresses than plain data like numbers and
    string segments. It could well be that the tagged pointer is
    irrelevant to the GC, which just makes its decisions on what&#039;s a
    pointer based on whether the pointer is contained in the &quot;full
    word space&quot; or the &quot;free space&quot;. The tags would then be used
    just for implementing &lt;code&gt;NUMBERP&lt;/code&gt;.
  &lt;/p&gt;

  &lt;a name=&#039;pdp1&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;Basic PDP-1 LISP&lt;/h3&gt;

  &lt;p&gt;
    For a L. Peter Deutsch joint,
    &lt;a href=&#039;http://s3data.computerhistory.org/pdp-1/DEC.pdp_1.1964.102650371.pdf&#039;&gt;
      The LISP implementation for the PDP-1 Computer
    &lt;/a&gt; proves to be a surprisingly unsatisfying document. It&#039;s almost
    exclusively user documentation, with no information on the systems
    architecture. Well, except a full source code listing. Guess we&#039;ll
    have to look at that, then.
    &lt;code&gt;NUMBERP&lt;/code&gt; is the easiest starting point:
  &lt;/p&gt;

&lt;pre&gt;
/// (&quot;is a number&quot;)
/NUMBERP
nmp,    lac i 100
        and (jmp
        sad (jmp
        jmp tru
        jmp fal
&lt;/pre&gt;

  &lt;p&gt;The main thing that need to be known from the rest of the code is
    that the interpreter stores a pointer to the Lisp value that&#039;s
    currently operated on value at address &lt;code&gt;100&lt;/code&gt; (octal).
  &lt;/p&gt;

  &lt;p&gt;First &lt;code&gt;&quot;lac i 100&quot;&lt;/code&gt; follows the pointer to read the
    first data words of the value into the accumulator. The next line
    looks bizarre; due to the way the PDP-1 macro-assembler
    works, &lt;code&gt;&quot;and (jmp&quot;&lt;/code&gt; effectively means &lt;code&gt;&quot;and
    600000&quot;&lt;/code&gt;. So this instruction is masking away all but the
    top two bits of the accumulator, and &lt;code&gt;&quot;sad
    (jmp&quot;&lt;/code&gt; is checking whether the result of the masking equals octal
    &lt;code&gt;600000&lt;/code&gt;. It appears that there is nothing special about the
    pointer to a number, but numbers are identified by having the top
    two bits set in the pointed-to value.
  &lt;/p&gt;

  &lt;p&gt;The next step in understanding the layout is the code for reading
  the raw value of a number.&lt;/p&gt;

&lt;pre&gt;
/get numeric value
vag,    lio i 100
        cla
        rcl 2s
        sas (3
        jmp qi3
        idx 100
        lac i 100
        rcl 8s
        rcl 8s
        jmp x
&lt;/pre&gt;

  &lt;p&gt;&lt;code&gt;&quot;lio i 100&quot;&lt;/code&gt; loads the current Lisp value into the IO register.
    &lt;code&gt;&quot;cla&quot;&lt;/code&gt; sets the accumulator to zero. &lt;code&gt;&quot;rcl
    2s&quot;&lt;/code&gt; then rotates the combination of the IO register and
    accumulator by 2 bits.  The accumulator now contains as its
    low bits the previous high two bits of the IO register. &lt;code&gt;&quot;sas
    (3&quot;&lt;/code&gt; compares the accumulator to 3; if they&#039;re not equal we
    jump to qi3 (the error routine for &quot;non-numeric arg for
    arith&quot;). &lt;code&gt;&quot;idx 100&quot;&lt;/code&gt; moves the pointer to the next word
    of the value, and &lt;code&gt;&quot;lac i 100&quot;&lt;/code&gt; reads that word into
    the accumulator. And finally the combination of the two registers
    is rotated by 16 bits, so that we end up with the raw 18 bit value
    in the accumulator. Written out step by step the process looks
    like this:
  &lt;/p&gt;

  &lt;pre&gt;
    . == Bit with value of 0
    ! == Bit with value of 1
    ? == Bit with unknown value
    0-9, A-H == bits of the integer value

    X                    X+1
------------------------------------------------
    [!!23456789ABCDEFGH] [................01]

    IO                   AC
------------------------------------------------
Load IO from address X
    [!!23456789ABCDEFGH] [??????????????????]
Clear AC
    [!!23456789ABCDEFGH] [..................]
Rotate left by 2
    [23456789ABCDEFGH..] [................!!]
Check AC == 3
Load AC from address X+1
    [23456789ABCDEFGH..] [................01]
Rotate left by 8
    [ABCDEFGH..........] [........0123456789]
Rotate left by 8
    [..................] [0123456789ABCDEFGH]
&lt;/pre&gt;

  &lt;p&gt;
    Clearly an integer is now represented by a pointer to two words
    that has a special tag in the high bits of the first word. This
    implementation got rid of the extra layer of indirection in LISP
    1.5; an integer is now just a pointer to tagged data. But we&#039;re
    still left with the storage of a one-word integer requiring three
    words.
  &lt;/p&gt;

  &lt;p&gt;
    Why use a layout that requires shuffling data around this much,
    instead of just having the tag in X and the raw value in X+1?  It
    seems awfully inconvenient. My best guess is that the top 1-2 bits
    of the second word are reserved for the GC, e.g. for use as mark
    bits. But understanding exactly how the GC works is maybe a project
    for another day.
  &lt;/p&gt;

  &lt;a name=&#039;m460&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;M 460 LISP&lt;/h3&gt;

  &lt;p&gt;
    Before starting research for this article, I&#039;d never heard of the
    early Lisp implementation for the Univac M 460. A description of
    the system can be found in the 1964
    collection &lt;a href=&#039;https://scholar.google.com/scholar?cluster=1071332420478270292&amp;hl=en&amp;as_sdt=0,5&amp;sciodt=0,5&#039;&gt;
    The programming language LISP: Its operation and applications
    &lt;/a&gt;.
  &lt;/p&gt;

  &lt;blockquote&gt;
    Numbers and print names are placed in free storage using the
    device that sufficiently small (i.e., less than 2^10) half-word
    quantities appear to point into the bit table area and so don&#039;t
    cause the garbage collector any trouble. A number is stored as a
    list of words (a flag-word and from 1 to 3 number words, as
    required), each number word containing in its CAR part 10
    significant bits and sign. Thus an integer whose absolute value
    is less than 2^11 will occupy the same amount of storage
    (2 words) as in 7090 LISP 1.5.
  &lt;/blockquote&gt;

  &lt;p&gt;
    This is another bit of progress! The key insight on the road to
    tagged pointers is that invalid parts of the address space can be
    used to distinguish between pointers and immediate data. Another
    important insight in this paper is that most numbers in a program
    are going to be small, so it might make sense to have variable
    representations for numbers of different magnitude. But it&#039;s not a
    full realization of the concept yet, immediate small numbers are
    not accessible directly by the user. They are internal to the
    implementation, used as a building block for boxed integers of
    various levels of inefficiency.
  &lt;/p&gt;

  &lt;p&gt;
    The paper gets even better once we get a few more pages in, since
    for characters M 460 Lisp does take that final step:
  &lt;/p&gt;

  &lt;blockquote&gt;
Each character in the character set available on the M 460
(including tab, carriage return, and others) is represented internally
by an 8-bit code (6 bits for the character (up to case),
1 bit for case, and 1 bit for color). To facilitate the manipulation
of character strings within our LISP system, we permit
such character literals to appear in list structure as if they
were atoms, i.e. pointers to property lists. These literals can,
where necessary, be distinguished from atoms since they are less
than 2^8 in magnitude and hence, viewed as pointers, don&#039;t point
into free storage (where, as in 7090 LISP, property lists are
stored). The predicate charp simply makes this magnitude test.
  &lt;/blockquote&gt;

  &lt;p&gt;
    That&#039;s about as clear a case of using embedding immediate data in
    pointers as it gets. It&#039;s just that the tag is rather large (22
    highest bits, rather than the 1-4 lowest bits you&#039;d expect today).
    And it&#039;s also dealing with characters rather than numbers, so
    let&#039;s carry on with the investigation a bit longer.
  &lt;/p&gt;

  &lt;a name=&#039;pdp6&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;PDP-6 LISP&lt;/h3&gt;

  &lt;p&gt;
    The June 1966 report on
    &lt;a href=&#039;https://dspace.mit.edu/bitstream/handle/1721.1/5899/AIM-098.pdf&#039;&gt;PDP-6 LISP&lt;/a&gt;
    has the following to say on integers:
  &lt;/p&gt;

  &lt;blockquote&gt;
    Fixed-point numbers &amp;gt;= 0 and &amp;lt; about 4000 are represented by
    a &quot;pointer&quot; 1 greater than their value, and no additional list structure.
    All other numbers use a pointer to full-word space as part of an atom
    header with a FIXNUM or FLONUM indicator.
  &lt;/blockquote&gt;

  &lt;p&gt;
    This is starting to get close to the modern fixnum, except for no
    facility for immediate negative numbers and a tiny range. (This is
    a machine with 36 bit words and 18 bit pointers; one would hope
    for a bit more than 12 bits for immediate integers).
  &lt;/p&gt;

  &lt;a name=&#039;bbn&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;BBN LISP&lt;/h3&gt;

  &lt;p&gt;
    &lt;a href=&#039;http://www.dtic.mil/dtic/tr/fulltext/u2/647601.pdf&#039;&gt;
      Structure of a LISP system using two-level storage
    &lt;/a&gt; is a wonderful systems design paper from November 1966,
    describing BBN LISP for a PDP-1 with 16K of
    core memory, 88K of absurdly slow drum memory, and no hardware
    paging support. How do you make efficient use of the drum memory?
    By some clever data layout, software-driven paging, and a
    locality-optimizing memory allocator.
  &lt;/p&gt;

  &lt;p&gt;
    So it&#039;s actually a paper I thought was totally worth reading just
    for its own sake. But for the purposes of this post, this is the
    money quote:
  &lt;/p&gt;

  &lt;blockquote&gt;
LISP assumes that it is operating in an environment containing
128K words, that is from 0 to 400,000 octal. Only 88K actually
exist on the drum. The remaining portion of the address space
is used for representation of small integers between -32,767
and 32,767 (offset by 300,000 octal), as described below.
  &lt;/blockquote&gt;

  &lt;p&gt;
    The paper describes a machine with both an 18-bit word size
    and address space, with 16-bit signed fixnums embedded in the
    pointers. That&#039;s about as good as it gets. (Though not quite
    optimal; they&#039;re using bit 17 as the integer tag, but what
    happened to bit 18? The paper doesn&#039;t say, but odds are that
    it&#039;s again a GC mark bit).
  &lt;/p&gt;

  &lt;p&gt;
    The particularly observant reader might have noticed that this
    machine had 104K words of physical memory, but the described
    tagging scheme only leaves 64K words addressable. What&#039;s up with
    that? On one level it&#039;s exactly what M 460 LISP and PDP-6 Lisp
    were doing: that 40K of address space stores things that can&#039;t be
    directly pointed to from another Lisp value. But those other
    implementations were just opportunistically reusing the
    parts of address space that contained native code.
  &lt;/p&gt;

  &lt;p&gt;By contrast, BBN LISP carefully arranged for there to exist as
    much of such storage as possible, and for it to be located above
    the address 200,000 (octal).&lt;/p&gt;

  &lt;img src=&#039;/blog/stc/images/lisp-numbers/bbn-lisp-layout.png&#039; /&gt;

  &lt;p&gt;The most clever example of that is the representation of
    symbols. The first implementations we saw just implemented symbols
    as a list of properties indexed by name (e.g. name, value cell,
    function cell, etc). An obvious optimization is to allocate a
    symbol as a single larger block of memory with fixed slots for the
    most common properties, and a generic property list slot to
    contain anything else.&lt;/p&gt;

  &lt;p&gt;
    What BBN Lisp does instead is allocate a symbol in multiple
    separate blocks rather than a single contiguous one. A pointer to
    the symbol will point to the block of value cells, so reading the
    value cell is trivial. What if you want to read another property,
    e.g. the function? We look at the offset of the value cell pointer
    to the start of the value cell block, and access the function cell
    block at the same offset. In modern parlance it ends up as an
    structure-of-arrays layout rather than an array-of-structures.
  &lt;/p&gt;

  &lt;p&gt;
    In addition to getting more address space for fixnums, they also
    got exactly the same kind of locality improvements that an
    structure-of-arrays would be used for today. So it was an
    all-around neat optimization.
  &lt;/p&gt;

  &lt;p&gt;
    There is also an &lt;a href=&#039;http://www.softwarepreservation.org/projects/LISP/bbnlisp/BBN940LispPrelimSpec_Oct1966.pdf&#039;&gt;early design document for BBN 940 LISP&lt;/a&gt; from almost the same time as the above paper. It appears
to describe the kind of elaborate tagging scheme that a modern Lisp
might use, and places the tags in the low bits where they&#039;re easier to
test for/eliminate. And they even call heap-allocated numbers &quot;boxed&quot;!
I had no idea this terminology was in use 50 years ago. The relevant
section:
  &lt;/p&gt;

  &lt;blockquote&gt;
&lt;p&gt;
There will be a maximum of 16 pointer types of
objects in the 940 LISP System. These are (numbered in octal)
&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; 00. S-expressions (nonatomic)
&lt;li&gt; 01. Identifiers (literal atoms)
&lt;li&gt; 02. Small Integers
&lt;li&gt; 03. Boxed Large Integers
&lt;li&gt; 04. Boxed Floating Point Numbers
&lt;li&gt; 05. Compiled Function - Lambda Type
&lt;li&gt; 06. Compiled Function - Lambda Type - Indef Args
&lt;li&gt; 07. Compiled Function - Mu Type - Args Paired
&lt;li&gt; 10. Compiled Function - Mu Type - List of Args
&lt;li&gt; 11. Compiled Function - Macro
&lt;li&gt; 12. Array - Pointers
&lt;li&gt; 13. Array - Integers
&lt;li&gt; 14. Array - FP #s
&lt;li&gt; 15. Strings - Packed Character Arrays
&lt;li&gt; 16.
&lt;li&gt; 17. Pushdown List Pointers
&lt;/ul&gt;

&lt;p&gt;
Each pointer will be contained in one 940 word of 24
bits. Bits 0 and 1 will be nominally empty, and may in some
cases be used by the system (e.g. bit 0 for garbage collection)
or perhaps even the user (in S-expressions). The four bits
2-5 will contain the type number for this pointer. The 18
bits 6-23 will contain an effective address (in the LISP
drum file) where the referenced information is stored.
&lt;/p&gt;

&lt;/blockquote&gt;

&lt;p&gt;It looks like they ended up not using this design
    for BBN 940 LISP, and it instead uses an extended version of the
    segmented memory scheme from the PDP-1 implementation described
    earlier in this section. But even if these particular bits
    weren&#039;t practical to use with that hardware, at this point
    just about all the ideas for tagged pointers have definitely
    been invented.&lt;/p&gt;

  &lt;a name=&#039;conclusion&#039;&gt;&lt;/a&gt;
  &lt;h3&gt;Conclusion&lt;/h3&gt;

  &lt;p&gt;
    The initial LISP I implementation in 1960 had the least efficient
    implementation of numbers this side of church numerals, where even
    just getting the value might imply chasing half a dozen
    pointers. But new implementations optimized that layout
    aggressively. By 1964, the M 460 LISP implementation had arrived
    at the general solution of using pointers to invalid parts of
    the address space for storing immediate data, but user-accessible
    integers were still boxed; the only use for the unboxed integers
    was as an internal building block. In 1966 PDP-6 LISP applied the
    idea of tagged immediate data to tiny positive integers, and the
    PDP-1 based BBN LISP took the idea to the logical conclusion, and
    allowed immediate storage of integers of almost the full machine
    word.
  &lt;/p&gt;

  &lt;p&gt;
    I would not have guessed that these optimizations were discovered
    and applied so early and so aggressively. It&#039;s also noteworthy
    that this was independent of both the machine word size, address
    space size, and addressing mode of the machine. The first fully
    fledged implementation I found was on a machine with 18 bit words, 18
    bits of address space, and word-addressing. That should have been
    just about the worst case!
  &lt;/p&gt;

  &lt;p&gt;There&#039;s an interesting tangent with how MacLISP ended up
    reversing this progress in the &#039;70s and going back to boxed
    integers, since they wanted to have just a single integer
    representation. I won&#039;t go into the details since this post
    already grew longer than intended. But for those interested in the
    subject &lt;a href=&#039;https://dspace.mit.edu/bitstream/handle/1721.1/6279/AIM-421.pdf&#039;&gt;AI Memo 421&lt;/a&gt; is a fun read.
  &lt;/p&gt;

  &lt;p&gt;
    Was the technique definitely first used in Lisp? These
    implementations are early enough that there aren&#039;t a ton of other
    possibilities. The only ones I can think of would be APL and
    Dartmouth BASIC. If anyone can find documentation on earlier
    uses of storing immediate data in tagged pointers, please
    let me know and I&#039;ll edit the article.
  &lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><category>HISTORY</category><pubDate>Mon, 04 Sep 2017 15:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2017-09-04-lisp-numbers/</guid></item><item><title>Use cases for CHANGE-CLASS in Common Lisp</title><link>https://www.snellman.net/blog/archive/2015-07-27-use-cases-for-change-class-in-common-lisp/</link><description>
&lt;p&gt;
This is a post on use cases for Common Lisp&#039;s &lt;code&gt;CHANGE-CLASS&lt;/code&gt; operation
&lt;a href=&#039;#fn0&#039; id=&#039;fnref0&#039;&gt;[0]&lt;/a&gt;.  As the name suggests, it changes
the class of an object without changing its object identity. It&#039;s an
operation that a certain class of programmers would consider totally
abhorrent.  I think it&#039;s both cool and useful.

&lt;p&gt;
As far as I an see, the class of an instance has three effects in
Common Lisp. It determines the set of slots the object has, it
determines which methods will be executed when a generic function is
called with that object as one of the arguments, and it determines how
the object interacts with the rest of the system based on the
metaclass of the class of the object.

&lt;read-more&gt;&lt;/read-more&gt;

&lt;h3&gt;Why change the class at runtime?&lt;/h3&gt;

&lt;p&gt;
Why would you change the class of an object rather than create a
new object as a replacement? Because there might be references to the
object all over the place, and updating all of those references to
point to the new object might be a lot of work or even impossible.

&lt;p&gt;
And why not just create the object with the right class in the first
place? Sometimes it&#039;s because the object is not created by the
application code but in the depths of some library, and changing that
library is not feasible. At other times it&#039;s because the appropriate
class of the object genuinely changes during execution.

&lt;p&gt;
And why not use a workaround like some kind of a delegating proxy
object instead? Both of the above reasons kind of apply there.

&lt;h3&gt;Adding new slots to a class&lt;/h3&gt;

&lt;p&gt;
I recently had to go and change something in the code that runs my
blog, for the first time in years and years. Now, this is some crufy
code. How crufty, you ask? Well... It runs a web server that was last
updated 10 years ago &lt;a href=&#039;#fn1&#039; id=&#039;fnref1&#039;&gt;[1]&lt;/a&gt;. While
spelunking through the code, I found a bit of code that looked
essentially like this:

&lt;pre&gt;
  (defclass blog-request (araneida:request)
    ((db :initarg :db :accessor db-of)
     (buffer-stream :initarg :buffer-stream :accessor buffer-stream-of)))

  (defmethod araneida:handle-request :around ((handler blog-handler) request)
    (clsql:with-database (db *db-spec* :if-exists :new)
      (let ((string (with-output-to-string (stream)
                     (change-class request &#039;blog-request
                                   :db db
                                   :buffer-stream stream)
                     (call-next-method))))
         (write-string string (araneida:request-stream request)))))
&lt;/pre&gt;

&lt;p&gt;
What&#039;s going on here? Well, we&#039;re hooking around the
&lt;code&gt;HANDLE-REQUEST&lt;/code&gt;
generic function of the web server, setting up a couple of state
objects (a database connection, a &lt;code&gt;STRING-OUTPUT-STREAM&lt;/code&gt;).
We then proceed with normal execution of the request handling with
&lt;code&gt;CALL-NEXT-METHOD&lt;/code&gt;, and write the data in that&#039;s been
buffered in the &lt;code&gt;STRING-OUTPUT-STREAM&lt;/code&gt; into the normal
output stream.

&lt;p&gt;
The core problem here is that the state data needs to be threaded down
the call stack to where it&#039;s actually used. Since we&#039;re doing all of
this from the middle of third party code, changing the function
signatures is not an option. so we change the class of the web
server&#039;s request object from &lt;code&gt;REQUEST&lt;/code&gt; to
&lt;code&gt;BLOG-REQUEST&lt;/code&gt; (a subclass), and
stuff the state objects into the slots that have now appeared in the
object.

&lt;p&gt;
The natural way of writing this in Common Lisp would probably be to
use special variables &lt;a href=&#039;#fn2&#039; id=&#039;fnref2&#039;&gt;[2]&lt;/a&gt;.  I &lt;i&gt;think&lt;/i&gt;
the reason I didn&#039;t go that route was that way back when I was not
running each request in a separate thread, but was using
&lt;code&gt;SERVE-EVENT&lt;/code&gt;, SBCL&#039;s rather bizarre recursive event loop
which really doesn&#039;t play together well with special variables. But
it&#039;s also not always the case that the lifetime of the additional data
is determined by a particular dynamic extent.

&lt;p&gt;
Another typical solution for attaching extra data to an object would
be storing the extra information in a weak-keyed hash table with the
objects as a key, and making that hash-table accessible in all of the
places where this extra data is needed (most likely as a global
variable). As far as I&#039;m concerned, that&#039;s just gross.

&lt;p&gt;
Is there a converse situation where you&#039;d want to &lt;code&gt;CHANGE-CLASS&lt;/code&gt; to
remove some slots from an object? I can&#039;t really think of a plausible
case. It might be a side effect from changing the object to be an
instance of a class that isn&#039;t a sublass of the original class. But
never the actual goal since the amount of memory you&#039;d save from
having fewer slots would be miniscule.

&lt;h3&gt;Modifying method dispatch&lt;/h3&gt;

&lt;p&gt;
A more obvious use for &lt;code&gt;CHANGE-CLASS&lt;/code&gt; is a need to manipulate method
dispatch. An example I like for this is the intermediate data
representation of a compiler. Consider the representation of a
variable binding. The binding could be for example constant
vs. modified, or totally local lexical binding vs lexical binding
closed over by a function vs. dynamic binding.

&lt;p&gt;
The compiler is going to need to treat the binding objects of
intermediate representation in very different ways depending on the
exactly what kind of variable this is. A variable binding that&#039;s never
changed and that&#039;s known to contain an immutable value can be
trivially constant-folded. A closed over and potentially modified
variable will need a some extra code to allocate some memory in which
the variable is stored. And the code that&#039;s generated for any of the
variable references (both reads and writes) will need to be different
as well.

&lt;p&gt;
Now, the funny thing is that these binding objects can change their
state multiple times during compilation. If dead code elimination ends
up removing the last read from a variable, that binding becomes
dead. A variable can bounce from non-closed over to closed over as a
closure is discovered, back to non-closed as it&#039;s proven that the
closure can&#039;t escape after all. And so on. It&#039;d just be infeasible to
generate all of the objects with the correct class up front. And these
objects are going to be referenced willy-nilly from all over the IR
tree, making replacing references truly annoying.

&lt;p&gt;
One way of representing this is to have a bunch of state flags in the
compiler&#039;s binding objects. But then you have to implement the
specializations a bunch of conditionals in large functions, rather
than by having the specialized behavior in separate methods and
relying on method dispatch to sort things out. I know which form of
organizing code I prefer &lt;a href=&#039;#fn3&#039; id=&#039;fnref3&#039;&gt;[3]&lt;/a&gt;.

&lt;h3&gt;Switching metaclasses&lt;/h3&gt;

&lt;p&gt;
Using &lt;code&gt;CHANGE-CLASS&lt;/code&gt; to switch the class to a hierarchy that&#039;s based on
a different metaclass is where I start drawing a blank. Unlike the
other two cases I haven&#039;t ever felt the need to do that myself so it&#039;s
harder to spin a convincing story. The best I can do is go through
some typical uses of non-standard metaclasses, and think about whether
there could be any reason to change between them and normal classes.
Here&#039;s some broad categories of features you could change:

&lt;ul&gt;
&lt;li&gt; Changing the slot storage representation
&lt;li&gt; Changing slot access in some other way
&lt;li&gt; Changing the code generated for accessors
&lt;li&gt; Adding new metadata to slot definitions or class definitions
&lt;li&gt; Changing the method dispatch resolution in some fundamental way,
  for example using C3 class hierarchy linearization
&lt;/ul&gt;

&lt;p&gt;
And as for how you&#039;d achieve something useful with one of these
features:

&lt;p&gt;
Perhaps the most prototypical use of metaclasses is persistent objects
- for example an object-relational mapping library, but it could be a
real object database too. Why do you need a custom metaclass for this?

&lt;p&gt;
One reasons is lazy initialization of some or all slots. When you load
an object from the database, you don&#039;t necessarily want to load all
the data up front. Some of it might trigger the loading of arbitrarily
deep graphs of other persistent object, which is expensive. To do this
you only want to fetch the value of these slots when their value is
read the first time. This doesn&#039;t look like a compelling case for
&lt;code&gt;CHANGE-CLASS&lt;/code&gt;; why would we change our already existing fully
initialized instance into one of these lazily initialized objects?

&lt;p&gt;
Alternatively you might want to attach extra information to slot
descriptors for describing how the data is to be persisted. What&#039;s the
SQL datatype of this field? Is it part of the primary key? are there
any foreign key constraints? It&#039;d definitely be reasonable to
&lt;code&gt;CHANGE-CLASS&lt;/code&gt; on instance of &lt;code&gt;USER&lt;/code&gt;
to &lt;code&gt;USER*&lt;/code&gt; given the following definitions:

&lt;pre&gt;
  (defclass user ()
    ((uid :accessor uid-of :initarg :uid)
     (username :accessor username-of :initarg :username)
     (password-hash :accessor password-hash-of :initarg :password-hash)))

  (defclass user* ()
    ((uid :accessor uid-of :initarg :uid :primary-key t :sql-datatype &#039;integer)
     (username :accessor username-of :initarg :username
               :unique t :sql-datatype &#039;text)
     (password-hash :accessor password-hash-of :initarg :password-hash
                    :sql-datatype &#039;text))
    (:metaclass db-object)
    (:sql-table &quot;user&quot;))
&lt;/pre&gt;

&lt;p&gt;
But... This doesn&#039;t explain why you&#039;d ever end up with a
&lt;code&gt;USER&lt;/code&gt; instead of a &lt;code&gt;USER*&lt;/code&gt; in the first
place. Persisting objects that you didn&#039;t create but that were
injected to your program by a library seems very odd.

&lt;p&gt;
Another textbook example of non-standard metaclasses are alternative
slot representations. Instead of an instance being essentially a
vector of slot values, it could be a hash-table mapping slot names to
values. The benefit here would be more space-efficient storage for
sparse objects; a class with hundreds of slots most of which never get
initialized. Could you want to swap back and forth between the normal
and the sparse representation? Maybe, but then you&#039;d just implement a
metaclass that automatically chooses the right representation and
switches between them behind the scenes. There&#039;s no point in forcing
the user to switch between the representations manually. This doesn&#039;t
feel plausible either.

&lt;p&gt;
One last try, Pascal Costanza&#039;s &lt;a href=&#039;https://common-lisp.net/project/closer/contextl.html&#039;&gt;ContextL&lt;/a&gt; library extends the support
for dynamic binding in the language, allowing not only dynamically
binding variables but also doing it for functions (including
autogenerated accessor methods) and slot values. The way this kind of
extension would be implemented in a threadsafe manner is by
indirecting the function calls and slot accesses through a special
variable. Which is to say the slot access protocol needs to be
reimplemented, and maybe the autogeneration of accessors
too. Obviously this needs a new metaclass!

&lt;p&gt;
And could you want to change a standard object to one supporting
dynamic binding? That&#039;s actually pretty plausible. A framework injects
some object whose behavior you need to customize on a very
fine-grained level. Dynamically scoped functions seem like a good
tool for that. But it&#039;s still pretty hand-wavey.

&lt;p&gt;
Does anyone have a more concrete example for using
&lt;code&gt;CHANGE-CLASS&lt;/code&gt; primarily on order to switch to a
different metaclass?

&lt;h3&gt;Footnotes&lt;/h3&gt;

&lt;p&gt;
&lt;a href=&#039;#fnref0&#039; id=&#039;fn0&#039;&gt;[0]&lt;/a&gt; Analogous operations are of course
  available in a bunch of languages. The key difference to my mind is
  that in Common Lisp, as with many other dynamic features, there&#039;s a
  well-defined protocol for customizing exactly how the feature works
  and how it&#039;s configured and extended.

&lt;p&gt;
&lt;a href=&#039;#fnref1&#039; id=&#039;fn1&#039;&gt;[1]&lt;/a&gt; Woo, Araneida for the win! Old
  code never dies.

&lt;p&gt;
&lt;a href=&#039;#fnref2&#039; id=&#039;fn2&#039;&gt;[2]&lt;/a&gt; Where &amp;quot;special variable&amp;quot; is
  actually a technical concept, perhaps one of the worst named ones in
  the world :-) Essentially a thread-local dynamically scoped global
  variable, though there&#039;s a couple of extra warts.

&lt;p&gt;
&lt;a href=&#039;#fnref3&#039; id=&#039;fn3&#039;&gt;[3]&lt;/a&gt; Fans of statically typed functional
  programming languages with pattern matching obviously have the
  opposite preference. What I find interesting is that a lot of the
  impetus for &lt;code&gt;CHANGE-CLASS&lt;/code&gt; comes from wanting to preserve object
  identity and not needing to update all the references to the
  object. The first is a non-issue in functional programming, the
  second is something you need to do anyway.


</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Mon, 27 Jul 2015 21:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2015-07-27-use-cases-for-change-class-in-common-lisp/</guid></item><item><title>A Monte Carlo simulation of Red7</title><link>https://www.snellman.net/blog/archive/2015-03-30-monte-carlo-red7/</link><description>
&lt;p&gt;
&lt;a href=&#039;https://boardgamegeek.com/boardgame/161417/red7&#039;&gt;Red7&lt;/a&gt; is
a very clever little card game, and one of my favorite 2014 releases.
But I have wondered
about the density of meaningful decisions in the game. Sometimes it
doesn&#039;t feel like you have all that much agency, and are just hanging
on in the game with a single valid move every time it&#039;s your turn.

&lt;p&gt;
So here&#039;s some automated exploration of what a game of Red7 actually
looks like from a statistical point of view. The method used here is a
pure Monte Carlo simulation, with the players choosing randomly from
the set of their valid moves.

&lt;p&gt;
Why a Monte Carlo simulation? I started trying to do a full game tree for
a given starting setup but to my surprise the game tree is actually too
large for that to be feasible; 2 weeks of computation even for a
single two player game and a lot of optimization. The branching factor
is just much bigger than it feels like when playing the game.

&lt;read-more&gt;&lt;/read-more&gt;

&lt;h2&gt;The rules&lt;/h2&gt;

&lt;p&gt;
(Skip this section if you&#039;re already familiar with the game. All you need
to know is that we&#039;re using the advanced version of the game but without
the optional special action rules.)

&lt;p&gt;
The rules of the game are very simple. There&#039;s a deck of 49 cards
(7 colors, numbers 1-7 in each color). In the middle is a discard
pile (&amp;quot;canvas&amp;quot;). The color topmost card of the discard pile determines
the victory condition. You must be &amp;quot;winning&amp;quot; at the end of each turn
you take, or you&#039;re out of the game.

&lt;p&gt;
There are three options to choose from on your turn. Play a card from your
hand to the table in front of you (your &amp;quot;palette&amp;quot;), discard a card
from your hand to the canvas, or first play a card and then discard a
card. If you discard a card with a number higher than the number of
cards in your palette, you get to draw a card.

&lt;p&gt;
The winning condition is determined based on the color of the canvas
(i.e. top card in discard pile):

&lt;p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td style=&#039;background-color: red&#039;&gt;Red &lt;td&gt;Highest card
&lt;tr&gt;&lt;td style=&#039;background-color: orange&#039;&gt;Orange &lt;td&gt;Most cards of the same number
&lt;tr&gt;&lt;td style=&#039;background-color: yellow&#039; &gt;Yellow &lt;td&gt;Most cards of the same color
&lt;tr&gt;&lt;td style=&#039;background-color: green&#039; &gt;Green &lt;td&gt;Most even cards
&lt;tr&gt;&lt;td style=&#039;background-color: blue; color: white&#039; &gt;Blue &lt;td&gt;Most different colors
&lt;tr&gt;&lt;td style=&#039;background-color: indigo; color: white&#039;&gt;Indigo &lt;td&gt;Longest run of sequential numbers (e.g. 4/5/6)
&lt;tr&gt;&lt;td style=&#039;background-color: violet&#039;&gt;Violet &lt;td&gt;Most cards with a number lower than 4
&lt;/table&gt;

&lt;p&gt;
If two players are tied for the winning condition (e.g. the rule is
blue and both of them have three even cards in their palette), the
winner is the player who had a higher card included in their
card combination (cards that didn&#039;t contribute to the winning
condition are ignored for the tie breaker). This is primarily based
on the numeric value of the card. But if two cards have the same
value, the one closer to red in the spectrum wins the tie (e.g. green
5 &amp;gt; indigo 5 &amp;gt; green 4).

&lt;h2&gt;The implementation&lt;/h2&gt;

&lt;p&gt;
(Ignore this section if you&#039;re not interested in the programming, and
skip straight on to the results).

&lt;p&gt;
I suspect that every Common Lisp program will eventually evolve to
using a clever bit-packing of fixnums as its primary data structure.
That&#039;s the case here as well.

&lt;h3&gt;Cards&lt;/h3&gt;

&lt;p&gt;
A card is an integer between 0 and 55 (inclusive). The low 3 bits are
the color, with a 0 being a dummy color that&#039;s not used for anything, 1
for violet going all the way to 7 for red. The next 3 bits are the
card&#039;s numeric value minus one (0-6). Note that with this representation
determining the higher of two cards is simply a matter of
making an integer comparison.

&lt;pre&gt;
(deftype card () &#039;(mod 56))

(defun card-color (card)
  (ldb (byte 3 0) card))

(defun card-value (card)
  (1+ (ash card -3)))
&lt;/pre&gt;

&lt;p&gt;
We&#039;ll also need a way to represent a set of cards, for a player&#039;s
hand or palette. We&#039;re going to use a 56-bit integer for that, with
bit X being 1 if the set contains card X.

&lt;pre&gt;
(deftype card-set () &#039;(unsigned-byte 56))
&lt;/pre&gt;

&lt;p&gt;
Adding and removing cards is simple. (Except how annoying is it that
SETF LOGBITP is not specified in the standard?).

&lt;pre&gt;
(defun remove-card (card card-set)
  (logandc2 card-set (ash 1 card)))

(defun add-card (card card-set)
  (logior card-set (ash 1 card)))

;; Create a new set from a list of cards.
(defun make-card-set (cards)
  (reduce #&#039;add-card cards))
&lt;/pre&gt;

&lt;p&gt;
We&#039;ll also need to be able to iterate through all the cards in a set.
This is most easily achieved by using INTEGER-LENGTH to find the highest
bit currently set, executing the loop body, clearing out the highest bit,
and carrying on.

&lt;pre&gt;
(defmacro do-cards ((card card-set) &amp;body body)
  (let ((modified-set (gensym)))
    `(loop with ,modified-set of-type card-set = ,card-set
           until (zerop ,modified-set)
           for ,card = (1- (integer-length ,modified-set))
           do (setf ,modified-set (remove-card ,card ,modified-set))
           do ,@body)))
&lt;/pre&gt;

&lt;h3&gt;Scoring&lt;/h3&gt;

&lt;p&gt;
With these primitives we can then write a very fast function to
determine who is currently winning the game. We&#039;ll base this
evaluation function on scoring a combination of a palette + rule, and
comparing the score that each player gets with the current rule. This
is a much better way than trying to directly compare the palettes.
If you&#039;re caching this evaluation function, you get a much higher
cache hit rate when the cache key depends only on the state of one
player rather than a combined state of two players.
(I&#039;m also pretty sure that given this data layout, computing a score
will be faster than any kind of direct comparison).

&lt;p&gt;
Let&#039;s start off with the general structure, and fill in the details as
functions under LABELS afterwards. So given a card-set and a color,
we&#039;ll return a score for that set:

&lt;pre&gt;
(defun card-set-score (card-set type)
  (labels (...)
    (ecase type
      (7 (red))
      (6 (orange))
      (5 (yellow))
      (4 (green))
      (3 (blue))
      (2 (indigo))
      (1 (violet)))))
&lt;/pre&gt;

&lt;p&gt;
Red (highest card) is trivial. We just find the highest card in the set
with a call to INTEGER-LENGTH.

&lt;pre&gt;
           (red ()
             (integer-length card-set))
&lt;/pre&gt;

&lt;p&gt;
For other rules we can make good use of the following helper function. It
matches the set against a bitmask, and returns a score based on the
number of bits that are set both in the set and the mask (main part of
score) which we get with LOGCOUNT, as well as the highest bit set in
both (the tiebreaker). Given this definition, most of the scoring
types can be written in a very concise manner:

&lt;pre&gt;
           (score-for-mask (mask)
             (let ((matching-cards (logand card-set mask)))
               (let ((matching-cards (logcount matching-cards))
                     (best-matching-card (integer-length matching-cards)))
                 (+ best-matching-card (* 64 matching-cards)))))
&lt;/pre&gt;

&lt;p&gt;
For orange (cards of one number) we start with a bitmask that matches
all bits corresponding to a card with the value 7. We compute the score
for that mask, then shift the mask right by 8 bits such that it covers
the cards with the value 6. Repeat 7 times, and find the maximum score.
(We don&#039;t need to know which iteration produced the highest score, only
what the score was).

&lt;pre&gt;
           (orange ()
             (loop for mask = #xff000000000000 then (ash mask -8)
                   repeat 7
                   maximize (score-for-mask mask)))
&lt;/pre&gt;

&lt;p&gt;
Yellow (most cards with the same number) is very similar. We start off
with a bitmask that matches all the red cards (so bit 55, 47, 39, etc)
and compute the score. Then shift it right by one, such that the mask
matches all orange cards instead. Again repeat 7 times and maximize.

&lt;pre&gt;
           (yellow ()
             (loop for mask = #x80808080808080 then (ash mask -1)
                   repeat 7
                   maximize (score-for-mask mask)))
&lt;/pre&gt;

&lt;p&gt;
Green (most even cards) and violet (most cards under 4) are trivial;
we can just score a single mask matching the even cards for green,
all cards of value 1, 2 or 3 for violet.

&lt;pre&gt;
           (green ()
             (score-for-mask #x00ff00ff00ff00))
           (violet ()
             (score-for-mask #x00000000ffffff))
&lt;/pre&gt;

&lt;p&gt;
Blue (most cards of different colors) is where we get into unintuitive
territory. Let&#039;s start with the tiebreaker; it&#039;s obviously guaranteed
that he highest card in the palette as a whole can be included in this
winning set, so we can just use INTEGER-LENGTH on the whole set the
same way we did for the red scoring rule.

&lt;p&gt;
To get the number of different colors, we will fold the cardset multiple
times. First we&#039;ll do a bitwise OR of the high 32 bits and the low 32
bits. Then we&#039;ll take OR bits 0-15 of that result with bits 16-31. And
finally one more OR of bits 0-7 with 8-15. The low 8 bits are now such
that bit 7 is set if any of the &amp;quot;red&amp;quot; bits in the original were set,
bit 6 if any of the &amp;quot;orange&amp;quot; bits, etc. We can then just use LOGCOUNT on
that byte to get the number of colors present in the palette, and combine
it together with the tiebreaker score computed above.

&lt;pre&gt;
           (blue ()
             (let* ((palette card-set)
                    (best-card (integer-length palette)))
               (setf palette (logior palette (ash palette -32)))
               (setf palette (logior palette (ash palette -16)))
               (setf palette (logior palette (ash palette -8)))
               (+ best-card
                  (* 64 (logcount (ldb (byte 8 0) palette))))))
&lt;/pre&gt;

&lt;p&gt;
Finally, there&#039;s indigo (longest straight). There does not appear
to be any clever bit manipulation trick to compute this quickly
(if you can think of one, please let me know!). We need to iterate through the
cards in order of descending value, ignore any consecutive cards with
the same number, and reset our scoring computation when the straight
gets interrupted by a missing number.

&lt;pre&gt;
           (indigo ()
             (let ((prev nil)
                   (current-run-score 0)
                   (best-score 0))
               (declare (type (unsigned-byte 16) current-run-score best-score))
               (do-cards (card card-set)
                 (cond ((not prev)
                        (setf current-run-score card)
                        (setf prev card))
                       ((= (card-value card) (card-value prev)))
                       ((= (card-value card) (1- (card-value prev)))
                        (incf current-run-score 64)
                        (setf prev card))
                       (t
                        (setf current-run-score card)
                        (setf prev card)))
                 (setf best-score (max best-score current-run-score)))
               best-score))
&lt;/pre&gt;

&lt;h3&gt;Players&lt;/h3&gt;

&lt;p&gt;
A player is defined as a normal structure, with the only oddity being
that they form a circular linked list using the NEXT slot. This tends
to be more convenient for iterating through players in turn order
than keeping them stored in an external collection of some sort.

&lt;pre&gt;
(defstruct (player)
  (id 0 :type (mod 5))
  eliminated
  (hand 0 :type card-set)
  (palette 0 :type card-set)
  (score-cache (make-array 16) :type (simple-vector 16))
  (next nil :type (or null player)))
&lt;/pre&gt;

&lt;p&gt;
The core operation of generating a list of valid moves is deciding
whether the player is winning the game after those a move is
made. When doing this we&#039;ll end up repeatedly evaluating the scores for
the same palettes over and over again. To speed this up, there&#039;s a
minimal cache; for each player / rule combination we store both the
last palette we evaluated for that rule, as well as the score.

&lt;pre&gt;
(defun player-score (player rule)
  (declare (type (mod 8) rule))
  (let* ((palette (player-palette player))
         (cache (player-score-cache player))
         (cached-key (aref cache rule)))
    (if (eql cached-key palette)
        (aref cache (+ rule 8))
        (progn
          (setf (aref cache rule) palette)
          (setf (aref cache (+ rule 8))
                (card-set-score palette rule))))))
&lt;/pre&gt;

&lt;p&gt;
Given that way to score a player against a rule, we can then check whether
the current player is winning the game with the rule.

&lt;pre&gt;
(defun player-is-winning (player rule)
  (loop with orig-player = player
        with orig-score of-type fixnum = (card-set-score player rule)
        for player = (player-next orig-player) then (player-next player)
        until (eql player orig-player)
        do (when (&gt;= (the fixnum (player-score player rule))
                     orig-score)
             (return-from player-is-winning nil)))
  t)
&lt;/pre&gt;

&lt;p&gt;
We can then generate all valid moves by iterating through all the
PLAY, PLAY+DISCARD, and DISCARD combinations for the player&#039;s current
state, and collecting the ones result in the player winning.

&lt;pre&gt;
(defun valid-moves (player current-rule)
  (let (valid-moves)
    (labels ((check-discard (play-card)
               (do-cards (discard-card (player-hand player))
                 (unless (or (eql play-card discard-card)
                             ;; Filter out cases where player discards a card
                             ;; without changing rule or gaining a new card.
                             (and (eql current-rule (card-color discard-card))
                                  (&gt;= (logcount (player-palette player))
                                      (card-value discard-card))))
                   (when (player-is-winning player (card-color discard-card))
                     (push (cons (cons :play play-card)
                                 (cons :discard discard-card))
                           valid-moves)))))
             (check-plays ()
               (do-cards (play-card (player-hand player))
                 (setf (player-palette player)
                       (add-card play-card (player-palette player)))
                 (when (player-is-winning player current-rule)
                   (push (cons :play play-card) valid-moves))
                 (check-discard play-card)
                 (setf (player-palette player)
                       (remove-card play-card (player-palette player))))))
      (check-plays)
      (check-discard nil))
    valid-moves))
&lt;/pre&gt;

&lt;h3&gt;Other stuff&lt;/h3&gt;

&lt;p&gt;
There&#039;s a little bit more code required to generate the scaffolding
for a game, and to actually do the random walk through the game
tree. None of that code is particularly interesting, nor are the
INLINE or TYPE declarations that you&#039;d need to sprinkle on the above
code to make it fast. The &lt;a
href=&#039;https://github.com/jsnell/red7&#039;&gt;full code&lt;/a&gt; is available on GitHub.

&lt;h3&gt;Performance&lt;/h3&gt;

&lt;p&gt;
In the optimal case of trying to iterate through the whole game tree
in a 2p game, the average cost of making a move is about 500 cycles,
with my desktop doing 7 million moves per second. This is however
amortizing the cost of computing the set of valid moves across all of
those moves (since in a full search every valid move gets
executed). If you&#039;re just doing a pure random walk with no
backtracking, you&#039;d get no amortization at all. That effect makes an
order of magnitude difference.

&lt;p&gt;
But it&#039;s funny that the biggest profiler hotspot in the program is the
PLAYER-SCORE function. Which, if you remember, will simply do an array
lookup to get the previous cache key, compare it to the card-set that
should be evaluated, and either return a previous result or call out
to the real scoring function. The function does basically nothing, but
it does nothing really often. When all of the things of substance are
pretty fast as well, it&#039;s maybe not a surprise that the bottleneck ends
up in a place like that.

&lt;h2&gt;Results&lt;/h2&gt;

&lt;p&gt;
(Skip this section if you&#039;re not actually interested in the game, and
just wanted to read some Common Lisp code).

&lt;p&gt;
The following results are computed from running simulations of 10k
different initial setups, with 100k matches for each simulation with
each player making random but valid moves. (So a total of one billion
games). All plays were with 3 players, the only player count I
consider worth playing.

&lt;p&gt;
As a sanity check, I ran a smaller simulation of 1000 initial setups
where the players would not play a card + discard, if just playing that
same card was sufficient to get into the lead without a discard. The
results were very close to the large fully random simulation
(e.g. the average game length was 14.6 instead of 14.1 turns, and
the win percentage of the best turn order position was 39% rather than
40%).

&lt;p&gt;
Finally, an even smaller scale experiment had the AIs use move
selection heuristics very similar to those I personally use when
playing the game. Those results didn&#039;t differ materially from random
play either.

&lt;h3&gt;Caveats&lt;/h3&gt;

&lt;p&gt;
Unless stated otherwise, all of the numbers are from games with
players making completely random moves. It is possible that the
aggregate statistics are different when players consciously build
toward palettes that are strong in multiple scoring rules, or strong
in rules that they have a lot of cards in hand for.

&lt;p&gt;
The games are always played with the full deck, rather than in reality
as the deck slowly depletes from hand to hand as cards are moved to the
scoring piles of players.

&lt;h3&gt;Starting player effect&lt;/h3&gt;

&lt;p&gt;
One thing I was curious about is whether the starting player has an
advantage, a disadvantage, or neither. It&#039;s not obvious, since there
are effects both ways.

&lt;p&gt;
The case for a disadvantage: Running out of cards means losing the
game, and the all other things being equal the first player will also
run out of cards first. Due to the way in which the player order is
picked, the last player is also guaranteed to have the highest value
starting card in their palette giving them a leg up on winning future
tiebreakers.

&lt;p&gt;
The case for an advantage: The earlier in turn order a player is, the
fewer cards the opponents have in their palettes. It&#039;s much easier to
pass two players with one card each, than two players with two cards
each. And this effect continues throughout the game, so it should
accumulate over time.

&lt;p&gt;
It turns out that at least with undirected random play there&#039;s a major
disadvantage to being first. It could be that the effect is smaller
when players are making &amp;quot;good&amp;quot; moves.

&lt;p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td&gt;Position&lt;td&gt;Win rate
&lt;tr&gt;&lt;td&gt;1st&lt;td&gt; 27.20%
&lt;tr&gt;&lt;td&gt;2nd&lt;td&gt; 32.42%
&lt;tr&gt;&lt;td&gt;3rd&lt;td&gt; 40.37%
&lt;/table&gt;

&lt;h3&gt;Number of possible moves&lt;/h3&gt;

&lt;p&gt;
Like mentioned above, the branching factor in the game was higher
than I&#039;d been expecting. There are cases where players have a lot
more moves available than I would have expected.

&lt;p&gt;
The theoretical maximum number of options is 7 + 7 + 7 * 6 = 56, where
a player can get in the lead either by discarding any of their cards,
playing any of their cards, or with a combination of the two. This
situation actually happened a total of 483986 times in 14 billion
moves (0.03% of the time). A lot more common than I would have thought.

&lt;p&gt;
But of course we don&#039;t particularly care about the 0.03% case. The
more common cases are more interesting. The following graph shows how
often you have at least &lt;i&gt;X&lt;/i&gt; moves available in the game.

&lt;p&gt;
&lt;img src=&#039;/blog/stc/images/red7/cumulative.png&#039;&gt;

&lt;p&gt;
For example, you can see that about a 1/3rd of the time a player had
10 or more options to choose from. It appears that the game is nowhere
as constrained as I thought, even when playing without the special
action rules.

&lt;h3&gt;Length of game&lt;/h3&gt;

&lt;p&gt;
The average game lasted for 14.2 turns, which is perhaps less than I
expected given 2 of those 14 turns were by definition a player just
dropping out from the match.

&lt;p&gt;
There were some games that already ended on turn 4, which meant that
only two cards were played in the game. That number was a mercifully
low 0.01%. And while there were players who got eliminated before playing
a card, there at least were no games ending in turn 2 or 3 even if that&#039;s
theoretically possible. And a single game lasted all the way to turn
28.

&lt;p&gt;
The following graph shows how large a proportion of the games were still
running on a given turn.

&lt;p&gt;
&lt;img src=&#039;/blog/stc/images/red7/gamelength.png&#039;&gt;


&lt;h3&gt;Effect of player decisions&lt;/h3&gt;

&lt;p&gt;
The final question is about how strongly predetermined a single hand of
Red7 is, and how much a player can affect it.

&lt;p&gt;
We&#039;ve already established that at least with this skill level of play
there&#039;s a very large start player advantage, but is that an isolated
issue or does the setup matter even more than that. In these simulations
all players are by definition equally skilled. If the end result of the
game is primarily determined by player skill, you&#039;d thus expect them
to have similar win rates from game to game. So let&#039;s graph the
distribution of per-setup win rates for each starting position:

&lt;p&gt;
&lt;img src=&#039;/blog/stc/images/red7/winrates-density.png&#039;&gt;

&lt;p&gt;
Now, this graph is a little abstract since we&#039;re looking at
probabilities of probabilities. The way to read this is that across
those 10000 starting setups, the most common win percentage for
player 1 (red) across the 100000 games in a specific setup was around
15% (the peak of the red line is at around 0.15). You can see that
the later players in turn order have a graph that&#039;s shifted further
to the right, which is what you&#039;d expect when they have a
substantially higher win percentage. But you can also see that from
any starting position you might get absolutely dismal win rates
(near 0) or very high win rates (over 80%). The ridiculously high
win rates (95%) appear to be purely reserved for the player last
in turn order.

&lt;p&gt;
There were two setups where a player didn&#039;t manage to win even a single
match out of 100000 (in both cases that was player 1). In 25% of the
cases the player with the worst chance of winning a setup had a 10% win
rate or lower, in 7% of the cases a win rate of 5% or lower. It does
appear that within a single hand of Red7, luck plays a massive role.

&lt;p&gt;
Out of all of the questions we&#039;ve been looking at, this is of course
the one where the applicability of a purely random search strategy is
the most questionable. If we&#039;re investigating the effect of player
skill, how can results from the least skillful play imaginable be
relevant? I&#039;m sympathetic to that argument, but before buying into
it I&#039;d really like to understand the mechanism by which one player is
supposed to disproportionately benefit from the random play.

&lt;p&gt;
Also... As mentioned earlier, I also tried extending the AIs to be
smarter about selecting each move. This was not based on any kind of
lookahead, but simply the kinds of heuristics I&#039;d usually use myself
when playing the game. If I can get into the lead either by playing a
card or discarding a card (without drawing a new one to replace it),
I&#039;d rather play a card since that&#039;s going to be useful on future
rounds. When choosing which of two cards to play, I&#039;d usually prefer
to play the one that adds strength to more different scoring rules.

&lt;p&gt;
Experiments with one AI player getting use of these kinds of
heuristics while the others played completely randomly did not show a
big effect, the changes in the win rate were on the order of 1-2
percentage points.

&lt;h2&gt;Future work&lt;/h2&gt;

&lt;p&gt;
I might be done with this little project, but if I pick it up again
there&#039;s a couple of obvious directions to take this. Implementing the
optional special action rules would be nice. That&#039;s my preferred form
of the game anyway.

&lt;p&gt;
The more interesting one is to extend the current system to be a full
AI using the Monte Carlo Tree Search approach. This would allow
generating statistics based on &amp;quot;good&amp;quot; play of the game, maybe provide
information on what kinds of moves are in general successful, as well
as give a more conclusive answer to the level of skill the game has.

&lt;p&gt;
The tricky bit with evolving this code to a MCTS is that the system
in the current form would allow the MCTS to exploit knowledge of
future random events and hidden information. It would need to
randomize all card draws (currently deterministic), as well as swap the
opponents hands for random cards for the duration of the evaluation
phase, and then swap the original deck and original hands back in for
the move execution. That&#039;s going to slow down each individual move
a lot, which is a problem when MCTS will intrinsically require computing
several orders of magnitude more moves than a random walk.
</description><author>jsnell@iki.fi</author><category>GAMES</category><category>LISP</category><pubDate>Mon, 30 Mar 2015 14:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2015-03-30-monte-carlo-red7/</guid></item><item><title>Pretty SBCL backtraces</title><link>https://www.snellman.net/blog/archive/2007-12-19-pretty-sbcl-backtraces.html</link><description>

&lt;p&gt;Every now and then I see complaints about the stacktraces in
SBCL. They contain too little info, or too much info, or are formatted
the wrong way, etc. But the backtrace printing isn&#039;t really any dark magic,
it&#039;s just basic Lisp code. If you don&#039;t like the default format, just
write a new backtrace function that prints something prettier/less
cluttered/more informative/etc.&lt;/p&gt;

&lt;p&gt;For inspiration, below is one implementation, based on a really quick
hack I wrote in answer to a c.l.l post a few weeks ago. In addition to
cosmetic changes, it adds a a couple of extra features: printing
filenames and line numbers for the frames when possible, and printing
the values of local variables when possible. Just call
&lt;code&gt;backtrace-with-extra-info&lt;/code&gt; in any condition handler where you&#039;d
normally call &lt;code&gt;sb-debug:backtrace&lt;/code&gt;, or call it from the debugger REPL
instead of using the &lt;code&gt;backtrace&lt;/code&gt; debugger command.&lt;/p&gt;

&lt;p&gt;The code assumes that you&#039;ve got Swank loaded. For best results, compile
your code with &lt;code&gt;(debug 2)&lt;/code&gt; or higher.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(defun backtrace-with-extra-info (&amp;amp;key (start 1) (end 20))
  (swank-backend::call-with-debugging-environment
   (lambda ()
     (loop for i from start to (length (swank-backend::compute-backtrace
                                        start end))
           do (ignore-errors (print-frame i))))))
(defun print-frame (i)
  (destructuring-bind (&amp;amp;key file position &amp;amp;allow-other-keys)
      (apply #&#039;append
             (remove-if #&#039;atom
                        (swank-backend:frame-source-location-for-emacs i)))
    (let* ((frame (swank-backend::nth-frame i))
           (line-number (find-line-position file position frame)))
      (format t &quot;~2@a: ~s~%~
                   ~:[~*~;~:[~2:*    At ~a (unknown line)~*~%~;~
                             ~2:*    At ~a:~a~%~]~]~
                   ~:[~*~;    Local variables:~%~{      ~a = ~s~%~}~]&quot;
              i
              (sb-debug::frame-call (swank-backend::nth-frame i))
              file line-number
              (swank-backend::frame-locals i)
              (mapcan (lambda (x)
                        ;; Filter out local variables whose variables we
                        ;; don&#039;t know
                        (unless (eql (getf x :value) :&amp;lt;not-available&amp;gt;)
                          (list (getf x :name) (getf x :value))))
                      (swank-backend::frame-locals i))))))
(defun find-line-position (file char-offset frame)
  ;; It would be nice if SBCL stored line number information in
  ;; addition to form path information by default Since it doesn&#039;t
  ;; we need to use Swank to map the source path to a character
  ;; offset, and then map the character offset to a line number
  (ignore-errors
   (let* ((location (sb-di::frame-code-location frame))
          (debug-source (sb-di::code-location-debug-source location))
          (line (with-open-file (stream file)
                  (1+ (loop repeat char-offset
                            count (eql (read-char stream) #\Newline))))))
     (format nil &quot;~:[~a (file modified)~;~a~]&quot;
             (= (file-write-date file)
                (sb-di::debug-source-created debug-source))
             line))))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;For example on the following code:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(declaim (optimize debug))
(defun foo (x)
  (let ((y (+ x 3)))
    (backtrace)
    (backtrace-with-extra-info)
    (+ x y)))
(defmethod bar ((n fixnum) (y (eql 1)))
  (foo (+ y n)))
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The old backtrace would look like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;
1: (FOO 4)
2: ((SB-PCL::FAST-METHOD BAR (FIXNUM (EQL 1)))
    #&amp;lt;unused argument&amp;gt;
    #&amp;lt;unused argument&amp;gt;
    3
    1)
3: (SB-INT:SIMPLE-EVAL-IN-LEXENV (BAR 3 1) #&amp;lt;NULL-LEXENV&amp;gt;)

&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;And the new backtrace like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;1: FOO
   At /tmp/testlisp:5
   Local variables:
     X = 4
     Y = 7
2: (SB-PCL::FAST-METHOD BAR (FIXNUM (EQL 1)))
   At /tmp/testlisp:8
   Local variables:
     N = 3
     Y = 1
3: SB-INT:SIMPLE-EVAL-IN-LEXENV
   At /scratch/src/sbcl/src/code/evallisp:93 (file modified)
   Local variables:
     ARG-0 = (BAR 3 1)
     ARG-1 = #&amp;lt;NULL-LEXENV&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;An improvement? That&#039;s probably in the eye of the beholder, and
depends on the codebase and the use cases. For example I can imagine
that for large functions showing the values of local variables in the
trace would make it way too spammy. But that&#039;s besides the point: if
the default stacktrace format is making debugging difficult for you,
it&#039;s not hard to customize it.&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Thu, 20 Dec 2007 00:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-12-19-pretty-sbcl-backtraces.html</guid></item><item><title>Faster SBCL hash-tables</title><link>https://www.snellman.net/blog/archive/2007-10-01-faster-sbcl-hash-tables.html</link><description>

&lt;p&gt;Long time, no blog. I have an excuse though, since I moved to
Switzerland for a
&lt;a href=&quot;http://www.google.ch/intl/en/jobs/index.html&quot;&gt;new job&lt;/a&gt; a month ago, and
haven&#039;t had a lot of time for things like blogging or hacking
Lisp (the latter is usually a prerequisite for the former for me).&lt;/p&gt;

&lt;p&gt;Anyway, I finally finished and committed the third rewrite of my patch
for speeding up the embarrassingly slow hash-tables in SBCL. It turned
out to be a really frustrating game of whack-a-mole, with every change
uncovering either some new deficiency or another interaction between
the GC and the hash-tables that the old implementation had handled by
always inhibiting GC during a hash-table operation.&lt;/p&gt;

&lt;p&gt;The main user-visible change is that SBCL no longer does its own
locking for hash-tables (the fact that it locked the tables was always
just an implementation detail, not a part of the public interface).
This follows the usual SBCL policy of requiring applications to do
take care of locking when sharing data structures between threads.&lt;/p&gt;

&lt;p&gt;The exact details are pretty boring, so I won&#039;t repeat them here (read
the
&lt;a href=&quot;http://sbcl.boinkor.net/gitweb?p=sbcl.git;a=commitdiff;h=f318d0b1654042ed8f1b887852a9ba1f539208e4&quot;&gt;commit message&lt;/a&gt; if you really want to know). Instead I&#039;m just going
to post a pretty benchmark graph, since it&#039;s been way too long since I
last did one of these:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/stc/images/sbcl-hash.png&quot;&gt;&lt;/img&gt;&lt;/p&gt;

&lt;p&gt;Sadly those improvements don&#039;t mean that SBCL now has the fastest
hash-tables in the West, it just means they don&#039;t completely suck.
For some reason the issue of SBCL hash-table speed has come up more
often during the last couple of months than during the previous three
years combined, so it was probably time to get this sorted out.&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Mon, 01 Oct 2007 05:15:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-10-01-faster-sbcl-hash-tables.html</guid></item><item><title>ICFP 2007</title><link>https://www.snellman.net/blog/archive/2007-07-25icfp-2007.html</link><description>

&lt;p&gt;For the last five years or so it&#039;s always been my firm intent to take
part in the &lt;a href=&quot;http://www.icfpcontest.org/&quot;&gt;programming contest&lt;/a&gt; associated
with the International Conference on Functional Programming (ICFP). And
each year
something has prevented it. But this year there was no emergency at
work, no computer hardware broke, no sisters were getting married, etc.
So instead of playing poker on the net, which had been consuming all
of my free time for the last couple of weeks, I read the 22 page spec
and fired up emacs. (Just kidding, emacs was already running).&lt;/p&gt;

&lt;p&gt;The surface task was to write an interpreter for a weird
string-rewriting language. The organizers supplied a large blob of
data, which when run through the interpreter would produce as output
some image drawing operations
(for which you basically had to write some kind of a visualizer if
you wanted to achieve anything). The goal was to come up with some
prefix to the program which would make it instead produce output that
would be as close as possible to a certain target image.&lt;/p&gt;

&lt;p&gt;The intended way to achieve that goal was to notice that the drawing
operations generated by the blob would first write a clue message,
which would then be hidden in the final image by other image operations.
This seems like a really
bad decision. I luckily noticed the message since my first version
of the image conversion tool didn&#039;t support the flood fill operation.
But apparently a lot of teams never saw the message, and were left to
stumble in the dark for the whole weekend. The image that could be
drawn by using the clue would then lead to another obscure puzzle. Again,
I was lucky to figure out the solution after a while, but judging by IRC
and mailing list traffic a huge amount of teams never got it, and were
basically stuck.&lt;/p&gt;

&lt;p&gt;That clue could then finally be used to produce some concrete details
on how the big blob of data was using the string-rewriting language to produce
the image. There was even a catalog of the functions that the blob contained.
But the really useful data seemed to be hidden behind yet more
puzzles. So at this point I just did a minimal hack to make a token improvement
to the produced image: the source image had a grove of apple trees, the
target had pear trees. And according to the catalog the function &lt;code&gt;apple_tree&lt;/code&gt;
was exactly as large as &lt;code&gt;pear_tree&lt;/code&gt;. So I wrote a prefix that overwrote the
former with the latter. And then I
submitted that prefix, and switched to doing something more interesting.
(I think that token improvement was still in the top 20 something like 8 hours
before the contest ended, which probably says something about how much
progress people were making).&lt;/p&gt;

&lt;p&gt;I did rather enjoy writing the interpreter and the visualization tool, and
the specifications for both were mostly very good. Unfortunately the
spec contained
only a couple of trivial test cases with the expected results, so if
your interpreter had a problem, figuring out what exactly was going wrong
just from looking at execution traces was really hard. The organizers
originally replied on the mailing list that such debugging
&quot;is exactly part of the task&quot;, but later released an example trace from
a few iterations at the start. There was a documented prefix that
would run some tests on the implementation, and generate an image from those
results, but the coverage of those tests didn&#039;t seem to be very good.
(I had several bugs that only showed up with the full image, not with
the test one).&lt;/p&gt;

&lt;p&gt;The part of the interpreter that many teams seemed to have big trouble
with was that you couldn&#039;t really use a basic string or array to represent
the program. If you did, performance would be orders of magnitude too slow
(people were reporting getting 1 iteration / second, when drawing the basic
image would require 2 million iterations) due to excessive copying of
strings. Now, this was even pointed out in the specification!
Paraphrasing: &quot;these two operations needs to run faster than in linear time&quot;.
And still people tried to use strings, bawled when their stupid implementation
wasn&#039;t fast enough, and decided that the only solution would be to rewrite
their program in C++ instead of their favourite Haskell/Ocaml/CL. Good grief...&lt;/p&gt;

&lt;p&gt;For what it&#039;s worth, I used just about the stupidest imaginable
implementation strategy beyond just a naive string: represent the
program as a linked list of
variable length chunks, which will share backing storage when possible.
My first CL implementation of this ran at about 5.5k iterations / second. This
was good enough at the stage in the competition that I got to, and
would&#039;ve been easy to optimize further if I&#039;d decided to continue
(afterwards I made a 15 line change that gave a 8x speedup, so the basic
image now only takes 41 seconds to render on an Athlon x2 3800+).
And this was with a stupid data structure and couple of minor performance
hacks. It seems obvious that practically any language could have been
used to write a sufficiently fast interpreter. It never ceases to amaze me
how programmers would rather blame their tools than think about the problem
for a couple of minutes.&lt;/p&gt;

&lt;p&gt;Anyway, the organizers obviously put in a huge effort for this contest,
so thanks to them for that. It&#039;s just that the format really wasn&#039;t
what I was looking for in a programming contest. But at least it was
interesting enough to temporarily shake me out of playing poker into
doing some hacking again :-)
(Faster SBCL hash tables coming soon, I hope).&lt;/p&gt;

&lt;p&gt;I&#039;ve made the &lt;a href=&quot;/blog/stc/files/icfp2007-execute.lisp&quot;&gt;source code&lt;/a&gt;
for the interpreter available since several people have asked for it.
I&#039;m not sure &lt;em&gt;why&lt;/em&gt; they&#039;ve asked for it, since it&#039;s not very good code, and
probably contains no worthy insights. But if you want it, it&#039;s there.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Addenda:&lt;/b&gt; After writing the above, I read a few messages on the mailing
list which claimed that there really wasn&#039;t much of a puzzle aspect, but
that success was mainly determined by how good tools
(compilers, debuggers, disassemblers, etc) you were able to write. While
it&#039;s possible that after the initial two humps that I described above the
puzzles were irrelevant, that wasn&#039;t my impression. At the
point where I stopped, it didn&#039;t feel to me as if sufficient knowledge
was available for writing the tools, but rather was hidden behind
encrypted pages, steganography, etc. None of which I really wanted
to deal with.&lt;/p&gt;

&lt;p&gt;There was definitely enough information available to make
a start at reverse-engineering, but I don&#039;t think there was
enough time to reverse-engineer enough of the system to
figure out how to write the tools, write them, and then use the tools to
actually solve the real problem. I&#039;m sure things were different for larger
teams, but that doesn&#039;t really comfort me as a one person team :-)
My impression is that in the earlier ICFP contests the tasks were
such that it was possible for a single programmer to achieve a
decent result, even if it&#039;s unlikely that it&#039;s good enough to win. In this
case you don&#039;t get any points for the reverse-engineering or for the tools,
but just for the end result.&lt;/p&gt;

&lt;p&gt;(Having written the above, I&#039;m now sure that the eventual winner
will turn out to be a single programmer who only started working on the task
8 hours before the deadline).&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Wed, 25 Jul 2007 07:30:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-07-25icfp-2007.html</guid></item><item><title>Code coverage tool for SBCL</title><link>https://www.snellman.net/blog/archive/2007-05-03-code-coverage-tool-for-sbcl.html</link><description>

&lt;p&gt;SBCL 1.0.5.28 includes an experimental code coverage tool (sb-cover) as a new
contrib module. Basically you just need to compile your code with a special
optimize proclamation, load it, run some tests, and then run a reporting
utility. The reporting utility will produce some html files. One will contain an
aggregate coverage report of your whole system, the others will show your
source code transformed into angry fruit salad:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/blog/stc/images/coverage-cropped.png&quot;&gt;&lt;/img&gt;&lt;/p&gt;

&lt;p&gt;For a more substantial example, &lt;a href=&quot;/sbcl/cover/cl-ppcre-report-3/cover-index.html&quot;&gt;here&#039;s&lt;/a&gt;
the coverage output for the cl-ppcre test suite.&lt;/p&gt;

&lt;p&gt;There are still some places where the coverage output won&#039;t be what
most people would intuitively expect. Some, like the handling of inlined
functions, would be simple to solve. It&#039;s just not yet clear to me what
the right solution would be.
For example in the case of inlined functions the right solution might be
suppressing inlining when compiling with coverage instrumentation, or it might
be to say &quot;don&#039;t do that, then&quot; to the users. Others are fundamentally
unsolvable, due to the impossibility of reliably mapping the forms that
the compiler sees back to the exact character position in the source file.
Hopefully this&#039;ll still turn out to be useful in its current state.&lt;/p&gt;

&lt;p&gt;If you have any suggestions for improvements, I&#039;d love to hear them.&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Thu, 03 May 2007 10:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-05-03-code-coverage-tool-for-sbcl.html</guid></item><item><title>ILC 2007 Summary</title><link>https://www.snellman.net/blog/archive/2007-04-11-ilc-2007-summary.html</link><description>

&lt;p&gt;I wrote several almost finished blog posts during ILC, but didn&#039;t
get around to posting them &quot;live&quot; due to the issues with wireless access
and a generic lack of time, due to being off having a jolly good time.
Then I did some more traveling after the
ILC, and didn&#039;t manage to get them posted right afterward either. And
now that I&#039;m finally back home, most of what I wrote then no longer
seems worth posting, since it&#039;s lost the immediacy.&lt;/p&gt;

&lt;p&gt;So here&#039;s a few things that come to mind now.&lt;/p&gt;

&lt;h3&gt;The good&lt;/h3&gt;

&lt;p&gt;The organization was stellar in almost all respects. A huge thanks to
Nick Levine and anyone else who was involved. Cambridge was just incredibly
pretty, and the weather ranged from great to &quot;not bad&quot;. There were some very
good talks, though disappointingly most of the best ones were from Schemers :-)
The last day of talks was particularily good. I had incredible fun
meeting old friends, most of whom I hadn&#039;t seen for a year, putting faces
to names I knew from the net, and talking
to completely new people. Special honorable mentions in the latter category
go to Jeremy Jones and Richard Brooksby, with whom I had several very
interesting and fruitful discussions.&lt;/p&gt;

&lt;p&gt;I also got lots of very valuable SBCL feedback and new ideas, for all
kinds of things from the GC to the user interface for my code coverage
tool for SBCL (work in progress). It looks as if we need to beef up the SBCL marketing
department, though. I had several discussions of the form &quot;Q: What
would it take to make SBCL do FOO? A: It&#039;s already done that for the latest
X releases.&quot;. In the worst case with the same person asking for three
different features in succession, all of which had been implemented :-)
For example no-one seems to be aware that SBCL/Slime have stepper support.
Not horribly &lt;em&gt;good&lt;/em&gt; stepper support, but support nonetheless.
Also got to talk shop with SBCL developers and Clozure/ITA people,
which is always good. And maybe even managed to offload some ideas that
I&#039;d proof-of-concepted, but have no intention of ever properly implementing
myself.&lt;/p&gt;

&lt;p&gt;Got a surprisingly large number of congratulations on graduating. And
the guys had even got me a present (a copy of the Lisp 1.5
manual that Nikodemus had found from a bookshop in Cambridge, MA).
Thanks! Conveniently the title of the programming contest for the
next ILC was pre-announced as &quot;Lisp 1.5&quot;, so the manual might even
be useful, not just cool :-)&lt;/p&gt;

&lt;p&gt;I think the Ravenbrook guys are going to try integrating
MPS with SBCL, since CMUCL
didn&#039;t work out for them. While it&#039;s unlikely to replace the current SBCL
GC for licensing reasons (it&#039;s currently under a GPLish license), it would
be very interesting for two reasons: as a benchmark for the current GC
and as a first step towards pluggable GCs. The first one would be good
since we know that the SBCL memory management is suboptimal in many ways.
It&#039;d be valuable to find out what the real cost of fixing many of those
suboptimalities is. As for pluggable GCs, Frode wrote a nice
&lt;a href=&quot;http://article.gmane.org/gmane.lisp.steel-bank.devel/8931&quot;&gt;message&lt;/a&gt; to
sbcl-devel about that. If MPS is better for someone&#039;s use case than
SBCL&#039;s gencgc and they can live with the license, it&#039;d certainly be
nice for them to be able to just switch GCs. And of course at some
point implement other alternative GCs.&lt;/p&gt;

&lt;p&gt;Compared to the ECLMs, surprisingly many people that I talked to weren&#039;t
yet using
Lisp seriously, but were just interested about it. Some might
think that this is bad, but I think it&#039;s really great that there
are people still in that stage who are interested enough to
travel to and attend a multi-day Lisp conference. And of course there
were a lot more serious Lisp users than newbies.&lt;/p&gt;

&lt;p&gt;Overall my ILC experience was very positive. I&#039;ll talk next about some
bad stuff, but that&#039;s just because I believe that you can&#039;t just sweep
that stuff under the rug.&lt;/p&gt;

&lt;h3&gt;The bad&lt;/h3&gt;

&lt;p&gt;I think that program-wise there was maybe a day of talks that could&#039;ve
been discarded with little loss. Or if not a whole day, than at least
enough to make the rest of the schedule less tight. For example the
History of Lisp presentation was total crap (not just somewhat bad, but
&quot;I&#039;d rather listen to an hour of silence&quot;-bad), and the information
theory one had no business being presented in a Lisp conference. Given
what little I heard of the review process in other cases, I don&#039;t understand
how the latter ever got accepted.&lt;/p&gt;

&lt;p&gt;I understand that people don&#039;t really go to a conference for the talks,
but that doesn&#039;t mean that anything goes. My plea to the next ILC program
committee is threefold:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;
&lt;p&gt;Please invite only speakers with something to say that&#039;s relevant to
Lisp &lt;em&gt;now&lt;/em&gt; or in the future, not in the last millennium.&lt;/li&gt;

&lt;li&gt;
&lt;p&gt;More specifically, I&#039;m sure there&#039;s a temptation to &quot;honor&quot; the 50th
birthday of Lisp by historical navel-gazing. Please don&#039;t give in to it.&lt;/li&gt;

&lt;li&gt;
&lt;p&gt;If you don&#039;t get enough good submissions, don&#039;t accept the irrelevant
ones as padding.&lt;/p&gt;
&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;My attempts at industrial espionage were mostly a failure. Both
Duane and Jans ran out of time before getting around to stuff that
would&#039;ve been both worth stealing. For example Duane didn&#039;t have time
to demo their profiler, which I&#039;d heard
described as the gold standard of Lisp profilers, and of course I can&#039;t
really try it out myself due to the license. I was surprised that the
Allegro equivalent to SBCL&#039;s optimization notes didn&#039;t have any kind
of UI for mapping the notes back to the original source, making it look
mostly useless. Or at least Duane, who is probably an expert
at reading them, did get confused by the results a couple of times
despite it being a scripted demo :-)&lt;/p&gt;

&lt;p&gt;[ Which isn&#039;t to say that Franz&#039;s presentations were bad. I just didn&#039;t
get much out of them SBCL-wise. ]&lt;/p&gt;

&lt;h3&gt;The controversial&lt;/h3&gt;

&lt;p&gt;Some stuff has received a lot of airtime after the conference.&lt;/p&gt;

&lt;p&gt;Before the conference I expressed some puzzlement about there being an
invited talk about CL-HTTP, which I regarded as a choice that was
completely out of touch with the current state of the Lisp world.
Seeing the talk didn&#039;t change my opinion (oh, wow, still using the
White House information system from the Clinton administration as
the example?).
E.g. when Mallery asked about who had ever used CL-HTTP, and practically
no hands went up, unlike with every other similar question that was asked
during the conference. But amazingly enough, in the last day two
presentaters appeared to be seriously using CL-HTTP. (IIRC
they were the RacerPro and XMLisp ones).&lt;/p&gt;

&lt;p&gt;Most of the Allegro features that Duane and Jans had time to show were things
that SBCL already does in some form. It&#039;s just that they&#039;re exporting
their internals, and in some cases the interfaces don&#039;t seem very
polished. I guess READ-LINE-INTO (?) wouldn&#039;t be a bad addition,
but e.g. MEMCPY-UP and MEMCPY-DOWN were just completely wrong.&lt;/p&gt;

&lt;p&gt;So I wasn&#039;t horribly impressed with what they talked about. But unlike
Luke, who was stirring up the debate both at ILC and after, I think that it
is a very worthwhile
goal to give Lisp users access to low level facilities,
and that we really should be suppling non-consing / resource-reusing
versions of functions where possible. STRING-TO-OCTETS and OCTETS-TO-STRING
are an obvious example where SBCL could be improved.&lt;/p&gt;

&lt;p&gt;Yes, it&#039;d be really great to just cons indiscriminately, but no matter
what the GC scheme is, there will be programs where consing will be deadly.
And yes, it&#039;ll mean that code written for performance might be a bit ugly,
but it&#039;s still better than dropping to C from Python for performance, etc.
Of course SBCL users can use many of those low level facilities
right now, but most of them are undocumented and unexported, which sets the
bar for using them pretty high.&lt;/p&gt;

&lt;h3&gt;The end&lt;/h3&gt;

&lt;p&gt;Anyway, it was lots of fun! I hope to see all of you again next year.&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Wed, 11 Apr 2007 22:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-04-11-ilc-2007-summary.html</guid></item><item><title>ILC 2007 MPS Tutorial</title><link>https://www.snellman.net/blog/archive/2007-04-01-ilc-2007-b-mps-tutorial.html</link><description>

&lt;p&gt;Oh, man. My excitement about the CMUCL/MPS integration seems to have
been premature :-)&lt;/p&gt;

&lt;p&gt;Paraphrases from the early part of the MPS tutorial:&lt;/p&gt;

&lt;p&gt;&quot;We didn&#039;t actually get too far with the actual implementation of
MPS and CMUCL, since we were unable to boostrap CMUCL if we made
any (even tiny modifications).&quot; (But apparently they have all the
design issues solved).&lt;/p&gt;

&lt;p&gt;&quot;Unfortunately Dave Jones who&#039;s been doing the work on this is
ill and thus not at the conference.&quot;&lt;/p&gt;

&lt;p&gt;&quot;Used CMUCL rather than SBCL since Carl Shapiro had earlier expressed
interest in integrating MPS and CMUCL. No particular reason besides
that.&quot; (In answer to my question about why they didn&#039;t try SBCL if
bootstrapping was a problem).&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Sun, 01 Apr 2007 14:05:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-04-01-ilc-2007-b-mps-tutorial.html</guid></item><item><title>ILC 2007 pre-conference stuff</title><link>https://www.snellman.net/blog/archive/2007-04-01-ilc-2007-a-pre-conference-stuff.html</link><description>

&lt;p&gt;(Stuff from Saturday, before the actual conference starts. Sorry for
any typos, I wrote it late last night after half a bottle of wine, and
didn&#039;t have time to proofread it this morning. And am now in the
middle of a tutorial. I&#039;ll fix it up later.).&lt;/p&gt;

&lt;p&gt;Woke up at 0500. Almost missed the plane lifting off at 0745 despite
that, since Taxis were nowhere to be found. Met Martti, fellow
Helsinki Lisper, at Heathrow, and was entertained by his tales of
British engineering for most of the trip from Heathrow to King&#039;s
Cross.&lt;/p&gt;

&lt;p&gt;The conference accommodation is nice, especially for the price.  Except
for the British plumbing, but complaining about that is about as
original as complaining about left side traffic. I got a room in the
top floor, which seems to be an attic that was later converted to
dorms. It looks pretty dramatic, in a good way (with the room being
horseshoe-shaped and varying in height between 4.5 meters to 0.5
meters). Unfortunately I don&#039;t have a camera.&lt;/p&gt;

&lt;p&gt;Cambridge looks really pretty. I haven&#039;t yet random walked around the
city properly, and probably won&#039;t have time to do so on this trip. I
did go to the conference tour, though. Thanks to Martin Simmons for
doing the hard work of punting on the punt that I was on. I didn&#039;t get
horribly much out of the guided walking tour part, but at least it
meant visiting various places that I would never have gone to on my
own.&lt;/p&gt;

&lt;p&gt;The sexp-formatted conference badges that Christophe designed look
sweet, though they&#039;re not a big surprise since I&#039;d seen them in the
earlier stages.&lt;/p&gt;

&lt;p&gt;We had a very nice dinner at a Turkish place that Christophe
recommended, and which surprisingly enough was able to give a table
for 12 with no warning at 1930 on a Saturday. IIRC the name of the
restaurant was Anatolia, and based on some after-the-fact backtracking
the location is off the conference-provided map, but probably on the
Bridge Street that Sidney Street transforms into in the intersection
to St. Johns Street. I really liked the food. Didn&#039;t mind the wine
either, though I won&#039;t pretend that I can make any kind of judgment on
its quality.&lt;/p&gt;

&lt;p&gt;All of tomorrow&#039;s 4 tutorials look interesting, but since they&#039;re in
parallel I can only do two. The MPS tutorial is a must-see for
me. Choosing between industrial espionage (performance tuning in
Allegro) and cool Lisp hacks (ContextL) will be tough.&lt;/p&gt;

&lt;p&gt;It&#039;s now 00:30 (2:30 Finnish time, so I&#039;ve been up for 19+
hours). Time to get some sleep, and hope that I can get this entry
posted tomorrow. No wireless reception in the room, and I couldn&#039;t get
a wireless connection up in the Library Common Room. Some people
reportedly had more luck with it.&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>LISP</category><pubDate>Sun, 01 Apr 2007 13:45:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2007-04-01-ilc-2007-a-pre-conference-stuff.html</guid></item></channel></rss>