This is a post on use cases for Common Lisp's CHANGE-CLASS operation [0]. As the name suggests, it changes the class of an object without changing its object identity. It's an operation that a certain class of programmers would consider totally abhorrent. I think it's both cool and useful.

As far as I an see, the class of an instance has three effects in Common Lisp. It determines the set of slots the object has, it determines which methods will be executed when a generic function is called with that object as one of the arguments, and it determines how the object interacts with the rest of the system based on the metaclass of the class of the object.

Why change the class at runtime?

Why would you change the class of an object rather than create a new object as a replacement? Because there might be references to the object all over the place, and updating all of those references to point to the new object might be a lot of work or even impossible.

And why not just create the object with the right class in the first place? Sometimes it's because the object is not created by the application code but in the depths of some library, and changing that library is not feasible. At other times it's because the appropriate class of the object genuinely changes during execution.

And why not use a workaround like some kind of a delegating proxy object instead? Both of the above reasons kind of apply there.

Adding new slots to a class

I recently had to go and change something in the code that runs my blog, for the first time in years and years. Now, this is some crufy code. How crufty, you ask? Well... It runs a web server that was last updated 10 years ago [1]. While spelunking through the code, I found a bit of code that looked essentially like this:

  (defclass blog-request (araneida:request)
    ((db :initarg :db :accessor db-of)
     (buffer-stream :initarg :buffer-stream :accessor buffer-stream-of)))

  (defmethod araneida:handle-request :around ((handler blog-handler) request)
    (clsql:with-database (db *db-spec* :if-exists :new)
      (let ((string (with-output-to-string (stream)
                     (change-class request 'blog-request
                                   :db db
                                   :buffer-stream stream)
                     (call-next-method))))
         (write-string string (araneida:request-stream request)))))

What's going on here? Well, we're hooking around the HANDLE-REQUEST generic function of the web server, setting up a couple of state objects (a database connection, a STRING-OUTPUT-STREAM). We then proceed with normal execution of the request handling with CALL-NEXT-METHOD, and write the data in that's been buffered in the STRING-OUTPUT-STREAM into the normal output stream.

The core problem here is that the state data needs to be threaded down the call stack to where it's actually used. Since we're doing all of this from the middle of third party code, changing the function signatures is not an option. so we change the class of the web server's request object from REQUEST to BLOG-REQUEST (a subclass), and stuff the state objects into the slots that have now appeared in the object.

The natural way of writing this in Common Lisp would probably be to use special variables [2]. I think the reason I didn't go that route was that way back when I was not running each request in a separate thread, but was using SERVE-EVENT, SBCL's rather bizarre recursive event loop which really doesn't play together well with special variables. But it's also not always the case that the lifetime of the additional data is determined by a particular dynamic extent.

Another typical solution for attaching extra data to an object would be storing the extra information in a weak-keyed hash table with the objects as a key, and making that hash-table accessible in all of the places where this extra data is needed (most likely as a global variable). As far as I'm concerned, that's just gross.

Is there a converse situation where you'd want to CHANGE-CLASS to remove some slots from an object? I can't really think of a plausible case. It might be a side effect from changing the object to be an instance of a class that isn't a sublass of the original class. But never the actual goal since the amount of memory you'd save from having fewer slots would be miniscule.

Modifying method dispatch

A more obvious use for CHANGE-CLASS is a need to manipulate method dispatch. An example I like for this is the intermediate data representation of a compiler. Consider the representation of a variable binding. The binding could be for example constant vs. modified, or totally local lexical binding vs lexical binding closed over by a function vs. dynamic binding.

The compiler is going to need to treat the binding objects of intermediate representation in very different ways depending on the exactly what kind of variable this is. A variable binding that's never changed and that's known to contain an immutable value can be trivially constant-folded. A closed over and potentially modified variable will need a some extra code to allocate some memory in which the variable is stored. And the code that's generated for any of the variable references (both reads and writes) will need to be different as well.

Now, the funny thing is that these binding objects can change their state multiple times during compilation. If dead code elimination ends up removing the last read from a variable, that binding becomes dead. A variable can bounce from non-closed over to closed over as a closure is discovered, back to non-closed as it's proven that the closure can't escape after all. And so on. It'd just be infeasible to generate all of the objects with the correct class up front. And these objects are going to be referenced willy-nilly from all over the IR tree, making replacing references truly annoying.

One way of representing this is to have a bunch of state flags in the compiler's binding objects. But then you have to implement the specializations a bunch of conditionals in large functions, rather than by having the specialized behavior in separate methods and relying on method dispatch to sort things out. I know which form of organizing code I prefer [3].

Switching metaclasses

Using CHANGE-CLASS to switch the class to a hierarchy that's based on a different metaclass is where I start drawing a blank. Unlike the other two cases I haven't ever felt the need to do that myself so it's harder to spin a convincing story. The best I can do is go through some typical uses of non-standard metaclasses, and think about whether there could be any reason to change between them and normal classes. Here's some broad categories of features you could change:

  • Changing the slot storage representation
  • Changing slot access in some other way
  • Changing the code generated for accessors
  • Adding new metadata to slot definitions or class definitions
  • Changing the method dispatch resolution in some fundamental way, for example using C3 class hierarchy linearization

And as for how you'd achieve something useful with one of these features:

Perhaps the most prototypical use of metaclasses is persistent objects - for example an object-relational mapping library, but it could be a real object database too. Why do you need a custom metaclass for this?

One reasons is lazy initialization of some or all slots. When you load an object from the database, you don't necessarily want to load all the data up front. Some of it might trigger the loading of arbitrarily deep graphs of other persistent object, which is expensive. To do this you only want to fetch the value of these slots when their value is read the first time. This doesn't look like a compelling case for CHANGE-CLASS; why would we change our already existing fully initialized instance into one of these lazily initialized objects?

Alternatively you might want to attach extra information to slot descriptors for describing how the data is to be persisted. What's the SQL datatype of this field? Is it part of the primary key? are there any foreign key constraints? It'd definitely be reasonable to CHANGE-CLASS on instance of USER to USER* given the following definitions:

  (defclass user ()
    ((uid :accessor uid-of :initarg :uid)
     (username :accessor username-of :initarg :username)
     (password-hash :accessor password-hash-of :initarg :password-hash)))

  (defclass user* ()
    ((uid :accessor uid-of :initarg :uid :primary-key t :sql-datatype 'integer)
     (username :accessor username-of :initarg :username
               :unique t :sql-datatype 'text)
     (password-hash :accessor password-hash-of :initarg :password-hash
                    :sql-datatype 'text))
    (:metaclass db-object)
    (:sql-table "user"))

But... This doesn't explain why you'd ever end up with a USER instead of a USER* in the first place. Persisting objects that you didn't create but that were injected to your program by a library seems very odd.

Another textbook example of non-standard metaclasses are alternative slot representations. Instead of an instance being essentially a vector of slot values, it could be a hash-table mapping slot names to values. The benefit here would be more space-efficient storage for sparse objects; a class with hundreds of slots most of which never get initialized. Could you want to swap back and forth between the normal and the sparse representation? Maybe, but then you'd just implement a metaclass that automatically chooses the right representation and switches between them behind the scenes. There's no point in forcing the user to switch between the representations manually. This doesn't feel plausible either.

One last try, Pascal Costanza's ContextL library extends the support for dynamic binding in the language, allowing not only dynamically binding variables but also doing it for functions (including autogenerated accessor methods) and slot values. The way this kind of extension would be implemented in a threadsafe manner is by indirecting the function calls and slot accesses through a special variable. Which is to say the slot access protocol needs to be reimplemented, and maybe the autogeneration of accessors too. Obviously this needs a new metaclass!

And could you want to change a standard object to one supporting dynamic binding? That's actually pretty plausible. A framework injects some object whose behavior you need to customize on a very fine-grained level. Dynamically scoped functions seem like a good tool for that. But it's still pretty hand-wavey.

Does anyone have a more concrete example for using CHANGE-CLASS primarily on order to switch to a different metaclass?

Footnotes

[0] Analogous operations are of course available in a bunch of languages. The key difference to my mind is that in Common Lisp, as with many other dynamic features, there's a well-defined protocol for customizing exactly how the feature works and how it's configured and extended.

[1] Woo, Araneida for the win! Old code never dies.

[2] Where "special variable" is actually a technical concept, perhaps one of the worst named ones in the world :-) Essentially a thread-local dynamically scoped global variable, though there's a couple of extra warts.

[3] Fans of statically typed functional programming languages with pattern matching obviously have the opposite preference. What I find interesting is that a lot of the impetus for CHANGE-CLASS comes from wanting to preserve object identity and not needing to update all the references to the object. The first is a non-issue in functional programming, the second is something you need to do anyway.