<rss version='2.0'><channel><title>Juho Snellman's Weblog</title><link>https://www.snellman.net/blog/</link><description>Lisp, Perl Golf</description><item><title>I don&#039;t want no &#039;wantarray&#039;</title><link>https://www.snellman.net/blog/archive/2017-07-18-wantarray/</link><description>
  &lt;p&gt;
    A while back, I got a bug report for
    &lt;a href=https://www.snellman.net/blog/archive/2016-01-12-json-to-multicsv/&gt;json-to-multicsv&lt;/a&gt;. The user was getting the following error for
    any input file, including the one used as an example
    in the documentation:
  &lt;/p&gt;

  &lt;pre&gt;
    , or } expected while parsing object/hash, at character offset 2 (before &quot;n&quot;)&lt;/pre&gt;

  &lt;p&gt;The full facts of the matter were:&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt; The JSON parser was failing on the third character of the
      file.
    &lt;li&gt; That was also the end of the first line in the
      file. (I.e. the first line of the JSON file contained just the
      opening bracket).
    &lt;li&gt; The user was running it on Windows.
    &lt;li&gt; The same input file worked fine for me on Linux.
  &lt;/ul&gt;

  &lt;read-more&gt;&lt;/read-more&gt;

  &lt;p&gt;
    Now, there&#039;s an obvious root cause here. It&#039;s almost
    impossible not to blame this on Windows using CR-LF line endings,
    where Unix uses just LF. The pattern match is irresistible: works
    on Linux, fails on Windows, fails at the end of the first line. And I
    almost answered the email based on this assumption.

  &lt;p&gt;Except... Something feels off with that theory. What would be the
    root cause here? &quot;&lt;i&gt;Wow, I can&#039;t believe that the JSON spec
    missed specifying the CR as whitespace&lt;/i&gt;&quot;? No, that makes no
    sense, nobody would define a text-based file format that
    sloppily. 0

  &lt;p&gt;How about: &quot;&lt;i&gt;Wow, I can&#039;t believe the JSON module
    of a major programming language has a bug making it fail on all
      inputs on a major operating system, and it took a decade for anyone
      to notice&lt;/i&gt;&quot;. That doesn&#039;t seem plausible either.

  &lt;p&gt;
    So I tried to reproduce the problem, by making a file with DOS
    line endings and running it through the script on Linux. That
    worked fine. Hm. Put in some invalid garbage, and you get a parser
    error as expected. Double-hm. But the error message I got was very
    different from that in the bug report. Could it be that it&#039;s using
    a totally different JSON module altogether?

  &lt;p&gt;
    Turns out that&#039;s basically what was going on. Perl&#039;s JSON module
    doesn&#039;t actually do any parsing itself. It&#039;s mostly a
    shim layer, the actual work is done by one of several
    different parser modules. On Linux, I&#039;d been getting &lt;code&gt;JSON::XS&lt;/code&gt;
    as the backend (&lt;code&gt;XS&lt;/code&gt; is Perl-talk for &quot;native code&quot;). In cases
    where &lt;code&gt;JSON::XS&lt;/code&gt; is not available, the shim module would use
    a pure Perl fallback, e.g. &lt;code&gt;JSON::PP&lt;/code&gt;.

  &lt;p&gt;Ok, so force the JSON module to dispatch to &lt;code&gt;JSON::PP&lt;/code&gt;.
    Success! Problem reproduced. Guess it really was buggy parser after
    all. Remove the DOS line endings, just to be sure... And it&#039;s still
    failing. WTF?
  &lt;/p&gt;

  &lt;p&gt;A bit more digging revealed that the error message was actually
    a lie. The problem wasn&#039;t with the whitespace, but with
    there being an end of file right after said whitespace. The input
    to &lt;code&gt;JSON::PP&lt;/code&gt; contained just a single line, not the whole
    file! At that point, the actual problem becomes obvious and the
    fix trivial:

  &lt;pre&gt;
-    my $json = decode_json read_file $file;
+    my $json = decode_json scalar read_file $file;&lt;/pre&gt;

  &lt;p&gt;I was using the &lt;code&gt;read_file&lt;/code&gt; function from
    &lt;code&gt;File::Slurp&lt;/code&gt; to read the contents of the file.
    Unfortunately that function behaves differently in scalar and list
    contexts. In scalar context, it returns the contents of the file
    in a single string. In list context, an array of strings. What had
    to be happening was that the context was changing based on the
    backend.
  &lt;/p&gt;

  &lt;p&gt;
    And just why would changing the parser backend change the context
    for that &lt;code&gt;read_file&lt;/code&gt; call? As it happens, the &lt;code&gt;JSON&lt;/code&gt;
    module does not actually define &lt;code&gt;decode_json&lt;/code&gt;, but
    directly aliases to the matching function in the backend. For
    example:
  &lt;/p&gt;

  &lt;pre&gt;*{&quot;JSON::decode_json&quot;} = &amp;{&quot;JSON::XS::decode_json&quot;};&lt;/pre&gt;

  &lt;p&gt;
    &lt;code&gt;JSON::XS&lt;/code&gt; declares the function with a &lt;code&gt;$&lt;/code&gt;
    prototype forcing the argument to be evaluated in scalar context.
    &lt;code&gt;JSON::PP&lt;/code&gt; uses no prototype and thus the arguments
    defaulted to being evaluated in list context.
  &lt;/p&gt;

  &lt;h2&gt;The blame game&lt;/h2&gt;

   &lt;p&gt;So, that&#039;s the bug. But what was the real culprit?
     I could come up with the following suspects.&lt;/p&gt;

  &lt;ul&gt;
    &lt;li&gt; Me, for using &lt;code&gt;File::Slurp&lt;/code&gt; for this
      in the first place. &lt;i&gt;&quot;Oh, I just always pass a file-handle to
      &lt;code&gt;decode_json&lt;/code&gt;&quot;&lt;/i&gt; said one coworker when I described
      this bug. And that would indeed have side-stepped the problem, and
      &lt;code&gt;read_file&lt;/code&gt; is just saving a couple of lines of code.
      But it&#039;s exactly the couple of lines of code I don&#039;t want to be writing:
      pairing up file opens/closes, and boilerplate error handling.
    &lt;li&gt; Me, for not realizing that the code was only working by
      accident.  I knew &lt;code&gt;read_file&lt;/code&gt; works differently in
      scalar and list contexts. I also knew this case needed scalar context,
      and had no special reason to believe that &lt;code&gt;decode_json&lt;/code&gt;
      would provide it. The default assumption should have bene for this
      code not to work. When it did, I should not have accepted it, but
      figured out why it worked and whether it was guaranteed to work
      in the future.
    &lt;li&gt;The &lt;code&gt;JSON&lt;/code&gt; module, for not explicitly documenting
      the inconsistent prototypes as part of the interface. I don&#039;t know
      that anyone would actually notice that in the documentation though.
      It might end up as just cover-your-ass documentation.
    &lt;li&gt; The &lt;code&gt;JSON&lt;/code&gt; module, for directly exposing the
      backend functions with aliasing, for a minimal performance gain.
      It&#039;s a shim: isn&#039;t the whole point to hide away the
      implementation differences from the user?
    &lt;li&gt;The &lt;code&gt;File::Slurp&lt;/code&gt; module, for using &lt;code&gt;wantarray&lt;/code&gt;
      to switch behavior of &lt;code&gt;read_file&lt;/code&gt; based on the context.
    &lt;li&gt;Perl for having the concept of different contexts in the first place.
    &lt;li&gt;Perl for allowing random library code to detect different contexts
      via &lt;code&gt;wantarray&lt;/code&gt;.
  &lt;/ul&gt;

  &lt;p&gt;The thing that really sticks out to me here is overloading
     of &lt;code&gt;File::Slurp::read_file&lt;/code&gt; based on the
     context. Returning a file as a single string vs. an array of
     lines are very different operations. There is absolutely no
     reason for them to share a name. It&#039;d be simpler to implement, simpler to use,
     and simpler to document. It&#039;s even already in a library, so it&#039;s
     not like there would be any kind of namespace pollution by using
     different names. (Unlike for the uses of context-sensitive
     overloading in core Perl. Sure, &lt;code&gt;count&lt;/code&gt; would probably
     make more sense than &lt;code&gt;scalar grep&lt;/code&gt;. But it would be a
     new name in the global namespace).
  &lt;/p&gt;

   &lt;p&gt;What about &lt;code&gt;wantarray&lt;/code&gt;? It&#039;s what&#039;s enabling this
     bogus overloading in the first place. I&#039;ve been using Perl for 20
     years, writing some pretty hairy stuff. As far as I can remember,
     I haven&#039;t used &lt;code&gt;wantarray&lt;/code&gt; once. And what&#039;s more,
     I don&#039;t remember ever using a library that used it to good
     effect. The reason context-sensitivity works in core Perl
     is the limited set of operations. One can reasonably learn the
     entire set of context-sensitive operations, and their (sometimes
     surprising) behavior. It&#039;s a lot less reasonable to expect people to
     learn this for arbitrary amounts of user code.
   &lt;/p&gt;

   &lt;p&gt;It&#039;s a bit unfortunate that function aliasing can cause action
      at a distance like this. But at least that&#039;s a feature with
      solid use cases.&lt;/p&gt;

   &lt;p&gt;So I think that&#039;s where I fall on this. It&#039;s all because of a horrible
     and mostly unnecessary language feature, used for particularly bad effect
     in a library. It feels like avoiding this kind of problem on the consumer
     side is almost impossible; it&#039;d just require superhuman levels of attention
     to detail. Avoiding it on the producer side is really easy:
     &lt;code&gt;wantarray&lt;/code&gt;: just say no.&lt;/p&gt;

  &lt;h3&gt;Footnotes&lt;/h3&gt;
  &lt;div class=footnotes&gt;

  &lt;a id=&#039;fn0&#039;&gt;[&lt;a href=&#039;#fnref0&#039;&gt;0&lt;/a&gt;] Did you nod and agree at &lt;i&gt;&quot;that makes no sense&quot;&lt;/i&gt;? Haha. The
    original JSON spec does say that &lt;i&gt;&quot;whitespace can be inserted
    between any two tokens&quot;&lt;/i&gt;, but doesn&#039;t actually define whitespace.
   &lt;/div&gt;

</description><author>jsnell@iki.fi</author><category>PERL</category><pubDate>Tue, 18 Jul 2017 18:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2017-07-18-wantarray/</guid></item><item><title>json-to-multicsv - Convert hierarchical JSON to multiple CSV files</title><link>https://www.snellman.net/blog/archive/2016-01-12-json-to-multicsv/</link><description>
      &lt;h2&gt;Introduction&lt;/h2&gt;

      &lt;p&gt;
        &lt;a href=&#039;http://github.com/jsnell/json-to-multicsv&#039;&gt;json-to-multicsv&lt;/a&gt; is a little program to convert a JSON file to one or more CSV files in a way that preserves the hierarchical structure of nested objects and lists. It&#039;s the kind of dime a dozen data munging tool that&#039;s too trivial to talk about, but I&#039;ll write a bit anyway for a couple of reasons.

      &lt;p&gt;
        The first one is that I spent an hour looking for an existing tool that did this and didn&#039;t find one. Lots of converters to other formats, all of which seem to assume the JSON is effectively going to be a list of records, but none that supported arbitrary nesting. Did I just somehow manage to miss all the good ones? Or is this truly something that nobody has ever needed to do?

      &lt;p&gt;
        Second, this is as good an excuse as any to start talking a bit about some patterns in how command line programs get told what to do (I&#039;d use the word &amp;quot;configured&amp;quot;, except that&#039;s not quite right).
      &lt;/p&gt;

&lt;read-more&gt;&lt;/read-more&gt;

      &lt;h2&gt;What and why?&lt;/h2&gt;

      &lt;p&gt;
        I needed to produce some data for someone else to analyze, but
        the statistics package they were using could not import JSON
        files with any non-trivial structure. Instead the data needed
        to be provided as multiple CSV files that can be joined
        together by the appropriate columns.

      &lt;p&gt;
        As a simplified example, instead of this:

&lt;pre&gt;
{
  &amp;quot;item 1&amp;quot;: {
    &amp;quot;title&amp;quot;: &amp;quot;The First Item&amp;quot;,
    &amp;quot;genres&amp;quot;: [&amp;quot;sci-fi&amp;quot;, &amp;quot;adventure&amp;quot;],
    &amp;quot;rating&amp;quot;: {
      &amp;quot;mean&amp;quot;: 9.5,
      &amp;quot;votes&amp;quot;: 190
     }
  },
  &amp;quot;item 2&amp;quot;: {
    &amp;quot;title&amp;quot;: &amp;quot;The Second Item&amp;quot;,
    &amp;quot;genres&amp;quot;: [&amp;quot;history&amp;quot;, &amp;quot;economics&amp;quot;],
    &amp;quot;rating&amp;quot;: {
      &amp;quot;mean&amp;quot;: 7.4,
      &amp;quot;votes&amp;quot;: 865
   },
   &amp;quot;sales&amp;quot;: [
     { &amp;quot;count&amp;quot;: 76, &amp;quot;country&amp;quot;: &amp;quot;us&amp;quot; },
     { &amp;quot;count&amp;quot;: 13, &amp;quot;country&amp;quot;: &amp;quot;de&amp;quot; },
     { &amp;quot;count&amp;quot;: 4, &amp;quot;country&amp;quot;: &amp;quot;fi&amp;quot; }
   ]
  }
}
&lt;/pre&gt;

&lt;p&gt;
My &amp;quot;customer&amp;quot; needed this:

&lt;p&gt;
&lt;b&gt;item.csv&lt;/b&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;item._key&lt;td&gt;item.rating.mean&lt;td&gt;item.rating.votes&lt;td&gt;item.title
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 1&amp;quot;&lt;td&gt;9.5&lt;td&gt;190&lt;td&gt;&amp;quot;The First Item&amp;quot;
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 2&amp;quot;&lt;td&gt;7.4&lt;td&gt;865&lt;td&gt;&amp;quot;The Second Item&amp;quot;
&lt;/table&gt;

&lt;p&gt;
&lt;b&gt;item.genres.csv&lt;/b&gt;
&lt;table&gt;
  &lt;tr&gt;&lt;td&gt;genres&lt;td style=&#039;background-color: #eaa&#039;&gt;item._key&lt;td&gt;item.genres._key
  &lt;tr&gt;&lt;td&gt;sci-fi&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 1&amp;quot;&lt;td&gt;1
  &lt;tr&gt;&lt;td&gt;adventure&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 1&amp;quot;&lt;td&gt;2
  &lt;tr&gt;&lt;td&gt;history&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 2&amp;quot;&lt;td&gt;1
  &lt;tr&gt;&lt;td&gt;economics&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 2&amp;quot;&lt;td&gt;2
&lt;/table&gt;

&lt;p&gt;
&lt;b&gt;item.sales.csv&lt;/b&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;item._key&lt;td&gt;item.sales._key&lt;td&gt;sales.count&lt;td&gt;sales.country
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 2&amp;quot;&lt;td&gt;1&lt;td&gt;76&lt;td&gt;us
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 2&amp;quot;&lt;td&gt;2&lt;td&gt;13&lt;td&gt;de
&lt;tr&gt;&lt;td style=&#039;background-color: #eaa&#039;&gt;&amp;quot;item 2&amp;quot;&lt;td&gt;3&lt;td&gt;4&lt;td&gt;fi
&lt;/table&gt;

&lt;p&gt;
One way to do this would have been to just change the program I used
to produce the output. That would have been a bit annoying since the
CSV output codepath would have been basically completely separate from
the JSON one (which was basically just
a &lt;code&gt;JSON::encode_json&lt;/code&gt; on the natural data structure. It&#039;s
almost easier to just have a generic converter than one specific for
that one app (the documentation is as long as the program itself). The
only question is how to configure the generic mechanism for the
specific case.

&lt;h2&gt;How command line tools get run&lt;/h2&gt;

&lt;p&gt;
Could this &amp;quot;just work&amp;quot; out of the box with no settings at all?
Not really, there&#039;s
multiple ways of interpreting the data. A compound value could mean
either the addition of more columns (ratings in the example) or adding
rows to another CSV file (sales in the example). Consistently choosing
the first interpretation would not work at all, while in the latter
case you&#039;d get really awkward &lt;a href=&#039;https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model&#039;&gt;entity-attribute-value&lt;/a&gt;-style output.

&lt;p&gt;
  Ok, so some configuration is needed. What kind of options do we have for
  doing that? Command line flags tend to be the simplest to start with,
  though they&#039;ll often eventually become complex either by developing
  ordering dependencies between flags (to express different semantics)
  or by the values developing some kind of complicated internal structure.

&lt;p&gt;
  Both of those actually happen for this tool. To run it, you need to
  pass in multiple --path command line options, each containing a pair
  of a patterns and the action to take for values whose path matches
  the pattern. (Just the first matching action is taken). For the above
  example those flags were:

&lt;pre&gt;
   --path /:table:item
   --path /*/rating:column
   --path /*/sales:table:sales
   --path /*/genres:table:genres
&lt;/pre&gt;

&lt;p&gt;
Scalar values have an automatic fallback handler that just outputs the
value as a column, but for compound data fields not finding a match is
an error. In these cases the error message will print out some
suggestions on what command line arguments could be added to resolve
the error, for example:

&lt;pre&gt;
Don&#039;t know how to handle object at /*/appendix/. Suggestions:
 --path /*/appendix/:table:name
 --path /*/appendix/:column
 --path /*/appendix/:row
 --path /*/appendix/:ignore
&lt;/pre&gt;

&lt;p&gt;
The next option would be feeding some kind of a schema file to the
tool, which would then be used to guide the process. For example if
the schema says that a type of object has a static set of fields,
those fields are probably columns. If it has an unknown set of keys,
it&#039;s probably more like tabular data.

&lt;p&gt;
The problem is that writing the schema would be a bit of a pain, and
it would be much harder for the conversion tool to guide the user
through an iterative process of getting the schema definition right.
One could maybe generate a schema file from the data file itself, and
edit any bits that the autodetection goes wrong. Schema generators do
exist, for example
&lt;a href=&#039;http://jsonschema.net/&#039;&gt;jsonschema.net&lt;/a&gt;, but at least that
one doesn&#039;t have enough knobs to tweak to even get this basic example
right. And the mistakes are such that fixing them would take a fair
bit of work. Reliable automated schema generation would make for some
pretty epic yak shaving in the context of this tiny tool.

&lt;p&gt;
Maybe if people really did write JSON schemas for everything it would
make sense to use that existing infrastructure. But I&#039;ve never seen
one of those in the wild, the spec is complicated, and JSON
schemas are not particularly well suited to this use case. (Really
you&#039;d want a custom schema format, but then it&#039;s completely guaranteed
that there&#039;s no pre-existing schema file to use).

&lt;p&gt;
And here&#039;s the thing... It&#039;s not just this specific case. It never
feels like any kind of declarative schema is the right solution. In a
couple of decades of writing data munging scripts I can remember just
a single case of basing the solution on an external description of the
data. And that single exception had several people working on the tool
full time. Sure, it&#039;s great to have a schema of some sort for for your
data interchange or storage format, for use in validation, code
generation, automated generation of example data, or other things like
that. But for actually processing it? It&#039;s just an incredibly rare
pattern.

&lt;p&gt;
And finally, could this be a use case for a special purpose language?
If schemas feel like a rarity, little languages are the
opposite. Especially in classic Unix they are ubiquitous.

&lt;p&gt;
As a recovering programming language addict, I have to be deeply
suspicious every time a new language looks like the right solution. Is
it really? Or is this just an excuse to fall off the wagon again, and
implement a language. (Not a big language, man. Just a little one, to
take the edge off).

&lt;p&gt;
It&#039;s also clear that the general idea of a JSON processing language is
solid. Some already exist
(e.g. &lt;a href=&#039;https://stedolan.github.io/jq/&#039;&gt;jq&lt;/a&gt;), but there could
be room for multiple approaches. Writing sample programs to see what a
language for JSON processing and transformation might look like was a
fun way to spend a couple of hours on the boring &amp;quot;no internet&amp;quot; leg of
a train journey. (&amp;quot;It could have this awk-like structure of a toplevel
pattern matching clauses, but on paths instead of rows of text, and
with a recursive main loop instead of a streaming one, and and
and...&amp;quot;).


&lt;p&gt;
If I kind of wanted to write this, the idea is good, and an initial
implementation is not an unreasonable amount of work, why not do it?
Well, even if a script written in this hypothetical language to
translate from hierarchical to tabular data would have been pretty
simple, it would still have been a program that the user of the tool
needs to write in a dodgy DSL. And since the language would have
been much more generic than a mere conversion tool, it it would also
have been impossible to guide the user through a process of iteratively
building the right configuration (like is now done via the error messages).

&lt;p&gt;
In all likelihood it&#039;d mean that nobody else would ever use the tool for the
original purpose. The less powerful and less flexible version is just
going to be more useful purely due to simplicity.

&lt;p&gt;
So sanity prevailed this time. But tune in for the next post for an
earlier example of where my self control failed.

</description><author>jsnell@iki.fi</author><category>GENERAL</category><category>PERL</category><pubDate>Tue, 12 Jan 2016 14:30:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2016-01-12-json-to-multicsv/</guid></item><item><title>Command languages as game user interfaces</title><link>https://www.snellman.net/blog/archive/2014-12-08-command-languages-as-game-ui/</link><description>
&lt;p&gt;
In
the &lt;a href=&#039;https://www.snellman.net/blog/archive/2014-11-27-history-of-online-terra-mystica/&#039;&gt;previous
post&lt;/a&gt; in this series, I promised to discuss in detail some of the
positive and negative consequences of the less conventional design
choices of my &lt;a href=&#039;http://terra.snellman.net/&#039;&gt;online Terra
Mystica implementation&lt;/a&gt;. If you have no idea of what that is,
reading at least the intro of that post might be a good idea. This
post will just deal with one design choice, but it&#039;s the elephant in
the room: the command language.

&lt;p&gt;
The canonical internal representation of a game in my TM
implementation is as a sequence of rows, each describing a some number
of player actions specified in
an &lt;a href=&#039;http://terra.snellman.net/usage/#sec-3.3&#039;&gt;ad hoc mini
language&lt;/a&gt;, or administrative commands that change the game setup
in some way (for example setting game options, or dropping a
player from the game partway through). This is what it might look like:

&lt;pre style=&#039;background-color: white&#039;&gt;
&lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;: action ACT4
&lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;: upgrade E6 to TE
&lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;: +FAV6
&lt;span style=&#039;background-color: #f08080&#039;&gt;giants&lt;/span&gt;: Leech 3 from &lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;
&lt;span style=&#039;background-color: #f08080&#039;&gt;giants&lt;/span&gt;: pass BON4
&lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;: Leech 2 from &lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;
&lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;: +WATER
&lt;span style=&#039;background-color: #f0c060&#039;&gt;dragonlords&lt;/span&gt;: Decline 2 from &lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;
&lt;span style=&#039;background-color: #f0c060&#039;&gt;dragonlords&lt;/span&gt;: dig 1. build G6
&lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;: send p to EARTH
&lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;: action FAV6. +AIR
&lt;span style=&#039;background-color: #f0c060&#039;&gt;dragonlords&lt;/span&gt;: pass BON7
&lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;: upgrade E7 to TE. +FAV11
&lt;span style=&#039;background-color: #f08080&#039;&gt;giants&lt;/span&gt;: Leech 3 from &lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;
&lt;span style=&#039;background-color: #f0c060&#039;&gt;dragonlords&lt;/span&gt;: Leech 2 from &lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;
&lt;span style=&#039;background-color: #b08040&#039;&gt;cultists&lt;/span&gt;: Leech 2 from &lt;span style=&#039;background-color: #e0f0ff&#039;&gt;yetis&lt;/span&gt;
&lt;/pre&gt;

&lt;p&gt;
That&#039;s a short excerpt from the middle of a random game. A full game
generally runs for about 400 rows.

&lt;read-more&gt;&lt;/read-more&gt;

&lt;p&gt;
What do I mean by this being the canonical internal representation?
Only a few parts of the game state are actually persisted separately
in the DB; these are things that might almost qualify as metadata,
such as whose turn is it to move, is the game still running, and what
were the final rankings of a finished game. But in general the only
way to find out the current state of the game is to evaluate the whole
sequence of commands from start to finish. This is in fact done for
&lt;i&gt;almost every operation on the site&lt;/i&gt; (viewing a game, previewing a move,
saving a move, viewing the or editing the game in an admin mode, and
so on).

&lt;p&gt;
In addition to being the canonical internal representation, the
command language is also the canonical user interface; the fundamental
operation players do is enter new rows into the command
sequence. Often this is done by writing the commands manually, though
there are GUI shortcuts of one form or another available for almost
all operations.

&lt;p&gt;
This might sound like a slightly insane way of doing things, but it
does have some benefits as well. I&#039;ve made several digital board game
adaptations of varying levels of completeness over the years, used
tens of other ones, and this solution hits the closest to my personal
sweetspot.

&lt;h3&gt;A taxonomical diversion&lt;/h3&gt;

&lt;p&gt;
Before discussing the fallout of this design decision in more detail,
it&#039;s probably useful to do a quick tour of some of the main axes in
the design space. (I&#039;m of course just describing the extremes, while
in the real world most examples would fall on a continuum).

&lt;p&gt;
First, there&#039;s the question of the interaction model which might be
&lt;b&gt;abstract&lt;/b&gt;
or &lt;b&gt;&lt;a href=&#039;http://en.wikipedia.org/wiki/Skeuomorph&#039;&gt;skeuomorphic&lt;/a&gt;&lt;/b&gt;. In
a skeuomorphic design the player doing input on a computer would still
be mimicking the actions of someone playing the game with physical
pieces and no computer assistance.

&lt;p&gt;
In an abstract design the player
would only input the parts of the move that are necessary to uniquely
distinguish it from other possible moves, with any bookkeeping and
mandatory intermediate steps being carried out automatically. Likewise
in a skeuomorphic design the software provides information through the
same methods as the original physical game, while an abstract design
will automate some of the mechanical parsing of the game state. Or
even just the question of using the graphical assets of the original
game, generally optimized for sales, versus using digital-first assets
optimized for clarity.

&lt;p&gt;
As an example of this axis, in
the &lt;a href=&#039;http://en.wikipedia.org/wiki/18XX&#039;&gt;18xx&lt;/a&gt; series of
games a substantial amount of playtime is spent computing the exact
routes of a number of trains on a complex rail network. I&#039;m aware of
three solutions that are actually in use, and there is a fourth
plausible one, in order from least to most abstract:

&lt;ul&gt;
&lt;li&gt; The user manually decides on the routes, computes their values with no computer assistance, and those values are used with no validation. Examples: ps18xx, early versions of &lt;a href=&#039;http://rails.sourceforge.net/&#039;&gt;Rails&lt;/a&gt;.
&lt;li&gt; The user enters valid routes through a user interface. The software computes the values of the routes, and distributes the income from the company appropriately. Example: &lt;a href=&#039;http://www.rr18xx.com/&#039;&gt;rr18xx&lt;/a&gt;.
&lt;li&gt; In games with requirements that all routes must be optimal, the software could compute an optimal route but only for the purpose of rejecting any manually computed unoptimal ones. Examples: None. (Though it&#039;s similar to what&#039;s done in the SlothNinja implementation of &lt;a href=&#039;http://www.slothninja.com/&#039;&gt;Indonesia&lt;/a&gt;, a game that probably counts as an honorary 18xx)
&lt;li&gt; The software automatically finds an optimal set of routes and computes their values. Examples: The ancient DOS-based &lt;a href=&#039;http://www.mikkosgameblog.com/2010/08/simtex-1830/&#039;&gt;1830 from Simtex&lt;/a&gt;, recent versions of Rails.
&lt;/ul&gt;

&lt;p&gt;
My own tastes run toward maximum abstraction, I&#039;ve rarely if ever seen
a digital boardgame conversion that needed to be more skeuomorphic.
But this is not a universal view. There are definitely people who will
refuse to play a conversion that does not use the same graphics as the
physical version. Or who will strenuously argue against automatic
finding of optimal routes in 18xx, on the basis that being evaluating
routes is a core skill in the game when making decision about route
building, and that skill can only be acquired by getting sufficient
practice in manual route computation.

&lt;p&gt;
A second axis is the internal representation, which could be based on
either &lt;b&gt;log replay&lt;/b&gt; or &lt;b&gt;stored state&lt;/b&gt;. In a log replay
system the game is stored as a series of steps from the starting setup to
the current state. In a stored state system the game is stored as the
current values of all pieces of the game. How much money does every
player have, which round is it right now, what&#039;s in this exact space
on the map, and so on.

&lt;p&gt;
A third axis is the input model. Moves could be entered either through
&lt;b&gt;direct&lt;/b&gt; or &lt;b&gt;indirect&lt;/b&gt; manipulation. In a system using
direct manipulation, the player would for example see a graphical
display a map and be able to click or drag on a unit to enter a move
for it. In an indirect system the player observes the game state in
one place, and enters their moves using some completely unrelated
system.

&lt;p&gt;
I think most digital boardgames use a direct input model, but there
are also a fair number that have a menu-driven system of some sort.
The only examples I know of that go a bit further with indirection by
providing a command language are
my &lt;a href=&#039;https://www.snellman.net/software/pogmap/&#039;&gt;ancient Paths
of Glory mapper&lt;/a&gt; and the even
older &lt;a href=&#039;http://en.wikipedia.org/wiki/Internet_Diplomacy#Email_judges&#039;&gt;Diplomacy
PBEM judges&lt;/a&gt;. If you have other examples, I&#039;d love to hear of them.

&lt;p&gt;
Direct manipulation is often, but not always, linked to excessive
skeuomorphism in the interaction model. For example I find it almost
painful to play most Vassal modules, with their hyper-direct
interaction model of dragging and dropping counters around, manually
drawing cards from a deck or rolling dice. Digital boardgames are not
the same media as physical boardgames, and should play to their unique
strengths. But these are in fact orthogonal concerns, and there&#039;s no
reason for why a direct manipulation model couldn&#039;t also provide
useful input and computational abstractions.

&lt;p&gt;
Whew, so much for the theory. In this taxonomy Online Terra Mystica is
pretty far toward the abstract end, and is fully in the log replay
camp. While it has a half-hearted attempt at adding some direct
manipulation concepts to the UI, it started off as an indirect system
and deep inside that&#039;s what it is. It also chooses to merge the input
format and the log format into one entity. So what does this mean?

&lt;h3&gt;Feature set&lt;/h3&gt;

&lt;p&gt;
Perhaps the signature feature of the site is the &lt;b&gt;planner&lt;/b&gt;. This
tool allows the player to enter an arbitrarily long sequence of
actions - all the way to the end of the game - and see what the
effects would be. Are all the moves valid? Are there sufficient
resources available to do all of this? Oh, I don&#039;t have enough
resources? Well what if I do this on round 5, and delay that action to
round 6. In cases where the plan fundamentally depends on the
opponents doing something, it&#039;s possible for the plan to also contain
arbitrary resource adjustments.  And finally, since the command
language supports comments, these plans can be properly documented so
that when you return to them in a day or two, you can remember why
you wanted to do these particular moves.

&lt;p&gt;
I think this feature is intrinsically linked to the command language as
a user interface, and it might actually be unique. There are some
games with other kinds of interfaces that allow you to play the game
forward, and then undo / rewind / reload. But simply being able to
play the game forward is not sufficient to make this a useful
tool. It&#039;s only the ease of inserting, reordering and deleting moves
that makes it possible to use this as a matter of course, rather than
only under the most exceptional circumstances.

&lt;p&gt;
A somewhat related feature is &lt;b&gt;undo&lt;/b&gt;. Inflexibility in allowing
moves to be taken back is the bane of many forms of digital
boardgames.  When playing a game face to face, most groups will
generally allow at least some level of taking back moves. In some
cases all moves are final immediately (this has always been the
primary problem of the otherwise brilliant implementation
of &lt;a href=&#039;http://brass.orderofthehammer.com/&#039;&gt;Brass at Order of the
Hammer&lt;/a&gt;). In some other implementations there are distinct
checkpoints, for
example &lt;a href=&#039;http://www.boardgaming-online.com/&#039;&gt;BGO&#039;s Through the
Ages&lt;/a&gt; allows undoing back to the start of your full turn, but no
other rollbacks (clicking &#039;finish turn&#039; is final, as is any kind of
action during an auction or war resolution). These two are, I believe,
examples of undo being limited for design reasons. At rr18xx meanwhile
rollbacks are possible until the previous action of each player. Here
my understanding is that the overriding issue is technical, as the
rollback is essentially a full restore to a previous database
snapshot, and there are resource constraints on how many snapshots can
be kept.

&lt;p&gt;
The solution Online TM takes to this is to grant the creator of the
game arbitrary powers to edit the history at will, the &lt;b&gt;admin
mode&lt;/b&gt;. Not only can they undo the last move or couple of moves. If
there was a mistake made three moves back, they can go and fix it (and
they can fix it without forcing the intervening moves to be
redone). This feature is fully tied to a log replay mode of
operation. While more limited forms of undoing could be implemented as
a reverse log replay from the end state or through state snapshots,
this more complete form depends on the log being directly editable.
And realistically the log also needs to be the input format; it would
not be reasonable to expect the admin to be able to edit a more
formal log representation correctly (whether the log format is XML,
protocol buffers, JSON, or something else). But in the case where the
log format and the move input system match, just playing the game
has taught the game admin the necessary skills.

&lt;p&gt;
This is a very nice feature for friendly games. It does have downsides
though, more on that later in the section on the social implications.

&lt;p&gt;
There&#039;s also a potential as yet unimplemented feature
of &lt;b&gt;pre-programmed actions&lt;/b&gt;, that people frequently ask for.
&amp;quot;I know exactly what I want to do next turn, why can&#039;t I just
pre-enter my move&amp;quot;. This would be a pretty interesting thing for
speeding up games, but to my mind would not be conducive for good
play. Circumstances change, often in ways you did not anticipate at
all.  The only way this could be even remotely usable would be if the
language was extended to have some kind of conditional execution. And
that&#039;s a can of worms I&#039;m interested in opening, and I suspect also a
bridge too far for 99% of my users.

&lt;p&gt;
It&#039;s worth noting that many of the above features are closely tied to
a game with no randomness (or at most setup randomness) and no
hidden information. As such their existence is something of an
anti-feature, preventing other additions to the game.

&lt;p&gt;
For a non-hypothetical example, I&#039;m currently thinking about how to
implement the &lt;b&gt;faction auction&lt;/b&gt; variant from the TM expansion. A
full open auction in the beginning would be painfully slow. The most
obvious, though still slightly imperfect, solution is a series
of &lt;a href=&#039;http://en.wikipedia.org/wiki/Vickrey_auction&#039;&gt;blind second
price auctions&lt;/a&gt;. But this is not a good fit for the site&#039;s existing
design. The problem is that the blind bid introduces momentary hidden
information into the game, and it&#039;s possible for that information to
leak through either the preview or admin modes. For example the admin
could wait for everyone else to bid, peek into the log and see
everyone else&#039;s bids, and then bid in such a way as to force the
winner to pay the maximum amount.

&lt;h3&gt;UX&lt;/h3&gt;

&lt;p&gt;
The most obvious UX consequence of using a command language is that
it tends to be &lt;b&gt;harder to learn&lt;/b&gt;. The following quote, said partly
in jest, certainly contains a kernel of truth:

&lt;blockquote&gt;
... has done a bang-up job providing a PBEM Terra Mystica experience that includes just enough extra layers of complexity via the interface and game administration tools to keep TM as confusing as ever, long after you master the actual game!
&lt;/blockquote&gt;

&lt;p&gt;
Non-natural languages are simply not a mode of human computer
interaction that most people are comfortable with in this day and age.
It actually continues to amaze me that I could get non-programmers to
play using this implementation at all. Is it possible to evaluate how
big a hurdle this has been for people? The best number I can come up
with is that around 20% of the players who joined at least one game
never finished even one game without dropping out. Note that these
are players who have already jumped through hoops such as email
validation during account registration. It&#039;s possible that there&#039;s
some other issue beside the UI that&#039;s a problem for these players,
but it does seem like the most likely candidate.

&lt;p&gt;
A smaller problem is that it essentially forces the introduction of a
&lt;b&gt;move preview&lt;/b&gt;. For those who haven&#039;t played the game, when entering
moves you need to first enter the moves, then click &#039;preview&#039;, check
that the results match what you want, and finally click &#039;save&#039; to
commit the moves. In a game that uses a direct manipulation paradigm,
a preview could be skipped. But with a more obscure UI like here, it&#039;s
absolutely essential since the move might not have had the intended
effect. Whether it&#039;s doing the entirely wrong move, picking the wrong
tile, building on the wrong location, etc. Even with a preview step
somebody will request a rollback on average once or twice a game.

&lt;p&gt;
So why do I call this a problem? Because despite my best efforts,
especially new players will frequently forget to &#039;save&#039;, leaving the
game in a limbo state where they think they&#039;ve done their move, until
some other player gets impatient. (To mitigate this a little, the
system will automatically do a &#039;preview&#039; when using the GUI tools to
generate the commands rather than type them.  Unfortunately
performance problems make it unfeasible to trigger continuous parsing
+ updates when typing).

&lt;p&gt;
A horrible mistake I made in the design of the language was the lack
of (mandatory) &lt;b&gt;turn delimiters&lt;/b&gt;. Originally my implementation treated
each row as a complete turn. This caused more confusion than any other
part of the command language. In the end I ended up writing a lot of
very complicated code for automatically detecting the turn breaks in
a command stream.

&lt;p&gt;
But that wasn&#039;t actually good enough, there are valid command streams
where the splitting isn&#039;t unambiguous, e.g. the tunneling ability of dwarves, where
&lt;code&gt;transform E10. build E10&lt;/code&gt;. I had to make an arbitrary
choice on that (basically the behavior now is greedy, as many commands
as possible are stuffed into the same move). So I had to include the
&lt;code&gt;done&lt;/code&gt; command to allow players to disambiguate in the few
cases where it&#039;s needed. This is still supremely confusing for
people. All of this could have been avoided by taking this into account
right at the start.

&lt;p&gt;
Finally, one very surprising outcome is that having a compact
vocabulary for game actions makes it much easier to display
a &lt;b&gt;useful player-readable log&lt;/b&gt; of what happened in the game. The
typical user-visible log is structured as natural language, and so
verbose as to be hard to read especially when trying to piece together
the flow of the game after the fact. It&#039;s easy to see why that design
choice is made, but it&#039;s not necessary when all players are almost by
definition going to know how to read a more compact representation.

&lt;p&gt;
Likewise this makes it really easy to display a concise summary of
what has happened in the game since the player last looked at it (done
both in the notification emails and the &#039;recent moves&#039; tab of games).

&lt;h3&gt;Social issues&lt;/h3&gt;

&lt;p&gt;
The unlimited admin access to games has a dark side. &lt;b&gt;Admin
malfeasance&lt;/b&gt; is rare but I do get about one complaint a month about
it. Sometimes these are games where the admin will change their moves
after others have already taken moves, rolling the game back by a huge
amount, taking over entirely for another player for example forcibly
passing them, applying different standards to allowing others to undo
vs. doing it themselves, and so on.

&lt;p&gt;
This is the kind of drama that I really do not want to deal with, but
the general solution is to just mark the game as unrated, and let the
players sort out between themselves whether and how the game will
continue. And it is a bit of a miracle that it hasn&#039;t yet become a
more widespread problem,
as &lt;a href=&#039;http://www.penny-arcade.com/comic/2004/03/19&#039;&gt;one might
expect to happen for the anonymity + internet combo&lt;/a&gt;. If it does
ever become intolerable, the solution will almost certainly be to
disable admin mode entirely for public games. The TM tournament has
already shown that it&#039;s at least workable, even if people do occasionally
get a little bit screwed by the &#039;no manual administration&#039; policy.

&lt;p&gt;
One consequence of a command language is that everything needs to be
named. The map needs to have a coordinate system, every component
needs a identifier of some sort, and every interaction needs a short
and snazzy name. Old school wargames will do this as a matter of
course. Of course every hex has an id! Of course the cards are both
numbered and uniquely titled! But not so much for eurogames.

&lt;p&gt;
The naming we ended up with on the site is far from optimal, and
caused yet more drama due to non-online players feeling excluded from
conversations. (If you want to know more, you can see an explanation
for &lt;a href=&#039;http://boardgamegeek.com/article/17066276#17066276&#039;&gt;where
the names came from, and why they won&#039;t change&lt;/a&gt;). That bit is
unfortunate. But at least I actually find real value in having
convenient shorthands available for everything, when discussing the
game, whether when theorycrafting or conducting some tabletalk on IRC
during a game.

&lt;h3&gt;Implementation issues&lt;/h3&gt;

&lt;p&gt;
The obvious problem for a log replay system
is &lt;b&gt;performance&lt;/b&gt;. Replaying a full game, which is done for almost
every operation, can take around 0.15 seconds in the current
implementation, with no obvious low hanging fruit to fix. On the
current traffic levels server load is not a problem, but I would start
to get worried if usage increased by a factor of 10. As discussed
above, there are features I&#039;m unwilling to implement due to CPU load
concerns. And it is actually causing real development pain for testing
(see below).

&lt;p&gt;
It&#039;s hard to say exactly how much of the CPU overload is related to
command parsing, a step that could be avoided with the use of a
more structured log format. Some crude profiling suggests that the
parsing takes only 5-10% of the runtime, certainly nowhere enough
to warrant using a different format.

&lt;p&gt;
A rewrite in a language with higher performance implementations than Perl
would almost certainly give a factor of 10 improvement on the actual
game evaluation code, moving the bottlenecks to IO. But a full rewrite
is not in the cards.

&lt;p&gt;
Another potential implementation worry is &lt;b&gt;storage&lt;/b&gt;. The current
DB size is about 250MB. Unlike CPU usage, this is a cost that
accumulates over time. Out of that 250MB maybe 75% is used by the game
logs. The logs, stored as a sequence of commands, are not a
particularly efficient form of encoding the game data. Simple lossless
compression could easily compress them by 80-90%.  Luckily disk is
cheap (this server still has 600GB free), so this should never become
a real issue.

&lt;p&gt;
Another consequence of a log replay system is that any change in the
game evaluation might &lt;b&gt;break existing games&lt;/b&gt;. That change might be a
bugfix for a place where the effect of a move was miscomputed, it
might be extra validation to prevent illegal moves of some kind,
cheating prevention, or something else entirely. This is not a
theoretical possibility. Basically every single game evaluation change
I make, there are already multiple affected games. No matter how
elementary a rule is, somebody has already broken it.

&lt;p&gt;
Obviously in a stored state implementation changes like this don&#039;t
matter. The current state is the current state no matter what. But in
a log replay system you need to have some story on how to deal with
retroactive changes. I can think of the following strategies:

&lt;ul&gt;
&lt;li&gt; Punt: Don&#039;t make any changes at all.
&lt;li&gt; Ignore: Just make the change, and don&#039;t worry about games breaking or the results changing part way through.
&lt;li&gt; Delete: Just delete any games that would be broken.
&lt;li&gt; Fixups: Find all games where the old and new behavior differ, and
change the appropriate logs in such a way that the results with the
new log and version will be the same as the result with the original log
and old version. This change could be manual or automated.
&lt;li&gt; Versioning: Each game file carries a version number. When making
a breaking change, keep both the original and new code paths, and choose
one of the two based on the version number. Any newly created games use
the new version number and get the fixes, existing games keep their original
version number and the original behavior.
&lt;li&gt; Positive options: Conditionalize the behavior on an option. Turn that option on for new games, as well as any existing games for which the new and old versions behave the same.
&lt;li&gt; Negative options: Conditionalize the old behavior on an option. Turn that option on only for existing games where the results for old and new versions differ. Never turn the option on for newly created games.
&lt;/ul&gt;

&lt;p&gt;
During the lifespan of the site I&#039;ve used most of these at one time or
another. The &#039;ignore&#039; strategy was appropriate a couple of times (for
changes where I decided that the the new behavior was always
acceptable, such as situations where a player had ended up overpaying
for an action). The &#039;delete&#039; strategy would be exceptional, the only
situations where I used it were games that were aborted, and one case
of a single game being completely unsalvageable due to bug abuse by a
player. The &#039;fixup&#039; strategy has the nice benefit that it avoids
introducing a new code path, and was my default choice early on. But
at this point it&#039;d be an unacceptable amount of manual work, and it&#039;s
not readily automatable. Especially with the relatively freeform input
from the command language. My next default was &#039;positive options&#039;, but
after about 3-4 of those I switched to &#039;negative options&#039;. Positive
options had a slightly more complicated rollout procedure, and also
permanently clutter up all games, confusing people. (&amp;quot;What&#039;s this
&lt;code&gt;strict-darkling-sh&lt;/code&gt; option?&amp;quot;).

&lt;p&gt;
None of these options are good, in this instance a log replay model
does introduce some major costs either to the developer (who has to do
extra work) or the users (who have some games screwed up or completely
lost).

&lt;p&gt;
But it&#039;s not all bad! A log replay model makes &lt;b&gt;testing&lt;/b&gt; much
easier. First, it&#039;d be very easy to write test cases since there is a
very natural serialization format for games already, the command
language. I don&#039;t actually write explicit tests for TM, but for
example at work we need absurd amount of infrastructure for making it
easy to write unit tests for TCP/IP packet handling. This kind of
design gives the test cases for free. Likewise a Age of Steam
implementation I was once doodling around with had lots of test cases,
but even with the reasonably friendly format (protocol buffers) they
were an absolute pain to write due to the boilerplate.

&lt;p&gt;
If I don&#039;t write unit tests, how do I test? Mostly by &lt;b&gt;side by side
testing&lt;/b&gt;; I have
a &lt;a href=&#039;https://github.com/jsnell/terra-mystica/blob/master/src/diffgame.pl&#039;&gt;small
script&lt;/a&gt; that runs every single game in the database against both
the new and the previous version. It munges the results a bit removing
known harmless diffs, and then displays any changes from game to
game. I can then look at those games, and decide whether it&#039;s
indicating some kind of a problem with my change, an expected result
of my change, or a problem of some sort in the game. It also acts as
a great regression test that prevents failures from creeping in, and
is the source of data for finding the games that would be broken by
a game, so that one of the fixes discussed in the previous section
can be applied.

&lt;p&gt;
This has been one of my favorite forms of testing for a long time, and
works tremendously well in a case like Online TM where we have access
to all games ever played. Thinking specifically of digital boardgames,
it&#039;s also a model that wouldn&#039;t work well without a replayable log.
The only problem is, as alluded to above, the CPU usage. Right now a
full &lt;code&gt;diffgame&lt;/code&gt; run takes about 90 minutes of CPU time on a
rather beefy machine. Even with parallelization it&#039;s not a fast
feedback cycle. (Makes me kind of miss being able to just casually run
a sxs test on a thousand machines).

&lt;h3&gt;Conclusion&lt;/h3&gt;

&lt;p&gt;
I&#039;m afraid this ended up longer than intended, despite only covering
one design decision. It&#039;s also a design decision that I feel is
overall a win. You&#039;ll have to wait for the next post for the
embarrassing technical missteps.

</description><author>jsnell@iki.fi</author><category>GAMES</category><category>PERL</category><pubDate>Mon, 08 Dec 2014 12:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2014-12-08-command-languages-as-game-ui/</guid></item><item><title>A brief history of Online Terra Mystica</title><link>https://www.snellman.net/blog/archive/2014-11-27-history-of-online-terra-mystica/</link><description>
&lt;h3&gt;What&#039;s this Online Terra Mystica thing?&lt;/h3&gt;
&lt;p&gt;
For the last couple of years my main hobby hacking project (over a
thousand commits, and probably an order of magnitude more time spent
on it than all other non-work projects combined) has been an &lt;a
href=&#039;http://terra.snellman.net/&#039;&gt;asynchronous
multiplayer web implementation&lt;/a&gt; of the brilliant board game
&lt;a href=&#039;http://boardgamegeek.com/boardgame/120677/terra-mystica&#039;&gt;Terra Mystica&lt;/a&gt; (&lt;a href=&#039;http://www.feuerland-spiele.de/en/&#039;&gt;Feuerland Spiele&lt;/a&gt;, 2012).
At the moment it&#039;s roughly 2/3 Perl, 1/3 Javascript, and uses
Postgres as the data storage.

&lt;p&gt;
It&#039;s been a fairly successful project for something that was
originally intended as a one-off. The usage statistics at the end
of November 2014 are:

&lt;ul&gt;
&lt;li&gt; Almost 6000 registered users
&lt;li&gt; About 1200 monthly active users (as in playing at least one game; not
passive use like looking at the statistics pages).
&lt;li&gt; 14000 moves executed on a normal weekday (10000 on weekends)
&lt;li&gt; 16500 games either ongoing or finished.
&lt;li&gt; Bi-monthly &lt;a href=&#039;http://tmtour.org&#039;&gt;online TM tournament&lt;/a&gt; run by
  Daniel &amp;Aring;kerlund with 400+ players.
&lt;li&gt; &lt;a href=&#039;https://github.com/jsnell/terra-mystica&#039;&gt;1038 commits&lt;/a&gt; as of this writing.
&lt;/ul&gt;

&lt;a href=&#039;http://terra.snellman.net/game/135test&#039;&gt;&lt;img src=&#039;/blog/stc/images/tm-thumb.png&#039; style=&#039;float: right; padding: 10px;&#039;&gt;&lt;/a&gt;

&lt;p&gt;
This was not supposed to be a general use
program. It was originally a one night hack to help keep track of
a hand-moderated play-by-forum game of TM, which was obviously
headed for failure due to the massive amount of errors people were
making while describing their moves in natural language or when manually
tracking their resources in the game.

&lt;read-more&gt;&lt;/read-more&gt;

&lt;p&gt;
From there the project snowballed, slowly gathering features
including just about everything I ever marked in the TODO as being
&#039;out of scope&#039;. Since I often had only very limited amounts of time to
work on this, and my expectation was always that the interest in the
site would soon fizzle out, the project management method was to always
get the maximum short-term bang for the buck.

&lt;p&gt;A project whose direction is literally guided by &#039;what can I get
done in the next two hours&#039; is of course massively path dependent; the early
decisions made with very little consideration had outsized influence
on where the site ended up. Sometimes the expedient gambles on &#039;do the
simplest possible thing&#039; failed, and the results were just rubbish. At
other times things ended up at a slightly odd local maximum. And in
some rare cases the gamble turned out to produce wonderful
and unexpected results.

&lt;h3&gt;Timeline&lt;/h3&gt;


&lt;p&gt;Future posts will discuss the actual lessons
learned; what didn&#039;t work and what did work - both in the mechanics of
programming and in the peculiarities of online boardgames. But in
this one let&#039;s just have a look at the history of the site, how long
it took for it to get features that one might consider absolutely
necessary, and how amazingly bad user experience people are willing to
put up with when it&#039;s the only way they can play their favorite game
online.

&lt;p&gt;Feel free to skip past the bulleted list if you get bored, it&#039;s
still a bit long even if I include only changes I consider fairly
major (indeed, a lot has to get filtered out given it&#039;s 1000+ commits).

&lt;p&gt;
  &lt;b&gt;2012&lt;/b&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;b&gt;December - Early January&lt;/b&gt;: The smallest program that did anything useful related to a game. I&#039;d enter moves into a text file and run the script to produce the final game state as JSON. This JSON was rendered to HTML + Canvas by some Javascript code that was half ripped off from an old project. There was some minimal rules checking and automation, and support for only 5 out of the 14 factions in the game. Users of the current site might want to see the &lt;a href=&#039;https://www.snellman.net/tmp/tm/1/&#039;&gt;old look&lt;/a&gt;.
  &lt;/ul&gt;
  &lt;b&gt;2013&lt;/b&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;b&gt;January&lt;/b&gt;: A rudimentary dynamic web site, implemented simply as a wrapper CGI script around the JSON generator script. After that a clumsy web-based editor was added for game files (a textarea that could be used to edit specific files in a git repository, no authentication except for each game having a random 160 bit identifier as part of the URL). This allowed other people to moderate their own games, as long as I created a game for them and sent the link with the secret embedded. Players would post / email a natural language description of their move to the moderator, who would then enter the moves into the admin tool using the correct syntax. Amazingly some 20 games were run using this insane system, while by all rights the project should have died there.
      &lt;br&gt;&lt;br&gt;This version of the software had automation for resolving the effect most game events, but did very little validation to notice completely invalid moves.
    &lt;li&gt;&lt;b&gt;February&lt;/b&gt;: Added an ability to easily rewind the game state back to any time in history, to help with post-game strategy analysis. Also added a way for players to enter their own moves (a textarea in the main game view, a preview button and a save button, and some verification to make sure they could only enter their own moves). Again there was no real authentication here, just links with an embedded faction token derived from the per-game secret key.
    &lt;li&gt;&lt;b&gt;March&lt;/b&gt;: The hackiest email integration in the world: Store the email addresses players in the same text file with the commands. After a player has entered a move, the software would create a mailto: link with prefilled subject, content and receivers (the other players). The player would clicks on the mailto: link, the email loads up in their mailer (even GMail), and they&#039;d press send.
      &lt;br&gt;&lt;br&gt;Compute and display a VP projection on the last round assuming no further moves, to give players some idea of who is really winning.
    &lt;li&gt;&lt;b&gt;April&lt;/b&gt;:
      I continued to resist adding any user management or authentication. But my friend Gareth wanted a better way to manage his ongoing games than a spreadsheet, and wrote a small App Engine site into which players entered their secret game URLs. His site then used my site&#039;s API to figure out which games the player needed to act in. And it went even a bit further, by embedding the move entry UI into the same app.
      &lt;br&gt;&lt;br&gt;After a few weeks of using Gareth&#039;s site, I had to admit that he was totally right about this being required functionality. So I finally added a DB to the project for storing user accounts and game metadata, and a &#039;your games&#039; list on the front page after login. It&#039;s also only at this point in the lifetime of the site where I added a UI for people to make new games. Until then every game was created by somebody asking for a new game via email.
      &lt;br&gt;&lt;br&gt;Finally, this month also saw the addition of a statistics page on how often each faction was winning (since balance was a hot topic on the BGG forums of Terra Mystica right from the start), and soon after a list of achieved high scores for each faction and player count.
    &lt;li&gt;&lt;b&gt;May&lt;/b&gt;: This month mostly introduced all kinds of stricter validation, as the reduced barrier to entry for playing was causing significantly more illegal moves to be entered (early on players were enthusiasts of the game and thus had good knowledge of the rules; at this point people started to learn the game through the site, which was quite scary).
      &lt;br&gt;&lt;br&gt;The main new feature of the month was the &#039;planner&#039;, an alternate text entry box which could be used to enter commands arbitrarily far into the future, and check that the moves are valid and what kind of effect they have. This is useful for example for checking that you have sufficient resources for making certain moves without manual computation. Another use is leaving &#039;notes to self&#039;, so that the player doesn&#039;t need to re-evaluate the board for every single move. (Some people were suddenly playing tens of games at a time, so this was a real problem).
    &lt;li&gt;&lt;b&gt;June-August&lt;/b&gt;: This time period saw only minor fixes and improvements from the user&#039;s point of view. There was a bit of infrastructure work behind the scenes, such as moving the actual game moves into the database, though they still remained just plaintext.
    &lt;li&gt;&lt;b&gt;October&lt;/b&gt;:
      The mini expansion for TM was released at the Spiel fair in Essen. I implemented the new features the very next morning in lobby of my hotel at Essen, with a ChromeBook, a ssh connection to the production server, and and the world&#039;s worst WiFi. After some reflection I decided not to make the change visible to the public before getting back home and a more reliable work environment :-)
    &lt;li&gt;&lt;b&gt;November&lt;/b&gt;:
      I finally made the site automatically send email notifications, rather than require players to jump through the fragile mailto: hoops to let other players know whose turn it is. Replacement of the mailto-style notification of moves also required the addition of an in-site chat feature for communication.
    &lt;li&gt;&lt;b&gt;December&lt;/b&gt;:
      Another consequence of the real email support from the previous month was that players no longer needed to expose their email addresses to other players. This finally made it possible to allow players to create &#039;public games&#039; that anyone can join, rather than only play people with whom they&#039;ve done some kind of an out-of band email address exchange. (At this point 1500+ games had been started, amazing how far such a kludgy system could go).
      &lt;br&gt;&lt;br&gt;At the time 25-30% of moves were being entered from
      smartphones or tablets. But the move entry interface was typing
      commands like &lt;code&gt;&#039;convert 2pw to 2c. upgrade d3 to tp&#039;&lt;/code&gt;
      into a text box. What&#039;s
      wrong with this picture? :-) In the month we finally got a
      slightly friendlier UI, though the textual command representation
      still remained the canonical one.
      &lt;br&gt;&lt;br&gt;
      The site finally got a ranking system: a multi-iteration version
      of the ELO algorithm, which computed not only player strengths but
      also faction strengths, and credited good results with the weaker
      factions more than good results with the strong ones.
      &lt;br&gt;&lt;br&gt;
        Finally, in very late December I went on a big refactoring
        spree to move the game from CGI scripts to a more persistent
        application server (FCGI with Plack and CGI::PSGI, but no
        framework). Eradicating all global data and all
        modification of literal data structures was way too much work,
        those were not corners worth cutting in the first place.
      &lt;br&gt;&lt;br&gt;The new UI went live a year from starting the project
      (almost exactly; from December 22nd 2012 to December 21st 2013),
      and is the point where I&#039;d consider the site to be actually usable
      by mere mortals.
  &lt;/ul&gt;
&lt;b&gt;2014&lt;/b&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;b&gt;February&lt;/b&gt;: Support for variant maps, for testing parts of
    the upcoming Terra Mystica expansion for the designers. I also added
    a map editor that could import map definitions from
    &lt;a href=&#039;http://lodev.org/tmai/&#039;&gt;Lode&#039;s TM AI&lt;/a&gt;, which the
    design team had been using for the map. The online playtest team
    proceeded to play 100 games with different map versions
    before the expansion finally went to print.
  &lt;li&gt;&lt;b&gt;April&lt;/b&gt;: A bunch of work on the expansion, which was still
    being kept under wraps. So the support for the new final scoring types and
    four of the new factions was not visible to most users at this time.
    &lt;br&gt;&lt;br&gt;
    The main user-visible change was automatically dropping players from
    games after a week of inactivity, to support the inaugural season of
    the online Terra Mystica tournament. People&#039;s irritation about others
    playing slowly had been constant ever since the addition of public
    games (95% of my games are private with a few separate groups of
    friends, so I&#039;m pretty isolated from this myself). Unfortunately
    this change appears did not appear to help enough.
    &lt;br&gt;&lt;br&gt;This month also saw the addition of individual profile pages,
    showing all kinds of statistics for each player (games started,
    finished, performance with given factions, performance and play
    counts against specific opponents, etc).
  &lt;li&gt;&lt;b&gt;September&lt;/b&gt;:The next attempt at reducing the anguish caused
    by slow players was to allow setting shorter move timers than the
    default one week (from 12 hours to 14 days). Lots of people started
    12 hour deadline games, and moved on to complaining about so many
    people dropping out. Sometimes you just can&#039;t win.
  &lt;li&gt;&lt;b&gt;October&lt;/b&gt;:Public support for the two new expansion maps,
    as well as the new final scoring types.
  &lt;li&gt;&lt;b&gt;November&lt;/b&gt;:Public support for all six new factions from
    the expansion, as well as the variable turn order variant.
&lt;/ul&gt;

&lt;p&gt;
I find it interesting that it really did basically take a year of
real time (and maybe 2 months of hacking time) before the
implementation was in a shape where I would&#039;ve thought about
publishing it. And there&#039;s no way I&#039;d put that amount of time into a
project like this up front. Usually these projects are active for a
couple of weekends before getting abandoned; fun parts are done but
all the hard work of making it really usable remains.

&lt;p&gt;In this case people were eager to use even the incredibly crude
early versions, so I got over that hump very quickly. And at that point
every incremental improvement to the site was affecting tens, hundreds,
or thousands of people. This is of course always more motivating than
working on polishing the perfect piece of software that nobody is
using.

&lt;p&gt;There were many architectural and design decisions done along the
way that I ended up deeply regretting, and which cost me lots of time
later on. But without all those early shortcuts there would&#039;ve been no
implementation at all. Easily the best example of
&lt;a href=&#039;http://www.jwz.org/doc/worse-is-better.html&#039;&gt;Worse is Better&lt;/a&gt;
that I&#039;ve been personally involved with.
</description><author>jsnell@iki.fi</author><category>GAMES</category><category>PERL</category><pubDate>Thu, 27 Nov 2014 21:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2014-11-27-history-of-online-terra-mystica/</guid></item><item><title>Feed moving</title><link>https://www.snellman.net/blog/archive/2006-01-04-feed-moving.html</link><description>

&lt;p&gt;I&#039;ve moved my blog away from the &lt;a href=&quot;http://www.cs.helsinki.fi/&quot;&gt;University of Helsinki Department of Computer Science&lt;/a&gt; servers, where
it&#039;s been living for almost two years. In anticipation of this moment
I originally made all the links to my rss feeds through as
&lt;a href=&quot;http://iki.fi/&quot;&gt;forwarding service&lt;/a&gt;. When the blog moves, just flip the redirector
to point elsewhere, and the users won&#039;t notice a thing!&lt;/p&gt;

&lt;p&gt;At least that was the plan.&lt;/p&gt;

&lt;p&gt;Apparently a lot of people still managed to subscribe with the
target URL of the forwarder
(http://www.cs.helsinki.fi/u/jesnellm/blog/rss-...), instead of
with the forwarder URL (/blog/rss-...),
and thus would still be seeing the old feed after the move.&lt;/p&gt;

&lt;p&gt;So now I&#039;ve just set up a HTTP 301 (permanent) redirect on
the old location. Smart RSS aggregators are supposed to update the feed URL
when seeing a permanent redirect, but judging from the access logs
few do this in practice. Instead a 301 is treated the same as a
302 (temporary) redirect.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;Which brings me to the actual point&lt;/b&gt;: If you want to keep
on subscribing to this feed indefinitely, please check that you
aren&#039;t using the cs.helsinki.fi URL. I&#039;m hoping to graduate this
year, which might also imply losing the 301 from the old location to the
new one.&lt;/p&gt;

&lt;p&gt;While moving servers, I also took the opportunity to redo the blog as a dynamic
application (using &lt;a href=&quot;http://www.cliki.net/araneida&quot;&gt;Araneida&lt;/a&gt;)
instead of generating static pages. I&#039;ll
see whether I can procrastinate myself into adding comment support
in the future.&lt;/p&gt;
&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>PERL</category><category>GENERAL</category><category>LISP</category><pubDate>Wed, 04 Jan 2006 05:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2006-01-04-feed-moving.html</guid></item><item><title>Golf - Deroter</title><link>https://www.snellman.net/blog/archive/2005-07-23.html</link><description>

&lt;p&gt;Long time since the last golf. Inspired by the recent announcement of
a &lt;a href=&quot;http://terje2.perlgolf.org/~golf-info/Book.html&quot;&gt;Perl Golf book&lt;/a&gt; I
took part in a Polish golf that was announced on the mailing list.&lt;/p&gt;

&lt;p&gt;Given a input string that has been &quot;encrypted&quot; with ROT-n on &lt;code&gt;STDIN&lt;/code&gt;
and a dictionary of words (sequences of letters &lt;code&gt;A-Za-z&lt;/code&gt;, not of &lt;code&gt;\w&lt;/code&gt;)
in &lt;code&gt;@ARGV&lt;/code&gt;
the program needs to output to &lt;code&gt;STDOUT&lt;/code&gt; the original plaintext.
(&lt;a href=&quot;http://kernelpanic.pl/perlgolf-view.mx?id=54&quot;&gt;Formal rules&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;My best
solution was 62 characters, but I figured out about an hour before
the golf ended that it was actually broken, and didn&#039;t have time to
figure out anything better than the 65.44 below, which is currently
good for a second place. The apparent winning solution of 63 doesn&#039;t seem to
work either, for unrelated reasons. So the explanation might
be for the winning entry, or it might not.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;#!perl -p0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You know the drill. &lt;code&gt;-p&lt;/code&gt; handles reading the input and printing
the output. Use &lt;code&gt;-0&lt;/code&gt; to read the input in one go, instead of
a line at a time.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;INIT{%a=map{pop,1}@ARGV}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In the &lt;code&gt;INIT&lt;/code&gt; block, pop all command line parameters to make &lt;code&gt;-p&lt;/code&gt;
read from &lt;code&gt;STDIN&lt;/code&gt;. Use the removed arguments as keys in a hash table
for detecting dictionary words. Using the symbol table with
something like &lt;code&gt;$$_=1while$_=pop&lt;/code&gt; would save a few characters, but
that&#039;s incorrect since &lt;code&gt;$ARGV&lt;/code&gt; is automatically set to &lt;code&gt;&#039;-&#039;&lt;/code&gt; on entering
the main loop.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$a{$&amp;amp;}||y/B-ZA-Gb-za/A-z/while/\pL+/g
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;At the start of the main body &lt;code&gt;$_&lt;/code&gt; contains the whole ROT-n text.&lt;/p&gt;

&lt;p&gt;On the first iteration &lt;code&gt;/\pL+/g&lt;/code&gt; will match the first word (letters
only; &lt;code&gt;\pL&lt;/code&gt; is essentially &lt;code&gt;[a-zA-Z]&lt;/code&gt;). &lt;code&gt;//g&lt;/code&gt; works differently in scalar
than in list context: it will only match once per call, but the next call
will start at the location in the string where the last match ended. If
a match was found it returns true, otherwise false.&lt;/p&gt;

&lt;p&gt;In the body of the while we first check if the word we matched
is in the dictionary. If it isn&#039;t (i.e. &lt;code&gt;$a{$&amp;amp;}&lt;/code&gt; is untrue) &lt;code&gt;$_&lt;/code&gt;
obviously isn&#039;t plaintext yet, so we rotate it by one step with
&lt;code&gt;y///&lt;/code&gt;. This contains the only tricky bits in the program:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;
&lt;p&gt;Changing &lt;code&gt;$_&lt;/code&gt; causes the scalar &lt;code&gt;//g&lt;/code&gt; to be reset, and start
matching from the start of the program.&lt;/li&gt;

&lt;li&gt;
&lt;p&gt;Doing the rotation backwards (A -&gt; Z, B -&gt; A, ..., Z -&gt; Y) instead
of the more intuitive direction (A -&gt; B, B -&gt; C, ... Z -&gt; A)
allows writing the transliteration in a way that saves one
character.&lt;/p&gt;

&lt;p&gt;There are six characters (&lt;code&gt;[\+]^_`&lt;/code&gt;) between &lt;code&gt;Z&lt;/code&gt; and &lt;code&gt;a&lt;/code&gt;. By
adding six extra characters into the right place on the left
side of the transliteration operation (with &lt;code&gt;-G&lt;/code&gt;) we can use the
range &lt;code&gt;A-z&lt;/code&gt; on the right side, instead of specifying separate
ranges for upper- and lowercase letters. Compare:&lt;/p&gt;
&lt;/li&gt;&lt;/ul&gt;

&lt;pre&gt;&lt;code&gt;y/A-Za-z/B-ZAb-za/
y/B-ZA-Gb-za/A-z/
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;FWIW, the 65.48 by Piotr Fusik by far the coolest solution. Wish
I&#039;d thought of that...&lt;/p&gt;
</description><author>jsnell@iki.fi</author><category>PERL</category><pubDate>Sat, 23 Jul 2005 10:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2005-07-23.html</guid></item><item><title>Golf - Rush Hour</title><link>https://www.snellman.net/blog/archive/2004-07-01.html</link><description>&lt;p&gt;The recent &lt;a href=&#039;http://groups.google.com/groups?selm=cbuda9%24su2%241%40post.home.lunix&#039;&gt;godzillagolf &lt;/a&gt;titled &lt;a href=&#039;http://terje2.perlgolf.org/~pgas/score.pl?func=rules&amp;hole=63&amp;season=1&#039;&gt;Rush Hour&lt;/a&gt; was the first golf in a while where my solution contained anything worth
  explaining. Here&#039;s all 157 characters of the solution:&lt;/p&gt;&lt;pre&gt;#!perl -n0
sub
R{$b&amp;lt;0?reverse:$_}sub
M{/
/?s^\pL^$b=$#A**pos;push@_,&quot;$&amp; $b
&quot;;$c=8*($&amp;lt
Z);s/$&amp;/ /,s/(($&amp;)\C{$c}) /$1$2/&amp;lt;++${$_=R}or&amp;M
for~~R;pop^ge:exit
print@_}M&lt;/pre&gt;&lt;p&gt;The code uses a depth-first search, which can be roughly
  divided into the following steps (the actual code doesn&#039;t
  do things quite in this order):&lt;/p&gt;&lt;ol&gt;&lt;li&gt;If current board has already been visited, backtrack to
     step 4.&lt;/li&gt;&lt;li&gt;Mark current board as visited&lt;/li&gt;&lt;li&gt;If a car is in the target space, print the moves that
     have been accumulated and quit.&lt;/li&gt;&lt;li&gt;Move one of the cars one step in either direction&lt;ul&gt;&lt;li&gt;If no cars can be moved, backtrack&lt;/li&gt;&lt;li&gt; If backtracking to here, try another car/direction&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Go to step 1.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;There are several subproblems that need to be solved to
  implement the algorithm:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Detect whether the win condition has been reached.&lt;/li&gt;&lt;li&gt;Accumulate the moves. (Preferably without using too
     much memory; I have a 149 character solution that uses 200MB,
     which isn&#039;t really justifyable for this problem).&lt;/li&gt;&lt;li&gt;Iterate over all valid moves (car/direction pairs) for a board.&lt;/li&gt;&lt;li&gt;Given a board and a move, generate another board.&lt;/li&gt;&lt;li&gt;Ensure that the backtracking works.&lt;/li&gt;&lt;li&gt;Detect whether the board has already been visited&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The board is of course stored in the original format as a string.
There&#039;s no room for any fancy datastructures... Given that, here
are my solutions to the subproblems:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Just check whether the board contains a space followed by a
     newline:&lt;pre&gt;/ \n/ ? ... : ...  # If the regexp fails, win)&lt;/pre&gt;&lt;li&gt;Keep the moves stored as strings in &lt;code&gt;@_&lt;/code&gt; in the correct order:&lt;pre&gt;push@_,&quot;$&amp; $b\n&quot;; # $&amp; is the current car, $b is either -1 or 1
...   # execute the rest of the algorithm. This quits on success...
pop;  # ... so reaching this line means that we&#039;re backtracking,
      # and need to remove the move&lt;/pre&gt;&lt;li&gt;Given a board, iterate over all valid car-characters in the
     string (I used &lt;code&gt;\pL&lt;/code&gt; here instead of the obvious
     &lt;code&gt;\w&lt;/code&gt; for reasons
     that are still unclear to me). For each character, generate
     a value &lt;code&gt;$b&lt;/code&gt; as either 1 or -1, so that it&#039;s guaranteed that
     for any board both values are generated at least once for
     each car. Since each line is 9 characters long and there
     are at least 2 characters in each car, each car must have
     at least one character in an even and one in an odd position
     in the string. Hence &lt;code&gt;(-1)**pos&lt;/code&gt; generates a proper value.&lt;pre&gt;s^\pL^$b=$#A**pos; ... ^ge&lt;/pre&gt;&lt;code&gt;$#A&lt;/code&gt; is just a shorter way of writing &lt;code&gt;(-1)&lt;/code&gt;.
     Unfortunately the operator precedence of unary &lt;code&gt;-&lt;/code&gt; is
     smaller than that of &lt;code&gt;**&lt;/code&gt;.&lt;/li&gt;&lt;li&gt;First let&#039;s solve the problem only for positive values of
     &lt;code&gt;$b&lt;/code&gt; (i.e. down or up).&lt;pre&gt;$c=8*($&amp;lt Z);  # $c = 8 if car moves up/down, 0 otherwise
s/$&amp;/ /;        # Remove the first character of the car
s/(($&amp;)\C{$c}) /$1$2/
# Find last character of car that&#039;s followed by a space exactly $c
# characters from it, and substitute the space with the character.
# For example &quot;bbcc.&quot; =&gt; &quot;bb.cc&quot;. If this substitution fails,
# the move was impossible and we should backtrack.&lt;/pre&gt;To handle negative values of &lt;code&gt;$b&lt;/code&gt;, just
     conditionally reverse &lt;code&gt;$_&lt;/code&gt; before and after it&#039;s
     modified (nobody else did this in the golf, which I found
     suprising):&lt;/li&gt;&lt;pre&gt;sub R{$b&amp;lt;0?reverse:$_}
$_=R;...;$_=R&lt;/pre&gt;&lt;/li&gt;&lt;li&gt;The backtracking can be implemented just by wrapping the code inside
     a recursive subroutine and restoring the original state if the
     recursive call returns. There are three interesting bits of state:&lt;ul&gt;&lt;li&gt;$_. Saved by binding $_ again with for:&lt;pre&gt;... for&quot;$_&quot;  # Can&#039;t use ... for$_, since that just
               # aliases the current $_ to the new $_&lt;/pre&gt;Since $_ needs to be conditionally reversed in d, we can just
         use the return value of R instead.&lt;pre&gt;... for~~R   # ~~ needed to give scalar context to the reverse&lt;/pre&gt;&lt;/li&gt;&lt;li&gt;The moves that haven&#039;t been tried for this board yet. Since
         these are generated from the substitution in subsolution &lt;b&gt;3&lt;/b&gt;, nothing special needs to be done.&lt;/li&gt;&lt;li&gt;The accumulated moves. This was handled correctly in
           subsolution &lt;b&gt;2&lt;/b&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;li&gt;Mark board visited by incrementing a symbolic reference using
     &lt;code&gt;$_&lt;/code&gt;.&lt;pre&gt;++$$_&lt;/pre&gt;Since &lt;code&gt;$_&lt;/code&gt; needs to be conditionally reversed again,
     the symbolic reference can be made on the value of the assignment
     instead:&lt;pre&gt;++${$_=R}&lt;/pre&gt;The other part of this subproblem is to not recurse if the board
     has already been visited. This can be done by comparing the
     return value of the increment to the final substitution in
     subsolution &lt;b&gt;4&lt;/b&gt;:&lt;pre&gt;s/...//&amp;lt;++${$_=R}or...&lt;/pre&gt;&lt;/li&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Mash these ingredients together, and add an &lt;code&gt;exit
print@_&lt;/code&gt; to actually do something with the result, and you get the
solution shown above.&lt;/p&gt;</description><author>jsnell@iki.fi</author><category>PERL</category><pubDate>Thu, 01 Jul 2004 00:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2004-07-01.html</guid></item><item><title>Golf - Matrix</title><link>https://www.snellman.net/blog/archive/2004-04-13.html</link><description>&lt;p&gt;The &lt;a href=&#039;http://terje2.perlgolf.org/~pgas/score.pl?func=rules&amp;hole=60&amp;season=1&#039;&gt;Matrix&lt;/a&gt; golf had a rather thin field (hopefully only temporary),
    but some really cool code (especially in the post-mortem).&lt;/p&gt;&lt;p&gt;The problem statement was short enough to be quoted here in full:&lt;/p&gt;&lt;blockquote&gt;Let A be an N*N matrix of zeros and ones. A submatrix S of A is any
group of contiguous entries that forms a square or a rectangle.
Write a program that determines the number of elements of the largest submatrix
of ones in A . Largest here is measured by area.&lt;/blockquote&gt;&lt;p&gt;Before going into the details, a brief example of how the algorithm
I used works. Assume the following matrix:&lt;/p&gt;&lt;pre&gt;00011
00110
01110
10110
00010&lt;/pre&gt;&lt;p&gt;The longest string of &lt;code&gt;1&lt;/code&gt;s is the &lt;code&gt;111&lt;/code&gt; on line 3, so that&#039;s the largest submatrix of one line (with an area
of &lt;code&gt;1*3=3&lt;/code&gt;). Then transform the matrix by doing a
stringwise and on each line and the line that follows it. The last
line will be chopped off:&lt;/p&gt;&lt;pre&gt;00010
00110
00110
00010&lt;/pre&gt;&lt;p&gt;Obviously the only way to get a &lt;code&gt;1&lt;/code&gt; on the
   transformed matrix is to have one on the corresponding position in
   two successive lines in the untransformed matrix. So the string &lt;code&gt;11&lt;/code&gt; (found on both lines 2 and 3) corresponds to an area
   of &lt;code&gt;2*2=4&lt;/code&gt; in the original matrix. Repeat the
   transform:&lt;/p&gt;&lt;pre&gt;00010
00110
00010&lt;/pre&gt;&lt;p&gt;Now any 1 is going to be the result of 3 1s on consecutive
        lines, so &lt;code&gt;11&lt;/code&gt; on line two means there was a
        submatrix of area &lt;code&gt;2*3=6&lt;/code&gt; on the original matrix.
        Repeating the whole process two more times would result in
        finding an area of 4 and one of area 5. The answer for this
        matrix would therefore be 6.&lt;/p&gt;&lt;p&gt;My solution (59 characters):&lt;/p&gt;&lt;pre&gt;#!perl -lp0
s/1*/$B[$?*length$&amp;]=$&amp;/ge,/
/,$_&amp;=$&#039;
while++$?;$_=$#B&lt;/pre&gt;&lt;p&gt;As usual, we slurp the whole input into &lt;code&gt;$_&lt;/code&gt; with
&lt;code&gt;-p0&lt;/code&gt; and take care of the trailing newline with &lt;code&gt;-l&lt;/code&gt;. In addition to &lt;code&gt;$_&lt;/code&gt;, a couple of other variables
contain some interesting state. &lt;code&gt;@B&lt;/code&gt; is used for keeping
track of the largest area that&#039;s been found (an old golf trick; we&#039;re
only interested in the size of the array, not the values stored in
it). &lt;code&gt;$?&lt;/code&gt; holds the current iteration (i.e. the multiplier
for the area calculation). &lt;code&gt;$?&lt;/code&gt; is used since it can only
contain an unsigned short (0-65535), and therefore repeatedly
incrementing it in the condition of the &lt;code&gt;while&lt;/code&gt; results in
the variable overflowing to 0 after 65535 iterations. (Another
variable with a similar behaviour is &lt;code&gt;$^C&lt;/code&gt;, which holds
signed chars. I used &lt;code&gt;$?&lt;/code&gt; instead since at some point my
program couldn&#039;t handle negative multipliers)&lt;/p&gt;&lt;p&gt;As mentioned before, the program contains a &lt;code&gt;while&lt;/code&gt;-loop whose condition is just incrementing &lt;code&gt;$?&lt;/code&gt;. The body
of the while implements most of the algorithm. Some code is executed
for each string of &lt;code&gt;1&lt;/code&gt;s with &lt;code&gt;s/1*/.../ge&lt;/code&gt;.
The code in question is &lt;code&gt;$B[$?*length$&amp;]=$&amp;&lt;/code&gt;, which just
calculates the area of the submatrix that the string of &lt;code&gt;1s&lt;/code&gt; represents (by multiplying &lt;code&gt;$?&lt;/code&gt; and the &lt;code&gt;length&lt;/code&gt; of the matched substring, i.e. $&amp;), and stores something
in that index of &lt;code&gt;@B&lt;/code&gt;. In this case, the value being
stored is &lt;code&gt;$&amp;&lt;/code&gt; since (despite using &lt;code&gt;s///&lt;/code&gt;) we
don&#039;t actually want to modify &lt;code&gt;$_&lt;/code&gt; yet. This takes care of
finding the largest area.&lt;/p&gt;&lt;p&gt;To implement the transformation described earlier, a &lt;code&gt;/\n/&lt;/code&gt; is used to find the first newline in $_. After this a
stringwise and of &lt;code&gt;$_&lt;/code&gt; and &lt;code&gt;$&#039;&lt;/code&gt; will have done
the tranformation (including chopping off the last line). Once the
loop ends, we just assign &lt;code&gt;$#B&lt;/code&gt; (the largest index of &lt;code&gt;@B&lt;/code&gt;) to &lt;code&gt;$_&lt;/code&gt;, which is then printed out thanks
to the &lt;code&gt;-p&lt;/code&gt; command line argument.&lt;/p&gt;&lt;p&gt;This was a very cool golf. My only regret is missing a
completely obvious optimization of replacing the &lt;code&gt;s/1*//ge&lt;/code&gt; with a suitable crafted map, which would&#039;ve saved two strokes. Well,
not completely obvious since I only realized that it would&#039;ve been
possible when writing this post. Perhaps I should start writing these
things earlier... ;-)&lt;/p&gt;</description><author>jsnell@iki.fi</author><category>PERL</category><pubDate>Tue, 13 Apr 2004 00:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2004-04-13.html</guid></item><item><title>Golf - Subproduct</title><link>https://www.snellman.net/blog/archive/2004-03-21.html</link><description>&lt;p&gt;For some reason I usually don&#039;t seem to have time to take part
in golfs that I&#039;ve designed. Almost happened with the badly named (couldn&#039;t
come up with anything better) &lt;a href=&#039;http://terje2.perlgolf.org/~pgas/score.pl?func=rules&amp;hole=58&#039;&gt;Subproduct&lt;/a&gt; too, but in the end I decided that I didn&#039;t
really need to write that seminar report yet...&lt;/p&gt;&lt;p&gt;The problem was simple. Given a string of digits (maximum
length 20) and a maximum substring length N (maximum of 9), find the
largest product of the digits in a substring of 1..N characters. (For
example for the string 0120340 and a N of 5 the correct answer is
3*4=12). The most complex parts about the problem are handling zeros
correctly and keeping track of the maximum value encountered (the
usual golf idiom of using the length of an array for this doesn&#039;t work,
since the maximum value of &lt;code&gt;9**9&lt;/code&gt; would require an array
that&#039;s too large).&lt;/p&gt;&lt;p&gt;Here&#039;s the code (68 characters):&lt;/p&gt;&lt;pre&gt;#!perl
$_=shift;s/./$^=1;($^*=chop)&amp;lt;$\or$\=&quot;$^
&quot;for($`.$&amp;)x&quot;@ARGV&quot;/ge;print&lt;/pre&gt;&lt;p&gt;First we get the first command line parameter into &lt;code&gt;$_&lt;/code&gt;with &lt;code&gt;shift&lt;/code&gt;, and loop over it using  &lt;code&gt;s///&lt;/code&gt;. The answer will be saved into &lt;code&gt;$\&lt;/code&gt; so that we can use
just a print without any arguments to print it. Of course print
without arguments will print &lt;code&gt;$_&lt;/code&gt;too, so we need to empty
&lt;code&gt;$_&lt;/code&gt; somehow. This is the reason for using &lt;code&gt;s/.//&lt;/code&gt;,
while the shorter &lt;code&gt;s///&lt;/code&gt; would otherwise also suffice.&lt;/p&gt;&lt;p&gt;For each position in the input string we&#039;ll first initialize
&lt;code&gt;$^&lt;/code&gt; to one. After that we&#039;ll loop N times through a loop,
where &lt;code&gt;$_&lt;/code&gt; has been initialized to &lt;code&gt;&quot;$`$&amp;&quot;&lt;/code&gt;(that is, all characters up to and including the one that&#039;s currently being
processed by &lt;code&gt;s///&lt;/code&gt;). In the loop, we&#039;ll chop off digits from the end of
the newly constructed &lt;code&gt;$_&lt;/code&gt; and multiply &lt;code&gt;$^&lt;/code&gt; with them. If &lt;code&gt;$^&lt;/code&gt; is larger
then the current value of &lt;code&gt;$\&lt;/code&gt;, set &lt;code&gt;$\&lt;/code&gt; to &lt;code&gt;&quot;$^\n&quot;.&lt;/code&gt;&lt;/p&gt;&lt;p&gt;There are a lot of variations on this theme that are equally
long. I ended up submitting one of the more obfuscated ones, only to
regret it now :-)&lt;/p&gt;</description><author>jsnell@iki.fi</author><category>PERL</category><pubDate>Sun, 21 Mar 2004 00:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2004-03-21.html</guid></item><item><title>Golf - Card Trick 2</title><link>https://www.snellman.net/blog/archive/2004-03-03.html</link><description>&lt;p&gt;After missing one minigolf, I had some time to take part in &lt;a href=&#039;http://terje2.perlgolf.org/~pgas/score.pl?func=rules&amp;hole=57&#039;&gt;Card Trick 2&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The mission was to determine the result of the &#039;trick&#039; outlined
       below, when getting as input the initial layout of cards and the
       actions of the &#039;audience&#039;.&lt;/p&gt;&lt;blockquote&gt;In my hand I have 21 cards that I deal out face up, one each to 3
spread piles, until there are 3 rows of 7. You silently chose your
card and inform me of which &#039;pile&#039; your card is in (1, 2, or 3). I
then pick up each pile making sure to put the pile with your card
between the two other piles. I deal them out as before, and again you
tell me which pile your card is in. We repeat the process a third
time, and when I again pick up the piles, placing the pile with your
card in the middle, your card will invariably be the center card (11
of 21 in this case).&lt;/blockquote&gt;&lt;p&gt;By the time I&#039;d even read the rules, there was already a mass
of people with a solution of 40-41 (and Ton Hospel at 38, but we all
know that he isn&#039;t human). Given the length of those solutions, it&#039;s
obvious that the solution is going to involve some cute mathematical
formula, instead of directly manipulating the cards to execute the
trick.&lt;/p&gt;&lt;p&gt;The way I thought of the formula was this: Given an index I into the
set of cards (for example with the 21 cards below, 0-20) and the pick
P (1-3) we need a formula for determining from I and P the index which
would get translated to I.&lt;/p&gt;&lt;pre&gt;  0  1  2
  3  4  5
  6  7  8
  9 10 11
 12 13 14
 15 16 17
 18 19 20&lt;/pre&gt;&lt;p&gt;Let&#039;s determine this by hand for for the interesting elements (we&#039;re
only interested in cards that are rearranged into the middle third):&lt;/p&gt;&lt;pre&gt;
    7  8  9 10 11 12 13
 1  0  3  6  9 12 15 18
 2  1  4  7 10 13 16 19
 3  2  5  8 11 14 17 20&lt;/pre&gt;&lt;p&gt;Obviously our formula looks like I&#039; = P + 3I - 22. To generalize this
we just note that 22 is the number of cards + 1 (which happens to be @F-2).
Now, to solve the problem we just need to remember that the card that
we&#039;re interested in ends up in the middlemost position (i.e. for 21 cards
in index 10, @F/2-2), and we can just repeat the formula three times to find
out the original position:&lt;/p&gt;&lt;pre&gt;  I_0    = @F/2-2
  I_1    = P_3 + 3*I_0 - (@F-2)
         = P_3 + 3*(@F/2-2) - @F + 2
         = P_3 + @F/2 - 4
  I_2    = P_2 + 3*I_1 - (@F-2)
         = P_2 + 3*(P_3 + @F/2 - 4) - @F + 2
         = P_2 + 3*P_3 + @F/2 - 10
  I_3    = P_1 + 3*I_2 - (@F-2)
         = P_1 + 3*P_2 + 9*P_3 + @F/2 - 28&lt;/pre&gt;&lt;p&gt;So that&#039;s the theory. In practice few Perl tricks are
needed for this problem. The cards can be accessed from &lt;code&gt;@F&lt;/code&gt; by turning on autosplitting with the &lt;code&gt;-a&lt;/code&gt; switch.
The picks could also be accessed from &lt;code&gt;@F&lt;/code&gt;, but it turns out that it&#039;s easier to access them using
regexps. By crafting a suitable regular expression, we can get the
picks into the special regep variables (&lt;code&gt;$&amp;, $&#039;, $1, etc&lt;/code&gt;). And finally, by using a repeated substitution, we can use the return
value of operation to count the amount of cards in the input (to save one
character when compared to using &lt;code&gt;@F/2&lt;/code&gt;. My final solution
is 39 characters (and a shared second place):&lt;/p&gt;&lt;pre&gt;#!perl -lpa
$_=$F[s/ ..(.)//g-27+9*$&#039;+3*$1+$&amp;]&lt;/pre&gt;</description><author>jsnell@iki.fi</author><category>PERL</category><pubDate>Wed, 03 Mar 2004 00:00:00 GMT</pubDate><guid permaurl='true'>https://www.snellman.net/blog/archive/2004-03-03.html</guid></item></channel></rss>