Stuff I read last week

About: A collection of links and short commentary, published weekly. In theory the inclusion criteria are for it to be something I read last week (duh) and was either particularly interesting or something I might want to refer to later. Subscribe with RSS or see my main blog for long-form writing.

2020-07-13

Chat Wars
Great cat and mouse story of a MS Messenger interoperability with AIM. I don't know if the punchline is how AOL finally made a change MS could not emulate (by exploiting a buffer overflow in AIM to do remote code execution to craft a response packet), or how MS bungled up the PR response to that.
The Windows Shutdown crapfest
Another classic description of early 2k organizational dysfunction of MS.
Sorting out graph processing
Random access vs. radix sort.
How to Periodize History: 2020 and the Decade Dilemma
The Cardinals vs. the Ordinals, on where the boundaries between decades are. What does it say that despite reading so much nitpicking about this around 2000, I never knew that the ISO 8601 definition of decade is the one that's ostensible wrong.

2020-06-22

Cuckoo++ Hash Tables: High-Performance Hash Tables for Networking Applications
Avoid wasting time on fetching the secondary bucket by maintaining a bloom filter of keys that required falling back to the secondary.
/technicolor-research/cuckoopp Source code for the above. One thing I was wondering about when reading the paper was how the bloom lookup was used in practice. Turns out the result is just branched off of. Wonder if there would have been gains in making that branchless.
Don't share, Don't lock: Large-scale Software Connection Tracking with Krononat The system the above hash table was built for. SDN NAT with some sexy performance numbers.
Memory-Efficient Hash Joins
A linear probing hash table for batch purposes that can basically get 100% occupancy by having a dense hash-key sorted array as the primary, and a bit-array with popcount tricks to find the index into the dense array. It's not at all obvious to me why this works, it seems like for any reasonable hash code length the bitmap has to be wasting tremendous amounts of memory. Especially considering their bitmap encoding seems wasteful (two bits per possible hashcode, seems like you could get it very close to one bit without any more memory accesses).

But the benchmarks claim it works, so...
Flexbuffers
Schema-less flatbuffers. The HN thread turned into a delightful pissing match between protobuf implementors.
40x faster hash joiner with vectorized execution
Converting a (part of a) row-order query engine to batched column-order.
MonetDB/X100: Hyper-Pipelining Query Execution
This seems to be the patient zero for database engines optimizing for branch mispredicts by batching operations on homogenous data?
Protobluff - Design Rationale
Protobuf dom that works entirely in-place on the original input string.

2020-06-15

Launchpad: A Rhythm-Based Level Generator for 2-D Platformers
A level generator for 2-D platformers built on a rhythm-based model of player behavior, derived from an analysis of existing platformer games
Patterns and Procedural Content Generation
Analyzing the Super Mario Bros. levels using a framework of repeating patterns.
OOP Is Dead, Long Live Data-oriented Design
Linkbait title, but this was a good data-oriented design walk-through for a non-gaming use case. (I'm still trying to understand what counts as DoD and what doesn't, since it sometimes feels like 90% of it is common sense.)
OOP Is Dead, Long Live Data-oriented Design
Slides for the above.
Zsh and Fish’s simple but clever trick for highlighting missing linefeeds
It's a great feature, and I would never have guessed how it works.
Vectorized VByte Decoding
SIMD-ified parsing of the Google-style MSB-bit-1 varints. Don't think it'll work for one little project I've been doodling around with, due to being optimized for decoding multiple small varints with one call.

2018-09-03

On the difficulty of nonograms
An algorithm for computing a difficulty rating for a nonogram.
Survey of Paint-by-Number Puzzle Solvers
A comparison of a bunch of nonogram solvers and the implementation strategies.
Advanced [Nonogram] Solving Techniques
The strategies a human would use for solving Nongrams.
Service provider story about tracking down TCP RSTs
A good network debugging story. Why are a couple of content providers / CDNs responding to some customers of an operator with an RST, while other customers of the same operator had no problems?
The History of a Security Hole
Archaeology into an embarrassing looking security issue in OpenBSD (x86-only). How a series of superficially safe looking transformations into a struct definition first created a tiny crack, and then later widened the hole.

2018-05-15

Delta Pointers: Buffer Overflow Checks Without the Checks
A tagged pointer setup for detecting buffer overflows with no branches or extra memory accesses.
Never Write Your Own Database
Actually just the opposite of the title. A lovely story on why sometimes you actually need to write software to build a product, not just snap together some lego blocks.
Why not mmap()?
Mmap is so great! Just do a single system call, and all the IO is hidden behind the scenes. Oh, wait. Turns out that sometimes that transparency isn't what you want at all. (Specifically sometimes you want non-blocking IO. Or as it happens, I found this post while looking for stuff on the MAP_POPULATE | MAP_NONBLOCK combination that Linux used to once support).
An unfinished draft of linearly-probed Robin Hood hash tables
Robin Hood hash tables with linear probing are just sorted arrays. What a great way to think about it.
An Experimental Comparison of Thirteen Relational Equi-Joins in Main Memory
> This makes it surprisingly hard to answer a very simple question: what is the fastest join algorithm in 2015? In this paper we will try to develop an answer. We start with an end-to-end black box comparison of the most important methods. Afterwards, we inspect the internals of these algorithms in a white box comparison. We derive improved variants of stateof-the-art join algorithms by applying optimizations like softwarewrite combine buffers, various hash table implementations, as well as NUMA-awareness in terms of data placement and scheduling
Zobrist Hashing
> Zobrist hashing starts by randomly generating bitstrings for each possible element of a board game, i.e. for each combination of a piece and a position (in the game of chess, that's 12 pieces × 64 board positions, or 14 x 64 if a king that may still castle and a pawn that may capture en passant are treated separately). Now any board configuration can be broken up into independent piece/position components, which are mapped to the random bitstrings generated earlier. The final Zobrist hash is computed by combining those bitstrings using bitwise XOR.
And obviously the punchline being that given a board state and its hash, recomputing the hash of an output board state is simply xoring the original hash with the bitstrings of the moved piece; once before the move, once after.
GP-Rush: Using Genetic Programming to Evolve Solvers for the Rush Hour Puzzle
Using genetic programming to choose when to use which heuristic for directing a iterative deepening A* search. Though I have to say that the game trees this paper is talking of seem ludicrously small for 2009 (1.5M states).
Using an Algorithm Portfolio to Solve Sokoban
Running multiple unrelated algorithms for Sokoban solving in parallel, on the assumption that the different algorithms have different blind spots. You get better average and worst cases by giving each of x algorithms 1/x% CPU than run just one with 100%. (Assuming the selection is diverse enough, of course).

Also had some ideas about having the different algorithms exchange information, but that seems very complicated and didn't seem to pan out.

2018-04-23

Travelling murderer problem: planning a Morrowind all-faction speedrun with simulated annealing, part 1
> However, there isn't a Morrowind speedrun category where someone tries to become the head of all factions. For all its critical acclaim and its great story, most of quests in Morrowind are basically fetch-item or kill-this-person and there aren't many quests that require anything else. But planning such a speedrun route could still be extremely interesting for many reasons.

Some really neat stuff about treating speedrunning as a search/optimization problem. I was a little bit annoyed by the parts where the story strays from that, and the author instead uses human intuition to e.g. select which set of quests to do or which skills to train. Also, part 2.
Building a Bw-Tree Takes More Than Just Buzz Words
Two things you don't see often in CS. Trying to replicate a systems design paper, and publishing a negative result. And also showing just how many crucial details can get left out in a systems description that makes it impossible to actually implement. And when you do implement it and don't get the hoped for performance, what then? Obviously more and more optimizations that the original system probably didn't have.

It's kind of interesting to read the original paper's HN comments after this.
The story of ispc (part 5)
> Assuming typical game theory for the jerks, here’s what the thinking would have been: I was a jerk too, and my real goal here was not to actually solve a problem, but was to leverage SIMD either to usurp the people who led parallel programming models in the compiler group or to advance some other nefarious agenda.

A personal retrospective on the development of ispc (a compiler for a shader-programming style C dialect for x86-64). What a great story of big-company intrigue and dysfunction. I'm reloading this site daily to check for new installments.

Part 1, part 2 part 3 part 4
UPnProxy: Blackhat Proxies via NAT Injections
Proxying traffic through home Wifi routers that expose UPnP to the internet. (I'd heard of malicious proxying through home routers, but I'd thought they were compromised devices rather than just misconfigured ones).

2018-04-16

Vulnerability Modeling with Binary Ninja
The sophistication of modern reverse engineering tools is pretty amazing.
ChuckMcM on "Weirdstuff Warehouse is closed"
> When a typical Silicon Valley company decides to "sell off their assets" that generally means office chairs, white boards, and the occasional espresso machine. Not test equipment, test fixtures, extra parts, and tools.

Using the closure of what was apparently a famous electronics scrap store to reflect on how Silicon Valley changed in the last couple of decades.
Salsify: Low-Latency Network Video through Tighter Integration between a Video Codec and a Transport Protocol
Expose SACKs directly to the encoder. Always use the latest fully ACKed frame as the keyframe. (Plus other things, but that felt like the interesting insight to me.)
Go: the Good, the Bad and the Ugly
A worthy new entry in the popular "why Go sucks" genre.
DD9 Kaypro Edition
> I definitely found the answer to my question about why so few graphical Kaypro programs exist. The Kaypro’s graphics are awful – it’s a text-mode machine with graphics bolted on as a box-checking exercise. That being said, the development experience was surprisingly nice and it was a lot of fun to go through the exercise of actually making a functional game for a machine slightly older than me.
NEON is the new black: fast JPEG optimization on ARM servers
On the performance and power efficiency of Xeons vs. Qualcomms server chips on SIMD workloads. My basic assumption on CF's tech blog posts is that they're 90% PR. But this does have hard numbers, and they're pretty surprising ones (specifically the power usage / unit of work numbers. though I wish they had raw power usage as well).
A Taxonomy of Tech Debt
> This post will focus on types of tech debt I’ve seen during my time working at Riot, and a model for discussing it that we’re starting to use internally. If you only take away one lesson from this article, I hope you remember the “contagion” metric discussed below.

2018-04-09

Implementing Primitive Datatypes for Higher-Level Languages
John Cowan linked to this in a comment on my post on tagged pointers. It's a very comprehensive look at datatype implementations in (mostly) Lisp implementations.

(Is this the by the same Stan Shebs who wrote XConq back in the day?)
Understanding and Mitigating Packet Corruption in Data Center Networks
There's a bunch of different reasons why packets might get corrupted in-flight. This research finds out signals for distinguishing between those cases (+ congestion-induced packet loss) and recommends specific maintenance tasks to fix the problems.
Faster: A Concurrent Key-Value Store with In-Place Updates
A design for a key-value store for update-heavy applications.
A critical reflection on GDPR
A critique of the GDPR as a piece of legislation from somebody who a) appears to be a privacy activist, b) works as a GDPR DPO.
Interleaving small reads of multiple files – why World of Tanks 1.0 has abysmal loading times on HDDs
Another story of debugging and mitigating a problem in a closed source program.

2018-04-03

Handling Overload
>This is a long-ish entry posted after multiple discussions were had on the nature of having or not having bounded mailbox in Erlang.
Apparent use of Sandvine devices for malicious or dubious ends in two countries
Multi-use software can be used in bad ways. If you sell such software to authoritarian governments (or government-controlled companies), it'd be good to have controls on exactly what they can do. Obviously that doesn't work if the system is arbitrarily scriptable, but few systems are.

But what really offends me about this article is just what garbage the Procera traffic rewriting implementation clearly was.
Surprising Creativity: Anecdotes from Evolutionary Computation
Just what it says in the title. Stories about genetic algorithms etc. generating unexpected results.
Performance Under Load
Reading through this, I kept thinking that I'd pretty recently read about someone else using TCP congestion control for RPC queue management. And indeed I had, it was this post by Evan Jones. First time this linkblog actually did what I intended it for! ;-)
Taking down Gooligan
The inner workings of an Android OAuth-token stealing botnet. [part 2] [part 3]
Death of the sampling theorem?
Not actually the death of the sampling theorem. But an absolutely brutal takedown of some dodgy signal processing research. The punchline:

> As so often, one does have to ask: How did these dramatic claims get through peer review? Given the obvious conflict with the Sampling Theorem, weren’t some eyebrows raised in the process? Who reviewed these submissions anyway? Well, I did. For a different journal, where the manuscript ultimately got rejected.
Gron: A command line tool that makes JSON greppable
A tool to transform JSON to a line-based format, where each line is prefixed with a path. And a tool to transform from that format back to JSON. Such a clever idea.
Procedural Worlds from Simple Tiles
Procedural map generation using (cleverly designed) Wang tiles.
"The DNS Camel", or, the rise in DNS complexity
The DNS protocol design is becoming increasingly detached from the practice, leading to increasingly complex and bug-prone features.
PubGrub: Next-Generation Version Solving
The new version resolution algorithm for Dart's package manager, with special emphasis on error messages. The contrast between this and the recent work for Go package version is pretty interesting.

◀ Earlier | Index