Mobile TCP optimization - lessons learned in production

Posted on 2015-08-25 in Networking

I did a keynote presentation at the SIGCOMM'15 HotMiddlebox workshop, "Mobile TCP optimization - Lessons Learned in Production". The title was set before I had any idea of what I'd really be talking about, just that it'd be about some of the stuff we've been working on at Teclo. So apologies if the content isn't an exact match for the title.

This post contains my slides, interleaved with my speaker's notes for that slide. It won't be an exact transcription of what I actually ended up saying, they were just written to make sure that I had at least something coherent to say re: each slide. We've got an endless supply of network horror story anecdotes, and I can't actually remember which ones I ended up using in the talk :-/

I'm particularly happy that my points on transparency of optimization got a positive reception. To us it's a key part of making optimization be a good networking citizen, and has seemingly been getting short shrift so far. Hilariously the other TCP optimization talk at the workshop brought up a transparency issue we'd never had to consider, lack of MAC transparency causing a Wifi security gateway to think connections were being spoofed.

Thanks to Teclo for letting me talk about some of this stuff publicly, and to everyone who attended HotMiddlebox. It was a lot of fun, and I got a bunch of useful information from the hallway discussions.

... Continue reading ...

Use cases for CHANGE-CLASS in Common Lisp

Posted on 2015-07-27 in Lisp

This is a post on use cases for Common Lisp's CHANGE-CLASS operation [0]. As the name suggests, it changes the class of an object without changing its object identity. It's an operation that a certain class of programmers would consider totally abhorrent. I think it's both cool and useful.

As far as I an see, the class of an instance has three effects in Common Lisp. It determines the set of slots the object has, it determines which methods will be executed when a generic function is called with that object as one of the arguments, and it determines how the object interacts with the rest of the system based on the metaclass of the class of the object.

... Continue reading ...

Detecting cheaters in an asynchronous online game

Posted on 2015-07-22 in Games

Introduction

This post is a description of some tools and data analysis I did for detecting players using multiple user accounts in an asynchronous online game. The code is available at GitHub.

A couple of months ago one of the players on my Online Terra Mystica site had some concerns that some of the players in the tournament were playing with multiple accounts. So I decided to do a bit of digging into the logs to see whether it was really happening or just paranoia.

... Continue reading ...

Unit testing a TCP stack

Posted on 2015-07-09 in Networking

Last year in an online discussion someone used in-kernel TCP stacks as a canonical example of code that you can't apply modern testing practices to. Now, that might be true but if so the operative phrase there is "in-kernel", not "TCP stack". When the TCP implementation is just a normal user-space application, there's no particular reason it can't be written in a way that's testable and amenable to a test driven development approach.

... Continue reading ...

Updated zlib benchmarks

Posted on 2015-06-05 in General

Last year I wrote a small benchmark suite to benchmark the various zlib optimization forks that were floating around. There's a couple of reasons to update those results. First, there were major optimizations added to the Cloudflare fork. And second, there's now a new entrant, zlib-ng which merges in the changes from both the Intel and Cloudflare versions but also drops support for old architectures and cleans up the code in general.

I'll write a bit less commentary this time, so that the results will be easier to update in the future without a new post. The big change compared to the 2014-08 results is that the Cloudflare version is now significantly faster particularly on high compression levels, but there are smaller improvements on all compression levels. Except for compression level 1, it seems like the preferable version now for pure speed.

... Continue reading ...

What's wrong with pcap filters?

Posted on 2015-05-18 in Networking

Introduction

I recently watched a video of a great talk on the early days of pcap by Steve McCanne. The bit on how the filtering language was designed - around the 26 minute mark but you might want to start at 20 minutes if you're unfamiliar with BPF - was one of the best stories about creating a new "little language" I've heard.

But that got me thinking a bit. This language is a tool that I use daily, that I'm generally happy with, but that also drives me absolutely crazy sometimes. This post is an attempt to look at some classes of problems that the pcap filtering language fails on, why those deficiencies exist, and why I continue using it even despite the flaws.

Just to be clear, libpcap is an amazing piece of software. It was originally written for one purpose, and it really is my fault that I end up too often using it for a different one. There's three very different use cases that I have for a packet filtering language (others may have more).

  • Small and simple filters to pick out a specific slice of traffic (single protocol, single flow, or single host). I believe it's fair to say that this is what the language was originally designed for.
  • Potentially complex filters for classifying traffic with real-time constraints and with no state, usually when using the filters for configuration rather than as an exploratory tool. This is where pcap is clumsy even when it generally works.
  • Offline analysis at higher protocol layers that'd benefit also benefit from tracking the high level protocol state between packets. You can sometimes coerce pcap to work for this use case, but it's super-awkward. It's also worth noting that features that are beneficial for this use case would not be welcome in the others. (Being able to run an arbitrary PCRE regexp on the packet payload? Great when doing offline analysis, unacceptable for real-time classification).

I try to do the third case with tools better suited for that, and only have a couple of complaints (e.g. VLAN support) on on first case. Mostly the pain comes from the middle case. So as we start the tour of annoyances, keep in mind that I'll often complain about a tool not doing a job it wasn't meant for.

... Continue reading ...

"It's like an OkCupid for voting" - the Finnish election engines

Posted on 2015-05-11 in General

Have I ever told you about the time I built an "OkCupid for elections" for the communists?

No? That's strange, I tend to get good mileage out of that story during election season. Unfortunately for the story to make any sense, you'll need a bit of absolutely fascinating background information on how elections work in Finland, and especially how websites that tell people whom to vote for became an integral part of it.

... Continue reading ...

A Monte Carlo simulation of Red7

Posted on 2015-03-30 in Games, Lisp

Red7 is a very clever little card game, and one of my favorite 2014 releases. But I have wondered about the density of meaningful decisions in the game. Sometimes it doesn't feel like you have all that much agency, and are just hanging on in the game with a single valid move every time it's your turn.

So here's some automated exploration of what a game of Red7 actually looks like from a statistical point of view. The method used here is a pure Monte Carlo simulation, with the players choosing randomly from the set of their valid moves.

Why a Monte Carlo simulation? I started trying to do a full game tree for a given starting setup but to my surprise the game tree is actually too large for that to be feasible; 2 weeks of computation even for a single two player game and a lot of optimization. The branching factor is just much bigger than it feels like when playing the game.

... Continue reading ...

Can't even throw code across the wall - on open sourcing existing code

Posted on 2015-03-19 in General

Starting a new project as open source feels like the simplest thing in the world. You just take the minimally working thing you wrote, slap on a license file, and push the repo to Github. The difficult bit is creating and maintaining a community that ensures long term continuity of the project, especially as some contributors leave and new ones enter. But getting the code out in a way that could be useful to others is easy.

Things are different for existing codebases, in ways that's hard to appreciate if you haven't tried doing it. Code releases that are made with no attempt to create a community around it and which aren't kept constantly in sync with the proprietary version are derided as "throwing the code across the wall". But getting even to that point can be non-trivial.

... Continue reading ...

Podcast on mobile TCP optimization

Posted on 2015-03-14 in Networking

I was recently a guest on Ivan Pepelnjak's (ipspace.net) Software Gone Wild podcast, talking about TCP acceleration in mobile networks, as well as whining in general about how much radio networks suck ;-) Thanks a lot to Ivan for the opportunity, it was fun!

You can listen to the podcast episode here.