It's possible for multiple houses (e.g. in different cities) to have the same street name and number. What's the closest pair of such "twins" in the UK?
I truly appreciate the dedication that went into researching this bit of trivia.
Why POSIX filesystem semantics aren't a good fit for large scale systems. (The funny thing is that it never occurred to me that anyone would use POSIX APIs in a modern distributed context. But apparently in supercomputing they do).
How do you enjoy the process of creating a new language, when you've been writing compilers for a long time? By adding artificial restrictions. Only assembly; no libraries, programming languages, or code generators.
Implement a minimal BCPL-like language in assembly. Then use that to implement a Lisp interpreter, that interpreter to create a reasonably featured VM (e.g. garbage collector, delim. continuations). Write a compiler, assembler, disassembler, linker targeting the VM. Then use these tools to write the language you originally wanted with objects, pattern matching, and non-sexp macros.
Thoughts about what features a REPL (and the language being evaluated in the REPL!) should have to be useful.
Read this as part of some archaeology into numeric representation in early Lisp systems. But it actually turned out to be pretty neat systems paper in general. One thing that's striking is how readable this 50 year old paper still is. The vocabulary of systems programming has changed surprisingly little (just switched from words to bytes), and even the problems being solved are at the core the same. It's all about memory hierarchies, even at the dawn of computing.
Paper describes an early version of BBN Lisp for a machine with 16K words of core memory, and 88K words of absurdly slow drum memory. Hardware has no paging support. How do you make efficient use of the drum memory, to fit in meaningful programs? So you need to somehow do paging in software, and reorganize the data layouts to minimize pointer chasing and page faults. (The latter bit is what I was really interested in, while looking at the history of tagged pointers).
Take a bitset implementation that splits the set into blocks, and adaptively uses the best data representation for each block. How do you determine which internal representations actually make sense?
Slides with anecdotes on game optimization in general, but on the Jaguar CPU in particular. E.g. didn't realize you really have to use SIMD on those CPUs, or you can't even use the full cache bandwidth. Neat example of a custom spatial database near the end.
"tl;dr don't bother".
The painful step-by-step journey of implementing a seemingly trivial optimization in a production compiler. Especially the "Lessons Learned" part is great; I'm fighting the temptation not to just quote all of it here.
> I switched to a 12” MacBook before I started working on my swiftc PR. It was so slow that I was only able to iterate on the code once a day, because a single compile and test run would take all night. I ended up buying a top-of-the-line 15” MacBook Pro because it was the only way to iterate on the codebase more than once a day.
> It’s really easy to break swiftc because of how complex it is. My original pull request was approved and merged in a month. Despite only having about 200 lines of changes, I received 125 comments from six reviewers. Even after that much scruitiny, it was reverted almost immediately because it introduced a memory leak that a seventh person found after running a four hour long standard library integration test.
Yes, it's a Bitcoin article. But it's also really good!
> Bitcoin neatly avoids the double-spending problem plaguing proof-of-work-as-cash schemes because it eschews puzzle solutions themselves having value.
Example of how some of the new features in the C++ standard will work together.
A chip reverse engineering story with the best digressions. It's not just about figuring out that the supposed RAM chip is actually a touch tone dialtone generator; it's also figuring out the maths on every dialtone generator on the market to exactly identify this one. And then going into some semiconductor physics for good measure.
An introduction to Futamura projections, phrased in terms of physical objects rather than partial evaluation of source code.
> In practice, a message broker is a service that transforms network errors and machine failures into filled disks. Then you add more disks.
On why you probably want either a load balancer or a database, not a pubsub system.
Slava Pestov reads through The NeWS Book: An Introduction to the Network/Extensible Window System from 1989. I never knew anything about NeWS, except from the Unix Haters Handbook X11 rant, so it was nice to fill it in with some more facts.
> Specifically, what I needed was mostly like a tree diff but I wasn’t optimizing for the same thing as other algorithms, what I wanted to optimize for was resulting file size, including indentation.
Many people don't appreciate how complicated handling configuration data is in the real world. (Pretty much every one of my jobs has at some point turned into a configuration handling nightmare). This is a good story on exactly that. There's a need for a seemingly very simple config manipulation operation, but a couple of weeks later you find yourself doing dynamic programming.
(Also, this is not just a good story, but a great example on just how to present an algorithm).
A walk through early CPU branch prediction strategies.
How to make a practical web search system using bloom filters rather than an inverted index. I especially like the notes on how classical problems of signature-based don't really matter in this domain. E.g. a modest amount of false positives is not a problem, since the full result set needs to be scored no matter what. Or how sharding the index by number-of-unique-terms was impractical in the past due to excessive disk seeks, but no problem when the index needs to be sharded to hundreds of machines anyway.
A deep dive of how MVCC works in Postgres, from concepts all the way down to the exact source code.
Reverse engineering the microcode in Athlons and Phenoms. Half of this work was done by mutating existing microcode update files, and probing the behavior of various instructions in a minimal operating system. The other half was done by delayering a CPU and using a electron microscope to find and read the microcode ROM.
Then write a proof-of-concept remote triggerable trojan in microcode.
Another trip to crazytown. How Windows Vista would artificially limit network throughput if any sound was playing. (With an effect that would be magnified linearly as more NICs were added to the machine). Brought up in the HN discussion of my PS4 download speed post.
I like the idea of treating programming languages as a creative work to be reviewed critically.
A case study in how not to change defaults when evolving a program from one use case to another. (Any blog platform will inevitably try to transform into a general purpose CMS and call a dystopian hellscape of ecommerce plugins an "ecosystem"). But I can't understand how anyone would think that changing the default RSS feed item count from 10 (which sounds pretty standard) to infinite could be the right thing.
A good discussion on the problems with transparent huge pages. (I turn them off at work for our data analysis machines, due to some absolutely crippling throughput issues they cause. Really need to check whether that server is already running on a 4.6+ kernel, with the supposedly improved THP behavior mentioned in this thread.)
Why does Linux load average include processes that are blocked on swapping. (Never realized they did; thought it used the classical definition). You know it's good software archaeology when it's treating with something that's still relevant today, and the search bottoms out in MACRO-10 code.
> font-size is the worst.
Just how hard coan it be to determine which font size should be used for an element based on the CSS? Pretty damn hard, it turns out.
> To recap, we are now at four different notions of font size being inherited: ...
Why and how to deprecate a programming language.
The thesis here is that the Linux kernel isn't a monorepo. Instead it's a monotree with multiple repositories. There are multiple repositories, e.g. the main one by Linus, subsystem specific ones, etc. Hence not a monorepo. But all of those repositories are rooted in the same tree, with changes flowing between the repos arbitrarily (so they're not polyrepos, which would generally need to be totally independent of each other). Hence the need for the new term.
Unsurprisingly, Github doesn't support this fairly unique workflow.
Computer science paper recommendations from Fabien Giesen, with long summaries of exactly why these papers are particularly useful/interesting.
A HN comment from 2015 explaining why the 6502 instruction set encouraged a SOA layout over AOS.
> But CSS wouldn’t be introduced for five years, and wouldn’t be fully implemented for ten. This was a period of intense work and innovation which resulted in more than a few competing styling methods that just as easily could have become the standard.
A survey of the early history of HTML styling languages.
Reverse engineering all the Pokemon games, to extract the full list of Pokemon stats and graphics. The particularly interesting bit here is the evolution in how the games have been storing their data.
Mike Hearn on the hard lessons about user account authentication learned at Google. I think I disagree about the ultimate conclusion about it being futile to implement your own system and just use OAuth to piggyback on Google/FB auth. Or that the only good alternative is session-token generating email links. As a user I don't think I'd like either of those. But it's still super important to be aware of the actual issues.
On the social implications of rating systems in games. What happens when the output of a rating system stops being used as a prediction, and instead becomes a status symbol? (And an argument for keeping MMRs purely hidden, while making the public "ranks" something you can advance on by sufficient grinding).
Packing functions in memory such that caller/callee are more likely to be on same cache line / same page. (Surprising to see the "same cache line" part actually happens 5% of the time; the ITLB improvements make a lot more intuitive sense). Do this using callgraph information collected continuously from production machines.
Then use same mechanism for keeping the very hottest code in huge pages. Can't do this universally, due to the tiny number of hugepage I-TLB entries.
5-10% improvements across a selection of Facebook's services.
(Excluding FreeBSD on their CDN servers, of course). Asked in the context of Gregg being an ex-Solaris hacker.
It's very easy for people to underestimate how big the cumulative effect from 20 years of even slightly faster improvements ends up being. E.g. were there any major enhancements to the Illumos TCP stack in this decade? If ther were, it's at least not obvious. Or (since I dug this post out due to a "Why would people run Linux instead of OpenBSD" discussion), anyone wanting to run a major Internet service on OpenBSD would probably need to hire 1-2 fulltime hackers to modernize the TCP implementation.
That's just the bit of operating systems I'm familiar with. But hard to believe it would somehow be a unique problem area./p>
An investigation of HTTP middleboxes all over the internet. How do they behave, and how do you fool them into doing things they weren't meant to do.
The Gamecube had a GPU with some programmable parts, rather than being purely fixed-function. For Dolphin to emulate that, they need to compile the Gamecube GPU programs to modern GPU shaders. But this compilation takes time, and they don't know the set of needed shaders up front (it's fully dynamic). How do you solve that?
> But what if we don't have to rely on specialized shaders? The crazy idea was born to emulate the rendering pipeline itself with an interpreter that runs directly on the GPU as a set of monsterous flexible shaders.
The great thing about Dolphin updates is that they don't just explain what a new feature is; they explain what other solutions have been tried or proposed, and why those solutions don't actually work.