Summary

This is a post in two parts. The first part discusses the current state and historical trends of HTTPS adoption in traffic carried over mobile networks. The second part explains why this change might well be beneficial to mobile operators, for whom the transition from unencrypted to encrypted traffic appears potentially harmful.

What's the traffic share of HTTPS?

Based on measurements from a number of mobile operators around the world, a typical 3G or LTE network currently carries around 20-25% HTTPS/SSL traffic. In some networks we've seen SSL traffic shares of over 35%. The transition has been very rapid, and given the the technological and social trends, it seems likely to continue. (For example just this week's announcment of Let's Encrypt, which removes any remaining barriers to entry, could matter a lot for the long tail of websites.)

When we did the first deployments of Teclo's product for mobile TCP optimization around 4 years ago, the traffic mix was consistently dominated by HTTP. At the time a typical split would be something like 95% HTTP, 2-3% SSL, and the remaining 2-3% would be split among other kinds of TCP traffic and UDP. (I have heard that operators in some geographical regions are seeing much higher levels of UDP traffic than that).

Over time, there was at best a very slow trend for SSL replacing HTTP, maybe a percentage point a year, such that 5% would have been a typical proportion of SSL traffic even 2 years later. But in 2013 something changed, and by the end of the year the average network was 10-15% SSL. The share has been increasing rapidly ever since. In the most extreme case I'm aware of, SSL traffic grew from X% to (X+10)% of the total volume in three months between us doing a trial in late 2013 and a final deployment in early 2014.

(Note: All the above numbers are specifically for downlink traffic. Uplink traffic is much more encryption-heavy. The SSL traffic share for uploads can easily be over 70%. But my experience is that in mobile networks downlink traffic volumes are 10x higher than uplink volumes.)

These numbers don't exactly line up with other reports I've seen. For example the numbers from Sandvine suggest much lower SSL shares, somewhere in the 6-8% range. My guess is that in that study some SSL traffic is being categorized as specific application traffic based on IP addresses. Since these things can vary a lot between countries and networks, I'd be very interested in hearing of any other public sources for this kind of information.

HTTP handling in mobile networks

So what does the rise of SSL mean for mobile networks? Theoretically it could mean nothing; it's all just TCP in the end. But actually many operator networks contain amazing amounts of HTTP-specific or HTTP-aware nodes. Caching, legal intercept / request logging, DPI, video compression or pacing, image compression, ad insertion, header enrichment, and probably many other types that I'm not familiar with. The presence of all these nodes makes perfect sense in a world where 95% of the traffic is HTTP.

To a first approximation none of these boxes are going to be useful in an encrypted world. The arguments in favor of HTTPS (example) concentrate mostly on why encrypted traffic is a win for both the end user and the content provider. You will after all not find many users who are happy at being tracked through header enrichment, or content providers who like having ads (or other content) inserted into their websites by a middlebox.

Now, of course from an operator's point of view most of these network elements either made business sense at some point (for example reducing infrastructure costs through caching and compression), or they are legally required. One could therefore expect that the effective obsoletion of these boxes would be harmful to the operator. And that might be true, I certainly don't know enough to estimate how important each of these devices is to the bottom line.

But I do know a bit about TCP performance problems in mobile networks. And at least in that area I believe that many operators should be very happy to see a move away from HTTP. Because in an astoundingly high percentage of networks, the single network element that's most harmful to network performance is some kind of HTTP proxy. Below is a list of some of the problems introduced by HTTP-handling core network nodes that we've seen, many in multiple networks:

  • No support for TCP window scaling on one or both of directions of traffic.
  • Retransmits of tens of kilobytes of data instead of one segment on a retransmit timout (which commonly happen in radio networks). 1.5% of the radio capacity of this network was wasted on the spurious retransmits from this node.
  • Using New Reno as the congestion control algorithm in networks with high levels of random packet loss.
  • An initial TCP congestion window of 2, with no configuration options to increase it.
  • No MTU clamping on the mobile side of the proxy, resulting in an effective MSS higher than 1380 bytes. This causes the GTP tunneling on the Gn interface to split each full size TCP packet into multiple UDP packets, which can cause up to 5% waste of backhaul capacity.
  • Mismatching MTU sizes on the mobile and internet facing interfaces, causing a high proportion of tiny packets mixed in with maximum size packets, due to the packet sizes on the two interfaces not being multiples of each other. In the best case this introduces just a small amount of header overhead. In the worst case it introduces massive amounts of reordering.
  • Mysterious delays of up to half a second between GET request being received by the terminating proxy, and being forwarded to the server. (Possibly due to a name resolution step that a well-done transparent proxy shold not need in the first place).
  • Stricter or buggy HTTP header parsing in middlebox, causing a request that would be accepted by both client and server to be rejected.

Some of these problems would have been absolutely crippling to network performance, with connections bypassing the middlebox being twice as fast as those going through it. Others would just have been wasting resources but in a way that'd be very hard to track down.