QUIC Is Not a TCP Replacement
The publication of a new, definitive specification for TCP (RFC 9293) is enough of a big deal in our world that we couldn’t resist a second post on the topic. In particular, we were intrigued by the discussion that compared QUIC to TCP, which inspired this week’s newsletter.
In our last post about the past and future of TCP, we touched on the possibility that QUIC might start to replace TCP. This week I want to argue that QUIC is actually solving a different problem than that solved by TCP, and so should be viewed as something other than a TCP replacement. It may well be that for some (or even most) applications QUIC becomes the default transport, but I believe that is because TCP has been pushed into roles for which it was not originally intended. Let’s take a step back to see why I make that claim.
Back in 1995, Larry and I were working on the first edition of Computer Networks: A Systems Approach, and we had reached the point of writing the transport protocols chapter, which we titled “End-to-end Protocols”. In those days, there were only two transport protocols of note in the Internet, UDP and TCP, so we gave each of those its own section. Since our book aims to teach networking principles rather than just the contents of RFCs, we framed the two sections as two different communication paradigms: a simple demultiplexing service (exemplified by UDP), and a reliable byte stream (TCP). But there was also a third paradigm that Larry argued we needed to cover, for which there wasn’t really a well-known example of an Internet protocol: Remote Procedure Call (RPC). The examples we used to illustrate RPC in 1995 seem quaint now: SunRPC and a home-grown example from Larry’s research at the time on the x-kernel. These days there are plenty of options for RPC implementations that run over IP, with gRPC being the one of the most well-known examples.
Why did we feel the need for a whole section on RPC, when most other networking books would have just covered TCP and UDP? For one thing, RPC was one of the key research areas in the distributed systems community at that time, with the 1984 paper of Nelson and Birrell spurring a generation of RPC-related projects. And in our view, a reliable byte stream is not the right abstraction for RPC. The core of RPC is a request/reply paradigm. You send a bunch of arguments from the client to the server, the server does some computation with those arguments, and then it returns the results of the computation. Yes, a reliable byte stream might help get all the arguments and results across the network correctly, but there is more to RPC than that. Leaving aside the problem of serializing the arguments for transmission over a network (which we also covered later in the book), RPC is not really about transferring a stream of bytes, but about sending a message and getting a response to it. So it is a bit more like a datagram service (as provided by UDP or IP) but it also requires more than just unreliable datagram delivery. RPC needs to handle lost, misordered, and duplicated messages; an identifier space is required to match requests and responses; and fragmentation/reassembly of messages must be supported, to name a few requirements. Out-of-order delivery, which a reliable byte stream prevents, is also desirable for RPC. There may be a reason why so many RPC frameworks came into existence in the 1980s and 1990s–distributed systems people needed an RPC mechanism, and there wasn’t anything readily available in the standard TCP/IP protocol suite. (RFC 1045 actually does define an experimental RPC-oriented transport, but it never seems to have caught on.) It also wasn’t obvious then that TCP/IP would become as dominant as it is today. So some RPC frameworks (DCE for example) were designed to be independent of the underlying network protocols.
The lack of support for RPC in the TCP/IP stack set the stage for QUIC. When HTTP came along in the early 1990s, it wasn’t trying to solve an RPC problem so much as an information sharing problem, but it did implement request/response semantics. The designers of HTTP, lacking any obviously better options, decided to run HTTP over TCP, with famously poor performance in the early versions due to use of a new connection for every “GET”. A variety of tweaks to HTTP such as pipelining, persistent connections, and the use of parallel connections were introduced to improve the performance, but TCP’s reliable byte-stream model was never the perfect fit for HTTP. With the introduction of transport layer security (TLS) causing another set of round-trip exchanges of cryptographic information, the mismatch between what HTTP needed and what TCP provides became more and more clear. This was well explained in the 2012 QUIC design document from Jim Roskind: head-of-line blocking, poor congestion response, and the additional RTT(s) introduced by TLS were all identified as problems inherent to running HTTP over TCP.
One way to frame what happened here is this: the “narrow waist” of the Internet was originally just the Internet Protocol, intended to support a diversity of protocols above it. But somehow the “waist” began to include TCP and UDP as well. Those were the only transports available. If you only wanted a datagram service, you could use UDP. If you needed any sort of reliable delivery, TCP was the answer. If you needed something that didn’t quite map to either unreliable datagrams or reliable byte streams, you were out of luck. But it was a lot to ask of TCP to be all things to so many upper layer protocols.
QUIC is doing a lot of work: its definition spans three RFCs covering the basic protocol (RFC 9000), its use of TLS (9001) and its congestion control mechanisms (9002). But at its heart it is an implementation of the missing third paradigm for the Internet: RPC. If what you really want is a reliable byte stream, such as when you are downloading that multi-gigabyte operating system update, then TCP really is well designed for the job. But HTTP(S) is much more like RPC than it is like a reliable byte stream, and one way to look at QUIC is that it’s finally delivering the RPC paradigm to the Internet protocol suite. That will certainly benefit applications that run over HTTP(S), including, notably, gRPC, and all those RESTful APIs that we’ve come to depend on.
When we wrote about QUIC previously, we noted that it was a good case study in how to rethink the layering of a system as the requirements become clearer. The point here is that TCP meets one set of requirements–those of a reliable byte stream–and its congestion control algorithms continue to evolve in service of those requirements. QUIC is really meeting a different set of requirements. Since HTTP is so central to the Internet today–indeed it has been argued (here and here) that it is becoming the new “narrow waist”–it could be that QUIC becomes the dominant transport protocol, not because it replaces TCP exactly, but because it meets the needs of the dominant applications above it.
In a few days our APNIC podcast will be available and you can hear Larry and Bruce discuss QUIC and Larry’s long campaign to have RPC taken seriously. Also, the forthcoming NSDI paper on the Magma open source mobile network to which Bruce contributed is now available on arXiv. There are some useful data points to show the value of taking an SDN-style approach to designing a mobile core.
Thanks for reading Systems Approach! Subscribe for free to receive new posts and support our work.