The publication of a new, definitive specification for TCP (RFC 9293) is enough of a big deal in our world that we couldn’t resist a second post on the topic. In particular, we were intrigued by the discussion that compared QUIC to TCP, which inspired this week’s newsletter.
In our last post about the past and future of TCP, we touched on the possibility that QUIC might start to replace TCP. This week I want to argue that QUIC is actually solving a different problem than that solved by TCP, and so should be viewed as something other than a TCP replacement. It may well be that for some (or even most) applications QUIC becomes the default transport, but I believe that is because TCP has been pushed into roles for which it was not originally intended. Let’s take a step back to see why I make that claim.
Back in 1995, Larry and I were working on the first edition of Computer Networks: A Systems Approach, and we had reached the point of writing the transport protocols chapter, which we titled “End-to-end Protocols”. In those days, there were only two transport protocols of note in the Internet, UDP and TCP, so we gave each of those its own section. Since our book aims to teach networking principles rather than just the contents of RFCs, we framed the two sections as two different communication paradigms: a simple demultiplexing service (exemplified by UDP), and a reliable byte stream (TCP). But there was also a third paradigm that Larry argued we needed to cover, for which there wasn’t really a well-known example of an Internet protocol: Remote Procedure Call (RPC). The examples we used to illustrate RPC in 1995 seem quaint now: SunRPC and a home-grown example from Larry’s research at the time on the x-kernel. These days there are plenty of options for RPC implementations that run over IP, with gRPC being the one of the most well-known examples.
Why did we feel the need for a whole section on RPC, when most other networking books would have just covered TCP and UDP? For one thing, RPC was one of the key research areas in the distributed systems community at that time, with the 1984 paper of Nelson and Birrell spurring a generation of RPC-related projects. And in our view, a reliable byte stream is not the right abstraction for RPC. The core of RPC is a request/reply paradigm. You send a bunch of arguments from the client to the server, the server does some computation with those arguments, and then it returns the results of the computation. Yes, a reliable byte stream might help get all the arguments and results across the network correctly, but there is more to RPC than that. Leaving aside the problem of serializing the arguments for transmission over a network (which we also covered later in the book), RPC is not really about transferring a stream of bytes, but about sending a message and getting a response to it. So it is a bit more like a datagram service (as provided by UDP or IP) but it also requires more than just unreliable datagram delivery. RPC needs to handle lost, misordered, and duplicated messages; an identifier space is required to match requests and responses; and fragmentation/reassembly of messages must be supported, to name a few requirements. Out-of-order delivery, which a reliable byte stream prevents, is also desirable for RPC. There may be a reason why so many RPC frameworks came into existence in the 1980s and 1990s–distributed systems people needed an RPC mechanism, and there wasn’t anything readily available in the standard TCP/IP protocol suite. (RFC 1045 actually does define an experimental RPC-oriented transport, but it never seems to have caught on.) It also wasn’t obvious then that TCP/IP would become as dominant as it is today. So some RPC frameworks (DCE for example) were designed to be independent of the underlying network protocols.
The lack of support for RPC in the TCP/IP stack set the stage for QUIC. When HTTP came along in the early 1990s, it wasn’t trying to solve an RPC problem so much as an information sharing problem, but it did implement request/response semantics. The designers of HTTP, lacking any obviously better options, decided to run HTTP over TCP, with famously poor performance in the early versions due to use of a new connection for every “GET”. A variety of tweaks to HTTP such as pipelining, persistent connections, and the use of parallel connections were introduced to improve the performance, but TCP’s reliable byte-stream model was never the perfect fit for HTTP. With the introduction of transport layer security (TLS) causing another set of round-trip exchanges of cryptographic information, the mismatch between what HTTP needed and what TCP provides became more and more clear. This was well explained in the 2012 QUIC design document from Jim Roskind: head-of-line blocking, poor congestion response, and the additional RTT(s) introduced by TLS were all identified as problems inherent to running HTTP over TCP.
One way to frame what happened here is this: the “narrow waist” of the Internet was originally just the Internet Protocol, intended to support a diversity of protocols above it. But somehow the “waist” began to include TCP and UDP as well. Those were the only transports available. If you only wanted a datagram service, you could use UDP. If you needed any sort of reliable delivery, TCP was the answer. If you needed something that didn’t quite map to either unreliable datagrams or reliable byte streams, you were out of luck. But it was a lot to ask of TCP to be all things to so many upper layer protocols.
QUIC is doing a lot of work: its definition spans three RFCs covering the basic protocol (RFC 9000), its use of TLS (9001) and its congestion control mechanisms (9002). But at its heart it is an implementation of the missing third paradigm for the Internet: RPC. If what you really want is a reliable byte stream, such as when you are downloading that multi-gigabyte operating system update, then TCP really is well designed for the job. But HTTP(S) is much more like RPC than it is like a reliable byte stream, and one way to look at QUIC is that it’s finally delivering the RPC paradigm to the Internet protocol suite. That will certainly benefit applications that run over HTTP(S), including, notably, gRPC, and all those RESTful APIs that we’ve come to depend on.
When we wrote about QUIC previously, we noted that it was a good case study in how to rethink the layering of a system as the requirements become clearer. The point here is that TCP meets one set of requirements–those of a reliable byte stream–and its congestion control algorithms continue to evolve in service of those requirements. QUIC is really meeting a different set of requirements. Since HTTP is so central to the Internet today–indeed it has been argued (here and here) that it is becoming the new “narrow waist”–it could be that QUIC becomes the dominant transport protocol, not because it replaces TCP exactly, but because it meets the needs of the dominant applications above it.
In a few days our APNIC podcast will be available and you can hear Larry and Bruce discuss QUIC and Larry’s long campaign to have RPC taken seriously. Also, the forthcoming NSDI paper on the Magma open source mobile network to which Bruce contributed is now available on arXiv. There are some useful data points to show the value of taking an SDN-style approach to designing a mobile core.
Just a QUIC comment or two... 8-)
The narrow waist of the Internet is indeed, something that lives in a magic shrink machine. I'd argue that Internet applications which need to work ubiquitously must be designed using port 80 or 443. I remember trying really hard to convince the 3GPP folks of this when they were deliberating whether femtocells should use IPsec or an SSL based encrypted connection back to their gateways. They chose IPsec, and that protocol has also been hit by the narrowing waist. So good luck planting your femtocell behind a corporate firewall. RTP --> WebRTC is another example of the shrinking waist. This seems to be following a pattern where web browsers are becoming *the* Internet access client where a human UI is expected - and firewall admins are setting the knobs to allow nothing else to pass. So to Chris, I'd say that the function is still needed, buy yeah, RIP SCTP.
I also think that your RPC observation is spot-on. Datagrams and Byte Streams (UDP and TCP) are Transport layer functions, but not Application layer functions. The Application layer has a number of *CASEs* that have been cobbled together in most Internet applications: typically variants of ACSE and ROSE. QUIC offers an interesting way to create the CASE "layer" functions and then re-use it across applications. (Not trying to push OSI here - they have too many layers. But IMHO the IETF stack has too few. My only assertion is that the functions need to be supported in some way and the OSI terms give me words to talk about them.)
Like QUIC itself, I also think that application communication behaviors (CASE) are being thrust into browsers, and over port 443 as that slides into place as the real small waist of the encrypted Internet. Clearly QUIC can support various behaviors and functions, and I like to think of it as grabbing some of the transport layer and also providing scaffolding to develop the functions of an ill-defined upper layer of the Internet - like the application identifier from the last comment. So I think that QUIC obviates the need for TCP in browsers going forward. If that means "replaces" - given the small waist - then OK, I think it does. Another way to say it is that expect to see much less work on TCP going forward.
I also had to laugh out loud reading your RPC observation, and I applaud you both for writing about it very early on. It lines up very well with the observations that computer scientist, John Day, made in his book, "Patterns in Network Architecture." His fundamental claim is that networking is *all* just distributed Inter-process communication. He and his insights have strongly influenced my own thinking about networking. I think he does a great job of describing the functions needed in networking, and also in describing how well they are or are not supported by existing protocols. I especially like how he describes the strange protocols and architecture we have had to invent to compensate for their absence in the places they ought to be.
Thanks for the article and Best Regards,
Tom
Great article on the always interesting transport protocol space. Couple of thoughts:
1. SCTP RIP. Stating the obvious, heh?
2. Wonder about QUIC implementations. User space or kernel? I am thinking the former for faster development/testing/deployment as well as kernel bypass advantages. For the latter, do we think some sort of eBPF, offload NICs and boxes, or even L4 connection-splicing middle boxes? Or am I off base because it runs over UDP? QUIC has been rolled out It and I presume web server platforms can handle this at scale.
3. Interested in seeing how streaming over QUIC can improve the live event viewing apps (such as NFL Sunday Ticket) that use HTTP live streaming (HLS). I was looking at the media-over-quic (moq) docs.
-chrisMetz