When writing our Systems Approach books, we generally try to put ourselves in the shoes of a reader who doesn’t yet understand the topic we are trying to explain. This might seem obvious, but it means we need to be constantly checking our assumptions about what readers can be expected to know. Back in 1995 we couldn’t even assume that a reader would have spent much time on the Internet–we recently noticed a section of our big book that was quite overdue for an update since it still reflected our 1995 assumptions. In a related vein, some of our recent discussions have made us wonder how well our readers understand virtualization, especially when applied to networking, so we’re taking a look back at how we came to understand virtualization and its power.
I have a vague memory of hearing about server virtualization for the first time in the early 2000s when I was working at Cisco. Most of my knowledge of operating systems had been picked up early in my career, so I had a working knowledge of concepts like virtual memory, but I was pretty surprised when I learned just how popular virtual machines had become. One thing I wondered was why virtual machines had become the de facto means to isolate applications rather than making use of process isolation capabilities of a single operating system. I wouldn’t get a satisfactory answer to that question until I joined Nicira many years later (more on that in a moment).
Even more perplexing to a networking person was to hear that the prevalence of virtual machines in data centers was driving a push towards big, flat layer-2 networks. One of the most remarkable (to me) consequences of virtualization is virtual machine migration: because a virtual machine is completely decoupled from the physical hardware on which it runs, it can be picked up and moved to another physical host without interruption. VMware’s version of live migration, vMotion, was released in 2003 to considerable acclaim, as it allowed a VM to be moved across a data center without interruption to the applications running on it. But there was an unfortunate networking-related side effect to VM migration: VMs retained their IP addresses as they moved. And this is what led to the push to build big, flat L2 networks in modern datacenters: so VMs could move around without finding themselves on a subnet that didn’t match their IP address.
By this time, I was convinced that scalable L2 networks were something of an oxymoron, so my first reaction was to argue that VMs should simply change their address to match the destination subnet when they moved across the DC. But that just illustrated what I didn’t know about real-world applications in datacenters. The consequences of an IP address change range from a dropped TCP connection to complete breakage of an application in the case where it has a builtin assumption of L2-adjacency to some other component. From the perspective of a datacenter operator, you simply can’t expect the application to respond correctly if the IP address is changing underneath it.
Fortunately there are alternatives to attempting to build datacenter-scale L2 networks. One of the seminal efforts to tackle the problem is VL2, described in 2009 by Greenberg et al. While this was not the first paper to use the term network virtualization, it did (I think) introduce the term in the way that it is most widely used today. Interestingly, Albert Greenberg (a SIGCOMM award winner) led the team that developed Azure’s network virtualization system, while his co-author James Hamilton later led the networking team at AWS. VL2 gives
each service the illusion that all the servers assigned to it, and only those servers, are connected by a single non-interfering Ethernet switch—a Virtual Layer 2— and maintain this illusion even as the size of each service varies from 1 server to 10,000.
Too often when I read something about network virtualization, it turns out to just be some way of partitioning network resources among different users (also known as slicing). But the key word in the above quote is “illusion” because it gets to the heart of virtualization: creating an illusion of something that, strictly speaking, doesn’t exist. As the seminal 1974 paper on virtualization puts it, these illusions are “efficient, isolated duplicates” of the physical thing being virtualized. Virtual memory gives processes the illusion of a massive amount of address space, generally much larger than what is physically present, completely available to that process and protected from other processes. Virtual machines create the illusion of a complete set of computing resources (CPU, memory, disk, I/O) that are so fully independent of the underlying physical machine that they can be moved across a datacenter (or further). And virtual networks create the illusion of a private switched Ethernet (in the case of VL2) that can span the datacenter, even when the datacenter is a Layer 3 network built out of routers interconnecting many subnets. So while partitioning of resources is part of the story, it’s just one element in service of creating these illusions.
By the time I started working at Nicira in 2012, the team had settled on this view of network virtualization that mirrored the success of server virtualization. The idea of virtualization creating an illusion of something was core to the vision: just as a virtual machine creates the illusion of a physical machine so faithfully that an unmodified operating system and its applications can run on it–even as it migrates from one piece of hardware to another–so too, a virtual network should perfectly replicate the features of a physical network, while remaining independent of the actual underlying hardware. Just like servers, networks are complicated things with lots of moving parts, so Nicira’s product needed to do a lot more than what VL2 did: not just creating a virtual layer 2 switch, but virtualizing every layer of the network. That meant (eventually) virtual layer 3 routing, virtual firewalls, virtual load balancers and so on. I used to chuckle to myself about the prospect of a 50-person engineering team managing to recreate in software all the networking capabilities that had been developed over the previous several decades by companies like F5, Checkpoint and Cisco, but that is pretty much what eventually happened (thanks to the injection of considerable engineering resources over the following years).
You can find descriptions of network virtualization as implemented at Nicira, Google, and Azure. It’s a bit harder to get details on how AWS does it but a VPC is a form of virtual network; here is some of the information as presented by James Hamilton. We also cover network virtualization as a key use case of SDN in our book. And it’s not limited to the datacenters of the hyperscalers; VMware claims that their network virtualization product (following on from the work at Nicira) is used in the datacenters of 91 of the large enterprises making up the Fortune 100.
In some respects, network virtualization has followed the same path as server virtualization and for similar reasons. Nicira founder Martin Casado has talked about how virtualization changes the “laws of physics”: the salient example for server virtualization is live VM migration, but there are others, such as snapshotting and cloning of VMs, made possible by recreating the illusion of a physical machine entirely in software. Not only did network virtualization bring similar capabilities to networking, but it facilitated entirely new ones such as microsegmentation, laying the groundwork for what was arguably the “killer” use case, zero-trust networking. We had a running joke at Nicira about the movie “Inception” (particularly when running virtual networks inside virtual networks). Network virtualization, with its own “laws of physics”, allowed us not only to recreate the capabilities of physical networks but to create new ones.
We’re always interested in understanding how things go wrong with the Internet, and recent weeks have provided plenty of examples.
Cloudflare has an interesting blog about how they came to suffer an outage in part of their DNS infrastucture weeks after the root cause. The DDOS attacks leveraging a feature of HTTP2 have been well explained by both Cloudflare and Google: this feels like a case study for some future version of our book and a cautionary tale about implementing RFCs to the letter of the law. And it turns out that lots of commercial BGP implementations will fail under fuzz testing.
Preview photo this week by Luca Nicoletti on Unsplash