We are back from our end-of-year break and pushing on with our various book projects (Operating an Edge Cloud, TCP Congestion Control, as well as some translations). And we’ve identified a deep vein of content related to the topic of IPUs and DPUs, which will certainly need to go into our SDN book at some point. Here is our latest thinking on this topic, which continues to evolve the closer we look at it.
Late last year I was invited to present a keynote at the Euro P4 Workshop, and I took the opportunity to revisit a topic that has held my attention for much of my career: the appropriate partitioning of functionality in networked computing systems. This talk was both an opportunity to reflect on what has (and hasn’t) changed since I was accidentally building SmartNICs in the 1990s, and to build on some of the themes in our recent post on Infrastructure Processing Units and Data Processing Units (IPUs/DPUs). After that post appeared, Guido Appenzeller, my former colleague at VMware and at the time an Intel executive, reached out to raise a point that I had missed in my first crack at the topic. This point concerns the role of IPUs in separating the guest workloads in a data center from infrastructure functions–those tasks performed by the cloud operator that are important to running the cloud but strictly not part of the concerns of a guest. Guido had used the analogy of kitchens in a hotel being off limits to guests, reminding me of a minor synth-pop classic from the 1980s, “You’ll always find me in the kitchen at parties”, and I took that as the title for my keynote. As a result of this interaction, along with others I’ve had in the last few weeks, I’m forming the view that IPUs/DPUs are a bigger deal than I first realized.
Whereas I had previously thought of IPUs as just an extension of the trend to move more and more functionality out of the server onto the NIC, there is another way to think about them. In the computing era before clouds, we might move some function to a NIC because it was more efficient to do it there. But in the cloud era, there are really two distinct classes of function: those that belong to guests (or users, tenants, or customers) and those that are the responsibility of the cloud service provider. And so a helpful way to think about the IPU is that it’s not just a sort of offload engine, but instead it is a way to fully separate those two classes of workload–those of guests and those of operators–into two separate types of computing system. The guest workloads run on traditional servers (x86 or ARM) while the infrastructure functions run on hardware optimized for that task: infrastructure processing units.
While I previously viewed this as a performance optimization (as have others) there is more to it than performance. There are a number of significant consequences that follow from isolating these two sets of functions into separate hardware subsystems. For example, a spike in the processing load on the IPU will not affect the performance of the guests (unless it were to affect the data path of the IPU, on which we have more to say later). Furthermore, this separation opens up interesting options for what actually runs on the guest system. For example, the guest need not necessarily have a hypervisor at all: we could dedicate an entire host to a bare-metal operating system, which includes the possibility of running something like Kubernetes on bare metal for the guest. Or the guest can “bring their own” hypervisor, which is effectively what VMware does when running VMware cloud on AWS. VMware’s hypervisor (and other software) runs on the server, while the AWS infrastructure (provided by Nitro) manages things like resource allocation and virtual networking. Guido also pointed out that your guest doesn’t have to be a traditional server either: it could, for example, be a bunch of GPUs.
I spoke to Kostadis Roussos (a VMware principal engineer) on this topic and he observed that these sophisticated interfaces to the servers can be thought of as “programmable wires”: a foundational building block for composable infrastructure. For example you can compose a server with networking and storage connected via the IPU, without being constrained by the networking or storage capabilities of any physical server. This again is more than a performance optimization–it’s giving us new capabilities to deliver services in a cloud.
As noted by both Intel and AWS, moving these infrastructure functions out of the servers and onto IPUs (or Nitro in the case of AWS) means that servers can be 100% allocated to guests. While this sounds like a good thing–stop wasting those precious x86 cycles on overhead, allocate them to real work–it’s only really a win if the new home of the infrastructure functions can do those tasks efficiently. The last thing we want is a new bottleneck, or an expensive piece of hardware doing a job that used to be done perfectly well on x86.
For this reason, the architecture of an IPU must provide high performance and achieve it at reasonable cost and power consumption. And this is where the role of P4, or more generally, a highly programmable and fast data path, becomes apparent. Just as the PISA architecture for switching is a new sweet spot for networking hardware that is both programmable and high-performance, the P4-programmable data plane in an IPU such as Intel’s Mount Evans ASIC also aims to find the right tradeoff between performance and flexibility. This enables the IPU to meet the evolving requirements of infrastructure functions such as network virtualization, storage, encryption and so on, without becoming a bottleneck, and in a way that is more efficient than simply performing those functions on the x86 servers where they previously lived.
For those of us who have been watching the evolution of SDN for over a decade, it is intriguing to see IPUs as another step along the path to making everything software-defined. Rather than seeing them as just another incremental tweak on SmartNICs, there is a case to be made that IPUs are an important strategic control point in the future of cloud architecture. By pulling IPUs out of the servers, operators gain a new central point of control over their infrastructure whose benefits include security, isolation, and the chance to innovate faster in their cloud services.
In case you missed it during the holidays, our article on Magma and cloud native networking was published in The Register. We found a good podcast featuring SDN pioneers Nick McKeown and Martin Casado discussing programmable infrastructure. And having shown my sympathy for 80s music with the title of this post, I want to give a shout-out to an amazing music documentary that I watched over the holidays, The Sparks Brothers, about the most influential band you’ve probably never heard of.
So by IPU are you thinking about Pensando on top of Aruba as a programmable switch? Or are you thinking with Cumulus now owned by NVIDIA will we see a full blown networking engine on a DPU? The same way ESXi and NSX will run on a DPU I think this could be an opportunity to introduce cloud services closer to the guest than at the first hop network device. My only concern is managing all these nodes efficiently with patches, updates, configurations, this needs to be an automated first design. Very exciting times!!
Nicely written. In the project-emco community, few discussions happened on this. Mainly, the discussions revolved around the IPU/DPU role in edge-computing. Since IPU and CPU take workloads (infrastructure and guest workloads respectively), it was felt that common entities need to be shared among them. That requires automation of IPU infrastructure when guest workloads come up. Due to isolation requirements, one likes to go with an agentless model (that is no agent in the host). Hence, the need for higher level orchestration systems to program IPU (such as networkpolicy, DDOS policies, service mesh etc..) for guests' workloads.
Good to know what you think. Created an article based on project-emco community discussions here: https://www.linkedin.com/pulse/cpu-ipu-why-multi-cluster-orchestration-becomes-super-addepalli/
In regards to P4: I would imagine that this interface is used within IPU between normal-path component in ARM cores and networking hardware IP. As far as the programming/configuring the IPU from external orchestration is concerned, I think it will continue to be K8s custom resources realized via K8s operator. P4 is good in the sense for portability of infrastructure software across multiple IPUs/DPUs. And portability is important and I hope that Industry keeps adding more extern features (to realize stateful packet processing, IPSEC, traffic shaping, RAN DU, UPF).