The original proposal for P4, “Programming protocol-independent packet processors” turns eight next month. There have been P4 Workshops going back to 2015, and the most recent one was held last month, with Bruce taking the general chair role after Larry did so last year. The ONF published a short blog from Bruce, but there are some bigger issues around the future of P4 that we wanted to cover in this week’s newsletter.
Looking back on the 2022 P4 Workshop I had two big takeaways: the first is narrowly focused on the proposed roadmap for P4, while the second looks at the broader context in which P4 is evolving. Both are connected to many of the themes we write about in these posts. (If you need a refresher on P4, see Chapter 4 of our SDN book.)
The narrow takeaway is that P4 is now securely established as the language of choice for programming packet forwarding pipelines. My evidence for this is all the discussion about features that need to be added to the language in support of new use cases. Make room on the P4 bandwagon! Much of this energy is driven by new target environments, notably SmartNICs and IPUs, which are increasingly becoming P4-programmable. These devices are (rightfully) being viewed as the ingress/egress of the network substrate, putting them squarely in the same domain as switches: software-programmable and centrally-controlled. This is a perfect match for a domain-specific language such as P4 (and P4Runtime)…setting it up for a success disaster.
There are two main challenges, both of which put P4’s value proposition at risk. The first is that P4 is designed to decouple the target hardware architecture (arch.p4) from the forwarding function (forward.p4). The community has had reasonable success defining a standard architecture for switches to enable this separation with the Portable Switch Architecture (PSA). But the NIC/IPU/DPU design space is wide open, and while there is recognition that a corresponding Portable NIC Architecture (PNA) is required, convergence on a standard architecture seems much less likely. There are just too many vendor-specific features (and too much value placed on differentiation) to expect otherwise. And without a common architectural definition, P4 programs will not be portable across platforms. That would be a major loss.
The second challenge is how to deal with pressure to add new language features. At one extreme, the history of SmartNICs teaches us there will be calls for off-loading arbitrary functionality to these devices. The cynic in me fully expects to see proposals to “add loops to P4 to enable X” in the not-too-distant future. Hopefully, efforts to turn P4 into C will not get any serious traction. More reasonably, however, we can expect proposals to add support for (a) stateful protocols, (b) event triggers, and (c) packet scheduling (for example) to the language. Doing so in a minimalist way, by adding only the necessary primitives and not baking in any policy, is the high bar that needs to be cleared. But to what end and at what cost? If you consider the ability to “off-load TCP” as your litmus test for this feature set, then it’s not clear you are putting the network/host demarcation point in the right place. The risk I see is the potential to sacrifice the verifiable correctness of the network, an objective I place far above any temporary win in performance or convenience.
That’s the narrow takeaway. The bigger picture is that all of this attention to P4 is happening in the context of a shift of focus from the datacenter to the network edge. Statements about the edge are always followed by shouts of “Which edge?”, to which the answer is simply “All of them!”. Nick McKeown and Sachin Katti framed it best in their keynote when they talked about the distributed edge, which includes everything this side of the datacenter, including Internet exchanges, co-location facilities, Telco Central Offices, and on-prem enterprise clouds. It’s that distributed set of locations that connects the physical world to the central cloud, and the challenge is to tame them in aggregate with the right mix of programming abstractions. This is a fundamentally hard problem because, unlike datacenters, which can be designed to maximize homogeneity (a necessary prerequisite to achieve operational scale), the distributed edge is fraught with heterogeneity. To paraphrase Tolstoy: All smoothly functioning enterprises are alike, but every dysfunctional enterprise is dysfunctional in its own unique way. That every enterprise is unique (and sometimes dysfunctional) is one of the lessons I learned from PlanetLab.
But setting that complication aside, the ambition of defining a unifying programming model for the edge is a laudable goal. The minimal thing it must do is hide unnecessary implementation details. It should not matter if a function is implemented in a P4 program running on a switch or an IPU, on a SmartNIC using some vendor’s customized variant of P4, on an end-host as an eBPF kernel extension, or as a server-hosted microservice—it should be possible to construct the necessary “device-to-cloud connectivity” through a combination of all of the above.
This is why the big picture has to be taken into consideration when considering extensions to P4: The problem space is larger than any one technology (including P4), and trying to turn one of those technologies into a universal mechanism is a questionable plan. The same is true for Kubernetes, which could be extended to make every edge function appear to be a container-based microservice connected by a CNI plugin. Instead, I believe the distributed edge calls for a new end-to-end service layer that sits “above” both P4-based forwarding pipelines and Kubernetes-based microservices (as well as other mechanisms like those listed above). For example, an approach we pursued in Aether (and before that in CORD) is to define a model-based runtime control system that effectively layers a unifying connectivity abstraction (and corresponding API) on top of a (heterogeneous) set of backend communication subsystems. In Aether, those backend subsystems are deployed as a combination of Kubernetes clusters, P4 forwarding functions, and SmartNIC programs. (You can read more about this in our Edge Cloud Operations book.)
But that’s just the next layer of the stack, and there always seems to be another, more interesting layer. One might fairly ask “what is the right programming model for building out a function-rich edge?” One answer is that multiple application platforms will emerge: one for IoT, one for Immersive UI, one for ML/Analytics, and so on. But this answer is rooted in the conventional mind-set of the edge being a place to accelerate today’s datacenter-hosted applications. Another possibility is that the distributed edge cloud spurs the development of new edge-native applications (a term I learned from a collaboration with Mahadev Satyanarayanan). This train of thought takes us well beyond the immediate P4 todo list, so we’ll save it for another post, except to note that P4 programmers have shown a remarkable knack for injecting innovative functions into the forwarding pipeline. The emergence of edge-native applications will only lead to more opportunities for their creativity.
It’s been hard to miss the stories about sentient AI this week, and there was a timely piece in the Economist by Douglas Hofstadter making the case that even the best AI systems have a long way to go in the sentience department. We also read a great note from Scott Aaronson (our favorite quantum computing expert) on his experiences at the Solvay conference on the physics of quantum information. Scott proposes a “Law of Conservation of Weirdness” to explain the specific types of problems that might be sped up by quantum computing. As always, an enlightening read. And finally, our long-awaited book on Edge Cloud Operations (drawing extensive on the experience of building and operating the Aether edge cloud) is now available in print and ebook formats.
Larry - P4 has been around so long and I have seen (and asked for) the need to support stateful processing. why has that taken so long? at the same time, supporting L2/L3 switching along the lines of table-based OpenFlow is not very interesting IMO
I'd rather like some form of FQ and AQM to make it cleanly into p4. https://github.com/ralfkundel/p4-codel/issues/2