Was MPLS Traffic Engineering Worthwhile?
One of the topics we keep coming back to at Systems Approach is “when should a feature be made part of a platform”. This question is at the heart of much of the work we have done over the years, and it is a reason we keep returning to the end-to-end argument in all sorts of contexts. This week we’re looking back at how the arguments played out in the context of traffic engineering.
By far the most controversial technology I have worked on in my career is MPLS traffic engineering (MPLS-TE). I can remember pretty well how the first ideas came about. I had joined Cisco a few months earlier (in 1995) and I was part of a team tasked with figuring out some way to combine the properties of ATM (asynchronous transfer mode) switching with IP routing. Even the idea that ATM and IP might coexist was somewhat controversial at the time. A solid chunk of the telco industry had lined up behind ATM as the future of packet switching and were planning to upgrade the telephone network from something that supported voice only to “Broadband ISDN” based on ATM. People in this world had a tendency to dismiss IP and Ethernet as “legacy” technologies. At the same time, the Internet community was growing rapidly as the Web took off and Internet access started to become something that consumers could get via dial-up modems or their employers. Steve Deering, an Internet pioneer, famously remarked that ATM would never get enough traction even to create a legacy (which was ultimately not far off the mark).
As discussed in a previous post, Yakov Rekhter had written out the basic ideas of MPLS (a.k.a. tag switching) in a two-page paper that set the direction for our team at Cisco. Our small team was going out on customer visits, mostly to the large ISPs who relied on Cisco for their routers, trying to determine whether the tag switching idea would interest them. At one such meeting I asked the customer (who was quite a character) whether he saw any value in having ATM switches (equipped with the tag switching software that did not yet exist) in his network. He picked up my business card (which had the words “ATM Business Unit” on it), left the room, and returned with a pair of scissors and some sticky tape. After cutting my card into strips, he put the fragments in front of me with the tape and asked me to show how I could add value by reassembling it. (This was clearly a joke about the need to break packets into small cells to send them over ATM networks.) I was taken aback by this but he certainly made his point in a memorable way.
You can get a sense of why MPLS was so controversial from that story. A person who was strongly committed to the IP architecture viewed ATM as worse than useless, and from his perspective, MPLS was just ATM in another form. From another perspective, MPLS was a distortion of the original vision of ATM, so we were not winning fans among traditional telco types either. Of course, if you take a middle position that irritates people at both ends of a spectrum, that doesn’t necessarily mean you’re doing the wrong thing. Nor does it prove you are right.
Some time later I had my first experience of talking to a trade journalist about my work. He interviewed me and some of the MPLS skeptics. Fortunately, I had someone from Cisco’s PR department on the call, because the resulting article was pretty embarrassing (for me). With a title along the lines of “MPLS: Bad for the Net?” it quoted me as follows: “MPLS is a significant deviation from the Internet architecture and some people think it should be stopped at all costs.” Calling MPLS “a significant deviation” wasn’t in any sense a Cisco position. The PR person had good notes and was able to verify what I had actually said: “Some people think MPLS is a significant deviation from the Internet architecture and should be stopped at all costs.” A correction was issued the following week (long after everyone with any interest had read the article and tucked away deep in the magazine as these things tend to be). I didn’t do a lot more press in my time at Cisco and that was fine with me.
I’ve already written about how valuable MPLS turned out to be for enterprise VPNs, but what about traffic engineering? Here I think the results are mixed and harder to quantify. While certainly there was a solid amount of deployment of MPLS-TE, it didn’t become an essential service of ISPs in the way MPLS VPNs did. If you want to make the case that MPLS-TE was really valuable, I think the strongest published arguments relate to its use in the massive backbones of Google and Azure. By applying SDN to the control plane of MPLS, the operators of these hyperscale networks are able to perform network-wide optimizations and squeeze out the performance gains that we hoped to achieve in the early days of MPLS-TE. But with the original, distributed MPLS-TE control plane (based on RSVP - the Resource ReSerVation Protocol) those gains were harder to achieve (because there was no global control), and equally hard to quantify. No-one was going to run a substantive A/B test on their network with and without MPLS-TE. (There was additional controversy over the choice of extending RSVP rather than building a new protocol, but that’s a story for another day.)
It wasn’t hard to make a theoretical argument that MPLS-TE could improve network performance and average link utilization, by moving traffic from congested links to uncongested ones. The hard part was proving that it would actually do a better job in practice than the more traditional methods such as using link weights and multipath routing to achieve the same ends. In the end, we had two disjoint sets of customers who passionately believed in their positions: one set holding that MPLS-TE was completely useless, the other believing they could only achieve their utilization and performance objectives by deploying MPLS-TE. It was a classic case of “listen to the customer” and we did, developing an increasingly complex set of MPLS features for the second group while continuing to develop the traditional feature set for the first group. We were lucky to have the resources to do this; it was reported that the requirement to support MPLS became a barrier to entry for some router vendors in the ISP market. Conspiracy theorists might (and did) argue that this was all part of Cisco’s cunning plan: make products complex just to raise that barrier to entry for competition. I can honestly say that this was not the case with MPLS, in that we really did have customer demand for the complexity.
All of this ancient history came back to mind recently as we’ve had a series of fascinating discussions about the end-to-end argument prompted by our recent pieces on networking support for RPC. I highly recommend occasional re-reading of the original paper, because most people take away the simple message “dumb core, smart end-hosts”, which is a massive oversimplification. The E2E paper actually makes a strong case for putting some functionality inside the network, particularly to improve performance, and offers a nuanced argument about how to decide when “enough is enough” regarding those in-network functions. In many respects my whole career was a set of efforts to decide when putting something into the network was a good idea from the end-to-end perspective. Did we get it right in the case of traffic engineering? We did what the customers wanted, giving them a tool to manage traffic that wasn’t available previously. Of course, it’s not always a good idea to listen to customers, and more isn’t always better. Jon Postel, the original RFC editor, put his finger on the issue:
“[There was] a space that we were exploring and, in the early days, we figured out this consistent path through the space: IP, TCP, and so on. What's been happening over the last few years is that the IETF is filling the rest of the space with every alternative approach, not necessarily any better.”
As one of the people who was “filling the space” at the time, I still don’t know if we made things better by doing so.
Further to our recent post on Large Language Models, Donald Michie’s matchbox-based machine learning system of the 1960s is apparently being rebuilt. And speaking of computing pioneers, the Turing award this year goes to Ethernet inventor (and one-time Internet skeptic) Bob Metcalfe. With Ethernet progressing from 3Mbps in 1973 to 400Gbps today (with no end in sight), Metcalfe certainly has earned his place in networking’s hall of fame.
Now that we have made it possible to pay for this newsletter, we’d like to thank the people who have signed up. Perhaps you could try asking your company to reimburse the small subscription fee as a business expense – here is a template to help. And don’t forget to join us on Mastodon.