Happy Chinese New Year Y’all!!!
I wanted to take the time and talk about the 25/100G trend we find ourselves in. With more and more platforms being added (Cumulus Linux supports 9 of these platforms with the 3.2 release and we have plans to add more over the next few months), and more customers making the switch to 25/100G as a way to future-proof their networks given the economics of 25/100G open networking switches being on par with their 10/40G counterparts, it’s clear that the 25/100G trend is picking up speed. It was a long journey to get to this point, and we learned a lot on the way. I’d like to take this opportunity to take a look back, analyze the situation and highlight a few things we learned as an industry.
Setting the stage: A short history of the 25G and 100G rush
I don’t know if y’all remember the 100G race between vendors in 2015 to deliver the first 100G switch based on the new 28 GHz standard; everyone had to be first in the market. We even had a handful of 100G switch submissions to OCP by mid 2015. Plus, for the first time in the networking industry, the same quarter (Q4’15) traditional vendors were selling 100G switches was the same quarter 100G open networking switches were available. But if your network wasn’t quite large enough or you didn’t have the spending power to procure your 100G devices with 100G NICs and 100G pluggables, y’all had to wait at least a quarter or two for supply to replenish.
But there was more than just supply issues going on. There were plenty of compatibility and interop issues to resolve (as there always is when a new interface standard is introduced). This would be an unbearable burden for 100G and 25G adoption. Some vendors “resolved” the interoperability issue by only supporting their one cable/optic solution, others decided to not fully implement/follow the standard (such as keeping Forward Error Correction (FEC) and Auto Negotiation off (AN)). Both of these directions soon caused headaches with early adopters as they found themselves doing gymnastics when it came to plugging in any new cable/optic.
It was these issues that motivated the OCP L1 Interop Program, administered by University of New Hampshire’s Internet Operability Lab, UNH-IOL, to tackle these issues and provide peace of mind to end users using open networking technologies. It took the better part of 2016 to implement the test plan for 25G and 100G, but by the end of the year, we had two successful Plugfests solely focused on 25G (December 2016) and 100G (August 2016) with passing configurations listed on the Integrator’s List and more listings added every month.
So now that the picture has been painted with broad strokes, how ‘bout we dive into some issues that affected everyone? In the below paragraphs, I’ll detail many of the issues that were discussed at the Plugfests in regards to 25G and 100G as well as how these issues have been resolved and how the industry has learned from them.
Challenges in the software ecosystem
Today, y’all can grab a given 25/100G box, install Cumulus Linux on it and expect that everything just works. But it took some work to get there. The following sections describe the journey in order to make that possible.
ASIC SDK Maturity
In the past, system vendors would typically work through these low level programming issues on their own leaving the ASIC suppliers and their SDKs in the dark. The open ecosystem and early focus on interoperability of OCP exposes these gaps early enough in the cycle, giving both ASIC and NOS vendors the ability to resolve these issues.
During the 100G plugfest, it was discovered that the SDKs supplied by some ASIC vendors contained some bugs and missing features to properly enable and report FEC and AN settings on a given port. This was just the tip of the iceberg as other issues arose, such as detecting some media types incorrectly (this particular issue will be exacerbated in the following issues).
These issues led to some test configurations being either inoperable or unrepeatable (not a good sign during testing). Today, most of these bugs have been completely resolved either by the ASIC vendor (SDK fix) or by the NOS vendor (work-around). But it was a bit of a journey for us to get there.
NOSes properly handling FEC and AN settings
Starting with the 100G standard (IEEE 802.3bj-2014), we saw the introduction to mandatory FEC and AN settings, and the requirement followed through to the 25G standard, IEEE 802.3by-2016, which came two years after 100G. In order to handle these changes, not only does the ASIC need to support FEC but also the SDKs and NOSes need to enable these features.
If fiber is used, RS FEC is turned on and AN is turned off. If copper is used, AN is enabled which will then be in charge of negotiating the FEC algorithm between both ends of the link. Easy peasy, right?
Well, if the SDK reports a fiber module to be copper,or vice versa, some percentage of the time, the NOS will mistakenly program the wrong the settings and the link fails to come up. And if there’s not a reliable method for reporting link settings, or the method is missing altogether, lots of finger pointing happens.
In some NOSes, proper and reliable link settings is mandatory as this determines which settings will be available to the user via their CLI.
NOSes misidentifying pluggables (via EEPROM)
There are a set specifications that define how pluggables need to identify themselves as (SFF-8636, SFF-8024, et al). With 25G and 100G came two new identifiers for determining 25G and 100G parts: SFP28 and QSFP28 respectfully (along with other settings and identifiers to properly say this is a 25G link for short distances, etc.). This was an awesome development — a quick and easy way to determine if a link is using the new interfaces since they share the same physical connector.
Heck, a 40G copper cable (DAC) is electrically the same as the 100G copper cable. Why sell the same cable with different EEPROM settings? Good point. Shortly thereafter, the definition for identifying SFP+ and QSFP+ (10G and 40G respectfully) was amended to include the clause, “or later”. And since 25G and 100G are later than 10G and 40G, the same cable for 40G and 100G can be programmed the same.
Now a NOS has to not only read the pluggable identifier but also the full extended capabilities section to determine the correct speed (100G vs 40G, etc.). This issue hit us hard during the 100G plugfest but was mostly ironed out by the time the 25G plugfest occurred.
The 25G/100G standards catch-up game
Strange twists in history have led to an interesting situation in the 25G/100G realm. Let me walk you down this road.
Remember how we talked about ports being misidentified, not being able to reliably get link setting information, as well as NOSes not properly handling link settings? I sure do.
Remember how the 100G standard came out before the 25G standard by two years and the same ASIC handling 100G (4 lanes of 25G) can be used in 25G switches? I sure…wait…what?
Well, the supplier quest for being first to market with 100G caused the 25G standard not to be ready at the same time as 100G (along with other issues of a political persuasion). This in turn caused some issues with some ASIC vendors.
What I’m about to describe isn’t anything new, remember: VHS vs BetaMax, HD-DVD vs BluRay, or even the 56K modem drama of the late 90s? It’s hard to predict the future of the market and as such, even harder to predict which standards will be ratified when.
The 25G specification came later than the initial availability of 100G capable ASICs and thus the early ASICs were unable to comply with a future standard. While supporting RS FEC at 100G (across all 4, 25G lanes), some vendors didn’t implement RS FEC on a single 25G lane. How could this happen? It’s simple, really. The ASIC existed before the 25G spec was out — by almost 2 years! As we all know in the tech world, a lot of things change in 2 years. So now we have some 25G switches that don’t conform to spec when using 25G optics because they can’t enable RS FEC on a single 25G lane. The good news is, if you are using 25G DACs to connect servers, you can rely on AN choosing the appropriate FEC algorithm during link up. If you want 25G optics, the silver lining is that we’ll be starting a new chip cycle this year where the ASICs fully conform to the 25G spec making this issue a thing of the past.
Looking back: We learned a lot
I, along with countless others in the industry and Open Compute spent a good part of 2015 and 2016 putting one helluva effort to get 25G and 100G to a point where users can adopt them with the ease and ubiquity of 10/40G. Most, if not all, of the issues stated above have been resolved by those participants at the OCP L1 Interop Program.
One thing is clear, Cumulus Networks , along with all participants at the plugfests including 3M, Ampehnol, Edgecore, FoxConn-FIT, Finisar, Ixia, Mellanox, SANBlaze, The Siemon Company, TE Connectivity, and Teledyne-Lecroy, took away all of those lessons learned to make their products and the open ecosystem better. For Cumulus Networks, we packed them into Cumulus Linux 3.2 to provide a solid and robust solution for users regardless if they are using 1/10/40G or 25/100G.
But don’t take my word for it. Take a gander at the Integrator’s List or if y’all are so inclined, get a few switches with Cumulus Linux to take a look for y’all’s self.