We need to do “upgrades in the network” is one of those phrases that chills the bones of all IT engineers. Upgrades don’t have to be so painful and in this blog, we’re going to discuss the upgrade process recommended by Cumulus and leave you with some example automation to make the process as efficient as possible.
Upgrades are necessary to maintain stable and secure code but bring the risk of new bugs and sustained outages due to unforeseen circumstances, and they’re generally not very easy to perform. Anyone who has worked network operations knows that upgrade windows could run as quickly as an hour or as long as all night (and maybe for the next three nights). Even as I write this I am remembering experiences from upgrade windows of old where things did not go according to plan. But before we get into the specifics of the upgrade process with Cumulus, it is worth discussing why upgrades in the network are so fraught with peril.
DISCLAIMER: Rant Incoming
The biggest impediment to network upgrades is complexity. When we say complexity we mean the conscious choice to add complexity into the design of the network that most folks undertake when they choose to add Multichassis Link Aggregation (MLAG) in an environment to provide redundant network to the host. Complex software fails in complex ways, MLAG is a complex piece of software that, regardless of your vendor, is not standards-based, because there are no standards for MLAG like there are for other parts of the network. If you’re still in the design phase of your network it is worth mentioning that decisions made here have STRONG ripples which affect what the network upgrade process looks like. To take a page from JR Rivers, consider designing your Infrastructure with Purpose. Insist on not supporting legacy L2 technologies in your new environment and working to make a network that is not everything to everybody but rather a highly focused and purposefully built infrastructure that serves the actual needs of the application. Consider deploying Routing on The Host to move to more robust L3 offerings if possible.
The most simple upgrades involve staging a new image and then rebooting the network device. Just like the NASA Jet Propulsion Laboratory and their “seven minutes of terror” during the Insight landing, network engineers have their fingers crossed that the switch comes back online after the upgrade process has completed and the switch reboots.
Some proactive network engineers create a MOP (Method of Procedure) with specific steps to follow. A good MOP might even outline rollback procedures in the event something should go wrong and the upgrade needs to be backed-out. MOPs are normally written based on some lab testing or advice from the vendor regarding best practices.
Going a step further you might even consider removing humans from the equation entirely and creating a script or an automated playbook that takes care of all those steps for you, which brings us to the real purpose of this blog.
One of our consultants, Eric Pulvino, spent a lot of time thinking through this problem and has taken the lead in creating an upgrade playbook that aims to proactively cover all scenarios of an upgrade. This playbook, with slight modification in one form or another, has been used in several large customer environments.
This playbook covers three of the most common data center technologies:
* BGP peering into the fabric
* CLAG peer links
* Host links downstream
The above three cover all forwarding paths on top of rack devices.
When performing your upgrade you want to make sure you’re not working on a node that is in service and handling traffic. There are a number of techniques that can be used to take a node out of service at both Layer2 and Layer3.
BGP Graceful Shutdown
One such technique is applicable to BGP. BGP Graceful Shutdown is a feature that leverages the GRACEFUL_SHUTDOWN community to reduce the amount of traffic affected during a reboot of a BGP enabled node.
The playbook pushes the graceful shutdown functionality to all underlay peers. If you like, or if it is applicable to your environment, this script can be enhanced to push the graceful shutdown attribute to any BGP peerings you might have within any VRFs as well, this is normally common in edge switches that use VRFs to establish external BGP peering.
The playbook also covers some verification steps by examining whether the graceful-shutdown community has been effectively populated after being configured:
Additionally, the playbook double checks whether traffic is flowing inbound on the links where BGP has the graceful-shutdown community enabled.
If your environment is using CLAG, the playbook will detect that and do some additional preparations for your upgrade. When it comes to CLAG you want to make sure you’re not about to take the primary node out of service as that has the potential to introduce a service disruption on your hosts. For any CLAG-enabled node, the lowest priority wins the primary role (just like in STP). For the purposes of an upgrade, the priority of the node is made less preferred by increasing the value. This creates a scenario where when the peerlink is disabled, there won’t be a situation where the switch role has to swap. In addition, it creates an environment when the CLAG peerlink is disabled, all host links are automatically disabled as well.
Disable Host Ports
The playbook leverages the automatic CLAG functionality where when the secondary clag peer loses the peerlink, it will automatically down all host-facing ports with a CLAG-id configured.
Alternatively, for more manual control, all host ports can be manually set down using a command:
Once the switch is effectively out of service at Layer 3 and Layer 2, the upgrade process can begin without fear of service disruption. For this, the playbook uses the `onie-install` command to stage both an image and a Zero Touch Provisioning (ZTP) script on your switch. Prestaging an image is great! Prestaging allows you to not rely on the network at all to provide that new image. ONIE will already have it fetched when the switch is rebooted. Since there are no network dependencies on a management network to pull the image, this technique is perfect for switches which are managed via in-band only connectivity as well.
In the above code, the desired_image and ztp_url are both variables that can be tuned to point wherever images are stored in your environment.
There are lots of different ways that network engineers create workflows which give them confidence in the upgrade process. Whether you’re using automation as is the case here, or you’re doing everything by hand using the CLI, the most important aspect of the upgrade process is that you test it ahead of time to understand the behaviors of your environment. Perhaps set up a few Cumulus VX nodes and test-taking a node out of service with a configuration which matches your own environment. Or maybe take a few of your self-spares off the shelf and wire them up to test the process for maximum authenticity.
Hopefully, this blog has helped you think about the moving parts as they apply to your own environment and provided some useful automation samples to jumpstart your upgrade process.