Back in October, 2015, I spoke at All Things Open in Raleigh, North Carolina, an event focused on open technology and open source software. I was very excited by this event because many attendees work in or manage data centers, which means they are very familiar with Linux but have little experience with the networking stack. Cumulus Networks is the first major networking company to contribute a true Linux networking operating system for data center switches, which is highly disruptive to the industry and drives a lot of fun conversations with open-minded individuals.
The talk I did for All Things Open last October titled “Using DevOps Tools for Modern Data Centers” focuses on the new concept of NetDevOps or DevOps for Network devices. Since the network operating system is Cumulus Linux, why not use open source off-the-shelf automation tools that are already being leveraged in the data center to act as a controller. These tools have an extremely large user base, are vendor neutral — that is, not proprietary — and can scale easily.
So what are the benefits of using open source tools? One of the most important benefits from a networking point of view is provisioning. Imagine you have 1000 VLANs that need provisioning. The wrong way to do this would be to create a spreadsheet template that creates 1000 switch VLAN interfaces (SVIs), 1000 VRRP configurations, trunk ports and so forth, then proceed to copy and paste this configuration via a telnet window and deal with the slow speed of configuring their network device manually.
The right way to do this is to use DevOps tools like Ansible, Chef and Puppet that are proven to solve this problem. Instead of a configuration existing as a file on someone’s laptop — a scary and not enterprise-y method — it now exists within a Git repo on a redundant server or a private GitHub account where it is open and available to your organization.
Since the master configuration does not exist on the switch itself, but rather in code that can be on any device (a concept is often referred to as infrastructure as code), once complex and critical activities like hot swapping a switch is now very easy to do. Git also acts as your configuration management tool as well. A random network engineer can’t randomly add a VLAN to a leaf switch and end up causing a loop, thereby destroying your network. Ansible, Chef and Puppet all enforce policy and revert the running configuration of the switch back to pristine state.
What else can we do?
DevOps tools create easy application deployment, make routine maintenance tasks easy, and can even be your user administrative tool. These are normal and well understood concepts within the server community. With 24 or more servers in a rack and with hundreds if not thousands of VMs to manage, running DevOps tools is crucial to deploying and managing these bare metal servers and virtual machines.
Unlike servers, network infrastructure is crucial to entire racks or even rows of equipment. Even with a highly available network infrastructure, a failure or misconfiguration can be detrimental to a data center. What if we could automate common network tasks?
Have a bad fan or other hardware problem? Just kick off an Ansible playbook or Puppet manifest and bring the device gracefully out of the routing fabric. Ansible, Chef and Puppet can even alert you in many ways, like text messages, email or Slack messages. Simply message the on-site tech to replace the broken switch and troubleshoot the broken switch offline.
No time for downtime? Cumulus Linux can route traffic gracefully around network switches that need maintenance, so you can upgrade your switch with minimal downtime. Increase OSPF cost or pre-pend BGP AS to make the device less preferable to routing. Instead of maintenance windows taking hours, they can now take minutes.
Cumulus Linux can even help protect the network from certain types of attacks. Did your monitoring tool notice a lot of traffic coming from a country where you have no business? Are you experiencing a DDoS attack and your Web server is getting the “hug of death?” Why not have the monitoring tool kick off an Ansible playbook automatically that blocks this attack. An iptables rule can be pushed to hardware to block only that specific source subnet or block the destination service (block DNS, ICMP, and so forth). Instead of the network administrator struggling to figure out how to recover from the attack while it is happening, the alert has proactively alerted and responded for the administrator. By the time that admin gets back to a computer terminal, users are already up and running.
Let’s dive deeper into an elaborate example.
To explore further into network automation, check out our knowledge base article Automation for Network Engineers. If you want to learn more about open source projects at Cumulus Networks read the interview I did with opensource.com.