One of the key tenets of DevOps is automation, or more specifically, “Infrastructure as Code.” That means your system configuration is expressed as a series of scripts that can be executed by your configuration management software, repeatedly, across multiple machines.
Treating infrastructure as code has many benefits, including the abilities to control when and how changes are applied, to apply changes quickly and to manage your changes with version control. Most importantly, because it’s code, you can test it.
If you’ve been maintaining computer systems for any amount of time, you’ve probably accidentally broken something important when you were making a configuration change; either the change didn’t work as you expected or you typed the wrong command. What if you had been able to write your changes ahead of time and test them before you applied them to production? Infrastructure as Code enables you to do just that.
Software developers have been testing their code for a long time, and we can leverage their experience and knowledge and apply it to Infrastructure as Code. So, just as there are a series of testing tools available for software engineers, automation engineers can also draw from a collection of tools and build themselves a complete end-to-end testing framework.
There are different types of testing, each with a different but related purpose of ensuring your infrastructure code works as you expect.
Syntax and Style Analysis
Simple static analysis can catch the lowest level of common errors: typos and syntax errors. Did you forget to close a quoted string? Is that a period instead of a comma? Did you mis-type the method name? All of these errors would result in a runtime error, and possibly a long and complicated backtrace for you to analyze, but static analysis tools can catch them much more easily and with more accurate reporting.
Syntax analysis tools also enable your team to enforce a consistent set of standards across your automation code; anyone who has had to look at a script written by a co-worker will appreciate that a consistent coding standard is very useful. It makes integrating and bug fixing other people’s code much easier, which speeds up your development cycle and allows your team to deliver changes faster.
Often bemoaned by developers, unit tests really come into their own as your infrastructure code grows. Unit tests verify that your code, as written, fulfills its requirements. For example, when you run a recipe to configure NTP on a client, will the correct configuration file be created? How about when you run that same recipe on an NTP server? Will the configuration files be created with the correct ownership and permissions?
Unit tests can be most useful during refactoring, to ensure you haven’t missed anything, and when you are writing code that relies on external data or variables, to ensure the correct actions are taken if the data changes.
Acceptance testing is possibly one of the easiest and most useful forms of testing you can apply to your infrastructure code. Most acceptance testing for infrastructure code is centered around the idea of actually running your code inside a sandbox (usually a virtual machine) and then verifying the result. Rapid write/test/fix cycles can be achieved when developers are able to run their code within a virtual machine on their own workstation. They can find bugs and fix them, and test the results until they are confident that their infrastructure code works as intended.
Acceptance testing is also an important component of test-driven development (TDD) and behavior-driven development (BDD), where you can write your tests first and then write and test your infrastructure code using those tests. Once all of your tests pass, you know that your code meets expectations.
Complex systems can have complex interactions, and can often interact in unexpected ways. Integration testing allows you to test the system as a whole. An operations engineer might want to check that the load balancers can connect to the Web servers, that the Web servers can connect to the database servers and that the stack as a whole can serve a Web page successfully. A network engineer might want to check that the leaf/spine topology is configured correctly, that MLAG is communicating between peers and that traffic can traverse the network.
Testing the Network
The good news is that because Cumulus Linux supports the same configuration management tools that Linux supports, you can use these exact same tools to test your infrastructure code and network configuration too. With tools such as Vagrant, Serverspec and Cumulus VX, you can now have the same rapid write/test/fix development cycles and the same confidence that your infrastructure code can be safely deployed, all without ever having to deploy a physical switch or re-cable a lab environment.
Good, repeatable tests give you the confidence that minor changes to your network configuration will not have major impacts. Last, but certainly not least, you’ll never need to worry about mis-typing a command on a production router ever again!