Cumulus was a key part of getting OpenStack to work in the way that we wanted. Using Cumulus to go to a Layer 3 to the node network, we were able to create a very highly resilient network
Government, Cloud Service Provider
Provide a local, secure, cloud service for the Australian government
Mellanox, OpenStack, Ansible
Vault is an Australian cloud infrastructure provider that offers Australian Signals Directorate certified secure cloud services for the Australian Federal Government. Their systems have been designed to demonstrate their commitment to providing an assured, trusted cloud computing environment and have achieved ASD certification for use for UNCLASSIFIED DLM and PROTECTED data. Vault’s clouds host government tenants who manage everything from sensitive social security and health record data, up to classified military and intelligence information.
The sensitive data of Australian citizens that the government holds, is secured with standards set by the Australian Signals Directorate (ASD), the Commonwealth authority on the security of information. To facilitate compliance with these standards, the ASD publishes the Information Security Manual (ISM) which outlines over one thousand controls that dictate how data should be handled to ensure the security of information within Australia.
In a recent government survey, 93% of Australians stated that they wanted to retain sovereignty over their data. Being an Australian owned business, Vault can point to the fact that its secure cloud infrastructure is situated wholly within Australian borders, ensuring data sovereignty is maintained and that no issues arise involving cross border jurisdictional ownership claims. Vault has worked to demonstrate that its ASD certified cloud provides a local, assured and secure environment for the hosting of Australian Government data and services.
After investigating multiple vendor solutions, Vault found that they couldn't meet the required security standards using a proprietary cloud platform running on bare metal hardware. Utilizing these proprietary solutions, Vault needed to make multiple diverse technology components work seamlessly together, and could not run them without substantially modifying their default settings. The necessity of having to continually add layers of extra security on top of proprietary commercial solutions, complicated their cloud platform and made the model unsustainable.
Rupert Taylor-Price, CEO of Vault, explained, “We were always trying to get a commercial product to operate in a way for government that it was never designed to do, and by trying to do that we had endless network issues.”
In addition, Vault had difficulties in attaining acceptable levels of network control and performance. Many legacy government workloads were not designed natively for the cloud, and Vault found that many agencies were not yet ready to refactor applications into cloud native architectures to deliver optimal cloud performance. With the suboptimal performance of these legacy workloads, Vault realised that the specifications of their cloud infrastructure needed to improve to deliver superior performance.
Vault’s pre-existing Brocade network presented them with unsustainable upgrade paths.
“When you go through certain upgrade paths and even have switch-over of services as you upgrade things, it becomes unsustainable. Being a cloud provider,” says Rupert, “The idea of having even one and a half seconds of network downtime is just completely unacceptable.”
"Imagine if government services were modern, unified, intuitive, secure, and as reliable as some of the web services we use everyday", posits Roland Cabana, Chief Customer Experience Officer. “We knew we had to build such an environment, because that was what our users were demanding.”
With the Australian government estimating that by 2020 it would invest $1.1 billion into cloud infrastructure, Vault knew that they had an opportunity to become a leading local cloud service provider, if only they could get their technology right. Their primary objective would be to guarantee uptime availability, and to create a highly resilient network with many paths from multiple network infrastructure connections. To meet these goals and overcome legacy infrastructure challenges, Vault applied a cloud maturity model with four strategic phases: lift and shift of workloads, optimization, orchestration and automation, and going cloud native.
With the newfound mission to find a solution that could meet their network and infrastructure requirements, Vault began leveraging the open source OpenStack cloud platform. It provided them with the flexibility to be able to adapt the platform if needed so that it would comply with Australian Signals Directory security requirements. Using OpenStack, the team were able to craft a solution that addressed their primary needs: maintaining a high level of control over the network, the ability to scale on demand, and ensuring performance requirements for legacy government workloads.
Cumulus Networks and Mellanox were key collaborators with Vault in adapting OpenStack to meet their requirements. Vault chose to use Mellanox’s 100 Gigabit per second fabric running Cumulus Linux, decoupling the networking from the hardware layer. Combining this with OpenStack’s Neutron networking service forms the basis of vault’s high speed Software Defined Network (SDN). Cumulus Networks’ Debian based OS provides full flexibility, customization, and automation for bare metal provisioning. Since integrating Cumulus Linux, Vault has seen extreme performance improvements with the Mellanox Spectrum switches. When migrating workloads and offloading through VXLAN, superior efficiencies are realized.
As an example, the Australian Department of Finance has created the ICON network to manage legacy workload migration. This Layer 2 fiber network seamlessly connects with Vault’s Layer 3 architecture via a security switch. The security switch has a service that terminates the layer 2 segment and translates it to VXLAN, going right to the host. The Layer 3 fabric in essence becomes a transit for the entire network. Vault can spin up a workload, and send data from a desktop to the cloud with sub-millisecond latency. This improves the performance of technologies like VDI, creating a better customer experience, and allows Vault to host a true hybrid cloud model with the database on one side and applications on the other, essentially extending the data center.
In addition, Vault is able to adapt networking parameters and controls to meet the regulations and metrics needed to comply with ASD security requirements. As new security requirements are added or modified, Vault is able to adapt the OpenStack platform rather than adding on additional layers of security and complexity.
For network automation, Vault has provisioned their entire cloud through Ansible and OpenStack. The seamless integration between Cumulus Linux and Linux-based Ansible makes the task of automation easier - a key factor in the success of Vault’s cloud platform. With internal Layer 3 network architecture, the Vault team can easily add additional hardware as needed with a one day turn around to rack, stack, and provision from bare metal. Adding an extra element of reliability to their structured change process is the company’s gated commit process. If an engineer would like to make a change to the network fabric, the change is applied first to a test development environment where a verification process ensures acceptable operational behaviour. If it passes the testing on this environment, the change then gets committed to the production environment. This also allows Vault to scrutinise changes effectively for potential performance degradation and provides a platform to onboard engineers at a higher rate, creating a culture of flexible talent.
“Cumulus Networks and Mellanox have built a true hybrid cloud with us,” reports Roland. Rupert adds that “we've gotten to a very comfortable state with our network, which is probably the first time in my 12 years of working with these types of systems.”
By creating a true hybrid cloud environment with Cumulus Networks and Mellanox, Vault has been able to meet full compliance with the Australian Government information security standards. The hyperconverged design provides the unparalleled performance, customization, and on demand scalability that the team needed to deliver. Finally, the Vault Engineering team now has the confidence to do any kind of maintenance on the network without fear of unforeseen issues arising. They can work on one component of the network and have the peace of mind that it is resilient enough to provide them with zero downtime and zero packet loss.
Some of the benefits of Vault’s technological partnership with Cumulus Linux and Mellanox include:
- Vault has achieved 100% ISM control compliance by utilizing open source OpenStack. Other cloud providers must take additional mitigation steps, and must acknowledge the potential risks of proprietary systems and disclose them.
- Cumulus Linux and Mellanox create a true hybrid cloud. This allows government agencies to seamlessly use workloads in the cloud to their advantage.
- Hyperconverged design provides unparalleled compute performance. Vault has seen improvements on ML and other workloads that had previously encountered latency issues on the disc.
- Reduced risk for government by using open standards. If the Australian Government needs to migrate away from Vault’s cloud, it can take its upstream code, deploy its own cloud and migrate its workloads back in-house, because Vault’s open standards approach provides full true portability.
- Vault has significantly reduced the number of issues reported with its system and has improved its automated configuration process to reduce configuration errors.
“The specialized knowledge that people still have is obviously still around networking and network architecture,” says Rupert. “But it's no longer really so much around proprietary technologies and trying to work out how a particular implementation or particular way of doing something was implemented with a specific vendor. We have more configuration freedom to build the innovative things we need.”
What’s next for Vault?
With the rapid growth of Container technology, and specifically Kubernetes for Container orchestration, Vault is looking for ways to better track the connection of these ephemeral services.
“More and more people are adopting Kubernetes in our architecture”, says Roland Cabana. “We need to find a solution that monitors performance metrics on some of the clusters. This will be quite important. We look forward to using Cumulus NetQ to give us this Kubernetes CNI visibility”.