There has been a lot of buzz in the industry about containers and how they are streamlining organizational processes. In short, containers are a modern application sandboxing mechanism that are gaining popularity in all aspects of computing from the home desktop to web-scale enterprises. In this post we’ll cover the basics: what is container networking and how can it help your data center? In the future, we’ll cover how you can optimize a web-scale network using Cumulus Linux and containers.
What is a container?
A container is an isolated execution environment on a Linux host that behaves much like a full-featured Linux installation with its own users, file system, processes and network stack. Running an application inside of a container isolates it from the host and other containers, meaning that even when the applications inside of them are running as root, they can not access or modify the files, processes, users, or other resources of the host or other containers.
Containers have become popular due to the way they simplify the process of installing and running an application on a Linux server. Applications can have a complicated web of dependencies. The newest version of an application may require a newer version of a dependency than is available for the Linux distribution, and upgrading the dependency may break another application running on the server.
However, since a container simulates a Linux environment, it becomes possible to install the dependencies in the container without causing any conflicts with the host. In fact, it’s possible to run multiple containers at the same time, all with different versions of applications and libraries! Finally, containers are portable and can be shared across platforms. Docker, a popular container engine, has a specific format for containers to be stored in. This allows a developer to package a container with all of its dependencies, post it online and allow users to download and run the container right away.
What makes a container different from a virtual machine?
On the surface, containers look and feel very similar to virtual machines and can be used in many of the same applications where a VM would be used. Containers are implemented in the Linux kernel, which means that they share
their resources with their host.
For example, whenever an application is run in a container, it is possible to see the container process in the host operating system’s process table. There is a mapping of “real” user and process ID’s to “fake” process IDs, handled transparently by the kernel.
Virtual machines, however, fully simulate all of the hardware of a computer including the peripherals, network adapters, hard drives, memory and CPU instructions. There is much more overhead involved in virtualization, meaning that on average, VMs are larger and slower than their container counterparts. They are also more opaque: applications that run in a VM are not visible in the host operating system’s process table — special hooks into the VM are needed to get that level of visibility.
There are a few things that can be done in a VM that can’t be done with a container:
- Encapsulate persistent data with a specific VM (i.e. vmdk)
- Install a new OS from a .iso file
- Use a different filesystem than the host (ext3/btrfs)
- Use a different version of the Linux Kernel than the host (3.x/4.x)
Virtual Machines are ideal for environments that run multiple operating systems (Windows, various Linux, etc), or for cloud hosting where customers want to import specific images and be isolated from the provider’s other tenants. Containers were designed with web-scale applications in mind, to replace VMs as the deployment platform for microservice architectures.
How is container networking helpful in web-scale application deployments?
Containers in enterprise environments are often used as part of a microservices architecture. This model is common for large web applications where many different applications handle individual tasks and communicate with one another via traditional IP networking.
In a simple web application, one might have:
- A front-end web GUI
- A backend REST API
- A cache
- A database (for the application. the actual data would be stored somewhere outside the container, such as a dedicated NAS)
In the microservices model, each of these roles would be its own container. The host would create a four containers, one for each purpose. External-facing containers such as the GUI and API would be exposed to the public internet via some kind of network address translation, while the others would be on a private network shared with the public services.
There are several advantages to the microservices model:
Easy to deploy: Rather than having to remember or automate a complicated host configuration to handle all of the roles simultaneously, the configuration can be embedded in the container, ready to go as soon as it is deployed.
Disposable: Containers are designed to be disposable. They often start up in a matter of seconds. Rather than upgrading them live as one would do with traditional hard nodes in a data center, it’s faster and easier to simply destroy the container and replace it with the new one. If the host fails for some reason, getting the application back online is a matter of plugging in a hot spare server and starting the containers there instead.
Fault tolerant: Modern clustered applications often go hand-in-hand with containers. Creating a redundant database or web server is a matter of starting copies of the same container across multiple physical nodes to provide high-availability and fault-tolerance.
The buzz about containers is for good reason. They’re an affordable, efficient way to install and deploy on Linux software.