Technology Update with Dries Harnie “How to improve container security with microVMs or gVisor”
CONTAINERS AND VMs
In a modern enterprise IT environment, most applications run on bare metal, virtual machines (VMs) or containers. These containers rely on partitioning techniques inside the host kernel to give the processes running inside the illusion of a machine to themselves. This results in significant advantages regarding management, development, and operations.
A big showstopper for container technology is that all containers on a machine share the same kernel. This means that a container can exploit a kernel vulnerability to “break out” and impact other containers or the host system. In this article, we discuss two different approaches to solving that problem.
THE MICROVMs APPROACH
Since the security problems with containers originate from the shared kernel, a logical response is to give each container a separate kernel. Then the machine can run multiple (container+kernel) pairs using virtualisation, which has a proven security benefit. The virtual machines used for isolation are not full-blown like those used on desktops or clusters. Instead, these microVMs have one network interface and one storage device—that’s it. Moreover, the communication between the code running inside the microVM and the outside world is mediated by a secure process on the host computer.
The project that spearheads the microVM way of thinking is called Firecracker. The creators claim to be able to start up about a hundred microVMs per second, with the container itself starting up within a few hundred milliseconds. In contrast, typical VM start-up times are measured in seconds if not minutes. Compared to regular containers, microVMs have more memory overhead (a few megabytes per microVM) and a small CPU cost for network and disk I/O. However, for some, the peace of mind is worth the overhead because a compromised microVM will not affect other containers on the same host or the rest of the network.
THE GVISOR APPROACH
The gVisor approach attacks the root problem directly, namely that each container has direct access to the kernel. gVisor interposes itself between the running container and the host operating system with a custom, sandboxed implementation of all system calls. What does that buy you? Fewer moving parts (no virtualisation required) and configurable levels of isolation. This allows system operators to run high-throughput network services with the host networking stack while securing the container from attacks on the rest of the kernel surface.
Both Firecracker and gVisor can be deployed as drop-in replacements for core containerd services. Google has integrated gVisor into its managed Kubernetes offering (GKE), and several companies are building infrastructure on top of microVMs.
Once the improved security capability becomes available, it is a matter of figuring out where best to implement it. As a CTO, it is alluring to mandate Firecracker or gVisor company-wide. Although, it might be more prudent to deploy these technologies to systems that must be highly robust in the face of untrusted inputs, containers, or both. They can also serve as an extra layer in a “defence in depth” strategy.
Both share a feature called container snapshotting, where the memory of a running container can be saved to a file and restored later. This technique has been widely employed in the HPC world to reduce the loss of work from node or process crashes. Snapshotting can also provide forensic data when a container does get compromised. In addition, application architectures might restore containers from pre-made snapshots to escape long application load or warm-up times.
From a pure performance point of view, both technologies are a net negative. However, they can significantly contribute to a company’s security profile when used correctly. At Addestino, we enjoy solving complicated technical problems. If you are interested in discussing one, let us know.