Is a Docker container just a lightweight virtual machine?

No. A container has no operating system and no kernel of its own. It is a normal process running on the host, the same kind of process as your text editor, except the kernel has been told to lie to it about what it can see. Namespaces give it a private view of the filesystem, process tree, and network. Cgroups cap what it can consume. Nothing boots.

Why do containers start so much faster than VMs?

Because there is nothing to boot. A VM has to bring up a full guest kernel, mount filesystems, start init, and launch services, which takes seconds to minutes. A container is just a process the kernel starts with restrictions applied, so it launches about as fast as any other program. Same reason images are megabytes instead of gigabytes: no guest OS on disk.

Are containers secure enough for untrusted code?

Not by default. Every container on a host shares that host kernel, and the kernel is the wall. One kernel bug that lets a process escape its namespaces and the container is just a process on your host with the blinders removed. Most workloads are fine because most workloads are not hostile. If you are running untrusted or multi-tenant code, use gVisor, Firecracker, or Kata Containers.

What is a Docker image made of?

An image is the filesystem the container process wakes up inside, built as a stack of read-only layers. Each Dockerfile instruction that changes the filesystem produces a layer, and Docker mounts them together with a union filesystem. Layers are cached and shared, which is why a good Dockerfile copies package.json and installs dependencies before copying the rest of the source code.

When should I still use a VM instead of a container?

Three cases. When you need a different OS or kernel version, since a Linux host only runs Linux containers. When you need hard multi-tenant isolation for untrusted or mutually hostile workloads, which is why the cloud itself is built on VMs. And when you need kernel modules or low-level hardware access, because a container cannot bring its own kernel to load a module into.

tech

Docker vs Virtual Machines: What a Container Actually Is

Everyone repeats that VMs virtualize hardware and containers virtualize the OS. True, but it hides the real thing. A container isn't a tiny VM. It's a normal process on the host with blinders on, fenced off by two Linux kernel features.

Tech Talk News EditorialJun 9, 202610 min read

#docker #containers #virtual-machines #devops #linux #infrastructure

ShareX LinkedIn Reddit Email

Docker vs Virtual Machines: What a Container Actually Is

Key takeaways

A container is not a lightweight VM. It is an ordinary host process that the Linux kernel has fenced off with two features: namespaces, which control what the process can see, and cgroups, which control what it can use.
A VM boots a full guest OS with its own kernel on top of a hypervisor, which is why VM images are measured in gigabytes and take seconds to minutes to start, while containers have no kernel to boot and start in milliseconds.
The container security tradeoff is that every container shares the host kernel, so a single kernel vulnerability that lets a process escape its namespaces removes the fence entirely, whereas escaping a VM means defeating both the guest kernel and the hypervisor.
gVisor, Firecracker, and Kata Containers all solve untrusted-workload isolation the same way, by putting a real boundary back under the container, and Firecracker micro-VMs boot in around 125 milliseconds and power AWS Lambda and Fargate.
Containers took over because of reproducibility, not isolation. The image ships the entire userland, so the image that ran in CI is byte for byte the image that runs in production, which kills the "works on my machine" problem.

The sentence everyone repeats is “VMs virtualize the hardware, containers virtualize the operating system.” It's correct. It's also useless, because it leaves you picturing a container as a skinnier virtual machine. Same idea, less overhead. That mental model is wrong, and almost every confusing thing about Docker traces back to it.

A container is not a small computer running inside your computer. It has no operating system of its own. It has no kernel of its own. When you run a container, you are running an ordinary process on your host, the same kind of process as your text editor or your web server, except the kernel has been told to lie to it about what it can see. The container thinks it has the whole machine. It doesn't. It has blinders on.

Once that clicks, the rest falls out for free. Why containers start in milliseconds instead of seconds. Why a 30MB image is normal and a 30MB VM is a joke. Why a kernel bug is scarier in container-land than in VM-land. Why Docker on your Mac is quietly running a Linux VM behind your back. Let's actually go through it.

Summary

A VM boots a full guest OS, with its own kernel, on top of a hypervisor. A container is just a host process that the Linux kernel has fenced off using two features: namespaces (what it can see) and cgroups (what it can use). No guest kernel, no boot. That single difference explains the speed, the size, and the security tradeoff all at once.

What a virtual machine actually does

Start with the heavy option, because it's the honest baseline. A virtual machine is a real computer simulated in software. Underneath it sits a hypervisor, the thing that carves your physical machine into virtual ones. On a server that's usually KVM, on a laptop it might be something like VMware or the built-in macOS hypervisor framework.^[1] The hypervisor hands each VM what looks like its own CPU, its own memory, its own disk, its own network card.

On top of that virtual hardware, the VM boots a complete operating system. A full guest OS, kernel and all. If you run an Ubuntu VM on a Windows host, there is a real Linux kernel running inside that VM, scheduling its own processes, managing its own memory, talking to virtual devices that the hypervisor pretends are real. The guest has no idea it's a guest.

That's why VMs are heavy. You're booting an entire OS. The disk image is measured in gigabytes because it contains a whole filesystem. Startup takes seconds to minutes because the kernel has to come up, mount things, start init, bring up services. Each VM also reserves its own slice of RAM for its own kernel and background processes, whether or not your app needs it.

The payoff for all that weight is isolation. Real, hard isolation. The guest kernel is a separate kernel. Code running inside the VM talks to that kernel, not to the host's. For a process inside the VM to reach the host, it has to break out of the guest OS and then defeat the hypervisor, which is a small, hardened, heavily audited layer. That is a genuinely difficult thing to do. This is why VMs have been the unit of multi-tenant isolation in the cloud for two decades.

Plain English

A VM is a fake computer all the way down. It boots its own operating system and runs its own kernel. Heavy and slow to start, but the wall between it and the host is thick, because there are two kernels and a hypervisor in between.

A container is a process wearing blinders

Now the light option. When you start a container, the Linux kernel does not boot anything. It starts a process, the same way it starts any process, and then it applies two kinds of restrictions to that process. That's the whole trick. Two features, both built into the Linux kernel, both there for years before Docker made them famous.

The first is namespaces. A namespace controls what a process can see.^[2] There are several kinds, and each one virtualizes a different slice of the system:

The mount namespace gives the process its own view of the filesystem. It sees the container image as / and has no idea your real root filesystem exists.
The PID namespacegives it its own process tree. The first process in the container is PID 1, like it's the only thing running on the machine. It can't see any of the host's processes.
The network namespacegives it its own network stack, its own interfaces, its own IP, its own ports. That's how two containers can both bind port 80 without fighting.
User, UTS, and IPC namespaces do the same trick for user IDs, the hostname, and inter-process communication.

The second feature is cgroups, short for control groups. Where namespaces decide what a process can see, cgroups decide what it can use.^[3] This much CPU, this much memory, this much disk and network bandwidth. Hit the memory cap and the kernel kills the process. This is what stops one container from eating the whole box and starving everything else.

Put those together and you have a container. A normal process, given a private view of the world through namespaces, and a hard budget through cgroups. Nothing booted. No second kernel. The process makes system calls straight to the same host kernel every other process uses. It just can't see past its own blinders.

Same app, two ways to run it

The VM stack carries a whole guest OS. The container stack doesn't.

Virtual machine

Your appThe thing you actually wanted to run
Guest OS + librariesA full userland
Guest kernelA second, separate kernel per VM
HypervisorSimulates the hardware

Host kernel + hardware

Both stacks ultimately sit on the same physical machine

Container

Your appSame app
App librariesShipped in the image, no kernel
Namespaces + cgroupsKernel features that fence the process off

What the container does not haveNo guest kernel, no hypervisor, no bootThe container process calls straight into the host kernel. That missing layer is the entire reason it is small and starts instantly.

The VM column has a guest kernel and a hypervisor the container column simply doesn't. Everything people find surprising about containers comes from that gap.

This is why the numbers are so lopsided. There's nothing to boot, so a container starts about as fast as launching any program. There's no guest OS on disk, so the image is just your app plus the libraries it needs. And because every container shares the one host kernel, you can pack hundreds onto a box that would choke on a dozen VMs.

Milliseconds

no OS to boot

Container start time

Seconds to minutes

full guest OS boot

VM start time

MBs vs GBs

no guest kernel

Image size, container vs VM

Takeaway

A VM virtualizes a computer. A container virtualizes a process's point of view. One boots an OS, the other puts blinders on something the kernel was already going to run.

So what is a Docker image, then

If a container is a process, an image is the filesystem that process wakes up inside. When you write a Dockerfile, you're describing how to build that filesystem, one step at a time. Here's a small one:

Dockerfiledockerfile

FROM node:20-alpine          # base layer: a minimal filesystem with Node
WORKDIR /app                  # metadata, basically free
COPY package.json .           # layer: just the manifest
RUN npm install               # layer: node_modules, the big one
COPY . .                      # layer: your source code
CMD ["node", "server.js"]     # what PID 1 runs when the container starts

Each instruction that changes the filesystem produces a layer. Layers are cached and stacked, which is why changing one source file doesn't force a fresh npm install.

The key idea is layers. Most of those instructions produce a read-only layer, a snapshot of the filesystem changes that step made. The base image is one layer. The npm install is another. Your code is another. Docker stacks them using a union filesystem, which mounts several read-only layers on top of each other so they appear as one combined filesystem.^[4] When the container runs, Docker adds a thin writable layer on top for anything the process changes at runtime.

Two things fall out of this. First, layers are cached and shared. If ten images use the same node:20-alpine base, that base is stored once and reused. Rebuild after changing one source file and Docker reuses every cached layer up to the COPY . . step, which is why a good Dockerfile copies package.jsonand installs deps before copying the rest of the code. Second, this is what people mean by “immutable.” The image layers never change. Every container from that image starts from the exact same filesystem.

Why this matters

The image carries your app's entire userland, the libraries, the files, the right versions of everything, but not a kernel. That's the sweet spot. You get a reproducible environment without paying for a whole operating system. It's also the answer to “it works on my machine,” which we'll get to.

The catch: everyone shares one kernel

Here's the tradeoff, and it's the part the cheerful Docker tutorials skip. Every container on a host shares that host's kernel. The isolation between containers is enforced by the kernel itself, through namespaces and cgroups. Which means the kernel is the wall. And a wall that's also the thing keeping score is a different kind of wall than a hypervisor.

Think about what an attacker has to do. To break out of a VM, they have to defeat the guest kernel and then the hypervisor, two separate, hard layers. To break out of a container, they have to find one bug in the host kernel that lets a process escape its namespaces. There's no second kernel behind it. A single kernel vulnerability, and the container is no longer fenced off. It's a process on your host with the blinders removed.^[5]

This isn't hypothetical and it isn't a reason to never use containers. Most workloads are fine, because most workloads aren't running hostile code. The risk gets real when you're running code you don't trust, multiple tenants' code, arbitrary user submissions, on the same kernel. That's where “they share a kernel” stops being trivia and starts being a threat model.

Heads up

Containers are an isolation boundary, not a security boundary you should bet hostile multi-tenancy on by default. The kernel is shared, and the kernel is the wall. If you're running untrusted code, plain containers are not enough on their own.

The industry's answer is interesting: put a real isolation boundary back underneath the container. Several projects do exactly this. gVisor, from Google, slips a user-space kernel between the container and the host, so container syscalls hit gVisor first instead of landing straight on the host kernel.^[6] Firecracker, from AWS, runs each container inside a stripped-down micro-VM that boots in around 125 milliseconds, which is what actually powers Lambda and Fargate.^[7] Kata Containers wraps each container in a lightweight VM while keeping the normal container workflow.^[8] The common thread is telling. When you need both container ergonomics and VM-grade isolation, you stop sharing the kernel. You put a small VM back.

The problem containers actually solved

Step back from the mechanics, because the reason containers took over isn't isolation at all. It's “it works on my machine.” That sentence has wasted more engineering hours than almost anything else in software. Your app runs on your laptop with your Node version, your installed system libraries, your environment variables, your locale settings, and then it falls over on the server because the server is subtly different.

A container kills that problem by shipping the whole environment in the image. Not just your code, the entire userland it depends on. The exact Node version, the exact OpenSSL, the exact libc, the exact config. The image that ran in CI is the same image, byte for byte, that runs in production. “Works on my machine” becomes “the machine is in the image,” and the gap where environment drift used to hide just closes.

That's the real product. The millisecond startup and the small images are nice, but the thing that changed how teams ship is that the unit you build is the unit you run. No more reproducing the server's exact state on your laptop. You ship the state.

Receipt

This is also why the “just give me a clean reproducible environment” use case exploded beyond deployment. CI runners, local dev databases, one-off tool installs, all of it moved to containers because pulling an image is faster and cleaner than installing software onto a machine you then have to keep clean.

When you still want a real VM

None of this means VMs are obsolete. They're the layer underneath, not the loser of a competition. There are jobs where a container is the wrong tool and you want a full VM:

You need a different OS or kernel.Containers share the host kernel, so a Linux host runs Linux containers, full stop. Want to run Windows, or a different kernel version, or kernel features the host doesn't have? That's a VM.
You need hard multi-tenant isolation. Running untrusted or mutually hostile workloads on shared hardware is the textbook VM job. The thick wall is the point. This is why the cloud itself is built on VMs and then runs containers inside them.
You need kernel modules or low-level hardware access. Custom kernel modules, specific drivers, anything that pokes at the kernel directly. A container can't bring its own kernel to load a module into.

And here's the detail that catches people. If you run Docker on a Mac or on Windows, there is no such thing as a native container there, because containers are a Linux kernel feature and those machines don't run a Linux kernel. So Docker Desktop quietly spins up a Linux VM in the background and runs all your containers inside it.^[9]The slick container experience you're using is sitting on top of a virtual machine the whole time. The two technologies aren't rivals. One is often running inside the other.

Side note

This is why Docker on a Mac can feel slower for disk-heavy work, like bind-mounting a big node_modulesfrom your host. You're crossing the boundary between the macOS host and the Linux VM on every file access. On a native Linux host that boundary doesn't exist, which is why the same workload often flies on a Linux box.

How I'd put it

For shipping application code, containers are the default and it isn't close. You get a reproducible environment, instant startup, and density that VMs can't touch, all because you stopped booting an operating system you didn't need. The mental upgrade is to stop seeing a container as a lightweight VM and start seeing it as what it is. A host process with a private view of the world and a spending limit.

VMs didn't lose. They moved underneath. They're the isolation boundary your containers run inside, the thing the cloud uses to keep one customer's code away from another's, the answer when you need a different kernel or a real wall. When you genuinely need both worlds, the answer the industry landed on, gVisor, Firecracker, Kata, is to run containers inside tiny VMs. Which tells you the relationship exactly. It was never containers versus VMs. It's containers on top of VMs, and knowing which layer is doing which job is most of understanding your own stack.

Primary sources

1.PrimaryKVM: the Linux kernel-based virtual machine and hypervisor. linux-kvm.org
2.Primarynamespaces(7): Linux namespaces manual page. man7.org
3.Primarycgroups(7): Linux control groups manual page. man7.org
4.PrimaryDocker storage drivers: images, layers, and the union filesystem. docs.docker.com
5.PrimaryDocker engine security: the shared-kernel attack surface. docs.docker.com
6.PrimarygVisor: a user-space kernel for sandboxing containers. gvisor.dev
7.PrimaryFirecracker: lightweight micro-VMs behind Lambda and Fargate. firecracker-microvm.github.io
8.PrimaryKata Containers: container workflow, VM-backed isolation. katacontainers.io
9.PrimaryDocker Desktop on Mac: containers run inside a Linux VM. docs.docker.com

Frequently asked questions

Is a Docker container just a lightweight virtual machine?: No. A container has no operating system and no kernel of its own. It is a normal process running on the host, the same kind of process as your text editor, except the kernel has been told to lie to it about what it can see. Namespaces give it a private view of the filesystem, process tree, and network. Cgroups cap what it can consume. Nothing boots.
Why do containers start so much faster than VMs?: Because there is nothing to boot. A VM has to bring up a full guest kernel, mount filesystems, start init, and launch services, which takes seconds to minutes. A container is just a process the kernel starts with restrictions applied, so it launches about as fast as any other program. Same reason images are megabytes instead of gigabytes: no guest OS on disk.
Are containers secure enough for untrusted code?: Not by default. Every container on a host shares that host kernel, and the kernel is the wall. One kernel bug that lets a process escape its namespaces and the container is just a process on your host with the blinders removed. Most workloads are fine because most workloads are not hostile. If you are running untrusted or multi-tenant code, use gVisor, Firecracker, or Kata Containers.
What is a Docker image made of?: An image is the filesystem the container process wakes up inside, built as a stack of read-only layers. Each Dockerfile instruction that changes the filesystem produces a layer, and Docker mounts them together with a union filesystem. Layers are cached and shared, which is why a good Dockerfile copies package.json and installs dependencies before copying the rest of the source code.
Why does Docker on a Mac run a Linux VM?: Because containers are a Linux kernel feature and macOS does not run a Linux kernel. There is no such thing as a native container on a Mac, so Docker Desktop quietly spins up a Linux VM in the background and runs everything inside it. That boundary is also why disk-heavy work, like bind-mounting a large node_modules, feels slow on a Mac and flies on a native Linux host.
When should I still use a VM instead of a container?: Three cases. When you need a different OS or kernel version, since a Linux host only runs Linux containers. When you need hard multi-tenant isolation for untrusted or mutually hostile workloads, which is why the cloud itself is built on VMs. And when you need kernel modules or low-level hardware access, because a container cannot bring its own kernel to load a module into.

Written by

Tech Talk News Editorial

Computer engineering background. Writes about software, AI, markets, and real estate, and the places where the three meet.

More about the author

ShareX LinkedIn Reddit Email