Docker vs Virtual Machines: What a Container Actually Is

Everyone repeats that VMs virtualize hardware and containers virtualize the OS. True, but it hides the real thing. A container isn't a tiny VM. It's a normal process on the host with blinders on, fenced off by two Linux kernel features.

Tech Talk News Editorial10 min read
ShareXLinkedInRedditEmail
Docker vs Virtual Machines: What a Container Actually Is

The sentence everyone repeats is “VMs virtualize the hardware, containers virtualize the operating system.” It's correct. It's also useless, because it leaves you picturing a container as a skinnier virtual machine. Same idea, less overhead. That mental model is wrong, and almost every confusing thing about Docker traces back to it.

A container is not a small computer running inside your computer. It has no operating system of its own. It has no kernel of its own. When you run a container, you are running an ordinary process on your host, the same kind of process as your text editor or your web server, except the kernel has been told to lie to it about what it can see. The container thinks it has the whole machine. It doesn't. It has blinders on.

Once that clicks, the rest falls out for free. Why containers start in milliseconds instead of seconds. Why a 30MB image is normal and a 30MB VM is a joke. Why a kernel bug is scarier in container-land than in VM-land. Why Docker on your Mac is quietly running a Linux VM behind your back. Let's actually go through it.

Summary

A VM boots a full guest OS, with its own kernel, on top of a hypervisor. A container is just a host process that the Linux kernel has fenced off using two features: namespaces (what it can see) and cgroups (what it can use). No guest kernel, no boot. That single difference explains the speed, the size, and the security tradeoff all at once.

What a virtual machine actually does

Start with the heavy option, because it's the honest baseline. A virtual machine is a real computer simulated in software. Underneath it sits a hypervisor, the thing that carves your physical machine into virtual ones. On a server that's usually KVM, on a laptop it might be something like VMware or the built-in macOS hypervisor framework.[1] The hypervisor hands each VM what looks like its own CPU, its own memory, its own disk, its own network card.

On top of that virtual hardware, the VM boots a complete operating system. A full guest OS, kernel and all. If you run an Ubuntu VM on a Windows host, there is a real Linux kernel running inside that VM, scheduling its own processes, managing its own memory, talking to virtual devices that the hypervisor pretends are real. The guest has no idea it's a guest.

That's why VMs are heavy. You're booting an entire OS. The disk image is measured in gigabytes because it contains a whole filesystem. Startup takes seconds to minutes because the kernel has to come up, mount things, start init, bring up services. Each VM also reserves its own slice of RAM for its own kernel and background processes, whether or not your app needs it.

The payoff for all that weight is isolation. Real, hard isolation. The guest kernel is a separate kernel. Code running inside the VM talks to that kernel, not to the host's. For a process inside the VM to reach the host, it has to break out of the guest OS and then defeat the hypervisor, which is a small, hardened, heavily audited layer. That is a genuinely difficult thing to do. This is why VMs have been the unit of multi-tenant isolation in the cloud for two decades.

Plain English

A VM is a fake computer all the way down. It boots its own operating system and runs its own kernel. Heavy and slow to start, but the wall between it and the host is thick, because there are two kernels and a hypervisor in between.

A container is a process wearing blinders

Now the light option. When you start a container, the Linux kernel does not boot anything. It starts a process, the same way it starts any process, and then it applies two kinds of restrictions to that process. That's the whole trick. Two features, both built into the Linux kernel, both there for years before Docker made them famous.

The first is namespaces. A namespace controls what a process can see.[2] There are several kinds, and each one virtualizes a different slice of the system:

  • The mount namespace gives the process its own view of the filesystem. It sees the container image as / and has no idea your real root filesystem exists.
  • The PID namespacegives it its own process tree. The first process in the container is PID 1, like it's the only thing running on the machine. It can't see any of the host's processes.
  • The network namespacegives it its own network stack, its own interfaces, its own IP, its own ports. That's how two containers can both bind port 80 without fighting.
  • User, UTS, and IPC namespaces do the same trick for user IDs, the hostname, and inter-process communication.

The second feature is cgroups, short for control groups. Where namespaces decide what a process can see, cgroups decide what it can use.[3] This much CPU, this much memory, this much disk and network bandwidth. Hit the memory cap and the kernel kills the process. This is what stops one container from eating the whole box and starving everything else.

Put those together and you have a container. A normal process, given a private view of the world through namespaces, and a hard budget through cgroups. Nothing booted. No second kernel. The process makes system calls straight to the same host kernel every other process uses. It just can't see past its own blinders.

Same app, two ways to run it

The VM stack carries a whole guest OS. The container stack doesn't.

Virtual machine

  • Your appThe thing you actually wanted to run
  • Guest OS + librariesA full userland
  • Guest kernelA second, separate kernel per VM
  • HypervisorSimulates the hardware

Host kernel + hardware

Both stacks ultimately sit on the same physical machine

Container

  • Your appSame app
  • App librariesShipped in the image, no kernel
  • Namespaces + cgroupsKernel features that fence the process off
What the container does not haveNo guest kernel, no hypervisor, no bootThe container process calls straight into the host kernel. That missing layer is the entire reason it is small and starts instantly.

The VM column has a guest kernel and a hypervisor the container column simply doesn't. Everything people find surprising about containers comes from that gap.

This is why the numbers are so lopsided. There's nothing to boot, so a container starts about as fast as launching any program. There's no guest OS on disk, so the image is just your app plus the libraries it needs. And because every container shares the one host kernel, you can pack hundreds onto a box that would choke on a dozen VMs.

Milliseconds
no OS to boot
Container start time
Seconds to minutes
full guest OS boot
VM start time
MBs vs GBs
no guest kernel
Image size, container vs VM

Takeaway

A VM virtualizes a computer. A container virtualizes a process's point of view. One boots an OS, the other puts blinders on something the kernel was already going to run.

So what is a Docker image, then

If a container is a process, an image is the filesystem that process wakes up inside. When you write a Dockerfile, you're describing how to build that filesystem, one step at a time. Here's a small one:

Dockerfiledockerfile
FROM node:20-alpine          # base layer: a minimal filesystem with Node
WORKDIR /app                  # metadata, basically free
COPY package.json .           # layer: just the manifest
RUN npm install               # layer: node_modules, the big one
COPY . .                      # layer: your source code
CMD ["node", "server.js"]     # what PID 1 runs when the container starts
Each instruction that changes the filesystem produces a layer. Layers are cached and stacked, which is why changing one source file doesn't force a fresh npm install.

The key idea is layers. Most of those instructions produce a read-only layer, a snapshot of the filesystem changes that step made. The base image is one layer. The npm install is another. Your code is another. Docker stacks them using a union filesystem, which mounts several read-only layers on top of each other so they appear as one combined filesystem.[4] When the container runs, Docker adds a thin writable layer on top for anything the process changes at runtime.

Two things fall out of this. First, layers are cached and shared. If ten images use the same node:20-alpine base, that base is stored once and reused. Rebuild after changing one source file and Docker reuses every cached layer up to the COPY . . step, which is why a good Dockerfile copies package.jsonand installs deps before copying the rest of the code. Second, this is what people mean by “immutable.” The image layers never change. Every container from that image starts from the exact same filesystem.

Why this matters

The image carries your app's entire userland, the libraries, the files, the right versions of everything, but not a kernel. That's the sweet spot. You get a reproducible environment without paying for a whole operating system. It's also the answer to “it works on my machine,” which we'll get to.

The catch: everyone shares one kernel

Here's the tradeoff, and it's the part the cheerful Docker tutorials skip. Every container on a host shares that host's kernel. The isolation between containers is enforced by the kernel itself, through namespaces and cgroups. Which means the kernel is the wall. And a wall that's also the thing keeping score is a different kind of wall than a hypervisor.

Think about what an attacker has to do. To break out of a VM, they have to defeat the guest kernel and then the hypervisor, two separate, hard layers. To break out of a container, they have to find one bug in the host kernel that lets a process escape its namespaces. There's no second kernel behind it. A single kernel vulnerability, and the container is no longer fenced off. It's a process on your host with the blinders removed.[5]

This isn't hypothetical and it isn't a reason to never use containers. Most workloads are fine, because most workloads aren't running hostile code. The risk gets real when you're running code you don't trust, multiple tenants' code, arbitrary user submissions, on the same kernel. That's where “they share a kernel” stops being trivia and starts being a threat model.

Heads up

Containers are an isolation boundary, not a security boundary you should bet hostile multi-tenancy on by default. The kernel is shared, and the kernel is the wall. If you're running untrusted code, plain containers are not enough on their own.

The industry's answer is interesting: put a real isolation boundary back underneath the container. Several projects do exactly this. gVisor, from Google, slips a user-space kernel between the container and the host, so container syscalls hit gVisor first instead of landing straight on the host kernel.[6] Firecracker, from AWS, runs each container inside a stripped-down micro-VM that boots in around 125 milliseconds, which is what actually powers Lambda and Fargate.[7] Kata Containers wraps each container in a lightweight VM while keeping the normal container workflow.[8] The common thread is telling. When you need both container ergonomics and VM-grade isolation, you stop sharing the kernel. You put a small VM back.

The problem containers actually solved

Step back from the mechanics, because the reason containers took over isn't isolation at all. It's “it works on my machine.” That sentence has wasted more engineering hours than almost anything else in software. Your app runs on your laptop with your Node version, your installed system libraries, your environment variables, your locale settings, and then it falls over on the server because the server is subtly different.

A container kills that problem by shipping the whole environment in the image. Not just your code, the entire userland it depends on. The exact Node version, the exact OpenSSL, the exact libc, the exact config. The image that ran in CI is the same image, byte for byte, that runs in production. “Works on my machine” becomes “the machine is in the image,” and the gap where environment drift used to hide just closes.

That's the real product. The millisecond startup and the small images are nice, but the thing that changed how teams ship is that the unit you build is the unit you run. No more reproducing the server's exact state on your laptop. You ship the state.

Receipt

This is also why the “just give me a clean reproducible environment” use case exploded beyond deployment. CI runners, local dev databases, one-off tool installs, all of it moved to containers because pulling an image is faster and cleaner than installing software onto a machine you then have to keep clean.

When you still want a real VM

None of this means VMs are obsolete. They're the layer underneath, not the loser of a competition. There are jobs where a container is the wrong tool and you want a full VM:

  • You need a different OS or kernel.Containers share the host kernel, so a Linux host runs Linux containers, full stop. Want to run Windows, or a different kernel version, or kernel features the host doesn't have? That's a VM.
  • You need hard multi-tenant isolation. Running untrusted or mutually hostile workloads on shared hardware is the textbook VM job. The thick wall is the point. This is why the cloud itself is built on VMs and then runs containers inside them.
  • You need kernel modules or low-level hardware access. Custom kernel modules, specific drivers, anything that pokes at the kernel directly. A container can't bring its own kernel to load a module into.

And here's the detail that catches people. If you run Docker on a Mac or on Windows, there is no such thing as a native container there, because containers are a Linux kernel feature and those machines don't run a Linux kernel. So Docker Desktop quietly spins up a Linux VM in the background and runs all your containers inside it.[9]The slick container experience you're using is sitting on top of a virtual machine the whole time. The two technologies aren't rivals. One is often running inside the other.

Side note

This is why Docker on a Mac can feel slower for disk-heavy work, like bind-mounting a big node_modulesfrom your host. You're crossing the boundary between the macOS host and the Linux VM on every file access. On a native Linux host that boundary doesn't exist, which is why the same workload often flies on a Linux box.

How I'd put it

For shipping application code, containers are the default and it isn't close. You get a reproducible environment, instant startup, and density that VMs can't touch, all because you stopped booting an operating system you didn't need. The mental upgrade is to stop seeing a container as a lightweight VM and start seeing it as what it is. A host process with a private view of the world and a spending limit.

VMs didn't lose. They moved underneath. They're the isolation boundary your containers run inside, the thing the cloud uses to keep one customer's code away from another's, the answer when you need a different kernel or a real wall. When you genuinely need both worlds, the answer the industry landed on, gVisor, Firecracker, Kata, is to run containers inside tiny VMs. Which tells you the relationship exactly. It was never containers versus VMs. It's containers on top of VMs, and knowing which layer is doing which job is most of understanding your own stack.

Written by

Tech Talk News Editorial

Tech Talk News covers engineering, AI, and tech investing for people who build and invest in technology.

ShareXLinkedInRedditEmail