Quick and Dirty Containers with systemd
Docker and other containerization technologies are making the rounds in the
Linux community, but a lowly hero lurks beneath now every major Linux
distro; enter systemd-nspawn
containers.
systemd has arrived with mixed reviews -- you either love it or you hate it -- but one thing stands for certain: it gets the job done. Regardless of your emotions projected towards it, systemd is likely here to stay for a while, so you might as well exploit as many features as you can out of it.
systemd-nspawn
containers are akin to FreeBSD Jails more than
Docker containers. They're basically just a fancy chroot
with some handy built-in integrations with systemd. You can start, stop,
enable, and disable the containers as if they were regular services.
Keep in mind, however, that by its own admission,
systemd-nspawn
is an experimental feature that hasn't been
thoroughly tested or audited. There are no guarantees of security or
stability; it's probably best to keep them out of production for the
time being. That said, here at BlackieOps we've been using
systemd-nspawn
containers for Jenkins, Stash, and Jira for a
while now without any issues.
Creating the container
We will be working on a fresh install of CentOS 7, but this process is
possible on any system using systemd. The only difference will be the first
package installation step. Obviously, on Debian you will not be using
yum
, and on Fedora your repo names and release versions will be
different (and you'll be using dnf
)…
First step is to "install" a new root filesystem into a directory.
# yum -y --releasever=7 --nogpg --installroot=/var/lib/machines/cool-container \
--disablerepo='*' --enablerepo=base install systemd passwd yum centos-release
This will create a directory,
/var/lib/machines/cool-container
, and populate it with a new
root filesystem and a couple core packages.
Second, we need to enter into the container to set up some basic things like
the root
password. We can use the
systemd-nspawn
command directly for this:
# systemd-nspawn -D /var/lib/machines/cool-container
This drops you to a shell in the container without actually
"booting" anything inside of it (think of it as like
chroot
-ing from a recovery mode. From here, we can set the root
password so we can log in later.
When you're done, just ^D
out as usual and you'll be
dropped back to your host machine.
Aside: kernel auditing and containers
If you ignore this section and continue trying to boot the container, you will likely get a warning before the container starts about the kernel auditing subsystem. There are supposedly odd bugs that can surface if auditing is enabled, so we're just going to disable it. If this worries you, feel free to inspect the issue further, but since this is not a production system it's probably fine.
We just need to add a flag to the kernel parameters in our bootloader. This
will vary between distros, but for CentOS it's as easy as editing the
/etc/sysconfig/grub
file and changing the
GRUB_CMDLINE_LINUX
variable by appending audit=0
to the list of parameters.
After editing the parameters, we'll need to regenerate our GRUB configuration:
# grub2-mkconfig -o /etc/grub2.cfg
Configuring the base system
We now have a skeleton of a container installed, but we still need to actually configure what's inside of it, and get it prepped to start automatically, or at least as a service from systemd.
Since we now have access to the root
account, we can fully
"boot" the container:
# systemd-nspawn -bD /var/lib/machines/cool-container
The -b
flag is short for --boot
and basically
means systemd-nspawn
will search for an init binary and execute
it. You'll see the standard boot log fly by, and then be dropped at a
standard PTY login prompt. Log in with the root credentials you set up
previously, and now we can start installing things as if we were on a brand
new machine.
Once you have your container set up and everything is running and
configured, you can exit by "shutting down" the container as if it
was a physical machine: poweroff
or halt
(or
whatever you usually use).
Managing the container
While the /var/lib/machines
prefix at the beginning may have
seemed arbitrary, in fact it was intentional -- containers in this directory
will be auto-discovered by systemd and we can enable and manage them
automatically.
To have your container start with everything else when your host boots:
# systemctl enable systemd-nspawn@cool-container
And as you can perhaps guess, we can start and stop our container just as any other service:
# systemctl start systemd-nspawn@cool-container
# systemctl stop systemd-nspawn@cool-container
Accessing the container
Accessing a running container can be a bit tricky; one option is to install
openssh
in the container and have it run on a non-standard port
(as containers share the host's network interfaces). Alternatively, you
can access the machine through machinectl
.
Just running machinectl
without arguments will list all running
containers (and other VMs, etc). Interestingly, the older version of
machinectl
on CentOS does not allow us to use the
login
argument (so you may want to install
openssh
)... If you're on Fedora (or a different more
up-to-date distro), we can use machinectl login
command:
# machinectl login cool-container
... which will drop us at that familiar PTY prompt.
Since we don't necessarily want to halt the container to escape from this prompt, there is a panic button to disconnect: hit escape three times within a second (i.e., fast).
In conclusion, systemd-nspawn
is an interesting technology that
shows promise. Its ubiquity through the proliferation of systemd means
containers are quite portable, easy to set up, and well-integrated directly
into the OS's init system.
Would I use it in production? Probably not. It's a very green technology and its immaturity is worrisome enough that my sleep cycles would be lessened dramatically by its deployment. For production "containers", FreeBSD Jails still provide the best security and featureset.
For now, systemd-nspawn
is staying on my internal
infrastructure, running my Atlassian stack, Jenkins, etc.; and it is running
those internal services quite well. But until its features are more
solidified and someone has verified it is at least moderately secure, it
won't be finding its way to my production stack for a few years yet.