Using systemd in Linux (services, journal, locale, network, container, …)

systemd is a suite of basic building blocks for a Linux system. It provides a system and service manager that runs as PID 1 and starts the rest of the system.

Uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic. Supports SysV and LSB init scripts and works as a replacement for sysvinit.

Other parts include a logging daemon, utilities to control basic system configuration like the hostname, date, locale, maintain a list of logged-in users and running containers and virtual machines, system accounts, runtime directories and settings, and daemons to manage simple network configuration, network time synchronization, log forwarding, and name resolution.

systemd is under development, see changelog/news

$ /usr/lib/systemd/systemd --version
systemd 218
$ man systemd.index

from systemd/systemd@wiki.

  1. System and service manager
  2. Journal (logging)
  3. Configuration files (hostname, locale, timedate, users, network, mounts, …)
  4. Filesystem hierarchy
  5. Containers

System and service manager

systemctl controls the systemd system and service manager /usr/lib/systemd/systemd. When run as first process on boot (as PID 1), it acts as init system that brings up and maintains userspace services.

## installation (usually done by distros)
# boot runs init as first process (w/ PID 1) on boot
$ ln -s /usr/lib/systemd/systemd /sbin/init
# use grub2-mkconfig to generate init option
# (usually isn't needed if using initramfs generated by dracut with systemd)
$ cat /etc/default/grub
GRUB_CMDLINE_LINUX="init=/usr/lib/systemd/systemd"
# to diagnosing boot problems to 'dmesg', boot with
systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M

## usage
systemctl [OPTIONS...] COMMAND [NAME...]
NAME is 'unit-name.?service?', 'unit-name.socket' or '/unit-name' for '.mount,.device'
template units use 'name@string.service' for 'name@.service' where '%i' is 'string'

# list unit files installed from '{/etc,/run,/usr/lib}/systemd/system/*'
$ systemctl list-unit-files
# list units status, LOAD (properly loaded), ACTIVE/SUB (high-level/low-level activation state)
$ systemctl list-units
# reload systemd, scanning for new or changed units
$ systemctl daemon-reload
# list dependencies
$ systemctl list-dependencies
# show unit file
$ systemctl cat
# list services that failed to activete/start
$ systemctl --state=failed

# sysvinit/service equivalents
$ systemctl start,stop,status,show,restart,reload
# reload if supported, restart otherwise
$ systemctl reload-or-restart
# restart if running, nothing otherwise; same as condrestart
$ systemctl try-restart
# reload if supported, try-restart otherwise; same as force-reload
$ systemctl reload-or-try-restart
# reload if supported, restart otherwise
$ systemctl reload-or-restart
# send a signal to one or more processes of the unit
$ systemctl --signal SIGTERM unit-name kill

# control access to system resources; '--runtime' for non-persistent between reboots
$ systemctl set-property CPUShares=512 MemoryLimit=1G

# remote control; '-H user@host'
$ systemctl -H user@host COMMAND

# SysVinit/chkconfig equivalents
$ systemctl enable,disable,is-enabled
$ prohibits all kinds of activation of the unit, including enablement and manual activation
$ systemctl mask,unmask
  • systemd.unit a unit configuration file encodes information about a service, a socket, a device, a mount point, an automount point, a swap file or partition, a start-up target, a watched file system path, a timer controlled and supervised by systemd(1), a temporary system state snapshot, a resource management slice or a group of externally created processes.
$ ls {/etc,/run,/usr/lib}/systemd/system/*
# find overridden configuration files
$ systemd-delta systemd/system
# '.include' is no longer supported, use '<uni-name>.type.d/' to include '.conf' files, to be parsed after file it self

$ man systemd.directives
[Unit]
'Before=, After=' Ordering dependencies between units
'Requires=' Units to also activate. Also deactivates when others deactivate (or fail to activate)
'Wants=' Same as 'Requires=' but doesn't deactivate on failure
'Conflicts=' Configures negative requirement dependencies. Independent of 'After=,Before='
'OnFailure=' Units to activate on 'failed'
'ConditionFirstBoot=yes' Used to populate '/etc' on first boot after factory reset
'ConditionPathExists=/path' File existence condition is checked before a unit is started
'AssertXXX=' Same as 'ConditionXXX' but sets unit state to 'failed'

[Install] called exclusively on enable/disable
'RequiredBy=,WantedBy=' On enable adds 'Requires=,Wants=' and creates symblinks '{.requires,.wants}/' to others
'Also=' Additional units to install/uninstall
[Service] used by .service units
'Type=simple' Expects 'ExecStart=' is the main process and doesn't exit
'Type=forking' Expects process in 'ExecStart=' forks
'Type=oneshot' Same as 'simple' but expects 'ExecStart=' to exit

'ExecStart=cmd' Commands and arguments executed when this service is started.
Multiple 'ExecStart=' are allowed but only on 'oneshot'
Use '${var} for environment variables
See '%i' for instance name, see 'man systemd.unit' for all specifiers
Prefix with '-' to ignore exit code (non-zero is error)
'ExecStartPre=, ExecStartPost=' Extra commands executed before/after 'ExecStart='
'ExecReload=' Command executed on reload. Supports multiple 'ExecReload='
Reload command should wait for completion, so '/bin/kill -HUP $MAINPID' isn't recommended
'ExecStop==' Command executed on stop. Defaults to 'SIGKILL' to control group, see 'systemd.kill'
'ExecStopPost' Command executed after stop

'Restart=no,allways' Never,Allways restart on process exit, killed/signaled or timeout 'RestartSec='
'Restart=on-success' Restart on clean exit code or clean signal, see 'SuccessExitStatus='
'Restart=on-failure' Restart on unclean exit code, unclean signals or timeout
'Restart=on-abnormal' Restart on unclean signal or timeout
'Restart=on-abort' Restart on unclean signal only
'RestartSec=,TimeoutStartSec=,TimeoutStopSec=' Time to wait before restart,start,stop
'SuccessExitStatus=' Defaults to '0 SIGHUP SIGINT SIGTERM SIGPIPE'
'StartLimitInterval=, StartLimitBurst=' Start rate limiting, defaults to 5 times within 10 secs

$ systemctl cat sshd
# /usr/lib64/systemd/system/sshd.service
[Unit]
Description=OpenSSH server daemon
After=syslog.target network.target auditd.service
[Service]
ExecStartPre=/usr/bin/ssh-keygen -A
ExecStart=/usr/sbin/sshd -D -e
ExecReload=/bin/kill -HUP $MAINPID
[Install]
WantedBy=multi-user.target

from systemd for Administrators, Part III and SystemdForUpstartUsers@ubuntu

  • systemd.target/systemd.special target units do not offer any additional functionality on top of the generic functionality provided by units. They exist merely to group units via dependencies (useful as boot targets), and to establish standardized names for synchronization points used in dependencies between units (e.g.: After=network.target and WantedBy=multi-user.target).
# change runlevels/targets; halt/poweroff=0, rescue=1, multi-user/default=2,3,4, graphical=5, reboot=6, emergency
$ systemctl multi-user,halt,poweroff,rescue,multi-user,graphical,reboot,emergency
# starts given unit and stops all others, used in .target units, similar to changing runlevel
$ systemctl isolate .target
# get/set default target
$ systemctl get-default,set-default .target

# suspend to RAM; hibernate to disk swap; hybrid-sleep suspends if battery isnt depleted and hibernates otherwise
$ systemctl suspend,hibernate,hybrid-sleep
[Service,Socket,Mount,Swap]
'WorkingDirectory=,RootDirectory=' Working and root directory for executed processes
'User=,Group=' User,group that the processes are executed
'Environment=,EnvironmentFile=' Environment variables, "VAR1=word1 word2" VAR2=word3", see 'systemctl show-environment'

'StandardInput=' null(default), tty from 'TTYPath='' or socket
'StandardOutput=,StandardError=' inherit, null, tty, journal, syslog, kmsg, journal+console, syslog+console, kmsg+console or socket
'TTYPath=' Terminal device, defaults to '/dev/console'
'SyslogIdentifier=' Prefix, defaults to process name
'SyslogFacility=' Syslog facility, see syslog(3), defaults to 'daemon'

'Nice=,CPUAffinity=' Nice level (scheduling priority) and CPU affinity
'LimitXXX=,' Sets soft and hard limits, see setrlimit(2)
'CapabilityBoundingSet=' Controls see capabilities(7), e.g.: CAP_SYS_PTRACE,

'PrivateNetwork=,' Process can access only loopback devices
'PrivateTmp=' '/tmp,/var/tmp' are private and isolated from the host system's
'ProtectSystem=' If true mounts '/usr' in ro, if 'full' also mounts '/etc' ro
'ReadWriteDirectories=,ReadOnlyDirectories=,InaccessibleDirectories=' Limit access to the file system hierarchy

from systemd for Administrators, Part XII

[Service,Socket,Mount,Swap,Slice,Scope]
'CPUShares=weight,StartupCPUShares=weight' Assign CPU time share weight
'CPUQuota=%' Assign CPU quota, see scheduler/sched-design-CFS.txt
'MemoryLimit=bytes' Limits max memory usage 'K,M,G,T', see cgroups/memory.txt
'BlockIOWeight=weight,StartupBlockIOWeight=weight' Default overall block IO weight (10-1000), see cgroups/blkio-controller.txt
'BlockIODeviceWeight=device weight' Per-device overall block IO weight
'BlockIOReadBandwidth=device bytes/sec, BlockIOWriteBandwidth=device bytes/sec' Per-device overall block IO bandwidth limit
'DeviceAllow=dev r|w|m' Control access to specific device nodes, see cgroups/devices.txt
'Slice=' Name of the slice unit to place the unit in.

$ systemctl set-property httpd.service CPUShares=500 MemoryLimit=500M
# or using slices
$ cat /etc/systemd/system/limits.slice
[Unit]
Description=Limited resources Slice
DefaultDependencies=no
Before=slices.target
[Slice]
CPUShares=512
MemoryLimit=2G
$ cat /etc/systemd/system/httpd.service.d/limits.conf
[Service]
Slice=limits.slice

from systemd for Administrators, Part XVIII

  • systemd.socket bind service activation to incoming socket connection, for socket-based activation. A service capable of socket activation must be able to receive its preinitialized sockets from systemd, instead of creating them internally. For most services this requires (minimal) patching.
#include "sd-daemon.h"
...
int fd, n;
n = sd_listen_fds(0); /* returns how many file descriptors are passed */
if (n > 1) { fprintf(stderr, "Too many file descriptors received.n"); exit(1);
} else if (n == 1)
  fd = SD_LISTEN_FDS_START + 0;
else {
  /* non-socket activated env, continue as before */
}

from Socket Activation

# for each .socket file, a matching .service file must exist
[Socket] # used by .socket
'ListenStream=,ListenDatagram=,ListenSequentialPacket=' Address to listen '?ip:?port'
'Service=' Overrides service unit to activate on incomming traffic, defaults to .service with same name
'ExecStartPre=, ExecStartPost=,ExecStopPre=, ExecStopPost=' Commands execute around start and stop listening

$ systemctl cat sshd.socket
# /usr/lib64/systemd/system/sshd.socket
[Unit]
Description=OpenSSH Server Socket
Conflicts=sshd.service
[Socket]
ListenStream=22
Accept=yes
[Install]
WantedBy=sockets.target

from DaemonSocketActivation and systemd-crontab-generator/systemd-cron

# for each .timer file, a matching .service file must exist
[Timer] # used by .timer
'OnActiveSec=,OnBootSec=,OnStartupSec=,OnUnitActiveSec=,OnUnitInactiveSec=' Monotonic timers relative to different starting points
'OnCalendar=' Realtime/wallclock timers e.g.: 'Thu,Fri 2012-*-1,5 11:12:13'
'AccuracySec=' Accuracy the timer shall elapse with, used to distribute wake-up
'Unit=' Overrides unit activated when timer elapses
'Persistent=' Activate service if it had to be when timer was inactive

$ cat /etc/systemd/system/foo.timer
[Unit]
Description=Run foo weekly (realtime/wallclock)
[Timer]
OnCalendar=weekly
Persistent=true # starts immediately, if it missed the last start time
[Install]
WantedBy=timers.target

$ cat /etc/systemd/system/foo.timer
[Unit]
Description=Run foo weekly and 15mins after boot
[Timer]
OnBootSec=15min
OnUnitActiveSec=1w
[Install]
WantedBy=timers.target

$ systemctl list-timers

from systemd/timers@arch

  • systemd.path bind service activation to file system changes, uses inotify(7).
# for each .path file, a matching .service file must exist
[Path] # used by .path
'PathExists=,PathExistsGlob=,PathChanged=,PathModified=,DirectoryNotEmpty=' Paths to monitor for certain changes, or existence
'Unit=' Overrides unit activated when any of the configured paths changes
'MakeDirectory=,DirectoryMode=' Create directories to watch before watching

$ cat /etc/systemd/system/foo.path
[Path]
PathExistsGlob=/var/crash/*.crash
Unit=foo.service
  • systemd can also manage services under the user’s control with a per-user systemd instance. On the first login of a user, systemd automatically launches a systemd –user instance, responsible to manage user services. User units should be placed in {~/.config,/etc,/usr/lib}/systemd/user/.
## installation (usually done by distros)
$ systemctl --user status

## example
$ cat $HOME/.config/systemd/user/mpd.service
[Unit]
Description=Music Player Daemon
[Service]
ExecStart=/usr/bin/mpd --no-daemon
[Install]
WantedBy=default.target

from systemd/user@arch

  • systemd-cgls/systemd-cgtop show control group contents (processes) and their resource usage. control groups are a way to hierarchally group and label processes, and a way to apply resource limits to these groups.
# show which service owns which processes, same as 'systemd-cgls'
$ ps xawf -eo pid,user,cgroup,args

$ systemd-cgtop

from systemd for Administrators, Part II

# list the started unit files, sorted by time each of them took to start up
$ systemd-analyze blame
# show which units are in the critical points in the startup chain
$ systemd-analyze critical-chain

# same as bootchart
$ systemd-analyze plot > plot.svg
# more detailed version of 'systemd-analyze plot', add this to to kernel line
initcall_debug printk.time=y init=/usr/lib/systemd/systemd-bootchart

from improve boot performance@arch and systemd for Administrators, Part VII

Journal (logging)

systemd has its own logging system called the journal; therefore, running a syslog daemon is no longer required. It captures Syslog messages, Kernel log messages, initial RAM disk and early boot messages as well as messages written to STDOUT/STDERR of all services, indexes them and makes this available to the user. It can be used in parallel, or in place of a traditional syslog daemon, such as rsyslog or syslog-ng.

journalctl used to query the contents of systemd journal written by systemd-journald.service (running /usr/lib/systemd/systemd-journald). systemd-journal-gatewayd.service (running /usr/lib/systemd/systemd-journal-gatewayd) is HTTP server for journal events.

## configuration
# 'journalctl' controls 'systemd-journald.service' configured in '/etc/systemd/journald.conf'
$ man journal.conf
'Storage=volatile' Stored in memory only, below '/run/log/journal'
'Storage=persistent' Stored in disk '/var/log/journal' and fallback to memory
'Storage=auto' Same as 'persistent' but '/var/log/journal' isn't created if needed
'Storage=none' Turns off all storage
'Compress=' Enable compression before written to disk
'RateLimitInterval=, RateLimitBurst=' Defaults to 1000 messages in 30s
'SystemMaxUse=%, SystemKeepFree=%, SystemMaxFileSize=bytes' Enforce size limits on the journal files stored
RuntimeMaxUse=, RuntimeKeepFree=, RuntimeMaxFileSize=' Same as above but for volatime/runtime storage
'MaxFileSec=,MaxRetentionSec=' Max time to store entries before rotating/deleting, time-based
'SyncIntervalSec=' Timeout before sync to disk, defaults to 5mins; CRIT, ALERT,EMERG are allways sync

'ForwardToSyslog=, ForwardToKMsg=, ForwardToConsole=, ForwardToWall=' Forward to traditional syslog daemons
'MaxLevelStore= MaxLevelSyslog=,MaxLevelKMsg=,MaxLevelConsole=,MaxLevelWall=' Max log level of stored messages
'TTYPath=' Console tty to use if ForwardToConsole=yes, defaults to '/dev/console'

## usage
journalctl [OPTIONS...] [MATCHES...]
where MATCHES is FIELD=VALUE, see man systemd.journal-fields

# show the journal, starting with the oldest
$ journalctl
# to see all logs add user to 'adm' group
$ usermod -a -G adm
# journal is stored binary, but content of messages isn't modified
$ strings /var/log/journal/*/system.journal | grep -i message

# '-f,--follow' live view, same as 'tail -f'
$ journalctl -f
# '-b,--boot id' show only after id boot, empty id is last
$ journalctl -b
# '-p,--priority' filter by priority 0/emerg...7/debug
$ journalctl -b -p err
# '--since,--until' filter by date
$ journalctl --since=yesterday
$ journalctl --since=2012-10-15 --until="2011-10-16 23:59:59"
# '-u,--unit' filter by unit
$ journalctl -u httpd --since=00:00 --until=9:30
# filter by field match
$ journalctl /usr/sbin/vpnc /usr/sbin/dhclient
# '-o,--output' output format 'short,verbose,export,json,cat'
# '-n, --lines=' show last n lines only
$ journalctl -o verbose -n
# '-F,--field' show all possible field values
$ journalctl -F _SYSTEMD_UNIT

# retrieve events from this boot from local journal in Journal Export Format
$ systemctl enable systemd-journal-gatewayd.socket
$ curl --silent -H'Accept: application/vnd.fdo.journal' 'http://localhost:19531/entries?boot'

from systemd for Administrators, Part XVII

Systemd journal (216) can be configured to forward events to a remote server. Entries are forwarded including full metadata, and are stored in normal journal files, identically to locally generated logs.

Two new daemons are added as part of the systemd package: 1) systemd-journal-remote accepts messages in the Journal Export Format and stores them locally, and 2) systemd-journal-upload is the journal client which exports journal messages and uploads them over the network.

# copy local journal events to a different journal directory
$ journalctl -o export | systemd-journal-remote -o /tmp/dir -

# retrieve events from a remote 'systemd-journal-gatewayd' instance and store in '/var/log/journal/some.host/remote-some.host.journal'
$ systemd-journal-remote --url http://some.host:19531/

Use coredumpctl to retrieve coredumps from the journal. systemd-coredump is used to store core dumps (generated when user program receives fatal signal) in an external file /var/lib/systemd/coredump or journal.

## configuration (usually done by distros)
# kernel configured to call 'systemd-coredump'
$ cat /usr/lib/sysctl.d/50-coredump.conf
kernel.core_pattern=|/usr/lib/systemd/systemd-coredump %p %u %g %s %t %e
$ cat /proc/sys/kernel/core_pattern
|/usr/lib/systemd/systemd-coredump %p %u %g %s %t %e

# create file {/etc,/run,/usr/lib}/coredump.d/.conf, defaults to '/etc/systemd/coredump.conf'
$ cat /etc/systemd/coredump.conf
'Storage=' Where to store 'none,external,journal,both'
'ExternalSizeMax=,JournalSizeMax=,ProcessSizeMax=' Max size of core to save
'MaxUse=%, KeepFree=%' Limits disk usage of external storage

## usage
coredumpctl [OPTIONS...] {COMMAND} [PID|COMM|EXE|MATCH...]

# list cores in journal
$ coredumpctl list
# filtered by PID or program name
$ coredumpctl list foo
# show core info
$ coredumpctl info 6654
$ extract/dump core
$ coredumpctl -o bar.coredump dump /usr/bin/bar

from coredump@arch

Journal collects all data logged via syslog, kernel logged via printk as well as stdout/stderr of any service. It also as a native API for logging, sd_journal_print with bindings for other languages (erlang, go, python, ruby, …)

## using syslog (it basically writes to /dev/log)
$ cat test-journal-submit.c
#include <syslog.h>
syslog(LOG_NOTICE, "Hello World");
$ journalctl -o json-pretty
    "PRIORITY" : "5",
    "_PID" : "3068",
    "MESSAGE" : "Hello World!",
    "_SOURCE_REALTIME_TIMESTAMP" : "1351126905014938"

## using printf("<PRIORITY>MSG")
#include <stdio.h>
#define PREFIX_NOTICE "<5>"
printf(PREFIX_NOTICE "Hello Worldn");

## native sd_journal_print/sd_journal_send
#include <systemd/sd-journal.h>
sd_journal_print(LOG_NOTICE, "Hello World");
sd_journal_send("MESSAGE=Hello World!",
  "MESSAGE_ID=52fb62f99e2c49d89cfbf9d6de5e3555", "PRIORITY=5", "XPTO=XXX"
  NULL);

from systemd for Developers III

Configuration Files

To unify systemd introduces new configuration files as primary source of configuration, and only per-distro configurtion as a fallback. See systemd for Administrators, Part VIII

  • binfmt.d registration of additional binary formats for systems like Java, Mono and WINE. At boot, systemd-binfmt.service reads configuration files from the above directories to register in the kernel additional binary formats for executables.
# see binfmt_misc.txt
:name:type:offset:magic:mask:interpreter:flags

# start WINE on Windows executables
$ cat /etc/binfmt.d/wine.conf
:DOSWin:M::MZ::/usr/bin/wine:
  • hostnamectl used to query and change the system hostname. Its the client for systemd-hostnamed.service. It distinguishes three different hostnames: the high-level “pretty” hostname (stored in /etc/machine-info) which might include all kinds of special characters (e.g. “Lennart’s Laptop”), the static hostname (stored in /etc/hostname/) which is used to initialize the kernel hostname at boot (e.g. “lennarts-laptop”), and the transient hostname which is a default received from network configuration (not used if a valid static hostname is defined).
# show current settings
$ hostnamectl ?status?

# set hostname
$ hostnamectl set-hostname  ?-static, --transient, --pretty?
$ cat /etc/{hostname,machine-info}
  • localectl used to query and change the system locale and keyboard layout settings. Its a client for the systemd-localed.service that uses /etc/locale.conf. And for systemd-vconsole-setup.service, an early service that uses /etc/vconsole.conf to configure the virtual console (i.e. keyboard mapping and console font).
# show current settings
$ localectl ?status? ?-h,--host=remote-host?
$ cat /etc/vconsole.conf
KEYMAP=de-latin1
FONT=latarcyrheb-sun16
$ cat /etc/locale.conf
LANG=de_DE.UTF-8
LC_MESSAGES=en_US.UTF-8

# change the system locale
$ localectl set-locale LANG=
# change the virtual console keymap:
$ localectl set-keymap
# set the X11 layout
$ localectl set-x11-keymap
# show current settings
$ timedatectl ?status? ?-h,--host=remote-host?

# set system clock
$ timedatectl set-time "2012-10-30 18:17:16"
# set sytem timezone
$ timedatectl set-timezone "Europe/Lisbon"

# set RTC (real-time, battery-powered clock) to universal time and use to adjust system clock (--adjust-system-clock)
$ timedatectl set-local-rtc 0 ?--adjust-system-clock?
  • systemd-timesyncd system service is used to synchronize the local system clock with a remote NTP server.
# start systemd-timesyncd
$ timedatectl set-ntp true
# NTP server is taken from systemd-networkd's '.network' configuration file appended with
$ cat /etc/systemd/timesyncd.conf
[Time]
NTP=0.arch.pool.ntp.org 1.arch.pool.ntp.org
FallbackNTP=0.pool.ntp.org 1.pool.ntp.org

from systemd-timesyncd@arch

  • systemd-tmpfiles creates, deletes, and cleans up volatile and temporary files and directories, based on the configuration file format and location specified in tmpfiles.d.
## configuration
# systemd-tmpfiles uses files defined (usually /tmp,/run) in '{/etc,/run,/usr/lib}/tmpfiles.d/*.conf'
# create file {/etc,/run,/usr/lib}/tmpfiles.d/.conf
# '/var/{run,lock}' are symblinks to '/run'
Type Path Mode UID GID Age Argument
'Type' 'f/d' create file/directory, 'F/D' create or truncate
'Type' 'L,L+' create symblink, 'C' recursively copy
'Type' 'X/x' exclude from cleanup, 'r/R' remove
'Path' can use '%m' for machineid, '%b' for bootid, '%H' for hostname
'Age' used by 'd,D,x' to decide when to cleanup

## examples
$ cat /etc/tmpfiles.d/samba.conf
D /run/samba 0755 root root
$ cat /etc/tmpfiles.d/abrt.conf
d /var/tmp/abrt 0755 abrt abrt
x /var/tmp/abrt/*

# creates, deletes, and cleans up volatile and temporary files and directories, based on configuration
$ systemd-tmpfiles --create --remove
  • systemd-sysusers uses the files from sysusers.d directory to create system users and groups at package installation or boot time.
## configuration
'Type' 'u' creates user and group, 'g' create group, 'm' add user to group
'ID' UID, GID or '-' to automatic

## examples
$ cat /etc/sysusers.d/mypackage.conf
# Type Name ID GECOS
u httpd 440 "HTTP User"
u authd /usr/bin/authd "Authorization user"
g input - -
m authd input
u root 0 "Superuser" /root
$ systemd-sysusers /etc/sysusers.d/mypackage.conf
# set kernel YP domain name
$ cat /etc/sysctl.d/domain-name.conf
kernel.domainname=example.com

# load virtio-net.ko at boot
$ cat /etc/modules-load.d/virtio-net.conf
virtio-net

from sysctl@arch and kernel modules@arch

  • systemd-networkd is system service that manages networks, virtual network devices and low-level device links, using systemd.network, systemd.netdev and systemd.link files respectively. It detects and configures network devices as they appear, as well as creates virtual network devices. Can run alongside your usual network management (e.g.: netctl).
## '.network' configuration
[Match]
'Name=,Host=,Virtualization=' Match against a device name, hostname or virtualization only
[Network]
'DHCP=none|v4|v6|both' Enable DHCP
'DNS=' DNS server, multiple allowed
'Domains=' Domains used for DNS resolution
'Bridge=' Bridge name to add the link to
'Address=addr/netmask' Static address, short-hand for [Address]
'Gateway=' Network gateway, short-hand for [Route]
'IPMasquerade=' (219) Packets forwarded from the network interface will be appear as coming from the local host

$ cat {/etc,/run,/usr/lib}/systemd/network/.network
[Match]
Name=en*
[Network]
# either dhcp
#DHCP=v4
# or static
Address=10.4.2.111/8
Gateway=10.254.0.2
DNS=10.254.0.121
DNS=10.254.0.122
Domains=mydomain

## '.link' configuration
[Match]
'MACAddress=,Host=,Virtualization=' Match against MAC address, hostname or virtualization only
[Link]
'MACAddressPolicy=' 'persistent' or 'random' MAC address
'NamePolicy=' List of policies used to set the interface name

$ cat /usr/lib/systemd/network/99-default.link
[Link]
NamePolicy=kernel database onboard slot path
MACAddressPolicy=persistent

## '.netdev' configuration
[Match]
'Host=,Virtualization=' Match against hostname or virtualization only
[Netdev]
'Name=' Interface name, required
'Kind=' 'bridge,bond,vlan,macvlan' required

$ cat /etc/systemd/network/.netdev
[NetDev]
Name=br0
Kind=bridge
$ cat /etc/systemd/network/.network
[Match]
Name=eth*
[Network]
Bridge=br0
$ cat /etc/systemd/network/.network
[Match]
Name=br0
[Network]
# either dhcp
#DHCP=v4
# or static
DNS=192.168.1.1
Address=192.168.1.2/24
Gateway=192.168.1.1
$ brctl show

$ systemctl restart systemd-networkd

from systemd-networkd@arch

  • systemd-resolved (216) implements a caching DNS stub resolver and an LLMNR resolver and responder. Calls the DNS servers in resolved.conf.d, the per-link static settings in .network files, and the per-link dynamic settings received over DHCP. Also generates {/etc,/run/systemd/resolve}/resolv.conf for compatibility.
$ /etc/systemd/resolved.conf.d/mypackage.conf
[Resolve]
DNS=192.168.0.10 192.168.1.3 192.168.0.1
## configuration
$ man fstab
/device /mount-point fs-type mount-options
$ man mount
mount -t fs-type -o mount-options /device /fs-mount-point

[Mount]
'What=' Device node, e.g.: tmpfs
'Where=' Mount point, must match unit name where '/' replace by '-'
'Type=' FS type, e.g.: ext4
'Options=' Mount options

## usage
$ ssh-keygen ; ssh-copy-id -i /root/.ssh/id_rsa.pub root@10.4.2.102

# either '/etc/fstab'
root@10.4.2.102:/tmp /mnt/tmp fuse.sshfs defaults,allow_other,_netdev 0 0
$ mount -a

# or '/etc/systemd/system/sshfs.mount'
[Unit]
Description=SSFHS example
[Mount]
What=root@remote:/dir
Where=/mnt/dir
Type=fuse.sshfs
Options=defaults,allow_other,_netdev
$ systemctl start /mnt/tmp

$ systemctl status /mnt/tmp

from fstab@arch

Filesystem hierarchy

file-hierarchy is a minimal, modernized subset of hier. Use systemd-path lists and query system and user paths.

'/etc' System specific configuration reserved for local admim
'/tmp' For small temporary files, use '/var/tmp' for large files, flushed on boot
'/srv' Store general server payload, managed by the admin
'/run' Runtime/volatile "tmpfs" for packages to place runtime data, flushed on boot, allways writable
'/usr' Vendor/package-supplied operating system resources, usually read-only
'/var' Persistent and variable data, must be writable, pre-populated with vendor, but reconstructed if necessary
'/dev,proc,/sys' Virtual kernel fs

# Compatibility Symlinks
'/bin,/sbin,/usr/sbin' -> '/usr/bin'
'/lib' -> '/usr/lib'
'/var/run' -> '/run'

Containers

Use systemd-nspawn to spawn a namespace containers for debugging, testing and building. Its like chroot on steroids. Use a tool like yum, debootstrap, or pacman to set up an OS directory tree suitable as file system hierarchy for systemd-nspawn containers.

machinectl used to introspect and control the state of your systemd VM and container, via systemd-machined.service.

systemd-nspawn [OPTIONS...] [COMMAND [ARGS...] ]
'-D,--directory=' Directory to use as file system root for the container
'-x,--template=' (219) Directory or "btrfs" subvolume to use as template for the container's root directory. Created (with '-D' directory) if doesnt exist.
'-x, --ephemeral' (219) Run with a temporary "btrfs" snapshot of its root directory (as configured with --directory=), that is removed immediately when the container terminates
'-i,--image=' Disk image to mount the root directory for the container from. Alternative to '-D'
'-b,--boot' Invoke init binary insted of shell or a user supplied program
'-M, --machine=' Sets the machine name for this container

'--private-network' Disconnect networking of the container from the host, except loopback
'--network-interface=' Assign the specified network interface to the container
'--network-macvlan=' Create a "macvlan" interface of the specified Ethernet network interface and add it to the container. A "macvlan" interface is a virtual interface that adds a second MAC address to an existing physical Ethernet link
'--network-veth' Create a virtual Ethernet link ("veth") between host and container
'--network-bridge=' Adds the host side of the Ethernet link created with --network-veth to the specified bridge
'-p,--port=' (219) If private networking, maps IP port on host onto IP port on container. Used with 'IPMasquerade=yes' in '.network'

'--read-only' Mount the root file system read-only for the container
'--bind=, --bind-ro=' Bind mount a file or directory from the host into the container
'--tmpfs=' Mount a tmpfs file system into the container

'--volatile=yes' Volatile/ephemeral mode. Root mounted as unpopulated "tmpfs" instance, and '/usr' from the OS tree is mounted into it, read-only
'--volatile=state' OS tree is mounted read-only, but '/var' is mounted as "tmpfs" instance into it
'--volatile=no' (default) whole tree is writable
If volatile=yes or state then all changes are lost on shutdown and container must boot with only '/usr' (able to populate '/var' automatically).

## examples
# boot a minimal Fedora in a container
$ yum -y --releasever=19 --nogpg --installroot=/srv/mycontainer --disablerepo='*' --enablerepo=fedora install systemd passwd yum fedora-release vim-minimal
$ systemd-nspawn -bD /srv/mycontainer

# spawn a shell in a container of a minimal Debian
$ debootstrap --arch=amd64 unstable ~/debian-tree/
$ systemd-nspawn -D ~/debian-tree/

# boot a minimal Arch Linux in a container
$ pacstrap -c -d ~/arch-tree/ base
$ systemd-nspawn -bD ~/arch-tree/

# boot your container at your machine startup
$ ln -s /path/to/MyContainer /var/lib/container/MyContainer
$ systemctl enable systemd-nspawn@MyContainer.service
$ systemctl start systemd-nspawn@MyContainer.service
$ machinectl list

# boot unmodified Fedora cloud images (219; dissecting of MBR disk images)
$ wget http://download.fedoraproject.org/pub/fedora/linux/releases/21/Cloud/Images/x86_64/Fedora-Cloud-Base-20141203-21.x86_64.raw.xz
$ unxz Fedora-Cloud-Base-20141203-21.x86_64.raw.xz
$ systemd-nspawn -i Fedora-Cloud-Base-20141203-21.x86_64.raw -b

# spawn a container on a temporary snapshot of your host's root directory, which is removed immediately when the container exits
$ systemd-nspawn -xb -D /

# first time it will create '/var/lib/container/mycontainer' from '/var/lib/container/fedora21' and boot it; on subsequent runs the container tree will already be created
$ systemd-nspawn -b -D /var/lib/container/mycontainer --template=/var/lib/container/fedora21

# socket activated OS containers
$ cat /etc/systemd/system/mycontainer.service
[Unit]
Description=My little container
[Service]
ExecStart=/usr/bin/systemd-nspawn -jbD /srv/mycontainer 3
KillMode=process
$ cat /etc/systemd/system/mycontainer.socket
[Unit]
Description=The SSH socket of my little container
[Socket]
ListenStream=23
# teach SSH inside the container socket activation
$ cat /etc/systemd/system/sshd.socket
[Unit]
Description=SSH Socket for Per-Connection Servers
[Socket]
ListenStream=23
Accept=yes
$ cat /etc/systemd/system/sshd@.service
[Unit]
Description=SSH Per-Connection Server for %I
[Service]
ExecStart=-/usr/sbin/sshd -i
StandardInput=socket
# start unit automatically when the container boots up
$ ln -s /etc/systemd/system/sshd.socket /etc/systemd/system/sockets.target.wants/

from systemd for Administrators, Part VI, systemd for Administrators, Part XX and systemd-nspawn@arch

systemd-import (219) used to pull and update containers images from Internet (docker). It converts them into brtfs subvolumes/snapshots and makes them available as simple directory trees in '/var/lib/container/' for booting with 'systemd-nspawn'.

# donwload 'mattdm/fedora', and make them available as '/var/lib/container/fedora'
$ systemd-import pull-dck mattdm/fedora
$ systemd-nspawn -M fedora

Factory Reset, Stateless Systems, Reproducible Systems & Verifiable Systems:

  • Factory reset mechanism should flush {/etc,/var} but keep /usr.
  • Stateless system (where a reboot is a factory reset) never stores {/etc,/var} on persistent storage, but always comes up with pristine vendor state.
  • Reproducible systems each has a private {/etc,/var} for receiving local configuration, and /usr is pulled in via bind mounts (in case of containers)
  • Verifiable Systems related to stateless system where storage is cryptographically ensure and {/etc,/var} are either included in image or unnecessary to boot.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s