We now track copied files in /etc/.clean. This is important, because
otherwise files that are removed from environment.etc will not
actually be removed from the file system. In particular, changing
users.extraUsers.<user>.openssh.authorizedKeys.keys to an empty list
would not cause /etc/ssh/authorized_keys.d/<user> to be removed, which
was a security issue.
It only needs to be started during boot. Starting it at other times
shouldn't hurt, except that if systemd-journald is restarting at the
same time, the latter might not have a SIGUSR1 signal handler
installed yet, so it might be killed by systemd-journal-flush. (At
least that's my theory about the dead systemd-journald instances in
the build farm...)
It's only needed during early boot (in fact, it's probably not needed
at all on NixOS). Restarting it is expensive because it does a sync()
of the root file system.
systemd escaping rules translate this into a string containing '\'
which is treated by some code paths as quoted, and by others as unquoted
causing the affected units to fail.
Restarting user@ instances is bad because it causes all user services
(such as ssh-agent.service) to be restarted. Maybe one day we can have
switch-to-configuration restart user units in a fine-grained way, but
for now we should just ignore user systemd instances.
Backport: 14.04
If /boot is a btrfs subvolume, it will be on a different device than /
but not be at the root from grub's perspective. This should be fixed in
a nicer way by #2449, but that can't go into 14.04.
This reverts commit 6eaced3582. Doesn't
work very well, e.g. if you actually have the FUSE module loaded. And
in any case it's already fixed in NixOps.
Otherwise, when switching from systemd 203 to 212, you get errors like:
Failed to stop remote-fs.target: Bad message
Failed to stop systemd-udevd-control.socket: Bad message
...
These fail to mount if you don't have the appropriate kernel support,
and this confuses NixOps' ‘check’ command. We should teach NixOps not
to complain about non-essential mount points, but in the meantime it's
better to turn them off.
This seems to have combined badly with the systemd upgrade, we'll revert
for now and revisit after the 14.04 branch.
This reverts commit ad80532881, reversing
changes made to 1c5d3c7883.
This used to work with systemd-nspawn 203, because it bind-mounted
/etc/resolv.conf (so openresolv couldn't overwrite it). Now it's just
copied, so we need some special handling.
With ‘systemd.user.units’ and ‘systemd.user.services’, you can specify
units used by per-user systemd instances. For example,
systemd.user.services.foo =
{ description = "foo";
wantedBy = [ "default.target" ];
serviceConfig.ExecStart = "${pkgs.foo}/bin/foo";
};
declares a unit ‘foo.service’ that gets started automatically when the
user systemd instance starts, and is stopped when the user systemd
instance stops.
Note that there is at most one systemd instance per user: it's created
when a user logs in and there is no systemd instance for that user
yet, and it's removed when the user fully logs out (i.e. has no
sessions anymore). So if you're simultaneously logged in via X11 and a
virtual console, you get only one copy of foo.
If you define a unit, and either systemd or a package in
systemd.packages already provides that unit, then we now generate a
file /etc/systemd/system/<unit>.d/overrides.conf. This makes it
possible to use upstream units, while allowing them to be customised
from the NixOS configuration. For instance, the module nix-daemon.nix
now uses the units provided by the Nix package. And all unit
definitions that duplicated upstream systemd units are finally gone.
This makes the baseUnit option unnecessary, so I've removed it.
This allows specifying rules for systemd-tmpfiles.
Also, enable systemd-tmpfiles-clean.timer so that stuff is cleaned up
automatically 15 minutes after boot and every day, *if* you have the
appropriate cleanup rules (which we don't have by default).
This creates static device nodes such as /dev/fuse or
/dev/snd/seq. The kernel modules for these devices will be loaded on
demand when the device node is opened.
This prevents insidious errors once systemd begins handling the unit. If
the unit is loaded at boot, any errors of this nature are logged to the
console before the journal service is running. This makes it very hard
to diagnose the issue. Therefore, this assertion helps guarantee the
mistake is not made.
Note that systemd no longer depends on dbus, so we're rid of the
cyclic dependency problem between systemd and dbus.
This commit incorporates from wkennington's systemd branch
(203dcff45002a63f6be75c65f1017021318cc839,
1f842558a95947261ece66f707bfa24faf5a9d88).
Using pkgs.lib on the spine of module evaluation is problematic
because the pkgs argument depends on the result of module
evaluation. To prevent an infinite recursion, pkgs and some of the
modules are evaluated twice, which is inefficient. Using ‘with lib’
prevents this problem.
This allows to define systemd.path(5) units, for example like this:
{
systemd = let
description = "Set Key Permissions for xyz.key";
in {
paths.set-key-perms = {
inherit description;
before = [ "network.target" ];
wantedBy = [ "multi-user.target" ];
pathConfig.PathChanged = "/run/keys/xyz.key";
};
services.set-key-perms = {
inherit description;
serviceConfig.Type = "oneshot";
script = "chown myspecialkeyuser /run/keys/xyz.key";
};
};
}
The example here is actually useful in order to set permissions for the
NixOps keys target to ensure those permisisons aren't reset whenever the
key file is reuploaded.
Signed-off-by: aszlig <aszlig@redmoonstudios.org>
You can now say:
systemd.services.foo.baseUnit = "${pkgs.foo}/.../foo.service";
This will cause NixOS' generated foo.service file to include
foo.service from the foo package. You can then apply local
customization in the usual way:
systemd.services.foo.serviceConfig.MemoryLimit = "512M";
Note however that overriding options in the original unit may not
work. For instance, you cannot override ExecStart.
It's also possible to customize instances of template units:
systemd.services."getty@tty4" =
{ baseUnit = "/etc/systemd/system/getty@.service";
serviceConfig.MemoryLimit = "512M";
};
This replaces the unit options linkTarget (which didn't allow
customisation) and extraConfig (which did allow customisation, but in
a non-standard way).
We used to have the configuration of the kernel available in a
somewhat convenient place (/run/booted-system/kernel-modules/config)
but that has disappeared. So instead just make /proc/configs.gz
available. It only eats a few kilobytes.
Without this the HTML manual and manpage is quite unreadable (newlines
are squashed so it doesn't look like a list anymore).
(Unfortunately, this makes the source unreadable.)
Security-relevant changes:
* No (salted) passphrase hash send to the yubikey, only hash of the salt (as it was in the original implementation).
* Derive $k_luks with PBKDF2 from the yubikey $response (as the PBKDF2 salt) and the passphrase $k_user
(as the PBKDF2 password), so that if two-factor authentication is enabled
(a) a USB-MITM attack on the yubikey itself is not enough to break the system
(b) the potentially low-entropy $k_user is better protected against brute-force attacks
* Instead of using uuidgen, gather the salt (previously random uuid / uuid_r) directly from /dev/random.
* Length of the new salt in byte added as the parameter "saltLength", defaults to 16 byte.
Note: Length of the challenge is 64 byte, so saltLength > 64 may have no benefit over saltLengh = 64.
* Length of $k_luks derived with PBKDF2 in byte added as the parameter "keyLength", defaults to 64 byte.
Example: For a luks device with a 512-bit key, keyLength should be 64.
* Increase of the PBKDF2 iteration count per successful authentication added as the
parameter "iterationStep", defaults to 0.
Other changes:
* Add optional grace period before trying to find the yubikey, defaults to 2 seconds.
Full overview of the yubikey authentication process:
(1) Read $salt and $iterations from unencrypted device (UD).
(2) Calculate the $challenge from the $salt with a hash function.
Chosen instantiation: SHA-512($salt).
(3) Challenge the yubikey with the $challenge and receive the $response.
(4) Repeat three times:
(a) Prompt for the passphrase $k_user.
(b) Derive the key $k_luks for the luks device with a key derivation function from $k_user and $response.
Chosen instantiation: PBKDF2(HMAC-SHA-512, $k_user, $response, $iterations, keyLength).
(c) Try to open the luks device with $k_luks and escape loop (4) only on success.
(5) Proceed only if luks device was opened successfully, fail otherwise.
(6) Gather $new_salt from a cryptographically secure pseudorandom number generator
Chosen instantiation: /dev/random
(7) Calculate the $new_challenge from the $new_salt with the same hash function as (2).
(8) Challenge the yubikey with the $new_challenge and receive the $new_response.
(9) Derive the new key $new_k_luks for the luks device in the same manner as in (4) (b),
but with more iterations as given by iterationStep.
(10) Try to change the luks device's key $k_luks to $new_k_luks.
(11) If (10) was successful, write the $new_salt and the $new_iterations to the UD.
Note: $new_iterations = $iterations + iterationStep
Known (software) attack vectors:
* A MITM attack on the keyboard can recover $k_user. This, combined with a USB-MITM
attack on the yubikey for the $response (1) or the $new_response (2) will result in
(1) $k_luks being recovered,
(2) $new_k_luks being recovered.
* Any attacker with access to the RAM state of stage-1 at mid- or post-authentication
can recover $k_user, $k_luks, and $new_k_luks
* If an attacker has recovered $response or $new_response, he can perform a brute-force
attack on $k_user with it without the Yubikey needing to be present (using cryptsetup's
"luksOpen --verify-passphrase" oracle. He could even make a copy of the luks device's
luks header and run the brute-force attack without further access to the system.
* A USB-MITM attack on the yubikey will allow an attacker to attempt to brute-force
the yubikey's internal key ("shared secret") without it needing to be present anymore.
Credits:
* Florian Klien,
for the original concept and the reference implementation over at
https://github.com/flowolf/initramfs_ykfde
* Anthony Thysse,
for the reference implementation of accessing OpenSSL's PBKDF2 over at
http://www.ict.griffith.edu.au/anthony/software/pbkdf2.c
Rationale:
* The main reason for choosing to implement the PBA in accordance
with the Yubico documentation was to prevent a MITM-USB-attack
successfully recovering the new LUKS key.
* However, a MITM-USB-attacker can read user id and password when
they were entered for PBA, which allows him to recover the new
challenge after the PBA is complete, with which he can challenge
the Yubikey, decrypt the new AES blob and recover the LUKS key.
* Additionally, since the Yubikey shared secret is stored in the
same AES blob, after such an attack not only is the LUKS device
compromised, the Yubikey is as well, since the shared secret
has also been recovered by the attacker.
* Furthermore, with this method an attacker could also bruteforce
the AES blob, if he has access to the unencrypted device, which
would again compromise the Yubikey, should he be successful.
* Finally, with this method, once the LUKS key has been recovered
once, the encryption is permanently broken, while with the previous
system, the LUKS key itself it changed at every successful boot,
so recovering it once will not necessarily result in a permanent
breakage and will also not compromise the Yubikey itself (since
its secret is never stored anywhere but on the Yubikey itself).
Summary:
The current implementation opens up up vulnerability to brute-forcing
the AES blob, while retaining the current MITM-USB attack, additionally
making the consequences of this attack permanent and extending it to
the Yubikey itself.
switch-to-configuration.pl is currently hard-coded to assume that if a
unit is in the "auto-restart" state that something has gone wrong, but
this is not strictly true. For example, I run offlineimap as a oneshot
service restarting itself every minute (on success). NixOS currently
thinks that offlineimap has failed to start as it enters the
auto-restart state, because it doesn't consider why the unit failed.
This commit changes switch-to-configuration.pl to inspect the full
status of a unit in auto-restart state, and now only considers it failed
if the ExecMainStatus is non-zero.
Currently switch-to-configuration.pl uses system() calls to interact
with DBus. This can be error prone, especially when we are parsing
output that could change. In this commit, almost all calls to the
systemctl binary have been replaced with equivalent operations via DBus.
This is achieved by having multiple lines per storage file, one for each user (if the feature is enabled); each of these
lines has the same format as would be the case for the userless authentication, except that they are prepended with a
SHA-512 of the user's id.
'YubiKey Integration for Full Disk Encryption Pre-Boot Authentication (Copyright) Yubico, 2011 Version: 1.1'.
Used binaries:
* uuidgen - for generation of random sequence numbers
* ykchalresp - for challenging a Yubikey
* ykinfo - to check if a Yubikey is plugged in at boot (fallback to passphrase authentication otherwise)
* openssl - for calculation of SHA-1, HMAC-SHA-1, as well as AES-256-CTR (de/en)cryption
Main differences to the specification mentioned above:
* No user management (yet), only one password+yubikey per LUKS device
* SHA-512 instead of CRC-16 for checksum
Main differences to the previous implementation:
* Instead of changing the key slot of the LUKS device each boot,
the actual key for the LUKS device will be encrypted itself
* Since the response for the new challenge is now calculated
locally with openssl, the MITM-USB-attack with which previously
an attacker could obtain the new response (that was used as the new
encryption key for the LUKS device) by listening to the
Yubikey has ideally become useless (as long as uuidgen can
successfuly generate new random sequence numbers).
Remarks:
* This is not downwards compatible to the previous implementation
This will allow overriding package-provided units, or overriding only a
specific instance of a unit template.
Signed-off-by: Shea Levy <shea@shealevy.com>
This required some changes to systemd unit handling:
* Add an option to specify that a unit is just a symlink
* Allow specified units to overwrite systemd-provided ones
* Have gettys.target require autovt@1.service instead of getty@1.service
Signed-off-by: Shea Levy <shea@shealevy.com>
Some programs (notably the Java Runtime Environment) expect to be able
to extract the name of the time zone from the target of the
/etc/localtime symlink. That doesn't work if /etc/localtime is a
symlink to /etc/static/localtime. So make it a direct symlink.
You can now say:
systemd.containers.foo.config =
{ services.openssh.enable = true;
services.openssh.ports = [ 2022 ];
users.extraUsers.root.openssh.authorizedKeys.keys = [ "ssh-dss ..." ];
};
which defines a NixOS instance with the given configuration running
inside a lightweight container.
You can also manage the configuration of the container independently
from the host:
systemd.containers.foo.path = "/nix/var/nix/profiles/containers/foo";
where "path" is a NixOS system profile. It can be created/updated by
doing:
$ nix-env --set -p /nix/var/nix/profiles/containers/foo \
-f '<nixos>' -A system -I nixos-config=foo.nix
The container configuration (foo.nix) should define
boot.isContainer = true;
to optimise away the building of a kernel and initrd. This is done
automatically when using the "config" route.
On the host, a lightweight container appears as the service
"container-<name>.service". The container is like a regular NixOS
(virtual) machine, except that it doesn't have its own kernel. It has
its own root file system (by default /var/lib/containers/<name>), but
shares the Nix store of the host (as a read-only bind mount). It also
has access to the network devices of the host.
Currently, if the configuration of the container changes, running
"nixos-rebuild switch" on the host will cause the container to be
rebooted. In the future we may want to send some message to the
container so that it can activate the new container configuration
without rebooting.
Containers are not perfectly isolated yet. In particular, the host's
/sys/fs/cgroup is mounted (writable!) in the guest.