bind mounting directories into the nix-store breaks nix commands.
In particular it introduces character devices that are not supported
by nix-store as valid files in the nix store. Use `/var/empty` instead
which is designated for these kind of use cases. We won't create any
files beause of the tmpfs mounted.
serviceConfig.ProtectSystem is usually a string so if set, the assert
itself would error out leaving no useable trace:
# nixos-rebuild switch --show-trace
building Nix...
building the system configuration...
error: while evaluating the attribute 'config.system.build.toplevel' at /nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/activation/top-level.nix:293:5:
while evaluating 'foldr' at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/lists.nix:52:20, called from /nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/activation/top-level.nix:128:12:
while evaluating 'fold'' at /nix/var/nix/profiles/per-user/root/channels/nixos/lib/lists.nix:55:15, called from /nix/var/nix/profiles/per-user/root/channels/nixos/lib/lists.nix:59:8:
while evaluating anonymous function at /nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/system/activation/top-level.nix:121:50, called from undefined position:
while evaluating the attribute 'assertion' at /nix/var/nix/profiles/per-user/root/channels/nixos/nixos/modules/security/systemd-confinement.nix:163:7:
value is a string while a Boolean was expected
Fix the check to give a sensible assert message instead; the attribute
should either be not set or false bool to pass.
Closes: #99000
systemd-confinement's automatic package extraction does not work correctly
if ExecStarts ExecReload etc are lists.
Add an extra flatten to make things smooth.
Fixes#96840.
Systemd ProtectSystem is incompatible with the chroot we make
for confinement. The options is redundant with what we do anyway
so warn if it had been set and advise to disable it.
Merges: https://github.com/NixOS/nixpkgs/pull/87420
So far we had MountFlags = "private", but as @Infinisil has correctly
noticed, there is a dedicated PrivateMounts option, which does exactly
that and is better integrated than providing raw mount flags.
When checking for the reason why I used MountFlags instead of
PrivateMounts, I found that at the time I wrote the initial version of
this module (Mar 12 06:15:58 2018 +0100) the PrivateMounts option didn't
exist yet and has been added to systemd in Jun 13 08:20:18 2018 +0200.
Signed-off-by: aszlig <aszlig@nix.build>
Noted by @Infinisil on IRC:
infinisil: Question regarding the confinement PR
infinisil: On line 136 you do different things depending on
RootDirectoryStartOnly
infinisil: But on line 157 you have an assertion that disallows that
option being true
infinisil: Is there a reason behind this or am I missing something
I originally left this in so that once systemd supports that, we can
just flip a switch and remove the assertion and thus support
RootDirectoryStartOnly for our confinement module.
However, this doesn't seem to be on the roadmap for systemd in the
foreseeable future, so I'll just remove this, especially because it's
very easy to add it again, once it is supported.
Signed-off-by: aszlig <aszlig@nix.build>
My implementation was relying on PrivateDevices, PrivateTmp,
PrivateUsers and others to be false by default if chroot-only mode is
used.
However there is an ongoing effort[1] to change these defaults, which
then will actually increase the attack surface in chroot-only mode,
because it is expected that there is no /dev, /sys or /proc.
If for example PrivateDevices is enabled by default, there suddenly will
be a mounted /dev in the chroot and we wouldn't detect it.
Fortunately, our tests cover that, but I'm preparing for this anyway so
that we have a smoother transition without the need to fix our
implementation again.
Thanks to @Infinisil for the heads-up.
[1]: https://github.com/NixOS/nixpkgs/issues/14645
Signed-off-by: aszlig <aszlig@nix.build>
From @edolstra at [1]:
BTW we probably should take the closure of the whole unit rather than
just the exec commands, to handle things like Environment variables.
With this commit, there is now a "fullUnit" option, which can be enabled
to include the full closure of the service unit into the chroot.
However, I did not enable this by default, because I do disagree here
and *especially* things like environment variables or environment files
shouldn't be in the closure of the chroot.
For example if you have something like:
{ pkgs, ... }:
{
systemd.services.foobar = {
serviceConfig.EnvironmentFile = ${pkgs.writeText "secrets" ''
user=admin
password=abcdefg
'';
};
}
We really do not want the *file* to end up in the chroot, but rather
just the environment variables to be exported.
Another thing is that this makes it less predictable what actually will
end up in the chroot, because we have a "globalEnvironment" option that
will get merged in as well, so users adding stuff to that option will
also make it available in confined units.
I also added a big fat warning about that in the description of the
fullUnit option.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Another thing requested by @edolstra in [1]:
We should not provide a different /bin/sh in the chroot, that's just
asking for confusion and random shell script breakage. It should be
the same shell (i.e. bash) as in a regular environment.
While I personally would even go as far to even have a very restricted
shell that is not even a shell and basically *only* allows "/bin/sh -c"
with only *very* minimal parsing of shell syntax, I do agree that people
expect /bin/sh to be bash (or the one configured by environment.binsh)
on NixOS.
So this should make both others and me happy in that I could just use
confinement.binSh = "${pkgs.dash}/bin/dash" for the services I confine.
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>
Quoting @edolstra from [1]:
I don't really like the name "chroot", something like "confine[ment]"
or "restrict" seems better. Conceptually we're not providing a
completely different filesystem tree but a restricted view of the same
tree.
I already used "confinement" as a sub-option and I do agree that
"chroot" sounds a bit too specific (especially because not *only* chroot
is involved).
So this changes the module name and its option to use "confinement"
instead of "chroot" and also renames the "chroot.confinement" to
"confinement.mode".
[1]: https://github.com/NixOS/nixpkgs/pull/57519#issuecomment-472855704
Signed-off-by: aszlig <aszlig@nix.build>