PIE causes problems with static binaries on ARM (see 76552e9). It is
enabled by default on other platforms anyway when musl is used, so we
don't need to specify it manually.
This mitigates CVE-2023-4911, crucially without a mass-rebuild.
We drop insecure environment variables explicitly, including
glibc-specific ones, since musl doesn't do this by default.
Change-Id: I591a817e6d4575243937d9ccab51c23a96bed6f9
Given that we are no longer inspecting the target of the /proc/self/exe
symlink, stop asserting that it has any properties. Remove the plumbing
for wrappersDir, which is no longer used.
Asserting that the binary is located in the specific place is no longer
necessary, because we don't rely on that location being writable only by
privileged entities (we used to rely on that when assuming that
readlink(/proc/self/exe) will continue to point at us and when assuming
that the `.real` file can be trusted).
Assertions about lack of write bits on the file were
IMO meaningless since inception: ignoring the Linux's refusal to honor
S[UG]ID bits on files-writeable-by-others, if someone could have
modified the wrapper in a way that preserved the capability or S?ID
bits, they could just remove this check.
Assertions about effective UID were IMO just harmful: if we were
executed without elevation, the caller would expect the result that
would cause in a wrapperless distro: the targets gets executed without
elevation. Due to lack of elevation, that cannot be used to abuse
privileges that the elevation would give.
This change partially fixes#98863 for S[UG]ID wrappers. The issue for
capability wrappers remains.
/proc/self/exe is a "fake" symlink. When it's opened, it always opens
the actual file that was execve()d in this process, even if the file was
deleted or renamed; if the file is no longer accessible from the current
chroot/mount namespace it will at the very worst fail and never open the
wrong file. Thus, we can make a much simpler argument that we're reading
capabilities off the correct file after this change (and that argument
doesn't rely on things such as protected_hardlinks being enabled, or no
users being able to write to /run/wrappers, or the verification that the
path readlink returns starts with /run/wrappers/).
Before this change it was crucial that nonprivileged users are unable to
create hardlinks to SUID wrappers, lest they be able to provide a
different `.real` file alongside. That was ensured by not providing a
location writable to them in the /run/wrappers tmpfs, (unless
disabled) by the fs.protected_hardlinks=1 sysctl, and by the explicit
own-path check in the wrapper. After this change, ensuring
that property is no longer important, and the check is most likely
redundant.
The simplification of expectations of the wrapper will make it
easier to remove some of the assertions in the wrapper (which currently
cause the wrapper to fail in no_new_privs environments, instead of
executing the target with non-elevated privileges).
Note that wrappers had to be copied (not symlinked) into /run/wrappers
due to the SUID/capability bits, and they couldn't be hard/softlinks of
each other due to those bits potentially differing. Thus, this change
doesn't increase the amount of memory used by /run/wrappers.
This change removes part of the test that is obsoleted by the removal of
`.real` files.
This change includes some stuff (e.g. reading of the `.real` file,
execution of the wrapper's target) that belongs to the apparmor policy
of the wrapper. This necessitates making them distinct for each wrapper.
The main reason for this change is as a preparation for making each
wrapper be a distinct binary.
Given that we are no longer inspecting the target of the /proc/self/exe
symlink, stop asserting that it has any properties. Remove the plumbing
for wrappersDir, which is no longer used.
Asserting that the binary is located in the specific place is no longer
necessary, because we don't rely on that location being writable only by
privileged entities (we used to rely on that when assuming that
readlink(/proc/self/exe) will continue to point at us and when assuming
that the `.real` file can be trusted).
Assertions about lack of write bits on the file were
IMO meaningless since inception: ignoring the Linux's refusal to honor
S[UG]ID bits on files-writeable-by-others, if someone could have
modified the wrapper in a way that preserved the capability or S?ID
bits, they could just remove this check.
Assertions about effective UID were IMO just harmful: if we were
executed without elevation, the caller would expect the result that
would cause in a wrapperless distro: the targets gets executed without
elevation. Due to lack of elevation, that cannot be used to abuse
privileges that the elevation would give.
This change partially fixes#98863 for S[UG]ID wrappers. The issue for
capability wrappers remains.
/proc/self/exe is a "fake" symlink. When it's opened, it always opens
the actual file that was execve()d in this process, even if the file was
deleted or renamed; if the file is no longer accessible from the current
chroot/mount namespace it will at the very worst fail and never open the
wrong file. Thus, we can make a much simpler argument that we're reading
capabilities off the correct file after this change (and that argument
doesn't rely on things such as protected_hardlinks being enabled, or no
users being able to write to /run/wrappers, or the verification that the
path readlink returns starts with /run/wrappers/).
Before this change it was crucial that nonprivileged users are unable to
create hardlinks to SUID wrappers, lest they be able to provide a
different `.real` file alongside. That was ensured by not providing a
location writable to them in the /run/wrappers tmpfs, (unless
disabled) by the fs.protected_hardlinks=1 sysctl, and by the explicit
own-path check in the wrapper. After this change, ensuring
that property is no longer important, and the check is most likely
redundant.
The simplification of expectations of the wrapper will make it
easier to remove some of the assertions in the wrapper (which currently
cause the wrapper to fail in no_new_privs environments, instead of
executing the target with non-elevated privileges).
Note that wrappers had to be copied (not symlinked) into /run/wrappers
due to the SUID/capability bits, and they couldn't be hard/softlinks of
each other due to those bits potentially differing. Thus, this change
doesn't increase the amount of memory used by /run/wrappers.
In user namespaces where an unprivileged user is mapped as root and root
is unmapped, setuid bits have no effect. However setuid root
executables like mount are still usable *in the namespace* as the user
already has the required privileges. This commit detects the situation
where the wrapper gained no privileges that the parent process did not
already have and in this case does less sanity checking. In short there
is no need to be picky since the parent already can execute the foo.real
executable themselves.
Details:
man 7 user_namespaces:
Set-user-ID and set-group-ID programs
When a process inside a user namespace executes a set-user-ID
(set-group-ID) program, the process's effective user (group) ID
inside the namespace is changed to whatever value is mapped for
the user (group) ID of the file. However, if either the user or
the group ID of the file has no mapping inside the namespace, the
set-user-ID (set-group-ID) bit is silently ignored: the new
program is executed, but the process's effective user (group) ID
is left unchanged. (This mirrors the semantics of executing a
set-user-ID or set-group-ID program that resides on a filesystem
that was mounted with the MS_NOSUID flag, as described in
mount(2).)
The effect of the setuid bit is that the real user id is preserved and
the effective and set user ids are changed to the owner of the wrapper.
We detect that no privilege was gained by checking that euid == suid
== ruid. In this case we stop checking that euid == owner of the
wrapper file.
As a reminder here are the values of euid, ruid, suid, stat.st_uid and
stat.st_mode & S_ISUID in various cases when running a setuid 42 executable as user 1000:
Normal case:
ruid=1000 euid=42 suid=42
setuid=2048, st_uid=42
nosuid mount:
ruid=1000 euid=1000 suid=1000
setuid=2048, st_uid=42
inside unshare -rm:
ruid=0 euid=0 suid=0
setuid=2048, st_uid=65534
inside unshare -rm, on a suid mount:
ruid=0 euid=0 suid=0
setuid=2048, st_uid=65534
Before this change, the description for
security.wrappers.<name>.capabilities made it seem like you could just
string together the names of capabilities like this:
capabilities = "CAP_SETUID,CAP_SETGID";
In reality, each item in the list must be a full-on capability clause:
capabilities = "CAP_SETUID=ep,CAP_SETGID+i";
conversions were done using https://github.com/pennae/nix-doc-munge
using (probably) rev f34e145 running
nix-doc-munge nixos/**/*.nix
nix-doc-munge --import nixos/**/*.nix
the tool ensures that only changes that could affect the generated
manual *but don't* are committed, other changes require manual review
and are discarded.
now nix-doc-munge will not introduce whitespace changes when it replaces
manpage references with the MD equivalent.
no change to the manpage, changes to the HTML manual are whitespace only.
the conversion procedure is simple:
- find all things that look like options, ie calls to either `mkOption`
or `lib.mkOption` that take an attrset. remember the attrset as the
option
- for all options, find a `description` attribute who's value is not a
call to `mdDoc` or `lib.mdDoc`
- textually convert the entire value of the attribute to MD with a few
simple regexes (the set from mdize-module.sh)
- if the change produced a change in the manual output, discard
- if the change kept the manual unchanged, add some text to the
description to make sure we've actually found an option. if the
manual changes this time, keep the converted description
this procedure converts 80% of nixos options to markdown. around 2000
options remain to be inspected, but most of those fail the "does not
change the manual output check": currently the MD conversion process
does not faithfully convert docbook tags like <code> and <package>, so
any option using such tags will not be converted at all.
A simpler implementation of 7d8b303e3f
that uses an assertion instead of a derivation.
`pathHasContext` seems a bit better than `hasPrefix storeDir` because it
avoids a string comparison, and catches nonsense like
`"foo${pkgs.hello}bar"`.
activating the configuration...
setting up /etc...
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.messagebus’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
chown: warning: '.' should be ':': ‘root.root’
reloading user units for root...
C's assert macro only works when NDEBUG is undefined. Previously
NDEBUG was undefined incorrectly which meant that the assert
macros in wrapper.c did not work.
Add a shell script that checks if the paths of all wrapped programs
actually exist to catch mistakes. This only checks for Nix store paths,
which are always expected to exist at build time.
To keep backward compatibility and have a typing would require making
all options null by default, adding a defaultText containing the actual
value, write the default value logic based on `!= null` and replacing
the nulls laters. This pretty much defeats the point of having used
a submodule type.
The security.wrappers option is morally a set of submodules but it's
actually (un)typed as a generic attribute set. This is bad for several
reasons:
1. Some of the "submodule" option are not document;
2. the default values are not documented and are chosen based on
somewhat bizarre rules (issue #23217);
3. It's not possible to override an existing wrapper due to the
dumb types.attrs.merge strategy;
4. It's easy to make mistakes that will go unnoticed, which is
really bad given the sensitivity of this module (issue #47839).
This makes the option a proper set of submodule and add strict types and
descriptions to every sub-option. Considering it's not yet clear if the
way the default values are picked is intended, this reproduces the current
behavior, but it's now documented explicitly.
With libcap 2.41 the output of cap_to_text changed, also the original
author of code hoped that this would never happen.
To counter this now the security-wrapper only relies on the syscall
ABI, which is more stable and robust than string parsing. If new
breakages occur this will be more obvious because version numbers will
be incremented.
Furthermore all errors no make execution explicitly fail instead of
hiding errors behind debug environment variables and the code style was
more consistent with no goto fail; goto fail; vulnerabilities (https://gotofail.com/)
This reverts commit fb6d63f3fd.
I really hope this finally fixes#99236: evaluation on Hydra.
This time I really did check basically the same commit on Hydra:
https://hydra.nixos.org/eval/1618011
Right now I don't have energy to find what exactly is wrong in the
commit, and it doesn't seem important in comparison to nixos-unstable
channel being stuck on a commit over one week old.
The /run/wrapper directory is a tmpfs. Unfortunately, it's mounted with
its root directory has the standard (for tmpfs) mode: 1777 (world writeable,
sticky -- the standard mode of shared temporary directories). This means that
every user can create new files and subdirectories there, but can't
move/delete/rename files that belong to other users.