During working on #150837 I discovered that `google-oslogin` test
started failing, and so did some of my development machines. Turns out
it was because nscd doesn't start by default; rather it's wanted by
NSS lookup targets, which are not always fired up.
To quote from section on systemd.special(7) on `nss-user-lookup.target`:
> All services which provide parts of the user/group database should be
> ordered before this target, and pull it in.
Following this advice and comparing our unit to official `sssd.service`
unit (which is a similar service), we now pull NSS lookup targets from
the service, while starting it with `multi-user.target`.
This removes the `services.dbus.socketActivated` and
`services.xserver.startDbusSession` options. Instead the user D-Bus
session is always socket activated.
Since systemd 243, docs were already steering users towards using
`journal`:
eedaf7f322
systemd 246 will go one step further, it shows warnings for these units
during bootup, and will [automatically convert these occurences to
`journal`](f3dc6af20f):
> [ 6.955976] systemd[1]: /nix/store/hwyfgbwg804vmr92fxc1vkmqfq2k9s17-unit-display-manager.service/display-manager.service:27: Standard output type syslog is obsolete, automatically updating to journal. Please update│······················
your unit file, and consider removing the setting altogether.
So there's no point of keeping `syslog` here, and it's probably a better
idea to just not set it, due to:
> This setting defaults to the value set with DefaultStandardOutput= in
> systemd-system.conf(5), which defaults to journal.
This effectively disables nscd's built-in hosts cache, which turns out
to be erratic in some cases.
We only use nscd these days as a more ABI-neutral NSS dispatcher
mechanism.
Local caching should still be possible with local resolvers in
/etc/resolv.conf (via the `dns` NSS module), or without local resolvers
via systemd-networkd (via the `resolve` nss module)
We don't set enable-cache to no due to
https://github.com/NixOS/nixpkgs/pull/50316#discussion_r241035226.
This prevents duplication in cross-compiled nixos machines. The
bootstrapped glibc differs from the natively compiled one, so we get
two glibc’s in the closure. To reduce closure size, just use
stdenv.cc.libc where available.
Since https://github.com/NixOS/nixpkgs/pull/61321, local-fs.target is
part of sysinit.target again, meaning units without
DefaultDependencies=no will automatically depend on it, and the manual
set dependencies can be dropped.
NixOS usually needs nscd just to have a single place where
LD_LIBRARY_PATH can be set to include all NSS modules, but nscd is also
useful if some of the NSS modules need to read files which are only
accessible by root.
For example, nixos/modules/config/ldap.nix needs this when
users.ldap.enable = true;
users.ldap.daemon.enable = false;
and users.ldap.bind.passwordFile exists. In that case, the module
creates an /etc/ldap.conf which is only readable by root, but which the
NSS module needs to read in order to find out what LDAP server to
connect to and with what credentials.
If nscd is started as root and configured with the server-user option in
nscd.conf, then it gives each NSS module the opportunity to initialize
itself before dropping privileges. The initialization happens in the
glibc-internal __nss_disable_nscd function, which pre-loads all the
configured NSS modules for passwd, group, hosts, and services (but not
netgroup for some reason?) and, for each loaded module, calls an init
function if one is defined. After that finishes, nscd's main() calls
nscd_init() which ends by calling finish_drop_privileges().
There are provisions in systemd for using DynamicUser with a service
which needs to drop privileges itself, so this patch does that.
Thanks to @arianvp for pointing out that when DynamicUser is true,
systemd defaults the value of User to be the name of the unit, which in
this case is already "nscd".
These options were being set to the same value as the defaults that are
hardcoded in nscd. Delete them so it's clear which settings are actually
important for NixOS.
One exception is `threads 1`, which is different from the built-in
default of 4. However, both values are equivalent because nscd forces
the number of threads to be at least as many as the number of kinds of
databases it supports, which is 5.
nscd doesn't create any files outside of /run/nscd unless the nscd.conf
"persistent" option is used, which we don't do by default. Therefore it
doesn't matter what UID/GID we run this service as, so long as it isn't
shared with any other running processes.
/run/nscd does need to be owned by the same UID that the service is
running as, but systemd takes care of that for us thanks to the
RuntimeDirectory directive.
If someone wants to turn on the "persistent" option, they need to
manually configure users.users.nscd and systemd.tmpfiles.rules so that
/var/db/nscd is owned by the same user that nscd runs as.
In an all-defaults boot.isContainer configuration of NixOS, this removes
the only user which did not have a pre-assigned UID.
Previously this module created both /var/db/nscd and /run/nscd using
shell commands in a preStart script. Note that both of these paths are
hard-coded in the nscd source. (Well, the latter is actually
/var/run/nscd but /var/run is a symlink to /run so it works out the
same.)
/var/db/nscd is only used if the nscd.conf "persistent" option is turned
on for one or more databases, which it is not in our default config
file. I'm not even sure persistent mode can work under systemd, since
`nscd --shutdown` is not synchronous so systemd will always
unceremoniously kill nscd without reliably giving it time to mark the
databases as unused. Nonetheless, if someone wants to use that option,
they can ensure the directory exists using systemd.tmpfiles.rules.
systemd can create /run/nscd for us with the RuntimeDirectory directive,
with the added benefit of causing systemd to delete the directory on
service stop or restart. The default value of RuntimeDirectoryMode is
755, the same as the mode which this module was using before.
I don't think the `rm -f /run/nscd/nscd.pid` was necessary after NixOS
switched to systemd and used its PIDFile directive, because systemd
deletes the specified file after the service stops, and because the file
can't persist across reboots since /run is a tmpfs. Even if the file
still exists when nscd starts, it's only a problem if the pid it
contains has been reused by another process, which is unlikely. Anyway,
this change makes that deletion even less necessary, because now systemd
deletes the entire /run/nscd directory when the service stops.
This postStart step was introduced on 2014-04-24 with the comment that
"Nscd forks into the background before it's ready to accept
connections."
However, that was fixed upstream almost two months earlier, on
2014-03-03, with the comment that "This, along with setting the nscd
service type to forking in its systemd configuration file, allows
systemd to be certain that the nscd service is ready and is accepting
connections."
The fix was released several months later in glibc 2.20, which was
merged in NixOS sometime before 15.09, so it certainly should be safe to
remove this workaround by now.
The geoclue module now lets us set application config. This should make
it more robust in environments that don't provide a geoclue agent.
Fixes#44725.
Systemd provides an option for allocating DynamicUsers
which we want to use in NixOS to harden service configuration.
However, we discovered that the user wasn't allocated properly
for services. After some digging this turned out to be, of course,
a cache inconsistency problem.
When a DynamicUser creation is performed, Systemd check beforehand
whether the requested user already exists statically. If it does,
it bails out. If it doesn't, systemd continues with allocating the
user.
However, by checking whether the user exists, nscd will store
the fact that the user does not exist in it's negative cache.
When the service tries to lookup what user is associated to its
uid (By calling whoami, for example), it will try to consult
libnss_systemd.so However this will read from the cache and tell
report that the user doesn't exist, and thus will return that
there is no user associated with the uid. It will continue
to do so for the cache duration time. If the service
doesn't immediately looks up its username, this bug is not
triggered, as the cache will be invalidated around this time.
However, if the service is quick enough, it might end up
in a situation where it's incorrectly reported that the
user doesn't exist.
Preferably, we would not be using nscd at all. But we need to
use it because glibc reads nss modules from /etc/nsswitch.conf
by looking relative to the global LD_LIBRARY_PATH. Because LD_LIBRARY_PATH
is not set globally (as that would lead to impurities and ABI issues),
glibc will fail to find any nss modules.
Instead, as a hack, we start up nscd with LD_LIBRARY_PATH set
for only that service. Glibc will forward all nss syscalls to
nscd, which will then respect the LD_LIBRARY_PATH and only
read from locations specified in the NixOS config.
we can load nss modules in a pure fashion.
However, I think by accident, we just copied over the default
settings of nscd, which actually caches user and group lookups.
We already disable this when sssd is enabled, as this interferes
with the correct working of libnss_sss.so as it already
does its own caching of LDAP requests.
(See https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/usingnscd-sssd)
Because nscd caching is now also interferring with libnss_systemd.so
and probably also with other nsss modules, lets just pre-emptively
disable caching for now for all options related to users and groups,
but keep it for caching hosts ans services lookups.
Note that we can not just put in /etc/nscd.conf:
enable-cache passwd no
As this will actually cause glibc to _not_ forward the call to nscd
at all, and thus never reach the nss modules. Instead we set
the negative and positive cache ttls to 0 seconds as a workaround.
This way, Glibc will always forward requests to nscd, but results
will never be cached.
Fixes#50273
Could also move kdc.conf, but this makes it inconvenient to use command line
utilities with heimdal, as it would require specifying --config-file with every
command.
Allow switching out kerberos server implementation.
Sharing config is probably sensible, but implementation is different enough to
be worth splitting into two files. Not sure this is the correct way to split an
implementation, but it works for now.
Uses the switch from config.krb5 to select implementation.
Several service definitions used `mkEnableOption` with text starting
with "Whether to", which produced funny option descriptions like
"Whether to enable Whether to run the rspamd daemon..".
This commit corrects this, and adds short descriptions of services
to affected service definitions.
Some modules of cloud-init can cope with a network not immediately
available (notably, the EC2 module), but some others won't retry if
network is not available (notably, the Cloudstack module).
network.target doesn't give much guarantee about the network
availability. Applications not able to start without a fully
configured network should be ordered after network-online.target.
Also see #44573 and #44524.
DBus seems to resolve user IDs directly via glibc, circumventing nscd. In more
advanced setups this leads to user's coming from LDAP or SSSD not being
resolved by the dbus system bus daemon. The effect for such users is, that all
access to the system bus (e.g. busctl or nmcli) is denied.
Adding the respective NSS modules to the service's environment solves the issue
the same way it does for nscd.
DBus daemon now loads its config from /run/current-system/dbus.
Reloading the daemon makes it re-read that file and catch the updates
after a system upgrade.
perlPackages.TextWrapI18N: init at 0.06
perlPackages.Po4a: init at 0.47
jade: init at 1.2.1
ding-libs: init at 0.6.0
Switch nscd to no-caching mode if SSSD is enabled.
abbradar: disable jade parallel building.
Closes#21150
The following changes are included:
1) install user unit files from upstream dbus
2) use absolute paths to config for --system and --session instances
3) make socket activation of user units configurable
There has been a number of PRs to address this, so this one does the
bare minimum, which is to make the functionality available and
configurable but defaults to off.
Related PRs:
- #18382
- #18222
(cherry picked from commit f7215c9b5b)
Signed-off-by: Domen Kožar <domen@dev.si>
It appears that packageOverrides no longer overrides aliases, so
aliases like
dbus_tools = self.dbus.out;
dbus_daemon = self.dbus.daemon;
now use the old, non-overriden version of dbus. That seems like a
pretty serious regression in general, but for this particular problem,
I've fixed it by replacing dbus_daemon by dbus.daemon and dbus_tools
by dbus.
The docstring for the `services.dbus.packages` configuration option only
mentioned one directory, but the implementation actually looked for DBus
config files in four separate places within the target packages. This
commit updates the docstring to reflect the actual implementation
behaviour.
This patch makes dbus launch with any user session instead of
leaving it up to the desktop environment launch script to run it.
It has been tested with KDE, which simply uses the running daemon
instead of launching its own.
This is upstream's recommended way to run dbus.
These services don't create files on disk, let alone on a network
filesystem, so they don't really need a fixed uid. And this also gets
rid of a warning coming from <= 14.12 systems.
Specifically, this fixes dnsmasq, which failed with
Apr 16 19:00:30 mandark dnsmasq[23819]: dnsmasq: DBus error: Connection ":1.260" is not allowed to own the service "uk.org.thekelleys.dnsmasq" due to security policies in the configuration file
Apr 16 19:00:30 mandark dnsmasq[23819]: DBus error: Connection ":1.260" is not allowed to own the service "uk.org.thekelleys.dnsmasq" due to security policies in the configuration file
after being enabled, due to dbus not being reloaded.
Many bus clients get hopelessly confused when dbus-daemon is
restarted. So let's not do that.
Of course, this is not ideal either, because we end up stuck with a
possibly outdated dbus-daemon. But that issue will become irrelevant
in the glorious kdbus-based future.
Hopefully this also gets rid of systemd getting stuck after
dbus-daemon is restarted:
Apr 01 15:37:50 mandark systemd[1]: Failed to register match for Disconnected message: Connection timed out
Apr 01 15:37:50 mandark systemd[1]: Looping too fast. Throttling execution a little.
Apr 01 15:37:51 mandark systemd[1]: Looping too fast. Throttling execution a little.
...
Using pkgs.lib on the spine of module evaluation is problematic
because the pkgs argument depends on the result of module
evaluation. To prevent an infinite recursion, pkgs and some of the
modules are evaluated twice, which is inefficient. Using ‘with lib’
prevents this problem.
[Bjørn Forsman <bjorn.forsman@gmail.com>:
- use types.lines instead of types.string. The former joins strings
with "\n" and the latter with "" (and is deprecated).
]