This bug is so obscure and unlikely that I was honestly not able to
properly write a test for it. What happens is that we are calling
handleModifiedUnit() with $unitsToStart=\%unitsToRestart. We do this to
make sure that the unit is stopped before it's started again which is
not possible by regular means because the stop phase is already done
when calling the activation script.
recordUnit() still gets $startListFile, however which is the wrong file.
The bug would be triggered if an activation script requests a service
restart for a service that has `stopIfChanged = true` and
switch-to-configuration is killed before the restart phase was run. If
the script is run again, but the activation script is not requesting
more restarts, the unit would be started instead of restarted.
systemd needs this so special characters (like the ones in wireguard
units that appear because they are part of base64) can be escaped using
the \x syntax.
Root of the issue is that `glob()` handles the backslash internally
which is obviously not what we want here.
Also add a test case and fix some perlcritic issues in the subroutine.
This is accomplished by comparing the hashes that the unit files
contain. By filtering for a special key `X-Reload-Triggers` in the
`[Unit]` section, we can differentiate between reloads and restarts.
Since activation scripts can request reloads of units as well, more
checking of this behaviour is implemented. If a unit is to be restarted,
it's never reloaded as well which would make no sense.
Also removes a useless subroutine and perl dependencies that are
nowadays handled by the propagated build inputs feature of
`perl.withPackages`.
The `nix.*` options, apart from options for setting up the
daemon itself, currently provide a lot of setting mappings
for the Nix daemon configuration. The scope of the mapping yields
convience, but the line where an option is considered essential
is blurry. For instance, the `extra-sandbox-paths` mapping is
provided without its primary consumer, and the corresponding
`sandbox-paths` option is also not mapped.
The current system increases the maintenance burden as maintainers have to
closely follow upstream changes. In this case, there are two state versions
of Nix which have to be maintained collectively, with different options
avaliable.
This commit aims to following the standard outlined in RFC 42[1] to
implement a structural setting pattern. The Nix configuration is encoded
at its core as key-value pairs which maps nicely to attribute sets, making
it feasible to express in the Nix language itself. Some existing options are
kept such as `buildMachines` and `registry` which present a simplified interface
to managing the respective settings. The interface is exposed as `nix.settings`.
Legacy configurations are mapped to their corresponding options under `nix.settings`
for backwards compatibility.
Various options settings in other nixos modules and relevant tests have been
updated to use structural setting for consistency.
The generation and validation of the configration file has been modified to
use `writeTextFile` instead of `runCommand` for clarity. Note that validation
is now mandatory as strict checking of options has been pushed down to the
derivation level due to freeformType consuming unmatched options. Furthermore,
validation can not occur when cross-compiling due to current limitations.
A new option `publicHostKey` was added to the `buildMachines`
submodule corresponding to the base64 encoded public host key settings
exposed in the builder syntax. The build machine generation was subsequently
rewritten to use `concatStringsSep` for better performance by grouping
concatenations.
[1] - https://github.com/NixOS/rfcs/blob/master/rfcs/0042-config-option.md
Modules that do not depend on e.g. toplevel should not have to include it just to set
things in `system.build`. As a general rule, this keeps tests simple, usage flexible
and evaluation fast. While one module is insignificant, consistency and good practices
are.
If the Nix daemon has never been enabled (nix.enable has always been
set to false), the gcroots directory won't exist. If the Nix daemon
is later enabled, the GC roots for booted-system and current-system
will be missing, and they might end up being garbage collected. Since
it's cheap to add GC roots even if the daemon will never be enabled,
let's just always add them so we're okay in the case where the daemon
is enabled later.
- Fully get rid of `parseKeyValues` and use systemctl features for that
- Add some regex modifiers recommended by perlcritic
- Get rid of a postfix if
- Sort units when showing their status
- Clean the logic for showing what failed from `elif` to `next`
- Switch from `state` to `substate` for `auto-restart` because that's
actually where the value is stored
- Show status of units with one single systemctl call and get rid of
COLUMNS in favor of --full
- Add a test for failing units
This replaces the naive K=V unit parser with a proper INI parser from a
library and adds proper support for override files. Also adds a bunch of
comments about parsing, I hope this makes it easier to understand and
maintain in the future.
There are multiple reasons to do so, the first one is just general
correctness with is nice imo. But to get to more serious reasons (I
didn't put in all that effort for nothing) is that this is the first
step torwards more clever restart/reload handling. By using a library
like Data::Compare a future PR could replace the current way of
fingerprinting units (which is to compare store paths) by comparing the
hashes. This is more precise because units won't get restarted because
the order of the options change, comments are added, some dependency of
writeText changes, .... Also this allows us to add a feature like
`X-Reload-Triggers` so the unit can either be reloaded when these change
or restarted when everything else changes, giving module authors the
ability to have their services reloaded without having to fear that
updates are not applied because the service doesn't get restarted.
Another reason why this feature is nice is that now that the unit files
are parsed correctly (and values are just extracted from one section),
potential future rewrites can just rely on some INI library without
having to implement their own weird parser that is compatible with this
script.
This also comes with a new subroutine to handle systemd booleans because
I thought the current way of handling it was just ugly. This also allows
overriding values this script reads in an override file.
Apart from making this script more compatible with the world around it,
this also fixes two issues I saw bugging exactly 0 (zero) people. First
is that this script now supports multiple override files, also ones that
are not called override.conf and the second one is that `1` and `on` are
treated as bools by systemd but were previously not parsed as such by
switch-to-configuration.
This removes `/run/nixos/activation-reload-list` (which we will need in
the future when reworking the reload logic) and makes
`/run/nixos/activation-restart-list` honor `restartIfChanged` and
`reloadIfChanged`. This way activation scripts don't have to bother with
choosing between reloading and restarting.
most modules can be evaluated for their documentation in a very
restricted environment that doesn't include all of nixpkgs. this
evaluation can then be cached and reused for subsequent builds, merging
only documentation that has changed into the cached set. since nixos
ships with a large number of modules of which only a few are used in any
given config this can save evaluation a huge percentage of nixos
options available in any given config.
in tests of this caching, despite having to copy most of nixos/, saves
about 80% of the time needed to build the system manual, or about two
second on the machine used for testing. build time for a full system
config shrank from 9.4s to 7.4s, while turning documentation off
entirely shortened the build to 7.1s.
Add `shellDryRun` to the generic stdenv and substitute it for uses of
`${stdenv.shell} -n`. The point of this layer of abstraction is to add
the flag `-O extglob`, which resolves#126344 in a more direct way.
some options have default that are best described in prose, such as
defaults that depend on the system stateVersion, defaults that are
derivations specific to the surrounding context, or those where the
expression is much longer and harder to understand than a simple text
snippet.
The first one doesn't make any sense because the directory where the
init binary resides does not contain other tools we need like
systemd-escape.
The second one doesn't make sense either because the errors are already
ignored.
This reverts commit 57961d2b83, reversing
changes made to b04f913afc.
(I.e. this reverts PR #141192.)
While well-intended, this change does unfortunately introduce very
serious regressions that are especially disruptive/noticeable on desktop
systems (e.g. users of Sway will loose their graphical session when
running "nixos-rebuild switch").
Therefore, this change has to be reverted ASAP instead of trying to fix
it in "production".
Note: An updated version should be extensively discussed, reviewed, and
tested before re-landing this change as an earlier version also had to
be reverted for the exact same issues [0].
Fix: #146727
[0]: https://github.com/NixOS/nixpkgs/pull/73871#issuecomment-559783752
By using the new extendModules function to produce the specialisations,
we avoid reimplementing the eval-config.nix logic in reverse and fix
cross compilation support for specialisations in the process.
This makes the order of operations the same in dry-activate and a "true"
activate. Also fixes the indentation I messed up and drop a useless
unlink() call (we are already unlinking that file earlier).
The previous logic failed to detect that units were socket-activated
when the socket was stopped before switch-to-configuration was run. This
commit fixes that and also starts the socket in question.
The first FIXME is removed because it doesn't make sense to use
/proc/1/exe since that points to a directory that doesn't have all tools
the activation script needs (like systemd-escape).
The second one is removed because there is already no error handling
(compare with the restart logic where the return code is checked).
This commit changes a lot more that you'd expect but it also adds a lot
of new testing code so nothing breaks in the future. The main change is
that sockets are now restarted when they change. The main reason for
the large amount of changes is the ability of activation scripts to
restart/reload units. This also works for socket-activated units now,
and honors reloadIfChanged and restartIfChanged. The two changes don't
really work without each other so they are done in the one large commit.
The test should show what works now and ensure it will continue to do so
in the future.
When cross-compiling, we can't run the runtime shell to check syntax
if it's e.g. for a different architecture. We have two options here.
We can disable syntax checking when cross compiling, but that risks
letting errors through. Or, we can do what I've done here, and change
the syntax check to use stdenv's shell instead of the runtime shell.
This requires the stdenv shell and runtime shell to be broadly
compatible, but I think that's so ingrained in Nixpkgs anyway that
it's fine. And this way we avoid conditionals that check for cross.
The primary use case is tools like sops-nix and agenix to restart units
when secrets change. There's probably other reasons to restart units as
well and a nice thing to have in general.
Since 03eaa48 added perl.withPackages, there is a canonical way to
create a perl interpreter from a list of libraries, for use in script
shebangs or generic build inputs. This method is declarative (what we
are doing is clear), produces short shebangs[1] and needs not to wrap
existing scripts.
Unfortunately there are a few exceptions that I've found:
1. Scripts that are calling perl with the -T switch. This makes perl
ignore PERL5LIB, which is what perl.withPackages is using to inform
the interpreter of the library paths.
2. Perl packages that depends on libraries in their own path. This
is not possible because perl.withPackages works at build time. The
workaround is to add `-I $out/${perl.libPrefix}` to the shebang.
In all other cases I propose to switch to perl.withPackages.
[1]: https://lwn.net/Articles/779997/
The `platform` field is pointless nesting: it's just stuff that happens
to be defined together, and that should be an implementation detail.
This instead makes `linux-kernel` and `gcc` top level fields in platform
configs. They join `rustc` there [all are optional], which was put there
and not in `platform` in anticipation of a change like this.
`linux-kernel.arch` in particular also becomes `linuxArch`, to match the
other `*Arch`es.
The next step after is this to combine the *specific* machines from
`lib.systems.platforms` with `lib.systems.examples`, keeping just the
"multiplatform" ones for defaulting.
The toplevel derivations of systems that have `networking.hostName`
set to `""` (because they want their hostname to be set by DHCP) used
to be all named
`nixos-system-unnamed-${config.system.nixos.label}`.
This makes them hard to distinguish.
A similar problem existed in NixOS tests where `vmName` is used in the
`testScript` to refer to the VM. It defaulted to the
`networking.hostName` which when set to `""` won't allow you to refer
to the machine from the `testScript`.
This commit makes the `system.name` configurable. It still defaults to:
```
if config.networking.hostName == ""
then "unnamed"
else config.networking.hostName;
```
but in case `networking.hostName` needs to be to `""` the
`system.name` can be set to a distinguishable name.
`$toplevel/system` of a system closure with `x86_64` kernel and `i686` userland should contain "x86_64-linux".
If `$toplevel/system` contains "i686-linux", the closure will be run using `qemu-system-i386`, which is able to run `x86_64` kernel on most Intel CPU, but fails on AMD.
So this fix is for a rare case of `x86_64` kernel + `i686` userland + AMD CPU
This is to facilitate units that should _only_ be manually started and
not activated when a configuration is switched to.
More specifically this is to be used by the new Nixops deploy-*
targets created in https://github.com/NixOS/nixops/pull/1245 that are
triggered by Nixops before/after switch-to-configuration is called.
This avoids a possible surprise if the user is using `nixpkgs.system`
and `nesting.children`. `nesting.children` is expected to ignore all
parent configuration so we shouldn't propagate the user-facing option
`nixpkgs.system`. To avoid doing so, we introduce a new internal
option for holding the value passed to eval-config.nix, and use that
when recursing for nesting.
The current behavior lets `system` default to
`builtins.currentSystem`. The system value specified to
`eval-config.nix` has very low precedence, so this should compose
properly.
Fixes#80806
Previously, socket units wouldn't be restarted if they were
changed. To restart the socket, the service the socket is attached
to needs to be stopped first before the socket can be restarted.
The new systemd in 19.09 gives an "Access Denied" error when doing
"systemctl daemon-reexec" on an 19.03 system. The fix is to use the
previous systemctl to signal the daemon to re-exec itself. This
ensures that users don't have to reboot when upgrading from NixOS
19.03 to 19.09.
Add support for custom device-tree files, and applying overlays to them.
This is useful for supporting non-discoverable hardware, such as sensors
attached to GPIO pins on a Raspberry Pi.
This change was only a temporary workaround and isn't required anymore,
since /etc/systemd/system/system.slice should not be present on any
recent NixOS system (which makes this change a no-op).
This reverts commit 7098b0fcdf.
The autoupgrade service defined in `system.autoUpgrade`
(`nixos/modules/installer/tools/auto-upgrade.nix`) doesn't have `su` in
its path and thus yields a warning during the `daemon-reload`.
Specifying the absolute path fixes the issue.
Fixes#47648
I think pam_lastlog is the only thing that writes to these files in
practice on a modern Linux system, so in a configuration that doesn't
use that module, we don't need to create these files.
I used tmpfiles.d instead of activation snippets to create the logs.
It's good enough for upstream and other distros; it's probably good
enough for us.
Nix 2.0 no longer uses these directories.
/run/nix/current-load was moved to /nix/var/nix/current-load in 2017
(Nix commit d7653dfc6dea076ecbe00520c6137977e0fced35). Anyway,
src/build-remote/build-remote.cc will create the current-load directory
if it doesn't exist already.
/run/nix/remote-stores seems to have been deprecated since 2014 (Nix
commit b1af336132cfe8a6e4c54912cc512f8c28d4ebf3) when the documentation
for $NIX_OTHER_STORES was removed, and support for it was dropped
entirely in 2016 (Nix commit 4494000e04122f24558e1436e66d20d89028b4bd).
or else at least the following config will fail with an evaluation error
instead of an assert
```
{
services.nixosManual.enable = false;
services.nixosManual.showManual = true;
}
```
This fixes an issue with shells like fish that are not fully POSIX
compliant. The syntax `ENV=val cmd' doesn't work properly in there.
This issue has been addressed in #45932 and #45945, however it has been
recommended to use a single shell (`stdenv.shell' which is either
`bash' or `sh') to significantly reduce the maintenance overload in the
future.
See https://github.com/NixOS/nixpkgs/issues/45897#issuecomment-417923464Fixes#45897
/cc @FRidh @xaverdh @etu
The instructions to install nixos behind a proxy were not clear. While
one could guess that setting http_proxy variables can get the install
rolling, one could end up with an installed system where the proxy
settings for the nix-daemon are not configured.
This commit updates the documentation with
1. steps to install behind a proxy
2. configure the global proxy settings so that nix-daemon can access
internet.
3. Pointers to use nesting.clone in case one has to use different proxy
settings on different networks.
When rebuilding you have to manually run `systemctl --user
daemon-reload`. It gathers all authenticated users using
`loginctl list-user` and runs `daemon-reload` for each of them.
This is a first step towards a `nixos-rebuild` which is able to reload
user units from systemd. The entire task is fairly hard, however I
consider this patch usable as it allows to restart units without running
`daemon-reload` for each authenticated user.
This allows a developer to better identify in which snippet the
failure happened. Furthermore, users seeking help will have more
information available about the failure.
Problem: Restarting (stopping) system.slice would not only stop X11 but
also most system units/services. We obviously don't want this happening
to users when they switch from 18.03 to 18.09 or nixos-unstable.
Reason: The following change in systemd:
d8e5a93382
The commit adds system.slice to the perpetual units, which means
removing the unit file and adding it to the source code. This is done so
that system.slice can't be stopped anymore but in our case it ironically
would cause this script to stop system.slice because the unit file was
removed (and an older systemd version is still running).
Related issue: https://github.com/NixOS/nixpkgs/issues/39791
Resolved the following conflicts (by carefully applying patches from the both
branches since the fork point):
pkgs/development/libraries/epoxy/default.nix
pkgs/development/libraries/gtk+/3.x.nix
pkgs/development/python-modules/asgiref/default.nix
pkgs/development/python-modules/daphne/default.nix
pkgs/os-specific/linux/systemd/default.nix
Fixes#28443
Fixed few invocations to `systemctl` to have an absolute path. Additionally add
LOCALE_ARCHIVE so that perl stops spewing warning messages.