There is a nasty race condition in the cups tests.
To understand what is going on, one must first note that
printers are installed in the vms with ensure-printers.service,
which is started as part of multi-user.target.
ensure-printers.service in turn triggers a start of
cups.service as it needs to connect to the local cups daemon.
This is what happens when the test runs:
1 the test waits for cups.socket or cups.service to start up
(subtest "Make sure that cups is up on both sides...")
2 after cups.service started
(it starts even in the "socket" case,
triggered by ensure-printers.service),
ensure-printers.service is started
3 the test tries to connect to the cups daemons via curl
(subtest "HTTP server is available too")
4 the test verifies the required printers are installed
("lpstat -a" called by subtest "LP status checks")
Usually, 3 needs some time, so ensure-printers.service
already installed all printers that are required by 4.
But if 3 is too fast, or if ensure-printers.service is too slow,
4 fails to find the printers it is looking for.
One can provoke the problem by adding
> systemd.services.ensure-printers.serviceConfig.ExecStartPre = "/run/current-system/sw/bin/sleep 10";
to the `nodes.client` configuration.
The commit at hand fixes the problem by changing 1:
Instead of waiting for cups,
it now waits for ensure-printers.service
(which in turn waits for cups.service and cups.socket).
This is also in accordance with the
subtest description in the code that promises to
"Make sure that cups is up [...] and printers are set up".
Prior to this contribution, every boot with a default configuration was
considered `ConditionFirstBoot=true` by systemd, since /etc/machine-id
was not commited to disk.
This also extends the systemd with a check for subsequent boots not
being considered first boots.
Adds explicit network configuration for etcd service
Waits for etcd to be fully healthy before running tests
Makes endpoint configuration explicit in etcdctl commands
Migrate to pkgs/by-name,
and update the test so that it passes for all versions
This version is added as EOL, since NetBox 4.1 is out,
but it might be still useful in case of an upgrade issue.
Modernize it. This allows the test to be extended, and pkgs to be
reused (later) to speed up evaluations a bit.
I believe this also makes it run on darwin hosts, but my linux-builder's
disk is too small to fit the massive closure of this test.
(cherry picked from commit 1396a03bee18a0993a4f3e97fda8938ff61c2918)
Modernize it. This allows the test to be extended, and pkgs to be
reused (later) to speed up evaluations a bit.
I believe this also makes it run on darwin hosts, but my linux-builder's
disk is too small to fit the massive closure of this test.
(cherry picked from commit 8c06d2cf667106dd440e7c140e70051dc1c321cb)
- remove dead code
- pass around a lot less redundant stuff
- add a timeout to the read so it can actually fail when characters are dropped
- run the input reader in systemd-cat so we can see the errors on console
This does not actually fix the flakiness in the tests, but it should make it
easier to find.
Before this change, the hash of the etc metadata image was included in
the mount unit that's responsible for mounting this metadata image in the
initrd.
And because this metadata image changes with every change to the etc
contents, the initrd would be rebuild every time as well.
This can lead to a lot of rebuilds (especially when revision info is
included in /etc/os-release) and all these initrd archives use up a lot of
space on the ESP.
With this change, we instead include a symlink to the metadata image in the
top-level directory, in the same way as we already do for things like init and
prepare-root, and we deduce the store path from the init= kernel parameter,
in the same way as we already do to find the path to init and prepare-root.
Doing so avoids rebuilding the initrd all the time.
The URL scheme for downloading plugins has changed a long time ago and
the used URL is dead. Gerrit only throws an error since it can't load
the plugin but it continues to boot. However, instead of maintaining
URLs to 3rdparty plugins, which end up dead anyway, just drop it. The
test should cover Gerrit and not 3rd party plugins.
Also, while on it, drop the setting `plugins.allowRemoteAdmin = true`
since it's not needed.
Signed-off-by: Felix Singer <felixsinger@posteo.net>
In order to emulate the `nixos-rebuild switch` that is called if the EC2
user data is a nix expression, run the switch-to-configuration script
for the current running config.
In order to avoid unnecessary CI errors regarding unsupported
architectures, limit the target architectures to the supported ones by
the Redmine package.
Signed-off-by: Felix Singer <felixsinger@posteo.net>
Tests were not changed according to the new prometheus firewall port
settings.
With this change we now check that the port is not accessible form the
outside, while everything still works from localhost.
When installing NixOS on a machine with Windows, the "easiest" solution
to dual-boot is re-using the existing EFI System Partition (ESP), which
allows systemd-boot to detect Windows automatically.
However, if there are multiple ESPs, maybe even on multiple disks,
systemd-boot is unable to detect the other OSes, and you either have to
use Grub and os-prober, or do a tedious manual configuration as
described in the wiki:
https://wiki.nixos.org/w/index.php?title=Dual_Booting_NixOS_and_Windows&redirect=no#EFI_with_multiple_disks
This commit automates and documents this properly so only a single line
like
boot.loader.systemd-boot.windows."10".efiDeviceHandle = "HD0c2";
is required.
In the future, we might want to try automatically detecting this
during installation, but finding the correct device handle while the
kernel is running is tricky.
- use upstream service and scripts
- switch to integrated-vtysh-config, abandon per-daemon config
- use always daemon names in options (e.g. ospf -> ospfd)
- zebra, mgmtd and staticd are always enabled
- abandon vtyListenAddress, vtyListenPort options; use
just "extraOptions" or "options" instead, respectively
- extend test to test staticd
- update release-notes
- pkgs.servers.frr: fix sbindir and remove FHS PATH
- introduce services.frr.openFilesLimit option
There's no point for the intermediate `getPath` function calling
`getLuaPath` with the "lua" argument.
There's also no other nginx test this copies code from.
We always call `getLuaPath` with "lua", so constant-propagate it in.
Also, camel-case `lualibs` to `luaLibs.`
When 757a455dde refactored the zones to go
from a list to a map, this broke the tests/common/resolver helper.
reproduction:
```
let
pkgs = import <nixpkgs> {};
testConfig = {
name = "resolver-repro";
nodes = {
acme = { nodes, ... }: {
imports = [ (pkgs.path + /nixos/tests/common/acme/server) ];
};
};
testScript = ''
'';
};
in pkgs.nixosTest testConfig
```
The test currently fails because we attempt to switch to a NixOS
configuration that is _very_ different from the one we are switching
from (e.g. the new configuration has an entirely empty /etc/fstab,
causing switch-to-configuration to want to start unmounting all
filesystems defined in the old configuration).
Explicitly waiting for influxdb2 in the test, instead of fixing the
underlying issue[1], was hiding a real bug[2]. Now that the bug has been
fixed we can remove the wait code.
[1] Commit 732d36522f ("nixos/influxdb2: wait until service is ready")
[2] https://github.com/NixOS/nixpkgs/issues/317017 ("Scrutiny tries to start before influxdb has started")
Added a decorator function to handle any
exceptions generated by test functions and
apply some retry logic with backoff.
Also wrapped the unwrapped add-a curl which
was causing some fails.
In the next release of Pebble, the certificate
subject is no longer populated with a useful domain name.
This change will refactor the fullchain validation assertions
to avoid checking the subject line.
I have no idea what this escape sequence even is, but it breaks the nix parser with cryptic errors if not used in a comment.
A friend let me know MacOS is prone to input weird spaces, not sure if that is the source.
Candidates were located and created with:
chr="$(echo -e '\xc2\xa0')"; rg -F "$chr" -l | xe sd -F "$chr" " "
There are some examples left, most being example output from `tree` in various markdown documents, some patches which we can't really touch, and `pkgs/tools/nix/nixos-render-docs/src/tests/test_commonmark.py` which I'm not sure if should be addressed
The issue was that the old test-case used `/tmp` to share data. Using
`JoinsNamespaceOf=` wasn't a real workaround since the private `/tmp` is
recreated when a service gets stopped/started which is the case here, so
the wals were still lost.
To keep the test building with `PrivateTmp=yes`, create a dedicated
directory in `/var/cache` with tmpfiles and allow the hardened
`postgresql.service` to access it via `ReadWritePaths`.
Just noticed that I apparently disabled this test while restructuring
the Nextcloud tests[1] effectively disabling the test.
This patch re-adds it and adjusts the code accordingly.
I also noticed that the old check whether the cache is actually used
(`test "[]" = "$(redis-cli --json KEYS "*")"`) was broken because the
`nextcloud.fail()` hid the fact that the `redis-cli` invocation was
failing due to a missing password. Fixed the subtest accordingly.
[1] 0b31ada92b
This only ever worked for the session, not for the greeter. Writing the information out to a file should be more consistent.
To make sure that this works, and continues working, for the greeter & session, also add a new VM test.