Commit Graph

68 Commits

Author SHA1 Message Date
K900
c8fd06c3b2 nixos/tests/acme: wait for server to run before starting the target
This is really an ordering issue in the ACME module itself,
but while we think of how to fix it, this should at least unflake
the tests.
2024-11-09 09:40:38 +03:00
K900
ee6df93fe2 nixos/tests/acme: explicitly start the targets we wait for
This should address the other source of flakiness in the test.
2024-11-09 01:57:35 +03:00
Florian Klink
3ae3a4fb69
nixos/tests/acme: Better error handling (#250260) 2024-10-03 11:41:53 +03:00
Lucas Savva
ffc9bf1882 nixos/tests/acme: Better error handling
Added a decorator function to handle any
exceptions generated by test functions and
apply some retry logic with backoff.

Also wrapped the unwrapped add-a curl which
was causing some fails.
2024-10-02 23:07:37 +01:00
Lucas Savva
b2758880b3 nixos/tests/acme: Fix fullchain validation
In the next release of Pebble, the certificate
subject is no longer populated with a useful domain name.
This change will refactor the fullchain validation assertions
to avoid checking the subject line.
2024-10-02 23:02:51 +01:00
r-vdp
f09a62f122
acme: fix test after fc35704bc8 2024-08-12 14:04:09 +02:00
Oliver Schmidt
26bae04567 tests/acme: check consistent account hash for legacy settings
To allow migration from 23.11 to 24.05 without triggering re-registrations,
the account hashing behaviour of the previous release can be retained by setting
`security.acme.defaults.server` to `null`.

We better also check for hash consistency with that setting to avoid unexpected
account hash changes again.
2024-06-20 17:22:05 +02:00
Martin Weinelt
121ba21838
Merge pull request #286999 from SuperSandro2000/acme-check-account-hash
tests/acme: check consistent account hash
2024-06-07 23:57:20 +02:00
Franz Pletz
b7d060d10d
nixos/nginx: fix reference to acme cert hostname
The change introduced in #308303 refers to the virtualHosts attrset
key which can be any string. The servername is the actual primary
hostname used for the certificate.

This fixes use cases like:

    services.nginx.virualHosts.foobar.serverName = "my.fqdn.org";
2024-05-10 01:36:34 +02:00
Gabriella Gonzalez
b8698cd8d6
macOS support for NixOS tests (#282401)
Closes #193336
Closes #261694
Related to #108984

The goal here was to get the following flake to build and run on
`aarch64-darwin`:

```nix
{ inputs.nixpkgs.url = <this branch>;

  outputs = { nixpkgs, ... }: {
    checks.aarch64-darwin.default =
      nixpkgs.legacyPackages.aarch64-darwin.nixosTest {
        name = "test";

        nodes.machine = { };

        testScript = "";
      };
  };
}
```

… and after this change it does.  There's no longer a need for the
user to set `nodes.*.nixpkgs.pkgs` or
`nodes.*.virtualisation.host.pkgs` as the correct values are inferred
from the host system.
2024-03-02 06:33:14 +01:00
Sandro Jäckel
f97594c700
tests/acme: drop unused variables 2024-02-07 15:00:03 +01:00
Sandro Jäckel
3f20904498
tests/acme: check consistent account hash 2024-02-07 14:59:50 +01:00
Jade Lovelace
274466d1fc nixos/tests: fix acme under network-online dep fix 2024-01-18 16:28:42 -08:00
datafoo
ade414b6c7 nixos/acme: rename option credentialsFile to environmentFile 2023-09-11 16:34:20 +00:00
Oliver Schmidt
e362fe9c6d security/acme: limit concurrent certificate generations
fixes #232505

Implements the new option `security.acme.maxConcurrentRenewals` to limit
the number of certificate generation (or renewal) jobs that can run in
parallel. This avoids overloading the system resources with many
certificates or running into acme registry rate limits and network
timeouts.

Architecture considerations:
- simplicity, lightweight: Concerns have been voiced about making this
  already rather complex module even more convoluted. Additionally,
  locking solutions shall not significantly increase performance and
  footprint of individual job runs.
  To accomodate these concerns, this solution is implemented purely in
  Nix, bash, and using the light-weight `flock` util. To reduce
  complexity, jobs are already assigned their lockfile slot at system
  build time instead of dynamic locking and retrying. This comes at the
  cost of not always maxing out the permitted concurrency at runtime.
- no stale locks: Limiting concurrency via locking mechanism is usually
  approached with semaphores. Unfortunately, both SysV as well as
  POSIX-Semaphores are *not* released when the process currently locking
  them is SIGKILLed. This poses the danger of stale locks staying around
  and certificate renewal being blocked from running altogether.
  `flock` locks though are released when the process holding the file
  descriptor of the lock file is KILLed or terminated.
- lockfile generation: Lock files could either be created at build time
  in the Nix store or at script runtime in a idempotent manner.
  While the latter would be simpler to achieve, we might exceed the number
  of permitted concurrent runs during a system switch: Already running
  jobs are still locked on the existing lock files, while jobs started
  after the system switch will acquire locks on freshly created files,
  not being blocked by the still running services.
  For this reason, locks are generated and managed at runtime in the
  shared state directory `/var/lib/locks/`.

nixos/security/acme: move locks to /run

also, move over permission and directory management to systemd-tmpfiles

nixos/security/acme: fix some linter remarks in my code

there are some remarks left for existing code, not touching that

nixos/security/acme: redesign script locking flow

- get rid of subshell
- provide function for wrapping scripts in a locked environment

nixos/acme: improve visibility of blocking on locks

nixos/acme: add smoke test for concurrency limitation

heavily inspired by m1cr0man

nixos/acme: release notes entry on new concurrency limits

nixos/acme: cleanup, clarifications
2023-09-09 20:13:18 +02:00
figsoda
202699c918 nixos/tests: fix typos 2023-05-19 22:31:04 -04:00
Ryan Lahfa
0b0726ae0b
Merge pull request #205983 from m1cr0man/acme-test-fix
nixos/acme: Increase number of retries in testing
2022-12-22 02:19:19 +01:00
figsoda
6bb0dbf91f nixos: fix typos 2022-12-17 19:31:14 -05:00
Lucas Savva
c9a5bf4a38
nixos/acme: Increase number of retries in testing
Helps to avoid failures in Hydra when the host server starts
the web server too slowly.
2022-12-17 21:12:13 +00:00
Lucas Savva
657ecbca0e nixos/acme: Make account creds check more robust
Fixes #190493

Check if an actual key file exists. This does not
completely cover the work accountHash does to ensure
that a new account is registered when account
related options are changed.
2022-10-06 10:30:24 -04:00
Lucas Savva
39796cad46 nixos/acme: Fix cert renewal with built in webserver
Fixes #191794

Lego threw a permission denied error binding to port 80.
AmbientCapabilities with CAP_NET_BIND_SERVICE was required.
Also added a test for this.
2022-10-06 10:30:24 -04:00
Robert Hensing
b7ffe44469 nixosTests.acme: Use module system based runner 2022-09-21 10:55:12 +01:00
Robert Hensing
152736d39e nixosTests.acme: Fix typechecking, avoiding type reassignment 2022-06-17 11:45:19 +02:00
Winter
b52607f43b nixos/acme: ensure web servers using certs can access them 2022-01-08 15:05:34 -05:00
Lucas Savva
46cd06eb9d
nixos/acme: Add test for caddy
This test is technically broken since reloading caddy
does not seem to load new certs. This needs to be fixed
in caddy.
2021-12-26 21:12:40 +00:00
Lucas Savva
65f1b8c6ae
nixos/acme: Add test for lego's built-in web server
In the process I also found that the CapabilityBoundingSet
was restricting the service from listening on port 80, and
the AmbientCapabilities was ineffective. Fixed appropriately.
2021-12-26 16:49:59 +00:00
Lucas Savva
41fb8d71ab
nixos/acme: Add useRoot option 2021-12-26 16:49:57 +00:00
Lucas Savva
377c6bcefc
nixos/acme: Add defaults and inheritDefaults option
Allows configuring many default settings for certificates,
all of which can still be overridden on a per-cert basis.
Some options have been moved into .defaults from security.acme,
namely email, server, validMinDays and renewInterval. These
changes will not break existing configurations thanks to
mkChangedOptionModule.

With this, it is also now possible to configure DNS-01 with
web servers whose virtualHosts utilise enableACME. The only
requirement is you set `acmeRoot = null` for each vhost.

The test suite has been revamped to cover these additions
and also to generally make it easier to maintain. Test config
for apache and nginx has been fully standardised, and it
is now much easier to add a new web server if it follows
the same configuration patterns as those two. I have also
optimised the use of switch-to-configuration which should
speed up testing.
2021-12-26 16:44:10 +00:00
Lucas Savva
a7f0001328
nixos/acme: Check for revoked certificates
Closes #129838

It is possible for the CA to revoke a cert that has not yet
expired. We must run lego to validate this before expiration,
but we must still ignore failures on unexpired certs to retain
compatibility with #85794

Also changed domainHash logic such that a renewal will only
be attempted at all if domains are unchanged, and do a full
run otherwises. Resolves #147540 but will be partially
reverted when go-acme/lego#1532 is resolved + available.
2021-12-26 16:44:09 +00:00
Lucas Savva
eba6713e8f
nixos/tests/acme: test access to files outside /var/lib/acme in postRun 2021-07-06 15:16:24 +02:00
Martin Weinelt
dc940ecdb3
Merge pull request #121750 from m1cr0man/master
nixos/acme: Ensure certs are always protected
2021-07-06 15:10:54 +02:00
Mewp
b00bcf21ab nixos/acme: Remove an incorrect assertion from tests
Commit 3a2e0c36e7 has removed
`--reuse-key` from default renew options, yet the tests still expected
keys not to change. This assertion is now removed, as they are supposed
to change on each renew/change.
2021-06-05 10:38:46 +02:00
Lucas Savva
083aba4f83 nixos/acme: Ensure certs are always protected
As per #121293, I ensured the UMask is set correctly
and removed any unnecessary chmod/chown/chgrp commands.
The test suite already partially covered permissions
checking but I added an extra check for the selfsigned
cert permissions.
2021-05-15 12:41:33 +01:00
Robert Hensing
06b070ffe7
nixosTests.acme: lint 2021-05-09 02:28:04 +02:00
Lucas Savva
2dd7973751 nixos/acme: Add permissions tests 2021-03-15 19:25:49 +00:00
Lucas Savva
920a3f5a9d nixos/acme: Fix webroot issues
With the UMask set to 0023, the
mkdir -p command which creates the webroot
could end up unreadable if the web server
changes, as surfaced by the test suite in #114751
On top of this, the following commands
to chown the webroot + subdirectories was
mostly unnecessary. I stripped it back to
only fix the deepest part of the directory,
resolving #115976, and reintroduced a
human readable error message.
2021-03-15 01:41:40 +00:00
Lucas Savva
bfe07e2179 nixos/acme: fix test config 2020-12-28 00:35:46 +00:00
Lucas Savva
85769a8cd8 nixos/acme: prevent mass account creation
Closes #106565
When generating multiple certificates which all
share the same server + email, lego will attempt
to create an account multiple times. By adding an
account creation target certificates which share
an account will wait for one service (chosen at
config build time) to complete first.
2020-12-28 00:35:18 +00:00
Lucas Savva
1edd91ca09
nixos/acme: Fix ocspMustStaple option and add test
Some of the testing setup for OCSP checking was wrong and
has been fixed too.
2020-10-07 00:18:13 +01:00
Lucas Savva
34b5c5c1a4
nixos/acme: More features and fixes
- Allow for key reuse when domains are the only thing that
  were changed.
- Fixed systemd service failure when preliminarySelfsigned
  was set to false
2020-09-06 01:28:19 +01:00
Lucas Savva
f57824c915
nixos/acme: Update docs, use assert more effectively 2020-09-05 01:06:29 +01:00
Lucas Savva
67a5d660cb
nixos/acme: Run postRun script as root 2020-09-04 19:34:10 +01:00
Lucas Savva
1b6cfd9796
nixos/acme: Fix race condition, dont be smart with keys
Attempting to reuse keys on a basis different to the cert (AKA,
storing the key in a directory with a hashed name different to
the cert it is associated with) was ineffective since when
"lego run" is used it will ALWAYS generate a new key. This causes
issues when you revert changes since your "reused" key will not
be the one associated with the old cert. As such, I tore out the
whole keyDir implementation.

As for the race condition, checking the mtime of the cert file
was not sufficient to detect changes. In testing, selfsigned
and full certs could be generated/installed within 1 second of
each other. cmp is now used instead.

Also, I removed the nginx/httpd reload waiters in favour of
simple retry logic for the curl-based tests
2020-09-04 01:09:43 +01:00
Lucas Savva
61dbf4bf89
nixos/acme: Add proper nginx/httpd config reload checks
Testing of certs failed randomly when the web server was still
returning old certs even after the reload was "complete". This was
because the reload commands send process signals and do not wait
for the worker processes to restart. This commit adds log watchers
which wait for the worker processes to be restarted.
2020-09-02 19:25:30 +01:00
Lucas Savva
982c5a1f0e
nixos/acme: Restructure module
- Use an acme user and group, allow group override only
- Use hashes to determine when certs actually need to regenerate
- Avoid running lego more than necessary
- Harden permissions
- Support "systemctl clean" for cert regeneration
- Support reuse of keys between some configuration changes
- Permissions fix services solves for previously root owned certs
- Add a note about multiple account creation and emails
- Migrate extraDomains to a list
- Deprecate user option
- Use minica for self-signed certs
- Rewrite all tests

I thought of a few more cases where things may go wrong,
and added tests to cover them. In particular, the web server
reload services were depending on the target - which stays alive,
meaning that the renewal timer wouldn't be triggering a reload
and old certs would stay on the web servers.

I encountered some problems ensuring that the reload took place
without accidently triggering it as part of the test. The sync
commands I added ended up being essential and I'm not sure why,
it seems like either node.succeed ends too early or there's an
oddity of the vm's filesystem I'm not aware of.

- Fix duplicate systemd rules on reload services

Since useACMEHost is not unique to every vhost, if one cert
was reused many times it would create duplicate entries in
${server}-config-reload.service for wants, before and
ConditionPathExists
2020-09-02 19:22:43 +01:00
Arian van Putten
0952336d1d nixos/acme: Move regression test into acme.nix 2020-06-15 11:05:00 +02:00
Arian van Putten
681cc105ce nixos/acme: Make sure nginx is running before certs are requested
This fixes https://github.com/NixOS/nixpkgs/issues/81842

We should probably also fix this for Apache, which recently also learned
to use ACME.
2020-06-15 11:04:59 +02:00
Arian van Putten
61f834833b nixos/acme: turn around test probes' dependencies
Reads a bit more naturally, and now the changes to the
acme-${cert}.service actually reflect what would be needed were you to
do the same in production.

e.g.  "for dns-01, your service that needs the cert needs to pull in the
cert"
2020-06-15 11:02:30 +02:00
Emily
bfffee9364 nixos/tests/acme: set maintainers to acme team 2020-04-20 01:39:31 +01:00
Emily
695fd78ac4 nixos/tests/acme: use CAP_NET_BIND_SERVICE 2020-04-18 05:15:47 +01:00