Passing `-l$NIX_BUILD_CORES` improperly limits the overall system load.
For a build machine which is configured to run `$B` builds where each
build gets `total cores / B` cores (`$C`), passing `-l $C` to make will
improperly limit the load to `$C` instead of `$B * $C`.
This effect becomes quite pronounced on machines with 80 cores, with
40 simultaneous builds and a cores limit of 2. On a machine with this
configuration, Nix will run 40 builds and make will limit the overall
system load to approximately 2. A build machine with this many cores
can happily run with a load approaching 80.
A non-solution is to oversubscribe the machine, by picking a larger
`$C`. However, there is no way to divide the number of cores in a way
which fairly subdivides the available cores when `$B` is greater than
1.
There has been exploration of passing a jobserver in to the sandbox,
or sharing a jobserver between all the builds. This is one option, but
relatively complicated and only supports make. Lots of other software
uses its own implementation of `-j` and doesn't support either `-l` or
the Make jobserver.
For the case of an interactive user machine, the user should limit
overall system load using `$B`, `$C`, and optionally systemd's
cpu/network/io limiting features.
Making this change should significantly improve the utilization of our
build farm, and improve the throughput of Hydra.
The build was failing with the following error:
```
[18950/51180] SOLINK ./libvk_swiftshader.sotls_transport_interface/dtls_transport_interface.omputils.o[K.otch.oos.oKx/unbundle:default)fault)ault)
FAILED: libvk_swiftshader.so libvk_swiftshader.so.TOC
python3 "../../build/toolchain/gcc_solink_wrapper.py" --readelf="readelf" --nm="nm" --sofile="./libvk_swiftshader.so" --tocfile="./libvk_swiftshader.so.TOC" --output="./libvk_swiftshader.so" -- clang++ -shared -Wl,-soname="libvk_swiftshader.so" -Wl,-Bsymbolic -Wl,--version-script=../../third_party/swiftshader/src/Vulkan/vk_swiftshader.lds -fuse-ld=lld -Wl,--fatal-warnings -Wl,--build-id=sha1 -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--icf=all -Wl,--color-diagnostics -Wl,-mllvm,-instcombine-lower-dbg-declare=0 -flto=thin -Wl,--thinlto-jobs=all -Wl,--thinlto-cache-dir=thinlto-cache -Wl,--thinlto-cache-policy=cache_size=10\%:cache_size_bytes=40g:cache_size_files=100000 -Wl,-mllvm,-import-instr-limit=30 -fwhole-program-vtables -Wl,--no-call-graph-profile-sort -m64 -no-canonical-prefixes -Wl,-O2 -Wl,--gc-sections -rdynamic -Wl,-z,defs -Wl,--as-needed -nostdlib++ -Wl,--lto-O0 -fsanitize=cfi-vcall -fsanitize=cfi-icall -o "./libvk_swiftshader.so" @"./libvk_swiftshader.so.rsp"
ld.lld: error: unable to find library -l:libffi_pic.a
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
```
This turned out to be a regression from b6b51374fc. That change was
bad/undesirable in the first place and I only applied it to quickly fix
another build error caused by incompatible wayland-protocols header
files from a newer system version (Chromium bundles version 1.21 while
we already package 1.26).
The better fix for that wayland-protocols build issue is to pull in a
patch that is already used/tested by the Arch package [0] and seems to
originate from [1] (not sure if that patch was formally submitted yet).
Alternatives to that patch would be to (we should probably first try the
first approach if need be):
1) Build with wayland-protocols 1.21 from the system (by overriding the
Nixpkgs package).
2) Dynamically link against libffi by patching [2] to use the other
branch (`default_toolchain == "//build/toolchain/cros:target"`).
Some additional details can be found in the GitHub PR [3].
Huge thanks to Lorenz Brun for his great analysis that enabled me to fix
the build so that we can finally merge the update to Chromium M105
(which contains many important security fixes!).
[0]: a353833a5a
[1]: https://bugs.chromium.org/p/angleproject/issues/detail?id=7582#c1
[2]: https://source.chromium.org/chromium/chromium/src/+/refs/tags/105.0.5195.52:build/config/linux/libffi/BUILD.gn
[3]: https://github.com/NixOS/nixpkgs/pull/189033
Co-Authored-By: Lorenz Brun <lorenz@brun.one>
This "fixes" errors like these:
```
FAILED: obj/third_party/angle/angle_gpu_info_util/SystemInfo_vulkan.o
[...]
In file included from ../../third_party/wayland/src/src/wayland-client.h:40:
/nix/store/an42rhwn6ck2nix6caikrr4rvizknjhh-wayland-1.21.0-dev/include/wayland-client-protocol.h:1040:13: error: use of undeclared identifier 'wl_proxy_marshal_flags'
callback = wl_proxy_marshal_flags((struct wl_proxy *) wl_display,
^
[...]
/nix/store/an42rhwn6ck2nix6caikrr4rvizknjhh-wayland-1.21.0-dev/include/wayland-client-protocol.h:1392:87: error: use of undeclared identifier 'WL_MARSHAL_FLAG_DESTROY'
WL_SHM_POOL_DESTROY, NULL, wl_proxy_get_version((struct wl_proxy *) wl_shm_pool), WL_MARSHAL_FLAG_DESTROY);
^
[...]
fatal error: too many errors emitted, stopping now [-ferror-limit=]
```
At least for now (until Chromium updates their bundled Wayland version) it
seems best to use the bundled headers/versions to avoid version incompatibility
issues (we should hopefully be able to drop use_system_wayland_scanner though).