When specifying the `builder` attribute in `stdenv.mkDerivation`, this
will be effectively transformed into
builtins.derivation {
builder = stdenv.shell;
args = [ "-e" builder ];
}
This also means that `default-builder.sh` is never sourced and as a
result it's not guaranteed that `$NIX_ATTRS_SH_FILE` is set to a correct
location[1].
Also, we need to source `.attrs.sh` to source `$stdenv`. So, the
following is done now:
* If `$NIX_ATTRS_SH_FILE` points to a correct location, then use it.
Directly using `.attrs.sh` is problematic for `nix-shell(1)` usage
(see previous commit for more context), so prefer the environment
variable if possible.
* Otherwise, if `.attrs.sh` exists, then use it. See [1] for when this
can happen.
* If neither applies, it can be assumed that `__structuredAttrs` is
turned off and thus nothing needs to be done.
[1] It's possible that it doesn't exist at all - in case of Nix 2.3 or
it can point to a wrong location on older Nix versions with a bug in
`__structuredAttrs`.
For NVLink topology systems we need fabricmanager. Fabricmanager itself is
dependent on the datacenter driver set and not the regular x11 ones, it is also
tightly tied to the driver version. Furhtermore the current cudaPackages
defaults to version 11.8, which corresponds to the 520 datacenter drivers.
Future improvement should be to switch the main nvidia datacenter driver version
on the `config.cudaVersion` since these are well known from:
> https://docs.nvidia.com/deploy/cuda-compatibility/index.html#use-the-right-compat-package
This adds nixos configuration options `hardware.nvidia.datacenter.enable` and
`hardware.nvidia.datacenter.settings` (the settings configure fabricmanager)
Other interesting external links related to this commit are:
* Fabricmanager download site:
- https://developer.download.nvidia.com/compute/cuda/redist/fabricmanager/linux-x86_64/
* Data Center drivers:
- https://www.nvidia.com/Download/driverResults.aspx/193711/en-us/
Implementation specific details:
* Fabricmanager is added as a passthru package, similar to settings and
presistenced.
* Adds `use{Settings,Persistenced,Fabricmanager}` with defaults to preserve x11
expressions.
* Utilizes mkMerge to split the `hardware.nvidia` module into three comment
delimited sections:
1. Common
2. X11/xorg
3. Data Center
* Uses asserts to make the configurations mutualy exclusive.
Notes:
* Data Center Drivers are `x86_64` only.
* Reuses the `nvidia_x11` attribute in nixpkgs on enable, e.g. doesn't change it
to `nvidia_driver` and sets that to either `nvidia_x11` or `nvidia_dc`.
* Should have a helper function which is switched on `config.cudaVersion` like
`selectHighestVersion` but rather `selectCudaCompatibleVersion`.