The NVIDIA X driver uses a UNIX domain socket to pass information to
other driver components. If unable to connect to this socket, some
driver features, such as G-Sync, may not work correctly. The socket will
be bound to a file with a name unique to the X server instance created
in the directory specified by this option. Note that on Linux, an
additional abstract socket (not associated with a file) will also be
created, with this pathname socket serving as a fallback if connecting
to the abstract socket fails.
The default, which was in effect prior to this change, was `/var/run`.
The effect of not setting this option was that GDM X sessions
(and other non-root sessions) would see this warning in the log files:
```
(WW) NVIDIA: Failed to bind sideband socket to
(WW) NVIDIA: '/var/run/nvidia-xdriver-b4f69129' Permission denied
```
I don't see any security implications of turning this on universally,
since there already was an abstract socket created according to the
docs.
Documentation:
1. [NVIDIA X Config Options](https://download.nvidia.com/XFree86/Linux-x86_64/440.82/README/xconfigoptions.html#SidebandSocketPath)
Diagnosis:
1. [Arch Linux BBS post](https://bbs.archlinux.org/viewtopic.php?pid=1909115#p1909115)
Allow setting the owner, group and mode of the `/dev/sev-guest` device,
similar to what is already possible for `/dev/sev` through the
`hardware.cpu.amd.sev` options.
The `/dev/sev` device is available to AMD SEV hosts, e.g., to start an
AMD SEV-SNP guest. In contrast, the `/dev/sev-guest` device is only
available within SEV-SNP guests. The guest uses the device, for example,
to request an attestation report. Linux has in-tree support for SEV-SNP
guests since 5.19.
For NVLink topology systems we need fabricmanager. Fabricmanager itself is
dependent on the datacenter driver set and not the regular x11 ones, it is also
tightly tied to the driver version. Furhtermore the current cudaPackages
defaults to version 11.8, which corresponds to the 520 datacenter drivers.
Future improvement should be to switch the main nvidia datacenter driver version
on the `config.cudaVersion` since these are well known from:
> https://docs.nvidia.com/deploy/cuda-compatibility/index.html#use-the-right-compat-package
This adds nixos configuration options `hardware.nvidia.datacenter.enable` and
`hardware.nvidia.datacenter.settings` (the settings configure fabricmanager)
Other interesting external links related to this commit are:
* Fabricmanager download site:
- https://developer.download.nvidia.com/compute/cuda/redist/fabricmanager/linux-x86_64/
* Data Center drivers:
- https://www.nvidia.com/Download/driverResults.aspx/193711/en-us/
Implementation specific details:
* Fabricmanager is added as a passthru package, similar to settings and
presistenced.
* Adds `use{Settings,Persistenced,Fabricmanager}` with defaults to preserve x11
expressions.
* Utilizes mkMerge to split the `hardware.nvidia` module into three comment
delimited sections:
1. Common
2. X11/xorg
3. Data Center
* Uses asserts to make the configurations mutualy exclusive.
Notes:
* Data Center Drivers are `x86_64` only.
* Reuses the `nvidia_x11` attribute in nixpkgs on enable, e.g. doesn't change it
to `nvidia_driver` and sets that to either `nvidia_x11` or `nvidia_dc`.
* Should have a helper function which is switched on `config.cudaVersion` like
`selectHighestVersion` but rather `selectCudaCompatibleVersion`.