This change which involves creating multiple outputs for CUDA
redistributable packages.
We use a script to find out, ahead of time, the outputs each redist
package provides. From that, we are able to create multiple outputs for
supported redist packages, allowing users to specify exactly which
components they require.
Beyond the script which finds outputs ahead of time, there is some custom
code involved in making this happen. For example, the way Nixpkgs
typically handles multiple outputs involves making `dev` the default
output when available, and adding `out` to `dev`'s
`propagatedBuildInputs`.
Instead, we make each output independent of the others. If a user wants
only to include the headers found in a redist package, they can do so by
choosing the `dev` output. If they want to include dynamic libraries,
they can do so by specifying the `lib` output, or `static` for static
libraries.
To avoid breakages, we continue to provide the `out` output, which
becomes the union of all other outputs, effectively making the split
outputs opt-in.
The `throwIf` expression in CUDNN was evaluated eagerly and essentially prevented the use of cudaPackages without a supported version of CUDNN (even when CUDNN was not requested).
This is needed for faster builds when debugging the opencv derivation,
and it's more consistent with other cuda-enabled packages
-DCUDA_GENERATION seems to expect architecture names, so we refactor
cudaFlags to facilitate easier extraction of the configured archnames
This is a hot-fix to un-break cuda-enabled packages (like tensorflow,
jaxlib, faiss, opencv, ...) after the gcc11->gcc12 bump. We should
probably build the whole downstream packages with a compatible stdenv
(such as gcc11Stdenv for cudaPackages_11), but just pointing nvcc at the
right compiler seems to do the trick
We already used this hack for non-redist cudatoolkit. Now we use it more
consistently.
This commit also re-links cuda packages against libstdc++ from the same
"compatible" gcc, rather than the current stdenv. We didn't test if this
is necessary -> need to revise in further PRs.
NOTE: long-term we should make it possible to override -ccbin and use
e.g. clang
As cudatoolkit is currently written, 11.8 introduces a broken symlink in `include` (also named `include`) and in `lib` (named `lib64`).
This trips up some consumers, like `tensorflow-gpu`.
python27 was recently marked as insecure, breaking cudaPackages.cudatoolkit. This commit has been successfully tested against the earliest supported, 10.0, and the latest supported, 11.8, with the assumption that intermediate versions ought to work as well.