doc: clarification of dependencies related attributes

Signed-off-by: Pamplemousse <xav.maso@gmail.com>
This commit is contained in:
Pamplemousse 2021-08-31 16:43:52 -07:00
parent 3e31a8d02f
commit f48c175fb2
2 changed files with 25 additions and 12 deletions

View File

@ -158,9 +158,9 @@ One would think that `localSystem` and `crossSystem` overlap horribly with the t
### Implementation of dependencies {#ssec-cross-dependency-implementation}
The categories of dependencies developed in [](#ssec-cross-dependency-categorization) are specified as lists of derivations given to `mkDerivation`, as documented in [](#ssec-stdenv-dependencies). In short, each list of dependencies for "host → target" of "foo → bar" is called `depsFooBar`, with exceptions for backwards compatibility that `depsBuildHost` is instead called `nativeBuildInputs` and `depsHostTarget` is instead called `buildInputs`. Nixpkgs is now structured so that each `depsFooBar` is automatically taken from `pkgsFooBar`. (These `pkgsFooBar`s are quite new, so there is no special case for `nativeBuildInputs` and `buildInputs`.) For example, `pkgsBuildHost.gcc` should be used at build-time, while `pkgsHostTarget.gcc` should be used at run-time.
The categories of dependencies developed in [](#ssec-cross-dependency-categorization) are specified as lists of derivations given to `mkDerivation`, as documented in [](#ssec-stdenv-dependencies). In short, each list of dependencies for "host → target" is called `deps<host><target>` (where `host`, and `target` values are either `build`, `host`, or `target`), with exceptions for backwards compatibility that `depsBuildHost` is instead called `nativeBuildInputs` and `depsHostTarget` is instead called `buildInputs`. Nixpkgs is now structured so that each `deps<host><target>` is automatically taken from `pkgs<host><target>`. (These `pkgs<host><target>`s are quite new, so there is no special case for `nativeBuildInputs` and `buildInputs`.) For example, `pkgsBuildHost.gcc` should be used at build-time, while `pkgsHostTarget.gcc` should be used at run-time.
Now, for most of Nixpkgs's history, there were no `pkgsFooBar` attributes, and most packages have not been refactored to use it explicitly. Prior to those, there were just `buildPackages`, `pkgs`, and `targetPackages`. Those are now redefined as aliases to `pkgsBuildHost`, `pkgsHostTarget`, and `pkgsTargetTarget`. It is acceptable, even recommended, to use them for libraries to show that the host platform is irrelevant.
Now, for most of Nixpkgs's history, there were no `pkgs<host><target>` attributes, and most packages have not been refactored to use it explicitly. Prior to those, there were just `buildPackages`, `pkgs`, and `targetPackages`. Those are now redefined as aliases to `pkgsBuildHost`, `pkgsHostTarget`, and `pkgsTargetTarget`. It is acceptable, even recommended, to use them for libraries to show that the host platform is irrelevant.
But before that, there was just `pkgs`, even though both `buildInputs` and `nativeBuildInputs` existed. \[Cross barely worked, and those were implemented with some hacks on `mkDerivation` to override dependencies.\] What this means is the vast majority of packages do not use any explicit package set to populate their dependencies, just using whatever `callPackage` gives them even if they do correctly sort their dependencies into the multiple lists described above. And indeed, asking that users both sort their dependencies, _and_ take them from the right attribute set, is both too onerous and redundant, so the recommended approach (for now) is to continue just categorizing by list and not using an explicit package set.

View File

@ -116,15 +116,27 @@ On Linux, `stdenv` also includes the `patchelf` utility.
## Specifying dependencies {#ssec-stdenv-dependencies}
As described in the Nix manual, almost any `*.drv` store path in a derivations attribute set will induce a dependency on that derivation. `mkDerivation`, however, takes a few attributes intended to, between them, include all the dependencies of a package. This is done both for structure and consistency, but also so that certain other setup can take place. For example, certain dependencies need their bin directories added to the `PATH`. That is built-in, but other setup is done via a pluggable mechanism that works in conjunction with these dependency attributes. See [](#ssec-setup-hooks) for details.
As described in the Nix manual, almost any `*.drv` store path in a derivations attribute set will induce a dependency on that derivation. `mkDerivation`, however, takes a few attributes intended to include all the dependencies of a package. This is done both for structure and consistency, but also so that certain other setup can take place. For example, certain dependencies need their bin directories added to the `PATH`. That is built-in, but other setup is done via a pluggable mechanism that works in conjunction with these dependency attributes. See [](#ssec-setup-hooks) for details.
Dependencies can be broken down along three axes: their host and target platforms relative to the new derivations, and whether they are propagated. The platform distinctions are motivated by cross compilation; see [](#chap-cross) for exactly what each platform means. [^footnote-stdenv-ignored-build-platform] But even if one is not cross compiling, the platforms imply whether or not the dependency is needed at run-time or build-time, a concept that makes perfect sense outside of cross compilation. By default, the run-time/build-time distinction is just a hint for mental clarity, but with `strictDeps` set it is mostly enforced even in the native case.
The extension of `PATH` with dependencies, alluded to above, proceeds according to the relative platforms alone. The process is carried out only for dependencies whose host platform matches the new derivations build platform i.e. dependencies which run on the platform where the new derivation will be built. [^footnote-stdenv-native-dependencies-in-path] For each dependency \<dep\> of those dependencies, `dep/bin`, if present, is added to the `PATH` environment variable.
The dependency is propagated when it forces some of its other-transitive (non-immediate) downstream dependencies to also take it on as an immediate dependency. Nix itself already takes a packages transitive dependencies into account, but this propagation ensures nixpkgs-specific infrastructure like setup hooks (mentioned above) also are run as if the propagated dependency.
A dependency is said to be **propagated** when some of its other-transitive (non-immediate) downstream dependencies also need it as an immediate dependency.
[^footnote-stdenv-propagated-dependencies]
It is important to note that dependencies are not necessarily propagated as the same sort of dependency that they were before, but rather as the corresponding sort so that the platform rules still line up. The exact rules for dependency propagation can be given by assigning to each dependency two integers based one how its host and target platforms are offset from the depending derivations platforms. Those offsets are given below in the descriptions of each dependency list attribute. Algorithmically, we traverse propagated inputs, accumulating every propagated dependencys propagated dependencies and adjusting them to account for the “shift in perspective” described by the current dependencys platform offsets. This results in sort a transitive closure of the dependency relation, with the offsets being approximately summed when two dependency links are combined. We also prune transitive dependencies whose combined offsets go out-of-bounds, which can be viewed as a filter over that transitive closure removing dependencies that are blatantly absurd.
It is important to note that dependencies are not necessarily propagated as the same sort of dependency that they were before, but rather as the corresponding sort so that the platform rules still line up. To determine the exact rules for dependency propagation, we start by assigning to each dependency a couple of ternary numbers (`-1` for `build`, `0` for `host`, and `1` for `target`), representing how respectively its host and target platforms are "offset" from the depending derivations platforms. The following table summarize the different combinations that can be obtained:
| `host → target` | attribute name | offset |
| ------------------- | ------------------- | -------- |
| `build --> build` | `depsBuildBuild` | `-1, -1` |
| `build --> host` | `nativeBuildInputs` | `-1, 0` |
| `build --> target` | `depsBuildTarget` | `-1, 1` |
| `host --> host` | `depsHostHost` | `0, 0` |
| `host --> target` | `buildInputs` | `0, 1` |
| `target --> target` | `depsTargetTarget` | `1, 1` |
Algorithmically, we traverse propagated inputs, accumulating every propagated dependencys propagated dependencies and adjusting them to account for the “shift in perspective” described by the current dependencys platform offsets. This results is sort of a transitive closure of the dependency relation, with the offsets being approximately summed when two dependency links are combined. We also prune transitive dependencies whose combined offsets go out-of-bounds, which can be viewed as a filter over that transitive closure removing dependencies that are blatantly absurd.
We can define the process precisely with [Natural Deduction](https://en.wikipedia.org/wiki/Natural_deduction) using the inference rules. This probably seems a bit obtuse, but so is the bash code that actually implements it! [^footnote-stdenv-find-inputs-location] Theyre confusing in very different ways so… hopefully if something doesnt make sense in one presentation, it will in the other!
@ -179,37 +191,37 @@ Overall, the unifying theme here is that propagation shouldnt be introducing
#### `depsBuildBuild` {#var-stdenv-depsBuildBuild}
A list of dependencies whose host and target platforms are the new derivations build platform. This means a `-1` host and `-1` target offset from the new derivations platforms. These are programs and libraries used at build time that produce programs and libraries also used at build time. If the dependency doesnt care about the target platform (i.e. isnt a compiler or similar tool), put it in `nativeBuildInputs` instead. The most common use of this `buildPackages.stdenv.cc`, the default C compiler for this role. That example crops up more than one might think in old commonly used C libraries.
A list of dependencies whose host and target platforms are the new derivations build platform. These are programs and libraries used at build time that produce programs and libraries also used at build time. If the dependency doesnt care about the target platform (i.e. isnt a compiler or similar tool), put it in `nativeBuildInputs` instead. The most common use of this `buildPackages.stdenv.cc`, the default C compiler for this role. That example crops up more than one might think in old commonly used C libraries.
Since these packages are able to be run at build-time, they are always added to the `PATH`, as described above. But since these packages are only guaranteed to be able to run then, they shouldnt persist as run-time dependencies. This isnt currently enforced, but could be in the future.
#### `nativeBuildInputs` {#var-stdenv-nativeBuildInputs}
A list of dependencies whose host platform is the new derivations build platform, and target platform is the new derivations host platform. This means a `-1` host offset and `0` target offset from the new derivations platforms. These are programs and libraries used at build-time that, if they are a compiler or similar tool, produce code to run at run-time—i.e. tools used to build the new derivation. If the dependency doesnt care about the target platform (i.e. isnt a compiler or similar tool), put it here, rather than in `depsBuildBuild` or `depsBuildTarget`. This could be called `depsBuildHost` but `nativeBuildInputs` is used for historical continuity.
A list of dependencies whose host platform is the new derivations build platform, and target platform is the new derivations host platform. These are programs and libraries used at build-time that, if they are a compiler or similar tool, produce code to run at run-time—i.e. tools used to build the new derivation. If the dependency doesnt care about the target platform (i.e. isnt a compiler or similar tool), put it here, rather than in `depsBuildBuild` or `depsBuildTarget`. This could be called `depsBuildHost` but `nativeBuildInputs` is used for historical continuity.
Since these packages are able to be run at build-time, they are added to the `PATH`, as described above. But since these packages are only guaranteed to be able to run then, they shouldnt persist as run-time dependencies. This isnt currently enforced, but could be in the future.
#### `depsBuildTarget` {#var-stdenv-depsBuildTarget}
A list of dependencies whose host platform is the new derivations build platform, and target platform is the new derivations target platform. This means a `-1` host offset and `1` target offset from the new derivations platforms. These are programs used at build time that produce code to run with code produced by the depending package. Most commonly, these are tools used to build the runtime or standard library that the currently-being-built compiler will inject into any code it compiles. In many cases, the currently-being-built-compiler is itself employed for that task, but when that compiler wont run (i.e. its build and host platform differ) this is not possible. Other times, the compiler relies on some other tool, like binutils, that is always built separately so that the dependency is unconditional.
A list of dependencies whose host platform is the new derivations build platform, and target platform is the new derivations target platform. These are programs used at build time that produce code to run with code produced by the depending package. Most commonly, these are tools used to build the runtime or standard library that the currently-being-built compiler will inject into any code it compiles. In many cases, the currently-being-built-compiler is itself employed for that task, but when that compiler wont run (i.e. its build and host platform differ) this is not possible. Other times, the compiler relies on some other tool, like binutils, that is always built separately so that the dependency is unconditional.
This is a somewhat confusing concept to wrap ones head around, and for good reason. As the only dependency type where the platform offsets are not adjacent integers, it requires thinking of a bootstrapping stage *two* away from the current one. It and its use-case go hand in hand and are both considered poor form: try to not need this sort of dependency, and try to avoid building standard libraries and runtimes in the same derivation as the compiler produces code using them. Instead strive to build those like a normal library, using the newly-built compiler just as a normal library would. In short, do not use this attribute unless you are packaging a compiler and are sure it is needed.
This is a somewhat confusing concept to wrap ones head around, and for good reason. As the only dependency type where the platform offsets, `-1` and `1`, are not adjacent integers, it requires thinking of a bootstrapping stage *two* away from the current one. It and its use-case go hand in hand and are both considered poor form: try to not need this sort of dependency, and try to avoid building standard libraries and runtimes in the same derivation as the compiler produces code using them. Instead strive to build those like a normal library, using the newly-built compiler just as a normal library would. In short, do not use this attribute unless you are packaging a compiler and are sure it is needed.
Since these packages are able to run at build time, they are added to the `PATH`, as described above. But since these packages are only guaranteed to be able to run then, they shouldnt persist as run-time dependencies. This isnt currently enforced, but could be in the future.
#### `depsHostHost` {#var-stdenv-depsHostHost}
A list of dependencies whose host and target platforms match the new derivations host platform. This means a `0` host offset and `0` target offset from the new derivations host platform. These are packages used at run-time to generate code also used at run-time. In practice, this would usually be tools used by compilers for macros or a metaprogramming system, or libraries used by the macros or metaprogramming code itself. Its always preferable to use a `depsBuildBuild` dependency in the derivation being built over a `depsHostHost` on the tool doing the building for this purpose.
A list of dependencies whose host and target platforms match the new derivations host platform. In practice, this would usually be tools used by compilers for macros or a metaprogramming system, or libraries used by the macros or metaprogramming code itself. Its always preferable to use a `depsBuildBuild` dependency in the derivation being built over a `depsHostHost` on the tool doing the building for this purpose.
#### `buildInputs` {#var-stdenv-buildInputs}
A list of dependencies whose host platform and target platform match the new derivations. This means a `0` host offset and a `1` target offset from the new derivations host platform. This would be called `depsHostTarget` but for historical continuity. If the dependency doesnt care about the target platform (i.e. isnt a compiler or similar tool), put it here, rather than in `depsBuildBuild`.
A list of dependencies whose host platform and target platform match the new derivations. This would be called `depsHostTarget` but for historical continuity. If the dependency doesnt care about the target platform (i.e. isnt a compiler or similar tool), put it here, rather than in `depsBuildBuild`.
These are often programs and libraries used by the new derivation at *run*-time, but that isnt always the case. For example, the machine code in a statically-linked library is only used at run-time, but the derivation containing the library is only needed at build-time. Even in the dynamic case, the library may also be needed at build-time to appease the linker.
#### `depsTargetTarget` {#var-stdenv-depsTargetTarget}
A list of dependencies whose host platform matches the new derivations target platform. This means a `1` offset from the new derivations platforms. These are packages that run on the target platform, e.g. the standard library or run-time deps of standard library that a compiler insists on knowing about. Its poor form in almost all cases for a package to depend on another from a future stage \[future stage corresponding to positive offset\]. Do not use this attribute unless you are packaging a compiler and are sure it is needed.
A list of dependencies whose host platform matches the new derivations target platform. These are packages that run on the target platform, e.g. the standard library or run-time deps of standard library that a compiler insists on knowing about. Its poor form in almost all cases for a package to depend on another from a future stage \[future stage corresponding to positive offset\]. Do not use this attribute unless you are packaging a compiler and are sure it is needed.
#### `depsBuildBuildPropagated` {#var-stdenv-depsBuildBuildPropagated}
@ -1228,6 +1240,7 @@ If the libraries lack `-fPIE`, you will get the error `recompile with -fPIE`.
[^footnote-stdenv-ignored-build-platform]: The build platform is ignored because it is a mere implementation detail of the package satisfying the dependency: As a general programming principle, dependencies are always *specified* as interfaces, not concrete implementation.
[^footnote-stdenv-native-dependencies-in-path]: Currently, this means for native builds all dependencies are put on the `PATH`. But in the future that may not be the case for sake of matching cross: the platforms would be assumed to be unique for native and cross builds alike, so only the `depsBuild*` and `nativeBuildInputs` would be added to the `PATH`.
[^footnote-stdenv-propagated-dependencies]: Nix itself already takes a packages transitive dependencies into account, but this propagation ensures nixpkgs-specific infrastructure like setup hooks (mentioned above) also are run as if the propagated dependency.
[^footnote-stdenv-find-inputs-location]: The `findInputs` function, currently residing in `pkgs/stdenv/generic/setup.sh`, implements the propagation logic.
[^footnote-stdenv-sys-lib-search-path]: It clears the `sys_lib_*search_path` variables in the Libtool script to prevent Libtool from using libraries in `/usr/lib` and such.
[^footnote-stdenv-build-time-guessing-impurity]: Eventually these will be passed building natively as well, to improve determinism: build-time guessing, as is done today, is a risk of impurity.