When 9.2.1 was [released], I apparently was confused by the wording. The
NCG (-fasm) codegen backend for aarch64 not only works on
aarch64-darwin, but also aarch64-linux. `useLLVM` being enabled on
aarch64-linux had no adverse effect, as GHC used -fasm anyways, but it
did inflate closure size unnecessarily which we can rectify now.
[released]: https://www.haskell.org/ghc/blog/20211029-ghc-9.2.1-released.html
This was already applied to GHC 9.2.x, but was not copied to GHC 9.4.x.
I have had issues with this locally.
The same patch works for both Cabal 3.6 and 3.8, so we can just reuse it.
The `meta` set of the binary GHCs is mostly independent of the used
bindist (except for `pname` which includes `variantSuffix`). Thus we
should make sure it can be evaluated even if no bindist is available for
the platform, i.e. evaluating `outPath` may cause an evaluation failure,
but `meta.platforms` not. Use case at present is to make
`lib.meta.availableOn` work everywhere for any GHC (the normal GHCs
inherit their platforms list from their respective boot compiler, at
least for now).
To fix this we need to make sure that shallowly evaluating `passthru`
doesn't force `binDistUsed`, since `mkDerivation` needs to merge
`passthru` into the resulting derivation attribute set, thus forcing the
attribute names of `passthru`. We can easily do this by accessing what
we want to learn from `ghcBinDists` manually and using `or` to fall back
to a sensible default.
The switch to cctools-llvm made several LLVM tools the default on
Darwin, which includes llvm-ar. GHC will try to use `-L` with `ar` when
it is `llvm-ar`, but that doesn’t work currently on Darwin.
See https://gitlab.haskell.org/ghc/ghc/-/issues/23188.
This saves just enough space on aarch64-linux so that the hadrian built
GHCs are under the 3GB Hydra output limit:
| compiler | before | after | Δ |
|----------|------------|------------|------------|
| ghc962 | 3241234736 | 2810740560 | -430494176 |
| ghcHEAD | 3341288328 | 2902760872 | -438527456 |
The total output size can be calculated using (don't forget to use
aarch64-linux):
```
nix-build -A <compiler> | xargs nix path-info -s | awk '{ s += $2 }; END { print s }'
```
haskell.compiler.ghc8102BinaryMinimal: remove at 8.10.2
haskell.compiler.ghc8107BinaryMinimal: remove at 8.10.7
haskell.compiler.ghc924BinaryMinimal: remove at 9.2.4
On aarch64-linux the binary GHCs take up about 2.6GB (which compresses
pretty well on zfs as it turns out), so they are below the output limit
of Hydra. This allows us to drop the special casing of aarch platforms
in haskell-packages.nix. While we're at it, drop the minimal variants so
we don't unnecessarily build variants of the binary GHCs.
The main problem was GHC exceeding the Hydra output limit with profiling
libs on aarch64-linux which made us disable the feature. Nowadays the
limit is 3GB, the GHC output is a bit over 2GB, so easily under the
limit.
aarch64-darwin uses a different codegen backend and was never really
affected by the problem: Its output with profiling enabled is around
1.6GB.
Consequently we can enable profiling for all platforms again, as we have
no output size issues for those we build on Hydra.
Thanks to flokli for helping me track down these up to date numbers.
Hadrian (the GHC build tool) is built separately from GHC. This means
that if `haskell.compiler.ghc961` is overridden to add patches, those
patches will _only_ be applied to the GHC portion of the build, and not
the Hadrian build. For example, backporting this patch to GHC 9.6.1
failed because the changes to `hadrian/` files were not reflected in the
Nix build:
5ed77deb1b
By lifting `src` and `hadrian` from variables defined in the function
body to parameters with default values, the `hadrian/` files can be
overridden using the `haskell.compiler.ghc961.override` function. For
example:
self.haskell.compiler.ghc961.override {
# The GHC 9.6 builder in nixpkgs first builds hadrian with the
# source tree provided here and then uses the built hadrian to
# build the rest of GHC. We need to make sure our patches get
# included in this `src`, then, rather than modifying the tree in
# the `patchPhase` or `postPatch` of the outer builder.
src = self.applyPatches {
src = let
version = "9.6.1";
in
self.fetchurl {
url = "https://downloads.haskell.org/ghc/${version}/ghc-${version}-src.tar.xz";
sha256 = "fe5ac909cb8bb087e235de97fa63aff47a8ae650efaa37a2140f4780e21f34cb";
};
patches = [
# Enable response files for linker if supported
(self.fetchpatch {
url = "5ed77deb1b.patch";
hash = "sha256-dvenK+EPTZJYXnyfKPdkvLp+zeUmsY9YrWpcGCzYStM=";
})
];
};
}
Note that we do have to re-declare the `src` we want, but I'm not sure
of a good way to avoid this while also sharing one set of patches
between the GHC and Hadrian builds.
We previously thought that we only need python if we were going to run
./boot or using emscripten which implements all its wrappers in
python (and likes to reinvoke them). As it turns out, though, hadrian
likes to invoke python itself for generating certain headers of rts
using a script shipped with the GHC source. This fact was obscured
before, since (presumably) sphinx would propagate python into PATH.
Due to link time dead code elimination not working on aarch64-darwin,
some unused store path references in Paths_* modules are retained. This
causes reference cycles when a separate `bin` output is used.
To prevent this, add a patch to Cabal as shipped by GHC which infers
based on the installation layout (which is influenced by
enableSeparateBinOutput, enableSeparateDataOutput etc. in a Nix build)
which references can be retained without causing a reference cycle. This
ensures that packages that were fine with a bin output will also work on
aarch64-darwin. Packages that cause a reference cycle anyways (by
actually using references that do cause one) fail due to a missing
symbol – here we are trading the overall benefit for a more confusing
error message.
For details, refer to the explanation comment in the patch.
`*.*.rts.*.opts` is actually copied from the migration GHC blog post,
but does not, actually, parse: The format is
`<stage>.<package>.<program>.<filetype>.<setting>`, so it would need to
be `*.rts.ghc.opts`. This is already achieved by the broader rule on the
next line.
- Christmas is over!
- Upstream has changed the name of the target triplet used for the JS
backend from js-unknown-ghcjs to javascript-unknown-ghcjs, since Cabal
calls the architecture "javascript":
6636b67023
Since the triplet is made up anyways, i.e. autoconf does not support
it and Rust uses different triplets for its emscripten backends, we'll
just change it as well.
- Upstream fixed the problem with ar(1) being invoked incorrectly by stage0:
e987e345c8
Hadrian does this automatically unfortunately, but unless we correctly
set enableShared as well, mkDerivation will try building shared libs
which will inevitably fail due to missing shared core packages.
Let's stay away from fully_static which does a lot of funky stuff and
was not working before anyways for pkgsStatic.
There is a code generation bug in Cabal-3.6.3.0. For packages configured with
--enable-relocatable, Cabal would generate code that doesn't compile.
There isn't an upstream issue, but the issue is described in the commit that
fixed it:
6c796218c9
It was fixed in Cabal-3.8.*
Backport the fix to the Cabal library that ships with ghc-9.4.4
Cabal 3.8 ships with ghc-9.6, so when 9.6 is released this fix shouldn't be
necessary.
`ghc ? hadrian` can be used to check if a GHC was built using hadrian.
This is often relevant since hadrian changed the ghc libdir location, so
we need to install libs to a different location as well.
GHC ships a [modified] config.sub so that js-unknown-ghcjs is accepted
by autotools. For some platforms, we automatically update config.sub
from upstream's source in order to prevent that builds fail when we use
an outdated config.sub. In this case of course the perfectly up to date
config.sub would reject the target platform we are trying to use, so we
must disable this mechanism for now.
I have asked in the GHC IRC channel if there are any plans on
upstreaming the platform. It would be nice if were able to drop this
change in the future.
This is now possible by building a cross compiler for js-unknown-ghjs
using `pkgsCross.ghcjs.buildPackages.haskell.compiler.ghcHEAD`.
To allow this, the following things needed to be done:
* Disable dependencies that wouldn't work:
- Don't pull in ncurses for terminfo
- Don't pull in libffi
- Don't pull in libiconv
- Don't enable the LLVM backend
- Enable gmp-less native-bignum backend
* Use emscripten instead of a C compiler. The way this works is inspired
by emscriptenPackages, but avoids the following flaws:
- Instead of using a custom configurePhase, just set
`configureScript = "emconfigure ./configure";` which is much simpler.
- Create writable EM_CACHE before configuring, as configure scripts
want to compile test programs.
Additionally, we need to disable the targetCC check, as it is not
applicable with emscripten which never appears as part of stdenv.
* Use generic $configureScript in installPhase to be able to work with
our emconfigure trick.
Note that the corresponding Haskell package set does not work yet. Cabal
doesn't seem to like GHC 9.7 yet and the generic-builder is clueless
about the JS backend.
gmp is part of buildInputs _and_ depsTargetTarget, so we need to check
the host and target platform to be correct. In practice this doesn't
change much though, as gmp.meta.platforms is _quite_ liberal.
Finally building a cross compiler using hadrian is possible, but there
are some outstanding issues regarding external libraries in the package
db which causes issues with ghc-bignum.