nixpkgs/pkgs/applications/networking/browsers/chromium/recompress-tarball.nix
emilylange e15783154f
chromium: improve and move recompressTarball
Recap: We need that (arguably stupid) helper function/drv because the
chromium tarball is big -- and is likely to increase even more in the
future. So big, that we eventually exceeded hydra.nixos.org's
max-output-limit (3G).  Instead of raising global hydra's limit, it was
decided that we recompress the tarball after deleting unused vendored
files from it.

I spent a lot of time on a version/prototype that does everything
(downloading, decompression, tar extraction, deleting unused files,
reproducible tar recreation and finally recompression) via stdin but
eventually had to scratch that.

GNU tar does not allow to create a tarball just from stdin, nixpkgs'
stdenv isn't built with stdin/stdout/pipes in mind, and things a lot of
other things I probably already forgot.

Nonetheless, this version improves multiple things:
- No more `mv` (used to be multiple, not just ours, since fetchzip had
  some as well)
- No more `rm` to get rid of the extracted files before recompressing.
  Instead, we simply don't extract them in the first place (thanks to
  tar's --exlude).
- No more "no space left" that happened due to `downloadToTemp = true;`.
- Multithreaded xz decompression, since that commit is still in
  staging-next.

We cannot use stdenv's unpackFile() because that does not allow us to
specify the needed --exclude (and --strip-components=1 if we don't want
to rely on glob matching).

The hash changed because we now have a static base directory ("source")
in the tarball, instead of whatever upstream provided us with (e.g.
"chromium-120.0.6099.129").
2024-01-04 01:34:15 +01:00

48 lines
1.5 KiB
Nix

{ zstd
, fetchurl
}:
{ version
, hash ? ""
, ...
} @ args:
fetchurl ({
name = "chromium-${version}.tar.zstd";
url = "https://commondatastorage.googleapis.com/chromium-browser-official/chromium-${version}.tar.xz";
inherit hash;
# chromium xz tarballs are multiple gigabytes big and are sometimes downloaded multiples
# times for different versions as part of our update script.
# We originally inherited fetchzip's default for downloadToTemp (true).
# Given the size of the /run/user tmpfs used defaults to logind's RuntimeDirectorySize=,
# which in turn defaults to 10% of the total amount of physical RAM, this often lead to
# "no space left" errors, eventually resulting in its own section in our chromium
# README.md (for users wanting to run the update script).
# Nowadays, we use fetchurl instead of fetchzip, which defaults to false instead of true.
# We just want to be explicit and provide a place to document the history and reasoning
# behind this.
downloadToTemp = false;
nativeBuildInputs = [ zstd ];
postFetch = ''
cat "$downloadedFile" \
| xz -d --threads=$NIX_BUILD_CORES \
| tar xf - \
--warning=no-timestamp \
--one-top-level=source \
--exclude=third_party/llvm \
--exclude=third_party/rust-src \
--strip-components=1
tar \
--use-compress-program "zstd -T$NIX_BUILD_CORES" \
--sort name \
--mtime "1970-01-01" \
--owner=root --group=root \
--numeric-owner --mode=go=rX,u+rw,a-s \
-cf $out source
'';
} // removeAttrs args [ "version" ])