Commit Graph

106 Commits

Author SHA1 Message Date
bjorn3
3816385b09 Merge commit '979dcf8e2f213e4f4b645cb62e7fe9f4f2c0c785' into sync_cg_clif-2025-05-25 2025-05-25 18:51:16 +00:00
bjorn3
3066da813b Share part of the global_asm!() implementation between cg_ssa and cg_clif 2025-04-14 09:38:04 +00:00
bjorn3
f5c93fa3fe Use cg_ssa's version of codegen_naked_asm in cg_clif 2025-04-14 09:38:04 +00:00
bors
e1b06f7730 Auto merge of #139453 - compiler-errors:incr, r=jieyouxu
Prepend temp files with per-invocation random string to avoid temp filename conflicts

https://github.com/rust-lang/rust/issues/139407 uncovered a very subtle unsoundness with incremental codegen, failing compilation sessions (due to assembler errors), and the "prefer hard linking over copying files" strategy we use in the compiler for file management.

Specifically, imagine we're building a single file 3 times, all with `-Csave-temps -Cincremental=...`. Let's call the object file we're building for the codegen unit for `main` "`XXX.o`" just for clarity since it's probably some gigantic hash name:

```
#[inline(never)]
#[cfg(any(rpass1, rpass3))]
fn a() -> i32 {
    0
}

#[cfg(any(cfail2))]
fn a() -> i32 {
    1
}

fn main() {
    evil::evil();
    assert_eq!(a(), 0);
}

mod evil {
    #[cfg(any(rpass1, rpass3))]
    pub fn evil() {
        unsafe {
            std::arch::asm!("/*  */");
        }
    }

    #[cfg(any(cfail2))]
    pub fn evil() {
        unsafe {
            std::arch::asm!("missing");
        }
    }
}
```

Session 1 (`rpass1`):
* Type-check, borrow-check, etc.
* Serialize the dep graph to the incremental working directory `.../s-...-working/`.
* Codegen object file to a temp file `XXX.rcgu.o` which is spit out in the cwd.
* Hard-link[^1] `XXX.rcgu.o` to the incremental working directory `.../s-...-working/XXX.o`.
* Save-temps option means we don't delete `XXX.rgcu.o`.
* Link the binary and stuff.
* Finalize[^2] the working incremental session by renaming `.../s-...-working` to ` s-...-asjkdhsjakd` (some other finalized incr comp session dir name).

Session 2 (`cfail2`):
* Load artifacts from the previous *finalized* incremental session, namely the dep graph.
* Type-check, borrow-check, etc. since the file has changed, so most dep graph nodes are red.
* Serialize the dep graph to the incremental working directory `.../s-...-working/`.
* Codegen object file to a temp file `XXX.rcgu.o`. **HERE IS THE PROBLEM**: The hard-link is still set up to point to the inode from `XXX.o` from the first session, so this also modifies the `XXX.o` in the previous finalized session directory.
* Codegen emits an error b/c `missing` is not an instruction, so we abort before finalizing the incremental session. Specifically, this means that the *previous* session is the last finalized session.

Session 3 (`rpass3`):
* Load artifacts from the previous *finalized* incremental session, namely the dep graph. NOTE that this is from session 1.
* All the dep graph nodes are green since we are basically replaying session 1.
* codegen object file `XXX.o`, which is detected as *reused* from session 1 since dep nodes were green. That means we **reuse** `XXX.o` which had been dirtied from session 2.
* Link the binary and stuff.

This results in a binary which reuses some of the build artifacts from session 2, but thinks it's from session 1.

At this point, I hope it's clear to see that the incremental results from session 1 were dirtied from session 2, but we reuse them as if session 1 was the previous (finalized) incremental session we ran. This is at best really buggy, and at worst **unsound**.

This isn't limited to `-C save-temps`, since there are other combinations of flags that may keep around temporary files (hard linked) in the working directory (like `-C debuginfo=1 -C split-debuginfo=unpacked` on darwin, for example).

---

This PR implements a fix which is to prepend temp filenames with a random string that is generated per invocation of rustc. This string is not *deterministic*, but temporary files are transient anyways, so I don't believe this is a problem.

That means that temp files are now something like... `{crate-name}.{cgu}.{invocation_temp}.rcgu.o`, where `{invocation_temp}` is the new temporary string we generate per invocation of rustc.

Fixes https://github.com/rust-lang/rust/issues/139407

[^1]: 175dcc7773/compiler/rustc_fs_util/src/lib.rs (L60)
[^2]: 175dcc7773/compiler/rustc_incremental/src/persist/fs.rs (L1-L40)
2025-04-11 13:59:33 +00:00
John Kåre Alsaker
02f10d9bfe Remove the use of Rayon iterators 2025-04-10 22:05:06 +02:00
Michael Goulet
9c372d8940 Prepend temp files with a string per invocation of rustc 2025-04-07 20:48:40 +00:00
Michael Goulet
effef88ac7 Simplify temp path creation a bit 2025-04-07 20:48:40 +00:00
bjorn3
1111a97886 Merge commit 'ba315abda789c9f59f2100102232bddb30b0d3d3' into sync_cg_clif-2025-03-30 2025-03-30 15:43:48 +00:00
Ben Kimock
e21502cf9e Fill out links_from_incr_cache in cg_clif 2025-02-26 20:02:03 -05:00
Ben Kimock
cae7c76d50 Avoid no-op unlink+link dances in incr comp 2025-02-24 19:46:48 -05:00
Nicholas Nethercote
f86f7ad5f2 Move some Map methods onto TyCtxt.
The end goal is to eliminate `Map` altogether.

I added a `hir_` prefix to all of them, that seemed simplest. The
exceptions are `module_items` which became `hir_module_free_items` because
there was already a `hir_module_items`, and `items` which became
`hir_free_items` for consistency with `hir_module_free_items`.
2025-02-17 13:21:02 +11:00
bjorn3
1fcae03369 Rustfmt 2025-02-08 22:12:13 +00:00
bjorn3
cac271fd38 Merge commit '8332329f83d4ef34479fec67cc21b21246dca6b5' into sync_cg_clif-2025-02-07 2025-02-07 20:58:27 +00:00
Zalathar
24cdaa146a Rename tcx.ensure() to tcx.ensure_ok() 2025-02-01 12:38:54 +11:00
Oli Scherer
b24f674520 Change collect_and_partition_mono_items tuple return type to a struct 2025-01-27 09:38:12 +00:00
bjorn3
d740a3f06a Merge commit '728bc27f32c05ac8a9b5eb33fd101e479072984f' into sync_cg_clif-2025-01-20 2025-01-20 15:30:04 +00:00
bjorn3
92a664e111 Merge commit 'e39eacd2d415803ef82de3b6a314e4f2d0fbc4dc' into sync_cg_clif-2025-01-10 2025-01-10 09:02:07 +00:00
bjorn3
a94e2d513b Merge commit '918acafef682d0d0ca30b47de4768210417ff362' into sync_cg_clif-2025-01-05 2025-01-05 15:44:46 +00:00
bjorn3
7e6be13647 Make DependencyList an IndexVec 2024-12-19 15:30:32 +00:00
bjorn3
3198496385 Make dependency_formats an FxIndexMap rather than a list of tuples
It is treated as a map already. This is using FxIndexMap rather than
UnordMap because the latter doesn't provide an api to pick a single
value iff all values are equal, which each_linked_rlib depends on.
2024-12-13 11:29:15 +00:00
bjorn3
ead78fdfdf Remove jobserver from Session
It is effectively a global resource and the jobserver::Client in Session
was a clone of GLOBAL_CLIENT anyway.
2024-12-13 10:21:22 +00:00
bjorn3
b3d837afe1 Merge commit '57845a397ec15e4e6a561ed2c4bfa3dcf49144fb' into sync_cg_clif-2024-12-06 2024-12-06 12:10:30 +00:00
clubby789
71b698c0b8 Replace Symbol::intern calls with preinterned symbols 2024-11-28 15:45:27 +00:00
bjorn3
c94f759f10 Merge commit '1fa693ca4462fc1f790693464cf765ad693616af' into sync_cg_clif-2024-11-09 2024-11-09 13:48:06 +00:00
Michael Goulet
c682aa162b Reformat using the new identifier sorting from rustfmt 2024-09-22 19:11:29 -04:00
Nicholas Nethercote
84ac80f192 Reformat use declarations.
The previous commit updated `rustfmt.toml` appropriately. This commit is
the outcome of running `x fmt --all` with the new formatting options.
2024-07-29 08:26:52 +10:00
bjorn3
9ec6a02ab3 Merge commit '49cd5dd454d0115cfbe9e39102a8b3ba4616aa40' into sync_cg_clif-2024-06-30 2024-06-30 11:28:14 +00:00
bors
1689a5a531 Auto merge of #122597 - pacak:master, r=bjorn3
Show files produced by `--emit foo` in json artifact notifications

Right now it is possible to ask `rustc` to save some intermediate representation into one or more files with `--emit=foo`, but figuring out what exactly was produced is difficult. This pull request adds information about `llvm_ir` and `asm` intermediate files into notifications produced by `--json=artifacts`.

Related discussion: https://internals.rust-lang.org/t/easier-access-to-files-generated-by-emit-foo/20477

Motivation - `cargo-show-asm` parses those intermediate files and presents them in a user friendly way, but right now I have to apply some dirty hacks. Hacks make behavior confusing: https://github.com/hintron/computer-enhance/issues/35

This pull request introduces a new behavior: now `rustc` will emit a new artifact notification for every artifact type user asked to `--emit`, for example for `--emit asm` those will include all the `.s` files.

Most users won't notice this behavior, to be affected by it all of the following must hold:
- user must use `rustc` binary directly (when `cargo` invokes `rustc` - it consumes artifact notifications and doesn't emit anything)
- user must specify both `--emit xxx` and `--json artifacts`
- user must refuse to handle unknown artifact types
- user must disable incremental compilation (or deal with it better than cargo does, or use a workaround like `save-temps`) in order not to hit #88829 / #89149
2024-06-04 00:05:56 +00:00
Augie Fackler
aa91871539 rustc_codegen_llvm: add support for writing summary bitcode
Typical uses of ThinLTO don't have any use for this as a standalone
file, but distributed ThinLTO uses this to make the linker phase more
efficient. With clang you'd do something like `clang -flto=thin
-fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o
(full of bitcode) and foo.indexing.o (just the summary or index part of
the bitcode). That's then usable by a two-stage linking process that's
more friendly to distributed build systems like bazel, which is why I'm
working on this area.

I talked some to @teresajohnson about naming in this area, as things
seem to be a little confused between various blog posts and build
systems. "bitcode index" and "bitcode summary" tend to be a little too
ambiguous, and she tends to use "thin link bitcode" and "minimized
bitcode" (which matches the descriptions in LLVM). Since the clang
option is thin-link-bitcode, I went with that to try and not add a new
spelling in the world.

Per @dtolnay, you can work around the lack of this by using `lld
--thinlto-index-only` to do the indexing on regular .o files of
bitcode, but that is a bit wasteful on actions when we already have all
the information in rustc and could just write out the matching minimized
bitcode. I didn't test that at all in our infrastructure, because by the
time I learned that I already had this patch largely written.
2024-05-22 14:04:22 -04:00
bjorn3
75f8bdbca4 Merge commit '3270432f4b0583104c8b9b6f695bf97d6bbf3ac2' into sync_cg_clif-2024-05-13 2024-05-13 13:26:33 +00:00
bjorn3
3d682cfb66 Merge commit 'de5d6523738fd44a0521b6abf3e73ae1df210741' into sync_cg_clif-2024-04-23 2024-04-23 09:37:28 +00:00
Michael Baikov
c8390cdbfa Show files produced by --emit foo in json artifact notifications 2024-04-19 08:31:41 -04:00
Michael Baikov
691e953da6 Save/restore more items in cache with incremental compilation 2024-04-06 10:59:24 -04:00
bjorn3
987ed345af Merge commit '09fae60a86b848a2fc0ad219ecc4e438dc1eef86' into sync_cg_clif-2024-03-28 2024-03-28 11:43:35 +00:00
bjorn3
37018026f0 Merge commit '3e50cf65025f96854d6597e80449b0d64ad89589' into sync_cg_clif-2024-01-26 2024-01-26 18:33:45 +00:00
Nicholas Nethercote
bd4e623485 Use chaining for DiagnosticBuilder construction and emit.
To avoid the use of a mutable local variable, and because it reads more
nicely.
2024-01-08 15:45:29 +11:00
bjorn3
d1d134e464 Merge commit '6d355f6844323db03bfd608899613e363e701951' into sync_cg_clif-2023-12-31 2023-12-31 13:29:53 +00:00
Nicholas Nethercote
8a9db25459 Remove more Session methods that duplicate DiagCtxt methods. 2023-12-24 08:17:47 +11:00
Nicholas Nethercote
99472c7049 Remove Session methods that duplicate DiagCtxt methods.
Also add some `dcx` methods to types that wrap `TyCtxt`, for easier
access.
2023-12-24 08:05:28 +11:00
Nicholas Nethercote
09af8a667c Rename Session::span_diagnostic as Session::dcx. 2023-12-18 16:06:21 +11:00
bjorn3
484bc7fc88 Merge commit '93a5433f17ab5ed48cc88f1e69b0713b16183373' into sync_cg_clif-2023-10-24 2023-10-24 12:22:23 +00:00
bjorn3
e07f47b6c5 Merge commit 'c07d1e2f88cb3b1a0604ae8f18b478c1aeb7a7fa' into sync_cg_clif-2023-10-21 2023-10-21 19:54:51 +00:00
bjorn3
f0b5820fa5 Fix review comments 2023-10-09 18:39:43 +00:00
bjorn3
e9fa2ca6ad Remove cgu_reuse_tracker from Session
This removes a bit of global mutable state
2023-10-09 18:39:41 +00:00
bjorn3
6b9ee90c2c Reuse determine_cgu_reuse from cg_ssa in cg_clif 2023-10-09 18:38:50 +00:00
bjorn3
169055f2ff Merge commit '81dc066758ec150b43822d4a0c84aae20fe10f40' into sync_cg_clif-2023-10-09 2023-10-09 08:52:46 +00:00
John Kåre Alsaker
f742d88326 Remove verbose_generic_activity_with_arg 2023-09-10 17:47:16 +02:00
Vadim Petrochenkov
0b89aac08d rustc: Move crate_types from Session to GlobalCtxt
Removes a piece of mutable state.
Follow up to #114578.
2023-08-09 14:17:54 +08:00
bjorn3
36708123c1 Merge commit '1eded3619d0e55d57521a259bf27a03906fdfad0' into sync_cg_clif-2023-07-22 2023-07-22 13:32:34 +00:00
Nicholas Nethercote
b52f9eb6ca Introduce MonoItemData.
It replaces `(Linkage, Visibility)`, making the code nicer. Plus the
next commit will add another field.
2023-07-17 08:44:48 +10:00