Add cross-language LLVM CFI support to the Rust compiler
This PR adds cross-language LLVM Control Flow Integrity (CFI) support to the Rust compiler by adding the `-Zsanitizer-cfi-normalize-integers` option to be used with Clang `-fsanitize-cfi-icall-normalize-integers` for normalizing integer types (see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space). For more information about LLVM CFI and cross-language LLVM CFI support for the Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and -Zsanitizer-cfi-normalize-integers, and requires proper (i.e., non-rustc) LTO (i.e., -Clinker-plugin-lto).
Thank you again, ``@bjorn3,`` ``@nikic,`` ``@samitolvanen,`` and the Rust community for all the help!
This commit adds cross-language LLVM Control Flow Integrity (CFI)
support to the Rust compiler by adding the
`-Zsanitizer-cfi-normalize-integers` option to be used with Clang
`-fsanitize-cfi-icall-normalize-integers` for normalizing integer types
(see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust
-compiled code "mixed binaries" (i.e., for when C or C++ and Rust
-compiled code share the same virtual address space). For more
information about LLVM CFI and cross-language LLVM CFI support for the
Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and
-Zsanitizer-cfi-normalize-integers, and requires proper (i.e.,
non-rustc) LTO (i.e., -Clinker-plugin-lto).
Remove `QueryEngine` trait
This removes the `QueryEngine` trait and `Queries` from `rustc_query_impl` and replaced them with function pointers and fields in `QuerySystem`. As a side effect `OnDiskCache` is moved back into `rustc_middle` and the `OnDiskCache` trait is also removed.
This has a couple of benefits.
- `TyCtxt` is used in the query system instead of the removed `QueryCtxt` which is larger.
- Function pointers are more flexible to work with. A variant of https://github.com/rust-lang/rust/pull/107802 is included which avoids the double indirection. For https://github.com/rust-lang/rust/pull/108938 we can name entry point `__rust_end_short_backtrace` to avoid some overhead. For https://github.com/rust-lang/rust/pull/108062 it avoids the duplicate `QueryEngine` structs.
- `QueryContext` now implements `DepContext` which avoids many `dep_context()` calls in `rustc_query_system`.
- The `rustc_driver` size is reduced by 0.33%, hopefully that means some bootstrap improvements.
- This avoids the unsafe code around the `QueryEngine` trait.
r? `@cjgillot`
Report allocation errors as panics
OOM is now reported as a panic but with a custom payload type (`AllocErrorPanicPayload`) which holds the layout that was passed to `handle_alloc_error`.
This should be review one commit at a time:
- The first commit adds `AllocErrorPanicPayload` and changes allocation errors to always be reported as panics.
- The second commit removes `#[alloc_error_handler]` and the `alloc_error_hook` API.
ACP: https://github.com/rust-lang/libs-team/issues/192Closes#51540Closes#51245
Enable flatten-format-args by default.
Part of https://github.com/rust-lang/rust/issues/99012.
This enables the `flatten-format-args` feature that was added by https://github.com/rust-lang/rust/pull/106824:
> This change inlines string literals, integer literals and nested format_args!() into format_args!() during ast lowering, making all of the following pairs result in equivalent hir:
>
> ```rust
> println!("Hello, {}!", "World");
> println!("Hello, World!");
> ```
>
> ```rust
> println!("[info] {}", format_args!("error"));
> println!("[info] error");
> ```
>
> ```rust
> println!("[{}] {}", status, format_args!("error: {}", msg));
> println!("[{}] error: {}", status, msg);
> ```
>
> ```rust
> println!("{} + {} = {}", 1, 2, 1 + 2);
> println!("1 + 2 = {}", 1 + 2);
> ```
>
> And so on.
>
> This is useful for macros. E.g. a `log::info!()` macro could just pass the tokens from the user directly into a `format_args!()` that gets efficiently flattened/inlined into a `format_args!("info: {}")`.
>
> It also means that `dbg!(x)` will have its file, line, and expression name inlined:
>
> ```rust
> eprintln!("[{}:{}] {} = {:#?}", file!(), line!(), stringify!(x), x); // before
> eprintln!("[example.rs:1] x = {:#?}", x); // after
> ```
>
> Which can be nice in some cases, but also means a lot more unique static strings than before if dbg!() is used a lot.
This is mostly an optimization, except that it will be visible through [`fmt::Arguments::as_str()`](https://doc.rust-lang.org/nightly/std/fmt/struct.Arguments.html#method.as_str).
In https://github.com/rust-lang/rust/pull/106823, there was already a libs-api FCP about the documentation of `fmt::Arguments::as_str()` to allow it to give `Some` rather than `None` depending on optimizations like this. That was just a documentation update though. This PR is the one that actually makes the user visible change:
```rust
assert_eq!(format_args!("abc").as_str(), Some("abc")); // Unchanged.
assert_eq!(format_args!("ab{}", "c").as_str(), Some("abc")); // Was `None` before!
```
Add `rustc_fluent_macro` to decouple fluent from `rustc_macros`
Fluent, with all the icu4x it brings in, takes quite some time to compile. `fluent_messages!` is only needed in further downstream rustc crates, but is blocking more upstream crates like `rustc_index`. By splitting it out, we allow `rustc_macros` to be compiled earlier, which speeds up `x check compiler` by about 5 seconds (and even more after the needless dependency on `serde_json` is removed from `rustc_data_structures`).
Fluent, with all the icu4x it brings in, takes quite some time to
compile. `fluent_messages!` is only needed in further downstream rustc
crates, but is blocking more upstream crates like `rustc_index`. By
splitting it out, we allow `rustc_macros` to be compiled earlier, which
speeds up `x check compiler` by about 5 seconds (and even more after the
needless dependency on `serde_json` is removed from
`rustc_data_structures`).
Validate `ignore` and `only` compiletest directive, and add human-readable ignore reasons
This PR adds strict validation for the `ignore` and `only` compiletest directives, failing if an unknown value is provided to them. Doing so uncovered 79 tests in `tests/ui` that had invalid directives, so this PR also fixes them.
Finally, this PR adds human-readable ignore reasons when tests are ignored due to `ignore` or `only` directives, like *"only executed when the architecture is aarch64"* or *"ignored when the operative system is windows"*. This was the original reason why I started working on this PR and #108659, as we need both of them for Ferrocene.
The PR is a draft because the code is extremely inefficient: it calls `rustc --print=cfg --target $target` for every rustc target (to gather the list of allowed ignore values), which on my system takes between 4s and 5s, and performs a lot of allocations of constant values. I'll fix both of them in the coming days.
r? `@ehuss`
Avoid a few locks
We can use atomics or datastructures tuned for specific access patterns instead of locks. This may be an improvement for parallel rustc, but it's mostly a cleanup making various datastructures only usable in the way they are used right now (append data, never mutate), instead of having a general purpose lock.
`-Cdebuginfo=1` was never line tables only and
can't be due to backwards compatibility issues.
This was clarified and an option for line tables only
was added. Additionally an option for line info
directives only was added, which is well needed for
some targets. The debug info options should now
behave the same as clang's debug info options.
Add `try_canonicalize` to `rustc_fs_util` and use it over `fs::canonicalize`
This adds `try_canonicalize` which tries to call `fs::canonicalize`, but falls back to `std::path::absolute` if it fails. Existing `canonicalize` calls are replaced with it. `fs::canonicalize` is not guaranteed to work on Windows.
Add `-Z time-passes-format` to allow specifying a JSON output for `-Z time-passes`
This adds back the `-Z time` option as that is useful for [my rustc benchmark tool](https://github.com/Zoxc/rcb), reverting https://github.com/rust-lang/rust/pull/102725. It now uses nanoseconds and bytes as the units so it is renamed to `time-precise`.
Avoid unnecessary hashing
I noticed some stable hashing being done in a non-incremental build. It turns out that some of this is necessary to compute the crate hash, but some of it is not. Removing the unnecessary hashing is a perf win.
r? `@cjgillot`
This makes it easier to open the messages file while developing on features.
The commit was the result of automatted changes:
for p in compiler/rustc_*; do mv $p/locales/en-US.ftl $p/messages.ftl; rmdir $p/locales; done
for p in compiler/rustc_*; do sed -i "s#\.\./locales/en-US.ftl#../messages.ftl#" $p/src/lib.rs; done
This is fixed by simply using the currently registered target in the
current session. We need to use it because of target json that are not
by design included in the rustc list of targets.
Do not implement HashStable for HashSet (MCP 533)
This PR removes all occurrences of `HashSet` in query results, replacing it either with `FxIndexSet` or with `UnordSet`, and then removes the `HashStable` implementation of `HashSet`. This is part of implementing [MCP 533](https://github.com/rust-lang/compiler-team/issues/533), that is, removing the `HashStable` implementations of all collection types with unstable iteration order.
The changes are mostly mechanical. The only place where additional sorting is happening is in Miri's override implementation of the `exported_symbols` query.
The crate hash is needed:
- if debug assertions are enabled, or
- if incr. comp. is enabled, or
- if metadata is being generated, or
- if `-C instrumentation-coverage` is enabled.
This commit avoids computing the crate hash when these conditions are
all false, such as when doing a release build of a binary crate.
It uses `Option` to store the hashes when needed, rather than
computing them on demand, because some of them are needed in multiple
places and computing them on demand would make compilation slower.
The commit also removes `Owner::hash_without_bodies`. There is no
benefit to pre-computing that one, it can just be done in the normal
fashion.
Implement -Zlink-directives=yes/no
`-Zlink-directives=no` will ignored `#[link]` directives while compiling a crate, so nothing is emitted into the crate's metadata. The assumption is that the build system already knows about the crate's native dependencies and can provide them at link time without these directives.
This is another way to address issue # #70093, which is currently addressed by `-Zlink-native-libraries` (implemented in #70095). The latter is implemented at link time, which has the effect of ignoring `#[link]` in *every* crate. This makes it a very large hammer as it requires all native dependencies to be known to the build system to be at all usable, including those in sysroot libraries. I think this means its effectively unused, and definitely under-used.
Being able to control this on a crate-by-crate basis should make it much easier to apply when needed.
I'm not sure if we need both mechanisms, but we can decide that later.
cc `@pcwalton` `@cramertj`
`-Zlink-directives=no` will ignored `#[link]` directives while compiling a
crate, so nothing is emitted into the crate's metadata. The assumption is
that the build system already knows about the crate's native dependencies
and can provide them at link time without these directives.
This is another way to address issue # #70093, which is currently addressed
by `-Zlink-native-libraries` (implemented in #70095). The latter is
implemented at link time, which has the effect of ignoring `#[link]`
in *every* crate. This makes it a very large hammer as it requires all
native dependencies to be known to the build system to be at all usable,
including those in sysroot libraries. I think this means its effectively
unused, and definitely under-used.
Being able to control this on a crate-by-crate basis should make it much
easier to apply when needed.
I'm not sure if we need both mechanisms, but we can decide that later.
errors: generate typed identifiers in each crate
Instead of loading the Fluent resources for every crate in `rustc_error_messages`, each crate generates typed identifiers for its own diagnostics and creates a static which are pulled together in the `rustc_driver` crate and provided to the diagnostic emitter.
There are advantages and disadvantages to this change..
#### Advantages
- Changing a diagnostic now only recompiles the crate for that diagnostic and those crates that depend on it, rather than `rustc_error_messages` and all crates thereafter.
- This approach can be used to support first-party crates that want to supply translatable diagnostics (e.g. `rust-lang/thorin` in https://github.com/rust-lang/rust/pull/102612#discussion_r985372582, cc `@JhonnyBillM)`
- We can extend this a little so that tools built using rustc internals (like clippy or rustdoc) can add their own diagnostic resources (much more easily than those resources needing to be available to `rustc_error_messages`)
#### Disadvantages
- Crates can only refer to the diagnostic messages defined in the current crate (or those from dependencies), rather than all diagnostic messages.
- `rustc_driver` (or some other crate we create for this purpose) has to directly depend on *everything* that has error messages.
- It already transitively depended on all these crates.
#### Pending work
- [x] I don't know how to make `rustc_codegen_gcc`'s translated diagnostics work with this approach - because `rustc_driver` can't depend on that crate and so can't get its resources to provide to the diagnostic emission. I don't really know how the alternative codegen backends are actually wired up to the compiler at all.
- [x] Update `triagebot.toml` to track the moved FTL files.
r? `@compiler-errors`
cc #100717
Extend `CodegenBackend` trait with a function returning the translation
resources from the codegen backend, which can be added to the complete
list of resources provided to the emitter.
Signed-off-by: David Wood <david.wood@huawei.com>
Instead of loading the Fluent resources for every crate in
`rustc_error_messages`, each crate generates typed identifiers for its
own diagnostics and creates a static which are pulled together in the
`rustc_driver` crate and provided to the diagnostic emitter.
Signed-off-by: David Wood <david.wood@huawei.com>
Use a lock-free datastructure for source_span
follow up to the perf regression in https://github.com/rust-lang/rust/pull/105462
The main regression is likely the CStore, but let's evaluate the perf impact of this on its own
remove unstable `pick_stable_methods_before_any_unstable` flag
This flag was only added in #90329 in case there was any issue with the impl so that it would be easy to tell nightly users to use the flag to disable the new logic to fix their code. It's now been enabled for two years and also I can't find any issues corresponding to this new functionality? This flag made it way harder to understand how this code works so it would be nice to remove it and simplify what's going on.
cc `@nbdd0121`
r? `@oli-obk`
Add `kernel-address` sanitizer support for freestanding targets
This PR adds support for KASan (kernel address sanitizer) instrumentation in freestanding targets. I included the minimal set of `x86_64-unknown-none`, `riscv64{imac, gc}-unknown-none-elf`, and `aarch64-unknown-none` but there's likely other targets it can be added to. (`linux_kernel_base.rs`?) KASan uses the address sanitizer attributes but has the `CompileKernel` parameter set to `true` in the pass creation.
Implement partial support for non-lifetime binders
This implements support for non-lifetime binders. It's pretty useless currently, but I wanted to put this up so the implementation can be discussed.
Specifically, this piggybacks off of the late-bound lifetime collection code in `rustc_hir_typeck::collect::lifetimes`. This seems like a necessary step given the fact we don't resolve late-bound regions until this point, and binders are sometimes merged.
Q: I'm not sure if I should go along this route, or try to modify the earlier nameres code to compute the right bound var indices for type and const binders eagerly... If so, I'll need to rename all these queries to something more appropriate (I've done this for `resolve_lifetime::Region` -> `resolve_lifetime::ResolvedArg`)
cc rust-lang/types-team#81
r? `@ghost`
Most tests involving save-analysis were removed, but I kept a few where
the `-Zsave-analysis` was an add-on to the main thing being tested,
rather than the main thing being tested.
For `x.py install`, the `rust-analysis` target has been removed.
For `x.py dist`, the `rust-analysis` target has been kept in a
degenerate form: it just produces a single file `reduced.json`
indicating that save-analysis has been removed. This is necessary for
rustup to keep working.
Closes#43606.
Introduce `-Zterminal-urls` to use OSC8 for error codes
Terminals supporting the OSC8 Hyperlink Extension can support inline anchors where the text is user defineable but clicking on it opens a browser to a specified URLs, just like `<a href="URL">` does in HTML.
https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
Resolve documentation links in rustc and store the results in metadata
This PR implements MCP https://github.com/rust-lang/compiler-team/issues/584.
Doc links are now resolved in rustc and stored into metadata, so rustdoc simply retrieves them through a query (local or extern),
Code that is no longer used is removed, and some code that no longer needs to be public is privatized.
The removed code includes resolver cloning, so this PR fixes https://github.com/rust-lang/rust/issues/83761.
Terminals supporting the OSC8 Hyperlink Extension can support inline
anchors where the text is user defineable but clicking on it opens a
browser to a specified URLs, just like `<a href="URL">` does in HTML.
https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda
This is somewhat important because LLVM enables the pass based on
target architecture, but support by the target OS also matters.
For example, XRay attributes are processed by codegen for macOS
targets, but Apple linker fails to process relocations in XRay
data sections, so the feature as a whole is not supported there
for the time being.