Make simd_shuffle_indices use valtrees
This removes the second-to-last user of the `destructure_mir_constant` query. So in a follow-up we can remove the query and just move the query provider function directly into pretty printing (which is the last user).
cc `@rust-lang/clippy` there's a small functional change, but I think it is correct?
Implement most of MCP510
This implements most of what remains to be done for MCP510:
- turns `-C link-self-contained` into a `+`/`-` list of components, like `-C link-self-contained=+linker,+crto,+libc,+unwind,+sanitizers,+mingw`. The scaffolding is present for all these expected components to be implemented and stabilized in the future on their own time. This PR only handles the `-Zgcc-ld=lld` subset of these link-self-contained components as `-Clink-self-contained=+linker`
- handles `-C link-self-contained=y|n` as-is today, for compatibility with `rustc_codegen_ssa:🔙🔗:self_contained`'s [explicit opt-in and opt-out](9eee230cd0/compiler/rustc_codegen_ssa/src/back/link.rs (L1671-L1676)).
- therefore supports our plan to opt out of `rust-lld` (when it's enabled by default) even for current `-Clink-self-contained` users, with e.g. `-Clink-self-contained -Clink-self-contained=-linker`
- turns `add_gcc_ld_path` into its expected final form, by using the `-C link-self-contained=+linker` CLI flag, and whether the `LinkerFlavor` has the expected `Cc::Yes` and `Lld::Yes` shape (this is not yet the case in practice for any CLI linker flavor)
- makes the [new clean linker flavors](https://github.com/rust-lang/rust/pull/96827#issuecomment-1208441595) selectable in the CLI in addition to the legacy ones, in order to opt-in to using `cc` and `lld` to emulate `-Zgcc-ld=lld`
- ensure the new `-C link-self-contained` components, and `-C linker-flavor`s are unstable, and require `-Z unstable-options` to be used
The up-to-date set of flags for the future stable CLI version of `-Zgcc-ld=lld` is currently: `-Clink-self-contained=+linker -Clinker-flavor=gnu-lld-cc -Zunstable-options`.
It's possible we'll also need to do something for distros that don't ship `rust-lld`, but maybe there are already no tool search paths to be added to `cc` in this situation anyways.
r? `@petrochenkov`
loongarch: Fix ELF header flags
This patch changes the ELF header flags so that the ABI matches the floating-point features. It also updates the link to the new official documentation.
Fix unset e_flags in ELF files generated for AVR targets
Closes#106576
~~Sort-of blocked by gimli-rs/object#500~~ (merged)
I'm not sure whether the list of AVR CPU names is okay here. Maybe it could be moved out-of-line to improve the readability of the function.
Support for native WASM exceptions
### Motivation
Currently, rustc does not support native WASM exceptions. It does support JavaScript based exceptions for the wasm32-emscripten-target, but this requires back&forth with javascript for many calls, which is very slow.
Native wasm support for exceptions is quite common: Clang+LLVM implemented them years ago, and all major browsers support them by now. They enable zero-cost exceptions, at least with regard to runtime-performance-cost. They may increase startup-time and code size, though.
### Important: This PR does not change default behaviour
Exceptions usually add a lot of code in form of unwinding blocks, increasing the binary size. Most users probably do not want that, especially which regard to web development.
Therefore, wasm exceptions play a similar role as WASM-threads: rustc should support them, like clang does, but users who want to use it have to use some command-line magic like rustflags to opt in.
### What does this PR do?
As stated above, the default behaviour is not changed. It is already possible to opt-in into wasm exceptions using the command line. Unfortunately, the LLVM IR is invalid and the LLVM backend crashes.
```
rustc <sourcefile>
--target wasm32-unknown-unknown
-C panic=unwind
-C llvm-args=-wasm-enable-eh
-C target-feature=+exception-handling
```
As it turns out, LLVM is quite picky when it comes to IR for exception handling. If the IR does not look exactly like it should, some LLVM-assertions fail and the code generation crashes.
This PR adds the necessary modifications to the code generator to make it work. It also adds `exception-handling` as a wasm target feature.
### What this PR does not / what is missing
This PR is not a full fledges solution. It is the first step. A few parts are still missing; however, it is already useable (see next section).
Currently missing:
* The std library has to be adapted. Currently, only [no_std] crates work
* Usually, nested exceptions abort the program (i.e. a panic during the cleanup of another panic). This is currently not done yet.
- Currently, code inside cleanup handlers does not unwind
- To fix this requires a little more work: The code generator currently maintains a single terminate block per function for this. Unfortunately, WASM requires funclet based exception handling. Therefore, we need to create a terminate block per funclet. This is probably not a big problem, but I want to keep this PR simple.
### How to use the compiler given this PR?
This PR does not add any command line flags or features. It uses those which are already there. To compile with exceptions enabled, you need
* to set the panic strategy to unwind, i.e. `-C panic=unwind`
* to enable the exception-handling target feature, i.e. `-C target-feature=+exception-handling`
* to tell LLVM about the exception handling, i.e. `-C llvm-args=-wasm-enable-eh`
Since the standard library has not been adapted, you can only use it in [no_std] crates as of now. The intrinsic `core::intrinsics::r#try` works. To throw exceptions, you need the ```@llvm.wasm.throw``` intrinsic.
I created a sample application which works for me: https://github.com/mirkootter/rust-wasm-demos
This example can be run at https://webassembly.sh
mir opt + codegen: handle subtyping
fixes#107205
the same issue was caused in multiple places:
- mir opts: both copy and destination propagation
- codegen: assigning operands to locals (which also propagates values)
I changed codegen to always update the type in the operands used for locals which should guard against any new occurrences of this bug going forward. I don't know how to make mir optimizations more resilient here. Hopefully the added tests will be enough to detect any trivially wrong optimizations going forward.
Add trustzone and virtualization target features for aarch32.
These are LLVM target features which allow the `smc` and `hvc` instructions respectively to be used in inline assembly.
For non-incremental builds on Unix, currently all the thread names look
like `opt regex.f10ba03eb5ec7975-cgu.0`. But they are truncated by
`pthread_setname` to `opt regex.f10ba`, hiding the numeric suffix that
distinguishes them. This is really annoying when using a profiler like
Samply.
This commit changes these thread names to a form like `opt cgu.0`, which
is much better.
The codegen main loop has two bools, `codegen_done` and
`codegen_aborted`. There are only three valid combinations: `(false,
false)`, `(true, false)`, `(true, true)`.
This commit replaces them with a single tri-state enum, which makes
things clearer.
`Message` is an enum with multiple variants. Four of those variants map
directly onto the four variants of `WorkItemResult`. This commit reduces
those four `Message` variants to a single variant containing a
`WorkItemResult`. This requires increasing `WorkItemResult`'s visibility
to `pub(crate)` visibility, but `WorkItem` and `Message` can also have
their visibility reduced to `pub(crate)`.
This change avoids some boilerplate enum translation code, and makes
`Message` easier to understand.
`Message` is an enum with multiple variants, for messages sent to the
coordinator thread. *Except* for `Message::CodegenItem`, which is
entirely disjoint, being for messages sent from the coordinator thread
to the main thread.
This commit move `Message::CodegenItem` into a separate type,
`CguMessage`, which makes the code much clearer.
Like for rlibs, the paths on the linker command line need to be relative
paths if the sysroot was specified by the user to be a relative path.
Dylibs put the path in /LIBPATH instead of into the file path of the
library itself, so we rehome the libpath and adjust the rehoming function
to be able to support both use cases, rlibs and dylibs.
When the `--sysroot` is specified as relative to the current working
directory, the sysroot's rlibs should also be specified as relative
paths. Otherwise, the current working directory ends up in the
absolute paths, and in the linker command line. And the entire linker
command line appears in the PDB file generated by the MSVC linker.
When adding an rlib to the linker command line, if the rlib's canonical
path is in the sysroot's canonical path, then use the current sysroot
path + filename instead of the full absolute path to the rlib. This
means that when `--sysroot=foo` is specified, the linker command line
will contain `foo/rustlib/target/lib/lib*.rlib` instead of the full
absolute path to the same.
This addresses https://github.com/rust-lang/rust/issues/112586
Because tiny CGUs make compilation less efficient *and* result in worse
generated code.
We don't do this when the number of CGUs is explicitly given, because
there are times when the requested number is very important, as
described in some comments within the commit. So the commit also
introduces a `CodegenUnits` type that distinguishes between default
values and user-specified values.
This change has a roughly neutral effect on walltimes across the
rustc-perf benchmarks; there are some speedups and some slowdowns. But
it has significant wins for most other metrics on numerous benchmarks,
including instruction counts, cycles, binary size, and max-rss. It also
reduces parallelism, which is good for reducing jobserver competition
when multiple rustc processes are running at the same time. It's smaller
benchmarks that benefit the most; larger benchmarks already have CGUs
that are all larger than the minimum size.
Here are some example before/after CGU sizes for opt builds.
- html5ever
- CGUs: 16, mean size: 1196.1, sizes: [3908, 2992, 1706, 1652, 1572,
1136, 1045, 948, 946, 938, 579, 471, 443, 327, 286, 189]
- CGUs: 4, mean size: 4396.0, sizes: [6706, 3908, 3490, 3480]
- libc
- CGUs: 12, mean size: 35.3, sizes: [163, 93, 58, 53, 37, 8, 2 (x6)]
- CGUs: 1, mean size: 424.0, sizes: [424]
- tt-muncher
- CGUs: 5, mean size: 1819.4, sizes: [8508, 350, 198, 34, 7]
- CGUs: 1, mean size: 9075.0, sizes: [9075]
Note that CGUs of size 100,000+ aren't unusual in larger programs.
Write to stdout if `-` is given as output file
With this PR, if `-o -` or `--emit KIND=-` is provided, output will be written to stdout instead. Binary output (those of type `obj`, `llvm-bc`, `link` and `metadata`) being written this way will result in an error unless stdout is not a tty. Multiple output types going to stdout will trigger an error too, as they will all be mixded together.
This implements https://github.com/rust-lang/compiler-team/issues/431
The idea behind the changes is to introduce an `OutFileName` enum that represents the output - be it a real path or stdout - and to use this enum along the code paths that handle different output types.
Rollup of 9 pull requests
Successful merges:
- #112034 (Migrate `item_opaque_ty` to Askama)
- #112179 (Avoid passing --cpu-features when empty)
- #112309 (bootstrap: remove dependency `is-terminal`)
- #112388 (Migrate GUI colors test to original CSS color format)
- #112389 (Add a test for #105709)
- #112392 (Fix ICE for while loop with assignment condition with LHS place expr)
- #112394 (Remove accidental comment)
- #112396 (Track more diagnostics in `rustc_expand`)
- #112401 (Don't `use compile_error as print`)
r? `@ghost`
`@rustbot` modify labels: rollup
Removed use of iteration through a HashMap/HashSet in rustc_incremental and replaced with IndexMap/IndexSet
This allows for the `#[allow(rustc::potential_query_instability)]` in rustc_incremental to be removed, moving towards fixing #84447 (although a LOT more modules have to be changed to fully resolve it). Only HashMaps/HashSets that are being iterated through have been modified (although many structs and traits outside of rustc_incremental had to be modified as well, as they had fields/methods that involved a HashMap/HashSet that would be iterated through)
I'm making a PR for just 1 module changed to test for performance regressions and such, for future changes I'll either edit this PR to reflect additional modules being converted, or batch multiple modules of changes together and make a PR for each group of modules.
Force all native libraries to be statically linked when linking a static binary
Previously, `#[link]` without an explicit `kind = "static"` would confuse the linker and end up producing a dynamically linked library because of the `-Bdynamic` flag. However this binary would not work correctly anyways since it was linked with startup code for a static binary.
This PR solves this by forcing all native libraries to be statically linked when the output is a static binary that cannot link to dynamic libraries anyways.
Fixes#108878Fixes#102993
If `-o -` or `--emit KIND=-` is provided, output will be written
to stdout instead. Binary output (`obj`, `llvm-bc`, `link` and
`metadata`) being written this way will result in an error unless
stdout is not a tty. Multiple output types going to stdout will
trigger an error too, as they will all be mixded together.
Use `load`+`store` instead of `memcpy` for small integer arrays
I was inspired by #98892 to see whether, rather than making `mem::swap` do something smart in the library, we could update MIR assignments like `*_1 = *_2` to do something smarter than `memcpy` for sufficiently-small types that doing it inline is going to be better than a `memcpy` call in assembly anyway. After all, special code may help `mem::swap`, but if the "obvious" MIR can just result in the correct thing that helps everything -- other code like `mem::replace`, people doing it manually, and just passing around by value in general -- as well as makes MIR inlining happier since it doesn't need to deal with all the complicated library code if it just sees a couple assignments.
LLVM will turn the short, known-length `memcpy`s into direct instructions in the backend, but that's too late for it to be able to remove `alloca`s. In general, replacing `memcpy`s with typed instructions is hard in the middle-end -- even for `memcpy.inline` where it knows it won't be a function call -- is hard [due to poison propagation issues](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/memcpy.20vs.20load-store.20for.20MIR.20assignments/near/360376712). So because we know more about the type invariants -- these are typed copies -- rustc can emit something more specific, allowing LLVM to `mem2reg` away the `alloca`s in some situations.
#52051 previously did something like this in the library for `mem::swap`, but it ended up regressing during enabling mir inlining (cbbf06b0cd), so this has been suboptimal on stable for ≈5 releases now.
The code in this PR is narrowly targeted at just integer arrays in LLVM, but works via a new method on the [`LayoutTypeMethods`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.LayoutTypeMethods.html) trait, so specific backends based on cg_ssa can enable this for more situations over time, as we find them. I don't want to try to bite off too much in this PR, though. (Transparent newtypes and simple things like the 3×usize `String` would be obvious candidates for a follow-up.)
Codegen demonstrations: <https://llvm.godbolt.org/z/fK8hT9aqv>
Before:
```llvm
define void `@swap_rgb48_old(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #1 {
%a.i = alloca [3 x i16], align 2
call void `@llvm.lifetime.start.p0(i64` 6, ptr nonnull %a.i)
call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %a.i, ptr noundef nonnull align 2 dereferenceable(6) %x, i64 6, i1 false)
tail call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %x, ptr noundef nonnull align 2 dereferenceable(6) %y, i64 6, i1 false)
call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %y, ptr noundef nonnull align 2 dereferenceable(6) %a.i, i64 6, i1 false)
call void `@llvm.lifetime.end.p0(i64` 6, ptr nonnull %a.i)
ret void
}
```
Note it going to stack:
```nasm
swap_rgb48_old: # `@swap_rgb48_old`
movzx eax, word ptr [rdi + 4]
mov word ptr [rsp - 4], ax
mov eax, dword ptr [rdi]
mov dword ptr [rsp - 8], eax
movzx eax, word ptr [rsi + 4]
mov word ptr [rdi + 4], ax
mov eax, dword ptr [rsi]
mov dword ptr [rdi], eax
movzx eax, word ptr [rsp - 4]
mov word ptr [rsi + 4], ax
mov eax, dword ptr [rsp - 8]
mov dword ptr [rsi], eax
ret
```
Now:
```llvm
define void `@swap_rgb48(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #0 {
start:
%0 = load <3 x i16>, ptr %x, align 2
%1 = load <3 x i16>, ptr %y, align 2
store <3 x i16> %1, ptr %x, align 2
store <3 x i16> %0, ptr %y, align 2
ret void
}
```
still lowers to `dword`+`word` operations, but has no stack traffic:
```nasm
swap_rgb48: # `@swap_rgb48`
mov eax, dword ptr [rdi]
movzx ecx, word ptr [rdi + 4]
movzx edx, word ptr [rsi + 4]
mov r8d, dword ptr [rsi]
mov dword ptr [rdi], r8d
mov word ptr [rdi + 4], dx
mov word ptr [rsi + 4], cx
mov dword ptr [rsi], eax
ret
```
And as a demonstration that this isn't just `mem::swap`, a `mem::replace` on a small array (since replace doesn't use swap since #83022), which used to be `memcpy`s in LLVM changes in IR
```llvm
define void `@replace_short_array(ptr` noalias nocapture noundef sret([3 x i32]) dereferenceable(12) %0, ptr noalias noundef align 4 dereferenceable(12) %r, ptr noalias nocapture noundef readonly dereferenceable(12) %v) unnamed_addr #0 {
start:
%1 = load <3 x i32>, ptr %r, align 4
store <3 x i32> %1, ptr %0, align 4
%2 = load <3 x i32>, ptr %v, align 4
store <3 x i32> %2, ptr %r, align 4
ret void
}
```
but that lowers to reasonable `dword`+`qword` instructions still
```nasm
replace_short_array: # `@replace_short_array`
mov rax, rdi
mov rcx, qword ptr [rsi]
mov edi, dword ptr [rsi + 8]
mov dword ptr [rax + 8], edi
mov qword ptr [rax], rcx
mov rcx, qword ptr [rdx]
mov edx, dword ptr [rdx + 8]
mov dword ptr [rsi + 8], edx
mov qword ptr [rsi], rcx
ret
```
These tend to have special handling in a bunch of places anyway, so the variant helps remember that. And I think it's easier to grok than non-Scalar Aggregates sometimes being `Immediates` (like I got wrong and caused 109992). As a minor bonus, it means we don't need to generate poison LLVM values for them to pass around in `OperandValue::Immediate`s.
linker: Report linker flavors incompatible with the current target
The linker flavor is checked for target compatibility even if linker is never used (e.g. we are producing a rlib).
If it causes trouble, we can move the check to `link.rs` so it will run if the linker (flavor) is actually used.
And also feature gate explicitly specifying linker flavors for tier 3 targets.
The next step is supporting all the internal linker flavors in user-visible interfaces (command line and json).
offset_of: don't require type to be `Sized`
Fixes#112051
~~The RFC [explicitly forbids](https://rust-lang.github.io/rfcs/3308-offset_of.html#limitations) non-`Sized` types, but it looks like only the fields being recursed into were checked. The sized check also seemed to have been completely missing for tuples~~
change `BorrowKind::Unique` to be a mutating `PlaceContext`
fixes#112056
I believe that `BorrowKind::Unique` is a footgun in general, so I added a FIXME and opened https://github.com/rust-lang/rust/issues/112072. This is a bit too involved for this PR though.
Optimize scalar and scalar pair representations loaded from ByRef in llvm
in https://github.com/rust-lang/rust/pull/105653 I noticed that we were generating suboptimal LLVM IR if we had a `ConstValue::ByRef` that could be represented by a `ScalarPair`. Before https://github.com/rust-lang/rust/pull/105653 this is probably rare, but after it, every slice will go down this suboptimal code path that requires LLVM to untangle a bunch of indirections and translate static allocations that are only used once to read a scalar pair from.
Go through an intermediate pair of `cc`and `lld` hints instead of mapping CLI options to `LinkerFlavor` directly, and use the target's default linker flavor as a reference.
Each of `{D,Subd}iagnosticMessage::{Str,Eager}` has a comment:
```
// FIXME(davidtwco): can a `Cow<'static, str>` be used here?
```
This commit answers that question in the affirmative. It's not the most
compelling change ever, but it might be worth merging.
This requires changing the `impl<'a> From<&'a str>` impls to `impl
From<&'static str>`, which involves a bunch of knock-on changes that
require/result in call sites being a little more precise about exactly
what kind of string they use to create errors, and not just `&str`. This
will result in fewer unnecessary allocations, though this will not have
any notable perf effects given that these are error paths.
Note that I was lazy within Clippy, using `to_string` in a few places to
preserve the existing string imprecision. I could have used `impl
Into<{D,Subd}iagnosticMessage>` in various places as is done in the
compiler, but that would have required changes to *many* call sites
(mostly changing `&format("...")` to `format!("...")`) which didn't seem
worthwhile.
Add support for LLVM SafeStack
Adds support for LLVM [SafeStack] which provides backward edge control
flow protection by separating the stack into two parts: data which is
only accessed in provable safe ways is allocated on the normal stack
(the "safe stack") and all other data is placed in a separate allocation
(the "unsafe stack").
SafeStack support is enabled by passing `-Zsanitizer=safestack`.
[SafeStack]: https://clang.llvm.org/docs/SafeStack.html
cc `@rcvalle` #39699
Adds support for LLVM [SafeStack] which provides backward edge control
flow protection by separating the stack into two parts: data which is
only accessed in provable safe ways is allocated on the normal stack
(the "safe stack") and all other data is placed in a separate allocation
(the "unsafe stack").
SafeStack support is enabled by passing `-Zsanitizer=safestack`.
[SafeStack]: https://clang.llvm.org/docs/SafeStack.html
Fix linking Mac Catalyst by including LC_BUILD_VERSION in object files
Hello. My first rustc PR!
Issue #106021 prevents Rust code from being linked into Mac Catalyst applications. Apple's LD has started requiring object files to contain version information about the platform they were built for, such as:
* the "deployment target" (minimum supported OS version),
* the SDK version
* the type of the platform (macOS/iOS/catalyst/tvOS/watchOS all have a different number).
This is currently only enforced when building for Mac Catalyst.
Rust uses the `object` crate which added support for including this information starting with `0.31.0`. ~~I upgraded it along with `thorin-dwp` so that everything depends on 0.31.
Apparently 0.31 [pulls in](https://github.com/gimli-rs/object/issues/463) `ruzstd` due to a [new ELF standard](https://maskray.me/blog/2022-09-09-zstd-compressed-debug-sections) because its `compression` feature is enabled by thorin. If you find this objectionable, let me know what the best way to avoid pulling in those dependencies might be.~~
**(`object` upgraded in https://github.com/rust-lang/rust/pull/111413)**
I then added two commits:
* The first one adds very basic, hard-coded support for calling `set_macho_build_version` for `-macabi` (Catalyst) targets, where it claims deployment target of Catalyst 14.0 and SDK of 16.2.
* The second weaves the versioning through `rust_target::spec::TargetOptions`, so that we can stick to specifying all target-related info in one place.
Kudos to ``@ara4n`` for writing [this gist](https://gist.github.com/ara4n/320a53ea768aba51afad4c9ed2168536).
Ensure Fluent messages are in alphabetical order
Fixes#111847
This adds a tidy check to ensure Fluent messages are in alphabetical order, as well as sorting all existing messages. I think the error could be worded better, would appreciate suggestions.
<details>
<summary>Script used to sort files</summary>
```py
import sys
import re
fn = sys.argv[1]
with open(fn, 'r') as f:
data = f.read().split("\n")
chunks = []
cur = ""
for line in data:
if re.match(r"^([a-zA-Z0-9_]+)\s*=\s*", line):
chunks.append(cur)
cur = ""
cur += line + "\n"
chunks.append(cur)
chunks.sort()
with open(fn, 'w') as f:
f.write(''.join(chunks).strip("\n\n") + "\n")
```
</details>
Support #[global_allocator] without the allocator shim
This makes it possible to use liballoc/libstd in combination with `--emit obj` if you use `#[global_allocator]`. This is what rust-for-linux uses right now and systemd may use in the future. Currently they have to depend on the exact implementation of the allocator shim to create one themself as `--emit obj` doesn't create an allocator shim.
Note that currently the allocator shim also defines the oom error handler, which is normally required too. Once `#![feature(default_alloc_error_handler)]` becomes the only option, this can be avoided. In addition when using only fallible allocator methods and either `--cfg no_global_oom_handling` for liballoc (like rust-for-linux) or `--gc-sections` no references to the oom error handler will exist.
To avoid this feature being insta-stable, you will have to define `__rust_no_alloc_shim_is_unstable` to avoid linker errors.
(Labeling this with both T-compiler and T-lang as it originally involved both an implementation detail and had an insta-stable user facing change. As noted above, the `__rust_no_alloc_shim_is_unstable` symbol requirement should prevent unintended dependence on this unstable feature.)
Use `Option::is_some_and` and `Result::is_ok_and` in the compiler
`.is_some_and(..)`/`.is_ok_and(..)` replace `.map_or(false, ..)` and `.map(..).unwrap_or(false)`, making the code more readable.
This PR is a sibling of https://github.com/rust-lang/rust/pull/111873#issuecomment-1561316515
Preprocess and cache dominator tree
Preprocessing dominators has a very strong effect for https://github.com/rust-lang/rust/pull/111344.
That pass checks that assignments dominate their uses repeatedly. Using the unprocessed dominator tree caused a quadratic runtime (number of bbs x depth of the dominator tree).
This PR also caches the dominator tree and the pre-processed dominators in the MIR cfg cache.
Rebase of https://github.com/rust-lang/rust/pull/107157
cc `@tmiasko`
Don't assume that `-Bdynamic` is the default linker mode
In particular this is false when passing `-static` or `-static-pie` to the linker, which changes the default to `-Bstatic`. This PR ensures we explicitly initialize the correct mode when we first need it.
Fix local libs not included when printing native static libs
This PR fixes https://github.com/rust-lang/rust/issues/111643 by adding the local used libs to the printed `--print=native-static-libs` output.
It seems that `--print=native-static-libs` doesn't have any test, so I added one. It's very simple and doesn't even tries to compile the result to a binary as I don't know how to handle external library linking in CI. (Note that https://github.com/rust-lang/rust/blob/master/tests/run-make/staticlib-dylib-linkage/Makefile does compile to a binary)
r? `@bjorn3`
Fix dependency tracking for debugger visualizers
This PR fixes dependency tracking for debugger visualizer files by changing the `debugger_visualizers` query to an `eval_always` query that scans the AST while it is still available. This way the set of visualizer files is already available when dep-info is emitted. Since the query is turned into an `eval_always` query, dependency tracking will now reliably detect changes to the visualizer script files themselves.
TODO:
- [x] perf.rlo
- [x] Needs a bit more documentation in some places
- [x] Needs regression test for the incr. comp. case
Fixes https://github.com/rust-lang/rust/issues/111226
Fixes https://github.com/rust-lang/rust/issues/111227
Fixes https://github.com/rust-lang/rust/issues/111295
r? `@wesleywiser`
cc `@gibbyfree`
Only depend on CFG_VERSION in rustc_interface
This avoids having to rebuild the whole compiler on each commit when `omit-git-hash = false`.
cc https://github.com/rust-lang/rust/issues/76720 - this won't fix it, and I'm not suggesting we turn this on by default, but it will make it less painful for people who do have `omit-git-hash` on as a workaround.
Support RISC-V unaligned-scalar-mem target feature
This adds `unaligned-scalar-mem` as an allowed RISC-V target feature. Some RISC-V cores support unaligned access to memory without trapping. On such cores, the compiler could significantly improve code-size and performance when using functions like core::ptr::read_unaligned<u32> by emitting a single load or store instruction with an unaligned address, rather than a long sequence of byte load/store/bitmanip instructions.
Enabling the `unaligned-scalar-mem` target feature allows LLVM to do this optimization.
Fixes#110883
In particular this is false when passing `-static` or `-static-pie` to
the linker, which changes the default to `-Bstatic`. This PR ensures we
explicitly initialize the correct mode when we first need it.
Remove misleading target feature aliases
Fixes#100752. This is a follow up to #103750. These aliases could not be completely removed until rust-lang/stdarch#1355 landed.
cc `@Amanieu`
Allow MIR debuginfo to point to a variable's address
MIR optimizations currently do not to operate on borrowed locals.
When enabling #106285, many borrows will be left as-is because they are used in debuginfo. This pass allows to replace this pattern directly in MIR debuginfo:
```rust
a => _1
_1 = &raw? mut? _2
```
becomes
```rust
a => &_2
// No statement to borrow _2.
```
This pass is implemented as a drive-by in ReferencePropagation MIR pass.
This transformation allows following following MIR opts to treat _2 as an unborrowed local, and optimize it as such, even in builds with debuginfo.
In codegen, when encountering `a => &..&_2`, we create a list of allocas:
```llvm
store ptr %_2.dbg.spill, ptr %a.ref0.dbg.spill
store ptr %a.ref0.dbg.spill, ptr %a.ref1.dbg.spill
...
call void `@llvm.dbg.declare(metadata` ptr %a.ref{n}.dbg.spill, /* ... */)
```
Caveat: this transformation looses the exact type, we do not differentiate `a` as a immutable, mutable reference or a raw pointer. Everything is declared to `*mut` to codegen. I'm not convinced this is a blocker.
Align unsized locals
Allocate an extra space for unsized locals and manually align the storage, since alloca doesn't support dynamic alignment.
Fixes#71416.
Fixes#71695.
Introduce `DynSend` and `DynSync` auto trait for parallel compiler
part of parallel-rustc #101566
This PR introduces `DynSend / DynSync` trait and `FromDyn / IntoDyn` structure in rustc_data_structure::marker. `FromDyn` can dynamically check data structures for thread safety when switching to parallel environments (such as calling `par_for_each_in`). This happens only when `-Z threads > 1` so it doesn't affect single-threaded mode's compile efficiency.
r? `@cjgillot`
bump windows crate 0.46 -> 0.48
This drops duped version of crate(0.46), reduces `rustc_driver.dll` ~800kb and reduces exported functions number from 26k to 22k.
Also while here, added `tidy-alphabetical` sorting to lists in tidy allowed lists.
You will need to add the following as replacement for the old __rust_*
definitions when not using the alloc shim.
#[no_mangle]
static __rust_no_alloc_shim_is_unstable: u8 = 0;
Isolate coverage FFI type layouts from their underlying LLVM C++ types
I noticed that several of the types used to send coverage information through FFI are not properly isolated from the layout of their corresponding C++ types in the LLVM API.
This PR adds more explicitly-defined FFI struct/enum types in `CoverageMappingWrapper.cpp`, so that Rust source files in `rustc_codegen_ssa` and `rustc_codegen_llvm` aren't directly exposed to LLVM C++ types.
Support linking to rust dylib with --crate-type staticlib
This allows for example dynamically linking libstd, while statically linking the user crate into an executable or C dynamic library. For this two unstable flags (`-Z staticlib-allow-rdylib-deps` and `-Z staticlib-prefer-dynamic`) are introduced. Without the former you get an error. The latter is the equivalent to `-C prefer-dynamic` for the staticlib crate type to indicate that dynamically linking is preferred when both options are available, like for libstd. Care must be taken to ensure that no crate ends up being merged into two distinct staticlibs that are linked together. Doing so will cause a linker error at best and undefined behavior at worst. In addition two distinct staticlibs compiled by different rustc may not be combined under any circumstances due to some rustc private symbols not being mangled.
To successfully link a staticlib, `--print native-static-libs` can be used while compiling to ask rustc for the linker flags necessary when linking the staticlib. This is an existing flag which previously only listed native libraries. It has been extended to list rust dylibs too. Trying to locate libstd yourself to link against it is not supported and may break if for example the libstd of multiple rustc versions are put in the same directory.
For an example on how to use this see the `src/test/run-make-fulldeps/staticlib-dylib-linkage/` test.
Add GNU Property Note
Fix#103001
Generates the missing property note:
```
Displaying notes found in: .note.gnu.property
Owner Data size Description
GNU 0x00000010 NT_GNU_PROPERTY_TYPE_0 Properties: x86 feature: IBT
```
Make `(try_)subst_and_normalize_erasing_regions` take `EarlyBinder`
Changes `subst_and_normalize_erasing_regions` and `try_subst_and_normalize_erasing_regions` to take `EarlyBinder<T>` instead of `T`.
(related to #105779)
This was suggested by `@BoxyUwU` in https://github.com/rust-lang/rust/pull/107753#discussion_r1105828139. After changing `type_of` to return `EarlyBinder`, there were several places where the binder was immediately skipped to call `tcx.subst_and_normalize_erasing_regions`, only for the binder to be reconstructed inside of that method.
r? `@BoxyUwU`
Operand::extract_field: only cast llval if it's a pointer and replace bitcast w/ pointercast.
Fixes#105439.
Also cc `@erikdesjardins,` looks like another place to cleanup as part of #105545
Stabilize raw-dylib, link_ordinal, import_name_type and -Cdlltool
This stabilizes the `raw-dylib` feature (#58713) for all architectures (i.e., `x86` as it is already stable for all other architectures).
Changes:
* Permit the use of the `raw-dylib` link kind for x86, the `link_ordinal` attribute and the `import_name_type` key for the `link` attribute.
* Mark the `raw_dylib` feature as stable.
* Stabilized the `-Zdlltool` argument as `-Cdlltool`.
* Note the path to `dlltool` if invoking it failed (we don't need to do this if `dlltool` returns an error since it prints its path in the error message).
* Adds tests for `-Cdlltool`.
* Adds tests for being unable to find the dlltool executable, and dlltool failing.
* Fixes a bug where we were checking the exit code of dlltool to see if it failed, but dlltool always returns 0 (indicating success), so instead we need to check if anything was written to `stderr`.
NOTE: As previously noted (https://github.com/rust-lang/rust/pull/104218#issuecomment-1315895618) using dlltool within rustc is temporary, but this is not the first time that Rust has added a temporary tool use and argument: https://github.com/rust-lang/rust/pull/104218#issuecomment-1318720482
Big thanks to ``````@tbu-`````` for the first version of this PR (#104218)
Add cross-language LLVM CFI support to the Rust compiler
This PR adds cross-language LLVM Control Flow Integrity (CFI) support to the Rust compiler by adding the `-Zsanitizer-cfi-normalize-integers` option to be used with Clang `-fsanitize-cfi-icall-normalize-integers` for normalizing integer types (see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space). For more information about LLVM CFI and cross-language LLVM CFI support for the Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and -Zsanitizer-cfi-normalize-integers, and requires proper (i.e., non-rustc) LTO (i.e., -Clinker-plugin-lto).
Thank you again, ``@bjorn3,`` ``@nikic,`` ``@samitolvanen,`` and the Rust community for all the help!
This commit adds cross-language LLVM Control Flow Integrity (CFI)
support to the Rust compiler by adding the
`-Zsanitizer-cfi-normalize-integers` option to be used with Clang
`-fsanitize-cfi-icall-normalize-integers` for normalizing integer types
(see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust
-compiled code "mixed binaries" (i.e., for when C or C++ and Rust
-compiled code share the same virtual address space). For more
information about LLVM CFI and cross-language LLVM CFI support for the
Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and
-Zsanitizer-cfi-normalize-integers, and requires proper (i.e.,
non-rustc) LTO (i.e., -Clinker-plugin-lto).
Currently a `{D,Subd}iagnosticMessage` can be created from any type that
impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static,
str>`, which are reasonable. It also includes `&String`, which is pretty
weird, and results in many places making unnecessary allocations for
patterns like this:
```
self.fatal(&format!(...))
```
This creates a string with `format!`, takes a reference, passes the
reference to `fatal`, which does an `into()`, which clones the
reference, doing a second allocation. Two allocations for a single
string, bleh.
This commit changes the `From` impls so that you can only create a
`{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static,
str>`. This requires changing all the places that currently create one
from a `&String`. Most of these are of the `&format!(...)` form
described above; each one removes an unnecessary static `&`, plus an
allocation when executed. There are also a few places where the existing
use of `&String` was more reasonable; these now just use `clone()` at
the call site.
As well as making the code nicer and more efficient, this is a step
towards possibly using `Cow<'static, str>` in
`{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing
the `From<&'a str>` impls to `From<&'static str>`, which is doable, but
I'm not yet sure if it's worthwhile.
Fix Unreadable non-UTF-8 output on localized MSVC
Fixes#35785 by converting non UTF-8 linker output to Unicode using the OEM code page.
Before:
```text
= note: Non-UTF-8 output: LINK : fatal error LNK1181: cannot open input file \'m\x84rchenhaft.obj\'\r\n
```
After:
```text
= note: LINK : fatal error LNK1181: cannot open input file 'märchenhaft.obj'
```
The difference is more dramatic if using a non-ascii language pack for Windows.
only error combining +whole-archive and +bundle for rlibs
Fixes https://github.com/rust-lang/rust/issues/110912
Checking `flavor == RlibFlavor::Normal` was accidentally lost in 601fc8b36bhttps://github.com/rust-lang/rust/pull/105601
That caused combining +whole-archive and +bundle link modifiers on non-rlib crates to fail with a confusing error message saying that combination is unstable for rlibs. In particular, this caused the build to fail when +whole-archive was used on staticlib crates, even though +whole-archive effectively does nothing on non-bin crates because the final linker invocation is left to an external build system.
cc ``@petrochenkov``
Use MIR's `Offset` for pointer `add` too
~~Status: draft while waiting for #110822 to land, since this is built atop that.~~
~~r? `@ghost~~`
Canonical Rust code has mostly moved to `add`/`sub` on pointers, which take `usize`, instead of `offset` which takes `isize`. (And, relatedly, when `sub_ptr` was added it turned out it replaced every single in-tree use of `offset_from`, because `usize` is just so much more useful than `isize` in Rust.)
Unfortunately, `intrinsics::offset` could only accept `*const` and `isize`, so there's a *huge* amount of type conversions back and forth being done. They're identity conversions in the backend, but still end up producing quite a lot of unhelpful MIR.
This PR changes `intrinsics::offset` to accept `*const` *and* `*mut` along with `isize` *and* `usize`. Conveniently, the backends and CTFE already handle this, since MIR's `BinOp::Offset` [already supports all four combinations](adaac6b166/compiler/rustc_const_eval/src/transform/validate.rs (L523-L528)).
To demonstrate the difference, I added some `mir-opt/pre-codegen/` tests around slice indexing. Here's the difference to `[T]::get_mut`, since it uses `<*mut _>::add` internally:
```diff
`@@` -79,30 +70,21 `@@` fn slice_get_mut_usize(_1: &mut [u32], _2: usize) -> Option<&mut u32> {
StorageLive(_12); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageLive(_9); // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
_9 = _8 as *mut u32 (PtrToPtr); // scope 11 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_13); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _13 = _2 as isize (IntToInt); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_14); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_15); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _15 = _9 as *const u32 (Pointer(MutToConstPointer)); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _14 = Offset(move _15, _13); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_15); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _7 = move _14 as *mut u32 (PtrToPtr); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_14); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_13); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
+ _7 = Offset(_9, _2); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
StorageDead(_9); // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageDead(_12); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageDead(_11); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
```
1c1c8e442a (diff-a841b6a4538657add3f39bc895744331453d0625e7aace128b1f604f0b63c8fdR80)
Fixes https://github.com/rust-lang/rust/issues/110912
Checking `flavor == RlibFlavor::Normal` was accidentally lost in
601fc8b36bhttps://github.com/rust-lang/rust/pull/105601
That caused combining +whole-archive and +bundle link modifiers on
non-rlib crates to fail with a confusing error message saying that
combination is unstable for rlibs. In particular, this caused the
build to fail when +whole-archive was used on staticlib crates, even
though +whole-archive effectively does nothing on non-bin crates because
the final linker invocation is left to an external build system.
Nicer ICE for #67981
Provides a slightly nicer ICE for #67981, documenting the problem. A proper fix will be necessary before `#![feature(unsized_fn_params)]` can be stabilized.
The problem is that the design of the `"rust-call"` ABI is fundamentally not compatible with `unsized_fn_params`. `"rust-call"` functions need to collect their arguments into a tuple, but if the arguments are not `Sized`, said tuple is potentially not even a valid type—and if it is, it requires `alloca` to create.
``@rustbot`` label +A-abi +A-codegen +F-unboxed_closures +F-unsized_fn_params
Fixes#35785 by converting non UTF-8 linker output to Unicode using the OEM code page.
Before:
```text
= note: Non-UTF-8 output: LINK : fatal error LNK1181: cannot open input file \'m\x84rchenhaft.obj\'\r\n
```
After:
```text
= note: LINK : fatal error LNK1181: cannot open input file 'märchenhaft.obj'
```
The difference is more dramatic if using a non-ascii language pack for Visual Studio.
This adds `unaligned-scalar-mem` as an allowed RISC-V target feature.
Some RISC-V cores support unaligned access to memory without trapping.
On such cores, the compiler could significantly improve code-size and
performance when using functions like core::ptr::read_unaligned<u32>
by emitting a single load or store instruction with an unaligned
address, rather than a long sequence of byte load/store/bitmanip
instructions.
Enabling the `unaligned-scalar-mem` target feature allows LLVM to do
this optimization.
Fixes#110883
Rollup of 10 pull requests
Successful merges:
- #108760 (Add lint to deny diagnostics composed of static strings)
- #109444 (Change tidy error message for TODOs)
- #110419 (Spelling library)
- #110550 (Suggest deref on comparison binop RHS even if type is not Copy)
- #110641 (Add new rustdoc book chapter to describe in-doc settings)
- #110798 (pass `unused_extern_crates` in `librustdoc::doctest::make_test`)
- #110819 (simplify TrustedLen impls)
- #110825 (diagnostics: add test case for already-solved issue)
- #110835 (Make some region folders a little stricter.)
- #110847 (rustdoc-json: Time serialization.)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
They're semantically the same, so this means the backends don't need to handle the intrinsic and means fewer MIR basic blocks in pointer arithmetic code.
Report allocation errors as panics
OOM is now reported as a panic but with a custom payload type (`AllocErrorPanicPayload`) which holds the layout that was passed to `handle_alloc_error`.
This should be review one commit at a time:
- The first commit adds `AllocErrorPanicPayload` and changes allocation errors to always be reported as panics.
- The second commit removes `#[alloc_error_handler]` and the `alloc_error_hook` API.
ACP: https://github.com/rust-lang/libs-team/issues/192Closes#51540Closes#51245
Add offset_of! macro (RFC 3308)
Implements https://github.com/rust-lang/rfcs/pull/3308 (tracking issue #106655) by adding the built in macro `core::mem::offset_of`. Two of the future possibilities are also implemented:
* Nested field accesses (without array indexing)
* DST support (for `Sized` fields)
I wrote this a few months ago, before the RFC merged. Now that it's merged, I decided to rebase and finish it.
cc `@thomcc` (RFC author)
Support AIX-style archive type
Reading facility of AIX big archive has been supported by `object` since 0.30.0.
Writing facility of AIX big archive has already been supported by `ar_archive_writer`, but we need to bump the version to support the new archive type enum.
Add `rustc_fluent_macro` to decouple fluent from `rustc_macros`
Fluent, with all the icu4x it brings in, takes quite some time to compile. `fluent_messages!` is only needed in further downstream rustc crates, but is blocking more upstream crates like `rustc_index`. By splitting it out, we allow `rustc_macros` to be compiled earlier, which speeds up `x check compiler` by about 5 seconds (and even more after the needless dependency on `serde_json` is removed from `rustc_data_structures`).
Encode hashes as bytes, not varint
In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash.
Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true`
Before:
```
( 1) 373418203 (53.7%, 53.7%): 1
( 2) 196240113 (28.2%, 81.9%): 3
( 3) 108157958 (15.6%, 97.5%): 2
( 4) 17213120 ( 2.5%, 99.9%): 4
( 5) 223614 ( 0.0%,100.0%): 9
( 6) 216262 ( 0.0%,100.0%): 10
( 7) 15447 ( 0.0%,100.0%): 5
( 8) 3633 ( 0.0%,100.0%): 19
( 9) 3030 ( 0.0%,100.0%): 8
( 10) 1167 ( 0.0%,100.0%): 18
( 11) 1032 ( 0.0%,100.0%): 7
( 12) 1003 ( 0.0%,100.0%): 6
( 13) 10 ( 0.0%,100.0%): 16
( 14) 10 ( 0.0%,100.0%): 17
( 15) 5 ( 0.0%,100.0%): 12
( 16) 4 ( 0.0%,100.0%): 14
```
After:
```
( 1) 372939136 (53.7%, 53.7%): 1
( 2) 196240140 (28.3%, 82.0%): 3
( 3) 108014969 (15.6%, 97.5%): 2
( 4) 17192375 ( 2.5%,100.0%): 4
( 5) 435 ( 0.0%,100.0%): 5
( 6) 83 ( 0.0%,100.0%): 18
( 7) 79 ( 0.0%,100.0%): 10
( 8) 50 ( 0.0%,100.0%): 9
( 9) 6 ( 0.0%,100.0%): 19
```
The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
Fluent, with all the icu4x it brings in, takes quite some time to
compile. `fluent_messages!` is only needed in further downstream rustc
crates, but is blocking more upstream crates like `rustc_index`. By
splitting it out, we allow `rustc_macros` to be compiled earlier, which
speeds up `x check compiler` by about 5 seconds (and even more after the
needless dependency on `serde_json` is removed from
`rustc_data_structures`).
Preserve argument indexes when inlining MIR
We store argument indexes on VarDebugInfo. Unlike the previous method of relying on the variable index to know whether a variable is an argument, this survives MIR inlining.
We also no longer check if var.source_info.scope is the outermost scope. When a function gets inlined, the arguments to the inner function will no longer be in the outermost scope. What we care about though is whether they were in the outermost scope prior to inlining, which we know by whether we assigned an argument index.
Fixes#83217
I considered using `Option<NonZeroU16>` instead of `Option<u16>` to store the index. I didn't because `TypeFoldable` isn't implemented for `NonZeroU16` and because it looks like due to padding, it currently wouldn't make any difference. But I indexed from 1 anyway because (a) it'll make it easier if later it becomes worthwhile to use a `NonZeroU16` and because the arguments were previously indexed from 1, so it made for a smaller change.
This is my first PR on rust-lang/rust, so apologies if I've gotten anything not quite right.
Initial support for loongarch64-unknown-linux-gnu
Hi, We hope to add a new port in rust for LoongArch.
LoongArch intro
LoongArch is a RISC style ISA which is independently designed by Loongson
Technology in China. It is divided into two versions, the 32-bit version (LA32)
and the 64-bit version (LA64). LA64 applications have application-level
backward binary compatibility with LA32 applications. LoongArch is composed of
a basic part (Loongson Base) and an expanded part. The expansion part includes
Loongson Binary Translation (LBT), Loongson VirtualiZation (LVZ), Loongson SIMD
EXtension (LSX) and Loongson Advanced SIMD EXtension(LASX).
Currently the LA464 processor core supports LoongArch ISA and the Loongson
3A5000 processor integrates 4 64-bit LA464 cores. LA464 is a four-issue 64-bit
high-performance processor core. It can be used as a single core for high-end
embedded and desktop applications, or as a basic processor core to form an
on-chip multi-core system for server and high-performance machine applications.
Documentations:
ISA:
https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html
ABI:
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html
More docs can be found at:
https://loongson.github.io/LoongArch-Documentation/README-EN.html
Since last year, we have locally adapted two versions of rust, rust1.41 and rust1.57, and completed the test locally.
I'm not sure if I'm submitting all the patches at once, so I split up the patches and here's one of the commits
Add support for RISC-V relax target feature
This adds `relax` as an allowed RISC-V target feature. The relax feature in LLVM enables [linker relaxation](https://www.sifive.com/blog/all-aboard-part-3-linker-relaxation-in-riscv-toolchain), an optimization specific to RISC-V that allows global variable accesses to be resolved by the linker by using the global pointer (`gp`) register (rather than constructing the addresses from scratch for each access). Enabling `relax` will cause LLVM to emit relocations in the object file that support this. The feature can be enabled in rustc with `-C target-feature=+relax`.
Currently this feature is disabled by default, but maybe it should be enabled by default since it is an easy performance improvement (but requires the `gp` register to be set up properly). GCC/Clang enable this feature by default (for both hosted/bare-metal targets), and include the `-mno-relax` flag to disable it (see [here](466d554dca/clang/lib/Driver/ToolChains/Arch/RISCV.cpp (L145)) for the code that enables it in Clang). I think it would make sense to enable by default, at least for all hosted targets since the `gp` register should be automatically set up by the runtime. For bare-metal targets, `gp` must be set up manually, so it is probably best to leave off by default to avoid breaking existing applications that do not set up `gp`. Leaving it disabled by default for all targets is also reasonable though.
Let me know your thoughts. Thanks!
Fixes#109426.
We store argument indexes on VarDebugInfo. Unlike the previous method of
relying on the variable index to know whether a variable is an argument,
this survives MIR inlining.
We also no longer check if var.source_info.scope is the outermost scope.
When a function gets inlined, the arguments to the inner function will
no longer be in the outermost scope. What we care about though is
whether they were in the outermost scope prior to inlining, which we
know by whether we assigned an argument index.
Fix a couple ICEs in the new `CastKind::Transmute` code
Check the sizes of the immediates, rather than the overall types, when deciding whether we can convert types without going through memory.
Fixes#110005Fixes#109992Fixes#110032
cc `@matthiaskrgr`