Re-implement a type-size based limit
r? lcnr
This PR reintroduces the type length limit added in #37789, which was accidentally made practically useless by the caching changes to `Ty::walk` in #72412, which caused the `walk` function to no longer walk over identical elements.
Hitting this length limit is not fatal unless we are in codegen -- so it shouldn't affect passes like the mir inliner which creates potentially very large types (which we observed, for example, when the new trait solver compiles `itertools` in `--release` mode).
This also increases the type length limit from `1048576 == 2 ** 20` to `2 ** 24`, which covers all of the code that can be reached with craterbot-check. Individual crates can increase the length limit further if desired.
Perf regression is mild and I think we should accept it -- reinstating this limit is important for the new trait solver and to make sure we don't accidentally hit more type-size related regressions in the future.
Fixes#125460
Since this codegen flag now only controls LLVM-generated comments rather than
all assembly comments, make the name more accurate (and also match Clang).
`-Z patchable-function-entry` works like `-fpatchable-function-entry`
on clang/gcc. The arguments are total nop count and function offset.
See MCP rust-lang/compiler-team#704
Deprecate no-op codegen option `-Cinline-threshold=...`
This deprecates `-Cinline-threshold` since using it has no effect. This has been the case since the new LLVM pass manager started being used, more than 2 years ago.
Recommend using `-Cllvm-args=--inline-threshold=...` instead.
Closes#89742 which is E-help-wanted.
Add `f16` inline ASM support for 32-bit ARM
Adds `f16` inline ASM support for 32-bit ARM. SIMD vector types are taken from [here](https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:`@navigationhierarchiesreturnbasetype=[float]&f:@navigationhierarchieselementbitsize=[16]&f:@navigationhierarchiesarchitectures=[A32]).`
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Add `f16` inline ASM support for RISC-V
This PR adds `f16` inline ASM support for RISC-V. A `FIXME` is left for `f128` support as LLVM does not support the required `Q` (Quad-Precision Floating-Point) extension yet.
Relevant issue: #125398
Tracking issue: #116909
`@rustbot` label +F-f16_and_f128
Honor collapse_debuginfo for statics.
fixes#126363
<!--
If this PR is related to an unstable feature or an otherwise tracked effort,
please link to the relevant tracking issue here. If you don't know of a related
tracking issue or there are none, feel free to ignore this.
This PR will get automatically assigned to a reviewer. In case you would like
a specific user to review your work, you can assign it to them by using
r? <reviewer name>
-->
Also sort `crt-static` in `--print target-features` output
I didn't find `crt-static` at first (for `x86_64-unknown-linux-gnu`), because it was put at the bottom of the large and otherwise sorted list.
Fully sort the list before we print it.
Note that `llvm_target_features` starts out and remains sorted and does not need to be sorted an extra time.
On my machine the diff is just:
```diff
$ diff -u /tmp/before2.txt /tmp/after2.txt
--- /tmp/before2.txt 2024-06-13 20:40:27.091636592 +0200
+++ /tmp/after2.txt 2024-06-13 20:39:54.584894891 +0200
``@@`` -20,6 +20,7 ``@@``
bmi1 - Support BMI instructions.
bmi2 - Support BMI2 instructions.
cmpxchg16b - 64-bit with cmpxchg16b (this is true for most x86-64 chips, but not the first AMD chips).
+ crt-static - Enables C Run-time Libraries to be statically linked.
ermsb - REP MOVS/STOS are fast.
f16c - Support 16-bit floating point conversion instructions.
fma - Enable three-operand fused multiple-add.
``@@`` -49,7 +50,6 ``@@``
xsavec - Support xsavec instructions.
xsaveopt - Support xsaveopt instructions.
xsaves - Support xsaves instructions.
- crt-static - Enables C Run-time Libraries to be statically linked.
Code-generation features supported by LLVM for this target:
16bit-mode - 16-bit mode (i8086).
```
I couldn't find a ui test that tested this output. Let's see if CI finds a regression tests.
This deprecates `-Cinline-threshold` since using it has no effect. This
has been the case since the new LLVM pass manager started being used,
more than 2 years ago.
I didn't find `crt-static` at first (for `x86_64-unknown-linux-gnu`),
because it was put at the bottom the large and otherwise sorted list.
Fully sort the list before we print it.
Note that `llvm_target_features` starts out sorted and does not need to
be sorted an extra time.
We already do this for a number of crates, e.g. `rustc_middle`,
`rustc_span`, `rustc_metadata`, `rustc_span`, `rustc_errors`.
For the ones we don't, in many cases the attributes are a mess.
- There is no consistency about order of attribute kinds (e.g.
`allow`/`deny`/`feature`).
- Within attribute kind groups (e.g. the `feature` attributes),
sometimes the order is alphabetical, and sometimes there is no
particular order.
- Sometimes the attributes of a particular kind aren't even grouped
all together, e.g. there might be a `feature`, then an `allow`, then
another `feature`.
This commit extends the existing sorting to all compiler crates,
increasing consistency. If any new attribute line is added there is now
only one place it can go -- no need for arbitrary decisions.
Exceptions:
- `rustc_log`, `rustc_next_trait_solver` and `rustc_type_ir_macros`,
because they have no crate attributes.
- `rustc_codegen_gcc`, because it's quasi-external to rustc (e.g. it's
ignored in `rustfmt.toml`).
Directly add extension instead of using `Path::with_extension`
`Path::with_extension` has a nice footgun when the original path doesn't contain an extension: Anything after the last dot gets removed.
Make repr(packed) vectors work with SIMD intrinsics
In #117116 I fixed `#[repr(packed, simd)]` by doing the expected thing and removing padding from the layout. This should be the last step in providing a solution to rust-lang/portable-simd#319
Rollup of 6 pull requests
Successful merges:
- #125263 (rust-lld: fallback to rustc's sysroot if there's no path to the linker in the target sysroot)
- #125345 (rustc_codegen_llvm: add support for writing summary bitcode)
- #125362 (Actually use TAIT instead of emulating it)
- #125412 (Don't suggest adding the unexpected cfgs to the build-script it-self)
- #125445 (Migrate `run-make/rustdoc-with-short-out-dir-option` to `rmake.rs`)
- #125452 (Cleanup check-cfg handling in core and std)
r? `@ghost`
`@rustbot` modify labels: rollup
rustc_codegen_llvm: add support for writing summary bitcode
Typical uses of ThinLTO don't have any use for this as a standalone file, but distributed ThinLTO uses this to make the linker phase more efficient. With clang you'd do something like `clang -flto=thin -fthin-link-bitcode=foo.indexing.o -c foo.c` and then get both foo.o (full of bitcode) and foo.indexing.o (just the summary or index part of the bitcode). That's then usable by a two-stage linking process that's more friendly to distributed build systems like bazel, which is why I'm working on this area.
I talked some to `@teresajohnson` about naming in this area, as things seem to be a little confused between various blog posts and build systems. "bitcode index" and "bitcode summary" tend to be a little too ambiguous, and she tends to use "thin link bitcode" and "minimized bitcode" (which matches the descriptions in LLVM). Since the clang option is thin-link-bitcode, I went with that to try and not add a new spelling in the world.
Per `@dtolnay,` you can work around the lack of this by using `lld --thinlto-index-only` to do the indexing on regular .o files of bitcode, but that is a bit wasteful on actions when we already have all the information in rustc and could just write out the matching minimized bitcode. I didn't test that at all in our infrastructure, because by the time I learned that I already had this patch largely written.
If we don't do this, some versions of LLVM (at least 17, experimentally)
will double-emit some error messages, which is how I noticed this. Given
that it seems to be costing some extra work, let's only request the
summary bitcode production if we'll actually bother writing it down,
otherwise skip it.