The resulting profile will include the crate name and will be stored in
the `--out-dir` directory.
This implementation makes it convenient to use LLVM time trace together
with cargo, in the contrast to the previous implementation which would
overwrite profiles or store them in `.cargo/registry/..`.
This commit works around a crash in LLVM when the
`-generate-arange-section` argument is passed to LLVM. An LLVM bug is
opened for this and the code in question is also set to continue passing
this flag with LLVM 14, assuming that this is fixed by the time LLVM 14
comes out. Otherwise this should work around debuginfo crashes on LLVM
13.
Initialize LLVM time trace profiler on each code generation thread
In https://reviews.llvm.org/D71059 LLVM 11, the time trace profiler was
extended to support multiple threads.
`timeTraceProfilerInitialize` creates a thread local profiler instance.
When a thread finishes `timeTraceProfilerFinishThread` moves a thread
local instance into a global collection of instances. Finally when all
codegen work is complete `timeTraceProfilerWrite` writes data from the
current thread local instance and the instances in global collection
of instances.
Previously, the profiler was intialized on a single thread only. Since
this thread performs no code generation on its own, the resulting
profile was empty.
Update LLVM codegen to initialize & finish time trace profiler on each
code generation thread.
cc `@tmandry`
r? `@wesleywiser`
In https://reviews.llvm.org/D71059 LLVM 11, the time trace profiler was
extended to support multiple threads.
`timeTraceProfilerInitialize` creates a thread local profiler instance.
When a thread finishes `timeTraceProfilerFinishThread` moves a thread
local instance into a global collection of instances. Finally when all
codegen work is complete `timeTraceProfilerWrite` writes data from the
current thread local instance and the instances in global collection
of instances.
Previously, the profiler was intialized on a single thread only. Since
this thread performs no code generation on its own, the resulting
profile was empty.
Update LLVM codegen to initialize & finish time trace profiler on each
code generation thread.
Cleanup LLVM multi-threading checks
The support for runtime multi-threading was removed from LLVM. Calls to
`LLVMStartMultithreaded` became no-ops equivalent to checking if LLVM
was compiled with support for threads http://reviews.llvm.org/D4216.
The support for runtime multi-threading was removed from LLVM. Calls to
`LLVMStartMultithreaded` became no-ops equivalent to checking if LLVM
was compiled with support for threads http://reviews.llvm.org/D4216.
[aarch64] add target feature outline-atomics
Enable outline-atomics by default as enabled in clang by the following commit
https://reviews.llvm.org/rGc5e7e649d537067dec7111f3de1430d0fc8a4d11
Performance improves by several orders of magnitude when using the LSE instructions
instead of the ARMv8.0 compatible load/store exclusive instructions.
Tested on Graviton2 aarch64-linux with
x.py build && x.py install && x.py test
Enable outline-atomics by default as enabled in clang by the following commit
https://reviews.llvm.org/rGc5e7e649d537067dec7111f3de1430d0fc8a4d11
Performance improves by several orders of magnitude when using the LSE instructions
instead of the ARMv8.0 compatible load/store exclusive instructions.
Tested on Graviton2 aarch64-linux with
x.py build && x.py install && x.py test
This fixes compiling things like the `snap` crate after
https://reviews.llvm.org/D105462. I added a test that verifies the
additional attribute gets specified, and confirmed that I can build
cargo with both LLVM 13 and 14 with this change applied.
This addresses a codegen-issue that needs to be fixed upstream in LLVM.
While we wait for the fix, we can disable it.
Verified manually that the outliner is no longer run when
`-Copt-level=z` is specified, and also that you can override this with
`-Cllvm-args=-enable-machine-outliner` if you need it anyway.
A regression test is not really feasible in this instance, given that we
do not have any minimal reproducers.
Fixes#85351
Update list of allowed aarch64 features
I recently added these features to std_detect for aarch64 linux, pending [review](https://github.com/rust-lang/stdarch/pull/1146).
I have commented any features not supported by LLVM 9, the current minimum version for Rust. Some (PAuth at least) were renamed between 9 & 12 and I've left them disabled. TME, however, is not in LLVM 9 but I've left it enabled.
See https://github.com/rust-lang/stdarch/issues/993
Since the beginning of time the `-Ctarget-feature` flag on the command
line has largely been passed unmodified to LLVM. Afterwards, though, the
`#[target_feature]` attribute was stabilized and some of the names in
this attribute do not match the corresponding LLVM name. This is because
Rust doesn't always want to stabilize the exact feature name in LLVM for
the equivalent functionality in Rust. This creates a situation, however,
where in Rust you'd write:
#[target_feature(enable = "pclmulqdq")]
unsafe fn foo() {
// ...
}
but on the command line you would write:
RUSTFLAGS="-Ctarget-feature=+pclmul" cargo build --release
This difference is somewhat odd to deal with if you're a newcomer and
the situation may be made worse with upcoming features like [WebAssembly
SIMD](https://github.com/rust-lang/rust/issues/74372) which may be more
prevalent.
This commit implements a mapping to translate requests via
`-Ctarget-feature` through the same name-mapping functionality that's
present for attributes in Rust going to LLVM. This means that
`+pclmulqdq` will work on x86 targets where as previously it did not.
I've attempted to keep this backwards-compatible where the compiler will
just opportunistically attempt to remap features found in
`-Ctarget-feature`, but if there's something it doesn't understand it
gets passed unmodified to LLVM just as it was before.
Import small cold functions
The Rust code is often written under an assumption that for generic
methods inline attribute is mostly unnecessary, since for optimized
builds using ThinLTO, a method will be code generated in at least one
CGU and available for import.
For example, deref implementations for Box, Vec, MutexGuard, and
MutexGuard are not currently marked as inline, neither is identity
implementation of From trait.
In PGO builds, when functions are determined to be cold, the default
multiplier of zero will stop the import, no matter how trivial the
implementation.
Increase slightly the default multiplier from 0 to 0.1.
r? `@ghost`
When cg_llvm encounters the `-Ctarget-cpu=native` it computes an
explciit set of features that applies to the target in order to
correctly compile code for the host CPU (because e.g. `skylake` alone is
not sufficient to tell if some of the instructions are available or
not).
However there were a couple of issues with how we did this. Firstly, the
order in which features were overriden wasn't quite right – conceptually
you'd expect `-Ctarget-cpu=native` option to override the features that
are implicitly set by the target definition. However due to how other
`-Ctarget-cpu` values are handled we must adopt the following order
of priority:
* Features from -Ctarget-cpu=*; are overriden by
* Features implied by --target; are overriden by
* Features from -Ctarget-feature; are overriden by
* function specific features.
Another problem was in that the function level `target-features`
attribute would overwrite the entire set of the globally enabled
features, rather than just the features the
`#[target_feature(enable/disable)]` specified. With something like
`-Ctarget-cpu=native` we'd end up in a situation wherein a function
without `#[target_feature(enable)]` annotation would have a broader
set of features compared to a function with one such attribute. This
turned out to be a cause of heavy run-time regressions in some code
using these function-level attributes in conjunction with
`-Ctarget-cpu=native`, for example.
With this PR rustc is more careful about specifying the entire set of
features for functions that use `#[target_feature(enable/disable)]` or
`#[instruction_set]` attributes.
Sadly testing the original reproducer for this behaviour is quite
impossible – we cannot rely on `-Ctarget-cpu=native` to be anything in
particular on developer or CI machines.
The Rust code is often written under an assumption that for generic
methods inline attribute is mostly unnecessary, since for optimized
builds using ThinLTO, a method will be generated in at least one CGU and
available for import.
For example, deref implementations for Box, Vec, MutexGuard, and
MutexGuard are not currently marked as inline, neither is identity
implementation of From trait.
In PGO builds, when functions are determined to be cold, the default
multiplier of zero will stop the import, even for completely trivial
functions.
Increase slightly the default multiplier from 0 to 0.1 to import them
regardless.
Use Option::unwrap_or instead of open-coding it
r? ```@oli-obk``` Noticed this while we were talking about the other PR just now 😆
```@rustbot``` modify labels +C-cleanup +T-compiler
Updated the list of white-listed target features for x86
This PR both adds in-source documentation on what to look out for when adding a new (X86) feature set and [adds all that are detectable at run-time in Rust stable as of 1.27.0](https://github.com/rust-lang/stdarch/blob/master/crates/std_detect/src/detect/arch/x86.rs).
This should only enable the use of the corresponding LLVM intrinsics.
Actual intrinsics need to be added separately in rust-lang/stdarch.
It also re-orders the run-time-detect test statements to be more consistent
with the actual list of intrinsics whitelisted and removes underscores not present
in the actual names (which might be mistaken as being part of the name)
The reference for LLVM's feature names used is [this file](https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/X86TargetParser.def).
This PR was motivated as the compiler end's part for allowing #67329 to be adressed over on rust-lang/stdarch
Allow making `RUSTC_BOOTSTRAP` conditional on the crate name
Motivation: This came up in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/233931-t-compiler.2Fmajor-changes/topic/Require.20users.20to.20confirm.20they.20know.20RUSTC_.E2.80.A6.20compiler-team.23350/near/208403962) for https://github.com/rust-lang/compiler-team/issues/350.
See also https://github.com/rust-lang/cargo/pull/6608#issuecomment-458546258; this implements https://github.com/rust-lang/cargo/issues/6627.
The goal is for this to eventually allow prohibiting setting `RUSTC_BOOTSTRAP` in build.rs (https://github.com/rust-lang/cargo/issues/7088).
## User-facing changes
- `RUSTC_BOOTSTRAP=1` still works; there is no current plan to remove this.
- Things like `RUSTC_BOOTSTRAP=0` no longer activate nightly features. In practice this shouldn't be a big deal, since `RUSTC_BOOTSTRAP` is the opposite of stable and everyone uses `RUSTC_BOOTSTRAP=1` anyway.
- `RUSTC_BOOTSTRAP=x` will enable nightly features only for crate `x`.
- `RUSTC_BOOTSTRAP=x,y` will enable nightly features only for crates `x` and `y`.
## Implementation changes
The main change is that `UnstableOptions::from_environment` now requires
an (optional) crate name. If the crate name is unknown (`None`), then the new feature is not available and you still have to use `RUSTC_BOOTSTRAP=1`. In practice this means the feature is only available for `--crate-name`, not for `#![crate_name]`; I'm interested in supporting the second but I'm not sure how.
Other major changes:
- Added `Session::is_nightly_build()`, which uses the `crate_name` of
the session
- Added `nightly_options::match_is_nightly_build`, a convenience method
for looking up `--crate-name` from CLI arguments.
`Session::is_nightly_build()`should be preferred where possible, since
it will take into account `#![crate_name]` (I think).
- Added `unstable_features` to `rustdoc::RenderOptions`
I'm not sure whether this counts as T-compiler or T-lang; _technically_ RUSTC_BOOTSTRAP is an implementation detail, but it's been used so much it seems like this counts as a language change too.
r? `@joshtriplett`
cc `@Mark-Simulacrum` `@hsivonen`
This commit grepped for LLVM_VERSION_GE, LLVM_VERSION_LT, get_major_version and
min-llvm-version and statically evaluated every expression possible
(and sensible) assuming that the LLVM version is >=9 now
with an eye on merging `TargetOptions` into `Target`.
`TargetOptions` as a separate structure is mostly an implementation detail of `Target` construction, all its fields logically belong to `Target` and available from `Target` through `Deref` impls.
The main change is that `UnstableOptions::from_environment` now requires
an (optional) crate name. If the crate name is unknown (`None`), then the new feature is not available and you still have to use `RUSTC_BOOTSTRAP=1`. In practice this means the feature is only available for `--crate-name`, not for `#![crate_name]`; I'm interested in supporting the second but I'm not sure how.
Other major changes:
- Added `Session::is_nightly_build()`, which uses the `crate_name` of
the session
- Added `nightly_options::match_is_nightly_build`, a convenience method
for looking up `--crate-name` from CLI arguments.
`Session::is_nightly_build()`should be preferred where possible, since
it will take into account `#![crate_name]` (I think).
- Added `unstable_features` to `rustdoc::RenderOptions`
There is a user-facing change here: things like `RUSTC_BOOTSTRAP=0` no
longer active nightly features. In practice this shouldn't be a big
deal, since `RUSTC_BOOTSTRAP` is the opposite of stable and everyone
uses `RUSTC_BOOTSTRAP=1` anyway.
- Add tests
Check against `Cheat`, not whether nightly features are allowed.
Nightly features are always allowed on the nightly channel.
- Only call `is_nightly_build()` once within a function
- Use booleans consistently for rustc_incremental
Sessions can't be passed through threads, so `read_file` couldn't take a
session. To be consistent, also take a boolean in `write_file_header`.
Updated the added documentation in llvm_util.rs to note which copies of LLVM need to be inspected.
Removed avx512bf16 and avx512vp2intersect because they are unsupported before LLVM 9 with the build with external LLVM 8 being supported
Re-introduced detection testing previously removed for un-requestable features tsc and mmx
This PR both adds in-source documentation on what to look out for
when adding a new (X86) feature set and adds all that are detectable at run-time in Rust stable
as of 1.27.0.
This should only enable the use of the corresponding LLVM intrinsics.
Actual intrinsics need to be added separately in rust-lang/stdarch.
It also re-orders the run-time-detect test statements to be more consistent
with the actual list of intrinsics whitelisted and removes underscores not present
in the actual names (which might be mistaken as being part of the name)
Preparation for a subsequent change that replaces
rustc_target::config::Config with its wrapped Target.
On its own, this commit breaks the build. I don't like making
build-breaking commits, but in this instance I believe that it
makes review easier, as the "real" changes of this PR can be
seen much more easily.
Result of running:
find compiler/ -type f -exec sed -i -e 's/target\.target\([)\.,; ]\)/target\1/g' {} \;
find compiler/ -type f -exec sed -i -e 's/target\.target$/target/g' {} \;
find compiler/ -type f -exec sed -i -e 's/target.ptr_width/target.pointer_width/g' {} \;
./x.py fmt