LLVM has built-in heuristics for adding stack canaries to functions. These
heuristics can be selected with LLVM function attributes. This patch adds a
rustc option `-Z stack-protector={none,basic,strong,all}` which controls the use
of these attributes. This gives rustc the same stack smash protection support as
clang offers through options `-fno-stack-protector`, `-fstack-protector`,
`-fstack-protector-strong`, and `-fstack-protector-all`. The protection this can
offer is demonstrated in test/ui/abi/stack-protector.rs. This fills a gap in the
current list of rustc exploit
mitigations (https://doc.rust-lang.org/rustc/exploit-mitigations.html),
originally discussed in #15179.
Stack smash protection adds runtime overhead and is therefore still off by
default, but now users have the option to trade performance for security as they
see fit. An example use case is adding Rust code in an existing C/C++ code base
compiled with stack smash protection. Without the ability to add stack smash
protection to the Rust code, the code base artifacts could be exploitable in
ways not possible if the code base remained pure C/C++.
Stack smash protection support is present in LLVM for almost all the current
tier 1/tier 2 targets: see
test/assembly/stack-protector/stack-protector-target-support.rs. The one
exception is nvptx64-nvidia-cuda. This patch follows clang's example, and adds a
warning message printed if stack smash protection is used with this target (see
test/ui/stack-protector/warn-stack-protector-unsupported.rs). Support for tier 3
targets has not been checked.
Since the heuristics are applied at the LLVM level, the heuristics are expected
to add stack smash protection to a fraction of functions comparable to C/C++.
Some experiments demonstrating how Rust code is affected by the different
heuristics can be found in
test/assembly/stack-protector/stack-protector-heuristics-effect.rs. There is
potential for better heuristics using Rust-specific safety information. For
example it might be reasonable to skip stack smash protection in functions which
transitively only use safe Rust code, or which uses only a subset of functions
the user declares safe (such as anything under `std.*`). Such alternative
heuristics could be added at a later point.
LLVM also offers a "safestack" sanitizer as an alternative way to guard against
stack smashing (see #26612). This could possibly also be included as a
stack-protection heuristic. An alternative is to add it as a sanitizer (#39699).
This is what clang does: safestack is exposed with option
`-fsanitize=safe-stack`.
The options are only supported by the LLVM backend, but as with other codegen
options it is visible in the main codegen option help menu. The heuristic names
"basic", "strong", and "all" are hopefully sufficiently generic to be usable in
other backends as well.
Reviewed-by: Nikita Popov <nikic@php.net>
Extra commits during review:
- [address-review] make the stack-protector option unstable
- [address-review] reduce detail level of stack-protector option help text
- [address-review] correct grammar in comment
- [address-review] use compiler flag to avoid merging functions in test
- [address-review] specify min LLVM version in fortanix stack-protector test
Only for Fortanix test, since this target specifically requests the
`--x86-experimental-lvi-inline-asm-hardening` flag.
- [address-review] specify required LLVM components in stack-protector tests
- move stack protector option enum closer to other similar option enums
- rustc_interface/tests: sort debug option list in tracking hash test
- add an explicit `none` stack-protector option
Revert "set LLVM requirements for all stack protector support test revisions"
This reverts commit a49b74f92a4e7d701d6f6cf63d207a8aff2e0f68.
In https://reviews.llvm.org/D71059 LLVM 11, the time trace profiler was
extended to support multiple threads.
`timeTraceProfilerInitialize` creates a thread local profiler instance.
When a thread finishes `timeTraceProfilerFinishThread` moves a thread
local instance into a global collection of instances. Finally when all
codegen work is complete `timeTraceProfilerWrite` writes data from the
current thread local instance and the instances in global collection
of instances.
Previously, the profiler was intialized on a single thread only. Since
this thread performs no code generation on its own, the resulting
profile was empty.
Update LLVM codegen to initialize & finish time trace profiler on each
code generation thread.
Add LLVM CFI support to the Rust compiler
This PR adds LLVM Control Flow Integrity (CFI) support to the Rust compiler. It initially provides forward-edge control flow protection for Rust-compiled code only by aggregating function pointers in groups identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space) will be provided in later work as part of this project by defining and using compatible type identifiers (see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e., -Clto).
Thank you, `@eddyb` and `@pcc,` for all the help!
This commit adds LLVM Control Flow Integrity (CFI) support to the Rust
compiler. It initially provides forward-edge control flow protection for
Rust-compiled code only by aggregating function pointers in groups
identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled
code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code
share the same virtual address space) will be provided in later work as
part of this project by defining and using compatible type identifiers
(see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e.,
-Clto).
Add -Z no-unique-section-names to reduce ELF header bloat.
This change adds a new compiler flag that can help reduce the size of ELF binaries that contain many functions.
By default, when enabling function sections (which is the default for most targets), the LLVM backend will generate different section names for each function. For example, a function `func` would generate a section called `.text.func`. Normally this is fine because the linker will merge all those sections into a single one in the binary. However, starting with [LLVM 12](https://github.com/llvm/llvm-project/commit/ee5d1a04), the backend will also generate unique section names for exception handling, resulting in thousands of `.gcc_except_table.*` sections ending up in the final binary because some linkers like LLD don't currently merge or strip these EH sections (see discussion [here](https://reviews.llvm.org/D83655)). This can bloat the ELF headers and string table significantly in binaries that contain many functions.
The new option is analogous to Clang's `-fno-unique-section-names`, and instructs LLVM to generate the same `.text` and `.gcc_except_table` section for each function, resulting in a smaller final binary.
The motivation to add this new option was because we have a binary that ended up with so many ELF sections (over 65,000) that it broke some existing ELF tools, which couldn't handle so many sections.
Here's our old binary:
```
$ readelf --sections old.elf | head -1
There are 71746 section headers, starting at offset 0x2a246508:
$ readelf --sections old.elf | grep shstrtab
[71742] .shstrtab STRTAB 0000000000000000 2977204c ad44bb 00 0 0 1
```
That's an 11MB+ string table. Here's the new binary using this option:
```
$ readelf --sections new.elf | head -1
There are 43 section headers, starting at offset 0x29143ca8:
$ readelf --sections new.elf | grep shstrtab
[40] .shstrtab STRTAB 0000000000000000 29143acc 0001db 00 0 0 1
```
The whole binary size went down by over 20MB, which is quite significant.
Cleanup LLVM multi-threading checks
The support for runtime multi-threading was removed from LLVM. Calls to
`LLVMStartMultithreaded` became no-ops equivalent to checking if LLVM
was compiled with support for threads http://reviews.llvm.org/D4216.
The support for runtime multi-threading was removed from LLVM. Calls to
`LLVMStartMultithreaded` became no-ops equivalent to checking if LLVM
was compiled with support for threads http://reviews.llvm.org/D4216.
This change adds a new compiler flag that can help reduce the size of
ELF binaries that contain many functions.
By default, when enabling function sections (which is the default for most
targets), the LLVM backend will generate different section names for each
function. For example, a function "func" would generate a section called
".text.func". Normally this is fine because the linker will merge all those
sections into a single one in the binary. However, starting with LLVM 12
(llvm/llvm-project@ee5d1a0), the backend will
also generate unique section names for exception handling, resulting in
thousands of ".gcc_except_table.*" sections ending up in the final binary
because some linkers don't currently merge or strip these EH sections.
This can bloat the ELF headers and string table significantly in
binaries that contain many functions.
The new option is analogous to Clang's -fno-unique-section-names, and
instructs LLVM to generate the same ".text" and ".gcc_except_table"
section for each function, resulting in smaller object files and
potentially a smaller final binary.
Implement `#[link_ordinal(n)]`
Allows the use of `#[link_ordinal(n)]` with `#[link(kind = "raw-dylib")]`, allowing Rust to link against DLLs that export symbols by ordinal rather than by name. As long as the ordinal matches, the name of the function in Rust is not required to match the name of the corresponding function in the exporting DLL.
Part of #58713.
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdoFixes#64892.
Rather than relying on `getPointerElementType()` from LLVM function
pointers, we now pass the function type explicitly when building `call`
or `invoke` instructions.
Use existing declaration of rust_eh_personality
If crate declares `rust_eh_personality`, re-use existing declaration
as otherwise attempts to set function attributes that follow the
declaration will fail (unless it happens to have exactly the same
type signature as the one predefined in the compiler).
Fixes#70117.
Fixes https://github.com/rust-lang/rust/pull/81469#issuecomment-809428126; probably.
This makes load generation compatible with opaque pointers.
The generation of nontemporal copies still accesses the pointer
element type, as fixing this requires more movement.
This does not yet support #[link_name] attributes on functions, the #[link_ordinal]
attribute, #[link(kind = "raw-dylib")] on extern blocks in bin crates, or
stdcall functions on 32-bit x86.
rustc: Store metadata-in-rlibs in object files
This commit updates how rustc compiler metadata is stored in rlibs.
Previously metadata was stored as a raw file that has the same format as
`--emit metadata`. After this commit, however, the metadata is encoded
into a small object file which has one section which is the contents of
the metadata.
The motivation for this commit is to fix a common case where #83730
arises. The problem is that when rustc crates a `dylib` crate type it
needs to include entire rlib files into the dylib, so it passes
`--whole-archive` (or the equivalent) to the linker. The problem with
this, though, is that the linker will attempt to read all files in the
archive. If the metadata file were left as-is (today) then the linker
would generate an error saying it can't read the file. The previous
solution was to alter the rlib just before linking, creating a new
archive in a temporary directory which has the metadata file removed.
This problem from before this commit is now removed if the metadata file
is stored in an object file that the linker can read. The only caveat we
have to take care of is to ensure that the linker never actually
includes the contents of the object file into the final output. We apply
similar tricks as the `.llvmbc` bytecode sections to do this.
This involved changing the metadata loading code a bit, namely updating
some of the LLVM C APIs used to use non-deprecated ones and fiddling
with the lifetimes a bit to get everything to work out. Otherwise though
this isn't intended to be a functional change really, only that metadata
is stored differently in archives now.
This should end up fixing #83730 because by default dylibs will no
longer have their rlib dependencies "altered" meaning that
split-debuginfo will continue to have valid paths pointing at the
original rlibs. (note that we still "alter" rlibs if LTO is enabled to
remove Rust object files and we also "alter" for the #[link(cfg)]
feature, but that's rarely used).
Closes#83730
This commit updates how rustc compiler metadata is stored in rlibs.
Previously metadata was stored as a raw file that has the same format as
`--emit metadata`. After this commit, however, the metadata is encoded
into a small object file which has one section which is the contents of
the metadata.
The motivation for this commit is to fix a common case where #83730
arises. The problem is that when rustc crates a `dylib` crate type it
needs to include entire rlib files into the dylib, so it passes
`--whole-archive` (or the equivalent) to the linker. The problem with
this, though, is that the linker will attempt to read all files in the
archive. If the metadata file were left as-is (today) then the linker
would generate an error saying it can't read the file. The previous
solution was to alter the rlib just before linking, creating a new
archive in a temporary directory which has the metadata file removed.
This problem from before this commit is now removed if the metadata file
is stored in an object file that the linker can read. The only caveat we
have to take care of is to ensure that the linker never actually
includes the contents of the object file into the final output. We apply
similar tricks as the `.llvmbc` bytecode sections to do this.
This involved changing the metadata loading code a bit, namely updating
some of the LLVM C APIs used to use non-deprecated ones and fiddling
with the lifetimes a bit to get everything to work out. Otherwise though
this isn't intended to be a functional change really, only that metadata
is stored differently in archives now.
This should end up fixing #83730 because by default dylibs will no
longer have their rlib dependencies "altered" meaning that
split-debuginfo will continue to have valid paths pointing at the
original rlibs. (note that we still "alter" rlibs if LTO is enabled to
remove Rust object files and we also "alter" for the #[link(cfg)]
feature, but that's rarely used).
Closes#83730
Set dso_local for more items
Related to https://github.com/rust-lang/rust/pull/83592. (cc `@nagisa)`
Noticed that on x86_64 with `relocation-model: static` `R_X86_64_GOTPCREL` relocations were still generated in some cases. (related: https://github.com/Rust-for-Linux/linux/issues/135; Rust-for-Linux needs these fixes to successfully build)
First time doing anything with LLVM so not sure whether this is correct but the following are some of the things I've tried to convince myself.
## C equivalent
Example from clang which also sets `dso_local` in these cases:
`clang-12 -fno-PIC -S -emit-llvm test.c`
```C
extern int A;
int* a() {
return &A;
}
int B;
int* b() {
return &B;
}
```
```
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
`@A` = external dso_local global i32, align 4
`@B` = dso_local global i32 0, align 4
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32* `@a()` #0 {
ret i32* `@A`
}
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32* `@b()` #0 {
ret i32* `@B`
}
attributes #0 = { noinline nounwind optnone uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 12.0.0 (https://github.com/llvm/llvm-project/ b978a93635b584db380274d7c8963c73989944a1)"}
```
`clang-12 -fno-PIC -c test.c`
`objdump test.o -r`:
```
test.o: file format elf64-x86-64
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000006 R_X86_64_64 A
0000000000000016 R_X86_64_64 B
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text
0000000000000040 R_X86_64_PC32 .text+0x0000000000000010
```
## Comparison to pre-LLVM 12 output
`rustc --emit=obj,llvm-ir --target=x86_64-unknown-none-linuxkernel --crate-type rlib test.rs`
```Rust
#![feature(no_core, lang_items)]
#![no_core]
#[lang="sized"]
trait Sized {}
#[lang="sync"]
trait Sync {}
#[lang = "drop_in_place"]
pub unsafe fn drop_in_place<T: ?Sized>(_: *mut T) {}
impl Sync for i32 {}
pub static STATIC: i32 = 32;
extern {
pub static EXT_STATIC: i32;
}
pub fn a() -> &'static i32 {
&STATIC
}
pub fn b() -> &'static i32 {
unsafe {&EXT_STATIC}
}
```
`objdump test.o -r`
nightly-2021-02-20 (rustc target is `x86_64-linux-kernel`):
```
RELOCATION RECORDS FOR [.text._ZN4test1a17h1024ba65f3424175E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S _ZN4test6STATIC17h3adc41a83746c9ffE
RELOCATION RECORDS FOR [.text._ZN4test1b17h86a6a80c1190ac8dE]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S EXT_STATIC
```
nightly-2021-05-10:
```
RELOCATION RECORDS FOR [.text._ZN4test1a17he846f03bf37b2d20E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_GOTPCREL _ZN4test6STATIC17h5a059515bf3d4968E-0x0000000000000004
RELOCATION RECORDS FOR [.text._ZN4test1b17h7e0f7f80fbd91125E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_GOTPCREL EXT_STATIC-0x0000000000000004
```
This PR:
```
RELOCATION RECORDS FOR [.text._ZN4test1a17he846f03bf37b2d20E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S _ZN4test6STATIC17h5a059515bf3d4968E
RELOCATION RECORDS FOR [.text._ZN4test1b17h7e0f7f80fbd91125E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S EXT_STATIC
```
`fast-math` implies things like functions not being able to accept as an
argument or return as a result, say, `inf` which made these functions
confusingly named or behaving incorrectly, depending on how you
interpret it. Since the time when these intrinsics have been implemented
the intrinsics user's (stdsimd) approach has changed significantly and
so now it is required that these intrinsics operate normally rather than
in "whatever" way.
Fixes#84268
Set dso_local for hidden, private and local items
This should probably have no real effect in most cases, as e.g. `hidden`
visibility already implies `dso_local` (or at least LLVM IR does not
preserve the `dso_local` setting if the item is already `hidden`), but
it should fix `-Crelocation-model=static` and improve codegen in
executables.
Note that this PR does not exhaustively port the logic in [clang], only the
portion that is necessary to fix a regression from LLVM 12 that relates to
`-Crelocation_model=static`.
Fixes#83335
[clang]: 3001d080c8/clang/lib/CodeGen/CodeGenModule.cpp (L945-L1039)
This should have no real effect in most cases, as e.g. `hidden`
visibility already implies `dso_local` (or at least LLVM IR does not
preserve the `dso_local` setting if the item is already `hidden`), but
it should fix `-Crelocation-model=static` and improve codegen in
executables.
Note that this PR does not exhaustively port the logic in [clang]. Only
the obviously correct portion and what is necessary to fix a regression
from LLVM 12 that relates to `-Crelocation_model=static`.
Fixes#83335
[clang]: 3001d080c8/clang/lib/CodeGen/CodeGenModule.cpp (L945-L1039)
- Add back `HirIdVec`, with a comment that it will soon be used.
- Add back `*_region` functions, with a comment they may soon be used.
- Remove `-Z borrowck_stats` completely. It didn't do anything.
- Remove `make_nop` completely.
- Add back `current_loc`, which is used by an out-of-tree tool.
- Fix style nits
- Remove `AtomicCell` with `cfg(parallel_compiler)` for consistency.
Found with https://github.com/est31/warnalyzer.
Dubious changes:
- Is anyone else using rustc_apfloat? I feel weird completely deleting
x87 support.
- Maybe some of the dead code in rustc_data_structures, in case someone
wants to use it in the future?
- Don't change rustc_serialize
I plan to scrap most of the json module in the near future (see
https://github.com/rust-lang/compiler-team/issues/418) and fixing the
tests needed more work than I expected.
TODO: check if any of the comments on the deleted code should be kept.
The definition of this struct changes in LLVM 12 due to the addition
of branch coverage support. To avoid future mismatches, declare our
own struct and then convert between them.
This commit adds a new ABI to be selected via `extern
"C-cmse-nonsecure-call"` on function pointers in order for the compiler to
apply the corresponding cmse_nonsecure_call callsite attribute.
For Armv8-M targets supporting TrustZone-M, this will perform a
non-secure function call by saving, clearing and calling a non-secure
function pointer using the BLXNS instruction.
See the page on the unstable book for details.
Signed-off-by: Hugues de Valon <hugues.devalon@arm.com>