rust/compiler/rustc_codegen_gcc
bors 0e7f91b75e Auto merge of #118324 - RalfJung:ctfe-read-only-pointers, r=saethlin
compile-time evaluation: detect writes through immutable pointers

This has two motivations:
- it unblocks https://github.com/rust-lang/rust/pull/116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in https://github.com/rust-lang/rust/pull/117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB

When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.

The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.

I just hope perf works out.
2023-12-07 18:11:01 +00:00
..
.github/workflows Merge commit '2e8386e9fb3506cef991d04f8b3bc78f9a0c2630' into subtree-update_cg_gcc_2023-11-17 2023-11-19 13:42:13 -05:00
build_sysroot Merge commit '2e8386e9fb3506cef991d04f8b3bc78f9a0c2630' into subtree-update_cg_gcc_2023-11-17 2023-11-19 13:42:13 -05:00
build_system Merge commit '2e8386e9fb3506cef991d04f8b3bc78f9a0c2630' into subtree-update_cg_gcc_2023-11-17 2023-11-19 13:42:13 -05:00
crate_patches Merge commit 'e8dca3e87d164d2806098c462c6ce41301341f68' into sync_from_cg_gcc 2022-06-06 22:04:37 -04:00
cross_patches Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
doc Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
example Allow internal_features in rustc_codegen_gcc examples 2023-12-07 15:26:18 +01:00
patches Merge commit '2e8386e9fb3506cef991d04f8b3bc78f9a0c2630' into subtree-update_cg_gcc_2023-11-17 2023-11-19 13:42:13 -05:00
src Auto merge of #118324 - RalfJung:ctfe-read-only-pointers, r=saethlin 2023-12-07 18:11:01 +00:00
tests Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
tools Merge commit '11a0cceab966e5ff1058ddbcab5977e8a1d6d290' into subtree-update_cg_gcc_2023-10-09 2023-10-09 15:53:34 -04:00
.gitignore Merge commit '11a0cceab966e5ff1058ddbcab5977e8a1d6d290' into subtree-update_cg_gcc_2023-10-09 2023-10-09 15:53:34 -04:00
.ignore Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
.rustfmt.toml Merge commit 'e8dca3e87d164d2806098c462c6ce41301341f68' into sync_from_cg_gcc 2022-06-06 22:04:37 -04:00
Cargo.lock Update rustc_codegen_gcc libc 2023-12-07 14:59:37 +01:00
cargo.sh Merge commit '09ce29d0591a21e1abae22eac4d41ffd32993af8' into subtree-update_cg_gcc_2023-10-25 2023-10-27 16:07:01 -04:00
Cargo.toml Disable master feature by default when building rustc_codegen_gcc 2023-11-02 21:03:27 +01:00
clean_all.sh Merge commit 'e8dca3e87d164d2806098c462c6ce41301341f68' into sync_from_cg_gcc 2022-06-06 22:04:37 -04:00
config.sh Pass --sysroot option 2023-11-02 21:03:27 +01:00
failing-lto-tests.txt Merge commit '11a0cceab966e5ff1058ddbcab5977e8a1d6d290' into subtree-update_cg_gcc_2023-10-09 2023-10-09 15:53:34 -04:00
failing-non-lto-tests.txt Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
failing-ui-tests12.txt Merge commit '2e8386e9fb3506cef991d04f8b3bc78f9a0c2630' into subtree-update_cg_gcc_2023-11-17 2023-11-19 13:42:13 -05:00
failing-ui-tests.txt Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
LICENSE-APACHE Add 'compiler/rustc_codegen_gcc/' from commit 'afae271d5d3719eeb92c18bc004bb6d1965a5f3f' 2021-08-12 21:53:49 -04:00
LICENSE-MIT Add 'compiler/rustc_codegen_gcc/' from commit 'afae271d5d3719eeb92c18bc004bb6d1965a5f3f' 2021-08-12 21:53:49 -04:00
messages.ftl Merge commit '11a0cceab966e5ff1058ddbcab5977e8a1d6d290' into subtree-update_cg_gcc_2023-10-09 2023-10-09 15:53:34 -04:00
Readme.md Merge commit 'e4fe941b11a55c5005630696e9b6d81c65f7bd04' into subtree-update_cg_gcc_2023-10-25 2023-10-26 17:42:02 -04:00
rust-toolchain Merge commit '2e8386e9fb3506cef991d04f8b3bc78f9a0c2630' into subtree-update_cg_gcc_2023-11-17 2023-11-19 13:42:13 -05:00
rustup.sh Merge commit '11a0cceab966e5ff1058ddbcab5977e8a1d6d290' into subtree-update_cg_gcc_2023-10-09 2023-10-09 15:53:34 -04:00
test.sh Merge commit '09ce29d0591a21e1abae22eac4d41ffd32993af8' into subtree-update_cg_gcc_2023-10-25 2023-10-27 16:07:01 -04:00
y.sh Merge commit '11a0cceab966e5ff1058ddbcab5977e8a1d6d290' into subtree-update_cg_gcc_2023-10-09 2023-10-09 15:53:34 -04:00

WIP libgccjit codegen backend for rust

Chat on IRC Chat on Matrix

This is a GCC codegen for rustc, which means it can be loaded by the existing rustc frontend, but benefits from GCC: more architectures are supported and GCC's optimizations are used.

Despite its name, libgccjit can be used for ahead-of-time compilation, as is used here.

Motivation

The primary goal of this project is to be able to compile Rust code on platforms unsupported by LLVM. A secondary goal is to check if using the gcc backend will provide any run-time speed improvement for the programs compiled using rustc.

Building

This requires a patched libgccjit in order to work. You need to use my fork of gcc which already includes these patches.

To build it (most of these instructions come from here, so don't hesitate to take a look there if you encounter an issue):

$ git clone https://github.com/antoyo/gcc
$ sudo apt install flex libmpfr-dev libgmp-dev libmpc3 libmpc-dev
$ mkdir gcc-build gcc-install
$ cd gcc-build
$ ../gcc/configure \
    --enable-host-shared \
    --enable-languages=jit \
    --enable-checking=release \ # it enables extra checks which allow to find bugs
    --disable-bootstrap \
    --disable-multilib \
    --prefix=$(pwd)/../gcc-install
$ make -j4 # You can replace `4` with another number depending on how many cores you have.

If you want to run libgccjit tests, you will need to also enable the C++ language in the configure:

--enable-languages=jit,c++

Then to run libgccjit tests:

$ cd gcc # from the `gcc-build` folder
$ make check-jit
# To run one specific test:
$ make check-jit RUNTESTFLAGS="-v -v -v jit.exp=jit.dg/test-asm.cc"

Put the path to your custom build of libgccjit in the file gcc_path.

$ dirname $(readlink -f `find . -name libgccjit.so`) > gcc_path

Then you can run commands like this:

$ ./y.sh prepare # download and patch sysroot src and install hyperfine for benchmarking
$ LIBRARY_PATH=$(cat gcc_path) LD_LIBRARY_PATH=$(cat gcc_path) ./y.sh build --release

To run the tests:

$ ./test.sh --release

Usage

$CG_GCCJIT_DIR is the directory you cloned this repo into in the following instructions:

export CG_GCCJIT_DIR=[the full path to rustc_codegen_gcc]

Cargo

$ CHANNEL="release" $CG_GCCJIT_DIR/cargo.sh run

If you compiled cg_gccjit in debug mode (aka you didn't pass --release to ./test.sh) you should use CHANNEL="debug" instead or omit CHANNEL="release" completely.

LTO

To use LTO, you need to set the variable FAT_LTO=1 and EMBED_LTO_BITCODE=1 in addition to setting lto = "fat" in the Cargo.toml. Don't set FAT_LTO when compiling the sysroot, though: only set EMBED_LTO_BITCODE=1.

Failing to set EMBED_LTO_BITCODE will give you the following error:

error: failed to copy bitcode to object file: No such file or directory (os error 2)

Rustc

You should prefer using the Cargo method.

$ LIBRARY_PATH=$(cat gcc_path) LD_LIBRARY_PATH=$(cat gcc_path) rustc +$(cat $CG_GCCJIT_DIR/rust-toolchain | grep 'channel' | cut -d '=' -f 2 | sed 's/"//g' | sed 's/ //g') -Cpanic=abort -Zcodegen-backend=$CG_GCCJIT_DIR/target/release/librustc_codegen_gcc.so --sysroot $CG_GCCJIT_DIR/build_sysroot/sysroot my_crate.rs

Env vars

CG_GCCJIT_INCR_CACHE_DISABLED
Don't cache object files in the incremental cache. Useful during development of cg_gccjit to make it possible to use incremental mode for all analyses performed by rustc without caching object files when their content should have been changed by a change to cg_gccjit.
CG_GCCJIT_DISPLAY_CG_TIME
Display the time it took to perform codegen for a crate
CG_RUSTFLAGS
Send additional flags to rustc. Can be used to build the sysroot without unwinding by setting `CG_RUSTFLAGS=-Cpanic=abort`.
CG_GCCJIT_DUMP_TO_FILE
Dump a C-like representation to /tmp/gccjit_dumps and enable debug info in order to debug this C-like representation.

Licensing

While this crate is licensed under a dual Apache/MIT license, it links to libgccjit which is under the GPLv3+ and thus, the resulting toolchain (rustc + GCC codegen) will need to be released under the GPL license.

However, programs compiled with rustc_codegen_gcc do not need to be released under a GPL license.

Debugging

Sometimes, libgccjit will crash and output an error like this:

during RTL pass: expand
libgccjit.so: error: in expmed_mode_index, at expmed.h:249
0x7f0da2e61a35 expmed_mode_index
	../../../gcc/gcc/expmed.h:249
0x7f0da2e61aa4 expmed_op_cost_ptr
	../../../gcc/gcc/expmed.h:271
0x7f0da2e620dc sdiv_cost_ptr
	../../../gcc/gcc/expmed.h:540
0x7f0da2e62129 sdiv_cost
	../../../gcc/gcc/expmed.h:558
0x7f0da2e73c12 expand_divmod(int, tree_code, machine_mode, rtx_def*, rtx_def*, rtx_def*, int)
	../../../gcc/gcc/expmed.c:4335
0x7f0da2ea1423 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier)
	../../../gcc/gcc/expr.c:9240
0x7f0da2cd1a1e expand_gimple_stmt_1
	../../../gcc/gcc/cfgexpand.c:3796
0x7f0da2cd1c30 expand_gimple_stmt
	../../../gcc/gcc/cfgexpand.c:3857
0x7f0da2cd90a9 expand_gimple_basic_block
	../../../gcc/gcc/cfgexpand.c:5898
0x7f0da2cdade8 execute
	../../../gcc/gcc/cfgexpand.c:6582

To see the code which causes this error, call the following function:

gcc_jit_context_dump_to_file(ctxt, "/tmp/output.c", 1 /* update_locations */)

This will create a C-like file and add the locations into the IR pointing to this C file. Then, rerun the program and it will output the location in the second line:

libgccjit.so: /tmp/something.c:61322:0: error: in expmed_mode_index, at expmed.h:249

Or add a breakpoint to add_error in gdb and print the line number using:

p loc->m_line
p loc->m_filename->m_buffer

To print a debug representation of a tree:

debug_tree(expr);

(defined in print-tree.h)

To print a debug reprensentation of a gimple struct:

debug_gimple_stmt(gimple_struct)

To get the rustc command to run in gdb, add the --verbose flag to cargo build.

To have the correct file paths in gdb instead of /usr/src/debug/gcc/libstdc++-v3/libsupc++/eh_personality.cc:

Maybe by calling the following at the beginning of gdb:

set substitute-path /usr/src/debug/gcc /path/to/gcc-repo/gcc

TODO(antoyo): but that's not what I remember I was doing.

failed to build archive error

When you get this error:

error: failed to build archive: failed to open object file: No such file or directory (os error 2)

That can be caused by the fact that you try to compile with lto = "fat", but you didn't compile the sysroot with LTO. (Not sure if that's the reason since I cannot reproduce anymore. Maybe it happened when forgetting setting FAT_LTO.)

ld: cannot find crtbegin.o

When compiling an executable with libgccijt, if setting the *LIBRARY_PATH variables to the install directory, you will get the following errors:

ld: cannot find crtbegin.o: No such file or directory
ld: cannot find -lgcc: No such file or directory
ld: cannot find -lgcc: No such file or directory
libgccjit.so: error: error invoking gcc driver

To fix this, set the variables to gcc-build/build/gcc.

How to debug GCC LTO

Run do the command with -v -save-temps and then extract the lto1 line from the output and run that under the debugger.

How to send arguments to the GCC linker

CG_RUSTFLAGS="-Clink-args=-save-temps -v" ../cargo.sh build

How to see the personality functions in the asm dump

CG_RUSTFLAGS="-Clink-arg=-save-temps -v -Clink-arg=-dA" ../cargo.sh build

How to see the LLVM IR for a sysroot crate

cargo build -v --target x86_64-unknown-linux-gnu -Zbuild-std
# Take the command from the output and add --emit=llvm-ir

To prevent the linker from unmangling symbols

Run with:

COLLECT_NO_DEMANGLE=1

How to use a custom-build rustc

  • Build the stage2 compiler (rustup toolchain link debug-current build/x86_64-unknown-linux-gnu/stage2).
  • Clean and rebuild the codegen with debug-current in the file rust-toolchain.

How to install a forked git-subtree

Using git-subtree with rustc requires a patched git to make it work. The PR that is needed is here. Use the following instructions to install it:

git clone git@github.com:tqc/git.git
cd git
git checkout tqc/subtree
make
make install
cd contrib/subtree
make
cp git-subtree ~/bin

Then, do a sync with this command:

PATH="$HOME/bin:$PATH" ~/bin/git-subtree push -P compiler/rustc_codegen_gcc/ ../rustc_codegen_gcc/ sync_branch_name
cd ../rustc_codegen_gcc
git checkout master
git pull
git checkout sync_branch_name
git merge master

To send the changes to the rust repo:

cd ../rust
git pull origin master
git checkout -b subtree-update_cg_gcc_YYYY-MM-DD
PATH="$HOME/bin:$PATH" ~/bin/git-subtree pull --prefix=compiler/rustc_codegen_gcc/ https://github.com/rust-lang/rustc_codegen_gcc.git master
git push

TODO: write a script that does the above.

https://rust-lang.zulipchat.com/#narrow/stream/301329-t-devtools/topic/subtree.20madness/near/258877725

How to use mem-trace

rustc needs to be built without jemalloc so that mem-trace can overload malloc since jemalloc is linked statically, so a LD_PRELOAD-ed library won't a chance to intercept the calls to malloc.

How to generate GIMPLE

If you need to check what gccjit is generating (GIMPLE), then take a look at how to generate it in gimple.md.

How to build a cross-compiling libgccjit

Building libgccjit

Configuring rustc_codegen_gcc

  • Run ./y.sh prepare --cross so that the sysroot is patched for the cross-compiling case.
  • Set the path to the cross-compiling libgccjit in gcc_path.
  • Make sure you have the linker for your target (for instance m68k-unknown-linux-gnu-gcc) in your $PATH. Currently, the linker name is hardcoded as being $TARGET-gcc. Specify the target when building the sysroot: ./y.sh build --target-triple m68k-unknown-linux-gnu.
  • Build your project by specifying the target: OVERWRITE_TARGET_TRIPLE=m68k-unknown-linux-gnu ../cargo.sh build --target m68k-unknown-linux-gnu.

If the target is not yet supported by the Rust compiler, create a target specification file (note that the arch specified in this file must be supported by the rust compiler). Then, you can use it the following way:

  • Add the target specification file using --target as an absolute path to build the sysroot: ./y.sh build --target-triple m68k-unknown-linux-gnu --target $(pwd)/m68k-unknown-linux-gnu.json
  • Build your project by specifying the target specification file: OVERWRITE_TARGET_TRIPLE=m68k-unknown-linux-gnu ../cargo.sh build --target path/to/m68k-unknown-linux-gnu.json.

If you get the following error:

/usr/bin/ld: unrecognised emulation mode: m68kelf

Make sure you set gcc_path to the install directory.