rust/compiler/rustc_codegen_llvm/src/back/write.rs

1199 lines
46 KiB
Rust
Raw Normal View History

2019-02-17 18:58:58 +00:00
use crate::back::lto::ThinBuffer;
use crate::back::profiling::{
selfprofile_after_pass_callback, selfprofile_before_pass_callback, LlvmSelfProfiler,
};
2019-02-17 18:58:58 +00:00
use crate::base;
2019-12-24 22:38:22 +00:00
use crate::common;
2019-02-17 18:58:58 +00:00
use crate::consts;
use crate::llvm::{self, DiagnosticInfo, PassManager, SMDiagnostic};
use crate::llvm_util;
use crate::type_::Type;
use crate::LlvmCodegenBackend;
2019-12-24 22:38:22 +00:00
use crate::ModuleLlvm;
use rustc_codegen_ssa::back::link::ensure_removed;
use rustc_codegen_ssa::back::write::{
BitcodeSection, CodegenContext, EmitObj, ModuleConfig, TargetMachineFactoryConfig,
TargetMachineFactoryFn,
};
2019-12-24 22:38:22 +00:00
use rustc_codegen_ssa::traits::*;
Store LLVM bitcode in object files, not compressed This commit is an attempted resurrection of #70458 where LLVM bitcode emitted by rustc into rlibs is stored into object file sections rather than in a separate file. The main rationale for doing this is that when rustc emits bitcode it will no longer use a custom compression scheme which makes it both easier to interoperate with existing tools and also cuts down on compile time since this compression isn't happening. The blocker for this in #70458 turned out to be that native linkers didn't handle the new sections well, causing the sections to either trigger bugs in the linker or actually end up in the final linked artifact. This commit attempts to address these issues by ensuring that native linkers ignore the new sections by inserting custom flags with module-level inline assembly. Note that this does not currently change the API of the compiler at all. The pre-existing `-C bitcode-in-rlib` flag is co-opted to indicate whether the bitcode should be present in the object file or not. Finally, note that an important consequence of this commit, which is also one of its primary purposes, is to enable rustc's `-Clto` bitcode loading to load rlibs produced with `-Clinker-plugin-lto`. The goal here is that when you're building with LTO Cargo will tell rustc to skip codegen of all intermediate crates and only generate LLVM IR. Today rustc will generate both object code and LLVM IR, but the object code is later simply thrown away, wastefully.
2020-04-23 18:45:55 +00:00
use rustc_codegen_ssa::{CompiledModule, ModuleCodegen};
use rustc_data_structures::profiling::SelfProfilerRef;
use rustc_data_structures::small_c_str::SmallCStr;
2020-06-09 13:37:59 +00:00
use rustc_errors::{FatalError, Handler, Level};
2019-12-24 22:38:22 +00:00
use rustc_fs_util::{link_or_copy, path_to_c_string};
2020-03-29 15:19:48 +00:00
use rustc_middle::bug;
use rustc_middle::ty::TyCtxt;
2021-02-07 21:47:03 +00:00
use rustc_session::config::{self, Lto, OutputType, Passes, SwitchWithOptPath};
use rustc_session::Session;
use rustc_span::symbol::sym;
2020-05-26 19:07:59 +00:00
use rustc_span::InnerSpan;
2021-02-07 21:47:03 +00:00
use rustc_target::spec::{CodeModel, RelocModel, SanitizerSet, SplitDebuginfo};
2020-08-05 11:35:53 +00:00
use tracing::debug;
2019-12-24 22:38:22 +00:00
use libc::{c_char, c_int, c_uint, c_void, size_t};
use std::ffi::CString;
use std::fs;
use std::io::{self, Write};
use std::path::{Path, PathBuf};
2019-12-24 22:38:22 +00:00
use std::slice;
use std::str;
use std::sync::Arc;
rustc: Link LLVM directly into rustc again This commit builds on #65501 continue to simplify the build system and compiler now that we no longer have multiple LLVM backends to ship by default. Here this switches the compiler back to what it once was long long ago, which is linking LLVM directly to the compiler rather than dynamically loading it at runtime. The `codegen-backends` directory of the sysroot no longer exists and all relevant support in the build system is removed. Note that `rustc` still supports a dynamically loaded codegen backend as it did previously, it just no longer supports dynamically loaded codegen backends in its own sysroot. Additionally as part of this the `librustc_codegen_llvm` crate now once again explicitly depends on all of its crates instead of implicitly loading them through the sysroot. This involved filling out its `Cargo.toml` and deleting all the now-unnecessary `extern crate` annotations in the header of the crate. (this in turn required adding a number of imports for names of macros too). The end results of this change are: * Rustbuild's build process for the compiler as all the "oh don't forget the codegen backend" checks can be easily removed. * Building `rustc_codegen_llvm` is much simpler since it's simply another compiler crate. * Managing the dependencies of `rustc_codegen_llvm` is much simpler since it's "just another `Cargo.toml` to edit" * The build process should be a smidge faster because there's more parallelism in the main rustc build step rather than splitting `librustc_codegen_llvm` out to its own step. * The compiler is expected to be slightly faster by default because the codegen backend does not need to be dynamically loaded. * Disabling LLVM as part of rustbuild is still supported, supporting multiple codegen backends is still supported, and dynamic loading of a codegen backend is still supported.
2019-10-22 15:51:35 +00:00
pub fn llvm_err(handler: &rustc_errors::Handler, msg: &str) -> FatalError {
match llvm::last_error() {
Some(err) => handler.fatal(&format!("{}: {}", msg, err)),
2021-09-30 17:38:50 +00:00
None => handler.fatal(msg),
}
}
pub fn write_output_file(
2019-12-24 22:38:22 +00:00
handler: &rustc_errors::Handler,
target: &'ll llvm::TargetMachine,
pm: &llvm::PassManager<'ll>,
m: &'ll llvm::Module,
output: &Path,
dwo_output: Option<&Path>,
2019-12-24 22:38:22 +00:00
file_type: llvm::FileType,
self_profiler_ref: &SelfProfilerRef,
2019-12-24 22:38:22 +00:00
) -> Result<(), FatalError> {
unsafe {
let output_c = path_to_c_string(output);
let result = if let Some(dwo_output) = dwo_output {
let dwo_output_c = path_to_c_string(dwo_output);
llvm::LLVMRustWriteOutputFile(
target,
pm,
m,
output_c.as_ptr(),
dwo_output_c.as_ptr(),
file_type,
)
} else {
llvm::LLVMRustWriteOutputFile(
target,
pm,
m,
output_c.as_ptr(),
std::ptr::null(),
file_type,
)
};
// Record artifact sizes for self-profiling
if result == llvm::LLVMRustResult::Success {
let artifact_kind = match file_type {
llvm::FileType::ObjectFile => "object_file",
llvm::FileType::AssemblyFile => "assembly_file",
};
record_artifact_size(self_profiler_ref, artifact_kind, output);
if let Some(dwo_file) = dwo_output {
record_artifact_size(self_profiler_ref, "dwo_file", dwo_file);
}
}
result.into_result().map_err(|()| {
let msg = format!("could not write output to {}", output.display());
llvm_err(handler, &msg)
})
}
}
pub fn create_informational_target_machine(sess: &Session) -> &'static mut llvm::TargetMachine {
let config = TargetMachineFactoryConfig { split_dwarf_file: None };
target_machine_factory(sess, config::OptLevel::No)(config)
2019-12-24 22:38:22 +00:00
.unwrap_or_else(|err| llvm_err(sess.diagnostic(), &err).raise())
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
pub fn create_target_machine(tcx: TyCtxt<'_>, mod_name: &str) -> &'static mut llvm::TargetMachine {
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo` This commit adds a new stable codegen option to rustc, `-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also subsumed by this flag but still requires `-Zunstable-options` to actually activate. The `-Csplit-debuginfo` flag takes one of three values: * `off` - This indicates that split-debuginfo from the final artifact is not desired. This is not supported on Windows and is the default on Unix platforms except macOS. On macOS this means that `dsymutil` is not executed. * `packed` - This means that debuginfo is desired in one location separate from the main executable. This is the default on Windows (`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes `-Zsplit-dwarf=single` and produces a `*.dwp` file. * `unpacked` - This means that debuginfo will be roughly equivalent to object files, meaning that it's throughout the build directory rather than in one location (often the fastest for local development). This is not the default on any platform and is not supported on Windows. Each target can indicate its own default preference for how debuginfo is handled. Almost all platforms default to `off` except for Windows and macOS which default to `packed` for historical reasons. Some equivalencies for previous unstable flags with the new flags are: * `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed` * `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked` * `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed` * `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked` Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for non-macOS platforms since split-dwarf support was *just* implemented in rustc. There's some more rationale listed on #79361, but the main gist of the motivation for this commit is that `dsymutil` can take quite a long time to execute in debug builds and provides little benefit. This means that incremental compile times appear that much worse on macOS because the compiler is constantly running `dsymutil` over every single binary it produces during `cargo build` (even build scripts!). Ideally rustc would switch to not running `dsymutil` by default, but that's a problem left to get tackled another day. Closes #79361
2020-11-30 16:39:08 +00:00
let split_dwarf_file = if tcx.sess.target_can_use_split_dwarf() {
2021-05-11 12:39:04 +00:00
tcx.output_filenames(()).split_dwarf_path(tcx.sess.split_debuginfo(), Some(mod_name))
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo` This commit adds a new stable codegen option to rustc, `-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also subsumed by this flag but still requires `-Zunstable-options` to actually activate. The `-Csplit-debuginfo` flag takes one of three values: * `off` - This indicates that split-debuginfo from the final artifact is not desired. This is not supported on Windows and is the default on Unix platforms except macOS. On macOS this means that `dsymutil` is not executed. * `packed` - This means that debuginfo is desired in one location separate from the main executable. This is the default on Windows (`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes `-Zsplit-dwarf=single` and produces a `*.dwp` file. * `unpacked` - This means that debuginfo will be roughly equivalent to object files, meaning that it's throughout the build directory rather than in one location (often the fastest for local development). This is not the default on any platform and is not supported on Windows. Each target can indicate its own default preference for how debuginfo is handled. Almost all platforms default to `off` except for Windows and macOS which default to `packed` for historical reasons. Some equivalencies for previous unstable flags with the new flags are: * `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed` * `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked` * `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed` * `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked` Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for non-macOS platforms since split-dwarf support was *just* implemented in rustc. There's some more rationale listed on #79361, but the main gist of the motivation for this commit is that `dsymutil` can take quite a long time to execute in debug builds and provides little benefit. This means that incremental compile times appear that much worse on macOS because the compiler is constantly running `dsymutil` over every single binary it produces during `cargo build` (even build scripts!). Ideally rustc would switch to not running `dsymutil` by default, but that's a problem left to get tackled another day. Closes #79361
2020-11-30 16:39:08 +00:00
} else {
None
};
let config = TargetMachineFactoryConfig { split_dwarf_file };
2021-09-30 17:38:50 +00:00
target_machine_factory(tcx.sess, tcx.backend_optimization_level(()))(config)
2019-12-24 22:38:22 +00:00
.unwrap_or_else(|err| llvm_err(tcx.sess.diagnostic(), &err).raise())
}
2019-12-24 22:38:22 +00:00
pub fn to_llvm_opt_settings(
cfg: config::OptLevel,
) -> (llvm::CodeGenOptLevel, llvm::CodeGenOptSize) {
use self::config::OptLevel::*;
match cfg {
No => (llvm::CodeGenOptLevel::None, llvm::CodeGenOptSizeNone),
Less => (llvm::CodeGenOptLevel::Less, llvm::CodeGenOptSizeNone),
Default => (llvm::CodeGenOptLevel::Default, llvm::CodeGenOptSizeNone),
Aggressive => (llvm::CodeGenOptLevel::Aggressive, llvm::CodeGenOptSizeNone),
Size => (llvm::CodeGenOptLevel::Default, llvm::CodeGenOptSizeDefault),
SizeMin => (llvm::CodeGenOptLevel::Default, llvm::CodeGenOptSizeAggressive),
}
}
fn to_pass_builder_opt_level(cfg: config::OptLevel) -> llvm::PassBuilderOptLevel {
use config::OptLevel::*;
match cfg {
No => llvm::PassBuilderOptLevel::O0,
Less => llvm::PassBuilderOptLevel::O1,
Default => llvm::PassBuilderOptLevel::O2,
Aggressive => llvm::PassBuilderOptLevel::O3,
Size => llvm::PassBuilderOptLevel::Os,
SizeMin => llvm::PassBuilderOptLevel::Oz,
}
}
fn to_llvm_relocation_model(relocation_model: RelocModel) -> llvm::RelocModel {
match relocation_model {
RelocModel::Static => llvm::RelocModel::Static,
// LLVM doesn't have a PIE relocation model, it represents PIE as PIC with an extra attribute.
RelocModel::Pic | RelocModel::Pie => llvm::RelocModel::PIC,
RelocModel::DynamicNoPic => llvm::RelocModel::DynamicNoPic,
RelocModel::Ropi => llvm::RelocModel::ROPI,
RelocModel::Rwpi => llvm::RelocModel::RWPI,
RelocModel::RopiRwpi => llvm::RelocModel::ROPI_RWPI,
}
}
pub(crate) fn to_llvm_code_model(code_model: Option<CodeModel>) -> llvm::CodeModel {
match code_model {
Some(CodeModel::Tiny) => llvm::CodeModel::Tiny,
Some(CodeModel::Small) => llvm::CodeModel::Small,
Some(CodeModel::Kernel) => llvm::CodeModel::Kernel,
Some(CodeModel::Medium) => llvm::CodeModel::Medium,
Some(CodeModel::Large) => llvm::CodeModel::Large,
None => llvm::CodeModel::None,
}
}
2019-12-24 22:38:22 +00:00
pub fn target_machine_factory(
sess: &Session,
optlvl: config::OptLevel,
) -> TargetMachineFactoryFn<LlvmCodegenBackend> {
let reloc_model = to_llvm_relocation_model(sess.relocation_model());
let (opt_level, _) = to_llvm_opt_settings(optlvl);
let use_softfp = sess.opts.cg.soft_float;
let ffunction_sections =
sess.opts.debugging_opts.function_sections.unwrap_or(sess.target.function_sections);
let fdata_sections = ffunction_sections;
let funique_section_names = !sess.opts.debugging_opts.no_unique_section_names;
let code_model = to_llvm_code_model(sess.code_model());
let mut singlethread = sess.target.singlethread;
// On the wasm target once the `atomics` feature is enabled that means that
// we're no longer single-threaded, or otherwise we don't want LLVM to
// lower atomic operations to single-threaded operations.
2020-12-30 18:52:21 +00:00
if singlethread && sess.target.is_like_wasm && sess.target_features.contains(&sym::atomics) {
singlethread = false;
}
std: Add a new wasm32-unknown-unknown target This commit adds a new target to the compiler: wasm32-unknown-unknown. This target is a reimagining of what it looks like to generate WebAssembly code from Rust. Instead of using Emscripten which can bring with it a weighty runtime this instead is a target which uses only the LLVM backend for WebAssembly and a "custom linker" for now which will hopefully one day be direct calls to lld. Notable features of this target include: * There is zero runtime footprint. The target assumes nothing exists other than the wasm32 instruction set. * There is zero toolchain footprint beyond adding the target. No custom linker is needed, rustc contains everything. * Very small wasm modules can be generated directly from Rust code using this target. * Most of the standard library is stubbed out to return an error, but anything related to allocation works (aka `HashMap`, `Vec`, etc). * Naturally, any `#[no_std]` crate should be 100% compatible with this new target. This target is currently somewhat janky due to how linking works. The "linking" is currently unconditional whole program LTO (aka LLVM is being used as a linker). Naturally that means compiling programs is pretty slow! Eventually though this target should have a linker. This target is also intended to be quite experimental. I'm hoping that this can act as a catalyst for further experimentation in Rust with WebAssembly. Breaking changes are very likely to land to this target, so it's not recommended to rely on it in any critical capacity yet. We'll let you know when it's "production ready". --- Currently testing-wise this target is looking pretty good but isn't complete. I've got almost the entire `run-pass` test suite working with this target (lots of tests ignored, but many passing as well). The `core` test suite is still getting LLVM bugs fixed to get that working and will take some time. Relatively simple programs all seem to work though! --- It's worth nothing that you may not immediately see the "smallest possible wasm module" for the input you feed to rustc. For various reasons it's very difficult to get rid of the final "bloat" in vanilla rustc (again, a real linker should fix all this). For now what you'll have to do is: cargo install --git https://github.com/alexcrichton/wasm-gc wasm-gc foo.wasm bar.wasm And then `bar.wasm` should be the smallest we can get it! --- In any case for now I'd love feedback on this, particularly on the various integration points if you've got better ideas of how to approach them!
2017-10-23 03:01:00 +00:00
let triple = SmallCStr::new(&sess.target.llvm_target);
let cpu = SmallCStr::new(llvm_util::target_cpu(sess));
Adjust `-Ctarget-cpu=native` handling in cg_llvm When cg_llvm encounters the `-Ctarget-cpu=native` it computes an explciit set of features that applies to the target in order to correctly compile code for the host CPU (because e.g. `skylake` alone is not sufficient to tell if some of the instructions are available or not). However there were a couple of issues with how we did this. Firstly, the order in which features were overriden wasn't quite right – conceptually you'd expect `-Ctarget-cpu=native` option to override the features that are implicitly set by the target definition. However due to how other `-Ctarget-cpu` values are handled we must adopt the following order of priority: * Features from -Ctarget-cpu=*; are overriden by * Features implied by --target; are overriden by * Features from -Ctarget-feature; are overriden by * function specific features. Another problem was in that the function level `target-features` attribute would overwrite the entire set of the globally enabled features, rather than just the features the `#[target_feature(enable/disable)]` specified. With something like `-Ctarget-cpu=native` we'd end up in a situation wherein a function without `#[target_feature(enable)]` annotation would have a broader set of features compared to a function with one such attribute. This turned out to be a cause of heavy run-time regressions in some code using these function-level attributes in conjunction with `-Ctarget-cpu=native`, for example. With this PR rustc is more careful about specifying the entire set of features for functions that use `#[target_feature(enable/disable)]` or `#[instruction_set]` attributes. Sadly testing the original reproducer for this behaviour is quite impossible – we cannot rely on `-Ctarget-cpu=native` to be anything in particular on developer or CI machines.
2021-03-13 13:29:39 +00:00
let features = llvm_util::llvm_global_features(sess).join(",");
let features = CString::new(features).unwrap();
let abi = SmallCStr::new(&sess.target.llvm_abiname);
let trap_unreachable =
sess.opts.debugging_opts.trap_unreachable.unwrap_or(sess.target.trap_unreachable);
2018-09-13 17:43:15 +00:00
let emit_stack_size_section = sess.opts.debugging_opts.emit_stack_sizes;
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
let asm_comments = sess.asm_comments();
let relax_elf_relocations =
sess.opts.debugging_opts.relax_elf_relocations.unwrap_or(sess.target.relax_elf_relocations);
let use_init_array =
!sess.opts.debugging_opts.use_ctors_section.unwrap_or(sess.target.use_ctors_section);
Arc::new(move |config: TargetMachineFactoryConfig| {
let split_dwarf_file = config.split_dwarf_file.unwrap_or_default();
let split_dwarf_file = CString::new(split_dwarf_file.to_str().unwrap()).unwrap();
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
let tm = unsafe {
llvm::LLVMRustCreateTargetMachine(
2019-12-24 22:38:22 +00:00
triple.as_ptr(),
cpu.as_ptr(),
features.as_ptr(),
abi.as_ptr(),
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
code_model,
reloc_model,
opt_level,
use_softfp,
ffunction_sections,
fdata_sections,
funique_section_names,
trap_unreachable,
std: Add a new wasm32-unknown-unknown target This commit adds a new target to the compiler: wasm32-unknown-unknown. This target is a reimagining of what it looks like to generate WebAssembly code from Rust. Instead of using Emscripten which can bring with it a weighty runtime this instead is a target which uses only the LLVM backend for WebAssembly and a "custom linker" for now which will hopefully one day be direct calls to lld. Notable features of this target include: * There is zero runtime footprint. The target assumes nothing exists other than the wasm32 instruction set. * There is zero toolchain footprint beyond adding the target. No custom linker is needed, rustc contains everything. * Very small wasm modules can be generated directly from Rust code using this target. * Most of the standard library is stubbed out to return an error, but anything related to allocation works (aka `HashMap`, `Vec`, etc). * Naturally, any `#[no_std]` crate should be 100% compatible with this new target. This target is currently somewhat janky due to how linking works. The "linking" is currently unconditional whole program LTO (aka LLVM is being used as a linker). Naturally that means compiling programs is pretty slow! Eventually though this target should have a linker. This target is also intended to be quite experimental. I'm hoping that this can act as a catalyst for further experimentation in Rust with WebAssembly. Breaking changes are very likely to land to this target, so it's not recommended to rely on it in any critical capacity yet. We'll let you know when it's "production ready". --- Currently testing-wise this target is looking pretty good but isn't complete. I've got almost the entire `run-pass` test suite working with this target (lots of tests ignored, but many passing as well). The `core` test suite is still getting LLVM bugs fixed to get that working and will take some time. Relatively simple programs all seem to work though! --- It's worth nothing that you may not immediately see the "smallest possible wasm module" for the input you feed to rustc. For various reasons it's very difficult to get rid of the final "bloat" in vanilla rustc (again, a real linker should fix all this). For now what you'll have to do is: cargo install --git https://github.com/alexcrichton/wasm-gc wasm-gc foo.wasm bar.wasm And then `bar.wasm` should be the smallest we can get it! --- In any case for now I'd love feedback on this, particularly on the various integration points if you've got better ideas of how to approach them!
2017-10-23 03:01:00 +00:00
singlethread,
asm_comments,
2018-09-13 17:43:15 +00:00
emit_stack_size_section,
relax_elf_relocations,
use_init_array,
split_dwarf_file.as_ptr(),
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
)
};
tm.ok_or_else(|| {
2019-12-24 22:38:22 +00:00
format!("Could not create LLVM TargetMachine for triple: {}", triple.to_str().unwrap())
})
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
})
}
pub(crate) fn save_temp_bitcode(
cgcx: &CodegenContext<LlvmCodegenBackend>,
module: &ModuleCodegen<ModuleLlvm>,
2019-12-24 22:38:22 +00:00
name: &str,
) {
if !cgcx.save_temps {
2019-12-24 22:38:22 +00:00
return;
}
unsafe {
let ext = format!("{}.bc", name);
let cgu = Some(&module.name[..]);
let path = cgcx.output_filenames.temp_path_ext(&ext, cgu);
let cstr = path_to_c_string(&path);
let llmod = module.module_llvm.llmod();
llvm::LLVMWriteBitcodeToFile(llmod, cstr.as_ptr());
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
}
pub struct DiagnosticHandlers<'a> {
data: *mut (&'a CodegenContext<LlvmCodegenBackend>, &'a Handler),
llcx: &'a llvm::Context,
old_handler: Option<&'a llvm::DiagnosticHandler>,
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
impl<'a> DiagnosticHandlers<'a> {
2019-12-24 22:38:22 +00:00
pub fn new(
cgcx: &'a CodegenContext<LlvmCodegenBackend>,
handler: &'a Handler,
llcx: &'a llvm::Context,
) -> Self {
let remark_passes_all: bool;
let remark_passes: Vec<CString>;
match &cgcx.remark {
Passes::All => {
remark_passes_all = true;
remark_passes = Vec::new();
}
Passes::Some(passes) => {
remark_passes_all = false;
remark_passes =
passes.iter().map(|name| CString::new(name.as_str()).unwrap()).collect();
}
};
let remark_passes: Vec<*const c_char> =
remark_passes.iter().map(|name: &CString| name.as_ptr()).collect();
2018-07-14 23:52:45 +00:00
let data = Box::into_raw(Box::new((cgcx, handler)));
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
unsafe {
let old_handler = llvm::LLVMRustContextGetDiagnosticHandler(llcx);
llvm::LLVMRustContextConfigureDiagnosticHandler(
llcx,
diagnostic_handler,
data.cast(),
remark_passes_all,
remark_passes.as_ptr(),
remark_passes.len(),
);
llvm::LLVMRustSetInlineAsmDiagnosticHandler(llcx, inline_asm_handler, data.cast());
DiagnosticHandlers { data, llcx, old_handler }
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
}
}
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
impl<'a> Drop for DiagnosticHandlers<'a> {
fn drop(&mut self) {
2018-07-14 23:52:45 +00:00
use std::ptr::null_mut;
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
unsafe {
2018-07-14 23:52:45 +00:00
llvm::LLVMRustSetInlineAsmDiagnosticHandler(self.llcx, inline_asm_handler, null_mut());
llvm::LLVMRustContextSetDiagnosticHandler(self.llcx, self.old_handler);
2018-07-14 23:52:45 +00:00
drop(Box::from_raw(self.data));
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
}
}
2020-05-26 19:07:59 +00:00
fn report_inline_asm(
2019-12-24 22:38:22 +00:00
cgcx: &CodegenContext<LlvmCodegenBackend>,
2020-05-26 19:07:59 +00:00
msg: String,
2020-06-09 13:37:59 +00:00
level: llvm::DiagnosticLevel,
2020-05-26 19:07:59 +00:00
mut cookie: c_uint,
source: Option<(String, Vec<InnerSpan>)>,
2019-12-24 22:38:22 +00:00
) {
2020-05-26 19:07:59 +00:00
// In LTO build we may get srcloc values from other crates which are invalid
// since they use a different source map. To be safe we just suppress these
// in LTO builds.
if matches!(cgcx.lto, Lto::Fat | Lto::Thin) {
cookie = 0;
}
2020-06-09 13:37:59 +00:00
let level = match level {
llvm::DiagnosticLevel::Error => Level::Error { lint: false },
2020-06-09 13:37:59 +00:00
llvm::DiagnosticLevel::Warning => Level::Warning,
llvm::DiagnosticLevel::Note | llvm::DiagnosticLevel::Remark => Level::Note,
};
cgcx.diag_emitter.inline_asm_error(cookie as u32, msg, level, source);
}
2019-12-24 22:38:22 +00:00
unsafe extern "C" fn inline_asm_handler(diag: &SMDiagnostic, user: *const c_void, cookie: c_uint) {
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
if user.is_null() {
2019-12-24 22:38:22 +00:00
return;
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
let (cgcx, _) = *(user as *const (&CodegenContext<LlvmCodegenBackend>, &Handler));
2015-01-22 18:43:39 +00:00
let smdiag = llvm::diagnostic::SrcMgrDiagnostic::unpack(diag);
report_inline_asm(cgcx, smdiag.message, smdiag.level, cookie, smdiag.source);
2015-01-22 18:43:39 +00:00
}
unsafe extern "C" fn diagnostic_handler(info: &DiagnosticInfo, user: *mut c_void) {
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
if user.is_null() {
2019-12-24 22:38:22 +00:00
return;
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
let (cgcx, diag_handler) = *(user as *const (&CodegenContext<LlvmCodegenBackend>, &Handler));
match llvm::diagnostic::Diagnostic::unpack(info) {
2015-01-22 18:43:39 +00:00
llvm::diagnostic::InlineAsm(inline) => {
report_inline_asm(cgcx, inline.message, inline.level, inline.cookie, inline.source);
2015-01-22 18:43:39 +00:00
}
llvm::diagnostic::Optimization(opt) => {
let enabled = match cgcx.remark {
Passes::All => true,
Passes::Some(ref v) => v.iter().any(|s| *s == opt.pass_name),
};
if enabled {
2019-12-24 22:38:22 +00:00
diag_handler.note_without_error(&format!(
"{}:{}:{}: {}: {}",
opt.filename, opt.line, opt.column, opt.pass_name, opt.message,
2019-12-24 22:38:22 +00:00
));
}
}
2019-12-24 22:38:22 +00:00
llvm::diagnostic::PGO(diagnostic_ref) | llvm::diagnostic::Linker(diagnostic_ref) => {
let msg = llvm::build_string(|s| {
llvm::LLVMRustWriteDiagnosticInfoToString(diagnostic_ref, s)
2019-12-24 22:38:22 +00:00
})
.expect("non-UTF8 diagnostic");
diag_handler.warn(&msg);
}
llvm::diagnostic::Unsupported(diagnostic_ref) => {
let msg = llvm::build_string(|s| {
llvm::LLVMRustWriteDiagnosticInfoToString(diagnostic_ref, s)
})
.expect("non-UTF8 diagnostic");
diag_handler.err(&msg);
}
2019-12-24 22:38:22 +00:00
llvm::diagnostic::UnknownDiagnostic(..) => {}
}
}
fn get_pgo_gen_path(config: &ModuleConfig) -> Option<CString> {
match config.pgo_gen {
SwitchWithOptPath::Enabled(ref opt_dir_path) => {
let path = if let Some(dir_path) = opt_dir_path {
dir_path.join("default_%m.profraw")
} else {
PathBuf::from("default_%m.profraw")
};
Some(CString::new(format!("{}", path.display())).unwrap())
}
SwitchWithOptPath::Disabled => None,
}
}
fn get_pgo_use_path(config: &ModuleConfig) -> Option<CString> {
config
.pgo_use
.as_ref()
.map(|path_buf| CString::new(path_buf.to_string_lossy().as_bytes()).unwrap())
}
fn get_pgo_sample_use_path(config: &ModuleConfig) -> Option<CString> {
config
.pgo_sample_use
.as_ref()
.map(|path_buf| CString::new(path_buf.to_string_lossy().as_bytes()).unwrap())
}
pub(crate) fn should_use_new_llvm_pass_manager(
cgcx: &CodegenContext<LlvmCodegenBackend>,
config: &ModuleConfig,
) -> bool {
// The new pass manager is enabled by default for LLVM >= 13.
// This matches Clang, which also enables it since Clang 13.
// FIXME: There are some perf issues with the new pass manager
// when targeting s390x, so it is temporarily disabled for that
// arch, see https://github.com/rust-lang/rust/issues/89609
config
.new_llvm_pass_manager
.unwrap_or_else(|| cgcx.target_arch != "s390x" && llvm_util::get_version() >= (13, 0, 0))
}
pub(crate) unsafe fn optimize_with_new_llvm_pass_manager(
cgcx: &CodegenContext<LlvmCodegenBackend>,
diag_handler: &Handler,
module: &ModuleCodegen<ModuleLlvm>,
config: &ModuleConfig,
opt_level: config::OptLevel,
opt_stage: llvm::OptStage,
) -> Result<(), FatalError> {
let unroll_loops =
opt_level != config::OptLevel::Size && opt_level != config::OptLevel::SizeMin;
let using_thin_buffers = opt_stage == llvm::OptStage::PreLinkThinLTO || config.bitcode_needed();
let pgo_gen_path = get_pgo_gen_path(config);
let pgo_use_path = get_pgo_use_path(config);
let pgo_sample_use_path = get_pgo_sample_use_path(config);
let is_lto = opt_stage == llvm::OptStage::ThinLTO || opt_stage == llvm::OptStage::FatLTO;
// Sanitizer instrumentation is only inserted during the pre-link optimization stage.
let sanitizer_options = if !is_lto {
Some(llvm::SanitizerOptions {
sanitize_address: config.sanitizer.contains(SanitizerSet::ADDRESS),
sanitize_address_recover: config.sanitizer_recover.contains(SanitizerSet::ADDRESS),
sanitize_memory: config.sanitizer.contains(SanitizerSet::MEMORY),
sanitize_memory_recover: config.sanitizer_recover.contains(SanitizerSet::MEMORY),
sanitize_memory_track_origins: config.sanitizer_memory_track_origins as c_int,
sanitize_thread: config.sanitizer.contains(SanitizerSet::THREAD),
2021-01-23 02:32:38 +00:00
sanitize_hwaddress: config.sanitizer.contains(SanitizerSet::HWADDRESS),
sanitize_hwaddress_recover: config.sanitizer_recover.contains(SanitizerSet::HWADDRESS),
})
} else {
None
};
let mut llvm_profiler = if cgcx.prof.llvm_recording_enabled() {
Some(LlvmSelfProfiler::new(cgcx.prof.get_self_profiler().unwrap()))
} else {
None
};
let llvm_selfprofiler =
llvm_profiler.as_mut().map(|s| s as *mut _ as *mut c_void).unwrap_or(std::ptr::null_mut());
let extra_passes = config.passes.join(",");
// FIXME: NewPM doesn't provide a facility to pass custom InlineParams.
// We would have to add upstream support for this first, before we can support
// config.inline_threshold and our more aggressive default thresholds.
let result = llvm::LLVMRustOptimizeWithNewPassManager(
module.module_llvm.llmod(),
&*module.module_llvm.tm,
to_pass_builder_opt_level(opt_level),
opt_stage,
config.no_prepopulate_passes,
config.verify_llvm_ir,
using_thin_buffers,
config.merge_functions,
unroll_loops,
config.vectorize_slp,
config.vectorize_loop,
config.no_builtins,
config.emit_lifetime_markers,
sanitizer_options.as_ref(),
pgo_gen_path.as_ref().map_or(std::ptr::null(), |s| s.as_ptr()),
pgo_use_path.as_ref().map_or(std::ptr::null(), |s| s.as_ptr()),
config.instrument_coverage,
config.instrument_gcov,
pgo_sample_use_path.as_ref().map_or(std::ptr::null(), |s| s.as_ptr()),
config.debug_info_for_profiling,
llvm_selfprofiler,
selfprofile_before_pass_callback,
selfprofile_after_pass_callback,
extra_passes.as_ptr().cast(),
extra_passes.len(),
);
result.into_result().map_err(|()| llvm_err(diag_handler, "failed to run LLVM passes"))
}
// Unsafe due to LLVM calls.
2019-12-24 22:38:22 +00:00
pub(crate) unsafe fn optimize(
cgcx: &CodegenContext<LlvmCodegenBackend>,
diag_handler: &Handler,
module: &ModuleCodegen<ModuleLlvm>,
config: &ModuleConfig,
) -> Result<(), FatalError> {
let _timer = cgcx.prof.generic_activity_with_arg("LLVM_module_optimize", &module.name[..]);
let llmod = module.module_llvm.llmod();
let llcx = &*module.module_llvm.llcx;
let tm = &*module.module_llvm.tm;
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
let _handlers = DiagnosticHandlers::new(cgcx, diag_handler, llcx);
2018-05-08 13:10:16 +00:00
let module_name = module.name.clone();
let module_name = Some(&module_name[..]);
if config.emit_no_opt_bc {
let out = cgcx.output_filenames.temp_path_ext("no-opt.bc", module_name);
let out = path_to_c_string(&out);
llvm::LLVMWriteBitcodeToFile(llmod, out.as_ptr());
}
if let Some(opt_level) = config.opt_level {
if should_use_new_llvm_pass_manager(cgcx, config) {
let opt_stage = match cgcx.lto {
Lto::Fat => llvm::OptStage::PreLinkFatLTO,
Lto::Thin | Lto::ThinLocal => llvm::OptStage::PreLinkThinLTO,
_ if cgcx.opts.cg.linker_plugin_lto.enabled() => llvm::OptStage::PreLinkThinLTO,
_ => llvm::OptStage::PreLinkNoLTO,
};
return optimize_with_new_llvm_pass_manager(
cgcx,
diag_handler,
module,
config,
opt_level,
opt_stage,
);
}
if cgcx.prof.llvm_recording_enabled() {
diag_handler
.warn("`-Z self-profile-events = llvm` requires `-Z new-llvm-pass-manager`");
}
// Create the two optimizing pass managers. These mirror what clang
// does, and are by populated by LLVM's default PassManagerBuilder.
// Each manager has a different set of passes, but they also share
// some common passes.
let fpm = llvm::LLVMCreateFunctionPassManagerForModule(llmod);
let mpm = llvm::LLVMCreatePassManager();
{
let find_pass = |pass_name: &str| {
let pass_name = SmallCStr::new(pass_name);
llvm::LLVMRustFindAndCreatePass(pass_name.as_ptr())
};
if config.verify_llvm_ir {
// Verification should run as the very first pass.
llvm::LLVMRustAddPass(fpm, find_pass("verify").unwrap());
}
let mut extra_passes = Vec::new();
let mut have_name_anon_globals_pass = false;
for pass_name in &config.passes {
if pass_name == "lint" {
// Linting should also be performed early, directly on the generated IR.
llvm::LLVMRustAddPass(fpm, find_pass("lint").unwrap());
continue;
}
if let Some(pass) = find_pass(pass_name) {
extra_passes.push(pass);
} else {
diag_handler.warn(&format!("unknown pass `{}`, ignoring", pass_name));
}
if pass_name == "name-anon-globals" {
have_name_anon_globals_pass = true;
}
}
// Instrumentation must be inserted before optimization,
// otherwise LLVM may optimize some functions away which
// breaks llvm-cov.
//
// This mirrors what Clang does in lib/CodeGen/BackendUtil.cpp.
if config.instrument_gcov {
llvm::LLVMRustAddPass(mpm, find_pass("insert-gcov-profiling").unwrap());
}
if config.instrument_coverage {
llvm::LLVMRustAddPass(mpm, find_pass("instrprof").unwrap());
}
if config.debug_info_for_profiling {
llvm::LLVMRustAddPass(mpm, find_pass("add-discriminators").unwrap());
}
add_sanitizer_passes(config, &mut extra_passes);
// Some options cause LLVM bitcode to be emitted, which uses ThinLTOBuffers, so we need
// to make sure we run LLVM's NameAnonGlobals pass when emitting bitcode; otherwise
// we'll get errors in LLVM.
let using_thin_buffers = config.bitcode_needed();
if !config.no_prepopulate_passes {
llvm::LLVMAddAnalysisPasses(tm, fpm);
llvm::LLVMAddAnalysisPasses(tm, mpm);
let opt_level = to_llvm_opt_settings(opt_level).0;
2019-12-24 22:38:22 +00:00
let prepare_for_thin_lto = cgcx.lto == Lto::Thin
|| cgcx.lto == Lto::ThinLocal
|| (cgcx.lto != Lto::Fat && cgcx.opts.cg.linker_plugin_lto.enabled());
2021-09-30 17:38:50 +00:00
with_llvm_pmb(llmod, config, opt_level, prepare_for_thin_lto, &mut |b| {
llvm::LLVMRustAddLastExtensionPasses(
2019-12-24 22:38:22 +00:00
b,
extra_passes.as_ptr(),
extra_passes.len() as size_t,
);
llvm::LLVMPassManagerBuilderPopulateFunctionPassManager(b, fpm);
llvm::LLVMPassManagerBuilderPopulateModulePassManager(b, mpm);
});
have_name_anon_globals_pass = have_name_anon_globals_pass || prepare_for_thin_lto;
if using_thin_buffers && !prepare_for_thin_lto {
llvm::LLVMRustAddPass(mpm, find_pass("name-anon-globals").unwrap());
have_name_anon_globals_pass = true;
}
} else {
// If we don't use the standard pipeline, directly populate the MPM
// with the extra passes.
for pass in extra_passes {
llvm::LLVMRustAddPass(mpm, pass);
}
}
if using_thin_buffers && !have_name_anon_globals_pass {
// As described above, this will probably cause an error in LLVM
if config.no_prepopulate_passes {
2019-12-24 22:38:22 +00:00
diag_handler.err(
"The current compilation is going to use thin LTO buffers \
without running LLVM's NameAnonGlobals pass. \
This will likely cause errors in LLVM. Consider adding \
2019-12-24 22:38:22 +00:00
-C passes=name-anon-globals to the compiler command line.",
);
} else {
2019-12-24 22:38:22 +00:00
bug!(
"We are using thin LTO buffers without running the NameAnonGlobals pass. \
This will likely cause errors in LLVM and should never happen."
);
}
2015-04-08 19:52:58 +00:00
}
}
2015-04-08 19:52:58 +00:00
diag_handler.abort_if_errors();
2015-04-08 19:52:58 +00:00
// Finally, run the actual optimization passes
2019-02-13 13:13:30 +00:00
{
let _timer = cgcx.prof.extra_verbose_generic_activity(
"LLVM_module_optimize_function_passes",
&module.name[..],
);
llvm::LLVMRustRunFunctionPassManager(fpm, llmod);
2019-02-13 13:13:30 +00:00
}
{
let _timer = cgcx.prof.extra_verbose_generic_activity(
"LLVM_module_optimize_module_passes",
&module.name[..],
);
llvm::LLVMRunPassManager(mpm, llmod);
2019-02-13 13:13:30 +00:00
}
// Deallocate managers that we're now done with
llvm::LLVMDisposePassManager(fpm);
llvm::LLVMDisposePassManager(mpm);
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
Ok(())
rustc: Enable LTO and multiple codegen units This commit is a refactoring of the LTO backend in Rust to support compilations with multiple codegen units. The immediate result of this PR is to remove the artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer term this is intended to lay the groundwork for LTO with incremental compilation and ultimately be the underpinning of ThinLTO support. The problem here that needed solving is that when rustc is producing multiple codegen units in one compilation LTO needs to merge them all together. Previously only upstream dependencies were merged and it was inherently relied on that there was only one local codegen unit. Supporting this involved refactoring the optimization backend architecture for rustc, namely splitting the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM module has been optimized it may be blocked and queued up for LTO, and only after LTO are modules code generated. Non-LTO compilations should look the same as they do today backend-wise, we'll spin up a thread for each codegen unit and optimize/codegen in that thread. LTO compilations will, however, send the LLVM module back to the coordinator thread once optimizations have finished. When all LLVM modules have finished optimizing the coordinator will invoke the LTO backend, producing a further list of LLVM modules. Currently this is always a list of one LLVM module. The coordinator then spawns further work to run LTO and code generation passes over each module. In the course of this refactoring a number of other pieces were refactored: * Management of the bytecode encoding in rlibs was centralized into one module instead of being scattered across LTO and linking. * Some internal refactorings on the link stage of the compiler was done to work directly from `CompiledModule` structures instead of lists of paths. * The trans time-graph output was tweaked a little to include a name on each bar and inflate the size of the bars a little
2017-07-23 15:14:38 +00:00
}
2019-12-24 22:38:22 +00:00
unsafe fn add_sanitizer_passes(config: &ModuleConfig, passes: &mut Vec<&'static mut llvm::Pass>) {
if config.sanitizer.contains(SanitizerSet::ADDRESS) {
let recover = config.sanitizer_recover.contains(SanitizerSet::ADDRESS);
passes.push(llvm::LLVMRustCreateAddressSanitizerFunctionPass(recover));
passes.push(llvm::LLVMRustCreateModuleAddressSanitizerPass(recover));
}
if config.sanitizer.contains(SanitizerSet::MEMORY) {
let track_origins = config.sanitizer_memory_track_origins as c_int;
let recover = config.sanitizer_recover.contains(SanitizerSet::MEMORY);
passes.push(llvm::LLVMRustCreateMemorySanitizerPass(track_origins, recover));
}
if config.sanitizer.contains(SanitizerSet::THREAD) {
passes.push(llvm::LLVMRustCreateThreadSanitizerPass());
}
2021-01-23 02:32:38 +00:00
if config.sanitizer.contains(SanitizerSet::HWADDRESS) {
let recover = config.sanitizer_recover.contains(SanitizerSet::HWADDRESS);
passes.push(llvm::LLVMRustCreateHWAddressSanitizerPass(recover));
}
}
pub(crate) fn link(
cgcx: &CodegenContext<LlvmCodegenBackend>,
diag_handler: &Handler,
mut modules: Vec<ModuleCodegen<ModuleLlvm>>,
) -> Result<ModuleCodegen<ModuleLlvm>, FatalError> {
use super::lto::{Linker, ModuleBuffer};
// Sort the modules by name to ensure to ensure deterministic behavior.
modules.sort_by(|a, b| a.name.cmp(&b.name));
let (first, elements) =
modules.split_first().expect("Bug! modules must contain at least one module.");
let mut linker = Linker::new(first.module_llvm.llmod());
for module in elements {
let _timer =
cgcx.prof.generic_activity_with_arg("LLVM_link_module", format!("{:?}", module.name));
let buffer = ModuleBuffer::new(module.module_llvm.llmod());
2021-09-30 17:38:50 +00:00
linker.add(buffer.data()).map_err(|()| {
let msg = format!("failed to serialize module {:?}", module.name);
2021-09-30 17:38:50 +00:00
llvm_err(diag_handler, &msg)
})?;
}
drop(linker);
Ok(modules.remove(0))
}
2019-12-24 22:38:22 +00:00
pub(crate) unsafe fn codegen(
cgcx: &CodegenContext<LlvmCodegenBackend>,
diag_handler: &Handler,
module: ModuleCodegen<ModuleLlvm>,
config: &ModuleConfig,
) -> Result<CompiledModule, FatalError> {
let _timer = cgcx.prof.generic_activity_with_arg("LLVM_module_codegen", &module.name[..]);
2014-12-09 18:44:51 +00:00
{
let llmod = module.module_llvm.llmod();
let llcx = &*module.module_llvm.llcx;
let tm = &*module.module_llvm.tm;
let module_name = module.name.clone();
let module_name = Some(&module_name[..]);
let handlers = DiagnosticHandlers::new(cgcx, diag_handler, llcx);
if cgcx.msvc_imps_needed {
create_msvc_imps(cgcx, llcx, llmod);
}
// A codegen-specific pass manager is used to generate object
// files for an LLVM module.
//
// Apparently each of these pass managers is a one-shot kind of
// thing, so we create a new one for each type of output. The
// pass manager passed to the closure should be ensured to not
// escape the closure itself, and the manager should only be
// used once.
2019-12-24 22:38:22 +00:00
unsafe fn with_codegen<'ll, F, R>(
tm: &'ll llvm::TargetMachine,
llmod: &'ll llvm::Module,
no_builtins: bool,
f: F,
) -> R
where
F: FnOnce(&'ll mut PassManager<'ll>) -> R,
{
let cpm = llvm::LLVMCreatePassManager();
llvm::LLVMAddAnalysisPasses(tm, cpm);
llvm::LLVMRustAddLibraryInfo(cpm, llmod, no_builtins);
f(cpm)
}
// Two things to note:
// - If object files are just LLVM bitcode we write bitcode, copy it to
// the .o file, and delete the bitcode if it wasn't otherwise
// requested.
// - If we don't have the integrated assembler then we need to emit
// asm from LLVM and use `gcc` to create the object file.
let bc_out = cgcx.output_filenames.temp_path(OutputType::Bitcode, module_name);
let obj_out = cgcx.output_filenames.temp_path(OutputType::Object, module_name);
if config.bitcode_needed() {
let _timer = cgcx
.prof
.generic_activity_with_arg("LLVM_module_codegen_make_bitcode", &module.name[..]);
let thin = ThinBuffer::new(llmod);
let data = thin.data();
if let Some(bitcode_filename) = bc_out.file_name() {
cgcx.prof.artifact_size(
"llvm_bitcode",
bitcode_filename.to_string_lossy(),
data.len() as u64,
);
}
if config.emit_bc || config.emit_obj == EmitObj::Bitcode {
let _timer = cgcx.prof.generic_activity_with_arg(
"LLVM_module_codegen_emit_bitcode",
&module.name[..],
);
if let Err(e) = fs::write(&bc_out, data) {
let msg = format!("failed to write bytecode to {}: {}", bc_out.display(), e);
diag_handler.err(&msg);
}
}
if config.emit_obj == EmitObj::ObjectCode(BitcodeSection::Full) {
let _timer = cgcx.prof.generic_activity_with_arg(
"LLVM_module_codegen_embed_bitcode",
&module.name[..],
);
embed_bitcode(cgcx, llcx, llmod, &config.bc_cmdline, data);
}
}
2020-03-20 01:46:27 +00:00
if config.emit_ir {
let _timer = cgcx
.prof
.generic_activity_with_arg("LLVM_module_codegen_emit_ir", &module.name[..]);
let out = cgcx.output_filenames.temp_path(OutputType::LlvmAssembly, module_name);
let out_c = path_to_c_string(&out);
extern "C" fn demangle_callback(
input_ptr: *const c_char,
input_len: size_t,
output_ptr: *mut c_char,
output_len: size_t,
) -> size_t {
let input =
unsafe { slice::from_raw_parts(input_ptr as *const u8, input_len as usize) };
let input = match str::from_utf8(input) {
Ok(s) => s,
Err(_) => return 0,
};
let output = unsafe {
slice::from_raw_parts_mut(output_ptr as *mut u8, output_len as usize)
};
let mut cursor = io::Cursor::new(output);
let demangled = match rustc_demangle::try_demangle(input) {
Ok(d) => d,
Err(_) => return 0,
};
if write!(cursor, "{:#}", demangled).is_err() {
// Possible only if provided buffer is not big enough
return 0;
}
2020-03-20 01:46:27 +00:00
cursor.position() as size_t
}
2020-03-20 01:46:27 +00:00
let result = llvm::LLVMRustPrintModule(llmod, out_c.as_ptr(), demangle_callback);
if result == llvm::LLVMRustResult::Success {
record_artifact_size(&cgcx.prof, "llvm_ir", &out);
}
2020-03-20 01:46:27 +00:00
result.into_result().map_err(|()| {
let msg = format!("failed to write LLVM IR to {}", out.display());
llvm_err(diag_handler, &msg)
})?;
}
if config.emit_asm {
2020-03-20 01:46:27 +00:00
let _timer = cgcx
.prof
.generic_activity_with_arg("LLVM_module_codegen_emit_asm", &module.name[..]);
let path = cgcx.output_filenames.temp_path(OutputType::Assembly, module_name);
// We can't use the same module for asm and object code output,
// because that triggers various errors like invalid IR or broken
// binaries. So we must clone the module to produce the asm output
// if we are also producing object code.
let llmod = if let EmitObj::ObjectCode(_) = config.emit_obj {
llvm::LLVMCloneModule(llmod)
} else {
llmod
};
2020-03-20 01:46:27 +00:00
with_codegen(tm, llmod, config.no_builtins, |cpm| {
write_output_file(
diag_handler,
tm,
cpm,
llmod,
&path,
None,
llvm::FileType::AssemblyFile,
&cgcx.prof,
)
2020-03-20 01:46:27 +00:00
})?;
}
match config.emit_obj {
EmitObj::ObjectCode(_) => {
let _timer = cgcx
.prof
.generic_activity_with_arg("LLVM_module_codegen_emit_obj", &module.name[..]);
let dwo_out = cgcx.output_filenames.temp_path_dwo(module_name);
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo` This commit adds a new stable codegen option to rustc, `-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also subsumed by this flag but still requires `-Zunstable-options` to actually activate. The `-Csplit-debuginfo` flag takes one of three values: * `off` - This indicates that split-debuginfo from the final artifact is not desired. This is not supported on Windows and is the default on Unix platforms except macOS. On macOS this means that `dsymutil` is not executed. * `packed` - This means that debuginfo is desired in one location separate from the main executable. This is the default on Windows (`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes `-Zsplit-dwarf=single` and produces a `*.dwp` file. * `unpacked` - This means that debuginfo will be roughly equivalent to object files, meaning that it's throughout the build directory rather than in one location (often the fastest for local development). This is not the default on any platform and is not supported on Windows. Each target can indicate its own default preference for how debuginfo is handled. Almost all platforms default to `off` except for Windows and macOS which default to `packed` for historical reasons. Some equivalencies for previous unstable flags with the new flags are: * `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed` * `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked` * `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed` * `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked` Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for non-macOS platforms since split-dwarf support was *just* implemented in rustc. There's some more rationale listed on #79361, but the main gist of the motivation for this commit is that `dsymutil` can take quite a long time to execute in debug builds and provides little benefit. This means that incremental compile times appear that much worse on macOS because the compiler is constantly running `dsymutil` over every single binary it produces during `cargo build` (even build scripts!). Ideally rustc would switch to not running `dsymutil` by default, but that's a problem left to get tackled another day. Closes #79361
2020-11-30 16:39:08 +00:00
let dwo_out = match cgcx.split_debuginfo {
// Don't change how DWARF is emitted in single mode (or when disabled).
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo` This commit adds a new stable codegen option to rustc, `-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also subsumed by this flag but still requires `-Zunstable-options` to actually activate. The `-Csplit-debuginfo` flag takes one of three values: * `off` - This indicates that split-debuginfo from the final artifact is not desired. This is not supported on Windows and is the default on Unix platforms except macOS. On macOS this means that `dsymutil` is not executed. * `packed` - This means that debuginfo is desired in one location separate from the main executable. This is the default on Windows (`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes `-Zsplit-dwarf=single` and produces a `*.dwp` file. * `unpacked` - This means that debuginfo will be roughly equivalent to object files, meaning that it's throughout the build directory rather than in one location (often the fastest for local development). This is not the default on any platform and is not supported on Windows. Each target can indicate its own default preference for how debuginfo is handled. Almost all platforms default to `off` except for Windows and macOS which default to `packed` for historical reasons. Some equivalencies for previous unstable flags with the new flags are: * `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed` * `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked` * `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed` * `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked` Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for non-macOS platforms since split-dwarf support was *just* implemented in rustc. There's some more rationale listed on #79361, but the main gist of the motivation for this commit is that `dsymutil` can take quite a long time to execute in debug builds and provides little benefit. This means that incremental compile times appear that much worse on macOS because the compiler is constantly running `dsymutil` over every single binary it produces during `cargo build` (even build scripts!). Ideally rustc would switch to not running `dsymutil` by default, but that's a problem left to get tackled another day. Closes #79361
2020-11-30 16:39:08 +00:00
SplitDebuginfo::Off | SplitDebuginfo::Packed => None,
// Emit (a subset of the) DWARF into a separate file in split mode.
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo` This commit adds a new stable codegen option to rustc, `-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also subsumed by this flag but still requires `-Zunstable-options` to actually activate. The `-Csplit-debuginfo` flag takes one of three values: * `off` - This indicates that split-debuginfo from the final artifact is not desired. This is not supported on Windows and is the default on Unix platforms except macOS. On macOS this means that `dsymutil` is not executed. * `packed` - This means that debuginfo is desired in one location separate from the main executable. This is the default on Windows (`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes `-Zsplit-dwarf=single` and produces a `*.dwp` file. * `unpacked` - This means that debuginfo will be roughly equivalent to object files, meaning that it's throughout the build directory rather than in one location (often the fastest for local development). This is not the default on any platform and is not supported on Windows. Each target can indicate its own default preference for how debuginfo is handled. Almost all platforms default to `off` except for Windows and macOS which default to `packed` for historical reasons. Some equivalencies for previous unstable flags with the new flags are: * `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed` * `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked` * `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed` * `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked` Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for non-macOS platforms since split-dwarf support was *just* implemented in rustc. There's some more rationale listed on #79361, but the main gist of the motivation for this commit is that `dsymutil` can take quite a long time to execute in debug builds and provides little benefit. This means that incremental compile times appear that much worse on macOS because the compiler is constantly running `dsymutil` over every single binary it produces during `cargo build` (even build scripts!). Ideally rustc would switch to not running `dsymutil` by default, but that's a problem left to get tackled another day. Closes #79361
2020-11-30 16:39:08 +00:00
SplitDebuginfo::Unpacked => {
if cgcx.target_can_use_split_dwarf {
Some(dwo_out.as_path())
} else {
None
}
}
};
with_codegen(tm, llmod, config.no_builtins, |cpm| {
write_output_file(
diag_handler,
tm,
cpm,
llmod,
&obj_out,
dwo_out,
llvm::FileType::ObjectFile,
&cgcx.prof,
)
})?;
}
EmitObj::Bitcode => {
debug!("copying bitcode {:?} to obj {:?}", bc_out, obj_out);
if let Err(e) = link_or_copy(&bc_out, &obj_out) {
diag_handler.err(&format!("failed to copy bitcode to object file: {}", e));
}
if !config.emit_bc {
debug!("removing_bitcode {:?}", bc_out);
ensure_removed(diag_handler, &bc_out);
}
}
EmitObj::None => {}
}
drop(handlers);
}
2019-12-24 22:38:22 +00:00
Ok(module.into_compiled_module(
config.emit_obj != EmitObj::None,
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo` This commit adds a new stable codegen option to rustc, `-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also subsumed by this flag but still requires `-Zunstable-options` to actually activate. The `-Csplit-debuginfo` flag takes one of three values: * `off` - This indicates that split-debuginfo from the final artifact is not desired. This is not supported on Windows and is the default on Unix platforms except macOS. On macOS this means that `dsymutil` is not executed. * `packed` - This means that debuginfo is desired in one location separate from the main executable. This is the default on Windows (`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes `-Zsplit-dwarf=single` and produces a `*.dwp` file. * `unpacked` - This means that debuginfo will be roughly equivalent to object files, meaning that it's throughout the build directory rather than in one location (often the fastest for local development). This is not the default on any platform and is not supported on Windows. Each target can indicate its own default preference for how debuginfo is handled. Almost all platforms default to `off` except for Windows and macOS which default to `packed` for historical reasons. Some equivalencies for previous unstable flags with the new flags are: * `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed` * `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked` * `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed` * `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked` Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for non-macOS platforms since split-dwarf support was *just* implemented in rustc. There's some more rationale listed on #79361, but the main gist of the motivation for this commit is that `dsymutil` can take quite a long time to execute in debug builds and provides little benefit. This means that incremental compile times appear that much worse on macOS because the compiler is constantly running `dsymutil` over every single binary it produces during `cargo build` (even build scripts!). Ideally rustc would switch to not running `dsymutil` by default, but that's a problem left to get tackled another day. Closes #79361
2020-11-30 16:39:08 +00:00
cgcx.target_can_use_split_dwarf && cgcx.split_debuginfo == SplitDebuginfo::Unpacked,
2019-12-24 22:38:22 +00:00
config.emit_bc,
&cgcx.output_filenames,
))
}
/// Embed the bitcode of an LLVM module in the LLVM module itself.
///
/// This is done primarily for iOS where it appears to be standard to compile C
/// code at least with `-fembed-bitcode` which creates two sections in the
/// executable:
///
/// * __LLVM,__bitcode
/// * __LLVM,__cmdline
///
/// It appears *both* of these sections are necessary to get the linker to
/// recognize what's going on. A suitable cmdline value is taken from the
/// target spec.
///
/// Furthermore debug/O1 builds don't actually embed bitcode but rather just
/// embed an empty section.
///
/// Basically all of this is us attempting to follow in the footsteps of clang
/// on iOS. See #35968 for lots more info.
2019-12-24 22:38:22 +00:00
unsafe fn embed_bitcode(
cgcx: &CodegenContext<LlvmCodegenBackend>,
llcx: &llvm::Context,
llmod: &llvm::Module,
cmdline: &str,
bitcode: &[u8],
2019-12-24 22:38:22 +00:00
) {
let llconst = common::bytes_in_context(llcx, bitcode);
let llglobal = llvm::LLVMAddGlobal(
llmod,
common::val_ty(llconst),
"rustc.embedded.module\0".as_ptr().cast(),
);
llvm::LLVMSetInitializer(llglobal, llconst);
2019-12-24 22:38:22 +00:00
let is_apple = cgcx.opts.target_triple.triple().contains("-ios")
|| cgcx.opts.target_triple.triple().contains("-darwin")
|| cgcx.opts.target_triple.triple().contains("-tvos");
2019-12-24 22:38:22 +00:00
let section = if is_apple { "__LLVM,__bitcode\0" } else { ".llvmbc\0" };
llvm::LLVMSetSection(llglobal, section.as_ptr().cast());
llvm::LLVMRustSetLinkage(llglobal, llvm::Linkage::PrivateLinkage);
llvm::LLVMSetGlobalConstant(llglobal, llvm::True);
let llconst = common::bytes_in_context(llcx, cmdline.as_bytes());
let llglobal = llvm::LLVMAddGlobal(
llmod,
common::val_ty(llconst),
"rustc.embedded.cmdline\0".as_ptr().cast(),
);
llvm::LLVMSetInitializer(llglobal, llconst);
2019-12-24 22:38:22 +00:00
let section = if is_apple { "__LLVM,__cmdline\0" } else { ".llvmcmd\0" };
llvm::LLVMSetSection(llglobal, section.as_ptr().cast());
llvm::LLVMRustSetLinkage(llglobal, llvm::Linkage::PrivateLinkage);
Store LLVM bitcode in object files, not compressed This commit is an attempted resurrection of #70458 where LLVM bitcode emitted by rustc into rlibs is stored into object file sections rather than in a separate file. The main rationale for doing this is that when rustc emits bitcode it will no longer use a custom compression scheme which makes it both easier to interoperate with existing tools and also cuts down on compile time since this compression isn't happening. The blocker for this in #70458 turned out to be that native linkers didn't handle the new sections well, causing the sections to either trigger bugs in the linker or actually end up in the final linked artifact. This commit attempts to address these issues by ensuring that native linkers ignore the new sections by inserting custom flags with module-level inline assembly. Note that this does not currently change the API of the compiler at all. The pre-existing `-C bitcode-in-rlib` flag is co-opted to indicate whether the bitcode should be present in the object file or not. Finally, note that an important consequence of this commit, which is also one of its primary purposes, is to enable rustc's `-Clto` bitcode loading to load rlibs produced with `-Clinker-plugin-lto`. The goal here is that when you're building with LTO Cargo will tell rustc to skip codegen of all intermediate crates and only generate LLVM IR. Today rustc will generate both object code and LLVM IR, but the object code is later simply thrown away, wastefully.
2020-04-23 18:45:55 +00:00
// We're adding custom sections to the output object file, but we definitely
// do not want these custom sections to make their way into the final linked
// executable. The purpose of these custom sections is for tooling
// surrounding object files to work with the LLVM IR, if necessary. For
// example rustc's own LTO will look for LLVM IR inside of the object file
// in these sections by default.
//
// To handle this is a bit different depending on the object file format
// used by the backend, broken down into a few different categories:
//
// * Mach-O - this is for macOS. Inspecting the source code for the native
// linker here shows that the `.llvmbc` and `.llvmcmd` sections are
// automatically skipped by the linker. In that case there's nothing extra
// that we need to do here.
//
// * Wasm - the native LLD linker is hard-coded to skip `.llvmbc` and
// `.llvmcmd` sections, so there's nothing extra we need to do.
//
// * COFF - if we don't do anything the linker will by default copy all
// these sections to the output artifact, not what we want! To subvert
// this we want to flag the sections we inserted here as
// `IMAGE_SCN_LNK_REMOVE`. Unfortunately though LLVM has no native way to
// do this. Thankfully though we can do this with some inline assembly,
// which is easy enough to add via module-level global inline asm.
//
// * ELF - this is very similar to COFF above. One difference is that these
// sections are removed from the output linked artifact when
// `--gc-sections` is passed, which we pass by default. If that flag isn't
// passed though then these sections will show up in the final output.
// Additionally the flag that we need to set here is `SHF_EXCLUDE`.
if is_apple
|| cgcx.opts.target_triple.triple().starts_with("wasm")
|| cgcx.opts.target_triple.triple().starts_with("asmjs")
{
// nothing to do here
} else if cgcx.is_pe_coff {
Store LLVM bitcode in object files, not compressed This commit is an attempted resurrection of #70458 where LLVM bitcode emitted by rustc into rlibs is stored into object file sections rather than in a separate file. The main rationale for doing this is that when rustc emits bitcode it will no longer use a custom compression scheme which makes it both easier to interoperate with existing tools and also cuts down on compile time since this compression isn't happening. The blocker for this in #70458 turned out to be that native linkers didn't handle the new sections well, causing the sections to either trigger bugs in the linker or actually end up in the final linked artifact. This commit attempts to address these issues by ensuring that native linkers ignore the new sections by inserting custom flags with module-level inline assembly. Note that this does not currently change the API of the compiler at all. The pre-existing `-C bitcode-in-rlib` flag is co-opted to indicate whether the bitcode should be present in the object file or not. Finally, note that an important consequence of this commit, which is also one of its primary purposes, is to enable rustc's `-Clto` bitcode loading to load rlibs produced with `-Clinker-plugin-lto`. The goal here is that when you're building with LTO Cargo will tell rustc to skip codegen of all intermediate crates and only generate LLVM IR. Today rustc will generate both object code and LLVM IR, but the object code is later simply thrown away, wastefully.
2020-04-23 18:45:55 +00:00
let asm = "
.section .llvmbc,\"n\"
.section .llvmcmd,\"n\"
";
llvm::LLVMRustAppendModuleInlineAsm(llmod, asm.as_ptr().cast(), asm.len());
} else {
let asm = "
.section .llvmbc,\"e\"
.section .llvmcmd,\"e\"
Store LLVM bitcode in object files, not compressed This commit is an attempted resurrection of #70458 where LLVM bitcode emitted by rustc into rlibs is stored into object file sections rather than in a separate file. The main rationale for doing this is that when rustc emits bitcode it will no longer use a custom compression scheme which makes it both easier to interoperate with existing tools and also cuts down on compile time since this compression isn't happening. The blocker for this in #70458 turned out to be that native linkers didn't handle the new sections well, causing the sections to either trigger bugs in the linker or actually end up in the final linked artifact. This commit attempts to address these issues by ensuring that native linkers ignore the new sections by inserting custom flags with module-level inline assembly. Note that this does not currently change the API of the compiler at all. The pre-existing `-C bitcode-in-rlib` flag is co-opted to indicate whether the bitcode should be present in the object file or not. Finally, note that an important consequence of this commit, which is also one of its primary purposes, is to enable rustc's `-Clto` bitcode loading to load rlibs produced with `-Clinker-plugin-lto`. The goal here is that when you're building with LTO Cargo will tell rustc to skip codegen of all intermediate crates and only generate LLVM IR. Today rustc will generate both object code and LLVM IR, but the object code is later simply thrown away, wastefully.
2020-04-23 18:45:55 +00:00
";
llvm::LLVMRustAppendModuleInlineAsm(llmod, asm.as_ptr().cast(), asm.len());
}
}
2019-12-24 22:38:22 +00:00
pub unsafe fn with_llvm_pmb(
llmod: &llvm::Module,
config: &ModuleConfig,
opt_level: llvm::CodeGenOptLevel,
prepare_for_thin_lto: bool,
f: &mut dyn FnMut(&llvm::PassManagerBuilder),
) {
use std::ptr;
// Create the PassManagerBuilder for LLVM. We configure it with
// reasonable defaults and prepare it to actually populate the pass
// manager.
let builder = llvm::LLVMPassManagerBuilderCreate();
let opt_size = config.opt_size.map_or(llvm::CodeGenOptSizeNone, |x| to_llvm_opt_settings(x).1);
let inline_threshold = config.inline_threshold;
let pgo_gen_path = get_pgo_gen_path(config);
let pgo_use_path = get_pgo_use_path(config);
let pgo_sample_use_path = get_pgo_sample_use_path(config);
llvm::LLVMRustConfigurePassManagerBuilder(
builder,
opt_level,
config.merge_functions,
config.vectorize_slp,
config.vectorize_loop,
prepare_for_thin_lto,
pgo_gen_path.as_ref().map_or(ptr::null(), |s| s.as_ptr()),
pgo_use_path.as_ref().map_or(ptr::null(), |s| s.as_ptr()),
pgo_sample_use_path.as_ref().map_or(ptr::null(), |s| s.as_ptr()),
);
llvm::LLVMPassManagerBuilderSetSizeLevel(builder, opt_size as u32);
if opt_size != llvm::CodeGenOptSizeNone {
llvm::LLVMPassManagerBuilderSetDisableUnrollLoops(builder, 1);
}
llvm::LLVMRustAddBuilderLibraryInfo(builder, llmod, config.no_builtins);
// Here we match what clang does (kinda). For O0 we only inline
// always-inline functions (but don't add lifetime intrinsics), at O1 we
// inline with lifetime intrinsics, and O2+ we add an inliner with a
// thresholds copied from clang.
match (opt_level, opt_size, inline_threshold) {
2016-08-26 16:23:42 +00:00
(.., Some(t)) => {
llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, t);
}
2016-08-26 16:23:42 +00:00
(llvm::CodeGenOptLevel::Aggressive, ..) => {
llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, 275);
}
(_, llvm::CodeGenOptSizeDefault, _) => {
llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, 75);
}
(_, llvm::CodeGenOptSizeAggressive, _) => {
llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, 25);
}
2016-08-26 16:23:42 +00:00
(llvm::CodeGenOptLevel::None, ..) => {
llvm::LLVMRustAddAlwaysInlinePass(builder, config.emit_lifetime_markers);
}
2016-08-26 16:23:42 +00:00
(llvm::CodeGenOptLevel::Less, ..) => {
llvm::LLVMRustAddAlwaysInlinePass(builder, config.emit_lifetime_markers);
}
2016-08-26 16:23:42 +00:00
(llvm::CodeGenOptLevel::Default, ..) => {
llvm::LLVMPassManagerBuilderUseInlinerWithThreshold(builder, 225);
}
}
f(builder);
llvm::LLVMPassManagerBuilderDispose(builder);
}
// Create a `__imp_<symbol> = &symbol` global for every public static `symbol`.
// This is required to satisfy `dllimport` references to static data in .rlibs
// when using MSVC linker. We do this only for data, as linker can fix up
// code references on its own.
// See #26591, #27438
fn create_msvc_imps(
cgcx: &CodegenContext<LlvmCodegenBackend>,
llcx: &llvm::Context,
2019-12-24 22:38:22 +00:00
llmod: &llvm::Module,
) {
if !cgcx.msvc_imps_needed {
2019-12-24 22:38:22 +00:00
return;
}
// The x86 ABI seems to require that leading underscores are added to symbol
2019-05-16 22:05:56 +00:00
// names, so we need an extra underscore on x86. There's also a leading
// '\x01' here which disables LLVM's symbol mangling (e.g., no extra
// underscores added in front).
2019-12-24 22:38:22 +00:00
let prefix = if cgcx.target_arch == "x86" { "\x01__imp__" } else { "\x01__imp_" };
unsafe {
let i8p_ty = Type::i8p_llcx(llcx);
let globals = base::iter_globals(llmod)
.filter(|&val| {
2019-12-24 22:38:22 +00:00
llvm::LLVMRustGetLinkage(val) == llvm::Linkage::ExternalLinkage
&& llvm::LLVMIsDeclaration(val) == 0
})
.filter_map(|val| {
// Exclude some symbols that we know are not Rust symbols.
let name = llvm::get_value_name(val);
2019-12-24 22:38:22 +00:00
if ignored(name) { None } else { Some((val, name)) }
})
.map(move |(val, name)| {
let mut imp_name = prefix.as_bytes().to_vec();
imp_name.extend(name);
let imp_name = CString::new(imp_name).unwrap();
(imp_name, val)
})
.collect::<Vec<_>>();
for (imp_name, val) in globals {
2019-12-24 22:38:22 +00:00
let imp = llvm::LLVMAddGlobal(llmod, i8p_ty, imp_name.as_ptr().cast());
llvm::LLVMSetInitializer(imp, consts::ptrcast(val, i8p_ty));
llvm::LLVMRustSetLinkage(imp, llvm::Linkage::ExternalLinkage);
}
}
// Use this function to exclude certain symbols from `__imp` generation.
fn ignored(symbol_name: &[u8]) -> bool {
// These are symbols generated by LLVM's profiling instrumentation
symbol_name.starts_with(b"__llvm_profile_")
}
}
fn record_artifact_size(
self_profiler_ref: &SelfProfilerRef,
artifact_kind: &'static str,
path: &Path,
) {
// Don't stat the file if we are not going to record its size.
if !self_profiler_ref.enabled() {
return;
}
if let Some(artifact_name) = path.file_name() {
let file_size = std::fs::metadata(path).map(|m| m.len()).unwrap_or(0);
self_profiler_ref.artifact_size(artifact_kind, artifact_name.to_string_lossy(), file_size);
}
}