rust/src/librustc_codegen_llvm/base.rs

203 lines
7.6 KiB
Rust
Raw Normal View History

2018-05-08 13:10:16 +00:00
//! Codegen the completed AST to the LLVM IR.
//!
2018-05-08 13:10:16 +00:00
//! Some functions here, such as codegen_block and codegen_expr, return a value --
//! the result of the codegen to LLVM -- while others, such as codegen_fn
//! and mono_item, are called only for the side effect of adding a
//! particular definition to the LLVM IR output we're producing.
//!
2018-05-08 13:10:16 +00:00
//! Hopefully useful general knowledge about codegen:
//!
2019-02-08 13:53:55 +00:00
//! * There's no way to find out the `Ty` type of a Value. Doing so
//! would be "trying to get the eggs out of an omelette" (credit:
//! pcwalton). You can, instead, find out its `llvm::Type` by calling `val_ty`,
//! but one `llvm::Type` corresponds to many `Ty`s; for instance, `tup(int, int,
//! int)` and `rec(x=int, y=int, z=int)` will have the same `llvm::Type`.
use super::ModuleLlvm;
use crate::attributes;
2019-02-17 18:58:58 +00:00
use crate::builder::Builder;
use crate::common;
use crate::context::CodegenCx;
2019-12-22 22:42:04 +00:00
use crate::llvm;
use crate::metadata;
use crate::value::Value;
2020-03-29 15:19:48 +00:00
use rustc_codegen_ssa::base::maybe_create_entry_wrapper;
use rustc_codegen_ssa::mono_item::MonoItemExt;
use rustc_codegen_ssa::traits::*;
use rustc_codegen_ssa::{ModuleCodegen, ModuleKind};
use rustc_data_structures::small_c_str::SmallCStr;
2020-03-29 14:41:09 +00:00
use rustc_middle::dep_graph;
use rustc_middle::middle::codegen_fn_attrs::{CodegenFnAttrFlags, CodegenFnAttrs};
use rustc_middle::middle::cstore::EncodedMetadata;
use rustc_middle::middle::exported_symbols;
use rustc_middle::mir::mono::{Linkage, Visibility};
use rustc_middle::ty::TyCtxt;
use rustc_session::config::DebugInfo;
use rustc_span::symbol::Symbol;
use std::ffi::CString;
use std::time::Instant;
2019-06-13 21:48:52 +00:00
pub fn write_compressed_metadata<'tcx>(
tcx: TyCtxt<'tcx>,
metadata: &EncodedMetadata,
llvm_module: &mut ModuleLlvm,
) {
use flate2::write::DeflateEncoder;
2019-12-22 22:42:04 +00:00
use flate2::Compression;
use std::io::Write;
Store metadata separately in rlib files Right now whenever an rlib file is linked against, all of the metadata from the rlib is pulled in to the final staticlib or binary. The reason for this is that the metadata is currently stored in a section of the object file. Note that this is intentional for dynamic libraries in order to distribute metadata bundled with static libraries. This commit alters the situation for rlib libraries to instead store the metadata in a separate file in the archive. In doing so, when the archive is passed to the linker, none of the metadata will get pulled into the result executable. Furthermore, the metadata file is skipped when assembling rlibs into an archive. The snag in this implementation comes with multiple output formats. When generating a dylib, the metadata needs to be in the object file, but when generating an rlib this needs to be separate. In order to accomplish this, the metadata variable is inserted into an entirely separate LLVM Module which is then codegen'd into a different location (foo.metadata.o). This is then linked into dynamic libraries and silently ignored for rlib files. While changing how metadata is inserted into archives, I have also stopped compressing metadata when inserted into rlib files. We have wanted to stop compressing metadata, but the sections it creates in object file sections are apparently too large. Thankfully if it's just an arbitrary file it doesn't matter how large it is. I have seen massive reductions in executable sizes, as well as staticlib output sizes (to confirm that this is all working).
2013-12-04 01:41:01 +00:00
let (metadata_llcx, metadata_llmod) = (&*llvm_module.llcx, llvm_module.llmod());
let mut compressed = tcx.metadata_encoding_version();
DeflateEncoder::new(&mut compressed, Compression::fast())
2019-12-22 22:42:04 +00:00
.write_all(&metadata.raw_data)
.unwrap();
2015-11-20 23:08:09 +00:00
let llmeta = common::bytes_in_context(metadata_llcx, &compressed);
let llconst = common::struct_in_context(metadata_llcx, &[llmeta], false);
let name = exported_symbols::metadata_symbol_name(tcx);
std: Implement CString-related RFCs This commit is an implementation of [RFC 592][r592] and [RFC 840][r840]. These two RFCs tweak the behavior of `CString` and add a new `CStr` unsized slice type to the module. [r592]: https://github.com/rust-lang/rfcs/blob/master/text/0592-c-str-deref.md [r840]: https://github.com/rust-lang/rfcs/blob/master/text/0840-no-panic-in-c-string.md The new `CStr` type is only constructable via two methods: 1. By `deref`'ing from a `CString` 2. Unsafely via `CStr::from_ptr` The purpose of `CStr` is to be an unsized type which is a thin pointer to a `libc::c_char` (currently it is a fat pointer slice due to implementation limitations). Strings from C can be safely represented with a `CStr` and an appropriate lifetime as well. Consumers of `&CString` should now consume `&CStr` instead to allow producers to pass in C-originating strings instead of just Rust-allocated strings. A new constructor was added to `CString`, `new`, which takes `T: IntoBytes` instead of separate `from_slice` and `from_vec` methods (both have been deprecated in favor of `new`). The `new` method returns a `Result` instead of panicking. The error variant contains the relevant information about where the error happened and bytes (if present). Conversions are provided to the `io::Error` and `old_io::IoError` types via the `FromError` trait which translate to `InvalidInput`. This is a breaking change due to the modification of existing `#[unstable]` APIs and new deprecation, and more detailed information can be found in the two RFCs. Notable breakage includes: * All construction of `CString` now needs to use `new` and handle the outgoing `Result`. * Usage of `CString` as a byte slice now explicitly needs a `.as_bytes()` call. * The `as_slice*` methods have been removed in favor of just having the `as_bytes*` methods. Closes #22469 Closes #22470 [breaking-change]
2015-02-18 06:47:40 +00:00
let buf = CString::new(name).unwrap();
2019-12-22 22:42:04 +00:00
let llglobal =
unsafe { llvm::LLVMAddGlobal(metadata_llmod, common::val_ty(llconst), buf.as_ptr()) };
unsafe {
llvm::LLVMSetInitializer(llglobal, llconst);
let section_name = metadata::metadata_section_name(&tcx.sess.target.target);
let name = SmallCStr::new(section_name);
llvm::LLVMSetSection(llglobal, name.as_ptr());
// Also generate a .section directive to force no
// flags, at least for ELF outputs, so that the
// metadata doesn't get loaded into memory.
let directive = format!(".section {}", section_name);
llvm::LLVMSetModuleInlineAsm2(metadata_llmod, directive.as_ptr().cast(), directive.len())
}
}
pub struct ValueIter<'ll> {
cur: Option<&'ll Value>,
step: unsafe extern "C" fn(&'ll Value) -> Option<&'ll Value>,
}
impl Iterator for ValueIter<'ll> {
type Item = &'ll Value;
fn next(&mut self) -> Option<&'ll Value> {
let old = self.cur;
if let Some(old) = old {
self.cur = unsafe { (self.step)(old) };
}
old
}
}
pub fn iter_globals(llmod: &'ll llvm::Module) -> ValueIter<'ll> {
2019-12-22 22:42:04 +00:00
unsafe { ValueIter { cur: llvm::LLVMGetFirstGlobal(llmod), step: llvm::LLVMGetNextGlobal } }
}
2015-01-02 04:54:03 +00:00
pub fn compile_codegen_unit(
tcx: TyCtxt<'tcx>,
cgu_name: Symbol,
) -> (ModuleCodegen<ModuleLlvm>, u64) {
let prof_timer = tcx.prof.generic_activity_with_arg("codegen_module", cgu_name.to_string());
let start_time = Instant::now();
let dep_node = tcx.codegen_unit(cgu_name).codegen_dep_node(tcx);
2019-12-22 22:42:04 +00:00
let (module, _) =
tcx.dep_graph.with_task(dep_node, tcx, cgu_name, module_codegen, dep_graph::hash_result);
2018-05-08 13:10:16 +00:00
let time_to_codegen = start_time.elapsed();
drop(prof_timer);
// We assume that the cost to run LLVM on a CGU is proportional to
2018-05-08 13:10:16 +00:00
// the time we needed for codegenning it.
2019-12-22 22:42:04 +00:00
let cost = time_to_codegen.as_secs() * 1_000_000_000 + time_to_codegen.subsec_nanos() as u64;
2019-12-22 22:42:04 +00:00
fn module_codegen(tcx: TyCtxt<'_>, cgu_name: Symbol) -> ModuleCodegen<ModuleLlvm> {
let cgu = tcx.codegen_unit(cgu_name);
2018-05-08 13:10:16 +00:00
// Instantiate monomorphizations without filling out definitions yet...
let llvm_module = ModuleLlvm::new(tcx, &cgu_name.as_str());
{
2018-09-27 13:31:20 +00:00
let cx = CodegenCx::new(tcx, cgu, &llvm_module);
2019-12-22 22:42:04 +00:00
let mono_items = cx.codegen_unit.items_in_deterministic_order(cx.tcx);
2018-05-08 13:10:16 +00:00
for &(mono_item, (linkage, visibility)) in &mono_items {
mono_item.predefine::<Builder<'_, '_, '_>>(&cx, linkage, visibility);
}
// ... and now that we have everything pre-defined, fill out those definitions.
2018-05-08 13:10:16 +00:00
for &(mono_item, _) in &mono_items {
mono_item.define::<Builder<'_, '_, '_>>(&cx);
}
// If this codegen unit contains the main function, also create the
// wrapper here
if let Some(entry) = maybe_create_entry_wrapper::<Builder<'_, '_, '_>>(&cx) {
attributes::sanitize(&cx, CodegenFnAttrFlags::empty(), entry);
}
// Run replace-all-uses-with for statics that need it
for &(old_g, new_g) in cx.statics_to_rauw().borrow().iter() {
unsafe {
let bitcast = llvm::LLVMConstPointerCast(new_g, cx.val_ty(old_g));
llvm::LLVMReplaceAllUsesWith(old_g, bitcast);
llvm::LLVMDeleteGlobal(old_g);
}
}
// Create the llvm.used variable
// This variable has type [N x i8*] and is stored in the llvm.metadata section
if !cx.used_statics().borrow().is_empty() {
cx.create_used_variable()
}
// Finalize debuginfo
if cx.sess().opts.debuginfo != DebugInfo::None {
cx.debuginfo_finalize();
}
}
ModuleCodegen {
name: cgu_name.to_string(),
module_llvm: llvm_module,
kind: ModuleKind::Regular,
}
}
(module, cost)
}
pub fn set_link_section(llval: &Value, attrs: &CodegenFnAttrs) {
let sect = match attrs.link_section {
Some(name) => name,
None => return,
rustc: Add a `#[wasm_import_module]` attribute This commit adds a new attribute to the Rust compiler specific to the wasm target (and no other targets). The `#[wasm_import_module]` attribute is used to specify the module that a name is imported from, and is used like so: #[wasm_import_module = "./foo.js"] extern { fn some_js_function(); } Here the import of the symbol `some_js_function` is tagged with the `./foo.js` module in the wasm output file. Wasm-the-format includes two fields on all imports, a module and a field. The field is the symbol name (`some_js_function` above) and the module has historically unconditionally been `"env"`. I'm not sure if this `"env"` convention has asm.js or LLVM roots, but regardless we'd like the ability to configure it! The proposed ES module integration with wasm (aka a wasm module is "just another ES module") requires that the import module of wasm imports is interpreted as an ES module import, meaning that you'll need to encode paths, NPM packages, etc. As a result, we'll need this to be something other than `"env"`! Unfortunately neither our version of LLVM nor LLD supports custom import modules (aka anything not `"env"`). My hope is that by the time LLVM 7 is released both will have support, but in the meantime this commit adds some primitive encoding/decoding of wasm files to the compiler. This way rustc postprocesses the wasm module that LLVM emits to ensure it's got all the imports we'd like to have in it. Eventually I'd ideally like to unconditionally require this attribute to be placed on all `extern { ... }` blocks. For now though it seemed prudent to add it as an unstable attribute, so for now it's not required (as that'd force usage of a feature gate). Hopefully it doesn't take too long to "stabilize" this! cc rust-lang-nursery/rust-wasm#29
2018-02-10 22:28:17 +00:00
};
unsafe {
let buf = SmallCStr::new(&sect.as_str());
llvm::LLVMSetSection(llval, buf.as_ptr());
}
}
pub fn linkage_to_llvm(linkage: Linkage) -> llvm::Linkage {
match linkage {
Linkage::External => llvm::Linkage::ExternalLinkage,
Linkage::AvailableExternally => llvm::Linkage::AvailableExternallyLinkage,
Linkage::LinkOnceAny => llvm::Linkage::LinkOnceAnyLinkage,
Linkage::LinkOnceODR => llvm::Linkage::LinkOnceODRLinkage,
Linkage::WeakAny => llvm::Linkage::WeakAnyLinkage,
Linkage::WeakODR => llvm::Linkage::WeakODRLinkage,
Linkage::Appending => llvm::Linkage::AppendingLinkage,
Linkage::Internal => llvm::Linkage::InternalLinkage,
Linkage::Private => llvm::Linkage::PrivateLinkage,
Linkage::ExternalWeak => llvm::Linkage::ExternalWeakLinkage,
Linkage::Common => llvm::Linkage::CommonLinkage,
}
}
pub fn visibility_to_llvm(linkage: Visibility) -> llvm::Visibility {
match linkage {
Visibility::Default => llvm::Visibility::Default,
Visibility::Hidden => llvm::Visibility::Hidden,
Visibility::Protected => llvm::Visibility::Protected,
2017-09-18 10:14:52 +00:00
}
}