Rollup merge of #111322 - mirkootter:master, r=davidtwco

Support for native WASM exceptions

### Motivation
Currently, rustc does not support native WASM exceptions. It does support JavaScript based exceptions for the wasm32-emscripten-target, but this requires back&forth with javascript for many calls, which is very slow.

Native wasm support for exceptions is quite common: Clang+LLVM implemented them years ago, and all major browsers support them by now. They enable zero-cost exceptions, at least with regard to runtime-performance-cost. They may increase startup-time and code size, though.

### Important: This PR does not change default behaviour
Exceptions usually add a lot of code in form of unwinding blocks, increasing the binary size. Most users probably do not want that, especially which regard to web development.

Therefore, wasm exceptions play a similar role as WASM-threads: rustc should support them, like clang does, but users who want to use it have to use some command-line magic like rustflags to opt in.

### What does this PR do?
As stated above, the default behaviour is not changed. It is already possible to opt-in into wasm exceptions using the command line. Unfortunately, the LLVM IR is invalid and the LLVM backend crashes.
```
rustc <sourcefile>
  --target wasm32-unknown-unknown
  -C panic=unwind
  -C llvm-args=-wasm-enable-eh
  -C target-feature=+exception-handling
```
As it turns out, LLVM is quite picky when it comes to IR for exception handling. If the IR does not look exactly like it should, some LLVM-assertions fail and the code generation crashes.

This PR adds the necessary modifications to the code generator to make it work. It also adds `exception-handling` as a wasm target feature.

### What this PR does not / what is missing
This PR is not a full fledges solution. It is the first step. A few parts are still missing; however, it is already useable (see next section).

Currently missing:
* The std library has to be adapted. Currently, only [no_std] crates work
* Usually, nested exceptions abort the program (i.e. a panic during the cleanup of another panic). This is currently not done yet.
  - Currently, code inside cleanup handlers does not unwind
  - To fix this requires a little more work: The code generator currently maintains a single terminate block per function for this. Unfortunately, WASM requires funclet based exception handling. Therefore, we need to create a terminate block per funclet. This is probably not a big problem, but I want to keep this PR simple.

### How to use the compiler given this PR?
This PR does not add any command line flags or features. It uses those which are already there. To compile with exceptions enabled, you need
* to set the panic strategy to unwind, i.e. `-C panic=unwind`
* to enable the exception-handling target feature, i.e. `-C target-feature=+exception-handling`
* to tell LLVM about the exception handling, i.e. `-C llvm-args=-wasm-enable-eh`

Since the standard library has not been adapted, you can only use it in [no_std] crates as of now. The intrinsic `core::intrinsics::r#try` works. To throw exceptions, you need the ```@llvm.wasm.throw``` intrinsic.

I created a sample application which works for me: https://github.com/mirkootter/rust-wasm-demos
This example can be run at https://webassembly.sh
This commit is contained in:
Matthias Krüger 2023-06-29 16:36:30 +02:00 committed by GitHub
commit 4696a92183
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 140 additions and 14 deletions

View File

@ -8,7 +8,7 @@ use crate::llvm_util;
use crate::type_::Type;
use crate::value::Value;
use rustc_codegen_ssa::base::wants_msvc_seh;
use rustc_codegen_ssa::base::{wants_msvc_seh, wants_wasm_eh};
use rustc_codegen_ssa::traits::*;
use rustc_data_structures::base_n;
use rustc_data_structures::fx::FxHashMap;
@ -532,19 +532,28 @@ impl<'ll, 'tcx> MiscMethods<'tcx> for CodegenCx<'ll, 'tcx> {
if let Some(llpersonality) = self.eh_personality.get() {
return llpersonality;
}
let name = if wants_msvc_seh(self.sess()) {
Some("__CxxFrameHandler3")
} else if wants_wasm_eh(self.sess()) {
// LLVM specifically tests for the name of the personality function
// There is no need for this function to exist anywhere, it will
// not be called. However, its name has to be "__gxx_wasm_personality_v0"
// for native wasm exceptions.
Some("__gxx_wasm_personality_v0")
} else {
None
};
let tcx = self.tcx;
let llfn = match tcx.lang_items().eh_personality() {
Some(def_id) if !wants_msvc_seh(self.sess()) => self.get_fn_addr(
Some(def_id) if name.is_none() => self.get_fn_addr(
ty::Instance::resolve(tcx, ty::ParamEnv::reveal_all(), def_id, ty::List::empty())
.unwrap()
.unwrap(),
),
_ => {
let name = if wants_msvc_seh(self.sess()) {
"__CxxFrameHandler3"
} else {
"rust_eh_personality"
};
let name = name.unwrap_or("rust_eh_personality");
if let Some(llfn) = self.get_declared_value(name) {
llfn
} else {
@ -662,6 +671,10 @@ impl<'ll> CodegenCx<'ll, '_> {
let t_f32 = self.type_f32();
let t_f64 = self.type_f64();
let t_metadata = self.type_metadata();
let t_token = self.type_token();
ifn!("llvm.wasm.get.exception", fn(t_token) -> i8p);
ifn!("llvm.wasm.get.ehselector", fn(t_token) -> t_i32);
ifn!("llvm.wasm.trunc.unsigned.i32.f32", fn(t_f32) -> t_i32);
ifn!("llvm.wasm.trunc.unsigned.i32.f64", fn(t_f64) -> t_i32);

View File

@ -7,7 +7,7 @@ use crate::type_of::LayoutLlvmExt;
use crate::va_arg::emit_va_arg;
use crate::value::Value;
use rustc_codegen_ssa::base::{compare_simd_types, wants_msvc_seh};
use rustc_codegen_ssa::base::{compare_simd_types, wants_msvc_seh, wants_wasm_eh};
use rustc_codegen_ssa::common::{IntPredicate, TypeKind};
use rustc_codegen_ssa::errors::{ExpectedPointerMutability, InvalidMonomorphization};
use rustc_codegen_ssa::mir::operand::OperandRef;
@ -452,6 +452,8 @@ fn try_intrinsic<'ll>(
bx.store(bx.const_i32(0), dest, ret_align);
} else if wants_msvc_seh(bx.sess()) {
codegen_msvc_try(bx, try_func, data, catch_func, dest);
} else if wants_wasm_eh(bx.sess()) {
codegen_wasm_try(bx, try_func, data, catch_func, dest);
} else if bx.sess().target.os == "emscripten" {
codegen_emcc_try(bx, try_func, data, catch_func, dest);
} else {
@ -610,6 +612,80 @@ fn codegen_msvc_try<'ll>(
bx.store(ret, dest, i32_align);
}
// WASM's definition of the `rust_try` function.
fn codegen_wasm_try<'ll>(
bx: &mut Builder<'_, 'll, '_>,
try_func: &'ll Value,
data: &'ll Value,
catch_func: &'ll Value,
dest: &'ll Value,
) {
let (llty, llfn) = get_rust_try_fn(bx, &mut |mut bx| {
bx.set_personality_fn(bx.eh_personality());
let normal = bx.append_sibling_block("normal");
let catchswitch = bx.append_sibling_block("catchswitch");
let catchpad = bx.append_sibling_block("catchpad");
let caught = bx.append_sibling_block("caught");
let try_func = llvm::get_param(bx.llfn(), 0);
let data = llvm::get_param(bx.llfn(), 1);
let catch_func = llvm::get_param(bx.llfn(), 2);
// We're generating an IR snippet that looks like:
//
// declare i32 @rust_try(%try_func, %data, %catch_func) {
// %slot = alloca i8*
// invoke %try_func(%data) to label %normal unwind label %catchswitch
//
// normal:
// ret i32 0
//
// catchswitch:
// %cs = catchswitch within none [%catchpad] unwind to caller
//
// catchpad:
// %tok = catchpad within %cs [null]
// %ptr = call @llvm.wasm.get.exception(token %tok)
// %sel = call @llvm.wasm.get.ehselector(token %tok)
// call %catch_func(%data, %ptr)
// catchret from %tok to label %caught
//
// caught:
// ret i32 1
// }
//
let try_func_ty = bx.type_func(&[bx.type_i8p()], bx.type_void());
bx.invoke(try_func_ty, None, None, try_func, &[data], normal, catchswitch, None);
bx.switch_to_block(normal);
bx.ret(bx.const_i32(0));
bx.switch_to_block(catchswitch);
let cs = bx.catch_switch(None, None, &[catchpad]);
bx.switch_to_block(catchpad);
let null = bx.const_null(bx.type_i8p());
let funclet = bx.catch_pad(cs, &[null]);
let ptr = bx.call_intrinsic("llvm.wasm.get.exception", &[funclet.cleanuppad()]);
let _sel = bx.call_intrinsic("llvm.wasm.get.ehselector", &[funclet.cleanuppad()]);
let catch_ty = bx.type_func(&[bx.type_i8p(), bx.type_i8p()], bx.type_void());
bx.call(catch_ty, None, None, catch_func, &[data, ptr], Some(&funclet));
bx.catch_ret(&funclet, caught);
bx.switch_to_block(caught);
bx.ret(bx.const_i32(1));
});
// Note that no invoke is used here because by definition this function
// can't panic (that's what it's catching).
let ret = bx.call(llty, None, None, llfn, &[try_func, data, catch_func], None);
let i32_align = bx.tcx().data_layout.i32_align.abi;
bx.store(ret, dest, i32_align);
}
// Definition of the standard `try` function for Rust using the GNU-like model
// of exceptions (e.g., the normal semantics of LLVM's `landingpad` and `invoke`
// instructions).

View File

@ -1071,6 +1071,7 @@ extern "C" {
// Operations on other types
pub fn LLVMVoidTypeInContext(C: &Context) -> &Type;
pub fn LLVMTokenTypeInContext(C: &Context) -> &Type;
pub fn LLVMMetadataTypeInContext(C: &Context) -> &Type;
// Operations on all values

View File

@ -52,6 +52,10 @@ impl<'ll> CodegenCx<'ll, '_> {
unsafe { llvm::LLVMVoidTypeInContext(self.llcx) }
}
pub(crate) fn type_token(&self) -> &'ll Type {
unsafe { llvm::LLVMTokenTypeInContext(self.llcx) }
}
pub(crate) fn type_metadata(&self) -> &'ll Type {
unsafe { llvm::LLVMMetadataTypeInContext(self.llcx) }
}

View File

@ -357,6 +357,13 @@ pub fn cast_shift_expr_rhs<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
}
}
// Returns `true` if this session's target will use native wasm
// exceptions. This means that the VM does the unwinding for
// us
pub fn wants_wasm_eh(sess: &Session) -> bool {
sess.target.is_like_wasm && sess.target.os != "emscripten"
}
/// Returns `true` if this session's target will use SEH-based unwinding.
///
/// This is only true for MSVC targets, and even then the 64-bit MSVC target
@ -366,6 +373,13 @@ pub fn wants_msvc_seh(sess: &Session) -> bool {
sess.target.is_like_msvc
}
/// Returns `true` if this session's target requires the new exception
/// handling LLVM IR instructions (catchpad / cleanuppad / ... instead
/// of landingpad)
pub fn wants_new_eh_instructions(sess: &Session) -> bool {
wants_wasm_eh(sess) || wants_msvc_seh(sess)
}
pub fn memcpy_ty<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
bx: &mut Bx,
dst: Bx::Value,

View File

@ -79,8 +79,8 @@ impl<'a, 'tcx> TerminatorCodegenHelper<'tcx> {
lltarget = fx.landing_pad_for(target);
}
if is_cleanupret {
// MSVC cross-funclet jump - need a trampoline
debug_assert!(base::wants_msvc_seh(fx.cx.tcx().sess));
// Cross-funclet jump - need a trampoline
debug_assert!(base::wants_new_eh_instructions(fx.cx.tcx().sess));
debug!("llbb_with_cleanup: creating cleanup trampoline for {:?}", target);
let name = &format!("{:?}_cleanup_trampoline_{:?}", self.bb, target);
let trampoline_llbb = Bx::append_block(fx.cx, fx.llfn, name);
@ -177,9 +177,16 @@ impl<'a, 'tcx> TerminatorCodegenHelper<'tcx> {
mir::UnwindAction::Continue => None,
mir::UnwindAction::Unreachable => None,
mir::UnwindAction::Terminate => {
if fx.mir[self.bb].is_cleanup && base::wants_msvc_seh(fx.cx.tcx().sess) {
// SEH will abort automatically if an exception tries to
if fx.mir[self.bb].is_cleanup && base::wants_new_eh_instructions(fx.cx.tcx().sess) {
// MSVC SEH will abort automatically if an exception tries to
// propagate out from cleanup.
// FIXME(@mirkootter): For wasm, we currently do not support terminate during
// cleanup, because this requires a few more changes: The current code
// caches the `terminate_block` for each function; funclet based code - however -
// requires a different terminate_block for each funclet
// Until this is implemented, we just do not unwind inside cleanup blocks
None
} else {
Some(fx.terminate_block())
@ -1528,7 +1535,7 @@ impl<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>> FunctionCx<'a, 'tcx, Bx> {
// FIXME(eddyb) rename this to `eh_pad_for_uncached`.
fn landing_pad_for_uncached(&mut self, bb: mir::BasicBlock) -> Bx::BasicBlock {
let llbb = self.llbb(bb);
if base::wants_msvc_seh(self.cx.sess()) {
if base::wants_new_eh_instructions(self.cx.sess()) {
let cleanup_bb = Bx::append_block(self.cx, self.llfn, &format!("funclet_{:?}", bb));
let mut cleanup_bx = Bx::build(self.cx, cleanup_bb);
let funclet = cleanup_bx.cleanup_pad(None, &[]);
@ -1587,6 +1594,15 @@ impl<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>> FunctionCx<'a, 'tcx, Bx> {
// } catch (...) {
// bar();
// }
//
// which creates an IR snippet like
//
// cs_terminate:
// %cs = catchswitch within none [%cp_terminate] unwind to caller
// cp_terminate:
// %cp = catchpad within %cs [null, i32 64, null]
// ...
llbb = Bx::append_block(self.cx, self.llfn, "cs_terminate");
let cp_llbb = Bx::append_block(self.cx, self.llfn, "cp_terminate");

View File

@ -179,7 +179,8 @@ pub fn codegen_mir<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
start_bx.set_personality_fn(cx.eh_personality());
}
let cleanup_kinds = base::wants_msvc_seh(cx.tcx().sess).then(|| analyze::cleanup_kinds(&mir));
let cleanup_kinds =
base::wants_new_eh_instructions(cx.tcx().sess).then(|| analyze::cleanup_kinds(&mir));
let cached_llbbs: IndexVec<mir::BasicBlock, CachedLlbb<Bx::BasicBlock>> =
mir.basic_blocks

View File

@ -284,6 +284,7 @@ const WASM_ALLOWED_FEATURES: &[(&str, Option<Symbol>)] = &[
// tidy-alphabetical-start
("atomics", Some(sym::wasm_target_feature)),
("bulk-memory", Some(sym::wasm_target_feature)),
("exception-handling", Some(sym::wasm_target_feature)),
("multivalue", Some(sym::wasm_target_feature)),
("mutable-globals", Some(sym::wasm_target_feature)),
("nontrapping-fptoint", Some(sym::wasm_target_feature)),