Avoid `GenFuture` shim when compiling async constructs
Previously, async constructs would be lowered to "normal" generators, with an additional `from_generator` / `GenFuture` shim in between to convert from `Generator` to `Future`.
The compiler will now special-case these generators internally so that async constructs will *directly* implement `Future` without the need to go through the `from_generator` / `GenFuture` shim.
The primary motivation for this change was hiding this implementation detail in stack traces and debuginfo, but it can in theory also help the optimizer as there is less abstractions to see through.
---
Given this demo code:
```rust
pub async fn a(arg: u32) -> Backtrace {
let bt = b().await;
let _arg = arg;
bt
}
pub async fn b() -> Backtrace {
Backtrace::force_capture()
}
```
I would get the following with the latest stable compiler (on Windows):
```
4: async_codegen:🅱️:async_fn$0
at .\src\lib.rs:10
5: core::future::from_generator::impl$1::poll<enum2$<async_codegen:🅱️:async_fn_env$0> >
at /rustc/897e37553bba8b42751c67658967889d11ecd120\library\core\src\future\mod.rs:91
6: async_codegen:🅰️:async_fn$0
at .\src\lib.rs:4
7: core::future::from_generator::impl$1::poll<enum2$<async_codegen:🅰️:async_fn_env$0> >
at /rustc/897e37553bba8b42751c67658967889d11ecd120\library\core\src\future\mod.rs:91
```
whereas now I get a much cleaner stack trace:
```
3: async_codegen:🅱️:async_fn$0
at .\src\lib.rs:10
4: async_codegen:🅰️:async_fn$0
at .\src\lib.rs:4
```
Previously, async constructs would be lowered to "normal" generators,
with an additional `from_generator` / `GenFuture` shim in between to
convert from `Generator` to `Future`.
The compiler will now special-case these generators internally so that
async constructs will *directly* implement `Future` without the need
to go through the `from_generator` / `GenFuture` shim.
The primary motivation for this change was hiding this implementation
detail in stack traces and debuginfo, but it can in theory also help
the optimizer as there is less abstractions to see through.
Pass `InferCtxt` to `DropRangeVisitor` so we can resolve vars
The types that we encounter in the `TypeckResults` that we pass to the `DropRangeVisitor` are not yet fully resolved, since that only happens in writeback after type checking is complete.
Instead, pass down the whole `InferCtxt` so that we can resolve any inference vars that have been constrained since they were written into the results. This is similar to how the `MemCategorizationContext` in the `ExprUseVisitor` also needs to pass down both typeck results _and_ the inference context.
Fixes an ICE mentioned in this comment: https://github.com/rust-lang/rust/issues/104382#issuecomment-1324410781
Properly handle `Pin<&mut dyn* Trait>` receiver in codegen
This ensures we can actually await a `dyn* Future`, which seems important for async fn in dyn trait.
Also, disable `dyn*` trait upcasting. It's not exactly complete right now, and can cause strange ICEs for no reason -- nobody's using it either. I thought it was cute to implement when I did it, but I didn't think about how it interacts structurally with `CoerceUnsized` correctly.
Fixes#104794, presumably removing `dyn*` upcasting and its `CoerceUnsized` issues does the trick.
Throw error on failure in loading llvm-plugin
The following code silently ignores the error as the `LLVMRustSetLastError` only tracks one error at a time. At all other places where `LLVMRustSetLastError` is used the code immediately returns.
251831ece9/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L801-L804)
Use `as_deref` in compiler (but only where it makes sense)
This simplifies some code :3
(there are some changes that are not exacly `as_deref`, but more like "clever `Option`/`Result` method use")
Fix hang in where-clause suggestion with `predicate_can_apply`
Using `predicate_may_hold` during error reporting causes an evaluation overflow, which (because we use `evaluate_obligation_no_overflow`) then causes the predicate to need to be re-evaluated locally, which results in a hang.
... but since the "add a where clause" suggestion is best-effort, just throw any overflow errors. No need for 100% accuracy.
r? `@lcnr` who has been thinking about overflows... Let me know if you want more context about this issue, and as always, feel free to reassign.
Fixes#104225
optimize field ordering by grouping m*2^n-sized fields with equivalently aligned ones
```rust
use std::ptr::addr_of;
use std::mem;
struct Foo {
word: u32,
byte: u8,
ary: [u8; 4]
}
fn main() {
let foo: Foo = unsafe { mem::zeroed() };
println!("base: {:p}\nword: {:p}\nbyte: {:p}\nary: {:p}", &foo, addr_of!(foo.word), addr_of!(foo.byte), addr_of!(foo.ary));
}
```
prints
```
base: 0x7fffc1a8a668
word: 0x7fffc1a8a668
byte: 0x7fffc1a8a66c
ary: 0x7fffc1a8a66d
```
I.e. the `u8` in the middle causes the array to sit at an odd offset, which might prevent optimizations, especially on architectures where unaligned loads are costly.
Note that this will make field ordering niche-dependent, i.e. a `Bar<T>` with `T=char` and `T=u32` may result in different field order, this may break some code that makes invalid assumptions about `repr(Rust)` types.
Lower return type outside async block creation
This allows feeding a different output type to async blocks with a different `ImplTraitContext`. Spotted this while working on #104321
Refactor must_use lint into two parts
Before, the lint did the checking for `must_use` and pretty printing the types in a special format in one pass, causing quite complex and untranslatable code.
Now the collection and printing is split in two. That should also make it easier to translate or extract the type pretty printing in the future.
Also fixes an integer overflow in the array length pluralization
calculation.
fixes#104352