[breaking change] Validate crate name in `--extern` [MCP 650]
Reject non-ASCII-identifier crate names passed to the CLI option `--extern` (`rustc`, `rustdoc`).
Implements [MCP 650](https://github.com/rust-lang/compiler-team/issues/650) (except that we only allow ASCII identifiers not arbitrary Rust identifiers).
Fixes#113035.
[As mentioned on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/233931-t-compiler.2Fmajor-changes/topic/Disallow.20non-identifier-valid.20--extern.20cr.E2.80.A6.20compiler-team.23650/near/376826988), doing a crater run probably doesn't make sense since it wouldn't yield anything. Most users don't interact with `rustc` directly but only ever through Cargo which always passes a valid crate name to `--extern` when it invokes `rustc` and `rustdoc`. In any case, the user wouldn't be able to use such a crate name in the source code anyway.
Note that I'm not using [`rustc_session::output::validate_crate_name`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_session/output/fn.validate_crate_name.html) (used for `--crate-name` and `#![crate_name]`) since the latter doesn't reject non-ASCII crate names and ones that start with a digit.
As an aside, I've also thought about getting rid of `validate_crate_name` entirely in a separate PR (with another MCP) in favor of `is_ascii_ident` to reject more weird `--crate-name`s, `#![crate_name]`s and file names but I think that would lead to a lot of actual breakage, namely because of file names starting with a digit. In `tests/ui` 9 tests would be impacted for example.
CC `@estebank`
r? `@est31`
adjust how closure/generator types are printed
I saw `&[closure@$DIR/issue-20862.rs:2:5]` and I thought it is a slice type, because that's usually what `&[_]` is... it took me a while to realize that this is just a confusing printer and actually there's no slice. Let's use something that cannot be mistaken for a regular type.
Allow `-Z treat-err-as-bug=0`
Makes `-Z treat-err-as-bug=0` behave as if the option wasn't present instead of asking the value to be ⩾ 1. This enables a quick on/off of the option, as you only need to change one character instead of removing the whole `-Z`.
Also update some text, e.g.
```bash
$ rustc -Z help | grep treat-err-as-bug
-Z treat-err-as-bug=val -- treat error number `val` that occurs as bug
```
where the value could be interpreted as an error code instead of an ordinal.
give FutureIncompatibilityReason variants more explicit names
Also make the `reason` field mandatory when declaring a lint, to make sure this is a deliberate decision.
Move `DepKind` to `rustc_query_system` and define it as `u16`
This moves the `DepKind` type to `rustc_query_system` where it's defined with an inner `u16` field. This decouples it from `rustc_middle` and is a step towards letting other crates define dep kinds. It also allows some type parameters to be removed. The `DepKind` trait is replaced with a `Deps` trait. That's used when some operations or information about dep kinds which is unavailable in `rustc_query_system` are still needed.
r? `@cjgillot`
rustc_hir_analysis: add a helper to check function the signature mismatches
This function is now used to check `#[panic_handler]`, `start` lang item, `main`, `#[start]` and intrinsic functions.
The diagnosis produced are now closer to the ones produced by trait/impl method signature mismatch.
This is the first time I do anything with rustc_hir_analysis/rustc_hir_typeck, so comments and suggestions about things I did wrong or that could be improved will be appreciated.
Suggest desugaring to return-position `impl Future` when an `async fn` in trait fails an auto trait bound
First commit allows us to store the span of the `async` keyword in HIR.
Second commit implements a suggestion to desugar an `async fn` to a return-position `impl Future` in trait to slightly improve the `Send` situation being discussed in #115822.
This suggestion is only made when `#![feature(return_type_notation)]` is not enabled -- if it is, we should instead suggest an appropriate where-clause bound.
coverage: Don't bother renumbering expressions on the Rust side
The LLVM API that we use to encode coverage mappings already has its own code for removing unused coverage expressions and renumbering the rest.
This lets us get rid of our own complex renumbering code, making it easier to change our coverage code in other ways.
---
Now that we have tests for coverage mappings (#114843), I've been able to verify that this PR doesn't make the coverage mappings worse, thanks to an explicit simplification step.
interpret: more consistently use ImmTy in operators and casts
The diff in src/tools/miri/src/shims/x86/sse2.rs should hopefully suffice to explain why this is nicer. :)
rename mir::Constant -> mir::ConstOperand, mir::ConstKind -> mir::Const
Also, be more consistent with the `to/eval_bits` methods... we had some that take a type and some that take a size, and then sometimes the one that takes a type is called `bits_for_ty`.
Turns out that `ty::Const`/`mir::ConstKind` carry their type with them, so we don't need to even pass the type to those `eval_bits` functions at all.
However this is not properly consistent yet: in `ty` we have most of the methods on `ty::Const`, but in `mir` we have them on `mir::ConstKind`. And indeed those two types are the ones that correspond to each other. So `mir::ConstantKind` should actually be renamed to `mir::Const`. But what to do with `mir::Constant`? It carries around a span, that's really more like a constant operand that appears as a MIR operand... it's more suited for `syntax.rs` than `consts.rs`, but the bigger question is, which name should it get if we want to align the `mir` and `ty` types? `ConstOperand`? `ConstOp`? `Literal`? It's not a literal but it has a field called `literal` so it would at least be consistently wrong-ish...
``@oli-obk`` any ideas?
Prevent promotion of const fn calls in inline consts
We don't wanna make that mistake we did for statics and consts worse by letting more code use it.
r? ``@RalfJung``
cc https://github.com/rust-lang/rust/issues/76001
Improve invalid UTF-8 lint by finding the expression initializer
This PR introduce a small mechanism to walk up the HIR through bindings, if/else, consts, ... when trying lint on invalid UTF-8.
Fixes https://github.com/rust-lang/rust/issues/115208
The LLVM API that we use to encode coverage mappings already has its own code
for removing unused coverage expressions and renumbering the rest.
This lets us get rid of our own complex renumbering code, making it easier to
change our coverage code in other ways.
After coverage instrumentation and MIR transformations, we can sometimes end up
with coverage expressions that always have a value of zero. Any expression
operand that refers to an always-zero expression can be replaced with a literal
`Operand::Zero`, making the emitted coverage mapping data smaller and simpler.
This simplification step is mostly redundant with the simplifications performed
inline in `expressions_with_regions`, except that it does a slightly more
thorough job in some cases (because it checks for always-zero expressions
*after* other simplifications).
However, adding this simplification step will then let us greatly simplify that
code, without affecting the quality of the emitted coverage maps.
Fall back to the unoptimized implementation in read_binary_file if File::metadata lies
Fixes https://github.com/rust-lang/rust/issues/115458
r? `@jackh726` because you approved the previous PR
Simplify/Optimize FileEncoder
FileEncoder is basically a BufWriter except that it exposes access to the not-written-to-yet region of the buffer so that some users can write directly to the buffer. This strategy is awesome because it lets us avoid calling memcpy for small copies, but the previous strategy was based on the writer accessing a `&mut [MaybeUninit<u8>; N]` and returning a `&[u8]` which is an API which currently mandates the use of unsafe code, making that interface in general not that appealing.
So this PR cleans up the FileEncoder implementation and builds on that general idea of direct buffer access in order to prevent `memcpy` calls in a few key places when encoding the dep graph and rmeta tables. The interface used here is now 100% safe, but with the caveat that internally we need to avoid trusting the number of bytes that the provided function claims to have written.
The original primary objective of this PR was to clean up the FileEncoder implementation so that the fix for the following issues would be easy to implement. The fix for these issues is to correctly update self.buffered even when writes fail, which I think it's easy to verify manually is now done, because all the FileEncoder methods are small.
Fixes https://github.com/rust-lang/rust/issues/115298
Fixes https://github.com/rust-lang/rust/issues/114671
Fixes https://github.com/rust-lang/rust/issues/114045
Fixes https://github.com/rust-lang/rust/issues/108100
Fixes https://github.com/rust-lang/rust/issues/106787
rustc_target/loongarch: Fix passing of transparent unions with only one non-ZST member
This ensures that `MaybeUninit<T>` has the same ABI as `T` when passed through an `extern "C"` function.
Fixes https://github.com/rust-lang/rust/issues/115509
r? `@bjorn3`
adjust ConstValue::Slice to work for arbitrary slice types
valtrees have already been assuming that this works; this PR makes it a reality. Also further restrict `ConstValue::Slice` to what it is actually used for; this even shrinks `ConstValue` from 32 to 24 bytes which is a nice win. :)
The alternative to this approach is to make `ConstValue::Slice` work really only for `&str`/`&[u8]` literals, and never return it in `op_to_const`. That would make `op_to_const` very clean. We could then even remove the `meta` field; the length would always be `data.inner().len()`. We could *almost* just use a `Symbol` instead of a `ConstAllocation`, but we have to support byte strings and there doesn't seem to be an interned representation of them (or rather, `ConstAllocation` *is* their interned representation). In this world, valtrees of slice reference types would then become noticeably more expensive to turn into a `ConstValue` -- but does that matter? Specifically for `&str`/`&[u8]` we could still use the optimized representation if we wanted.
If byte strings were already interned somewhere I'd gravitate towards the alternative, but the way things stand, we need a `ConstAllocation` case anyway to support byte strings, and then we might as well support arbitrary slices. (Or we say that byte strings don't get an optimized representation at all. Such a performance cliff between `str` and byte strings is probably unexpected, though due to the lack of interning for byte strings I think there might already be a performance cliff there.)