They're semantically the same, so this means the backends don't need to handle the intrinsic and means fewer MIR basic blocks in pointer arithmetic code.
Updates `interpret`, `codegen_ssa`, and `codegen_cranelift` to consume the new cast instead of the intrinsic.
Includes `CastTransmute` for custom MIR building, to be able to test the extra UB.
I was looking into `array::IntoIter` optimization, and noticed that it wasn't annotating the loads with `noundef` for simple things like `array::IntoIter<i32, N>`.
Turned out to be a more general problem as `MaybeUninit::assume_init_read` isn't marking the load as initialized (<https://rust.godbolt.org/z/Mxd8TPTnv>), which is unfortunate since that's basically its reason to exist.
This PR lowers `ptr::read(p)` to `copy *p` in MIR, which fortuitiously also improves the IR we give to LLVM for things like `mem::replace`.