Rollup merge of #101717 - Pointerbender:unsafecell-memory-layout, r=Amanieu

Add documentation about the memory layout of `UnsafeCell<T>`

The documentation for `UnsafeCell<T>` currently does not make any promises about its memory layout. This PR adds this documentation, namely that the memory layout of `UnsafeCell<T>` is the same as the memory layout of its inner `T`.

# Use case
Without this layout promise, the following cast would not be legally possible:

```rust
fn example<T>(ptr: *mut T) -> *const UnsafeCell<T> {
  ptr as *const UnsafeCell<T>
}
```

A use case where this can come up involves FFI. If Rust receives a pointer over a FFI boundary which provides shared read-write access (with some form of custom synchronization), and this pointer is managed by some Rust struct with lifetime `'a`, then it would greatly simplify its (internal) API and safety contract if a `&'a UnsafeCell<T>` can be created from a raw FFI pointer `*mut T`. A lot of safety checks can be done when receiving the pointer for the first time through FFI (non-nullness, alignment, initialize uninit bytes, etc.) and these properties can then be encoded into the `&UnsafeCell<T>` type. Without this documentation guarantee, this is not legal today outside of the standard library.

# Caveats
Casting in the opposite direction is still not valid, even with this documentation change:

```rust
fn example2<T>(ptr: &UnsafeCell<T>) -> &mut T {
  let t = ptr as *const UnsafeCell<T> as *mut T;
  unsafe { &mut *t }
}
```

This is because the only legal way to obtain a mutable pointer to the contents of the shared reference is through [`UnsafeCell::get`](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html#method.get) and [`UnsafeCell::raw_get`](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html#method.raw_get). Although there might be a desire to also make this legal at some point in the future, that part is outside the scope of this PR. Also see this relevant [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/136281-t-lang.2Fwg-unsafe-code-guidelines/topic/transmuting.20.26.20-.3E.20.26mut).

# Alternatives
Instead of adding a new documentation promise, it's also possible to add a new method to `UnsafeCell<T>` with signature `pub fn from_ptr_bikeshed(ptr: *mut T) -> *const UnsafeCell<T>` which indirectly only allows one-way casting to `*const UnsafeCell<T>`.
This commit is contained in:
Yuki Okushi 2022-10-16 11:41:12 +09:00 committed by GitHub
commit cbc0a73c95
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -1816,6 +1816,50 @@ impl<T: ?Sized + fmt::Display> fmt::Display for RefMut<'_, T> {
///
/// [`.get_mut()`]: `UnsafeCell::get_mut`
///
/// `UnsafeCell<T>` has the same in-memory representation as its inner type `T`. A consequence
/// of this guarantee is that it is possible to convert between `T` and `UnsafeCell<T>`.
/// Special care has to be taken when converting a nested `T` inside of an `Outer<T>` type
/// to an `Outer<UnsafeCell<T>>` type: this is not sound when the `Outer<T>` type enables [niche]
/// optimizations. For example, the type `Option<NonNull<u8>>` is typically 8 bytes large on
/// 64-bit platforms, but the type `Option<UnsafeCell<NonNull<u8>>>` takes up 16 bytes of space.
/// Therefore this is not a valid conversion, despite `NonNull<u8>` and `UnsafeCell<NonNull<u8>>>`
/// having the same memory layout. This is because `UnsafeCell` disables niche optimizations in
/// order to avoid its interior mutability property from spreading from `T` into the `Outer` type,
/// thus this can cause distortions in the type size in these cases. Furthermore, it is only valid
/// to obtain a `*mut T` pointer to the contents of a _shared_ `UnsafeCell<T>` through [`.get()`]
/// or [`.raw_get()`]. A `&mut T` reference can be obtained by either dereferencing this pointer or
/// by calling [`.get_mut()`] on an _exclusive_ `UnsafeCell<T>`, e.g.:
///
/// ```rust
/// use std::cell::UnsafeCell;
///
/// let mut x: UnsafeCell<u32> = UnsafeCell::new(5);
/// let shared: &UnsafeCell<u32> = &x;
/// // using `.get()` is okay:
/// unsafe {
/// // SAFETY: there exist no other references to the contents of `x`
/// let exclusive: &mut u32 = &mut *shared.get();
/// };
/// // using `.raw_get()` is also okay:
/// unsafe {
/// // SAFETY: there exist no other references to the contents of `x` in this scope
/// let exclusive: &mut u32 = &mut *UnsafeCell::raw_get(shared as *const _);
/// };
/// // using `.get_mut()` is always safe:
/// let exclusive: &mut u32 = x.get_mut();
///
/// // when we have exclusive access, we can convert it to a shared `&UnsafeCell`:
/// unsafe {
/// // SAFETY: `u32` has no niche, therefore it has the same layout as `UnsafeCell<u32>`
/// let shared: &UnsafeCell<u32> = &*(exclusive as *mut _ as *const UnsafeCell<u32>);
/// // SAFETY: there exist no other *active* references to the contents of `x` in this scope
/// let exclusive: &mut u32 = &mut *shared.get();
/// }
/// ```
///
/// [niche]: https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#niche
/// [`.raw_get()`]: `UnsafeCell::raw_get`
///
/// # Examples
///
/// Here is an example showcasing how to soundly mutate the contents of an `UnsafeCell<_>` despite