In MSL output, avoid undefined behavior due to unbounded loops by
adding an unpredictable, never-actually-taken `break` to the bottom of
each loop body, rather than adding an unpredictable,
never-actually-taken branch over each loop.
This will probably have more of a performance impact, because it
affects each iteration of the loop, but unlike branching over the
loop, which leaves infinite loops (and thus undefined behavior) in the
output, this actually ensures that no loop presented to Metal is
unbounded, so that there is no undefined behavior present that the
optimizer could use to make unwelcome inferences.
Fixes#6528.
Until now we accepted a float, as is the case for non-depth textures.
This makes us compliant with the spec.
The validator is updated to expect an Sint or Uint when the ImageClass
is ImageClass::Depth. The SPIR-V frontend converts the LOD argument
from float to Sint (assuming that it is representable), likewise The
SPIR-V backend now converts the LOD from either Sint or Uint to
Float. HLSL and MSL backends require no changes as they implicitly do
that conversion. GLSL does not support non-compare LOD samples,
therefore no changes are required.
This gets the `wgpu_test::ray_tracing::as_build::out_of_order_as_build` test to pass.
This seems to be an issue even on trunk, looking at the nr of calls to `create_command_encoder` & `destroy_command_encoder` in hal, they are not equal. So, I'm not sure why the validation layers don't raise the `VUID-vkDestroyDevice-device-05137`.
There is still an issue with previous command buffers being leaked but I will fix this in a follow-up.
The `Device` should not contain any `Arc`s to resources as that creates cycles (since all resources hold strong references to the `Device`).
Note that `LifetimeTracker` internally has `Arc`s to resources.
The `Device` should not contain any `Arc`s to resources as that creates cycles (since all resources hold strong references to the `Device`).
Note that `PendingWrites` internally has `Arc`s to resources.
I think this change also makes more sense conceptually since most operations that use `PendingWrites` are on the `Queue`.
`Global::device_drop` was wrongly assuming `device_poll` with `Maintain::Wait` was called but this is not a documented invariant and only `wgpu` was upholding this.
PreHashedMap was introduced as a performance optimization. It is,
however, potentially unsound as it cannot distinguish between keys
that are not equal yet have the same hash. This patch replaces its
usage with a simple FastHashMap, which should be good enough. If this
is found to cause real life performance issues down the line then we
can figure out a better solution.
Implement WGSL frontend and WGSL, SPIR-V, HLSL, MSL, and GLSL
backends. WGSL and SPIR-V backends natively support the instruction.
MSL and HLSL emulate it by casting to f16 and back to f32. GLSL does
similar but must (mis)use (un)pack2x16 to do so.