This option was only evaluated for Metal backends, and now it's required
there so the option is going away. It is still configurable for tests
via the PipelineOptions struct, deserialized from .ron files.
This also fixes some type problems with the unpack functions in
writer.rs. Metal << operator extends operand to int-sized, which then
has to be cast back down to the real size before as_type bit conversion.
The math for the snorm values is corrected, in some cases using the
metal unpack_snorm2x16_to_float function because we can't directly
cast a bit-shifted ushort value to half.
Add `wgpu_core::device::Device::last_successful_submission_index`,
which records the fence value that `Maintain::Wait` should actually
wait for. See comments for details.
Fixes#5969.
Ensure that samplers using non-`Nearest` mipmap filtering are
considered "filtering samplers" when deciding bind group layout
compatibility.
Add tests for layout `NonFiltering` validation.
Fixes#5948.
* Expose gpu allocation configuration options
This commit adds hints to control memory allocations strategies to the configuration options. These hints allow for automatic profiles such as optimizing for performance (the default, makes sense for a game), optimizing for memory usage (typically more useful for a web browser or UI library) and specifying settings manually.
The details of gpu allocation are still in flux. The goal is to switch vulkan and metal to gpu_allocator which is currently used with d3d12. gpu_allocator will also likely receive more configuration options, in particular the ability to start with smaller memory block sizes and progressively grow the block size. So the manual settings already provision for this upcoming option. Another approach could be to wait and add the manual option after the dust settles.
The reason for providing presets and defining values in the backends is that I am convinced that optimal fonigurations should take hardware capabilities into consideration. It's a deep rabbithole, though, so that will be an exercise for later.
* changelog
* Update CHANGELOG.md
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Add a comment about not entirely knowing what we are doing
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Allow unconsumed inputs in fragment shaders by removing them from vertex
outputs when generating HLSL.
Fixes https://github.com/gfx-rs/wgpu/issues/3748
* Add naga:🔙:hlsl::FragmentEntryPoint for providing information
about the fragment entry point when generating vertex entry points via
naga:🔙:hlsl::Writer::write. Vertex outputs not consumed by the
fragment entry point are omitted in the final output struct.
* Add naga snapshot test for this new feature,
* Remove Features::SHADER_UNUSED_VERTEX_OUTPUT,
StageError::InputNotConsumed, and associated validation logic.
* Make wgpu dx12 backend pass fragment shader info when generating
vertex HLSL.
* Add wgpu regression test for allowing unconsumed inputs.
* Address review
* Add note that nesting structs for the inter-stage interface can't
happen.
* Remove new TODO notes (some addressed and some transferred to an issue
https://github.com/gfx-rs/wgpu/issues/5577)
* Changed issue that regression test refers to 3748 -> 5553
* Add debug_assert that binding.is_some() in hlsl writer
* Fix typos caught in CI
Also, fix compiling snapshot test when hlsl-out feature is not enabled.
* share timestamp write struct
* Make name of set_push_constants methods consistently plural
* remove lifetime bounds of resources passed into render pass
* first render pass resource ownership test
* introduce dynrenderpass & immediately create ArcCommands and take ownership of resources passed on pass creation
* Use of dynrenderpass in deno
* Separate active occlusion & pipeline statitics query
* resolve render/compute command is now behind `replay` feature
* add vertex & index buffer to ownership test
* test for pipeline statistics query
* add occlusion query set to pass resource test
* add tests for resource ownership of render pass query timestamps
* RenderPass can now be made 'static just like ComputePass. Add respective test
* Extend encoder_operations_fail_while_pass_alive test to also check encoder locking errors with render passes
* improve changelog entry on lifetime bounds
Skip
`wgpu_test::compute_pass_ownership::compute_pass_resource_ownership`
on the GL backend on AMD Radeon Pro WX 3200, to avoid the kernel crash
described in #5800.
Add the following flags to `wgpu_types::Features`:
- `SHADER_INT64_ATOMIC_ALL_OPS` enables all atomic operations on `atomic<i64>` and
`atomic<u64>` values.
- `SHADER_INT64_ATOMIC_MIN_MAX` is a subset of the above, enabling only
`AtomicFunction::Min` and `AtomicFunction::Max` operations on `atomic<i64>` and
`atomic<u64>` values in the `Storage` address space. These are the only 64-bit
atomic operations available on Metal as of 3.1.
Add corresponding flags to `naga::valid::Capabilities`. These are supported by the
WGSL front end, and all Naga backends.
Platform support:
- On Direct3d 12, in `D3D12_FEATURE_DATA_D3D12_OPTIONS9`, if
`AtomicInt64OnTypedResourceSupported` and `AtomicInt64OnGroupSharedSupported` are
both available, then both wgpu features described above are available.
- On Metal, `SHADER_INT64_ATOMIC_MIN_MAX` is available on Apple9 hardware, and on
hardware that advertises both Apple8 and Mac2 support. This also requires Metal
Shading Language 2.4 or later. Metal does not yet support the more general
`SHADER_INT64_ATOMIC_ALL_OPS`.
- On Vulkan, if the `VK_KHR_shader_atomic_int64` extension is available with both the
`shader_buffer_int64_atomics` and `shader_shared_int64_atomics` features, then both
wgpu features described above are available.
* add new tests for checking on query set lifetime
* Fix ownership management of query sets on compute passes for write_timestamp, timestamp_writes (on desc) and pipeline statistic queries
* changelog entry
* Reintroduce computepass->encoder lifetime constraint and make it opt-out via `wgpu::ComputePass::make_static`
* improve comments based on review feedback
* use the same lifetime name for all usages of `ComputePass<'encoder>`
* comment improvement that I missed earlier
* more review based comment improvements
* use suggested zero-overhead lifetime removal
* rename make_static to forge_lifetime
* missed comma
This proves a flag in msl::PipelineOptions that attempts to write all
Metal vertex entry points to use a vertex pulling technique. It does
this by:
1) Forcing the _buffer_sizes structure to be generated for all vertex
entry points. The structure has additional buffer_size members that
contain the byte sizes of the vertex buffers.
2) Adding new args to vertex entry points for the vertex id and/or
the instance id and for the bound buffers. If there is an existing
@builtin(vertex_index) or @builtin(instance_index) param, then no
duplicate arg is created.
3) Adding code at the beginning of the function for vertex entry points
to compare the vertex id or instance id against the lengths of all the
bound buffers, and force an early-exit if the bounds are violated.
4) Extracting the raw bytes from the vertex buffer(s) and unpacking
those bytes into the bound attributes with the expected types.
5) Replacing the varyings input and instead using the unpacked
attributes to fill any structs-as-args that are rebuilt in the entry
point.
A new naga test is added which exercises this flag and demonstrates the
effect of the transform. The msl generated by this test passes
validation.
Eventually this transformation will be the default, always-on behavior
for Metal pipelines, though the flag may remain so that naga
translation tests can be run with and without the tranformation.
* lift encoder->computepass lifetime constraint and add now failing test
* compute passes now take an arc to their parent command encoder, thus removing compile time dependency to it
* Command encoder goes now into locked state while compute pass is open
* changelog entry
* share most of the code between get_encoder and lock_encoder
* basic test setup
* remove lifetime and drop resources on test - test fails now just as expected
* compute pass recording is now hub dependent (needs gfx_select)
* compute pass recording now bumps reference count of uses resources directly on recording
TODO:
* bind groups don't work because the Binder gets an id only
* wgpu level error handling is missing
* simplify compute pass state flush, compute pass execution no longer needs to lock bind_group storage
* wgpu sided error handling
* make ComputePass hal dependent, removing command cast hack. Introduce DynComputePass on wgpu side
* remove stray repr(C)
* changelog entry
* fix deno issues -> move DynComputePass into wgc
* split out resources setup from test
Use `std::panic::Location` to record the source location of each
`#[gpu_test]` test, and if it fails, include that in the error output.
This is not essential, but it should make working with failures a bit
more comfortable.
Invoke a DeviceLostClosure immediately if set on an invalid device.
To make the device invalid, this defines an explicit, test-only method
make_invalid. It also modifies calls that expect to always retrieve a
valid device.
Co-authored-by: Erich Gubler <erichdongubler@gmail.com>
Rust would have made this operation either an overflow in release mode,
or a panic in debug mode. Neither seem appropriate for this context,
where I suspect an error should be returned instead. Web browsers, for
instance, shouldn't crash simply because of an issue of this nature.
Users may, quite reasonably, have bad arguments to this in early stages
of development!
Fuzz testing in Firefox encountered crashes for calls of
`Global::command_encoder_clear_buffer` where:
* `offset` is greater than `buffer.size`, but…
* `size` is `None`.
Oops! We should _always_ check this (i.e., even when `size` is `None`),
because we have no guarantee that `offset` and the fallback value of
`size` is in bounds. 😅 So, we change validation here to unconditionally
compute `size` and run checks we previously gated behind `if let
Some(size) = size { … }`.
For convenience, the spec. link for this method:
<https://gpuweb.github.io/gpuweb/#dom-gpucommandencoder-clearbuffer>