* Expose gpu allocation configuration options
This commit adds hints to control memory allocations strategies to the configuration options. These hints allow for automatic profiles such as optimizing for performance (the default, makes sense for a game), optimizing for memory usage (typically more useful for a web browser or UI library) and specifying settings manually.
The details of gpu allocation are still in flux. The goal is to switch vulkan and metal to gpu_allocator which is currently used with d3d12. gpu_allocator will also likely receive more configuration options, in particular the ability to start with smaller memory block sizes and progressively grow the block size. So the manual settings already provision for this upcoming option. Another approach could be to wait and add the manual option after the dust settles.
The reason for providing presets and defining values in the backends is that I am convinced that optimal fonigurations should take hardware capabilities into consideration. It's a deep rabbithole, though, so that will be an exercise for later.
* changelog
* Update CHANGELOG.md
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Add a comment about not entirely knowing what we are doing
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Allow unconsumed inputs in fragment shaders by removing them from vertex
outputs when generating HLSL.
Fixes https://github.com/gfx-rs/wgpu/issues/3748
* Add naga:🔙:hlsl::FragmentEntryPoint for providing information
about the fragment entry point when generating vertex entry points via
naga:🔙:hlsl::Writer::write. Vertex outputs not consumed by the
fragment entry point are omitted in the final output struct.
* Add naga snapshot test for this new feature,
* Remove Features::SHADER_UNUSED_VERTEX_OUTPUT,
StageError::InputNotConsumed, and associated validation logic.
* Make wgpu dx12 backend pass fragment shader info when generating
vertex HLSL.
* Add wgpu regression test for allowing unconsumed inputs.
* Address review
* Add note that nesting structs for the inter-stage interface can't
happen.
* Remove new TODO notes (some addressed and some transferred to an issue
https://github.com/gfx-rs/wgpu/issues/5577)
* Changed issue that regression test refers to 3748 -> 5553
* Add debug_assert that binding.is_some() in hlsl writer
* Fix typos caught in CI
Also, fix compiling snapshot test when hlsl-out feature is not enabled.
* share timestamp write struct
* Make name of set_push_constants methods consistently plural
* remove lifetime bounds of resources passed into render pass
* first render pass resource ownership test
* introduce dynrenderpass & immediately create ArcCommands and take ownership of resources passed on pass creation
* Use of dynrenderpass in deno
* Separate active occlusion & pipeline statitics query
* resolve render/compute command is now behind `replay` feature
* add vertex & index buffer to ownership test
* test for pipeline statistics query
* add occlusion query set to pass resource test
* add tests for resource ownership of render pass query timestamps
* RenderPass can now be made 'static just like ComputePass. Add respective test
* Extend encoder_operations_fail_while_pass_alive test to also check encoder locking errors with render passes
* improve changelog entry on lifetime bounds
To support atomic instructions in the SPIR-V front end, observe which
global variables the input accesses using atomic instructions, and
adjust their types from ordinary scalars to atomic values.
See comments in `naga::front::atomic_upgrade`.
* ensure that `wgpu::Error` is `Sync`
This makes it possible to wrap the error in `anyhow::Error` and
`eyre::Report`, which require the inner error to be `Sync`.
* update changelog
Add the following flags to `wgpu_types::Features`:
- `SHADER_INT64_ATOMIC_ALL_OPS` enables all atomic operations on `atomic<i64>` and
`atomic<u64>` values.
- `SHADER_INT64_ATOMIC_MIN_MAX` is a subset of the above, enabling only
`AtomicFunction::Min` and `AtomicFunction::Max` operations on `atomic<i64>` and
`atomic<u64>` values in the `Storage` address space. These are the only 64-bit
atomic operations available on Metal as of 3.1.
Add corresponding flags to `naga::valid::Capabilities`. These are supported by the
WGSL front end, and all Naga backends.
Platform support:
- On Direct3d 12, in `D3D12_FEATURE_DATA_D3D12_OPTIONS9`, if
`AtomicInt64OnTypedResourceSupported` and `AtomicInt64OnGroupSharedSupported` are
both available, then both wgpu features described above are available.
- On Metal, `SHADER_INT64_ATOMIC_MIN_MAX` is available on Apple9 hardware, and on
hardware that advertises both Apple8 and Mac2 support. This also requires Metal
Shading Language 2.4 or later. Metal does not yet support the more general
`SHADER_INT64_ATOMIC_ALL_OPS`.
- On Vulkan, if the `VK_KHR_shader_atomic_int64` extension is available with both the
`shader_buffer_int64_atomics` and `shader_shared_int64_atomics` features, then both
wgpu features described above are available.
* add new tests for checking on query set lifetime
* Fix ownership management of query sets on compute passes for write_timestamp, timestamp_writes (on desc) and pipeline statistic queries
* changelog entry
* Reintroduce computepass->encoder lifetime constraint and make it opt-out via `wgpu::ComputePass::make_static`
* improve comments based on review feedback
* use the same lifetime name for all usages of `ComputePass<'encoder>`
* comment improvement that I missed earlier
* more review based comment improvements
* use suggested zero-overhead lifetime removal
* rename make_static to forge_lifetime
* missed comma
* lift encoder->computepass lifetime constraint and add now failing test
* compute passes now take an arc to their parent command encoder, thus removing compile time dependency to it
* Command encoder goes now into locked state while compute pass is open
* changelog entry
* share most of the code between get_encoder and lock_encoder
* Add support for pipeline-overridable constants in WebGPU
* Add utility function for setting constants map
* Panic on failure to set constants map
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* rename `command_encoder_run_*_pass` to `*_pass_end` and make it a method of compute/render pass instead of encoder
* executing a compute pass consumes it now such that it can't be executed again
* use handle_error instead of handle_error_nolabel for wgpu compute pass
* use handle_error instead of handle_error_nolabel for render_pass_end
* changelog addition
* feat: `compute_pass_set_push_constant`: move panics to error variants
Co-Authored-By: Erich Gubler <erichdongubler@gmail.com>
---------
Co-authored-by: Erich Gubler <erichdongubler@gmail.com>
* basic test setup
* remove lifetime and drop resources on test - test fails now just as expected
* compute pass recording is now hub dependent (needs gfx_select)
* compute pass recording now bumps reference count of uses resources directly on recording
TODO:
* bind groups don't work because the Binder gets an id only
* wgpu level error handling is missing
* simplify compute pass state flush, compute pass execution no longer needs to lock bind_group storage
* wgpu sided error handling
* make ComputePass hal dependent, removing command cast hack. Introduce DynComputePass on wgpu side
* remove stray repr(C)
* changelog entry
* fix deno issues -> move DynComputePass into wgc
* split out resources setup from test
* Issue SetDrawColorBuffers commands before issuing ClearColor
This is necessary for glClearBuffer calls to work correctly on some machines (e.g. AMD Renoir graphics running on Linux). Without this, glClearBuffer calls are ignored.
* Use clear_buffer_f32_slice instead of gl.clear to suppress WebGL warnings
This fixes the following WebGL warning: "WebGL warning: drawBuffers: `buffers[i]` must be NONE or COLOR_ATTACHMENTi."
When using native OpenGL, it is acceptable to call glDrawBuffers with an array of buffers where i != COLOR_ATTACHMENTi. In WebGL, this is not allowed.
* Run cargo fmt
* Add changes for PR GH-5666 to the CHANGELOG
* Avoid introducing spurious features for optional dependencies
If a feature depends on an optional dependency without using the dep:
prefix, a feature with the same name as the optional dependency is
introduced. This feature almost certainly won't have any effect when
enabled other than increasing compile times and polutes the feature list
shown by cargo add. Consistently use dep: for all optional dependencies
to avoid this problem.
* Add changelog entry
* Clean up weak references to texture views
* add change to CHANGELOG.md
* drop texture view before clean up
* cleanup weak ref to bind groups
* update changelog
* Trim weak backlinks in their holders' triage functions.
---------
Co-authored-by: Jim Blandy <jimb@red-bean.com>
Document that `wgpu_hal::CommandEncoder::discard_encoding` must not be called multiple times.
Assert in `wgpu_hal::vulkan::CommandEncoder::discard_encoding` that encoding is actually in progress.
Fixes#5255.
* Prefer OpenGL over OpenGL ES
* Fix sRGB on egl
* Check if OpenGL is supported
* Add changelog entry
* Remove expected failure for OpenGL Non-ES, add comment explaining FRAMEBUFFER_SRGB, add driver info to AdapterInfo
* Fix draw indexed
* CI host doesn't seem to support Rg8Snorm and Rgb9eUfloat clearing
* pool tracker vecs
* pool
* ci
* move pool to device
* use pool ref, cleanup and comment
* suspect all the future suspects (#5413)
* suspect all the future suspects
* changelog
* changelog
* review feedback
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
Invoke a DeviceLostClosure immediately if set on an invalid device.
To make the device invalid, this defines an explicit, test-only method
make_invalid. It also modifies calls that expect to always retrieve a
valid device.
Co-authored-by: Erich Gubler <erichdongubler@gmail.com>
Naga assumed that GLSL 410 supported layout(binding = ...) but it does not,
it only supports layout(location = ...). It is not possible to enable only
layout(location = ...) currently, so we need to predicate the feature on GLSL
420 instead.
When running wgpu with an OpenGL context on macOS that is created with a core
profile and with the forward-compatibility bit set, the MAX_VARYING_COMPONENTS
constant returns 0 when queried. The default value is 60, so we return the
default value if the query returns 0.
We also need to use `#version 140` on macOS since `#version 130` isn't accepted.
Since `#version 140` should be available from OpenGL 3.1, we use that everywhere.
That way we don't need any specific macOS flags or features.
Fuzz testing in Firefox encountered crashes for calls of
`Global::command_encoder_clear_buffer` where:
* `offset` is greater than `buffer.size`, but…
* `size` is `None`.
Oops! We should _always_ check this (i.e., even when `size` is `None`),
because we have no guarantee that `offset` and the fallback value of
`size` is in bounds. 😅 So, we change validation here to unconditionally
compute `size` and run checks we previously gated behind `if let
Some(size) = size { … }`.
For convenience, the spec. link for this method:
<https://gpuweb.github.io/gpuweb/#dom-gpucommandencoder-clearbuffer>
* split out TIMESTAMP_QUERY_INSIDE_ENCODERS from TIMESTAMP_QUERY
* changelog entry
* update changelog change number
* fix web warnings
* single line changelog
* note on followup issue
When no work is submitted for a frame, presenting the surface results
in a timeout due to no work having been submitted.
Fixes#3189.
This flag was added in #1892 with a note that it was going to be
temporary until #1688 landed.
* [wgpu-core] Add tests for minimum binding size validation.
* [wgpu-core] Compute minimum binding size correctly for arrays.
In early versions of WGSL, `storage` or `uniform` global variables had
to be either structs or runtime-sized arrays. This rule was relaxed,
and now globals can have any type; Naga automatically wraps such
variables in structs when required by the backend shading language.
Under the old rules, whenever wgpu-core saw a `storage` or `uniform`
global variable with an array type, it could assume it was a
runtime-sized array, and take the stride as the minimum binding size.
Under the new rules, wgpu-core must consider fixed-sized and
runtime-sized arrays separately.
* add GL_EXT_texture_shadow_lod feature detection
* allow more cases of cube depth texture sampling in glsl
* add test for sampling a cubemap array depth texture with lod
* add test for chosing GL_EXT_texture_shadow_lod over the grad workaround if instructed
* add changelog entry for GL_EXT_texture_shadow_lod
* fix criteria for requiring and using TEXTURE_SHADOW_LOD
* require gles 320 for textureSampling over cubeArrayShadow
* prevent false positives in TEXTURE_SHADOW_LOD in checks
* make workaround_lod_with_grad usecase selection less context dependant
* move 3d array texture error into the validator
* correct ImageSample logic errors
* Replace `Instance::any_backend_feature_enabled` with `Instance::enabled_backend_features` which reports all available backends instead of just reporting if none is available.
* add changelog entry
* update enabled_backend_features in doc
* fix not enabling any backend on android, fix related doc issues
This fixes two cases where a DeviceLostClosureC might not be consumed
before it is dropped, which is a requirement:
1) When the closure is replaced, this ensures the to-be-dropped closure
is invoked.
2) When the global is dropped, this ensures that the closure is invoked
before it is dropped.
The first of these two cases is tested in a new test,
DEVICE_LOST_REPLACED_CALLBACK. The second case has a stub,
always-skipped test, DROPPED_GLOBAL_THEN_DEVICE_LOST. The test is
always-skipped because there does not appear to be a way to drop the
global from within a test. Nor is there any other way to reach
Device.prepare_to_die without having first dropping the device.
* Add serde, serialize, deserialize features to wgpu and wgpu-core
Remove trace, replay features from wgpu-types
* Do not use trace, replay in wgpu-types anymore
* Make use of deserialize, serialize features in wgpu-core
* Make use of serialize, deserialize features in wgpu
* Run cargo fmt
* Use serde(default) for deserialize only
* Fix serial-pass feature
* Add a comment for new features
* Add CHANGELOG entry
* Run cargo fmt
* serial-pass also needs serde features for Id<T>
* Add feature documentation to lib.rs docs
* wgpu-types implicit serde feature
* wgpu-core explicit serde feature
* wgpu explicit serde feature
* Update CHANGELOG.md
* Fix compilation with default features
* Address review comments
Join all threads before returning from the test case, to ensure that
we don't return from `main` until all open `Device`s have been
dropped.
This avoids a race condition in glibc in which a thread calling
`dlclose` can unmap a shared library's code even while the main thread
is still running its finalization functions. (See #5084 for details.)
Joining all threads before returning from the test ensures that the
Vulkan loader has finished `dlclose`-ing the Vulkan validation layer
shared library before `main` returns.
Remove `skip` for this test on GL/llvmpipe. With this change, that has
not been observed to crash. Without it, the test crashes within ten
runs or so.
Fixes#5084.
Fixed#4285.