* Add an optional system for counting and reporting internal resources and events
* Count API objects in wgpu-hal
* Expose internal counters in wgpu-core and wgpu.
- Improve logging in `StatelessTracker::remove_abandoned` to show the
outcome of the call.
- Add similar logging to `BufferTracker` and `TextureTracker`.
- Let `device_create_buffer`'s log only the new buffer's label, id,
and whether it's mapped at creation. It used to show the whole
descriptor, which is too much detail.
- Have `queue_submit` log the submission id, and have `device_poll`
log what it was waiting for, and what actually got done.
- Have `Device::drop` log the destruction of the raw device when it
actually happens, so it's properly ordered with respect to logging
from other parts of the device, like `Device::command_allocator`.
Add the following flags to `wgpu_types::Features`:
- `SHADER_INT64_ATOMIC_ALL_OPS` enables all atomic operations on `atomic<i64>` and
`atomic<u64>` values.
- `SHADER_INT64_ATOMIC_MIN_MAX` is a subset of the above, enabling only
`AtomicFunction::Min` and `AtomicFunction::Max` operations on `atomic<i64>` and
`atomic<u64>` values in the `Storage` address space. These are the only 64-bit
atomic operations available on Metal as of 3.1.
Add corresponding flags to `naga::valid::Capabilities`. These are supported by the
WGSL front end, and all Naga backends.
Platform support:
- On Direct3d 12, in `D3D12_FEATURE_DATA_D3D12_OPTIONS9`, if
`AtomicInt64OnTypedResourceSupported` and `AtomicInt64OnGroupSharedSupported` are
both available, then both wgpu features described above are available.
- On Metal, `SHADER_INT64_ATOMIC_MIN_MAX` is available on Apple9 hardware, and on
hardware that advertises both Apple8 and Mac2 support. This also requires Metal
Shading Language 2.4 or later. Metal does not yet support the more general
`SHADER_INT64_ATOMIC_ALL_OPS`.
- On Vulkan, if the `VK_KHR_shader_atomic_int64` extension is available with both the
`shader_buffer_int64_atomics` and `shader_shared_int64_atomics` features, then both
wgpu features described above are available.
* add new tests for checking on query set lifetime
* Fix ownership management of query sets on compute passes for write_timestamp, timestamp_writes (on desc) and pipeline statistic queries
* changelog entry
Fix two major synchronization issues in `wgpu_val::vulkan`:
- Properly order queue command buffer submissions. Due to Mesa bugs, two semaphores are required even though the Vulkan spec says that only one should be necessary.
- Properly manage surface texture acquisition and presentation:
- Acquiring a surface texture can return while the presentation engine is still displaying the texture. Applications must wait for a semaphore to be signaled before using the acquired texture.
- Presenting a surface texture requires a semaphore to ensure that drawing is complete before presentation occurs.
Co-authored-by: Jim Blandy <jimb@red-bean.com>
This proves a flag in msl::PipelineOptions that attempts to write all
Metal vertex entry points to use a vertex pulling technique. It does
this by:
1) Forcing the _buffer_sizes structure to be generated for all vertex
entry points. The structure has additional buffer_size members that
contain the byte sizes of the vertex buffers.
2) Adding new args to vertex entry points for the vertex id and/or
the instance id and for the bound buffers. If there is an existing
@builtin(vertex_index) or @builtin(instance_index) param, then no
duplicate arg is created.
3) Adding code at the beginning of the function for vertex entry points
to compare the vertex id or instance id against the lengths of all the
bound buffers, and force an early-exit if the bounds are violated.
4) Extracting the raw bytes from the vertex buffer(s) and unpacking
those bytes into the bound attributes with the expected types.
5) Replacing the varyings input and instead using the unpacked
attributes to fill any structs-as-args that are rebuilt in the entry
point.
A new naga test is added which exercises this flag and demonstrates the
effect of the transform. The msl generated by this test passes
validation.
Eventually this transformation will be the default, always-on behavior
for Metal pipelines, though the flag may remain so that naga
translation tests can be run with and without the tranformation.
* lift encoder->computepass lifetime constraint and add now failing test
* compute passes now take an arc to their parent command encoder, thus removing compile time dependency to it
* Command encoder goes now into locked state while compute pass is open
* changelog entry
* share most of the code between get_encoder and lock_encoder
These are being deprecated in the future in favor of the associated
constants (which are already being used in some code here), so this
consistently uses the preferred forms.
* rename `command_encoder_run_*_pass` to `*_pass_end` and make it a method of compute/render pass instead of encoder
* executing a compute pass consumes it now such that it can't be executed again
* use handle_error instead of handle_error_nolabel for wgpu compute pass
* use handle_error instead of handle_error_nolabel for render_pass_end
* changelog addition
* feat: `compute_pass_set_push_constant`: move panics to error variants
Co-Authored-By: Erich Gubler <erichdongubler@gmail.com>
---------
Co-authored-by: Erich Gubler <erichdongubler@gmail.com>
This was previously added in #2230 but I don't think it was necessary. #901 already implemented the buffer <-> texture validation for those formats. It's also not a requirement in the spec.
* basic test setup
* remove lifetime and drop resources on test - test fails now just as expected
* compute pass recording is now hub dependent (needs gfx_select)
* compute pass recording now bumps reference count of uses resources directly on recording
TODO:
* bind groups don't work because the Binder gets an id only
* wgpu level error handling is missing
* simplify compute pass state flush, compute pass execution no longer needs to lock bind_group storage
* wgpu sided error handling
* make ComputePass hal dependent, removing command cast hack. Introduce DynComputePass on wgpu side
* remove stray repr(C)
* changelog entry
* fix deno issues -> move DynComputePass into wgc
* split out resources setup from test
A `for` loop is less noisy than a `drain`, which requires:
- a `mut` qualifier for a variable whose modified value we never
consult
- a method name appearing mid-line instead of a control structure name
at the front of the line
- a range which is always `..`, establishing no restriction at all
- a closure instead of a block
Structured control flow syntax has a fine pedigree, originating in,
among other places, Dijkstrsa's efforts at designing languages in a
way that made it easier to formally verify programs written in
them (see "A Discipline Of Programming"). There is nothing "more
mathematical" about a method call that takes a closure than a `for`
loop. Since `for_each` is useless unless the closure has side effects,
there's nothing "more functional" about `for_each` here, either.
Obsessive use of `for_each` suggests that the author loves Haskell
without understanding it.
Rename `LifetimeTracker::triage_resources`'s `resources_map` argument
to `suspected_resources`, since this always points to a field of
`LifetimeTracker::suspected_resources`.
In the various `triage_suspected_foo` functions, name the map
`suspected_foos`.
Check whether the resource is abandoned first, since none of the rest
of the work is necessary otherwise.
Rename `non_referenced_resources` to `last_resources`. This function
copes with various senses in which the resource might be referenced or
not. Instead, `last_resources` is the name of the `ActiveSubmission`
member this may point to, which is more specific.
Move the use of `last_resources` immediately after its production.
* Avoid introducing spurious features for optional dependencies
If a feature depends on an optional dependency without using the dep:
prefix, a feature with the same name as the optional dependency is
introduced. This feature almost certainly won't have any effect when
enabled other than increasing compile times and polutes the feature list
shown by cargo add. Consistently use dep: for all optional dependencies
to avoid this problem.
* Add changelog entry
* Clean up weak references to texture views
* add change to CHANGELOG.md
* drop texture view before clean up
* cleanup weak ref to bind groups
* update changelog
* Trim weak backlinks in their holders' triage functions.
---------
Co-authored-by: Jim Blandy <jimb@red-bean.com>
Change `Device::untrack` to properly reuse the `ResourceMap` allocated
for prior calls. The prior code tries to do this but always leaves
`Device::temp_suspected` set to a new empty `ResourceMap`, leaving the
previous value to be dropped by `ResourceMap::extend`.
Change `ResourceMap::extend` to take `other` by reference, rather than
taking it by value and dropping it.
Remove unreachable code from `Global::queue_submit` that checks
whether the resources used by the command buffer have a reference
count of one, and adds them to `Device::temp_suspected` if so.
When `queue_submit` is called, all the `Arc`s processed by this code
have a reference count of at least three, even when the user has
dropped the resource:
- `Device::trackers` holds strong references to all the device's
resources.
- `CommandBufferMutable::trackers` holds strong references to all
resources used by the command buffer.
- The `used_resources` methods of the various members of
`CommandBufferMutable::trackers` all return iterators of owned
`Arc`s.
Fortunately, since the `Global::device_drop_foo` methods all add the
`foo` being dropped to `Device::suspected_resources`, and
`LifetimeTracker::triage_suspected` does an adequate job of accounting
for the uninteresting `Arc`s and leaves the resources there until
they're actually dead, things do get cleaned up without the checks in
`Global::queue_submit`.
This allows `Device::temp_suspected` to be private to
`device::resource`, with a sole remaining use in `Device::untrack`.
Fixes#5647.
The lock analyzers in the `wgpu_core::lock` module can be a bit
simpler if they can assume that locks are acquired and released in a
stack-like order: that a guard is only dropped when it is the most
recently acquired lock guard still held. So:
- Change `Device::maintain` to take a `RwLockReadGuard` for the device's
hal fence, rather than just a reference to it.
- Adjust the order in which guards are dropped in `Device::maintain`
and `Queue::submit`.
Fixes#5610.
Rather than implementing `Drop` for all three lock guard types to
restore the lock analysis' per-thread state, let lock guards own
values of a new type, `LockStateGuard`, with the appropriate `Drop`
implementation. This is cleaner and shorter, and helps us implement
`RwLock::downgrade` in a later commit.
* Fix cts_runner command invocation in readme
* Remove assertDeviceMatch from deno_webgpu in createBindGroup
This should be done as verification in wgpu-core.
* Add device mismatched check to create_buffer_binding
* Extract common logic to create_sampler_binding
* Move common logic to create_texture_binding and add device mismatch check
Introduce two new private functions, `acquire` and `release`, to the
`lock::ranked` module, to perform validation for acquiring and
releasing locks. Change `Mutex::lock` and `MutexGuard::drop` to use
those functions, rather than writing out their contents.
* move out compute command to separate module
* introduce ArcComputeCommand
* stateless tracker now returns reference to arc upon insertion
* add insert_merge_single to buffer tracker
* compute pass execution now works internally with an ArcComputeCommand
* compute pass execution now translates Command to ArcCommand ahead of time
* don't clone commands in compute pass execution
* remove doc hiding
* use option insert
* clippy fix
* fix private doc issue
* remove unnecessary copied over doc hide
If `debug_assertions` or the `"validate-locks"` feature are enabled,
change `wgpu-core` to use a wrapper around `parking_lot::Mutex` that
checks for potential deadlocks.
At the moment, `wgpu-core` does contain deadlocks, so the ranking in
the `lock::rank` module is incomplete, in the interests of keeping it
acyclic. #5572 tracks the work needed to complete the ranking.
The derivation is only effective if the generic type parameter `A`
also implements `Default`, which `HalApi` implementations generally
don't, so this derivation never actually took place. (This is why
`ResourceMaps::new` is written out the way it is.)
Move the `Mutex` in `Device::command_allocator` inside the
`CommandAllocator` type itself, allowing it to be passed by shared
reference instead of mutable reference.
Passing `CommandAllocator` to functions like
`PendingWrites::post_submit` by mutable reference requires the caller
to acquire and hold the mutex for the entire time the callee runs, but
`CommandAllocator` is just a recycling pool, with very simple
invariants; there's no reason to hold the lock for a long time.
Flesh out the documentation for `wgpu_core`'s `CommandBuffer`,
`CommandEncoder`, and associated types.
Allow doc links to private items. `wgpu-core` isn't entirely
user-facing, so it's useful to document internal items.
Replace the `wgpu_core:🆔:Id::transmute` method, the `transmute`
private module, and the `Transmute` sealed trait with some associated
functions with obvious names.
* [wgpu-core] pass resources as Arcs when adding them to the registry (fix gfx-rs#5493)
* [wgpu-core] also add `Arc::new` to `#[cfg(dx12)]` blocks
* [wgpu-core] allow `clippy::arc_with_non_send_sync`
* pool tracker vecs
* pool
* ci
* move pool to device
* use pool ref, cleanup and comment
* suspect all the future suspects (#5413)
* suspect all the future suspects
* changelog
* changelog
* review feedback
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
Invoke a DeviceLostClosure immediately if set on an invalid device.
To make the device invalid, this defines an explicit, test-only method
make_invalid. It also modifies calls that expect to always retrieve a
valid device.
Co-authored-by: Erich Gubler <erichdongubler@gmail.com>
Rust would have made this operation either an overflow in release mode,
or a panic in debug mode. Neither seem appropriate for this context,
where I suspect an error should be returned instead. Web browsers, for
instance, shouldn't crash simply because of an issue of this nature.
Users may, quite reasonably, have bad arguments to this in early stages
of development!
Fuzz testing in Firefox encountered crashes for calls of
`Global::command_encoder_clear_buffer` where:
* `offset` is greater than `buffer.size`, but…
* `size` is `None`.
Oops! We should _always_ check this (i.e., even when `size` is `None`),
because we have no guarantee that `offset` and the fallback value of
`size` is in bounds. 😅 So, we change validation here to unconditionally
compute `size` and run checks we previously gated behind `if let
Some(size) = size { … }`.
For convenience, the spec. link for this method:
<https://gpuweb.github.io/gpuweb/#dom-gpucommandencoder-clearbuffer>
* split out TIMESTAMP_QUERY_INSIDE_ENCODERS from TIMESTAMP_QUERY
* changelog entry
* update changelog change number
* fix web warnings
* single line changelog
* note on followup issue
When no work is submitted for a frame, presenting the surface results
in a timeout due to no work having been submitted.
Fixes#3189.
This flag was added in #1892 with a note that it was going to be
temporary until #1688 landed.
* docs: sync. `wgpu/Cargo.toml` feature comments with `lib.rs`
* Revert "docs: inline `document-features` usage, remove dep."
This reverts commit 3d5bec659b9cf19f1c64274de0d11808d771cc66, with an
update to `document-features`, and preferring to keep new `feature`
content. To be clear, the only difference I have observed is the
addition of the `serde` feature.
In case it shortens anyone's search, the specific issue resolved is
[`slint-ui/document-features`#20](https://github.com/slint-ui/document-features/issues/20).
* [wgpu-core] Add tests for minimum binding size validation.
* [wgpu-core] Compute minimum binding size correctly for arrays.
In early versions of WGSL, `storage` or `uniform` global variables had
to be either structs or runtime-sized arrays. This rule was relaxed,
and now globals can have any type; Naga automatically wraps such
variables in structs when required by the backend shading language.
Under the old rules, whenever wgpu-core saw a `storage` or `uniform`
global variable with an array type, it could assume it was a
runtime-sized array, and take the stride as the minimum binding size.
Under the new rules, wgpu-core must consider fixed-sized and
runtime-sized arrays separately.
It's risky to get write access through the snatchlock from a drop implementation since the snatch lock is typically held for large scopes. This commit makes it so we deffer snatching some resources to when the device is polled and we know the snatch lock is not held.
Co-authored-by: Erich Gubler <erichdongubler@gmail.com>
This fixes two cases where a DeviceLostClosureC might not be consumed
before it is dropped, which is a requirement:
1) When the closure is replaced, this ensures the to-be-dropped closure
is invoked.
2) When the global is dropped, this ensures that the closure is invoked
before it is dropped.
The first of these two cases is tested in a new test,
DEVICE_LOST_REPLACED_CALLBACK. The second case has a stub,
always-skipped test, DROPPED_GLOBAL_THEN_DEVICE_LOST. The test is
always-skipped because there does not appear to be a way to drop the
global from within a test. Nor is there any other way to reach
Device.prepare_to_die without having first dropping the device.
* Add serde, serialize, deserialize features to wgpu and wgpu-core
Remove trace, replay features from wgpu-types
* Do not use trace, replay in wgpu-types anymore
* Make use of deserialize, serialize features in wgpu-core
* Make use of serialize, deserialize features in wgpu
* Run cargo fmt
* Use serde(default) for deserialize only
* Fix serial-pass feature
* Add a comment for new features
* Add CHANGELOG entry
* Run cargo fmt
* serial-pass also needs serde features for Id<T>
* Add feature documentation to lib.rs docs
* wgpu-types implicit serde feature
* wgpu-core explicit serde feature
* wgpu explicit serde feature
* Update CHANGELOG.md
* Fix compilation with default features
* Address review comments
* naga: glsl parser should return singular ParseError similar to wgsl
* wgpu: treat glsl the same as wgsl when creating ShaderModule
* naga: update glsl parser tests to use new ParseError type
* naga: glsl ParseError errors field should be public
* wgpu-core: add 'glsl' feature
* fix some minor bugs in glsl parse error refactor
* naga/wgpu/wgpu-core: improve spirv parse error handling
* wgpu-core: feature gate use of glsl and spv naga modules
* wgpu: enable wgpu-core glsl and spirv features when appropriate
* obey clippy
* naga: derive Clone in Type
* naga: don't feature gate Clone derivation for Type
* obey cargo fmt
* wgpu-core: use bytemuck instead of zerocopy
* wgpu-core: apply suggested edit
* wgpu-core: no need to borrow spirv code
* Update wgpu/src/backend/wgpu_core.rs
Co-authored-by: Alphyr <47725341+a1phyr@users.noreply.github.com>
---------
Co-authored-by: Alphyr <47725341+a1phyr@users.noreply.github.com>
This clarifies that the Rust and C-style callbacks/closures need to be
consumed (not called) before they are dropped. It also makes the from_c
function consume the param closure so that it can be dropped without
panicking.
It also relaxes the restriction that the callback/closure can only be
called once.
* Remove the Destroyed state from Storage
It used to be how we handled destroying buffers and textures but we moved to different approach.
* Explicit check for destroyed textures/buffers in a few entry points
This used to be checked automatically when getting the resource from the registry, but has to be done manually now that we track we track the destroyed state in the resources.
Re-implements https://github.com/gfx-rs/wgpu/pull/4886 (CC @Wumpf)
without the `document-features` crate, which has issues integrating into
Firefox builds after being `cargo vendor`ed into its repository. This
issue is being tracked against
https://github.com/slint-ui/document-features/issues/20. Once resolved,
I expect that we will want to revert this PR in its entirety, since
`document-features` is still a good addition to `wgpu`'s documentation
story.
Internally, consensus has already been achieved for this change.
Firefox's ability to build unfortunately take priority over this
particular convenience. Hopefully, we won't have to compromise shortly!
I tested this by ensuring that the HTML output of our existing
`document_features::document_features!(…)` usage was exactly the same.
There should be exactly zero regressions in the current state of
documentation for users. For maintainers, I have added a disclaimer that
one needs to keep changes in sync. with the relevant `Cargo.toml`
manifests.
* Ensure device lost closure is called exactly once before being dropped.
This requires a change to the Rust callback signature, which is now Fn
instead of FnOnce. When the Rust callback or the C closure are dropped,
they will panic if they haven't been called. `device_drop` is changed
to call the closure with a message of "Device dropped." A test is added.
* Remove some locks in BindGroup
These are only written to clear the vectors when triaging bindgroups for destruction, which is not necessary. We can let the reference counts drop when the bind group is dropped.
* Make the mem_leak test pass again
We allocate a String every time we want to get a label for logging. The string is also allocated when logging is disabled. Either way, the allocation is unnecessary. This commit replaces the String with a dyn Debug reference which does not need any allocation.
* Remove the abstractions in resource maps
ResourceMaps had a rather convoluted system for erasing types that isn't needed anywhere except in one place in the triage_resource function. Everywhere else we are always dealing with specific types so using a member of the resource maps is simpler than going through an abstraction. More importantly there was a constraint that all contents of the resource maps implement the Resource trait which got in the way of some of the ongoing buffer snatching changes.
This commit simplifies this by removing the abstraction. Each resource type has its hash map directly in ResourceMaps and it is easier to have different requirements and behaviors depending on the type of the each resource.
* Introduce `dx12` and `metal` crate features to `wgpu`
* Implement dummy `Context` to allow compilation with `--no-default-features`
* Address review
* Remove `dummy::Context` in favor of `hal::api::Empty`
* Add changelog entry
* Panic early in `Instance::new()` if no backend is enabled
Co-Authored-By: Andreas Reich <1220815+Wumpf@users.noreply.github.com>
---------
Co-authored-by: Andreas Reich <1220815+Wumpf@users.noreply.github.com>
The general idea is to register postpone reigistering the buffer until towards the end of the function so that our unique reference to it lets us easily snatch the raw buffer if an error happens.
* Downgrade resource lifetime management log level to trace.
Allow promoting it back to info via an feature flag.
* Don't filter out info and warning log in the examples.
* Changelog entry.
* Downgrade storage log level from info to trace
* Downgrade tracking log level from info to trace
* Demote present log from info to debug
* Downgrade device/life.rs log from info to debug and trace
* Downgrade more log from info to trace
* Conditionally lift API logging from trace to info level
Most of this logging used to be info level until I demoted it to trace. Unfortunately this gets in my way because:
- Most of the logging I currently need is these API entry points, there is is a fair amount of more verbose logging in wgpu at higher levels than trace
- Firefox disable all trace and debug logging for optimized builds, which means I miss the this API logging where I need most.
This patch lifts the api logging back to info level.
* Move the api logging behind the api_log macro
* Clean up the trace-level logging for devices
* Log the descriptors for create_buffer and create_texture.
* Make logged ids more concise
Logged ids go from 'Id { index: 204, epoch: 1, backend: Vulkan }' to 'Id(204,1,vk)'.
* Log errors in more places.
Let `naga::TypeInner::Matrix` hold a full `Scalar`, with a kind and
byte width, not merely a byte width, to make it possible to represent
matrices of AbstractFloats for WGSL.
* Keep the value in its storage after destroy
in #4657 the destroy implementation was made to remove the value from the storage and leave an error variant in its place.
Unfortunately this causes some issues with the tracking code which expects the ID to be unregistered after the value has been fully destroyed, even if the latter is not in storage anymore.
To work around that, this commit adds a `Destroyed` variant in storage which keeps the value so that the tracking behavior is preserved while
still making sure that most accesses to the destroyed resource lead to validation errors.
... Except for submitted command buffers that need to be consumed right away. These are replaced with `Element::Error` like before this commit.
Co-authored-by: Teodor Tanasoaia <28601907+teoxoy@users.noreply.github.com>
Introduce a new struct type, `Scalar`, combining a `ScalarKind` and a
`Bytes` width, and use this whenever such pairs of values are passed
around.
In particular, use `Scalar` in `TypeInner` variants `Scalar`, `Vector`,
`Atomic`, and `ValuePointer`.
Introduce associated `Scalar` constants `I32`, `U32`, `F32`, `BOOL`
and `F64`, for common cases.
Introduce a helper function `Scalar::float` for constructing `Float`
scalars of a given width, for dealing with `TypeInner::Matrix`, which
only supplies the scalar width of its elements, not a kind.
Introduce helper functions on `Literal` and `TypeInner`, to produce
the `Scalar` describing elements' values.
Use `Scalar` in `wgpu_core::validation::NumericType` as well.
* Better handle destroying textures and buffers
Before this commit, explicitly destroying a texture or a buffer (without dropping it)
schedules the asynchronous destruction of its raw resources but does not actually mark
it as destroyed. This can cause some incorrect behavior, for example mapping a buffer
after destroying it does not cause a validation error (and later panics due to the
map callback being dropped without being called).
This Commit adds `Storage::take_and_mark_destroyed` for use in `destroy` methods.
Since it puts the resource in the error state, other methods properly catch that
the resource is no longer usable when attempting to access it and raise validation
errors.
There are other resource types that require similar treatment and will be addressed
in followup work.
* Add a changelog entry
* More complete implementation of "lose the device".
This provides a way for wgpu-core to specify a callback on "lose the
device". It ensures this callback is called at the appropriate times:
either after device.destroy has empty queues, or on demand from
device.lose.
A test has been added to device.rs.
* Updated CHANGELOG.md.
* Fix conversion to *const c_char.
* Use an allow lint to permit trivial_casts.
* rustfmt changes.
Since the pipeline id is provided by the caller, the caller may presume
that an implicit pipeline layout id is also created, even in error
conditions such as with an invalid device. Since our registry system
will panic if asked to retrieve a pipeline layout id that has never been
registered, it's dangerous to leave that id unregistered. This ensures
that the layout ids also get error values when the pipeline creation
returns an error.
* Add `BGRA8UNORM_STORAGE` extension
* Leave a comment in the backends that don't support bgra8unorm-storage
* Pass the appropriate storage format to naga
* Check for bgra8unorm storage support in the vulkan backend
* Add a test
Co-authored-by: Jinlei Li <jinleili0@outlook.com>
Co-authored-by: Teodor Tanasoaia <28601907+teoxoy@users.noreply.github.com>
* Make buffer_unmap check for device validity, add tests.
This patch makes buffer_unmap check for a valid device, and corrects
buffer_map to return the appropriate error for an invalid device.
Tests are added for both operations.
It also adds device validity checks to device_maintain_ids and to the
functions that get and set buffer sub data.
* Update changelog and test comment.
* Run rustfmt.
* Update test device_lose_then_more to specify more buffer usages to keep Vulkan happy.
* Don't create a raw bind group layout for duplicates.
* Fully support bind group layout deduplication in Pipeline::get_bind_group_layout
---------
Co-authored-by: Connor Fitzgerald <connorwadefitzgerald@gmail.com>
* wgpu-core: Only produce StageError::InputNotConsumed on DX11/DX12
This error only exists due to an issue with naga's HLSL support:
https://github.com/gfx-rs/naga/issues/1945
The WGPU spec itself allows vertex shader outputs that are
not consumed by the fragment shader.
Until the issue is fixed, we can allow unconsumed outputs on
all platforms other than DX11/DX12.
* Add Features::SHADER_UNUSED_VERTEX_OUTPUT to allow disabling check
* Pick an unused feature id
* Remove generic parameter in compat::Manager.
The code is specific to bind group layout ids and the generic parameter gets in the way.
* Add a test.
The test actually covers wgpu's configuration where deduplication does not require the indirection rather than the new code. I used it to debug the new code with the configuration hard-coded. It's tedious to add a test to cover dedpuplication of bind group layouts for users of wgpu_core to provide their IDs, we can rely on the CTS which has test for that.
* Implement bind group layout deduplication for all configurations
Currently wgpu-core implement bind group layout deduplication only when it creates its own resource IDs. In other words it works for wgpu but not in Firefox.
This PR bridges the gap by allowing an optional indirection in bind group layouts: each BGL may store an ID referring to its "deduplicated" BGL.
When referring to a BGL the rest of the code must make sure to follow the indirection. The exception is command buffer processing which is considered hot code and where we first validate against the provided BGL ID and only follow the indirection if the initial check failed.
The main pain point with this approach is the various places where wgpu-core manually updates reference counts: we have to be careful about following the indirection to track the right BGL.
* Avoid making decisions based on the size of some generic type.