…using `cargo +1.79.0 clippy --workspace --all-features --all-targets
--fix`, plus some manual changes to (1) catch some missed cases (not run
on all platforms?) and (2) `--fix` doesn't make things compile again. 😀
Some module doc comments were using `/*! ... */` syntax and had
leading ` *` prefixes on each line. This interferes with the
tracking of `clippy::doc_lazy_continuation`, so switch those over
to `//!` style comment blocks.
This leaves `/*! ... */` blocks alone which didn't prefix each
line.
The config can be made more precise so as to not accidentally
ignore some issues due to case (in-)sensitivity and searching for
substrings with `extend-words`.
Additionally, we can check the configuration directories as
well like `.github`.
The usage of `implace_it` went away some time ago, but not all
references were removed.
This option was only evaluated for Metal backends, and now it's required
there so the option is going away. It is still configurable for tests
via the PipelineOptions struct, deserialized from .ron files.
This also fixes some type problems with the unpack functions in
writer.rs. Metal << operator extends operand to int-sized, which then
has to be cast back down to the real size before as_type bit conversion.
The math for the snorm values is corrected, in some cases using the
metal unpack_snorm2x16_to_float function because we can't directly
cast a bit-shifted ushort value to half.
Introduce the new type alias `wgpu_hal::AtomicFenceValue`, which is
the atomic version of `wgpu_hal::FenceValue`. Use this type alias in
`wgpu_core`. Remove `as` conversions made unnecessary since we're not
conflating `usize` with `u64` any more.
In `naga:🔙hlsl`:
- Generate calls to `Interlocked{op}64` when necessary. not
`Interlocked{op}`.
- Make atomic operations that do not produce a value emit their
operands properly.
In the Naga snapshot tests:
- Adapt `atomicOps-int64-min-max.wgsl` to include cases that
cover non-trivial atomic operation operand emitting.
In `wgpu_hal::vulkan::adapter`:
- When retrieving physical device features, be sure to include
the `PhysicalDeviceShaderAtomicInt64Features` extending struct
in the chain whenever the `VK_KHR_shader_atomic_int64` extension
is available.
- Request both `shader_{buffer,shared}_int64_atomics` in the
`PhysicalDeviceShaderAtomicInt64Features` extending struct when either of
`wgpu_types::Features::SHADER_INT64_ATOMIC_{ALL_OPS,MIN_MAX}` is requested.
---------
Co-authored-by: Jim Blandy <jimb@red-bean.com>
It's already documented that to unmap a buffer it has to have been mapped.
Vulkan was the only backend that was returning an OOM on missing `Buffer.block` but `Buffer.map_buffer` already returns an error in this case.
* Expose gpu allocation configuration options
This commit adds hints to control memory allocations strategies to the configuration options. These hints allow for automatic profiles such as optimizing for performance (the default, makes sense for a game), optimizing for memory usage (typically more useful for a web browser or UI library) and specifying settings manually.
The details of gpu allocation are still in flux. The goal is to switch vulkan and metal to gpu_allocator which is currently used with d3d12. gpu_allocator will also likely receive more configuration options, in particular the ability to start with smaller memory block sizes and progressively grow the block size. So the manual settings already provision for this upcoming option. Another approach could be to wait and add the manual option after the dust settles.
The reason for providing presets and defining values in the backends is that I am convinced that optimal fonigurations should take hardware capabilities into consideration. It's a deep rabbithole, though, so that will be an exercise for later.
* changelog
* Update CHANGELOG.md
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Add a comment about not entirely knowing what we are doing
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Allow unconsumed inputs in fragment shaders by removing them from vertex
outputs when generating HLSL.
Fixes https://github.com/gfx-rs/wgpu/issues/3748
* Add naga:🔙:hlsl::FragmentEntryPoint for providing information
about the fragment entry point when generating vertex entry points via
naga:🔙:hlsl::Writer::write. Vertex outputs not consumed by the
fragment entry point are omitted in the final output struct.
* Add naga snapshot test for this new feature,
* Remove Features::SHADER_UNUSED_VERTEX_OUTPUT,
StageError::InputNotConsumed, and associated validation logic.
* Make wgpu dx12 backend pass fragment shader info when generating
vertex HLSL.
* Add wgpu regression test for allowing unconsumed inputs.
* Address review
* Add note that nesting structs for the inter-stage interface can't
happen.
* Remove new TODO notes (some addressed and some transferred to an issue
https://github.com/gfx-rs/wgpu/issues/5577)
* Changed issue that regression test refers to 3748 -> 5553
* Add debug_assert that binding.is_some() in hlsl writer
* Fix typos caught in CI
Also, fix compiling snapshot test when hlsl-out feature is not enabled.
We are using `program_cache.try_lock()` when creting pipelines which is covered by a guard gotten from `context.lock()`. For the `.try_lock()` to always succeed we need to make sure that the other lock acquisitions are also covered by a `context.lock()`.
The `wgpu_examples::hello_compute::tests::multithreaded_compute` test has been failing intermittently in CI due to this.
* Add an optional system for counting and reporting internal resources and events
* Count API objects in wgpu-hal
* Expose internal counters in wgpu-core and wgpu.
A recent change by rustc, now in 1.79-stable, makes empty str constants
point to the same location: 0x01. This is an optimization of sorts, not
stable behavior. Code must not rely on constants having stable addresses
nor should it pass &"" to APIs expecting CStrs or NULL addresses.
D3DCompile will segfault if you give it such a pointer, or worse:
read random garbage addresses!
Pass the NULL pointer to D3DCompile if wgpu lacks a decent CString.
refs:
- https://learn.microsoft.com/en-us/windows/win32/api/d3dcompiler/nf-d3dcompiler-d3dcompile
Co-authored-by: Jan Hohenheim <jan@hohenheim.ch>
Co-authored-by: Brezak <bezak.adam@proton.me>
Add the following flags to `wgpu_types::Features`:
- `SHADER_INT64_ATOMIC_ALL_OPS` enables all atomic operations on `atomic<i64>` and
`atomic<u64>` values.
- `SHADER_INT64_ATOMIC_MIN_MAX` is a subset of the above, enabling only
`AtomicFunction::Min` and `AtomicFunction::Max` operations on `atomic<i64>` and
`atomic<u64>` values in the `Storage` address space. These are the only 64-bit
atomic operations available on Metal as of 3.1.
Add corresponding flags to `naga::valid::Capabilities`. These are supported by the
WGSL front end, and all Naga backends.
Platform support:
- On Direct3d 12, in `D3D12_FEATURE_DATA_D3D12_OPTIONS9`, if
`AtomicInt64OnTypedResourceSupported` and `AtomicInt64OnGroupSharedSupported` are
both available, then both wgpu features described above are available.
- On Metal, `SHADER_INT64_ATOMIC_MIN_MAX` is available on Apple9 hardware, and on
hardware that advertises both Apple8 and Mac2 support. This also requires Metal
Shading Language 2.4 or later. Metal does not yet support the more general
`SHADER_INT64_ATOMIC_ALL_OPS`.
- On Vulkan, if the `VK_KHR_shader_atomic_int64` extension is available with both the
`shader_buffer_int64_atomics` and `shader_shared_int64_atomics` features, then both
wgpu features described above are available.
Fix two major synchronization issues in `wgpu_val::vulkan`:
- Properly order queue command buffer submissions. Due to Mesa bugs, two semaphores are required even though the Vulkan spec says that only one should be necessary.
- Properly manage surface texture acquisition and presentation:
- Acquiring a surface texture can return while the presentation engine is still displaying the texture. Applications must wait for a semaphore to be signaled before using the acquired texture.
- Presenting a surface texture requires a semaphore to ensure that drawing is complete before presentation occurs.
Co-authored-by: Jim Blandy <jimb@red-bean.com>
This proves a flag in msl::PipelineOptions that attempts to write all
Metal vertex entry points to use a vertex pulling technique. It does
this by:
1) Forcing the _buffer_sizes structure to be generated for all vertex
entry points. The structure has additional buffer_size members that
contain the byte sizes of the vertex buffers.
2) Adding new args to vertex entry points for the vertex id and/or
the instance id and for the bound buffers. If there is an existing
@builtin(vertex_index) or @builtin(instance_index) param, then no
duplicate arg is created.
3) Adding code at the beginning of the function for vertex entry points
to compare the vertex id or instance id against the lengths of all the
bound buffers, and force an early-exit if the bounds are violated.
4) Extracting the raw bytes from the vertex buffer(s) and unpacking
those bytes into the bound attributes with the expected types.
5) Replacing the varyings input and instead using the unpacked
attributes to fill any structs-as-args that are rebuilt in the entry
point.
A new naga test is added which exercises this flag and demonstrates the
effect of the transform. The msl generated by this test passes
validation.
Eventually this transformation will be the default, always-on behavior
for Metal pipelines, though the flag may remain so that naga
translation tests can be run with and without the tranformation.