…using `cargo +1.79.0 clippy --workspace --all-features --all-targets
--fix`, plus some manual changes to (1) catch some missed cases (not run
on all platforms?) and (2) `--fix` doesn't make things compile again. 😀
Some module doc comments were using `/*! ... */` syntax and had
leading ` *` prefixes on each line. This interferes with the
tracking of `clippy::doc_lazy_continuation`, so switch those over
to `//!` style comment blocks.
This leaves `/*! ... */` blocks alone which didn't prefix each
line.
The config can be made more precise so as to not accidentally
ignore some issues due to case (in-)sensitivity and searching for
substrings with `extend-words`.
Additionally, we can check the configuration directories as
well like `.github`.
The usage of `implace_it` went away some time ago, but not all
references were removed.
This option was only evaluated for Metal backends, and now it's required
there so the option is going away. It is still configurable for tests
via the PipelineOptions struct, deserialized from .ron files.
This also fixes some type problems with the unpack functions in
writer.rs. Metal << operator extends operand to int-sized, which then
has to be cast back down to the real size before as_type bit conversion.
The math for the snorm values is corrected, in some cases using the
metal unpack_snorm2x16_to_float function because we can't directly
cast a bit-shifted ushort value to half.
Introduce the new type alias `wgpu_hal::AtomicFenceValue`, which is
the atomic version of `wgpu_hal::FenceValue`. Use this type alias in
`wgpu_core`. Remove `as` conversions made unnecessary since we're not
conflating `usize` with `u64` any more.
In `naga:🔙hlsl`:
- Generate calls to `Interlocked{op}64` when necessary. not
`Interlocked{op}`.
- Make atomic operations that do not produce a value emit their
operands properly.
In the Naga snapshot tests:
- Adapt `atomicOps-int64-min-max.wgsl` to include cases that
cover non-trivial atomic operation operand emitting.
In `wgpu_hal::vulkan::adapter`:
- When retrieving physical device features, be sure to include
the `PhysicalDeviceShaderAtomicInt64Features` extending struct
in the chain whenever the `VK_KHR_shader_atomic_int64` extension
is available.
- Request both `shader_{buffer,shared}_int64_atomics` in the
`PhysicalDeviceShaderAtomicInt64Features` extending struct when either of
`wgpu_types::Features::SHADER_INT64_ATOMIC_{ALL_OPS,MIN_MAX}` is requested.
---------
Co-authored-by: Jim Blandy <jimb@red-bean.com>
It's already documented that to unmap a buffer it has to have been mapped.
Vulkan was the only backend that was returning an OOM on missing `Buffer.block` but `Buffer.map_buffer` already returns an error in this case.
* Expose gpu allocation configuration options
This commit adds hints to control memory allocations strategies to the configuration options. These hints allow for automatic profiles such as optimizing for performance (the default, makes sense for a game), optimizing for memory usage (typically more useful for a web browser or UI library) and specifying settings manually.
The details of gpu allocation are still in flux. The goal is to switch vulkan and metal to gpu_allocator which is currently used with d3d12. gpu_allocator will also likely receive more configuration options, in particular the ability to start with smaller memory block sizes and progressively grow the block size. So the manual settings already provision for this upcoming option. Another approach could be to wait and add the manual option after the dust settles.
The reason for providing presets and defining values in the backends is that I am convinced that optimal fonigurations should take hardware capabilities into consideration. It's a deep rabbithole, though, so that will be an exercise for later.
* changelog
* Update CHANGELOG.md
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Add a comment about not entirely knowing what we are doing
---------
Co-authored-by: Andreas Reich <r_andreas2@web.de>
* Allow unconsumed inputs in fragment shaders by removing them from vertex
outputs when generating HLSL.
Fixes https://github.com/gfx-rs/wgpu/issues/3748
* Add naga:🔙:hlsl::FragmentEntryPoint for providing information
about the fragment entry point when generating vertex entry points via
naga:🔙:hlsl::Writer::write. Vertex outputs not consumed by the
fragment entry point are omitted in the final output struct.
* Add naga snapshot test for this new feature,
* Remove Features::SHADER_UNUSED_VERTEX_OUTPUT,
StageError::InputNotConsumed, and associated validation logic.
* Make wgpu dx12 backend pass fragment shader info when generating
vertex HLSL.
* Add wgpu regression test for allowing unconsumed inputs.
* Address review
* Add note that nesting structs for the inter-stage interface can't
happen.
* Remove new TODO notes (some addressed and some transferred to an issue
https://github.com/gfx-rs/wgpu/issues/5577)
* Changed issue that regression test refers to 3748 -> 5553
* Add debug_assert that binding.is_some() in hlsl writer
* Fix typos caught in CI
Also, fix compiling snapshot test when hlsl-out feature is not enabled.
We are using `program_cache.try_lock()` when creting pipelines which is covered by a guard gotten from `context.lock()`. For the `.try_lock()` to always succeed we need to make sure that the other lock acquisitions are also covered by a `context.lock()`.
The `wgpu_examples::hello_compute::tests::multithreaded_compute` test has been failing intermittently in CI due to this.
* Add an optional system for counting and reporting internal resources and events
* Count API objects in wgpu-hal
* Expose internal counters in wgpu-core and wgpu.
A recent change by rustc, now in 1.79-stable, makes empty str constants
point to the same location: 0x01. This is an optimization of sorts, not
stable behavior. Code must not rely on constants having stable addresses
nor should it pass &"" to APIs expecting CStrs or NULL addresses.
D3DCompile will segfault if you give it such a pointer, or worse:
read random garbage addresses!
Pass the NULL pointer to D3DCompile if wgpu lacks a decent CString.
refs:
- https://learn.microsoft.com/en-us/windows/win32/api/d3dcompiler/nf-d3dcompiler-d3dcompile
Co-authored-by: Jan Hohenheim <jan@hohenheim.ch>
Co-authored-by: Brezak <bezak.adam@proton.me>
Add the following flags to `wgpu_types::Features`:
- `SHADER_INT64_ATOMIC_ALL_OPS` enables all atomic operations on `atomic<i64>` and
`atomic<u64>` values.
- `SHADER_INT64_ATOMIC_MIN_MAX` is a subset of the above, enabling only
`AtomicFunction::Min` and `AtomicFunction::Max` operations on `atomic<i64>` and
`atomic<u64>` values in the `Storage` address space. These are the only 64-bit
atomic operations available on Metal as of 3.1.
Add corresponding flags to `naga::valid::Capabilities`. These are supported by the
WGSL front end, and all Naga backends.
Platform support:
- On Direct3d 12, in `D3D12_FEATURE_DATA_D3D12_OPTIONS9`, if
`AtomicInt64OnTypedResourceSupported` and `AtomicInt64OnGroupSharedSupported` are
both available, then both wgpu features described above are available.
- On Metal, `SHADER_INT64_ATOMIC_MIN_MAX` is available on Apple9 hardware, and on
hardware that advertises both Apple8 and Mac2 support. This also requires Metal
Shading Language 2.4 or later. Metal does not yet support the more general
`SHADER_INT64_ATOMIC_ALL_OPS`.
- On Vulkan, if the `VK_KHR_shader_atomic_int64` extension is available with both the
`shader_buffer_int64_atomics` and `shader_shared_int64_atomics` features, then both
wgpu features described above are available.
Fix two major synchronization issues in `wgpu_val::vulkan`:
- Properly order queue command buffer submissions. Due to Mesa bugs, two semaphores are required even though the Vulkan spec says that only one should be necessary.
- Properly manage surface texture acquisition and presentation:
- Acquiring a surface texture can return while the presentation engine is still displaying the texture. Applications must wait for a semaphore to be signaled before using the acquired texture.
- Presenting a surface texture requires a semaphore to ensure that drawing is complete before presentation occurs.
Co-authored-by: Jim Blandy <jimb@red-bean.com>
This proves a flag in msl::PipelineOptions that attempts to write all
Metal vertex entry points to use a vertex pulling technique. It does
this by:
1) Forcing the _buffer_sizes structure to be generated for all vertex
entry points. The structure has additional buffer_size members that
contain the byte sizes of the vertex buffers.
2) Adding new args to vertex entry points for the vertex id and/or
the instance id and for the bound buffers. If there is an existing
@builtin(vertex_index) or @builtin(instance_index) param, then no
duplicate arg is created.
3) Adding code at the beginning of the function for vertex entry points
to compare the vertex id or instance id against the lengths of all the
bound buffers, and force an early-exit if the bounds are violated.
4) Extracting the raw bytes from the vertex buffer(s) and unpacking
those bytes into the bound attributes with the expected types.
5) Replacing the varyings input and instead using the unpacked
attributes to fill any structs-as-args that are rebuilt in the entry
point.
A new naga test is added which exercises this flag and demonstrates the
effect of the transform. The msl generated by this test passes
validation.
Eventually this transformation will be the default, always-on behavior
for Metal pipelines, though the flag may remain so that naga
translation tests can be run with and without the tranformation.
These are being deprecated in the future in favor of the associated
constants (which are already being used in some code here), so this
consistently uses the preferred forms.
Document some more safety expectations for
- resource destruction methods
- `CommandEncoder` methods
- `Queue::submit`
Document `Fence` creation a bit.
Document the `Queue` trait a bit.
Document `vulkan` shader module handling a bit.
* Issue SetDrawColorBuffers commands before issuing ClearColor
This is necessary for glClearBuffer calls to work correctly on some machines (e.g. AMD Renoir graphics running on Linux). Without this, glClearBuffer calls are ignored.
* Use clear_buffer_f32_slice instead of gl.clear to suppress WebGL warnings
This fixes the following WebGL warning: "WebGL warning: drawBuffers: `buffers[i]` must be NONE or COLOR_ATTACHMENTi."
When using native OpenGL, it is acceptable to call glDrawBuffers with an array of buffers where i != COLOR_ATTACHMENTi. In WebGL, this is not allowed.
* Run cargo fmt
* Add changes for PR GH-5666 to the CHANGELOG
Document that `wgpu_hal::CommandEncoder::discard_encoding` must not be called multiple times.
Assert in `wgpu_hal::vulkan::CommandEncoder::discard_encoding` that encoding is actually in progress.
Fixes#5255.
* Prefer OpenGL over OpenGL ES
* Fix sRGB on egl
* Check if OpenGL is supported
* Add changelog entry
* Remove expected failure for OpenGL Non-ES, add comment explaining FRAMEBUFFER_SRGB, add driver info to AdapterInfo
* Fix draw indexed
* CI host doesn't seem to support Rg8Snorm and Rgb9eUfloat clearing
Flesh out the documentation for `wgpu_core`'s `CommandBuffer`,
`CommandEncoder`, and associated types.
Allow doc links to private items. `wgpu-core` isn't entirely
user-facing, so it's useful to document internal items.
Since this struct's role is to hold all the relevant "VkFooProperties"
structs we can get about a given physical device, and "capabilities"
means something else in Vulkan (SPIR-V capabilities), it seems that
`PhysicalDeviceProperties` is a better name.
In `wgpu_hal::vulkan::InstanceShared::inspect`, handle
`PhysicalDeviceCapabilities::maintenance_3` more like the way we
handle other extension-provided physical device properties.
Specifically, use `Option::insert` to populate the `Option` and borrow
a mutable reference to its value, rather than calling
`.as_mut().unwrap()`.
This change should have no observable effect on behavior. It simply
replaces a runtime check (`unwrap`) with a statically checked
borrow (`insert`).
When running wgpu with an OpenGL context on macOS that is created with a core
profile and with the forward-compatibility bit set, the MAX_VARYING_COMPONENTS
constant returns 0 when queried. The default value is 60, so we return the
default value if the query returns 0.
We also need to use `#version 140` on macOS since `#version 130` isn't accepted.
Since `#version 140` should be available from OpenGL 3.1, we use that everywhere.
That way we don't need any specific macOS flags or features.
* split out TIMESTAMP_QUERY_INSIDE_ENCODERS from TIMESTAMP_QUERY
* changelog entry
* update changelog change number
* fix web warnings
* single line changelog
* note on followup issue
* Make sure to copy all of the buffers into the resource array for dx12.
Fixes#5088. Even though we're telling DX12 that the maximum frame latency should be our non-padded value, the swap chain may request any of the buffers allocated to it.
* Up the maximum frame latency on the DX12 backend to allow a larger range.
My understanding is that we shouldn't need to (The d3d12 docs aren't very specific about that), but we have evidence that these functions sometimes leave the resource pointer set to null without returning an error.
On Pop!_OS we have versions like
`Mesa 23.3.0-1pop0~1702935939~22.04~67e417a`. This failed to parse here
since it tried to split at the `.` in the suffix.
Not sure if other distros use a suffix with a `.`, but splitting from
the left and comparing as a tuple instead of a float seems cleaner
overall.
Co-authored-by: Connor Fitzgerald <connorwadefitzgerald@gmail.com>
* Introduce `dx12` and `metal` crate features to `wgpu`
* Implement dummy `Context` to allow compilation with `--no-default-features`
* Address review
* Remove `dummy::Context` in favor of `hal::api::Empty`
* Add changelog entry
* Panic early in `Instance::new()` if no backend is enabled
Co-Authored-By: Andreas Reich <1220815+Wumpf@users.noreply.github.com>
---------
Co-authored-by: Andreas Reich <1220815+Wumpf@users.noreply.github.com>