* Horizon: Migrate wlan and stubs latest services
This PR migrate empty wlan services, values are found by RE.
Latest firmwares added some other services which are now stubbed and up-to-date.
* Fix imports ordering
* Replace image barriers inside render pass with more generic memory barrier
* Remove forceStorage since it was creating images with storage bit for formats that are not StorageImage compatible
* Add missing flags on subpass dependency
* Don't call vkCmdSetScissor with a scissor count of 0
* One semaphore per swapchain image
* Remove compute stage from read to write barriers
* Try to improve Pipeline.Barrier nonsense
* Set PipelineStateFlags based on supported stages
Just some simple changes to the buffer conversion shaders. (stride conversion, D32S8 to D24S8)
The first change is using a device local buffer for converted vertex buffers, since they're only read/written on the GPU. These paths don't trigger on NVIDIA, but if you force them to use it demonstrates the full extent writing to host owned memory from compute absolutely destroys them. AMD GPUs are less heavily affected by this issue, but since the game in question was writing 230MB from compute, I imagine it should have some effect.
The second change is allowing the buffer conversion shaders to scale their work group count. While dividing the work between 32 invocations works OK for M1 macs, it's not so great for anything with more cores like AMD GPUs, which should be able to do a lot more parallel copies. Now, it scales by roughly 100 elements per invocation.
Some stride change cases could be improved further by either limiting vertex buffer size somehow (reading the index buffer could help, but is always risky) or only updating regions that changed, rather than invalidating the whole thing.
* Implement vertex and geometry shader conversion to compute
* Call InitializeReservedCounts for compute too
* PR feedback
* Set clip distance mask for geometry and tessellation shaders too
* Transform feedback emulation only for vertex
#5576 changed where the position was declared, but forgot to add the Invariant declaration to position when the ReducedPrecision flag was enabled. This was causing weird graphical bugs in a bunch of games, mostly to do with mismatching depth between multiple draws of the same geometry.
Maybe the attempt to add it to Position in DeclareInputOrOutput can be removed now, assuming that path is never used.
* mm: Migrate service in Horizon project
This PR migrate the `mm:u` service to the Horizon project, things were checked by some RE aswell, that's why some vars are renamed, the logic should be the same as before.
Tests are welcome.
* Lock _sessionList instead
* Fix comment
* Fix Session fields order
* Move shuffle handling out of the backend to a transform pass
* Handle subgroup sizes higher than 32
* Stop using the subgroup size control extension
* Make GenerateShuffleFunction static
* Shader cache version bump
* Vulkan: Periodically free regions of the staging buffer
There was an edge case where a game could submit tens of thousands of small copies over the course of over half a minute to unique fences. This could result in a large stutter when the staging buffer became full and it tried to check and free thousands of completed fences.
This became visible with some games and mirrors on Windows, as they don't submit any buffer data via the staging buffer, but may submit copies of the support buffer.
This change makes the Vulkan backend check for staging buffer completion on each command buffer submit, so it can't get backed up with 1000s of copies to check.
* Add comment
* pr_triage: Fix invalid workflow
* Don't assign reviewers to draft PRs
* Add team review request for developers team
* Introduce Mako to make team reviewers work
* Initial implementation of buffer mirrors
Generally slower right now, goal is to reduce render passes in games that do inline updates
Fix support buffer mirrors
Reintroduce vertex buffer mirror
Add storage buffer support
Optimisation part 1
More optimisation
Avoid useless data copies.
Remove unused cbIndex stuff
Properly set write flag for storage buffers.
Fix minor issues
Not sure why this was here.
Fix BufferRangeList
Fix some big issues
Align storage buffers rather than getting full buffer as a range
Improves mirrorability of read-only storage buffers
Increase staging buffer size, as it now contains mirrors
Fix some issues with buffers not updating
Fix buffer SetDataUnchecked offset for one of the paths when using mirrors
Fix buffer mirrors interaction with buffer textures
Fix mirror rebinding
Move GetBuffer calls on indirect draws before BeginRenderPass to avoid draws without render pass
Fix mirrors rebase
Fix rebase 2023
* Fix crash when using stale vertex buffer
Similar to `Get` with a size that's too large, just treat it as a clamp.
* Explicitly set support buffer as mirrorable
* Address feedback
* Remove unused fragment of MVK workaround
* Replace logging for staging buffer OOM
* Address format issues
* Address more format issues
* Mini cleanup
* Address more things
* Rename BufferRangeList
* Support bounding range for ClearMirrors and UploadPendingData
* Add maximum size for vertex buffer mirrors
* Enable index buffer mirrors
Enabled on all platforms for the IbStreamer.
* Feedback
* Remove mystery BufferCache change
Probably macos related?
* Fix mirrors not creating when staging buffer is empty.
* Change log level to debug
This branch changes the buffer copy fast path to notify memory tracking for all resources that aren't buffers. This fixes cases where games would copy buffer data directly into texture memory, which before would only work if the texture did not already exist. I imagine this happens when the guest driver is moving data between allocations or uploading it.
Since this only affects the fast path, cases where the source data has been modified from GPU (fast path copy destination doesn't count) will still fail to notify the texture, though I don't imagine games will do this. This should be resolved in future.
This should fix some texture issues with guest OpenGL games on switch, such as Dragon Quest Builders.
This may also be useful in future for games that move shader data around memory, if we end up using memory tracking for those.
* Move some properties out of ShaderConfig
* Stop using ShaderConfig on backends
* Replace ShaderConfig usages on Translator and passes
* Move remaining properties out of ShaderConfig and delete ShaderConfig
* Remove ResourceManager property from TranslatorContext
* Move Rewriter passes to separate transform pass files
* Fix TransformPasses.RunPass on cases where a node is removed
* Move remaining ClipDistancePrimitivesWritten and UsedFeatures updates to decode stage
* Reduce excessive parameter passing a bit by using structs more
* Remove binding parameter from ShaderProperties methods since it is redundant
* Replace decoder instruction checks with switch statement
* Put GLSL on the same plan as SPIR-V for input/output declaration
* Stop mutating TranslatorContext state when Translate is called
* Pass most of the graphics state using a struct instead of individual query methods
* Auto-format
* Auto-format
* Add backend logging interface
* Auto-format
* Remove unnecessary use of interpolated strings
* Remove more modifications of AttributeUsage after decode
* PR feedback
* gl_Layer is not supported on compute
ForceDpiAware.Windows has a side effect of forcing the application DPI to be the same as the primary monitor. This isn't good if you have multiple monitors with different DPI.
On Avalonia, I don't think there are any downsides to disabling this. When it's disabled, `ForceDpiAware.GetWindowScaleFactor` always returns 1.