Ryujinx

Author	SHA1	Message	Date
gdkchan	7f6b3d234a	Implement IMUL, PCNT and CONT shader instructions, fix FFMA32I and HFMA32I (#2972 ) * Implement IMUL shader instruction * Implement PCNT/CONT instruction and fix FFMA32I * Add HFMA232I to the table * Shader cache version bump * No Rc on Ffma32i	2022-01-10 12:08:00 -03:00
riperiperi	79adba4402	Add support for render scale to vertex stage. (#2763 ) * Add support for render scale to vertex stage. Occasionally games read off textureSize on the vertex stage to inform the fragment shader what size a texture is without querying in there. Scales were not present in the vertex shader to correct the sizes, so games were providing the raw upscaled texture size to the fragment shader, which was incorrect. One downside is that the fragment and vertex support buffer description must be identical, so the full size scales array must be defined when used. I don't think this will have an impact though. Another is that the fragment texture count must be updated when vertex shader textures are used. I'd like to correct this so that the update is folded into the update for the scales. Also cleans up a bunch of things, like it making no sense to call CommitRenderScale for each stage. Fixes render scale causing a weird offset bloom in Super Mario Party and Clubhouse Games. Clubhouse Games still has a pixelated look in a number of its games due to something else it does in the shader. * Split out support buffer update, lazy updates. * Commit support buffer before compute dispatch * Remove unnecessary qualifier. * Address Feedback	2022-01-08 14:48:48 -03:00
gdkchan	119a3a1887	Fix SUATOM and other texture shader instructions with RZ dest (#2885 ) * Fix SUATOM and other texture shader instructions with RZ dest * Shader cache version bump	2021-12-08 18:36:09 -03:00
gdkchan	650cc41c02	Implement remaining shader double-precision instructions (#2845 ) * Implement remaining shader double-precision instructions * Shader cache version bump	2021-12-08 17:54:12 -03:00
gdkchan	acc0b0f313	Fix FLO.SH shader instruction with a input of 0 (#2876 ) * Fix FLO.SH shader instruction with a input of 0 * Shader cache version bump	2021-12-05 13:25:05 +01:00
gdkchan	911ea38e93	Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins (#2817 ) * Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins * Shader cache version bump * Fix back color value on fragment shader * Disable IPA multiplication for fixed function attributes and back color selection	2021-11-08 13:18:46 -03:00
gdkchan	3dee712164	Fix bindless/global memory elimination with inverted predicates (#2826 ) * Fix bindless/global memory elimination with inverted predicates * Shader cache version bump	2021-11-08 12:57:28 -03:00
gdkchan	b7a1544e8b	Fix InvocationInfo on geometry shader and bindless default integer const (#2822 ) * Fix InvocationInfo on geometry shader and bindless default integer const * Shader cache version bump * Consistency for the default value	2021-11-08 11:39:30 -03:00
gdkchan	99445dd0a6	Add support for fragment shader interlock (#2768 ) * Support coherent images * Add support for fragment shader interlock * Change to tree based match approach * Refactor + check for branch targets and external registers * Make detection more robust * Use Intel fragment shader ordering if interlock is not available, use nothing if both are not available * Remove unused field	2021-10-28 19:53:12 -03:00
gdkchan	63f1663fa9	Fix shader 8-bit and 16-bit STS/STG (#2741 ) * Fix 8 and 16-bit STG * Fix 8 and 16-bit STS * Shader cache version bump	2021-10-18 20:24:15 -03:00
riperiperi	052deebf26	Another workaround for NVIDIA driver 496.13 shader bug (#2750 ) * Another workaround for NVIDIA driver 496.13 shader bug This might work better than the other one. Give this a test to see if it fixes/doesn't fix issues with the other one. The problem seems to be when any variable assignment happens with a negation. `temp_1 = -temp_0;` seems to trigger weird behaviour, but `temp_1 = 0.0 - temp_0;` does not. This also might to extend towards integer types? * Update cache version * Add disclaimer comment * Wording	2021-10-18 20:04:06 -03:00
gdkchan	d512ce122c	Initial tessellation shader support (#2534 ) * Initial tessellation shader support * Nits * Re-arrange built-in table * This is not needed anymore * PR feedback	2021-10-18 18:38:04 -03:00
gdkchan	d05573bfd1	Implement SHF (funnel shift) shader instruction (#2702 ) * Implement SHF shader instruction * Shader cache version bump * Better name	2021-10-17 17:02:20 -03:00
gdkchan	a7109c767b	Rewrite shader decoding stage (#2698 ) * Rewrite shader decoding stage * Fix P2R constant buffer encoding * Fix PSET/PSETP * PR feedback * Log unimplemented shader instructions * Implement NOP * Remove using * PR feedback	2021-10-12 22:35:31 +02:00
gdkchan	fd7567a6b5	Only make render target 2D textures layered if needed (#2646 ) * Only make render target 2D textures layered if needed * Shader cache version bump * Ensure topology is updated on channel swap	2021-09-29 01:55:12 +02:00
gdkchan	f08a280ade	Use shader subgroup extensions if shader ballot is not supported (#2627 ) * Use shader subgroup extensions if shader ballot is not supported * Shader cache version bump + cleanup * The type is still required on the table	2021-09-19 14:38:39 +02:00
riperiperi	f0b00c1ae9	Fix TXQ for 3D textures. (#2613 ) * Fix TXQ for 3D textures. Assumes the texture is 3D if the component mask contains Z. This fixes a bug in UE4 games where parts of the map had garbage pointers to lighting voxels, as the lookup 3D texture was not being initialized. Most notable game is THPS1+2. May need another PR to keep image store data alive and properly flush it in order using the AutoDeleteCache. * Get sampler type for TextureSize from bound textures.	2021-09-02 00:17:43 -03:00
riperiperi	142cededd4	Implement Shader Instructions SUATOM and SURED (#2090 ) * Initial Implementation * Further improvements (no support for float/64-bit types) * Merge atomic and reduce instructions, add missing format switch * Fix rebase issues. * Not used. * Whoops. Fixed. * Partial implementation of inc/dec, cleanup and TODOs * Remove testing path * Address Feedback	2021-08-31 02:51:57 -03:00
gdkchan	416dc8fde4	Fix out-of-bounds shader thread shuffle (#2605 ) * Fix out-of-bounds shader thread shuffle * Shader cache version bump	2021-08-30 14:02:40 -03:00
gdkchan	ee1038e542	Initial support for shader attribute indexing (#2546 ) * Initial support for shader attribute indexing * Support output indexing too, other improvements * Fix order * Address feedback	2021-08-27 01:44:47 +02:00
riperiperi	ec3e848d79	Add a Multithreading layer for the GAL, multi-thread shader compilation at runtime (#2501 ) * Initial Implementation About as fast as nvidia GL multithreading, can be improved with faster command queuing. * Struct based command list Speeds up a bit. Still a lot of time lost to resource copy. * Do shader init while the render thread is active. * Introduce circular span pool V1 Ideally should be able to use structs instead of references for storing these spans on commands. Will try that next. * Refactor SpanRef some more Use a struct to represent SpanRef, rather than a reference. * Flush buffers on background thread * Use a span for UpdateRenderScale. Much faster than copying the array. * Calculate command size using reflection * WIP parallel shaders * Some minor optimisation * Only 2 max refs per command now. The command with 3 refs is gone. 😌 * Don't cast on the GPU side * Remove redundant casts, force sync on window present * Fix Shader Cache * Fix host shader save. * Fixup to work with new renderer stuff * Make command Run static, use array of delegates as lookup Profile says this takes less time than the previous way. * Bring up to date * Add settings toggle. Fix Muiltithreading Off mode. * Fix warning. * Release tracking lock for flushes * Fix Conditional Render fast path with threaded gal * Make handle iteration safe when releasing the lock This is mostly temporary. * Attempt to set backend threading on driver Only really works on nvidia before launching a game. * Fix race condition with BufferModifiedRangeList, exceptions in tracking actions * Update buffer set commands * Some cleanup * Only use stutter workaround when using opengl renderer non-threaded * Add host-conditional reservation of counter events There has always been the possibility that conditional rendering could use a query object just as it is disposed by the counter queue. This change makes it so that when the host decides to use host conditional rendering, the query object is reserved so that it cannot be deleted. Counter events can optionally start reserved, as the threaded implementation can reserve them before the backend creates them, and there would otherwise be a short amount of time where the counter queue could dispose the event before a call to reserve it could be made. * Address Feedback * Make counter flush tracked again. Hopefully does not cause any issues this time. * Wait for FlushTo on the main queue thread. Currently assumes only one thread will want to FlushTo (in this case, the GPU thread) * Add SDL2 headless integration * Add HLE macro commands. Co-authored-by: Mary <mary@mary.zone>	2021-08-27 00:31:29 +02:00
gdkchan	eb181425b1	Fix size of cached compute shaders (#2548 ) * Fix size of cached compute shaders * Missed one	2021-08-12 15:59:24 -03:00
gdkchan	3148c0c21c	Unify GpuAccessorBase and TextureDescriptorCapableGpuAccessor (#2542 ) * Unify GpuAccessorBase and TextureDescriptorCapableGpuAccessor * Shader cache version bump	2021-08-11 18:56:59 -03:00
gdkchan	c3e2646f9e	Workaround for Intel FrontFacing built-in variable bug (#2540 )	2021-08-11 23:01:06 +02:00
gdkchan	ed754af8d5	Make sure attributes used on subsequent shader stages are initialized (#2538 )	2021-08-11 22:27:00 +02:00
gdkchan	0f6ec446ea	Replace BGRA and scale uniforms with a uniform block (#2496 ) * Replace BGRA and scale uniforms with a uniform block * Setting the data again on program change is no longer needed * Optimize and resolve some warnings * Avoid redundant support buffer updates * Some optimizations to BindBuffers (now inlined) * Unify render scale arrays	2021-08-11 21:33:43 +02:00
gdkchan	d9d18439f6	Use a new approach for shader BRX targets (#2532 ) * Use a new approach for shader BRX targets * Make shader cache actually work * Improve the shader pattern matching a bit * Extend LDC search to predecessor blocks, catches more cases * Nit * Only save the amount of constant buffer data actually used. Avoids crashes on partially mapped buffers * Ignore Rd on predicate instructions, as they do not have a Rd register (catches more cases)	2021-08-11 20:59:42 +02:00
gdkchan	9b08abc644	Fix shader compilation on shaders that uses rectangle textures (#2471 )	2021-07-12 16:20:33 -03:00
gdkchan	40b21cc3c4	Separate GPU engines (part 2/2) (#2440 ) * 3D engine now uses DeviceState too, plus new state modification tracking * Remove old methods code * Remove GpuState and friends * Optimize DeviceState, force inline some functions * This change was not supposed to go in * Proper channel initialization * Optimize state read/write methods even more * Fix debug build * Do not dirty state if the write is redundant * The YControl register should dirty either the viewport or front face state too, to update the host origin * Avoid redundant vertex buffer updates * Move state and get rid of the Ryujinx.Graphics.Gpu.State namespace * Comments and nits * Fix rebase * PR feedback * Move changed = false to improve codegen * PR feedback * Carry RyuJIT a bit more	2021-07-11 17:20:40 -03:00
gdkchan	59900d7f00	Unscale textureSize when resolution scaling is used (#2441 ) * Unscale textureSize when resolution scaling is used * Fix textureSize on compute * Flag texture size as needing res scale values too	2021-07-09 00:09:07 -03:00
gdkchan	8b44eb1c98	Separate GPU engines and make state follow official docs (part 1/2) (#2422 ) * Use DeviceState for compute and i2m * Migrate 2D class, more comments * Migrate DMA copy engine * Remove now unused code * Replace GpuState by GpuAccessorState on GpuAcessor, since compute no longer has a GpuState * More comments * Add logging (disabled) * Add back i2m on 3D engine	2021-07-07 20:56:06 -03:00
gdkchan	d125fce3e8	Allow shader language and target API to be specified on the shader translator (#2402 )	2021-07-06 21:20:06 +02:00
gdkchan	fbb4019ed5	Initial support for separate GPU address spaces (#2394 ) * Make GPU memory manager a member of GPU channel * Move physical memory instance to the memory manager, and the caches to the physical memory * PR feedback	2021-06-29 19:32:02 +02:00
gdkchan	493648df31	Fix default value for unwritten shader outputs (#2412 ) * Fix shader default output values * Shader cache version bump	2021-06-25 19:56:03 -03:00
gdkchan	ed2f5ede0f	Fix texture sampling with depth compare and LOD level or bias (#2404 ) * Fix texture sampling with depth compare and LOD level or bias * Shader cache version bump * nit: Sorting	2021-06-25 00:54:50 +02:00
gdkchan	c71ae9c85c	Fix shader texture LOD query (#2397 )	2021-06-23 23:31:14 +02:00
gdkchan	49edf14a3e	Pass all inputs when geometry shader passthrough is enabled (#2362 ) * Pass all inputs when geometry shader passthrough is enabled * Shader cache version bump	2021-06-23 23:04:59 +02:00
riperiperi	7ff1f9aa12	End shader decoding when reaching a block that starts with an infinite loop (after BRX) (#2367 ) * End shader decoding when reaching an infinite loop The NV shader compiler puts these at the end of shaders. * Update shader cache version	2021-06-15 02:09:59 +02:00
gdkchan	3b90adcd1d	Fix shaders with mixed PBK and SSY addresses on the stack (#2329 ) * Fix shaders with mixed PBK and SSY addresses on the stack * Address PR feedback and nits	2021-06-03 01:41:53 +02:00
gdkchan	79b3243f54	Do not attempt to normalize SNORM image buffers on shaders (#2317 ) * Do not attempt to normalize SNORM image buffers on shaders * Shader cache version bump	2021-05-31 21:59:23 +02:00
riperiperi	5271cfe70b	Fix dimensions check for scale eligibility (#2301 )	2021-05-21 01:09:18 +02:00
gdkchan	12533e5c9d	Fix buffer and texture uses not being propagated for vertex A/B shaders (#2300 ) * Fix buffer and texture uses not being propagated for vertex A/B shaders * Shader cache version bump	2021-05-20 21:43:23 +02:00
gdkchan	b34c0a47b4	Fix constant buffer array size when indexing is used and other buffer descriptor and resolution scale regressions (#2298 ) * Fix constant buffer array size when indexing is used * Change default QueryConstantBufferUse value * Fix more regressions * Ensure proper order	2021-05-20 15:12:15 -03:00
gdkchan	49745cfa37	Move shader resource descriptor creation out of the backend (#2290 ) * Move shader resource descriptor creation out of the backend * Remove now unused code, and other nits * Shader cache version bump * Nits * Set format for bindless image load/store * Fix buffer write flag	2021-05-19 23:15:26 +02:00
EmulationFanatic	b5c72b44de	Merge pull request #2177 from riperiperi/feature/parallel-shader-cache Allow parallel shader compilation when loading a shader cache	2021-05-19 11:39:19 -07:00
gdkchan	0e9823d7e6	Fix shader buffer write flag on atomic instructions (#2261 ) * Fix shader buffer write flag on atomic instructions * Shader cache version bump	2021-05-01 20:46:21 +02:00
gdkchan	4770cfa920	Only enable clip distance if written to on shader (#2217 ) * Only enable clip distance if written to on shader * Signal InstanceId use through FeatureFlags * Shader cache version bump	2021-04-20 12:33:54 +02:00
riperiperi	9e68f5026e	Fix skipping missing shaders	2021-04-18 17:34:01 +01:00
riperiperi	b1c3e01691	Nit	2021-04-18 17:34:00 +01:00
riperiperi	35eac315ab	The task isn't required for loading compute binary.	2021-04-18 17:33:59 +01:00

1 2

100 commits