Ryujinx

Author	SHA1	Message	Date
gdkchan	b46b63e06a	Add support for alpha to coverage dithering (#3069 ) * Add support for alpha to coverage dithering * Shader cache version bump * Fix wrong alpha register * Ensure support buffer is cleared * New shader specialization based approach	2022-07-05 19:58:36 -03:00
gdkchan	851f56b08a	Support Array/3D depth-stencil render target, and single layer clears (#3400 ) * Support Array/3D depth-stencil render target, and single layer clears * Alignment	2022-06-14 13:30:39 -03:00
gdkchan	830cbf91bb	Ignore ClipControl on draw texture fallback (#3388 )	2022-06-11 14:31:17 -03:00
Mary	7bc4971cf9	misc: Clean up of CS project after Avalonia merge (#3340 ) This reformat Avalonia csproj file, remove unused deps and reajust Ryujinx csproj a bit after some other changes Also updated OpenTK.Graphics	2022-05-15 16:02:15 +02:00
riperiperi	43b4b34376	Implement Viewport Transform Disable (#3328 ) * Initial implementation (no specialization) * Use specialization * Fix render scale, increase code gen version * Revert accidental change * Address Feedback	2022-05-12 10:47:13 -03:00
riperiperi	d1146a5af2	Don't restore Viewport 0 if it hasn't been set yet. (#3219 ) Fixes a driver crash when starting some games caused by #3217	2022-03-20 14:48:43 -03:00
riperiperi	d461d4f68b	Fix OpenGL issues with RTSS overlays and OBS Game Capture (#3217 ) OpenGL game overlays and hooks tend to make a lot of assumptions about how games present frames to the screen, since presentation in OpenGL kind of sucks and they would like to have info such as the size of the screen, or if the contents are SRGB rather than linear. There are two ways of getting this. OBS hooks swap buffers to get a frame for video capture, but it actually checks the bound framebuffer at the time. I made sure that this matches the output framebuffer (the window) so that the output matches the size. RTSS checks the viewport size by default, but this was actually set to the last used viewport by the game, causing the OSD to fly all across the screen depending on how it was used (or res scale). The viewport is now manually set to match the output framebuffer size. In the case of RTSS, it also loads its resources by destructively setting a pixel pack parameter without regard to what it was set to by the guest application. OpenGL state can be set for a long period of time and is not expected to be set before each call to a method, so randomly changing it isn't great practice. To fix this, I've added a line to set the pixel unpack alignment back to 4 after presentation, which should cover RTSS loading its incredibly ugly font. - RTSS and overlays that use it should no longer cause certain textures to load incorrectly. (mario kart 8, pokemon legends arceus) - OBS Game Capture should no longer crop the game output incorrectly, flicker randomly, or capture with incorrect gamma. This doesn't fix issues with how RTSS reports our frame timings.	2022-03-20 13:37:45 -03:00
gdkchan	3bd357045f	Do not allow render targets not explicitly written by the fragment shader to be modified (#3063 ) * Do not allow render targets not explicitly written by the fragment shader to be modified * Shader cache version bump * Remove blank lines * Avoid redundant color mask updates * HostShaderCacheEntry can be null * Avoid more redundant glColorMask calls * nit: Mask -> Masks * Fix currentComponentMask * More efficient way to update _currentComponentMasks	2022-02-16 23:15:39 +01:00
gdkchan	6e0799580f	Fix render target clear when sizes mismatch (#2994 )	2022-01-11 20:15:17 +01:00
riperiperi	79adba4402	Add support for render scale to vertex stage. (#2763 ) * Add support for render scale to vertex stage. Occasionally games read off textureSize on the vertex stage to inform the fragment shader what size a texture is without querying in there. Scales were not present in the vertex shader to correct the sizes, so games were providing the raw upscaled texture size to the fragment shader, which was incorrect. One downside is that the fragment and vertex support buffer description must be identical, so the full size scales array must be defined when used. I don't think this will have an impact though. Another is that the fragment texture count must be updated when vertex shader textures are used. I'd like to correct this so that the update is folded into the update for the scales. Also cleans up a bunch of things, like it making no sense to call CommitRenderScale for each stage. Fixes render scale causing a weird offset bloom in Super Mario Party and Clubhouse Games. Clubhouse Games still has a pixelated look in a number of its games due to something else it does in the shader. * Split out support buffer update, lazy updates. * Commit support buffer before compute dispatch * Remove unnecessary qualifier. * Address Feedback	2022-01-08 14:48:48 -03:00
gdkchan	611bec6e44	Implement DrawTexture functionality (#2747 ) * Implement DrawTexture functionality * Non-NVIDIA support * Disable some features that should not affect draw texture (slow path) * Remove space from shader source * Match 2D engine names * Fix resolution scale and add missing XML docs * Disable transform feedback for draw texture fallback	2021-11-10 15:37:49 -03:00
gdkchan	d512ce122c	Initial tessellation shader support (#2534 ) * Initial tessellation shader support * Nits * Re-arrange built-in table * This is not needed anymore * PR feedback	2021-10-18 18:38:04 -03:00
riperiperi	ec3e848d79	Add a Multithreading layer for the GAL, multi-thread shader compilation at runtime (#2501 ) * Initial Implementation About as fast as nvidia GL multithreading, can be improved with faster command queuing. * Struct based command list Speeds up a bit. Still a lot of time lost to resource copy. * Do shader init while the render thread is active. * Introduce circular span pool V1 Ideally should be able to use structs instead of references for storing these spans on commands. Will try that next. * Refactor SpanRef some more Use a struct to represent SpanRef, rather than a reference. * Flush buffers on background thread * Use a span for UpdateRenderScale. Much faster than copying the array. * Calculate command size using reflection * WIP parallel shaders * Some minor optimisation * Only 2 max refs per command now. The command with 3 refs is gone. 😌 * Don't cast on the GPU side * Remove redundant casts, force sync on window present * Fix Shader Cache * Fix host shader save. * Fixup to work with new renderer stuff * Make command Run static, use array of delegates as lookup Profile says this takes less time than the previous way. * Bring up to date * Add settings toggle. Fix Muiltithreading Off mode. * Fix warning. * Release tracking lock for flushes * Fix Conditional Render fast path with threaded gal * Make handle iteration safe when releasing the lock This is mostly temporary. * Attempt to set backend threading on driver Only really works on nvidia before launching a game. * Fix race condition with BufferModifiedRangeList, exceptions in tracking actions * Update buffer set commands * Some cleanup * Only use stutter workaround when using opengl renderer non-threaded * Add host-conditional reservation of counter events There has always been the possibility that conditional rendering could use a query object just as it is disposed by the counter queue. This change makes it so that when the host decides to use host conditional rendering, the query object is reserved so that it cannot be deleted. Counter events can optionally start reserved, as the threaded implementation can reserve them before the backend creates them, and there would otherwise be a short amount of time where the counter queue could dispose the event before a call to reserve it could be made. * Address Feedback * Make counter flush tracked again. Hopefully does not cause any issues this time. * Wait for FlushTo on the main queue thread. Currently assumes only one thread will want to FlushTo (in this case, the GPU thread) * Add SDL2 headless integration * Add HLE macro commands. Co-authored-by: Mary <mary@mary.zone>	2021-08-27 00:31:29 +02:00
mpnico	8e1adb95cf	Add support for HLE macros and accelerate MultiDrawElementsIndirectCount #2 (#2557 ) * Add support for HLE macros and accelerate MultiDrawElementsIndirectCount * Add missing barrier * Fix index buffer count * Add support check for each macro hle before use * Add missing xml doc Co-authored-by: gdkchan <gab.dark.100@gmail.com>	2021-08-26 23:50:28 +02:00
gdkchan	c702943af3	Swap BGR components for 16-bit BGR texture formats (#2567 )	2021-08-20 18:26:25 -03:00
gdkchan	0ba4ade8f1	Ensure render scale is initialized to 1 on the backend (#2543 )	2021-08-11 19:44:41 -03:00
gdkchan	0f6ec446ea	Replace BGRA and scale uniforms with a uniform block (#2496 ) * Replace BGRA and scale uniforms with a uniform block * Setting the data again on program change is no longer needed * Optimize and resolve some warnings * Avoid redundant support buffer updates * Some optimizations to BindBuffers (now inlined) * Unify render scale arrays	2021-08-11 21:33:43 +02:00
gdkchan	fefd4619a5	Add support for custom line widths (#2406 )	2021-06-25 20:11:54 -03:00
riperiperi	fe29aff266	Use Quads on OpenGL host when supported. (#2331 ) Improves OpenGL performance on FAST RMX and Xenoblade DE/2. Will probably only work on NVIDIA GPUs, but the emulated quads path will still be valid for other GPUs. Note: SLOW RMX gets a bit faster in handheld mode. I'd recommend checking on platforms without supported host quads to make sure a GL error is actually thrown when attempting GL.Begin(PrimitiveType.Quads)	2021-06-02 13:27:30 +02:00
Mary	7527c5b906	Avoid clearing alpha channel by handle when presenting (#2323 ) * Avoid clearning alpha channel by handle when presenting Previous code was binding then blitting while the framebuffer was bound and then clearing the alpha channel by its handle. This ended up triggering a bug since AMD driver 21.4.1 ending up clearing the whole framebuffer as a result. New code fix this weird logic by applying the clear on the bound framebuffer. Close #2236. * Address rip's comments * Fix AMD being broken once again	2021-06-01 09:29:01 +02:00
riperiperi	212e472c9f	Use copy dependencies for the Intel/AMD view format workaround (#2144 ) * This might help AMD a bit * Removal of old workaround.	2021-05-16 20:43:27 +02:00
riperiperi	da283ff3c3	Flip component mask if target is BGRA. (#2087 ) * Flip component mask if target is BGRA. * Make mask selection less ugly.	2021-03-08 11:12:19 +11:00
gdkchan	caf049ed15	Avoid some redundant GL calls (#1958 )	2021-01-27 08:44:07 +11:00
gdkchan	df820a72de	Implement clear buffer (fast path) (#1902 ) * Implement clear buffer (fast path) * Remove blank line	2021-01-13 08:50:54 +11:00
riperiperi	c00d39b675	Dummy out gl queries with 0 draws, remove glFlush call (#1773 )	2020-12-03 19:42:59 +01:00
gdkchan	9435d62206	Simplify depth test state updates (#1695 )	2020-11-17 23:20:17 +01:00
gdkchan	8d168574eb	Use explicit buffer and texture bindings on shaders (#1666 ) * Use explicit buffer and texture bindings on shaders * More XML docs and other nits	2020-11-08 12:10:00 +01:00
riperiperi	e1da7df207	Support res scale on images, correctly blacklist for SUST, move logic out of backend. (#1657 ) * Support res scale on images, correctly blacklist for SUST, move logic out of backend. * Fix Typo	2020-11-02 16:53:23 -03:00
gdkchan	812e32f775	Fix transform feedback errors caused by host pause/resume and multiple uses (#1634 ) * Fix transform feedback errors caused by host pause/resume * Fix TFB being used as something else issue with copies * This is supposed to be StreamCopy	2020-10-25 17:23:42 -03:00
gdkchan	2dcc6333f8	Fix image binding format (#1625 ) * Fix image binding format * XML doc	2020-10-20 19:03:20 -03:00
riperiperi	b4d8d893a4	Memory Read/Write Tracking using Region Handles (#1272 ) * WIP Range Tracking - Texture invalidation seems to have large problems - Buffer/Pool invalidation may have problems - Mirror memory tracking puts an additional `add` in compiled code, we likely just want to make HLE access slower if this is the final solution. - Native project is in the messiest possible location. - [HACK] JIT memory access always uses native "fast" path - [HACK] Trying some things with texture invalidation and views. It works :) Still a few hacks, messy things, slow things More work in progress stuff (also move to memory project) Quite a bit faster now. - Unmapping GPU VA and CPU VA will now correctly update write tracking regions, and invalidate textures for the former. - The Virtual range list is now non-overlapping like the physical one. - Fixed some bugs where regions could leak. - Introduced a weird bug that I still need to track down (consistent invalid buffer in MK8 ribbon road) Move some stuff. I think we'll eventually just put the dll and so for this in a nuget package. Fix rebase. [WIP] MultiRegionHandle variable size ranges - Avoid reprotecting regions that change often (needs some tweaking) - There's still a bug in buffers, somehow. - Might want different api for minimum granularity Fix rebase issue Commit everything needed for software only tracking. Remove native components. Remove more native stuff. Cleanup Use a separate window for the background context, update opentk. (fixes linux) Some experimental changes Should get things working up to scratch - still need to try some things with flush/modification and res scale. Include address with the region action. Initial work to make range tracking work Still a ton of bugs Fix some issues with the new stuff. * Fix texture flush instability There's still some weird behaviour, but it's much improved without this. (textures with cpu modified data were flushing over it) * Find the destination texture for Buffer->Texture full copy Greatly improves performance for nvdec videos (with range tracking) * Further improve texture tracking * Disable Memory Tracking for view parents This is a temporary approach to better match behaviour on master (where invalidations would be soaked up by views, rather than trigger twice) The assumption is that when views are created to a texture, they will cover all of its data anyways. Of course, this can easily be improved in future. * Introduce some tracking tests. WIP * Complete base tests. * Add more tests for multiregion, fix existing test. * Cleanup Part 1 * Remove unnecessary code from memory tracking * Fix some inconsistencies with 3D texture rule. * Add dispose tests. * Use a background thread for the background context. Rather than setting and unsetting a context as current, doing the work on a dedicated thread with signals seems to be a bit faster. Also nerf the multithreading test a bit. * Copy to texture with matching alignment This extends the copy to work for some videos with unusual size, such as tutorial videos in SMO. It will only occur if the destination texture already exists at XCount size. * Track reads for buffer copies. Synchronize new buffers before copying overlaps. * Remove old texture flushing mechanisms. Range tracking all the way, baby. * Wake the background thread when disposing. Avoids a deadlock when games are closed. * Address Feedback 1 * Separate TextureCopy instance for background thread Also `BackgroundContextWorker.InBackground` for a more sensible idenfifier for if we're in a background thread. * Add missing XML docs. * Address Feedback * Maybe I should start drinking coffee. * Some more feedback. * Remove flush warning, Refocus window after making background context	2020-10-16 17:18:35 -03:00
gdkchan	6a51b628f9	Fix error when dual source blend is used (#1610 ) * Fix error when dual source blend is used * Ensure framebuffer	2020-10-12 21:50:41 -03:00
gdkchan	1eea35554c	Better viewport flipping and depth mode detection method (#1556 ) * Use a better viewport flipping approach * New approach to detect depth mode * nit: Sort method on the OpenGL backend * Adjust spacing on comment * Unswap near and far parameters based on ScaleZ	2020-09-19 19:46:49 -03:00
mageven	a33dc2f491	Improved Logger (#1292 ) * Logger class changes only Now compile-time checking is possible with the help of Nullable Value types. * Misc formatting * Manual optimizations PrintGuestLog PrintGuestStackTrace Surfaceflinger DequeueBuffer * Reduce SendVibrationXX log level to Debug * Add Notice log level This level is always enabled and used to print system info, etc... Also, rewrite LogColor to switch expression as colors are static * Unify unhandled exception event handlers * Print enabled LogLevels during init * Re-add App Exit disposes in proper order nit: switch case spacing * Revert PrintGuestStackTrace to Info logs due to #1407 PrintGuestStackTrace is now called in some critical error handlers so revert to old behavior as KThread isn't part of Guest. * Batch replace Logger statements	2020-08-04 01:32:53 +02:00
gdkchan	43c13057da	Implement alpha test using legacy functions (#1426 )	2020-07-28 18:30:08 -03:00
gdkchan	51fbc1fde4	Use polygon offset clamp if supported (#1429 )	2020-07-26 18:11:28 -03:00
gdkchan	8dbcae1ff8	Implement BGRA texture support (#1418 ) * Implement BGRA texture support * Missing AppendLine * Remove empty lines * Address PR feedback	2020-07-26 00:03:40 -03:00
mageven	723ae240dc	GL: Implement more Point parameters (#1399 ) * Fix GL_INVALID_VALUE on glPointSize calls * Implement more of Point primitive state * Use existing Origin enum	2020-07-20 21:59:13 -03:00
gdkchan	788ca6a411	Initial transform feedback support (#1370 ) * Initial transform feedback support * Some nits and fixes * Update ReportCounterType and Write method * Can't change shader or TFB bindings while TFB is active * Fix geometry shader input names with new naming	2020-07-15 13:01:10 +10:00
riperiperi	f224769c49	Implement Logical Operation registers and functionality (#1380 ) * Implement Logical Operation registers and functionality. * Address Feedback 1	2020-07-10 14:23:15 -03:00
riperiperi	484eb645ae	Implement Zero-Configuration Resolution Scaling (#1365 ) * Initial implementation of Render Target Scaling Works with most games I have. No GUI option right now, it is hardcoded. Missing handling for texelFetch operation. * Realtime Configuration, refactoring. * texelFetch scaling on fragment shader (WIP) * Improve Shader-Side changes. * Fix potential crash when no color/depth bound * Workaround random uses of textures in compute. This was blacklisting textures in a few games despite causing no bugs. Will eventually add full support so this doesn't break anything. * Fix scales oscillating when changing between non-native scales. * Scaled textures on compute, cleanup, lazier uniform update. * Cleanup. * Fix stupidity * Address Thog Feedback. * Cover most of GDK's feedback (two comments remain) * Fix bad rename * Move IsDepthStencil to FormatExtensions, add docs. * Fix default config, square texture detection. * Three final fixes: - Nearest copy when texture is integer format. - Texture2D -> Texture3D copy correctly blacklists the texture before trying an unscaled copy (caused driver error) - Discount small textures. * Remove scale threshold. Not needed right now - we'll see if we run into problems. * All CPU modification blacklists scale. * Fix comment.	2020-07-07 04:41:07 +02:00
gdkchan	a15b951721	Fix wrong face culling once and for all (#1277 ) * Viewport swizzle support on NV and clip origin * Initialize default viewport swizzle state, emulate viewport swizzle on shaders when not supported * Address PR feedback	2020-05-28 09:03:07 +10:00
riperiperi	5dab515c7a	Flush GL commands before inevitably waiting for a query result. (#1278 )	2020-05-27 17:51:03 +10:00
riperiperi	d941f4c070	Remember bound framebuffer to avoid glGetInteger use. (#1273 ) glGetInteger seems to sync with GPU which is less than ideal, and slowing down texture copies.	2020-05-24 15:44:12 +02:00
gdkchan	5011640b30	Spanify Graphics Abstraction Layer (#1226 ) * Spanify Graphics Abstraction Layer * Be explicit about BufferHandle size	2020-05-23 11:46:09 +02:00
riperiperi	cd48576f58	Implement Counter Queue and Partial Host Conditional Rendering (#1167 ) * Implementation of query queue and host conditional rendering * Resolve some comments. * Use overloads instead of passing object. * Wake the consumer threads when incrementing syncpoints. Also, do a busy loop when awaiting the counter for a blocking flush, rather than potentially sleeping the thread. * Ensure there's a command between begin and end query.	2020-05-04 12:24:59 +10:00
mageven	53369e79bd	Implement user-defined clipping on GL state pipeline (#1118 )	2020-05-04 12:04:49 +10:00
riperiperi	c2ac45adc5	Fix depth clamp enable bit, unit scale for polygon offset. (#1178 ) Verified with deko3d and opengl driver code.	2020-04-30 11:47:24 +10:00
gdkchan	3cb1fa0e85	Implement texture buffers (#1152 ) * Implement texture buffers * Throw NotSupportedException where appropriate	2020-04-25 23:02:18 +10:00
mageven	a728610b40	Implement Constant Color blends (#1119 ) * Implement Constant Color blends and init blend states * Address gdkchan's comments Also adds Set methods to GpuState * Fix descriptions of QueryModified	2020-04-25 23:00:43 +10:00

1 2

68 commits