* use ArrayPool, avoid 6000-7000 allocs/sec of runtime
* use ArrayPool, avoid ~7k allocs/second during game execution
* use ArrayPool, avoid ~3000 allocs/sec during game execution
* use MemoryPool, reduce 0.5 MB/sec of new allocations during game execution
* avoid over-allocation by setting List<> Capacity when known
* remove LINQ in KTimeManager.UnscheduleFutureInvocation
* KTimeManager - avoid spinning one more time when the time has arrived
* KTimeManager - let SpinWait decide when to Thread.Yield(), and don't SpinOnce() immediately after Thread.Yield()
* use MemoryPool, reduce ~175k bytes/sec allocation during game execution
* IpcService - call commands via dynamic methods instead of reflection .Invoke(). Faster to call and with fewer allocations because parameters can be passed directly instead of as an array
* Make ButtonMappingEntry a record struct to avoid allocations. Set the List<ButtonMappingEntry> capacity according to use.
* add MemoryBuffer type for working with MemoryPool<byte>
* update changes to use MemoryBuffer
* make parameter ReadOnlySpan instead of Span
* whitespace fix
* Revert "IpcService - call commands via dynamic methods instead of reflection .Invoke(). Faster to call and with fewer allocations because parameters can be passed directly instead of as an array"
This reverts commit f2c698bdf65f049e8481c9f2ec7138d9b9a8261d.
* tweak KTimeManager spin behavior
* replace MemoryBuffer with ByteMemoryPool modeled after System.Buffers.ArrayMemoryPool<T>
* make ByteMemoryPoolBuffer responsible for renting memory
* IPC refactor part 3 + 4: New server HIPC message processor with source generator based serialization
* Make types match on calls to AlignUp/AlignDown
* Formatting
* Address some PR feedback
* Move BitfieldExtensions to Ryujinx.Common.Utilities and consolidate implementations
* Rename Reader/Writer to SpanReader/SpanWriter and move to Ryujinx.Common.Memory
* Implement EventType
* Address more PR feedback
* Log request processing errors since they are not normal
* Rename waitable to multiwait and add missing lock
* PR feedback
* Ac_K PR feedback
* Generic Math Update
Updated Several functions in Ryujinx.Common/Utilities/BitUtils to use generic math
* Updated BitUtil calls
* Removed Whitespace
* Switched decrement
* Fixed changed method calls.
The method calls were originally changed on accident due to me relying too much on intellisense doing stuff for me
* Update Ryujinx.Common/Utilities/BitUtils.cs
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
Co-authored-by: gdkchan <gab.dark.100@gmail.com>
* Refactor CPU interface
* Use IExecutionContext interface on SVC handler, change how CPU interrupts invokes the handlers
* Make CpuEngine take a ITickSource rather than returning one
The previous implementation had the scenario where the CPU engine had to implement the tick source in mind, like for example, when we have a hypervisor and the game can read CNTPCT on the host directly. However given that we need to do conversion due to different frequencies anyway, it's not worth it. It's better to just let the user pass the tick source and redirect any reads to CNTPCT to the user tick source
* XML docs for the public interfaces
* PPTC invalidation due to NativeInterface function name changes
* Fix build of the CPU tests
* PR feedback
* Fix race when EventWait is called and a wait is done on the CPU
* This is useless now
* Fix EventSignal
* Ensure the signal belongs to the current fence, to avoid stale signals
It seems that certain games (Link's Awakening, Xenoblade DE) had their fences reached already when posting framebuffers, so the signal that a frame was ready would go out _before_ the frame was enqueued, and the render loop would fail to dequeue anything and "skip" a frame.
This was resulting in their performance lowering dramatically after some loading transitions, as a frame signal would be consumed and presentation would be one frame behind.
It's possible this might have eventually caused deadlocks in these games or others, if it happened twice.
* Make GPU memory manager a member of GPU channel
* Move physical memory instance to the memory manager, and the caches to the physical memory
* PR feedback
* Make all title id instances unsigned
* Replace address and size with ulong instead of signed types
Long overdue change.
Also change some logics here and there to optimize with the new memory
manager.
* Address Ac_K's comments
* Remove uneeded cast all around
* Fixes some others misalignment
* Rename CommandAttribute as CommandHIpcAttribute to prepare for 12.x changes
* Implement inital support for TIPC and adds SM command ids
* *Ipc to *ipc
* Missed a ref in last commit...
* CommandAttributeTIpc to CommandAttributeTipc
* Addresses comment and fixes some bugs around
TIPC doesn't have any padding requirements as buffer C isn't a thing
Fix for RegisterService inverting two argument only on TIPC
* Surface Flinger: Fix an oversight when closing a layer
As the title say.
I also took the liberty of changing the logic on how we select the
current layer being rendered to make it more explicit when opening and
creating layers.
NOTE: Found by Ac_k.
* check for RenderLayerId and not the dictionary size
This fix a possible race condition between the time you create a layer and set the one currently used for rendering
This PR fixes a regression introduced in #1741. The actual implementation do the assumption of fences always exist and then registering the callback.
Homebrews may not use fences, so the code crashes when it try to register the callback.
* Interrupt GPU command processing when a frame's fence is reached.
* Accumulate times rather than %s
* Accurate timer for vsync
Spin wait for the last .667ms of a frame. Avoids issues caused by signalling 16ms vsync. (periodic stutters in smo)
* Use event wait for better timing.
* Fix lazy wait
Windows doesn't seem to want to do 1ms consistently, so force a spin if we're less than 2ms.
* A bit more efficiency on frame waits.
Should now wait the remainder 0.6667 instead of 1.6667 sometimes (odd waits above 1ms are reliable, unlike 1ms waits)
* Better swap interval 0 solution
737 fps without breaking a sweat. Downside: Vsync can no longer be disabled on games that use the event heavily (link's awakening - which is ok since it breaks anyways)
* Fix comment.
* Address Comments.
* Rewrite scheduler context switch code
* Fix race in UnmapIpcRestorePermission
* Fix thread exit issue that could leave the scheduler in a invalid state
* Change context switch method to not wait on guest thread, remove spin wait, use SignalAndWait to pass control
* Remove multi-core setting (it is always on now)
* Re-enable assert
* Remove multicore from default config and schema
* Fix race in KTimeManager
* IPC refactor part 2: Use ReplyAndReceive on HLE services and remove special handling from kernel
* Fix for applet transfer memory + some nits
* Keep handles if possible to avoid server handle table exhaustion
* Fix IPC ZeroFill bug
* am: Correctly implement CreateManagedDisplayLayer and implement CreateManagedDisplaySeparableLayer
CreateManagedDisplaySeparableLayer is requires since 10.x+ when appletResourceUserId != 0
* Make it exit properly
* Make ServiceNotImplementedException show the full message again
* Allow yielding execution to avoid starving other threads
* Only wait if active
* Merge IVirtualMemoryManager and IAddressSpaceManager
* Fix Ro loading data from the wrong process
Co-authored-by: Thog <me@thog.eu>
This fix a mistake I made during my original reimplementation of SurfaceFlinger by disabling async buffer.
This fix a memory corruption on Super Mario All-Stars 3D (Super Mario Sunshine & Super
Mario Galaxy now go ingame).
Thanks to @gdkchan for tracing the memory corruption.
* Logger class changes only
Now compile-time checking is possible with the help of Nullable Value
types.
* Misc formatting
* Manual optimizations
PrintGuestLog
PrintGuestStackTrace
Surfaceflinger DequeueBuffer
* Reduce SendVibrationXX log level to Debug
* Add Notice log level
This level is always enabled and used to print system info, etc...
Also, rewrite LogColor to switch expression as colors are static
* Unify unhandled exception event handlers
* Print enabled LogLevels during init
* Re-add App Exit disposes in proper order
nit: switch case spacing
* Revert PrintGuestStackTrace to Info logs due to #1407
PrintGuestStackTrace is now called in some critical error handlers
so revert to old behavior as KThread isn't part of Guest.
* Batch replace Logger statements
* SurfaceFlinger: fix some bugs
This fixes some bugs in the current implementation and make it closer to
the real implementation.
* Fix align of some variables
* Implement a new physical memory manager and replace DeviceMemory
* Proper generic constraints
* Fix debug build
* Add memory tests
* New CPU memory manager and general code cleanup
* Remove host memory management from CPU project, use Ryujinx.Memory instead
* Fix tests
* Document exceptions on MemoryBlock
* Fix leak on unix memory allocation
* Proper disposal of some objects on tests
* Fix JitCache not being set as initialized
* GetRef without checks for 8-bits and 16-bits CAS
* Add MemoryBlock destructor
* Throw in separate method to improve codegen
* Address PR feedback
* QueryModified improvements
* Fix memory write tracking not marking all pages as modified in some cases
* Simplify MarkRegionAsModified
* Remove XML doc for ghost param
* Add back optimization to avoid useless buffer updates
* Add Ryujinx.Cpu project, move MemoryManager there and remove MemoryBlockWrapper
* Some nits
* Do not perform address translation when size is 0
* Address PR feedback and format NativeInterface class
* Remove ghost parameter description
* Update Ryujinx.Cpu to .NET Core 3.1
* Address PR feedback
* Fix build
* Return a well defined value for GetPhysicalAddress with invalid VA, and do not return unmapped ranges as modified
* Typo
* nvservice: add a lock to NvHostEvent
* Disable surface flinger release fence and readd infinite timeout
* FenceAction: Add a timeout of 1 seconds as this shouldn't wait forever anyuway
* surfaceflinger: remove leftovers from the release fence
* Don't allow infinite timeout on syncpoint while printing all timeout for better debugging
This invalidate the GraphicBuffer on the consumer side when
SetPreallocatedBuffer is called on a buffer slot.
This fix rendering issues on games with a dynamic resolution like Yoshi
Crafted World.
* Rewrite SurfaceFlinger
Reimplement accurately SurfaceFlinger (based on my 8.1.0 reversing of it)
TODO: support swap interval properly and reintroduce disabled "game vsync" support.
* Some fixes for SetBufferCount
* uncomment a test from last commit
* SurfaceFlinger: don't free the graphic buffer in SetBufferCount
* SurfaceFlinger: Implement swap interval correctly
* SurfaceFlinger: Reintegrate Game VSync toggle
* SurfaceFlinger: do not push a fence on buffer release on the consumer side
* Revert "SurfaceFlinger: do not push a fence on buffer release on the consumer side"
This reverts commit 586b52b0bfab2d11f361f4b59ab7b7141020bbad.
* Make the game vsync toggle work dynamically again
* Unregister producer's Binder object when closing layer
* Address ripinperi's comments
* Add a timeout on syncpoint wait operation
Syncpoint aren't supposed to be waited on for more than a second.
This effectively workaround issues caused by not having a channel
scheduling in place yet.
PS: Also introduce Android WaitForever warning about fence being not
signaled for 3s
* Fix a print of previous commit
* Address Ac_K's comments
* Address gdkchan's comments
* Address final comments
* Implement GPU syncpoints
This adds support for GPU syncpoints on the GPU backend & nvservices.
Everything that was implemented here is based on my researches,
hardware testing of the GM20B and reversing of nvservices (8.1.0).
Thanks to @fincs for the informations about some behaviours of the pusher
and for the initial informations about syncpoints.
* syncpoint: address gdkchan's comments
* Add some missing logic to handle SubmitGpfifo correctly
* Handle the NV event API correctly
* evnt => hostEvent
* Finish addressing gdkchan's comments
* nvservices: write the output buffer even when an error is returned
* dma pusher: Implemnet prefetch barrier
lso fix when the commands should be prefetch.
* Partially fix prefetch barrier
* Add a missing syncpoint check in QueryEvent of NvHostSyncPt
* Address Ac_K's comments and fix GetSyncpoint for ChannelResourcePolicy == Channel
* fix SyncptWait & SyncptWaitEx cmds logic
* Address ripinperi's comments
* Address gdkchan's comments
* Move user event management to the control channel
* Fix mm implementation, nvdec works again
* Address ripinperi's comments
* Address gdkchan's comments
* Implement nvhost-ctrl close accurately + make nvservices dispose channels when stopping the emulator
* Fix typo in MultiMediaOperationType
* Add detail of ZbcSetTableArguments
This is a missing part of the #800 PR that cause an assert to be
triggered in debug mode.
Also, remove Fence in SurfaceFlinger as it's a duplicate of NvFence.
* Fix critical issue in size checking of ioctl
oops