Compare commits

..

142 Commits

Author SHA1 Message Date
makigumo
9678d42346 audio_renderer: passthrough 6 channel data 2020-03-12 15:10:27 +01:00
bunnei
fce33adcf1 Merge pull request #3494 from ReinUsesLisp/fix-cs-pipeline
gl_shader_manager: Fix interaction between graphics and compute
2020-03-11 13:51:54 -04:00
ReinUsesLisp
8357908099 gl_shader_manager: Fix interaction between graphics and compute
After a compute shader was set to the pipeline, no graphics shader was
invoked again. To address this use glUseProgram to bind compute shaders
(without state tracking) and call glUseProgram(0) when transitioning out
of it back to the graphics pipeline.
2020-03-11 01:04:52 -03:00
bunnei
503ebe9b96 Merge pull request #3458 from FearlessTobi/voice-issues
cubeb_sink: Don't discard other channels when performing downmixing
2020-03-10 22:18:37 -04:00
Rodrigo Locatti
22e825a3bc Merge pull request #3301 from ReinUsesLisp/state-tracker
video_core: Remove gl_state and use a state tracker based on dirty flags
2020-03-09 18:34:37 -03:00
bunnei
c281173df6 Merge pull request #3486 from ReinUsesLisp/fix-anisotropy-hack
textures: Fix anisotropy hack
2020-03-08 16:28:07 -04:00
ReinUsesLisp
1aa75b1081 textures: Fix anisotropy hack
Previous code could generate an anisotropy value way higher than x16.
2020-03-08 15:59:38 -03:00
FearlessTobi
59d0d34dce cubeb_sink: Don't discard other channels when performing downmixing
Previously, when performing downmixing, we would discard all channels except the left and right one.
This implementation respects them when mixing down to Stereo.
It is taken from this document: http://www.atsc.org/wp-content/uploads/2015/03/A52-201212-17.pdf.

Fixes Luigis Mansion 3 cutscene and Bayonetta audio.
2020-03-08 03:16:06 -04:00
bunnei
84e9f9f395 Merge pull request #3452 from Morph1984/anisotropic-filtering
frontend/Graphics: Add "Advanced" graphics tab and experimental Anisotropic Filtering support
2020-03-07 22:28:35 -05:00
bunnei
662feb8c1c Merge pull request #3481 from ReinUsesLisp/abgr5-storage
maxwell_to_vk: Remove Storage capability for A1B5G5R5U
2020-03-07 19:51:33 -05:00
ReinUsesLisp
aa6fe3f1aa maxwell_to_vk: Remove Storage capability for A1B5G5R5U 2020-03-06 18:47:27 -03:00
bunnei
49eff536d0 Merge pull request #3463 from ReinUsesLisp/vk-toctou
vk_swapchain: Silence TOCTOU race condition
2020-03-05 19:38:42 -05:00
bunnei
4a8fe67964 Merge pull request #3479 from jroweboy/dont-log-on-no-input
Minor fixes for udp input
2020-03-05 15:09:48 -05:00
bunnei
0361aa1915 Merge pull request #3451 from ReinUsesLisp/indexed-textures
vk_shader_decompiler: Implement indexed textures
2020-03-05 11:42:46 -05:00
bunnei
fa1d625eed Merge pull request #3469 from namkazt/patch-1
shader_decode: Fix LD, LDG when track constant buffer
2020-03-04 23:10:01 -05:00
bunnei
1e84d22275 Merge pull request #3478 from bunnei/a32
Refactoring to boot A32 games
2020-03-04 20:37:51 -05:00
James Rowe
002d9508a0 input/udp - Add minor error handling to prevent bad input from crashing 2020-03-03 23:46:05 -07:00
bunnei
67e7186d79 Merge pull request #3455 from ReinUsesLisp/attr-scaled
video_core: Implement more scaled attribute formats
2020-03-03 22:46:20 -05:00
James Rowe
fc205a1bc5 Frontend/SDL - Provide proper default for UDP input
When the default file is read in, the settings default value is only used
when the key is missing. As it was, the key existed, but the value was empty string
causing it to accept that as a value to pass into the core
2020-03-03 20:05:42 -07:00
James Rowe
2cdda8c564 input/udp - Dont log on invalid packet received 2020-03-03 19:52:16 -07:00
bunnei
dba112e510 core: hle: Implement separate A32/A64 SVC interfaces. 2020-03-02 21:52:03 -05:00
bunnei
c083ea7d78 core: Implement separate A32/A64 ARM interfaces. 2020-03-02 21:51:57 -05:00
bunnei
6fc485a607 core: loader: Remove check for 32-bit. 2020-03-02 21:43:15 -05:00
bunnei
64facb403e core: dynarmic: Add CP15 from Citra. 2020-03-02 21:43:15 -05:00
bunnei
08c638f249 Merge pull request #3464 from FernandoS27/jit-fix
ARM_Interface: Cache the JITs instead of deleting/recreating.
2020-03-02 21:41:43 -05:00
bunnei
dfa2e336ba Merge pull request #3475 from yuzu-emu/FearlessTobi-readme
Port citra-emu/citra#5097: "Update README.md"
2020-03-01 22:41:41 -05:00
Tobias
6af8ff24c9 Update README.md 2020-03-01 18:03:32 +01:00
Nguyen Dac Nam
85a4222a8c nit: move comment to right place. 2020-02-29 13:50:10 +07:00
bunnei
ca7618684c Merge pull request #3448 from bunnei/fix-audio-interp-2
audio_core: interpolate: Improvements to fix audio crackling.
2020-02-28 16:07:10 -05:00
ReinUsesLisp
735c003a70 video_core/dirty_flags: Address feedback 2020-02-28 17:56:43 -03:00
ReinUsesLisp
ef7f6eb67d renderer_opengl: Fix edge-case where alpha testing might cull presentation 2020-02-28 17:56:43 -03:00
ReinUsesLisp
a6a350ddc3 gl_texture_cache: Remove blending disable on blits
Blending doesn't affect blits. Rasterizer discard does, update the
commentaries.
2020-02-28 17:56:43 -03:00
ReinUsesLisp
887d5288ef gl_rasterizer: Don't disable blending on clears
Blending doesn't affect clears.
2020-02-28 17:56:43 -03:00
ReinUsesLisp
ac204754d4 dirty_flags: Deduplicate code between OpenGL and Vulkan 2020-02-28 17:56:43 -03:00
ReinUsesLisp
6669b359a3 vk_rasterizer: Pass Maxwell registers to dynamic updates 2020-02-28 17:56:43 -03:00
ReinUsesLisp
042256c6bb state_tracker: Remove type traits with named structures 2020-02-28 17:56:43 -03:00
ReinUsesLisp
6ac3eb4d87 vk_state_tracker: Implement dirty flags for stencil properties 2020-02-28 17:56:43 -03:00
ReinUsesLisp
f9df2c6bcd vk_state_tracker: Implement dirty flags for depth bounds 2020-02-28 17:56:43 -03:00
ReinUsesLisp
cd0e28c9ec vk_state_tracker: Implement dirty flags for blend constants 2020-02-28 17:56:43 -03:00
ReinUsesLisp
a33870996b vk_state_tracker: Implement dirty flags for depth bias 2020-02-28 17:56:43 -03:00
ReinUsesLisp
42f1874965 vk_state_tracker: Implement dirty flags for scissors 2020-02-28 17:56:43 -03:00
ReinUsesLisp
1bd95a314f vk_state_tracker: Initial implementation
Add support for render targets and viewports.
2020-02-28 17:56:43 -03:00
ReinUsesLisp
b1498d2c54 gl_rasterizer: Remove num vertex buffers magic number 2020-02-28 17:56:43 -03:00
ReinUsesLisp
62437943a7 gl_rasterizer: Only apply polygon offset clamp if enabled 2020-02-28 17:56:43 -03:00
ReinUsesLisp
2eeea90713 gl_state_tracker: Implement dirty flags for depth clamp enabling 2020-02-28 17:56:43 -03:00
ReinUsesLisp
3ce66776ec gl_rasterizer: Disable scissor 0 when scissor is not used on clear 2020-02-28 17:56:43 -03:00
ReinUsesLisp
35bb9239ca gl_rasterizer: Notify depth mask changes on clear 2020-02-28 17:56:43 -03:00
ReinUsesLisp
98c8948b23 gl_rasterizer: Minor sort changes to clearing 2020-02-28 17:56:42 -03:00
ReinUsesLisp
15cadc3948 maxwell_3d: Use two tables instead of three for dirty flags 2020-02-28 17:56:42 -03:00
ReinUsesLisp
a5bfc0d045 gl_state_tracker: Track state of index buffers 2020-02-28 17:56:42 -03:00
ReinUsesLisp
a42a6e1a2c gl_state_tracker: Implement dirty flags for clip control 2020-02-28 17:56:42 -03:00
ReinUsesLisp
4f8d152b18 gl_state_tracker: Implement dirty flags for point sizes 2020-02-28 17:56:42 -03:00
ReinUsesLisp
231601763c gl_state_tracker: Implement dirty flags for fragment color clamp 2020-02-28 17:56:42 -03:00
ReinUsesLisp
bf1a1d989f gl_state_tracker: Implement dirty flags for logic op 2020-02-28 17:56:42 -03:00
ReinUsesLisp
13afd0e5b0 gl_state_tracker: Implement dirty flags for sRGB 2020-02-28 17:56:42 -03:00
ReinUsesLisp
d8f5c45051 gl_state_tracker: Implement dirty flags for rasterize enable 2020-02-28 17:56:42 -03:00
ReinUsesLisp
b727d99441 gl_state_tracker: Implement dirty flags for multisample 2020-02-28 17:56:42 -03:00
ReinUsesLisp
3c22bd92d8 gl_state_tracker: Implement dirty flags for alpha testing 2020-02-28 17:56:42 -03:00
ReinUsesLisp
9e46953580 gl_state_tracker: Implement dirty flags for polygon offsets 2020-02-28 17:56:42 -03:00
ReinUsesLisp
46a1888e02 gl_state_tracker: Implement dirty flags for primitive restart 2020-02-28 17:56:42 -03:00
ReinUsesLisp
37536d7a49 gl_state_tracker: Implement dirty flags for stencil testing 2020-02-28 17:56:42 -03:00
ReinUsesLisp
40a2c57df5 gl_state_tracker: Implement depth dirty flags 2020-02-28 17:56:42 -03:00
ReinUsesLisp
b910a83a47 gl_state_tracker: Implement dirty flags for front face and culling 2020-02-28 17:56:42 -03:00
ReinUsesLisp
b01dd7d1c8 gl_state_tracker: Implement dirty flags for blending 2020-02-28 17:56:42 -03:00
ReinUsesLisp
f7ec078592 gl_state_tracker: Implement dirty flags for clip distances and shaders 2020-02-28 17:56:42 -03:00
ReinUsesLisp
758ad3f75d gl_state_tracker: Add dirty flags for buffers and divisors 2020-02-28 17:56:42 -03:00
ReinUsesLisp
9b08698a0c maxwell_3d: Change write dirty flags to a bitset 2020-02-28 17:56:42 -03:00
ReinUsesLisp
69ad6279e4 gl_state_tracker: Implement dirty flags for vertex formats 2020-02-28 17:56:42 -03:00
ReinUsesLisp
6530144ccb gl_state_tracker: Implement dirty flags for color masks 2020-02-28 17:56:42 -03:00
ReinUsesLisp
ba6f390448 gl_state_tracker: Implement dirty flags for scissors 2020-02-28 17:56:42 -03:00
ReinUsesLisp
7f52efdf61 gl_state_tracker: Implement dirty flags for viewports 2020-02-28 17:56:41 -03:00
ReinUsesLisp
dacf83ac02 renderer_opengl: Reintroduce dirty flags for render targets 2020-02-28 17:56:41 -03:00
ReinUsesLisp
9e74e6988b maxwell_3d: Flatten cull and front face registers 2020-02-28 17:56:41 -03:00
ReinUsesLisp
eed789d0d1 video_core: Reintroduce dirty flags infrastructure 2020-02-28 17:56:41 -03:00
ReinUsesLisp
b92dfcd7f2 gl_state: Remove completely 2020-02-28 17:56:35 -03:00
ReinUsesLisp
1c4bf9cbfa gl_state: Remove program tracking 2020-02-28 17:52:14 -03:00
ReinUsesLisp
5ccb07933a gl_state: Remove framebuffer tracking 2020-02-28 17:52:10 -03:00
ReinUsesLisp
17a7fa751b gl_state: Remove image tracking 2020-02-28 17:36:40 -03:00
ReinUsesLisp
9677db03da gl_state: Remove texture and sampler tracking 2020-02-28 17:35:58 -03:00
ReinUsesLisp
1bc0da3dea gl_state: Remove blend state tracking 2020-02-28 17:34:43 -03:00
ReinUsesLisp
7d9a5e9e30 gl_state: Remove stencil test tracking 2020-02-28 17:32:05 -03:00
ReinUsesLisp
07a954e67f gl_state: Remove clip control tracking 2020-02-28 17:31:57 -03:00
ReinUsesLisp
1eee891f6e gl_state: Remove clip distances tracking 2020-02-28 17:26:26 -03:00
ReinUsesLisp
e8125af8dd gl_state: Remove rasterizer disable tracking 2020-02-28 17:25:28 -03:00
ReinUsesLisp
d3e433a380 gl_state: Remove viewport and depth range tracking 2020-02-28 17:25:18 -03:00
ReinUsesLisp
7c16b3551b gl_state: Remove scissor test tracking 2020-02-28 17:00:23 -03:00
ReinUsesLisp
0914c70b7f gl_state: Remove color mask tracking 2020-02-28 16:59:17 -03:00
ReinUsesLisp
2392b548be gl_state: Remove clamp framebuffer color tracking
This commit doesn't reset it for screen draws because clamping doesn't
change anything there.
2020-02-28 16:58:30 -03:00
ReinUsesLisp
f92236976b gl_state: Remove multisample tracking 2020-02-28 16:57:47 -03:00
ReinUsesLisp
04d1134191 gl_state: Remove framebuffer sRGB tracking 2020-02-28 16:55:23 -03:00
ReinUsesLisp
d5ab0358b6 gl_state: Remove VAO cache and tracking 2020-02-28 16:54:37 -03:00
ReinUsesLisp
2a662fea36 gl_state: Remove depth clamp tracking 2020-02-28 16:53:35 -03:00
ReinUsesLisp
e1a16a52fa gl_state: Remove depth tracking 2020-02-28 16:52:46 -03:00
ReinUsesLisp
0f343d32c4 gl_state: Remove primitive restart tracking 2020-02-28 16:51:45 -03:00
ReinUsesLisp
42708c762e gl_state: Remove logic op tracker 2020-02-28 16:51:23 -03:00
ReinUsesLisp
915d73f3b8 gl_state: Remove blend color tracking 2020-02-28 16:50:58 -03:00
ReinUsesLisp
a0321b984f gl_state: Remove polygon offset tracking 2020-02-28 16:49:20 -03:00
ReinUsesLisp
f646321dd0 gl_state: Remove alpha test tracking 2020-02-28 16:48:57 -03:00
ReinUsesLisp
c8f5f54a44 gl_state: Remove cull mode tracking 2020-02-28 16:48:23 -03:00
ReinUsesLisp
925521da5f gl_state: Remove front face tracking 2020-02-28 16:47:59 -03:00
ReinUsesLisp
d2d5554296 gl_state: Remove point size tracking 2020-02-28 16:39:44 -03:00
ReinUsesLisp
b95f064b51 gl_rasterizer: Add oglEnablei helper 2020-02-28 16:39:44 -03:00
ReinUsesLisp
1698143a1d gl_rasterizer: Add OpenGL enable/disable helper 2020-02-28 16:39:44 -03:00
ReinUsesLisp
96ac3d518a gl_rasterizer: Remove dirty flags 2020-02-28 16:39:27 -03:00
ReinUsesLisp
e38ed26b98 common/math_util: Support float type rectangles 2020-02-28 16:22:11 -03:00
bunnei
c7db1ef565 Merge pull request #3470 from bunnei/fix-smash-srgb
renderer_opengl: Fix SRGB presentation frame tracking.
2020-02-28 01:22:00 -05:00
namkazy
1326e326f5 Merge branch 'patch-1' of https://github.com/namkazt/yuzu into patch-2 2020-02-28 13:14:49 +07:00
bunnei
5056d23d0d renderer_opengl: Fix SRGB presentation frame tracking.
- Fixes SRGB in Super Smash Bros. Ultimate.
2020-02-28 01:13:38 -05:00
Nguyen Dac Nam
6c0c2dfabc shader_decode: Fix LD, LDG when track constant buffer 2020-02-28 13:11:19 +07:00
Nguyen Dac Nam
1c385362f5 shader_decode: keep it search on all code
It fixed opcode LD, LDG on Pokemon Sword that can't find the constant buffer. Not sure if it helps any on visual.
2020-02-28 11:59:05 +07:00
Morph
7ee6065178 Create an "Advanced" tab in the graphics configuration tab and add anisotropic filtering levels. 2020-02-27 21:34:00 -05:00
bunnei
969357af1a Merge pull request #3430 from bunnei/split-presenter
Port citra-emu/citra#4940: "Split Presentation thread from Render thread"
2020-02-27 19:51:55 -05:00
bunnei
ebbfe73557 renderer_opengl: Reduce swap chain size to 3. 2020-02-27 19:50:17 -05:00
Morph
e1efab1f51 AM/ICommonStateGetter: Stub SetLcdBacklighOffEnabled (#3454)
* Stub SetLcdBacklighOffEnabled

Used by Super Smash Bros. Ultimate
We require backlight services to be implemented to turn on/off the backlight.

* Address feedback
2020-02-27 17:49:23 +01:00
Nguyen Dac Nam
db2f547434 shader: FMUL switch to using LUT (#3441)
* shader: add FmulPostFactor LUT table

* shader: FMUL apply LUT

* Update src/video_core/engines/shader_bytecode.h

Co-Authored-By: Mat M. <mathew1800@gmail.com>

* nit: mistype

* clang-format & add missing import

* shader: remove post factor LUT.

* shader: move post factor LUT to function and fix incorrect order.

* clang-format

* shader: FMUL: add static to post factor LUT

* nit: typo

Co-authored-by: Mat M. <mathew1800@gmail.com>
2020-02-27 11:14:25 -05:00
bunnei
a17214baea renderer_opengl: Use more concise lock syntax. 2020-02-26 18:35:35 -05:00
bunnei
aef159354c renderer_opengl: Move Frame/FrameMailbox to OpenGL namespace. 2020-02-26 18:28:50 -05:00
ReinUsesLisp
0aaa69e4d7 vk_swapchain: Silence TOCTOU race condition
It's possible that the window is resized from the moment we ask for its
size to the moment a swapchain is created, causing validation issues.

To workaround this Vulkan issue request the capabilities again just
before creating the swapchain, making the race condition less likely.
2020-02-26 17:07:18 -03:00
Fernando Sahmkow
f3d4d4eaa8 ARM_Interface: Cache the JITs instead of deleting/recreating.
This was a bug inherited from citra which was fixed by then at some 
time. This commit corrects such bug and ensures JITs are correctly 
recycled.
2020-02-26 15:53:47 -04:00
bunnei
1f57f679a4 Merge pull request #3440 from namkazt/patch-6
shader: implement LOP3 fast replace for old function
2020-02-26 10:24:35 -05:00
bunnei
01a05b48b7 Merge pull request #3431 from CJBok/npad-fix
InputCommon: analog_from_button get direction implementation
2020-02-25 21:39:26 -05:00
bunnei
795893a9a5 renderer_opengl: Create gl_framebuffer_data if empty. 2020-02-25 21:23:02 -05:00
bunnei
c6f78a4a6d frontend: qt: bootmanager: Acquire a shared context in main emu thread. 2020-02-25 21:23:02 -05:00
bunnei
e25297536f frontend: qt: bootmanager: Vulkan: Restore support for VK backend. 2020-02-25 21:23:01 -05:00
bunnei
14877b8f35 frontend: qt: bootmanager: OpenGL: Implement separate presentation thread. 2020-02-25 21:23:01 -05:00
bunnei
b2a38cce4e frontent: qt: main: Various updates/refactoring for separate presentation thread. 2020-02-25 21:23:00 -05:00
bunnei
667f026c95 core: frontend: Refactor scope_acquire_window_context to scope_acquire_context. 2020-02-25 21:23:00 -05:00
bunnei
2e16c23784 frontend: sdl2: emu_window: Implement separate presentation thread. 2020-02-25 21:23:00 -05:00
bunnei
dc672ca4b3 renderer_opengl: Add texture mailbox support for presenter thread. 2020-02-25 21:22:59 -05:00
bunnei
add2c38b73 renderer_opengl: Add OGLRenderbuffer to resource/state management. 2020-02-25 21:22:58 -05:00
bunnei
0c82b00dfd core: frontend: emu_window: Add TextureMailbox class. 2020-02-25 21:22:57 -05:00
bunnei
571451bdfe core: settings: Add setting to enable vsync, which is on by default. 2020-02-25 20:57:02 -05:00
Mat M
45ac1c62c6 Merge pull request #3461 from ReinUsesLisp/r32i-rt
video_core/surface: Add R32_SINT render target format
2020-02-25 17:47:14 -05:00
Mat M
00e3eab9c1 Merge pull request #3460 from ReinUsesLisp/unused-format-getter
video_core/gpu: Remove unused functions
2020-02-25 17:46:07 -05:00
ReinUsesLisp
3c648e3e2d video_core/gpu: Remove unused functions 2020-02-25 16:53:47 -03:00
ReinUsesLisp
1e9213632a vk_shader_decompiler: Implement indexed textures
Implement accessing textures through an index. It uses the same
interface as OpenGL, the main difference is that Vulkan bindings are
forced to be arrayed (the binding index doesn't change for stacked
textures in SPIR-V).
2020-02-24 01:26:07 -03:00
ReinUsesLisp
1dda77d392 shader: Simplify indexed sampler usages 2020-02-24 01:26:07 -03:00
ReinUsesLisp
e2dd59e341 video_core: Implement more scaler attribute formats
While changing this, fix assert in vk_shader_decompiler. We now know
scaled formats are expected to be float in shaders attributes.
2020-02-24 00:27:37 -03:00
bunnei
1989e1b9ac audio_core: interpolate: Improvements to fix audio crackling.
- Fixes audio crackling in Crash Team Racing Nitro-Fueled, Super Mario Odyssey, and others.
- Addresses followup issues from #3310.
2020-02-22 22:26:16 -05:00
Nguyen Dac Nam
10d8afb302 nit: add const to where it need. 2020-02-21 21:16:45 +07:00
Nguyen Dac Nam
1956a34ee5 shader: implement LOP3 fast replace for old function
ref: https://devtalk.nvidia.com/default/topic/1070081/cuda-programming-and-performance/reverse-lut-for-lop3-lut/
2020-02-21 19:08:07 +07:00
CJBok
23c4cc80e2 analog_from_button get direction implementation 2020-02-18 06:45:37 +01:00
136 changed files with 4027 additions and 2774 deletions

View File

@@ -1,6 +1,6 @@
yuzu emulator
=============
[![Travis CI Build Status](https://travis-ci.org/yuzu-emu/yuzu.svg?branch=master)](https://travis-ci.org/yuzu-emu/yuzu)
[![Travis CI Build Status](https://travis-ci.com/yuzu-emu/yuzu.svg?branch=master)](https://travis-ci.com/yuzu-emu/yuzu)
[![Azure Mainline CI Build Status](https://dev.azure.com/yuzu-emu/yuzu/_apis/build/status/yuzu%20mainline?branchName=master)](https://dev.azure.com/yuzu-emu/yuzu/)
yuzu is an experimental open-source emulator for the Nintendo Switch from the creators of [Citra](https://citra-emu.org/).
@@ -21,7 +21,7 @@ For development discussion, please join us on [Discord](https://discord.gg/XQV6d
Most of the development happens on GitHub. It's also where [our central repository](https://github.com/yuzu-emu/yuzu) is hosted.
If you want to contribute please take a look at the [Contributor's Guide](CONTRIBUTING.md) and [Developer Information](https://github.com/yuzu-emu/yuzu/wiki/Developer-Information). You should as well contact any of the developers on Discord in order to know about the current state of the emulator.
If you want to contribute please take a look at the [Contributor's Guide](CONTRIBUTING.md) and [Developer Information](https://github.com/yuzu-emu/yuzu/wiki/Developer-Information). You should also contact any of the developers on Discord in order to know about the current state of the emulator.
### Building

View File

@@ -8,13 +8,14 @@
#include <climits>
#include <cmath>
#include <vector>
#include "audio_core/algorithm/interpolate.h"
#include "common/common_types.h"
#include "common/logging/log.h"
namespace AudioCore {
constexpr std::array<s16, 512> curve_lut0 = {
constexpr std::array<s16, 512> curve_lut0{
6600, 19426, 6722, 3, 6479, 19424, 6845, 9, 6359, 19419, 6968, 15, 6239,
19412, 7093, 22, 6121, 19403, 7219, 28, 6004, 19391, 7345, 34, 5888, 19377,
7472, 41, 5773, 19361, 7600, 48, 5659, 19342, 7728, 55, 5546, 19321, 7857,
@@ -56,7 +57,7 @@ constexpr std::array<s16, 512> curve_lut0 = {
19403, 6121, 22, 7093, 19412, 6239, 15, 6968, 19419, 6359, 9, 6845, 19424,
6479, 3, 6722, 19426, 6600};
constexpr std::array<s16, 512> curve_lut1 = {
constexpr std::array<s16, 512> curve_lut1{
-68, 32639, 69, -5, -200, 32630, 212, -15, -328, 32613, 359, -26, -450,
32586, 512, -36, -568, 32551, 669, -47, -680, 32507, 832, -58, -788, 32454,
1000, -69, -891, 32393, 1174, -80, -990, 32323, 1352, -92, -1084, 32244, 1536,
@@ -98,7 +99,7 @@ constexpr std::array<s16, 512> curve_lut1 = {
32551, -568, -36, 512, 32586, -450, -26, 359, 32613, -328, -15, 212, 32630,
-200, -5, 69, 32639, -68};
constexpr std::array<s16, 512> curve_lut2 = {
constexpr std::array<s16, 512> curve_lut2{
3195, 26287, 3329, -32, 3064, 26281, 3467, -34, 2936, 26270, 3608, -38, 2811,
26253, 3751, -42, 2688, 26230, 3897, -46, 2568, 26202, 4046, -50, 2451, 26169,
4199, -54, 2338, 26130, 4354, -58, 2227, 26085, 4512, -63, 2120, 26035, 4673,
@@ -146,10 +147,10 @@ std::vector<s16> Interpolate(InterpolationState& state, std::vector<s16> input,
if (ratio <= 0) {
LOG_CRITICAL(Audio, "Nonsensical interpolation ratio {}", ratio);
ratio = 1.0;
return input;
}
const int step = static_cast<int>(ratio * 0x8000);
const s32 step{static_cast<s32>(ratio * 0x8000)};
const std::array<s16, 512>& lut = [step] {
if (step > 0xaaaa) {
return curve_lut0;
@@ -160,28 +161,37 @@ std::vector<s16> Interpolate(InterpolationState& state, std::vector<s16> input,
return curve_lut2;
}();
std::vector<s16> output(static_cast<std::size_t>(input.size() / ratio));
int in_offset = 0;
for (std::size_t out_offset = 0; out_offset < output.size(); out_offset += 2) {
const int lut_index = (state.fraction >> 8) * 4;
const std::size_t num_frames{input.size() / 2};
const int l = input[(in_offset + 0) * 2 + 0] * lut[lut_index + 0] +
input[(in_offset + 1) * 2 + 0] * lut[lut_index + 1] +
input[(in_offset + 2) * 2 + 0] * lut[lut_index + 2] +
input[(in_offset + 3) * 2 + 0] * lut[lut_index + 3];
std::vector<s16> output;
output.reserve(static_cast<std::size_t>(input.size() / ratio + InterpolationState::taps));
const int r = input[(in_offset + 0) * 2 + 1] * lut[lut_index + 0] +
input[(in_offset + 1) * 2 + 1] * lut[lut_index + 1] +
input[(in_offset + 2) * 2 + 1] * lut[lut_index + 2] +
input[(in_offset + 3) * 2 + 1] * lut[lut_index + 3];
for (std::size_t frame{}; frame < num_frames; ++frame) {
const std::size_t lut_index{(state.fraction >> 8) * InterpolationState::taps};
const int new_offset = state.fraction + step;
std::rotate(state.history.begin(), state.history.end() - 1, state.history.end());
state.history[0][0] = input[frame * 2 + 0];
state.history[0][1] = input[frame * 2 + 1];
in_offset += new_offset >> 15;
state.fraction = new_offset & 0x7fff;
while (state.position <= 1.0) {
const s32 left{state.history[0][0] * lut[lut_index + 0] +
state.history[1][0] * lut[lut_index + 1] +
state.history[2][0] * lut[lut_index + 2] +
state.history[3][0] * lut[lut_index + 3]};
const s32 right{state.history[0][1] * lut[lut_index + 0] +
state.history[1][1] * lut[lut_index + 1] +
state.history[2][1] * lut[lut_index + 2] +
state.history[3][1] * lut[lut_index + 3]};
const s32 new_offset{state.fraction + step};
output[out_offset + 0] = static_cast<s16>(std::clamp(l >> 15, SHRT_MIN, SHRT_MAX));
output[out_offset + 1] = static_cast<s16>(std::clamp(r >> 15, SHRT_MIN, SHRT_MAX));
state.fraction = new_offset & 0x7fff;
output.emplace_back(static_cast<s16>(std::clamp(left >> 15, SHRT_MIN, SHRT_MAX)));
output.emplace_back(static_cast<s16>(std::clamp(right >> 15, SHRT_MIN, SHRT_MAX)));
state.position += ratio;
}
state.position -= 1.0;
}
return output;

View File

@@ -6,12 +6,17 @@
#include <array>
#include <vector>
#include "common/common_types.h"
namespace AudioCore {
struct InterpolationState {
int fraction = 0;
static constexpr std::size_t taps{4};
static constexpr std::size_t history_size{taps * 2 - 1};
std::array<std::array<s16, 2>, history_size> history{};
double position{};
s32 fraction{};
};
/// Interpolates input signal to produce output signal.

View File

@@ -291,8 +291,9 @@ void AudioRenderer::VoiceState::RefreshBuffer(Memory::Memory& memory) {
samples[index * 2 + 1] = new_samples[index];
}
break;
case 2: {
// 2 channel is played as is
case 2:
case 6: {
// 2 and 6 channel is played as is
samples = std::move(new_samples);
break;
}

View File

@@ -8,6 +8,7 @@
#include "audio_core/cubeb_sink.h"
#include "audio_core/stream.h"
#include "audio_core/time_stretch.h"
#include "common/assert.h"
#include "common/logging/log.h"
#include "common/ring_buffer.h"
#include "core/settings.h"
@@ -65,12 +66,25 @@ public:
void EnqueueSamples(u32 source_num_channels, const std::vector<s16>& samples) override {
if (source_num_channels > num_channels) {
// Downsample 6 channels to 2
ASSERT_MSG(source_num_channels == 6, "Channel count must be 6");
std::vector<s16> buf;
buf.reserve(samples.size() * num_channels / source_num_channels);
for (std::size_t i = 0; i < samples.size(); i += source_num_channels) {
for (std::size_t ch = 0; ch < num_channels; ch++) {
buf.push_back(samples[i + ch]);
}
// Downmixing implementation taken from the ATSC standard
const s16 left{samples[i + 0]};
const s16 right{samples[i + 1]};
const s16 center{samples[i + 2]};
const s16 surround_left{samples[i + 4]};
const s16 surround_right{samples[i + 5]};
// Not used in the ATSC reference implementation
[[maybe_unused]] const s16 low_frequency_effects { samples[i + 3] };
constexpr s32 clev{707}; // center mixing level coefficient
constexpr s32 slev{707}; // surround mixing level coefficient
buf.push_back(left + (clev * center / 1000) + (slev * surround_left / 1000));
buf.push_back(right + (clev * center / 1000) + (slev * surround_right / 1000));
}
queue.Push(buf);
return;

View File

@@ -24,17 +24,29 @@ struct Rectangle {
: left(left), top(top), right(right), bottom(bottom) {}
T GetWidth() const {
return std::abs(static_cast<std::make_signed_t<T>>(right - left));
if constexpr (std::is_floating_point_v<T>) {
return std::abs(right - left);
} else {
return std::abs(static_cast<std::make_signed_t<T>>(right - left));
}
}
T GetHeight() const {
return std::abs(static_cast<std::make_signed_t<T>>(bottom - top));
if constexpr (std::is_floating_point_v<T>) {
return std::abs(bottom - top);
} else {
return std::abs(static_cast<std::make_signed_t<T>>(bottom - top));
}
}
Rectangle<T> TranslateX(const T x) const {
return Rectangle{left + x, top, right + x, bottom};
}
Rectangle<T> TranslateY(const T y) const {
return Rectangle{left, top + y, right, bottom + y};
}
Rectangle<T> Scale(const float s) const {
return Rectangle{left, top, static_cast<T>(left + GetWidth() * s),
static_cast<T>(top + GetHeight() * s)};

View File

@@ -131,8 +131,8 @@ add_library(core STATIC
frontend/framebuffer_layout.cpp
frontend/framebuffer_layout.h
frontend/input.h
frontend/scope_acquire_window_context.cpp
frontend/scope_acquire_window_context.h
frontend/scope_acquire_context.cpp
frontend/scope_acquire_context.h
gdbstub/gdbstub.cpp
gdbstub/gdbstub.h
hardware_interrupt_manager.cpp
@@ -595,8 +595,12 @@ endif()
if (ARCHITECTURE_x86_64)
target_sources(core PRIVATE
arm/dynarmic/arm_dynarmic.cpp
arm/dynarmic/arm_dynarmic.h
arm/dynarmic/arm_dynarmic_32.cpp
arm/dynarmic/arm_dynarmic_32.h
arm/dynarmic/arm_dynarmic_64.cpp
arm/dynarmic/arm_dynarmic_64.h
arm/dynarmic/arm_dynarmic_cp15.cpp
arm/dynarmic/arm_dynarmic_cp15.h
)
target_link_libraries(core PRIVATE dynarmic)
endif()

View File

@@ -25,7 +25,20 @@ public:
explicit ARM_Interface(System& system_) : system{system_} {}
virtual ~ARM_Interface() = default;
struct ThreadContext {
struct ThreadContext32 {
std::array<u32, 16> cpu_registers;
u32 cpsr;
std::array<u8, 4> padding;
std::array<u64, 32> fprs;
u32 fpscr;
u32 fpexc;
u32 tpidr;
};
// Internally within the kernel, it expects the AArch32 version of the
// thread context to be 344 bytes in size.
static_assert(sizeof(ThreadContext32) == 0x158);
struct ThreadContext64 {
std::array<u64, 31> cpu_registers;
u64 sp;
u64 pc;
@@ -38,7 +51,7 @@ public:
};
// Internally within the kernel, it expects the AArch64 version of the
// thread context to be 800 bytes in size.
static_assert(sizeof(ThreadContext) == 0x320);
static_assert(sizeof(ThreadContext64) == 0x320);
/// Runs the CPU until an event happens
virtual void Run() = 0;
@@ -130,17 +143,10 @@ public:
*/
virtual void SetTPIDR_EL0(u64 value) = 0;
/**
* Saves the current CPU context
* @param ctx Thread context to save
*/
virtual void SaveContext(ThreadContext& ctx) = 0;
/**
* Loads a CPU context
* @param ctx Thread context to load
*/
virtual void LoadContext(const ThreadContext& ctx) = 0;
virtual void SaveContext(ThreadContext32& ctx) = 0;
virtual void SaveContext(ThreadContext64& ctx) = 0;
virtual void LoadContext(const ThreadContext32& ctx) = 0;
virtual void LoadContext(const ThreadContext64& ctx) = 0;
/// Clears the exclusive monitor's state.
virtual void ClearExclusiveState() = 0;

View File

@@ -0,0 +1,208 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <cinttypes>
#include <memory>
#include <dynarmic/A32/a32.h>
#include <dynarmic/A32/config.h>
#include <dynarmic/A32/context.h>
#include "common/microprofile.h"
#include "core/arm/dynarmic/arm_dynarmic_32.h"
#include "core/arm/dynarmic/arm_dynarmic_64.h"
#include "core/arm/dynarmic/arm_dynarmic_cp15.h"
#include "core/core.h"
#include "core/core_manager.h"
#include "core/core_timing.h"
#include "core/hle/kernel/svc.h"
#include "core/memory.h"
namespace Core {
class DynarmicCallbacks32 : public Dynarmic::A32::UserCallbacks {
public:
explicit DynarmicCallbacks32(ARM_Dynarmic_32& parent) : parent(parent) {}
u8 MemoryRead8(u32 vaddr) override {
return parent.system.Memory().Read8(vaddr);
}
u16 MemoryRead16(u32 vaddr) override {
return parent.system.Memory().Read16(vaddr);
}
u32 MemoryRead32(u32 vaddr) override {
return parent.system.Memory().Read32(vaddr);
}
u64 MemoryRead64(u32 vaddr) override {
return parent.system.Memory().Read64(vaddr);
}
void MemoryWrite8(u32 vaddr, u8 value) override {
parent.system.Memory().Write8(vaddr, value);
}
void MemoryWrite16(u32 vaddr, u16 value) override {
parent.system.Memory().Write16(vaddr, value);
}
void MemoryWrite32(u32 vaddr, u32 value) override {
parent.system.Memory().Write32(vaddr, value);
}
void MemoryWrite64(u32 vaddr, u64 value) override {
parent.system.Memory().Write64(vaddr, value);
}
void InterpreterFallback(u32 pc, std::size_t num_instructions) override {
UNIMPLEMENTED();
}
void ExceptionRaised(u32 pc, Dynarmic::A32::Exception exception) override {
switch (exception) {
case Dynarmic::A32::Exception::UndefinedInstruction:
case Dynarmic::A32::Exception::UnpredictableInstruction:
break;
case Dynarmic::A32::Exception::Breakpoint:
break;
}
LOG_CRITICAL(HW_GPU, "ExceptionRaised(exception = {}, pc = {:08X}, code = {:08X})",
static_cast<std::size_t>(exception), pc, MemoryReadCode(pc));
UNIMPLEMENTED();
}
void CallSVC(u32 swi) override {
Kernel::CallSVC(parent.system, swi);
}
void AddTicks(u64 ticks) override {
// Divide the number of ticks by the amount of CPU cores. TODO(Subv): This yields only a
// rough approximation of the amount of executed ticks in the system, it may be thrown off
// if not all cores are doing a similar amount of work. Instead of doing this, we should
// device a way so that timing is consistent across all cores without increasing the ticks 4
// times.
u64 amortized_ticks = (ticks - num_interpreted_instructions) / Core::NUM_CPU_CORES;
// Always execute at least one tick.
amortized_ticks = std::max<u64>(amortized_ticks, 1);
parent.system.CoreTiming().AddTicks(amortized_ticks);
num_interpreted_instructions = 0;
}
u64 GetTicksRemaining() override {
return std::max(parent.system.CoreTiming().GetDowncount(), {});
}
ARM_Dynarmic_32& parent;
std::size_t num_interpreted_instructions{};
u64 tpidrro_el0{};
u64 tpidr_el0{};
};
std::shared_ptr<Dynarmic::A32::Jit> ARM_Dynarmic_32::MakeJit(Common::PageTable& page_table,
std::size_t address_space_bits) const {
Dynarmic::A32::UserConfig config;
config.callbacks = cb.get();
// TODO(bunnei): Implement page table for 32-bit
// config.page_table = &page_table.pointers;
config.coprocessors[15] = std::make_shared<DynarmicCP15>((u32*)&CP15_regs[0]);
config.define_unpredictable_behaviour = true;
return std::make_unique<Dynarmic::A32::Jit>(config);
}
MICROPROFILE_DEFINE(ARM_Jit_Dynarmic_32, "ARM JIT", "Dynarmic", MP_RGB(255, 64, 64));
void ARM_Dynarmic_32::Run() {
MICROPROFILE_SCOPE(ARM_Jit_Dynarmic_32);
jit->Run();
}
void ARM_Dynarmic_32::Step() {
cb->InterpreterFallback(jit->Regs()[15], 1);
}
ARM_Dynarmic_32::ARM_Dynarmic_32(System& system, ExclusiveMonitor& exclusive_monitor,
std::size_t core_index)
: ARM_Interface{system},
cb(std::make_unique<DynarmicCallbacks32>(*this)), core_index{core_index},
exclusive_monitor{dynamic_cast<DynarmicExclusiveMonitor&>(exclusive_monitor)} {}
ARM_Dynarmic_32::~ARM_Dynarmic_32() = default;
void ARM_Dynarmic_32::SetPC(u64 pc) {
jit->Regs()[15] = static_cast<u32>(pc);
}
u64 ARM_Dynarmic_32::GetPC() const {
return jit->Regs()[15];
}
u64 ARM_Dynarmic_32::GetReg(int index) const {
return jit->Regs()[index];
}
void ARM_Dynarmic_32::SetReg(int index, u64 value) {
jit->Regs()[index] = static_cast<u32>(value);
}
u128 ARM_Dynarmic_32::GetVectorReg(int index) const {
return {};
}
void ARM_Dynarmic_32::SetVectorReg(int index, u128 value) {}
u32 ARM_Dynarmic_32::GetPSTATE() const {
return jit->Cpsr();
}
void ARM_Dynarmic_32::SetPSTATE(u32 cpsr) {
jit->SetCpsr(cpsr);
}
u64 ARM_Dynarmic_32::GetTlsAddress() const {
return CP15_regs[static_cast<std::size_t>(CP15Register::CP15_THREAD_URO)];
}
void ARM_Dynarmic_32::SetTlsAddress(VAddr address) {
CP15_regs[static_cast<std::size_t>(CP15Register::CP15_THREAD_URO)] = static_cast<u32>(address);
}
u64 ARM_Dynarmic_32::GetTPIDR_EL0() const {
return cb->tpidr_el0;
}
void ARM_Dynarmic_32::SetTPIDR_EL0(u64 value) {
cb->tpidr_el0 = value;
}
void ARM_Dynarmic_32::SaveContext(ThreadContext32& ctx) {
Dynarmic::A32::Context context;
jit->SaveContext(context);
ctx.cpu_registers = context.Regs();
ctx.cpsr = context.Cpsr();
}
void ARM_Dynarmic_32::LoadContext(const ThreadContext32& ctx) {
Dynarmic::A32::Context context;
context.Regs() = ctx.cpu_registers;
context.SetCpsr(ctx.cpsr);
jit->LoadContext(context);
}
void ARM_Dynarmic_32::PrepareReschedule() {
jit->HaltExecution();
}
void ARM_Dynarmic_32::ClearInstructionCache() {
jit->ClearCache();
}
void ARM_Dynarmic_32::ClearExclusiveState() {}
void ARM_Dynarmic_32::PageTableChanged(Common::PageTable& page_table,
std::size_t new_address_space_size_in_bits) {
auto key = std::make_pair(&page_table, new_address_space_size_in_bits);
auto iter = jit_cache.find(key);
if (iter != jit_cache.end()) {
jit = iter->second;
return;
}
jit = MakeJit(page_table, new_address_space_size_in_bits);
jit_cache.emplace(key, jit);
}
} // namespace Core

View File

@@ -0,0 +1,77 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <memory>
#include <unordered_map>
#include <dynarmic/A32/a32.h>
#include <dynarmic/A64/a64.h>
#include <dynarmic/A64/exclusive_monitor.h>
#include "common/common_types.h"
#include "common/hash.h"
#include "core/arm/arm_interface.h"
#include "core/arm/exclusive_monitor.h"
namespace Memory {
class Memory;
}
namespace Core {
class DynarmicCallbacks32;
class DynarmicExclusiveMonitor;
class System;
class ARM_Dynarmic_32 final : public ARM_Interface {
public:
ARM_Dynarmic_32(System& system, ExclusiveMonitor& exclusive_monitor, std::size_t core_index);
~ARM_Dynarmic_32() override;
void SetPC(u64 pc) override;
u64 GetPC() const override;
u64 GetReg(int index) const override;
void SetReg(int index, u64 value) override;
u128 GetVectorReg(int index) const override;
void SetVectorReg(int index, u128 value) override;
u32 GetPSTATE() const override;
void SetPSTATE(u32 pstate) override;
void Run() override;
void Step() override;
VAddr GetTlsAddress() const override;
void SetTlsAddress(VAddr address) override;
void SetTPIDR_EL0(u64 value) override;
u64 GetTPIDR_EL0() const override;
void SaveContext(ThreadContext32& ctx) override;
void SaveContext(ThreadContext64& ctx) override {}
void LoadContext(const ThreadContext32& ctx) override;
void LoadContext(const ThreadContext64& ctx) override {}
void PrepareReschedule() override;
void ClearExclusiveState() override;
void ClearInstructionCache() override;
void PageTableChanged(Common::PageTable& new_page_table,
std::size_t new_address_space_size_in_bits) override;
private:
std::shared_ptr<Dynarmic::A32::Jit> MakeJit(Common::PageTable& page_table,
std::size_t address_space_bits) const;
using JitCacheKey = std::pair<Common::PageTable*, std::size_t>;
using JitCacheType =
std::unordered_map<JitCacheKey, std::shared_ptr<Dynarmic::A32::Jit>, Common::PairHash>;
friend class DynarmicCallbacks32;
std::unique_ptr<DynarmicCallbacks32> cb;
JitCacheType jit_cache;
std::shared_ptr<Dynarmic::A32::Jit> jit;
std::size_t core_index;
DynarmicExclusiveMonitor& exclusive_monitor;
std::array<u32, 84> CP15_regs{};
};
} // namespace Core

View File

@@ -8,7 +8,7 @@
#include <dynarmic/A64/config.h>
#include "common/logging/log.h"
#include "common/microprofile.h"
#include "core/arm/dynarmic/arm_dynarmic.h"
#include "core/arm/dynarmic/arm_dynarmic_64.h"
#include "core/core.h"
#include "core/core_manager.h"
#include "core/core_timing.h"
@@ -25,9 +25,9 @@ namespace Core {
using Vector = Dynarmic::A64::Vector;
class ARM_Dynarmic_Callbacks : public Dynarmic::A64::UserCallbacks {
class DynarmicCallbacks64 : public Dynarmic::A64::UserCallbacks {
public:
explicit ARM_Dynarmic_Callbacks(ARM_Dynarmic& parent) : parent(parent) {}
explicit DynarmicCallbacks64(ARM_Dynarmic_64& parent) : parent(parent) {}
u8 MemoryRead8(u64 vaddr) override {
return parent.system.Memory().Read8(vaddr);
@@ -68,7 +68,7 @@ public:
LOG_INFO(Core_ARM, "Unicorn fallback @ 0x{:X} for {} instructions (instr = {:08X})", pc,
num_instructions, MemoryReadCode(pc));
ARM_Interface::ThreadContext ctx;
ARM_Interface::ThreadContext64 ctx;
parent.SaveContext(ctx);
parent.inner_unicorn.LoadContext(ctx);
parent.inner_unicorn.ExecuteInstructions(num_instructions);
@@ -90,7 +90,7 @@ public:
parent.jit->HaltExecution();
parent.SetPC(pc);
Kernel::Thread* const thread = parent.system.CurrentScheduler().GetCurrentThread();
parent.SaveContext(thread->GetContext());
parent.SaveContext(thread->GetContext64());
GDBStub::Break();
GDBStub::SendTrap(thread, 5);
return;
@@ -126,14 +126,14 @@ public:
return Timing::CpuCyclesToClockCycles(parent.system.CoreTiming().GetTicks());
}
ARM_Dynarmic& parent;
ARM_Dynarmic_64& parent;
std::size_t num_interpreted_instructions = 0;
u64 tpidrro_el0 = 0;
u64 tpidr_el0 = 0;
};
std::unique_ptr<Dynarmic::A64::Jit> ARM_Dynarmic::MakeJit(Common::PageTable& page_table,
std::size_t address_space_bits) const {
std::shared_ptr<Dynarmic::A64::Jit> ARM_Dynarmic_64::MakeJit(Common::PageTable& page_table,
std::size_t address_space_bits) const {
Dynarmic::A64::UserConfig config;
// Callbacks
@@ -159,79 +159,79 @@ std::unique_ptr<Dynarmic::A64::Jit> ARM_Dynarmic::MakeJit(Common::PageTable& pag
// Unpredictable instructions
config.define_unpredictable_behaviour = true;
return std::make_unique<Dynarmic::A64::Jit>(config);
return std::make_shared<Dynarmic::A64::Jit>(config);
}
MICROPROFILE_DEFINE(ARM_Jit_Dynarmic, "ARM JIT", "Dynarmic", MP_RGB(255, 64, 64));
MICROPROFILE_DEFINE(ARM_Jit_Dynarmic_64, "ARM JIT", "Dynarmic", MP_RGB(255, 64, 64));
void ARM_Dynarmic::Run() {
MICROPROFILE_SCOPE(ARM_Jit_Dynarmic);
void ARM_Dynarmic_64::Run() {
MICROPROFILE_SCOPE(ARM_Jit_Dynarmic_64);
jit->Run();
}
void ARM_Dynarmic::Step() {
void ARM_Dynarmic_64::Step() {
cb->InterpreterFallback(jit->GetPC(), 1);
}
ARM_Dynarmic::ARM_Dynarmic(System& system, ExclusiveMonitor& exclusive_monitor,
std::size_t core_index)
ARM_Dynarmic_64::ARM_Dynarmic_64(System& system, ExclusiveMonitor& exclusive_monitor,
std::size_t core_index)
: ARM_Interface{system},
cb(std::make_unique<ARM_Dynarmic_Callbacks>(*this)), inner_unicorn{system},
cb(std::make_unique<DynarmicCallbacks64>(*this)), inner_unicorn{system},
core_index{core_index}, exclusive_monitor{
dynamic_cast<DynarmicExclusiveMonitor&>(exclusive_monitor)} {}
ARM_Dynarmic::~ARM_Dynarmic() = default;
ARM_Dynarmic_64::~ARM_Dynarmic_64() = default;
void ARM_Dynarmic::SetPC(u64 pc) {
void ARM_Dynarmic_64::SetPC(u64 pc) {
jit->SetPC(pc);
}
u64 ARM_Dynarmic::GetPC() const {
u64 ARM_Dynarmic_64::GetPC() const {
return jit->GetPC();
}
u64 ARM_Dynarmic::GetReg(int index) const {
u64 ARM_Dynarmic_64::GetReg(int index) const {
return jit->GetRegister(index);
}
void ARM_Dynarmic::SetReg(int index, u64 value) {
void ARM_Dynarmic_64::SetReg(int index, u64 value) {
jit->SetRegister(index, value);
}
u128 ARM_Dynarmic::GetVectorReg(int index) const {
u128 ARM_Dynarmic_64::GetVectorReg(int index) const {
return jit->GetVector(index);
}
void ARM_Dynarmic::SetVectorReg(int index, u128 value) {
void ARM_Dynarmic_64::SetVectorReg(int index, u128 value) {
jit->SetVector(index, value);
}
u32 ARM_Dynarmic::GetPSTATE() const {
u32 ARM_Dynarmic_64::GetPSTATE() const {
return jit->GetPstate();
}
void ARM_Dynarmic::SetPSTATE(u32 pstate) {
void ARM_Dynarmic_64::SetPSTATE(u32 pstate) {
jit->SetPstate(pstate);
}
u64 ARM_Dynarmic::GetTlsAddress() const {
u64 ARM_Dynarmic_64::GetTlsAddress() const {
return cb->tpidrro_el0;
}
void ARM_Dynarmic::SetTlsAddress(VAddr address) {
void ARM_Dynarmic_64::SetTlsAddress(VAddr address) {
cb->tpidrro_el0 = address;
}
u64 ARM_Dynarmic::GetTPIDR_EL0() const {
u64 ARM_Dynarmic_64::GetTPIDR_EL0() const {
return cb->tpidr_el0;
}
void ARM_Dynarmic::SetTPIDR_EL0(u64 value) {
void ARM_Dynarmic_64::SetTPIDR_EL0(u64 value) {
cb->tpidr_el0 = value;
}
void ARM_Dynarmic::SaveContext(ThreadContext& ctx) {
void ARM_Dynarmic_64::SaveContext(ThreadContext64& ctx) {
ctx.cpu_registers = jit->GetRegisters();
ctx.sp = jit->GetSP();
ctx.pc = jit->GetPC();
@@ -242,7 +242,7 @@ void ARM_Dynarmic::SaveContext(ThreadContext& ctx) {
ctx.tpidr = cb->tpidr_el0;
}
void ARM_Dynarmic::LoadContext(const ThreadContext& ctx) {
void ARM_Dynarmic_64::LoadContext(const ThreadContext64& ctx) {
jit->SetRegisters(ctx.cpu_registers);
jit->SetSP(ctx.sp);
jit->SetPC(ctx.pc);
@@ -253,25 +253,32 @@ void ARM_Dynarmic::LoadContext(const ThreadContext& ctx) {
SetTPIDR_EL0(ctx.tpidr);
}
void ARM_Dynarmic::PrepareReschedule() {
void ARM_Dynarmic_64::PrepareReschedule() {
jit->HaltExecution();
}
void ARM_Dynarmic::ClearInstructionCache() {
void ARM_Dynarmic_64::ClearInstructionCache() {
jit->ClearCache();
}
void ARM_Dynarmic::ClearExclusiveState() {
void ARM_Dynarmic_64::ClearExclusiveState() {
jit->ClearExclusiveState();
}
void ARM_Dynarmic::PageTableChanged(Common::PageTable& page_table,
std::size_t new_address_space_size_in_bits) {
void ARM_Dynarmic_64::PageTableChanged(Common::PageTable& page_table,
std::size_t new_address_space_size_in_bits) {
auto key = std::make_pair(&page_table, new_address_space_size_in_bits);
auto iter = jit_cache.find(key);
if (iter != jit_cache.end()) {
jit = iter->second;
return;
}
jit = MakeJit(page_table, new_address_space_size_in_bits);
jit_cache.emplace(key, jit);
}
DynarmicExclusiveMonitor::DynarmicExclusiveMonitor(Memory::Memory& memory_, std::size_t core_count)
: monitor(core_count), memory{memory_} {}
DynarmicExclusiveMonitor::DynarmicExclusiveMonitor(Memory::Memory& memory, std::size_t core_count)
: monitor(core_count), memory{memory} {}
DynarmicExclusiveMonitor::~DynarmicExclusiveMonitor() = default;

View File

@@ -5,9 +5,12 @@
#pragma once
#include <memory>
#include <unordered_map>
#include <dynarmic/A64/a64.h>
#include <dynarmic/A64/exclusive_monitor.h>
#include "common/common_types.h"
#include "common/hash.h"
#include "core/arm/arm_interface.h"
#include "core/arm/exclusive_monitor.h"
#include "core/arm/unicorn/arm_unicorn.h"
@@ -18,14 +21,14 @@ class Memory;
namespace Core {
class ARM_Dynarmic_Callbacks;
class DynarmicCallbacks64;
class DynarmicExclusiveMonitor;
class System;
class ARM_Dynarmic final : public ARM_Interface {
class ARM_Dynarmic_64 final : public ARM_Interface {
public:
ARM_Dynarmic(System& system, ExclusiveMonitor& exclusive_monitor, std::size_t core_index);
~ARM_Dynarmic() override;
ARM_Dynarmic_64(System& system, ExclusiveMonitor& exclusive_monitor, std::size_t core_index);
~ARM_Dynarmic_64() override;
void SetPC(u64 pc) override;
u64 GetPC() const override;
@@ -42,8 +45,10 @@ public:
void SetTPIDR_EL0(u64 value) override;
u64 GetTPIDR_EL0() const override;
void SaveContext(ThreadContext& ctx) override;
void LoadContext(const ThreadContext& ctx) override;
void SaveContext(ThreadContext32& ctx) override {}
void SaveContext(ThreadContext64& ctx) override;
void LoadContext(const ThreadContext32& ctx) override {}
void LoadContext(const ThreadContext64& ctx) override;
void PrepareReschedule() override;
void ClearExclusiveState() override;
@@ -53,12 +58,17 @@ public:
std::size_t new_address_space_size_in_bits) override;
private:
std::unique_ptr<Dynarmic::A64::Jit> MakeJit(Common::PageTable& page_table,
std::shared_ptr<Dynarmic::A64::Jit> MakeJit(Common::PageTable& page_table,
std::size_t address_space_bits) const;
friend class ARM_Dynarmic_Callbacks;
std::unique_ptr<ARM_Dynarmic_Callbacks> cb;
std::unique_ptr<Dynarmic::A64::Jit> jit;
using JitCacheKey = std::pair<Common::PageTable*, std::size_t>;
using JitCacheType =
std::unordered_map<JitCacheKey, std::shared_ptr<Dynarmic::A64::Jit>, Common::PairHash>;
friend class DynarmicCallbacks64;
std::unique_ptr<DynarmicCallbacks64> cb;
JitCacheType jit_cache;
std::shared_ptr<Dynarmic::A64::Jit> jit;
ARM_Unicorn inner_unicorn;
std::size_t core_index;
@@ -67,7 +77,7 @@ private:
class DynarmicExclusiveMonitor final : public ExclusiveMonitor {
public:
explicit DynarmicExclusiveMonitor(Memory::Memory& memory_, std::size_t core_count);
explicit DynarmicExclusiveMonitor(Memory::Memory& memory, std::size_t core_count);
~DynarmicExclusiveMonitor() override;
void SetExclusive(std::size_t core_index, VAddr addr) override;
@@ -80,7 +90,7 @@ public:
bool ExclusiveWrite128(std::size_t core_index, VAddr vaddr, u128 value) override;
private:
friend class ARM_Dynarmic;
friend class ARM_Dynarmic_64;
Dynarmic::A64::ExclusiveMonitor monitor;
Memory::Memory& memory;
};

View File

@@ -0,0 +1,80 @@
// Copyright 2017 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/arm/dynarmic/arm_dynarmic_cp15.h"
using Callback = Dynarmic::A32::Coprocessor::Callback;
using CallbackOrAccessOneWord = Dynarmic::A32::Coprocessor::CallbackOrAccessOneWord;
using CallbackOrAccessTwoWords = Dynarmic::A32::Coprocessor::CallbackOrAccessTwoWords;
std::optional<Callback> DynarmicCP15::CompileInternalOperation(bool two, unsigned opc1,
CoprocReg CRd, CoprocReg CRn,
CoprocReg CRm, unsigned opc2) {
return {};
}
CallbackOrAccessOneWord DynarmicCP15::CompileSendOneWord(bool two, unsigned opc1, CoprocReg CRn,
CoprocReg CRm, unsigned opc2) {
// TODO(merry): Privileged CP15 registers
if (!two && CRn == CoprocReg::C7 && opc1 == 0 && CRm == CoprocReg::C5 && opc2 == 4) {
// This is a dummy write, we ignore the value written here.
return &CP15[static_cast<std::size_t>(CP15Register::CP15_FLUSH_PREFETCH_BUFFER)];
}
if (!two && CRn == CoprocReg::C7 && opc1 == 0 && CRm == CoprocReg::C10) {
switch (opc2) {
case 4:
// This is a dummy write, we ignore the value written here.
return &CP15[static_cast<std::size_t>(CP15Register::CP15_DATA_SYNC_BARRIER)];
case 5:
// This is a dummy write, we ignore the value written here.
return &CP15[static_cast<std::size_t>(CP15Register::CP15_DATA_MEMORY_BARRIER)];
default:
return {};
}
}
if (!two && CRn == CoprocReg::C13 && opc1 == 0 && CRm == CoprocReg::C0 && opc2 == 2) {
return &CP15[static_cast<std::size_t>(CP15Register::CP15_THREAD_UPRW)];
}
return {};
}
CallbackOrAccessTwoWords DynarmicCP15::CompileSendTwoWords(bool two, unsigned opc, CoprocReg CRm) {
return {};
}
CallbackOrAccessOneWord DynarmicCP15::CompileGetOneWord(bool two, unsigned opc1, CoprocReg CRn,
CoprocReg CRm, unsigned opc2) {
// TODO(merry): Privileged CP15 registers
if (!two && CRn == CoprocReg::C13 && opc1 == 0 && CRm == CoprocReg::C0) {
switch (opc2) {
case 2:
return &CP15[static_cast<std::size_t>(CP15Register::CP15_THREAD_UPRW)];
case 3:
return &CP15[static_cast<std::size_t>(CP15Register::CP15_THREAD_URO)];
default:
return {};
}
}
return {};
}
CallbackOrAccessTwoWords DynarmicCP15::CompileGetTwoWords(bool two, unsigned opc, CoprocReg CRm) {
return {};
}
std::optional<Callback> DynarmicCP15::CompileLoadWords(bool two, bool long_transfer, CoprocReg CRd,
std::optional<u8> option) {
return {};
}
std::optional<Callback> DynarmicCP15::CompileStoreWords(bool two, bool long_transfer, CoprocReg CRd,
std::optional<u8> option) {
return {};
}

View File

@@ -0,0 +1,152 @@
// Copyright 2017 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <memory>
#include <optional>
#include <dynarmic/A32/coprocessor.h>
#include "common/common_types.h"
enum class CP15Register {
// c0 - Information registers
CP15_MAIN_ID,
CP15_CACHE_TYPE,
CP15_TCM_STATUS,
CP15_TLB_TYPE,
CP15_CPU_ID,
CP15_PROCESSOR_FEATURE_0,
CP15_PROCESSOR_FEATURE_1,
CP15_DEBUG_FEATURE_0,
CP15_AUXILIARY_FEATURE_0,
CP15_MEMORY_MODEL_FEATURE_0,
CP15_MEMORY_MODEL_FEATURE_1,
CP15_MEMORY_MODEL_FEATURE_2,
CP15_MEMORY_MODEL_FEATURE_3,
CP15_ISA_FEATURE_0,
CP15_ISA_FEATURE_1,
CP15_ISA_FEATURE_2,
CP15_ISA_FEATURE_3,
CP15_ISA_FEATURE_4,
// c1 - Control registers
CP15_CONTROL,
CP15_AUXILIARY_CONTROL,
CP15_COPROCESSOR_ACCESS_CONTROL,
// c2 - Translation table registers
CP15_TRANSLATION_BASE_TABLE_0,
CP15_TRANSLATION_BASE_TABLE_1,
CP15_TRANSLATION_BASE_CONTROL,
CP15_DOMAIN_ACCESS_CONTROL,
CP15_RESERVED,
// c5 - Fault status registers
CP15_FAULT_STATUS,
CP15_INSTR_FAULT_STATUS,
CP15_COMBINED_DATA_FSR = CP15_FAULT_STATUS,
CP15_INST_FSR,
// c6 - Fault Address registers
CP15_FAULT_ADDRESS,
CP15_COMBINED_DATA_FAR = CP15_FAULT_ADDRESS,
CP15_WFAR,
CP15_IFAR,
// c7 - Cache operation registers
CP15_WAIT_FOR_INTERRUPT,
CP15_PHYS_ADDRESS,
CP15_INVALIDATE_INSTR_CACHE,
CP15_INVALIDATE_INSTR_CACHE_USING_MVA,
CP15_INVALIDATE_INSTR_CACHE_USING_INDEX,
CP15_FLUSH_PREFETCH_BUFFER,
CP15_FLUSH_BRANCH_TARGET_CACHE,
CP15_FLUSH_BRANCH_TARGET_CACHE_ENTRY,
CP15_INVALIDATE_DATA_CACHE,
CP15_INVALIDATE_DATA_CACHE_LINE_USING_MVA,
CP15_INVALIDATE_DATA_CACHE_LINE_USING_INDEX,
CP15_INVALIDATE_DATA_AND_INSTR_CACHE,
CP15_CLEAN_DATA_CACHE,
CP15_CLEAN_DATA_CACHE_LINE_USING_MVA,
CP15_CLEAN_DATA_CACHE_LINE_USING_INDEX,
CP15_DATA_SYNC_BARRIER,
CP15_DATA_MEMORY_BARRIER,
CP15_CLEAN_AND_INVALIDATE_DATA_CACHE,
CP15_CLEAN_AND_INVALIDATE_DATA_CACHE_LINE_USING_MVA,
CP15_CLEAN_AND_INVALIDATE_DATA_CACHE_LINE_USING_INDEX,
// c8 - TLB operations
CP15_INVALIDATE_ITLB,
CP15_INVALIDATE_ITLB_SINGLE_ENTRY,
CP15_INVALIDATE_ITLB_ENTRY_ON_ASID_MATCH,
CP15_INVALIDATE_ITLB_ENTRY_ON_MVA,
CP15_INVALIDATE_DTLB,
CP15_INVALIDATE_DTLB_SINGLE_ENTRY,
CP15_INVALIDATE_DTLB_ENTRY_ON_ASID_MATCH,
CP15_INVALIDATE_DTLB_ENTRY_ON_MVA,
CP15_INVALIDATE_UTLB,
CP15_INVALIDATE_UTLB_SINGLE_ENTRY,
CP15_INVALIDATE_UTLB_ENTRY_ON_ASID_MATCH,
CP15_INVALIDATE_UTLB_ENTRY_ON_MVA,
// c9 - Data cache lockdown register
CP15_DATA_CACHE_LOCKDOWN,
// c10 - TLB/Memory map registers
CP15_TLB_LOCKDOWN,
CP15_PRIMARY_REGION_REMAP,
CP15_NORMAL_REGION_REMAP,
// c13 - Thread related registers
CP15_PID,
CP15_CONTEXT_ID,
CP15_THREAD_UPRW, // Thread ID register - User/Privileged Read/Write
CP15_THREAD_URO, // Thread ID register - User Read Only (Privileged R/W)
CP15_THREAD_PRW, // Thread ID register - Privileged R/W only.
// c15 - Performance and TLB lockdown registers
CP15_PERFORMANCE_MONITOR_CONTROL,
CP15_CYCLE_COUNTER,
CP15_COUNT_0,
CP15_COUNT_1,
CP15_READ_MAIN_TLB_LOCKDOWN_ENTRY,
CP15_WRITE_MAIN_TLB_LOCKDOWN_ENTRY,
CP15_MAIN_TLB_LOCKDOWN_VIRT_ADDRESS,
CP15_MAIN_TLB_LOCKDOWN_PHYS_ADDRESS,
CP15_MAIN_TLB_LOCKDOWN_ATTRIBUTE,
CP15_TLB_DEBUG_CONTROL,
// Skyeye defined
CP15_TLB_FAULT_ADDR,
CP15_TLB_FAULT_STATUS,
// Not an actual register.
// All registers should be defined above this.
CP15_REGISTER_COUNT,
};
class DynarmicCP15 final : public Dynarmic::A32::Coprocessor {
public:
using CoprocReg = Dynarmic::A32::CoprocReg;
explicit DynarmicCP15(u32* cp15) : CP15(cp15){};
std::optional<Callback> CompileInternalOperation(bool two, unsigned opc1, CoprocReg CRd,
CoprocReg CRn, CoprocReg CRm,
unsigned opc2) override;
CallbackOrAccessOneWord CompileSendOneWord(bool two, unsigned opc1, CoprocReg CRn,
CoprocReg CRm, unsigned opc2) override;
CallbackOrAccessTwoWords CompileSendTwoWords(bool two, unsigned opc, CoprocReg CRm) override;
CallbackOrAccessOneWord CompileGetOneWord(bool two, unsigned opc1, CoprocReg CRn, CoprocReg CRm,
unsigned opc2) override;
CallbackOrAccessTwoWords CompileGetTwoWords(bool two, unsigned opc, CoprocReg CRm) override;
std::optional<Callback> CompileLoadWords(bool two, bool long_transfer, CoprocReg CRd,
std::optional<u8> option) override;
std::optional<Callback> CompileStoreWords(bool two, bool long_transfer, CoprocReg CRd,
std::optional<u8> option) override;
private:
u32* CP15{};
};

View File

@@ -3,7 +3,7 @@
// Refer to the license.txt file included.
#ifdef ARCHITECTURE_x86_64
#include "core/arm/dynarmic/arm_dynarmic.h"
#include "core/arm/dynarmic/arm_dynarmic_64.h"
#endif
#include "core/arm/exclusive_monitor.h"
#include "core/memory.h"

View File

@@ -53,7 +53,7 @@ static bool UnmappedMemoryHook(uc_engine* uc, uc_mem_type type, u64 addr, int si
void* user_data) {
auto* const system = static_cast<System*>(user_data);
ARM_Interface::ThreadContext ctx{};
ARM_Interface::ThreadContext64 ctx{};
system->CurrentArmInterface().SaveContext(ctx);
ASSERT_MSG(false, "Attempted to read from unmapped memory: 0x{:X}, pc=0x{:X}, lr=0x{:X}", addr,
ctx.pc, ctx.cpu_registers[30]);
@@ -179,7 +179,7 @@ void ARM_Unicorn::ExecuteInstructions(std::size_t num_instructions) {
}
Kernel::Thread* const thread = system.CurrentScheduler().GetCurrentThread();
SaveContext(thread->GetContext());
SaveContext(thread->GetContext64());
if (last_bkpt_hit || GDBStub::IsMemoryBreak() || GDBStub::GetCpuStepFlag()) {
last_bkpt_hit = false;
GDBStub::Break();
@@ -188,7 +188,7 @@ void ARM_Unicorn::ExecuteInstructions(std::size_t num_instructions) {
}
}
void ARM_Unicorn::SaveContext(ThreadContext& ctx) {
void ARM_Unicorn::SaveContext(ThreadContext64& ctx) {
int uregs[32];
void* tregs[32];
@@ -215,7 +215,7 @@ void ARM_Unicorn::SaveContext(ThreadContext& ctx) {
CHECKED(uc_reg_read_batch(uc, uregs, tregs, 32));
}
void ARM_Unicorn::LoadContext(const ThreadContext& ctx) {
void ARM_Unicorn::LoadContext(const ThreadContext64& ctx) {
int uregs[32];
void* tregs[32];

View File

@@ -30,8 +30,6 @@ public:
void SetTlsAddress(VAddr address) override;
void SetTPIDR_EL0(u64 value) override;
u64 GetTPIDR_EL0() const override;
void SaveContext(ThreadContext& ctx) override;
void LoadContext(const ThreadContext& ctx) override;
void PrepareReschedule() override;
void ClearExclusiveState() override;
void ExecuteInstructions(std::size_t num_instructions);
@@ -41,6 +39,11 @@ public:
void PageTableChanged(Common::PageTable&, std::size_t) override {}
void RecordBreak(GDBStub::BreakpointAddress bkpt);
void SaveContext(ThreadContext32& ctx) override {}
void SaveContext(ThreadContext64& ctx) override;
void LoadContext(const ThreadContext32& ctx) override {}
void LoadContext(const ThreadContext64& ctx) override;
private:
static void InterruptHook(uc_engine* uc, u32 int_no, void* user_data);

View File

@@ -24,6 +24,7 @@
#include "core/file_sys/sdmc_factory.h"
#include "core/file_sys/vfs_concat.h"
#include "core/file_sys/vfs_real.h"
#include "core/frontend/scope_acquire_context.h"
#include "core/gdbstub/gdbstub.h"
#include "core/hardware_interrupt_manager.h"
#include "core/hle/kernel/client_port.h"
@@ -173,6 +174,7 @@ struct System::Impl {
}
interrupt_manager = std::make_unique<Core::Hardware::InterruptManager>(system);
gpu_core = VideoCore::CreateGPU(system);
renderer->Rasterizer().SetupDirtyFlags();
is_powered_on = true;
exit_lock = false;
@@ -184,6 +186,8 @@ struct System::Impl {
ResultStatus Load(System& system, Frontend::EmuWindow& emu_window,
const std::string& filepath) {
Core::Frontend::ScopeAcquireContext acquire_context{emu_window};
app_loader = Loader::GetLoader(GetGameFileFromPath(virtual_filesystem, filepath));
if (!app_loader) {
LOG_CRITICAL(Core, "Failed to obtain loader for {}!", filepath);

View File

@@ -6,9 +6,6 @@
#include <mutex>
#include "common/logging/log.h"
#ifdef ARCHITECTURE_x86_64
#include "core/arm/dynarmic/arm_dynarmic.h"
#endif
#include "core/arm/exclusive_monitor.h"
#include "core/arm/unicorn/arm_unicorn.h"
#include "core/core.h"

View File

@@ -26,9 +26,6 @@ public:
/// Releases (dunno if this is the "right" word) the context from the caller thread
virtual void DoneCurrent() = 0;
/// Swap buffers to display the next frame
virtual void SwapBuffers() = 0;
};
/**

View File

@@ -29,6 +29,7 @@ enum class AspectRatio {
struct FramebufferLayout {
u32 width{ScreenUndocked::Width};
u32 height{ScreenUndocked::Height};
bool is_srgb{};
Common::Rectangle<u32> screen;

View File

@@ -0,0 +1,18 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/frontend/emu_window.h"
#include "core/frontend/scope_acquire_context.h"
namespace Core::Frontend {
ScopeAcquireContext::ScopeAcquireContext(Core::Frontend::GraphicsContext& context)
: context{context} {
context.MakeCurrent();
}
ScopeAcquireContext::~ScopeAcquireContext() {
context.DoneCurrent();
}
} // namespace Core::Frontend

View File

@@ -8,16 +8,16 @@
namespace Core::Frontend {
class EmuWindow;
class GraphicsContext;
/// Helper class to acquire/release window context within a given scope
class ScopeAcquireWindowContext : NonCopyable {
class ScopeAcquireContext : NonCopyable {
public:
explicit ScopeAcquireWindowContext(Core::Frontend::EmuWindow& window);
~ScopeAcquireWindowContext();
explicit ScopeAcquireContext(Core::Frontend::GraphicsContext& context);
~ScopeAcquireContext();
private:
Core::Frontend::EmuWindow& emu_window;
Core::Frontend::GraphicsContext& context;
};
} // namespace Core::Frontend

View File

@@ -1,18 +0,0 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/frontend/emu_window.h"
#include "core/frontend/scope_acquire_window_context.h"
namespace Core::Frontend {
ScopeAcquireWindowContext::ScopeAcquireWindowContext(Core::Frontend::EmuWindow& emu_window_)
: emu_window{emu_window_} {
emu_window.MakeCurrent();
}
ScopeAcquireWindowContext::~ScopeAcquireWindowContext() {
emu_window.DoneCurrent();
}
} // namespace Core::Frontend

View File

@@ -217,7 +217,7 @@ static u64 RegRead(std::size_t id, Kernel::Thread* thread = nullptr) {
return 0;
}
const auto& thread_context = thread->GetContext();
const auto& thread_context = thread->GetContext64();
if (id < SP_REGISTER) {
return thread_context.cpu_registers[id];
@@ -239,7 +239,7 @@ static void RegWrite(std::size_t id, u64 val, Kernel::Thread* thread = nullptr)
return;
}
auto& thread_context = thread->GetContext();
auto& thread_context = thread->GetContext64();
if (id < SP_REGISTER) {
thread_context.cpu_registers[id] = val;
@@ -259,7 +259,7 @@ static u128 FpuRead(std::size_t id, Kernel::Thread* thread = nullptr) {
return u128{0};
}
auto& thread_context = thread->GetContext();
auto& thread_context = thread->GetContext64();
if (id >= UC_ARM64_REG_Q0 && id < FPCR_REGISTER) {
return thread_context.vector_registers[id - UC_ARM64_REG_Q0];
@@ -275,7 +275,7 @@ static void FpuWrite(std::size_t id, u128 val, Kernel::Thread* thread = nullptr)
return;
}
auto& thread_context = thread->GetContext();
auto& thread_context = thread->GetContext64();
if (id >= UC_ARM64_REG_Q0 && id < FPCR_REGISTER) {
thread_context.vector_registers[id - UC_ARM64_REG_Q0] = val;
@@ -916,7 +916,7 @@ static void WriteRegister() {
// Update ARM context, skipping scheduler - no running threads at this point
Core::System::GetInstance()
.ArmInterface(current_core)
.LoadContext(current_thread->GetContext());
.LoadContext(current_thread->GetContext64());
SendReply("OK");
}
@@ -947,7 +947,7 @@ static void WriteRegisters() {
// Update ARM context, skipping scheduler - no running threads at this point
Core::System::GetInstance()
.ArmInterface(current_core)
.LoadContext(current_thread->GetContext());
.LoadContext(current_thread->GetContext64());
SendReply("OK");
}
@@ -1019,7 +1019,7 @@ static void Step() {
// Update ARM context, skipping scheduler - no running threads at this point
Core::System::GetInstance()
.ArmInterface(current_core)
.LoadContext(current_thread->GetContext());
.LoadContext(current_thread->GetContext64());
}
step_loop = true;
halt_loop = true;

View File

@@ -186,6 +186,10 @@ struct KernelCore::Impl {
return;
}
for (auto& core : cores) {
core.SetIs64Bit(process->Is64BitProcess());
}
system.Memory().SetCurrentPageTable(*process);
}

View File

@@ -5,7 +5,8 @@
#include "common/logging/log.h"
#include "core/arm/arm_interface.h"
#ifdef ARCHITECTURE_x86_64
#include "core/arm/dynarmic/arm_dynarmic.h"
#include "core/arm/dynarmic/arm_dynarmic_32.h"
#include "core/arm/dynarmic/arm_dynarmic_64.h"
#endif
#include "core/arm/exclusive_monitor.h"
#include "core/arm/unicorn/arm_unicorn.h"
@@ -20,13 +21,17 @@ PhysicalCore::PhysicalCore(Core::System& system, std::size_t id,
Core::ExclusiveMonitor& exclusive_monitor)
: core_index{id} {
#ifdef ARCHITECTURE_x86_64
arm_interface = std::make_unique<Core::ARM_Dynarmic>(system, exclusive_monitor, core_index);
arm_interface_32 =
std::make_unique<Core::ARM_Dynarmic_32>(system, exclusive_monitor, core_index);
arm_interface_64 =
std::make_unique<Core::ARM_Dynarmic_64>(system, exclusive_monitor, core_index);
#else
arm_interface = std::make_shared<Core::ARM_Unicorn>(system);
LOG_WARNING(Core, "CPU JIT requested, but Dynarmic not available");
#endif
scheduler = std::make_unique<Kernel::Scheduler>(system, *arm_interface, core_index);
scheduler = std::make_unique<Kernel::Scheduler>(system, core_index);
}
PhysicalCore::~PhysicalCore() = default;
@@ -48,4 +53,12 @@ void PhysicalCore::Shutdown() {
scheduler->Shutdown();
}
void PhysicalCore::SetIs64Bit(bool is_64_bit) {
if (is_64_bit) {
arm_interface = arm_interface_64.get();
} else {
arm_interface = arm_interface_32.get();
}
}
} // namespace Kernel

View File

@@ -68,10 +68,14 @@ public:
return *scheduler;
}
void SetIs64Bit(bool is_64_bit);
private:
std::size_t core_index;
std::unique_ptr<Core::ARM_Interface> arm_interface;
std::unique_ptr<Core::ARM_Interface> arm_interface_32;
std::unique_ptr<Core::ARM_Interface> arm_interface_64;
std::unique_ptr<Kernel::Scheduler> scheduler;
Core::ARM_Interface* arm_interface{};
};
} // namespace Kernel

View File

@@ -42,7 +42,8 @@ void SetupMainThread(Process& owner_process, KernelCore& kernel, u32 priority) {
// Register 1 must be a handle to the main thread
const Handle thread_handle = owner_process.GetHandleTable().Create(thread).Unwrap();
thread->GetContext().cpu_registers[1] = thread_handle;
thread->GetContext32().cpu_registers[1] = thread_handle;
thread->GetContext64().cpu_registers[1] = thread_handle;
// Threads by default are dormant, wake up the main thread so it runs when the scheduler fires
thread->ResumeFromWait();

View File

@@ -383,8 +383,8 @@ void GlobalScheduler::Unlock() {
// TODO(Blinkhawk): Setup the interrupts and change context on current core.
}
Scheduler::Scheduler(Core::System& system, Core::ARM_Interface& cpu_core, std::size_t core_id)
: system(system), cpu_core(cpu_core), core_id(core_id) {}
Scheduler::Scheduler(Core::System& system, std::size_t core_id)
: system{system}, core_id{core_id} {}
Scheduler::~Scheduler() = default;
@@ -422,9 +422,10 @@ void Scheduler::UnloadThread() {
// Save context for previous thread
if (previous_thread) {
cpu_core.SaveContext(previous_thread->GetContext());
system.ArmInterface(core_id).SaveContext(previous_thread->GetContext32());
system.ArmInterface(core_id).SaveContext(previous_thread->GetContext64());
// Save the TPIDR_EL0 system register in case it was modified.
previous_thread->SetTPIDR_EL0(cpu_core.GetTPIDR_EL0());
previous_thread->SetTPIDR_EL0(system.ArmInterface(core_id).GetTPIDR_EL0());
if (previous_thread->GetStatus() == ThreadStatus::Running) {
// This is only the case when a reschedule is triggered without the current thread
@@ -451,9 +452,10 @@ void Scheduler::SwitchContext() {
// Save context for previous thread
if (previous_thread) {
cpu_core.SaveContext(previous_thread->GetContext());
system.ArmInterface(core_id).SaveContext(previous_thread->GetContext32());
system.ArmInterface(core_id).SaveContext(previous_thread->GetContext64());
// Save the TPIDR_EL0 system register in case it was modified.
previous_thread->SetTPIDR_EL0(cpu_core.GetTPIDR_EL0());
previous_thread->SetTPIDR_EL0(system.ArmInterface(core_id).GetTPIDR_EL0());
if (previous_thread->GetStatus() == ThreadStatus::Running) {
// This is only the case when a reschedule is triggered without the current thread
@@ -481,9 +483,10 @@ void Scheduler::SwitchContext() {
system.Kernel().MakeCurrentProcess(thread_owner_process);
}
cpu_core.LoadContext(new_thread->GetContext());
cpu_core.SetTlsAddress(new_thread->GetTLSAddress());
cpu_core.SetTPIDR_EL0(new_thread->GetTPIDR_EL0());
system.ArmInterface(core_id).LoadContext(new_thread->GetContext32());
system.ArmInterface(core_id).LoadContext(new_thread->GetContext64());
system.ArmInterface(core_id).SetTlsAddress(new_thread->GetTLSAddress());
system.ArmInterface(core_id).SetTPIDR_EL0(new_thread->GetTPIDR_EL0());
} else {
current_thread = nullptr;
// Note: We do not reset the current process and current page table when idling because

View File

@@ -181,7 +181,7 @@ private:
class Scheduler final {
public:
explicit Scheduler(Core::System& system, Core::ARM_Interface& cpu_core, std::size_t core_id);
explicit Scheduler(Core::System& system, std::size_t core_id);
~Scheduler();
/// Returns whether there are any threads that are ready to run.
@@ -235,7 +235,6 @@ private:
std::shared_ptr<Thread> selected_thread = nullptr;
Core::System& system;
Core::ARM_Interface& cpu_core;
u64 last_context_switch_time = 0;
u64 idle_selection_count = 0;
const std::size_t core_id;

View File

@@ -187,6 +187,13 @@ static ResultCode SetHeapSize(Core::System& system, VAddr* heap_addr, u64 heap_s
return RESULT_SUCCESS;
}
static ResultCode SetHeapSize32(Core::System& system, u32* heap_addr, u32 heap_size) {
VAddr temp_heap_addr{};
const ResultCode result{SetHeapSize(system, &temp_heap_addr, heap_size)};
*heap_addr = static_cast<u32>(temp_heap_addr);
return result;
}
static ResultCode SetMemoryPermission(Core::System& system, VAddr addr, u64 size, u32 prot) {
LOG_TRACE(Kernel_SVC, "called, addr=0x{:X}, size=0x{:X}, prot=0x{:X}", addr, size, prot);
@@ -371,6 +378,12 @@ static ResultCode ConnectToNamedPort(Core::System& system, Handle* out_handle,
return RESULT_SUCCESS;
}
static ResultCode ConnectToNamedPort32(Core::System& system, Handle* out_handle,
u32 port_name_address) {
return ConnectToNamedPort(system, out_handle, port_name_address);
}
/// Makes a blocking IPC call to an OS service.
static ResultCode SendSyncRequest(Core::System& system, Handle handle) {
const auto& handle_table = system.Kernel().CurrentProcess()->GetHandleTable();
@@ -390,6 +403,10 @@ static ResultCode SendSyncRequest(Core::System& system, Handle handle) {
return session->SendSyncRequest(SharedFrom(thread), system.Memory());
}
static ResultCode SendSyncRequest32(Core::System& system, Handle handle) {
return SendSyncRequest(system, handle);
}
/// Get the ID for the specified thread.
static ResultCode GetThreadId(Core::System& system, u64* thread_id, Handle thread_handle) {
LOG_TRACE(Kernel_SVC, "called thread=0x{:08X}", thread_handle);
@@ -405,6 +422,17 @@ static ResultCode GetThreadId(Core::System& system, u64* thread_id, Handle threa
return RESULT_SUCCESS;
}
static ResultCode GetThreadId32(Core::System& system, u32* thread_id_low, u32* thread_id_high,
Handle thread_handle) {
u64 thread_id{};
const ResultCode result{GetThreadId(system, &thread_id, thread_handle)};
*thread_id_low = static_cast<u32>(thread_id >> 32);
*thread_id_high = static_cast<u32>(thread_id & std::numeric_limits<u32>::max());
return result;
}
/// Gets the ID of the specified process or a specified thread's owning process.
static ResultCode GetProcessId(Core::System& system, u64* process_id, Handle handle) {
LOG_DEBUG(Kernel_SVC, "called handle=0x{:08X}", handle);
@@ -479,6 +507,12 @@ static ResultCode WaitSynchronization(Core::System& system, Handle* index, VAddr
return result;
}
static ResultCode WaitSynchronization32(Core::System& system, u32 timeout_low, u32 handles_address,
s32 handle_count, u32 timeout_high, Handle* index) {
const s64 nano_seconds{(static_cast<s64>(timeout_high) << 32) | static_cast<s64>(timeout_low)};
return WaitSynchronization(system, index, handles_address, handle_count, nano_seconds);
}
/// Resumes a thread waiting on WaitSynchronization
static ResultCode CancelSynchronization(Core::System& system, Handle thread_handle) {
LOG_TRACE(Kernel_SVC, "called thread=0x{:X}", thread_handle);
@@ -917,6 +951,18 @@ static ResultCode GetInfo(Core::System& system, u64* result, u64 info_id, u64 ha
}
}
static ResultCode GetInfo32(Core::System& system, u32* result_low, u32* result_high, u32 sub_id_low,
u32 info_id, u32 handle, u32 sub_id_high) {
const u64 sub_id{static_cast<u64>(sub_id_low | (static_cast<u64>(sub_id_high) << 32))};
u64 res_value{};
const ResultCode result{GetInfo(system, &res_value, info_id, handle, sub_id)};
*result_high = static_cast<u32>(res_value >> 32);
*result_low = static_cast<u32>(res_value & std::numeric_limits<u32>::max());
return result;
}
/// Maps memory at a desired address
static ResultCode MapPhysicalMemory(Core::System& system, VAddr addr, u64 size) {
LOG_DEBUG(Kernel_SVC, "called, addr=0x{:016X}, size=0x{:X}", addr, size);
@@ -1058,7 +1104,7 @@ static ResultCode GetThreadContext(Core::System& system, VAddr thread_context, H
return ERR_BUSY;
}
Core::ARM_Interface::ThreadContext ctx = thread->GetContext();
Core::ARM_Interface::ThreadContext64 ctx = thread->GetContext64();
// Mask away mode bits, interrupt bits, IL bit, and other reserved bits.
ctx.pstate &= 0xFF0FFE20;
@@ -1088,6 +1134,10 @@ static ResultCode GetThreadPriority(Core::System& system, u32* priority, Handle
return RESULT_SUCCESS;
}
static ResultCode GetThreadPriority32(Core::System& system, u32* priority, Handle handle) {
return GetThreadPriority(system, priority, handle);
}
/// Sets the priority for the specified thread
static ResultCode SetThreadPriority(Core::System& system, Handle handle, u32 priority) {
LOG_TRACE(Kernel_SVC, "called");
@@ -1259,6 +1309,11 @@ static ResultCode QueryMemory(Core::System& system, VAddr memory_info_address,
query_address);
}
static ResultCode QueryMemory32(Core::System& system, u32 memory_info_address,
u32 page_info_address, u32 query_address) {
return QueryMemory(system, memory_info_address, page_info_address, query_address);
}
static ResultCode MapProcessCodeMemory(Core::System& system, Handle process_handle, u64 dst_address,
u64 src_address, u64 size) {
LOG_DEBUG(Kernel_SVC,
@@ -1675,6 +1730,10 @@ static void SignalProcessWideKey(Core::System& system, VAddr condition_variable_
}
}
static void SignalProcessWideKey32(Core::System& system, u32 condition_variable_addr, s32 target) {
SignalProcessWideKey(system, condition_variable_addr, target);
}
// Wait for an address (via Address Arbiter)
static ResultCode WaitForAddress(Core::System& system, VAddr address, u32 type, s32 value,
s64 timeout) {
@@ -1760,6 +1819,10 @@ static ResultCode CloseHandle(Core::System& system, Handle handle) {
return handle_table.Close(handle);
}
static ResultCode CloseHandle32(Core::System& system, Handle handle) {
return CloseHandle(system, handle);
}
/// Clears the signaled state of an event or process.
static ResultCode ResetSignal(Core::System& system, Handle handle) {
LOG_DEBUG(Kernel_SVC, "called handle 0x{:08X}", handle);
@@ -2317,69 +2380,196 @@ struct FunctionDef {
};
} // namespace
static const FunctionDef SVC_Table[] = {
static const FunctionDef SVC_Table_32[] = {
{0x00, nullptr, "Unknown"},
{0x01, SvcWrap<SetHeapSize>, "SetHeapSize"},
{0x02, SvcWrap<SetMemoryPermission>, "SetMemoryPermission"},
{0x03, SvcWrap<SetMemoryAttribute>, "SetMemoryAttribute"},
{0x04, SvcWrap<MapMemory>, "MapMemory"},
{0x05, SvcWrap<UnmapMemory>, "UnmapMemory"},
{0x06, SvcWrap<QueryMemory>, "QueryMemory"},
{0x07, SvcWrap<ExitProcess>, "ExitProcess"},
{0x08, SvcWrap<CreateThread>, "CreateThread"},
{0x09, SvcWrap<StartThread>, "StartThread"},
{0x0A, SvcWrap<ExitThread>, "ExitThread"},
{0x0B, SvcWrap<SleepThread>, "SleepThread"},
{0x0C, SvcWrap<GetThreadPriority>, "GetThreadPriority"},
{0x0D, SvcWrap<SetThreadPriority>, "SetThreadPriority"},
{0x0E, SvcWrap<GetThreadCoreMask>, "GetThreadCoreMask"},
{0x0F, SvcWrap<SetThreadCoreMask>, "SetThreadCoreMask"},
{0x10, SvcWrap<GetCurrentProcessorNumber>, "GetCurrentProcessorNumber"},
{0x11, SvcWrap<SignalEvent>, "SignalEvent"},
{0x12, SvcWrap<ClearEvent>, "ClearEvent"},
{0x13, SvcWrap<MapSharedMemory>, "MapSharedMemory"},
{0x14, SvcWrap<UnmapSharedMemory>, "UnmapSharedMemory"},
{0x15, SvcWrap<CreateTransferMemory>, "CreateTransferMemory"},
{0x16, SvcWrap<CloseHandle>, "CloseHandle"},
{0x17, SvcWrap<ResetSignal>, "ResetSignal"},
{0x18, SvcWrap<WaitSynchronization>, "WaitSynchronization"},
{0x19, SvcWrap<CancelSynchronization>, "CancelSynchronization"},
{0x1A, SvcWrap<ArbitrateLock>, "ArbitrateLock"},
{0x1B, SvcWrap<ArbitrateUnlock>, "ArbitrateUnlock"},
{0x1C, SvcWrap<WaitProcessWideKeyAtomic>, "WaitProcessWideKeyAtomic"},
{0x1D, SvcWrap<SignalProcessWideKey>, "SignalProcessWideKey"},
{0x1E, SvcWrap<GetSystemTick>, "GetSystemTick"},
{0x1F, SvcWrap<ConnectToNamedPort>, "ConnectToNamedPort"},
{0x01, SvcWrap32<SetHeapSize32>, "SetHeapSize32"},
{0x02, nullptr, "Unknown"},
{0x03, nullptr, "SetMemoryAttribute32"},
{0x04, nullptr, "MapMemory32"},
{0x05, nullptr, "UnmapMemory32"},
{0x06, SvcWrap32<QueryMemory32>, "QueryMemory32"},
{0x07, nullptr, "ExitProcess32"},
{0x08, nullptr, "CreateThread32"},
{0x09, nullptr, "StartThread32"},
{0x0a, nullptr, "ExitThread32"},
{0x0b, nullptr, "SleepThread32"},
{0x0c, SvcWrap32<GetThreadPriority32>, "GetThreadPriority32"},
{0x0d, nullptr, "SetThreadPriority32"},
{0x0e, nullptr, "GetThreadCoreMask32"},
{0x0f, nullptr, "SetThreadCoreMask32"},
{0x10, nullptr, "GetCurrentProcessorNumber32"},
{0x11, nullptr, "SignalEvent32"},
{0x12, nullptr, "ClearEvent32"},
{0x13, nullptr, "MapSharedMemory32"},
{0x14, nullptr, "UnmapSharedMemory32"},
{0x15, nullptr, "CreateTransferMemory32"},
{0x16, SvcWrap32<CloseHandle32>, "CloseHandle32"},
{0x17, nullptr, "ResetSignal32"},
{0x18, SvcWrap32<WaitSynchronization32>, "WaitSynchronization32"},
{0x19, nullptr, "CancelSynchronization32"},
{0x1a, nullptr, "ArbitrateLock32"},
{0x1b, nullptr, "ArbitrateUnlock32"},
{0x1c, nullptr, "WaitProcessWideKeyAtomic32"},
{0x1d, SvcWrap32<SignalProcessWideKey32>, "SignalProcessWideKey32"},
{0x1e, nullptr, "GetSystemTick32"},
{0x1f, SvcWrap32<ConnectToNamedPort32>, "ConnectToNamedPort32"},
{0x20, nullptr, "Unknown"},
{0x21, SvcWrap32<SendSyncRequest32>, "SendSyncRequest32"},
{0x22, nullptr, "SendSyncRequestWithUserBuffer32"},
{0x23, nullptr, "Unknown"},
{0x24, nullptr, "GetProcessId32"},
{0x25, SvcWrap32<GetThreadId32>, "GetThreadId32"},
{0x26, nullptr, "Break32"},
{0x27, nullptr, "OutputDebugString32"},
{0x28, nullptr, "Unknown"},
{0x29, SvcWrap32<GetInfo32>, "GetInfo32"},
{0x2a, nullptr, "Unknown"},
{0x2b, nullptr, "Unknown"},
{0x2c, nullptr, "MapPhysicalMemory32"},
{0x2d, nullptr, "UnmapPhysicalMemory32"},
{0x2e, nullptr, "Unknown"},
{0x2f, nullptr, "Unknown"},
{0x30, nullptr, "Unknown"},
{0x31, nullptr, "Unknown"},
{0x32, nullptr, "SetThreadActivity32"},
{0x33, nullptr, "GetThreadContext32"},
{0x34, nullptr, "WaitForAddress32"},
{0x35, nullptr, "SignalToAddress32"},
{0x36, nullptr, "Unknown"},
{0x37, nullptr, "Unknown"},
{0x38, nullptr, "Unknown"},
{0x39, nullptr, "Unknown"},
{0x3a, nullptr, "Unknown"},
{0x3b, nullptr, "Unknown"},
{0x3c, nullptr, "Unknown"},
{0x3d, nullptr, "Unknown"},
{0x3e, nullptr, "Unknown"},
{0x3f, nullptr, "Unknown"},
{0x40, nullptr, "CreateSession32"},
{0x41, nullptr, "AcceptSession32"},
{0x42, nullptr, "Unknown"},
{0x43, nullptr, "ReplyAndReceive32"},
{0x44, nullptr, "Unknown"},
{0x45, nullptr, "CreateEvent32"},
{0x46, nullptr, "Unknown"},
{0x47, nullptr, "Unknown"},
{0x48, nullptr, "Unknown"},
{0x49, nullptr, "Unknown"},
{0x4a, nullptr, "Unknown"},
{0x4b, nullptr, "Unknown"},
{0x4c, nullptr, "Unknown"},
{0x4d, nullptr, "Unknown"},
{0x4e, nullptr, "Unknown"},
{0x4f, nullptr, "Unknown"},
{0x50, nullptr, "Unknown"},
{0x51, nullptr, "Unknown"},
{0x52, nullptr, "Unknown"},
{0x53, nullptr, "Unknown"},
{0x54, nullptr, "Unknown"},
{0x55, nullptr, "Unknown"},
{0x56, nullptr, "Unknown"},
{0x57, nullptr, "Unknown"},
{0x58, nullptr, "Unknown"},
{0x59, nullptr, "Unknown"},
{0x5a, nullptr, "Unknown"},
{0x5b, nullptr, "Unknown"},
{0x5c, nullptr, "Unknown"},
{0x5d, nullptr, "Unknown"},
{0x5e, nullptr, "Unknown"},
{0x5F, nullptr, "FlushProcessDataCache32"},
{0x60, nullptr, "Unknown"},
{0x61, nullptr, "Unknown"},
{0x62, nullptr, "Unknown"},
{0x63, nullptr, "Unknown"},
{0x64, nullptr, "Unknown"},
{0x65, nullptr, "GetProcessList32"},
{0x66, nullptr, "Unknown"},
{0x67, nullptr, "Unknown"},
{0x68, nullptr, "Unknown"},
{0x69, nullptr, "Unknown"},
{0x6A, nullptr, "Unknown"},
{0x6B, nullptr, "Unknown"},
{0x6C, nullptr, "Unknown"},
{0x6D, nullptr, "Unknown"},
{0x6E, nullptr, "Unknown"},
{0x6f, nullptr, "GetSystemInfo32"},
{0x70, nullptr, "CreatePort32"},
{0x71, nullptr, "ManageNamedPort32"},
{0x72, nullptr, "ConnectToPort32"},
{0x73, nullptr, "SetProcessMemoryPermission32"},
{0x74, nullptr, "Unknown"},
{0x75, nullptr, "Unknown"},
{0x76, nullptr, "Unknown"},
{0x77, nullptr, "MapProcessCodeMemory32"},
{0x78, nullptr, "UnmapProcessCodeMemory32"},
{0x79, nullptr, "Unknown"},
{0x7A, nullptr, "Unknown"},
{0x7B, nullptr, "TerminateProcess32"},
};
static const FunctionDef SVC_Table_64[] = {
{0x00, nullptr, "Unknown"},
{0x01, SvcWrap64<SetHeapSize>, "SetHeapSize"},
{0x02, SvcWrap64<SetMemoryPermission>, "SetMemoryPermission"},
{0x03, SvcWrap64<SetMemoryAttribute>, "SetMemoryAttribute"},
{0x04, SvcWrap64<MapMemory>, "MapMemory"},
{0x05, SvcWrap64<UnmapMemory>, "UnmapMemory"},
{0x06, SvcWrap64<QueryMemory>, "QueryMemory"},
{0x07, SvcWrap64<ExitProcess>, "ExitProcess"},
{0x08, SvcWrap64<CreateThread>, "CreateThread"},
{0x09, SvcWrap64<StartThread>, "StartThread"},
{0x0A, SvcWrap64<ExitThread>, "ExitThread"},
{0x0B, SvcWrap64<SleepThread>, "SleepThread"},
{0x0C, SvcWrap64<GetThreadPriority>, "GetThreadPriority"},
{0x0D, SvcWrap64<SetThreadPriority>, "SetThreadPriority"},
{0x0E, SvcWrap64<GetThreadCoreMask>, "GetThreadCoreMask"},
{0x0F, SvcWrap64<SetThreadCoreMask>, "SetThreadCoreMask"},
{0x10, SvcWrap64<GetCurrentProcessorNumber>, "GetCurrentProcessorNumber"},
{0x11, SvcWrap64<SignalEvent>, "SignalEvent"},
{0x12, SvcWrap64<ClearEvent>, "ClearEvent"},
{0x13, SvcWrap64<MapSharedMemory>, "MapSharedMemory"},
{0x14, SvcWrap64<UnmapSharedMemory>, "UnmapSharedMemory"},
{0x15, SvcWrap64<CreateTransferMemory>, "CreateTransferMemory"},
{0x16, SvcWrap64<CloseHandle>, "CloseHandle"},
{0x17, SvcWrap64<ResetSignal>, "ResetSignal"},
{0x18, SvcWrap64<WaitSynchronization>, "WaitSynchronization"},
{0x19, SvcWrap64<CancelSynchronization>, "CancelSynchronization"},
{0x1A, SvcWrap64<ArbitrateLock>, "ArbitrateLock"},
{0x1B, SvcWrap64<ArbitrateUnlock>, "ArbitrateUnlock"},
{0x1C, SvcWrap64<WaitProcessWideKeyAtomic>, "WaitProcessWideKeyAtomic"},
{0x1D, SvcWrap64<SignalProcessWideKey>, "SignalProcessWideKey"},
{0x1E, SvcWrap64<GetSystemTick>, "GetSystemTick"},
{0x1F, SvcWrap64<ConnectToNamedPort>, "ConnectToNamedPort"},
{0x20, nullptr, "SendSyncRequestLight"},
{0x21, SvcWrap<SendSyncRequest>, "SendSyncRequest"},
{0x21, SvcWrap64<SendSyncRequest>, "SendSyncRequest"},
{0x22, nullptr, "SendSyncRequestWithUserBuffer"},
{0x23, nullptr, "SendAsyncRequestWithUserBuffer"},
{0x24, SvcWrap<GetProcessId>, "GetProcessId"},
{0x25, SvcWrap<GetThreadId>, "GetThreadId"},
{0x26, SvcWrap<Break>, "Break"},
{0x27, SvcWrap<OutputDebugString>, "OutputDebugString"},
{0x24, SvcWrap64<GetProcessId>, "GetProcessId"},
{0x25, SvcWrap64<GetThreadId>, "GetThreadId"},
{0x26, SvcWrap64<Break>, "Break"},
{0x27, SvcWrap64<OutputDebugString>, "OutputDebugString"},
{0x28, nullptr, "ReturnFromException"},
{0x29, SvcWrap<GetInfo>, "GetInfo"},
{0x29, SvcWrap64<GetInfo>, "GetInfo"},
{0x2A, nullptr, "FlushEntireDataCache"},
{0x2B, nullptr, "FlushDataCache"},
{0x2C, SvcWrap<MapPhysicalMemory>, "MapPhysicalMemory"},
{0x2D, SvcWrap<UnmapPhysicalMemory>, "UnmapPhysicalMemory"},
{0x2C, SvcWrap64<MapPhysicalMemory>, "MapPhysicalMemory"},
{0x2D, SvcWrap64<UnmapPhysicalMemory>, "UnmapPhysicalMemory"},
{0x2E, nullptr, "GetFutureThreadInfo"},
{0x2F, nullptr, "GetLastThreadInfo"},
{0x30, SvcWrap<GetResourceLimitLimitValue>, "GetResourceLimitLimitValue"},
{0x31, SvcWrap<GetResourceLimitCurrentValue>, "GetResourceLimitCurrentValue"},
{0x32, SvcWrap<SetThreadActivity>, "SetThreadActivity"},
{0x33, SvcWrap<GetThreadContext>, "GetThreadContext"},
{0x34, SvcWrap<WaitForAddress>, "WaitForAddress"},
{0x35, SvcWrap<SignalToAddress>, "SignalToAddress"},
{0x30, SvcWrap64<GetResourceLimitLimitValue>, "GetResourceLimitLimitValue"},
{0x31, SvcWrap64<GetResourceLimitCurrentValue>, "GetResourceLimitCurrentValue"},
{0x32, SvcWrap64<SetThreadActivity>, "SetThreadActivity"},
{0x33, SvcWrap64<GetThreadContext>, "GetThreadContext"},
{0x34, SvcWrap64<WaitForAddress>, "WaitForAddress"},
{0x35, SvcWrap64<SignalToAddress>, "SignalToAddress"},
{0x36, nullptr, "SynchronizePreemptionState"},
{0x37, nullptr, "Unknown"},
{0x38, nullptr, "Unknown"},
{0x39, nullptr, "Unknown"},
{0x3A, nullptr, "Unknown"},
{0x3B, nullptr, "Unknown"},
{0x3C, SvcWrap<KernelDebug>, "KernelDebug"},
{0x3D, SvcWrap<ChangeKernelTraceState>, "ChangeKernelTraceState"},
{0x3C, SvcWrap64<KernelDebug>, "KernelDebug"},
{0x3D, SvcWrap64<ChangeKernelTraceState>, "ChangeKernelTraceState"},
{0x3E, nullptr, "Unknown"},
{0x3F, nullptr, "Unknown"},
{0x40, nullptr, "CreateSession"},
@@ -2387,7 +2577,7 @@ static const FunctionDef SVC_Table[] = {
{0x42, nullptr, "ReplyAndReceiveLight"},
{0x43, nullptr, "ReplyAndReceive"},
{0x44, nullptr, "ReplyAndReceiveWithUserBuffer"},
{0x45, SvcWrap<CreateEvent>, "CreateEvent"},
{0x45, SvcWrap64<CreateEvent>, "CreateEvent"},
{0x46, nullptr, "Unknown"},
{0x47, nullptr, "Unknown"},
{0x48, nullptr, "MapPhysicalMemoryUnsafe"},
@@ -2398,9 +2588,9 @@ static const FunctionDef SVC_Table[] = {
{0x4D, nullptr, "SleepSystem"},
{0x4E, nullptr, "ReadWriteRegister"},
{0x4F, nullptr, "SetProcessActivity"},
{0x50, SvcWrap<CreateSharedMemory>, "CreateSharedMemory"},
{0x51, SvcWrap<MapTransferMemory>, "MapTransferMemory"},
{0x52, SvcWrap<UnmapTransferMemory>, "UnmapTransferMemory"},
{0x50, SvcWrap64<CreateSharedMemory>, "CreateSharedMemory"},
{0x51, SvcWrap64<MapTransferMemory>, "MapTransferMemory"},
{0x52, SvcWrap64<UnmapTransferMemory>, "UnmapTransferMemory"},
{0x53, nullptr, "CreateInterruptEvent"},
{0x54, nullptr, "QueryPhysicalAddress"},
{0x55, nullptr, "QueryIoMapping"},
@@ -2419,8 +2609,8 @@ static const FunctionDef SVC_Table[] = {
{0x62, nullptr, "TerminateDebugProcess"},
{0x63, nullptr, "GetDebugEvent"},
{0x64, nullptr, "ContinueDebugEvent"},
{0x65, SvcWrap<GetProcessList>, "GetProcessList"},
{0x66, SvcWrap<GetThreadList>, "GetThreadList"},
{0x65, SvcWrap64<GetProcessList>, "GetProcessList"},
{0x66, SvcWrap64<GetThreadList>, "GetThreadList"},
{0x67, nullptr, "GetDebugThreadContext"},
{0x68, nullptr, "SetDebugThreadContext"},
{0x69, nullptr, "QueryDebugProcessMemory"},
@@ -2436,24 +2626,32 @@ static const FunctionDef SVC_Table[] = {
{0x73, nullptr, "SetProcessMemoryPermission"},
{0x74, nullptr, "MapProcessMemory"},
{0x75, nullptr, "UnmapProcessMemory"},
{0x76, SvcWrap<QueryProcessMemory>, "QueryProcessMemory"},
{0x77, SvcWrap<MapProcessCodeMemory>, "MapProcessCodeMemory"},
{0x78, SvcWrap<UnmapProcessCodeMemory>, "UnmapProcessCodeMemory"},
{0x76, SvcWrap64<QueryProcessMemory>, "QueryProcessMemory"},
{0x77, SvcWrap64<MapProcessCodeMemory>, "MapProcessCodeMemory"},
{0x78, SvcWrap64<UnmapProcessCodeMemory>, "UnmapProcessCodeMemory"},
{0x79, nullptr, "CreateProcess"},
{0x7A, nullptr, "StartProcess"},
{0x7B, nullptr, "TerminateProcess"},
{0x7C, SvcWrap<GetProcessInfo>, "GetProcessInfo"},
{0x7D, SvcWrap<CreateResourceLimit>, "CreateResourceLimit"},
{0x7E, SvcWrap<SetResourceLimitLimitValue>, "SetResourceLimitLimitValue"},
{0x7C, SvcWrap64<GetProcessInfo>, "GetProcessInfo"},
{0x7D, SvcWrap64<CreateResourceLimit>, "CreateResourceLimit"},
{0x7E, SvcWrap64<SetResourceLimitLimitValue>, "SetResourceLimitLimitValue"},
{0x7F, nullptr, "CallSecureMonitor"},
};
static const FunctionDef* GetSVCInfo(u32 func_num) {
if (func_num >= std::size(SVC_Table)) {
static const FunctionDef* GetSVCInfo32(u32 func_num) {
if (func_num >= std::size(SVC_Table_32)) {
LOG_ERROR(Kernel_SVC, "Unknown svc=0x{:02X}", func_num);
return nullptr;
}
return &SVC_Table[func_num];
return &SVC_Table_32[func_num];
}
static const FunctionDef* GetSVCInfo64(u32 func_num) {
if (func_num >= std::size(SVC_Table_64)) {
LOG_ERROR(Kernel_SVC, "Unknown svc=0x{:02X}", func_num);
return nullptr;
}
return &SVC_Table_64[func_num];
}
MICROPROFILE_DEFINE(Kernel_SVC, "Kernel", "SVC", MP_RGB(70, 200, 70));
@@ -2464,7 +2662,8 @@ void CallSVC(Core::System& system, u32 immediate) {
// Lock the global kernel mutex when we enter the kernel HLE.
std::lock_guard lock{HLE::g_hle_lock};
const FunctionDef* info = GetSVCInfo(immediate);
const FunctionDef* info = system.CurrentProcess()->Is64BitProcess() ? GetSVCInfo64(immediate)
: GetSVCInfo32(immediate);
if (info) {
if (info->func) {
info->func(system);

View File

@@ -15,6 +15,10 @@ static inline u64 Param(const Core::System& system, int n) {
return system.CurrentArmInterface().GetReg(n);
}
static inline u32 Param32(const Core::System& system, int n) {
return static_cast<u32>(system.CurrentArmInterface().GetReg(n));
}
/**
* HLE a function return from the current ARM userland process
* @param system System context
@@ -24,40 +28,44 @@ static inline void FuncReturn(Core::System& system, u64 result) {
system.CurrentArmInterface().SetReg(0, result);
}
static inline void FuncReturn32(Core::System& system, u32 result) {
system.CurrentArmInterface().SetReg(0, (u64)result);
}
////////////////////////////////////////////////////////////////////////////////////////////////////
// Function wrappers that return type ResultCode
template <ResultCode func(Core::System&, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0)).raw);
}
template <ResultCode func(Core::System&, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), Param(system, 1)).raw);
}
template <ResultCode func(Core::System&, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0))).raw);
}
template <ResultCode func(Core::System&, u32, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(
system,
func(system, static_cast<u32>(Param(system, 0)), static_cast<u32>(Param(system, 1))).raw);
}
template <ResultCode func(Core::System&, u32, u64, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0)), Param(system, 1),
Param(system, 2), Param(system, 3))
.raw);
}
template <ResultCode func(Core::System&, u32*)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param = 0;
const u32 retval = func(system, &param).raw;
system.CurrentArmInterface().SetReg(1, param);
@@ -65,7 +73,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32*, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, static_cast<u32>(Param(system, 1))).raw;
system.CurrentArmInterface().SetReg(1, param_1);
@@ -73,7 +81,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32*, u32*)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
u32 param_2 = 0;
const u32 retval = func(system, &param_1, &param_2).raw;
@@ -86,7 +94,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32*, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, Param(system, 1)).raw;
system.CurrentArmInterface().SetReg(1, param_1);
@@ -94,7 +102,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32*, u64, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval =
func(system, &param_1, Param(system, 1), static_cast<u32>(Param(system, 2))).raw;
@@ -104,7 +112,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u64*, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u64 param_1 = 0;
const u32 retval = func(system, &param_1, static_cast<u32>(Param(system, 1))).raw;
@@ -113,12 +121,12 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u64, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), static_cast<u32>(Param(system, 1))).raw);
}
template <ResultCode func(Core::System&, u64*, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u64 param_1 = 0;
const u32 retval = func(system, &param_1, Param(system, 1)).raw;
@@ -127,7 +135,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u64*, u32, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u64 param_1 = 0;
const u32 retval = func(system, &param_1, static_cast<u32>(Param(system, 1)),
static_cast<u32>(Param(system, 2)))
@@ -138,19 +146,19 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0)), Param(system, 1)).raw);
}
template <ResultCode func(Core::System&, u32, u32, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0)),
static_cast<u32>(Param(system, 1)), Param(system, 2))
.raw);
}
template <ResultCode func(Core::System&, u32, u32*, u64*)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
u64 param_2 = 0;
const ResultCode retval = func(system, static_cast<u32>(Param(system, 2)), &param_1, &param_2);
@@ -161,54 +169,54 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u64, u64, u32, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), Param(system, 1),
static_cast<u32>(Param(system, 2)), static_cast<u32>(Param(system, 3)))
.raw);
}
template <ResultCode func(Core::System&, u64, u64, u32, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), Param(system, 1),
static_cast<u32>(Param(system, 2)), Param(system, 3))
.raw);
}
template <ResultCode func(Core::System&, u32, u64, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0)), Param(system, 1),
static_cast<u32>(Param(system, 2)))
.raw);
}
template <ResultCode func(Core::System&, u64, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), Param(system, 1), Param(system, 2)).raw);
}
template <ResultCode func(Core::System&, u64, u64, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(
system,
func(system, Param(system, 0), Param(system, 1), static_cast<u32>(Param(system, 2))).raw);
}
template <ResultCode func(Core::System&, u32, u64, u64, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0)), Param(system, 1),
Param(system, 2), static_cast<u32>(Param(system, 3)))
.raw);
}
template <ResultCode func(Core::System&, u32, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(
system,
func(system, static_cast<u32>(Param(system, 0)), Param(system, 1), Param(system, 2)).raw);
}
template <ResultCode func(Core::System&, u32*, u64, u64, s64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, Param(system, 1), static_cast<u32>(Param(system, 2)),
static_cast<s64>(Param(system, 3)))
@@ -219,14 +227,14 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u64, u64, u32, s64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), Param(system, 1),
static_cast<u32>(Param(system, 2)), static_cast<s64>(Param(system, 3)))
.raw);
}
template <ResultCode func(Core::System&, u64*, u64, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u64 param_1 = 0;
const u32 retval =
func(system, &param_1, Param(system, 1), Param(system, 2), Param(system, 3)).raw;
@@ -236,7 +244,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32*, u64, u64, u64, u32, s32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, Param(system, 1), Param(system, 2), Param(system, 3),
static_cast<u32>(Param(system, 4)), static_cast<s32>(Param(system, 5)))
@@ -247,7 +255,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u32*, u64, u64, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, Param(system, 1), Param(system, 2),
static_cast<u32>(Param(system, 3)))
@@ -258,7 +266,7 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, Handle*, u64, u32, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, Param(system, 1), static_cast<u32>(Param(system, 2)),
static_cast<u32>(Param(system, 3)))
@@ -269,14 +277,14 @@ void SvcWrap(Core::System& system) {
}
template <ResultCode func(Core::System&, u64, u32, s32, s64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), static_cast<u32>(Param(system, 1)),
static_cast<s32>(Param(system, 2)), static_cast<s64>(Param(system, 3)))
.raw);
}
template <ResultCode func(Core::System&, u64, u32, s32, s32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system, Param(system, 0), static_cast<u32>(Param(system, 1)),
static_cast<s32>(Param(system, 2)), static_cast<s32>(Param(system, 3)))
.raw);
@@ -286,7 +294,7 @@ void SvcWrap(Core::System& system) {
// Function wrappers that return type u32
template <u32 func(Core::System&)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system));
}
@@ -294,7 +302,7 @@ void SvcWrap(Core::System& system) {
// Function wrappers that return type u64
template <u64 func(Core::System&)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
FuncReturn(system, func(system));
}
@@ -302,44 +310,110 @@ void SvcWrap(Core::System& system) {
/// Function wrappers that return type void
template <void func(Core::System&)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system);
}
template <void func(Core::System&, u32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, static_cast<u32>(Param(system, 0)));
}
template <void func(Core::System&, u32, u64, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, static_cast<u32>(Param(system, 0)), Param(system, 1), Param(system, 2),
Param(system, 3));
}
template <void func(Core::System&, s64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, static_cast<s64>(Param(system, 0)));
}
template <void func(Core::System&, u64, s32)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, Param(system, 0), static_cast<s32>(Param(system, 1)));
}
template <void func(Core::System&, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, Param(system, 0), Param(system, 1));
}
template <void func(Core::System&, u64, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, Param(system, 0), Param(system, 1), Param(system, 2));
}
template <void func(Core::System&, u32, u64, u64)>
void SvcWrap(Core::System& system) {
void SvcWrap64(Core::System& system) {
func(system, static_cast<u32>(Param(system, 0)), Param(system, 1), Param(system, 2));
}
// Used by QueryMemory32
template <ResultCode func(Core::System&, u32, u32, u32)>
void SvcWrap32(Core::System& system) {
FuncReturn32(system,
func(system, Param32(system, 0), Param32(system, 1), Param32(system, 2)).raw);
}
// Used by GetInfo32
template <ResultCode func(Core::System&, u32*, u32*, u32, u32, u32, u32)>
void SvcWrap32(Core::System& system) {
u32 param_1 = 0;
u32 param_2 = 0;
const u32 retval = func(system, &param_1, &param_2, Param32(system, 0), Param32(system, 1),
Param32(system, 2), Param32(system, 3))
.raw;
system.CurrentArmInterface().SetReg(1, param_1);
system.CurrentArmInterface().SetReg(2, param_2);
FuncReturn(system, retval);
}
// Used by GetThreadPriority32, ConnectToNamedPort32
template <ResultCode func(Core::System&, u32*, u32)>
void SvcWrap32(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, &param_1, Param32(system, 1)).raw;
system.CurrentArmInterface().SetReg(1, param_1);
FuncReturn(system, retval);
}
// Used by GetThreadId32
template <ResultCode func(Core::System&, u32*, u32*, u32)>
void SvcWrap32(Core::System& system) {
u32 param_1 = 0;
u32 param_2 = 0;
const u32 retval = func(system, &param_1, &param_2, Param32(system, 1)).raw;
system.CurrentArmInterface().SetReg(1, param_1);
system.CurrentArmInterface().SetReg(2, param_2);
FuncReturn(system, retval);
}
// Used by SignalProcessWideKey32
template <void func(Core::System&, u32, s32)>
void SvcWrap32(Core::System& system) {
func(system, static_cast<u32>(Param(system, 0)), static_cast<s32>(Param(system, 1)));
}
// Used by SendSyncRequest32
template <ResultCode func(Core::System&, u32)>
void SvcWrap32(Core::System& system) {
FuncReturn(system, func(system, static_cast<u32>(Param(system, 0))).raw);
}
// Used by WaitSynchronization32
template <ResultCode func(Core::System&, u32, u32, s32, u32, Handle*)>
void SvcWrap32(Core::System& system) {
u32 param_1 = 0;
const u32 retval = func(system, Param32(system, 0), Param32(system, 1), Param32(system, 2),
Param32(system, 3), &param_1)
.raw;
system.CurrentArmInterface().SetReg(1, param_1);
FuncReturn(system, retval);
}
} // namespace Kernel

View File

@@ -133,15 +133,16 @@ void Thread::CancelWait() {
ResumeFromWait();
}
/**
* Resets a thread context, making it ready to be scheduled and run by the CPU
* @param context Thread context to reset
* @param stack_top Address of the top of the stack
* @param entry_point Address of entry point for execution
* @param arg User argument for thread
*/
static void ResetThreadContext(Core::ARM_Interface::ThreadContext& context, VAddr stack_top,
VAddr entry_point, u64 arg) {
static void ResetThreadContext32(Core::ARM_Interface::ThreadContext32& context, u32 stack_top,
u32 entry_point, u32 arg) {
context = {};
context.cpu_registers[0] = arg;
context.cpu_registers[15] = entry_point;
context.cpu_registers[13] = stack_top;
}
static void ResetThreadContext64(Core::ARM_Interface::ThreadContext64& context, VAddr stack_top,
VAddr entry_point, u64 arg) {
context = {};
context.cpu_registers[0] = arg;
context.pc = entry_point;
@@ -198,9 +199,9 @@ ResultVal<std::shared_ptr<Thread>> Thread::Create(KernelCore& kernel, std::strin
thread->owner_process->RegisterThread(thread.get());
// TODO(peachum): move to ScheduleThread() when scheduler is added so selected core is used
// to initialize the context
ResetThreadContext(thread->context, stack_top, entry_point, arg);
ResetThreadContext32(thread->context_32, static_cast<u32>(stack_top),
static_cast<u32>(entry_point), static_cast<u32>(arg));
ResetThreadContext64(thread->context_64, stack_top, entry_point, arg);
return MakeResult<std::shared_ptr<Thread>>(std::move(thread));
}
@@ -213,11 +214,13 @@ void Thread::SetPriority(u32 priority) {
}
void Thread::SetWaitSynchronizationResult(ResultCode result) {
context.cpu_registers[0] = result.raw;
context_32.cpu_registers[0] = result.raw;
context_64.cpu_registers[0] = result.raw;
}
void Thread::SetWaitSynchronizationOutput(s32 output) {
context.cpu_registers[1] = output;
context_32.cpu_registers[1] = output;
context_64.cpu_registers[1] = output;
}
s32 Thread::GetSynchronizationObjectIndex(std::shared_ptr<SynchronizationObject> object) const {

View File

@@ -102,7 +102,8 @@ public:
using MutexWaitingThreads = std::vector<std::shared_ptr<Thread>>;
using ThreadContext = Core::ARM_Interface::ThreadContext;
using ThreadContext32 = Core::ARM_Interface::ThreadContext32;
using ThreadContext64 = Core::ARM_Interface::ThreadContext64;
using ThreadSynchronizationObjects = std::vector<std::shared_ptr<SynchronizationObject>>;
@@ -273,12 +274,20 @@ public:
return status == ThreadStatus::WaitSynch;
}
ThreadContext& GetContext() {
return context;
ThreadContext32& GetContext32() {
return context_32;
}
const ThreadContext& GetContext() const {
return context;
const ThreadContext32& GetContext32() const {
return context_32;
}
ThreadContext64& GetContext64() {
return context_64;
}
const ThreadContext64& GetContext64() const {
return context_64;
}
ThreadStatus GetStatus() const {
@@ -466,7 +475,8 @@ private:
void AdjustSchedulingOnPriority(u32 old_priority);
void AdjustSchedulingOnAffinity(u64 old_affinity_mask, s32 old_core);
Core::ARM_Interface::ThreadContext context{};
ThreadContext32 context_32{};
ThreadContext64 context_64{};
u64 thread_id = 0;

View File

@@ -607,7 +607,7 @@ ICommonStateGetter::ICommonStateGetter(Core::System& system,
{40, nullptr, "GetCradleFwVersion"},
{50, nullptr, "IsVrModeEnabled"},
{51, nullptr, "SetVrModeEnabled"},
{52, nullptr, "SwitchLcdBacklight"},
{52, &ICommonStateGetter::SetLcdBacklighOffEnabled, "SetLcdBacklighOffEnabled"},
{53, nullptr, "BeginVrModeEx"},
{54, nullptr, "EndVrModeEx"},
{55, nullptr, "IsInControllerFirmwareUpdateSection"},
@@ -636,7 +636,6 @@ void ICommonStateGetter::GetBootMode(Kernel::HLERequestContext& ctx) {
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u8>(static_cast<u8>(Service::PM::SystemBootMode::Normal)); // Normal boot mode
}
@@ -660,6 +659,7 @@ void ICommonStateGetter::ReceiveMessage(Kernel::HLERequestContext& ctx) {
rb.PushEnum<AppletMessageQueue::AppletMessage>(message);
return;
}
rb.Push(RESULT_SUCCESS);
rb.PushEnum<AppletMessageQueue::AppletMessage>(message);
}
@@ -672,6 +672,17 @@ void ICommonStateGetter::GetCurrentFocusState(Kernel::HLERequestContext& ctx) {
rb.Push(static_cast<u8>(FocusState::InFocus));
}
void ICommonStateGetter::SetLcdBacklighOffEnabled(Kernel::HLERequestContext& ctx) {
IPC::RequestParser rp{ctx};
const auto is_lcd_backlight_off_enabled = rp.Pop<bool>();
LOG_WARNING(Service_AM, "(STUBBED) called. is_lcd_backlight_off_enabled={}",
is_lcd_backlight_off_enabled);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);
}
void ICommonStateGetter::GetDefaultDisplayResolutionChangeEvent(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_AM, "called");

View File

@@ -182,6 +182,7 @@ private:
void GetOperationMode(Kernel::HLERequestContext& ctx);
void GetPerformanceMode(Kernel::HLERequestContext& ctx);
void GetBootMode(Kernel::HLERequestContext& ctx);
void SetLcdBacklighOffEnabled(Kernel::HLERequestContext& ctx);
void GetDefaultDisplayResolution(Kernel::HLERequestContext& ctx);
void SetCpuBoostMode(Kernel::HLERequestContext& ctx);

View File

@@ -287,13 +287,13 @@ void Controller_NPad::RequestPadStateUpdate(u32 npad_id) {
analog_state[static_cast<std::size_t>(JoystickId::Joystick_Left)]->GetAnalogDirectionStatus(
Input::AnalogDirection::DOWN));
pad_state.r_stick_up.Assign(analog_state[static_cast<std::size_t>(JoystickId::Joystick_Right)]
->GetAnalogDirectionStatus(Input::AnalogDirection::RIGHT));
pad_state.r_stick_left.Assign(analog_state[static_cast<std::size_t>(JoystickId::Joystick_Right)]
->GetAnalogDirectionStatus(Input::AnalogDirection::LEFT));
pad_state.r_stick_right.Assign(
analog_state[static_cast<std::size_t>(JoystickId::Joystick_Right)]
->GetAnalogDirectionStatus(Input::AnalogDirection::UP));
->GetAnalogDirectionStatus(Input::AnalogDirection::RIGHT));
pad_state.r_stick_left.Assign(analog_state[static_cast<std::size_t>(JoystickId::Joystick_Right)]
->GetAnalogDirectionStatus(Input::AnalogDirection::LEFT));
pad_state.r_stick_up.Assign(analog_state[static_cast<std::size_t>(JoystickId::Joystick_Right)]
->GetAnalogDirectionStatus(Input::AnalogDirection::UP));
pad_state.r_stick_down.Assign(analog_state[static_cast<std::size_t>(JoystickId::Joystick_Right)]
->GetAnalogDirectionStatus(Input::AnalogDirection::DOWN));

View File

@@ -129,12 +129,6 @@ AppLoader_DeconstructedRomDirectory::LoadResult AppLoader_DeconstructedRomDirect
}
metadata.Print();
const FileSys::ProgramAddressSpaceType arch_bits{metadata.GetAddressSpaceType()};
if (arch_bits == FileSys::ProgramAddressSpaceType::Is32Bit ||
arch_bits == FileSys::ProgramAddressSpaceType::Is32BitNoMap) {
return {ResultStatus::Error32BitISA, {}};
}
if (process.LoadFromMetadata(metadata).IsError()) {
return {ResultStatus::ErrorUnableToParseKernelMetadata, {}};
}

View File

@@ -111,7 +111,7 @@ json GetProcessorStateDataAuto(Core::System& system) {
const auto& vm_manager{process->VMManager()};
auto& arm{system.CurrentArmInterface()};
Core::ARM_Interface::ThreadContext context{};
Core::ARM_Interface::ThreadContext64 context{};
arm.SaveContext(context);
return GetProcessorStateData(process->Is64BitProcess() ? "AArch64" : "AArch32",

View File

@@ -94,6 +94,7 @@ void LogSettings() {
LogSetting("Renderer_UseAccurateGpuEmulation", Settings::values.use_accurate_gpu_emulation);
LogSetting("Renderer_UseAsynchronousGpuEmulation",
Settings::values.use_asynchronous_gpu_emulation);
LogSetting("Renderer_UseVsync", Settings::values.use_vsync);
LogSetting("Audio_OutputEngine", Settings::values.sink_id);
LogSetting("Audio_EnableAudioStretching", Settings::values.enable_audio_stretching);
LogSetting("Audio_OutputDevice", Settings::values.audio_device_id);

View File

@@ -430,11 +430,13 @@ struct Values {
float resolution_factor;
int aspect_ratio;
int max_anisotropy;
bool use_frame_limit;
u16 frame_limit;
bool use_disk_shader_cache;
bool use_accurate_gpu_emulation;
bool use_asynchronous_gpu_emulation;
bool use_vsync;
bool force_30fps_mode;
float bg_red;

View File

@@ -188,6 +188,7 @@ void TelemetrySession::AddInitialInfo(Loader::AppLoader& app_loader) {
Settings::values.use_accurate_gpu_emulation);
AddField(field_type, "Renderer_UseAsynchronousGpuEmulation",
Settings::values.use_asynchronous_gpu_emulation);
AddField(field_type, "Renderer_UseVsync", Settings::values.use_vsync);
AddField(field_type, "System_UseDockedMode", Settings::values.use_docked_mode);
}

View File

@@ -34,6 +34,20 @@ public:
y * coef * (x == 0 ? 1.0f : SQRT_HALF));
}
bool GetAnalogDirectionStatus(Input::AnalogDirection direction) const override {
switch (direction) {
case Input::AnalogDirection::RIGHT:
return right->GetStatus();
case Input::AnalogDirection::LEFT:
return left->GetStatus();
case Input::AnalogDirection::UP:
return up->GetStatus();
case Input::AnalogDirection::DOWN:
return down->GetStatus();
}
return false;
}
private:
Button up;
Button down;

View File

@@ -32,8 +32,16 @@ public:
SocketCallback callback)
: callback(std::move(callback)), timer(io_service),
socket(io_service, udp::endpoint(udp::v4(), 0)), client_id(client_id),
pad_index(pad_index),
send_endpoint(udp::endpoint(boost::asio::ip::make_address_v4(host), port)) {}
pad_index(pad_index) {
boost::system::error_code ec{};
auto ipv4 = boost::asio::ip::make_address_v4(host, ec);
if (ec.failed()) {
LOG_ERROR(Input, "Invalid IPv4 address \"{}\" provided to socket", host);
ipv4 = boost::asio::ip::address_v4{};
}
send_endpoint = {udp::endpoint(ipv4, port)};
}
void Stop() {
io_service.stop();
@@ -85,17 +93,18 @@ private:
}
void HandleSend(const boost::system::error_code& error) {
boost::system::error_code _ignored{};
// Send a request for getting port info for the pad
Request::PortInfo port_info{1, {pad_index, 0, 0, 0}};
const auto port_message = Request::Create(port_info, client_id);
std::memcpy(&send_buffer1, &port_message, PORT_INFO_SIZE);
socket.send_to(boost::asio::buffer(send_buffer1), send_endpoint);
socket.send_to(boost::asio::buffer(send_buffer1), send_endpoint, {}, _ignored);
// Send a request for getting pad data for the pad
Request::PadData pad_data{Request::PadData::Flags::Id, pad_index, EMPTY_MAC_ADDRESS};
const auto pad_message = Request::Create(pad_data, client_id);
std::memcpy(send_buffer2.data(), &pad_message, PAD_DATA_SIZE);
socket.send_to(boost::asio::buffer(send_buffer2), send_endpoint);
socket.send_to(boost::asio::buffer(send_buffer2), send_endpoint, {}, _ignored);
StartSend(timer.expiry());
}

View File

@@ -31,7 +31,6 @@ namespace Response {
*/
std::optional<Type> Validate(u8* data, std::size_t size) {
if (size < sizeof(Header)) {
LOG_DEBUG(Input, "Invalid UDP packet received");
return std::nullopt;
}
Header header{};

View File

@@ -2,6 +2,8 @@ add_library(video_core STATIC
buffer_cache/buffer_block.h
buffer_cache/buffer_cache.h
buffer_cache/map_interval.h
dirty_flags.cpp
dirty_flags.h
dma_pusher.cpp
dma_pusher.h
engines/const_buffer_engine_interface.h
@@ -69,8 +71,8 @@ add_library(video_core STATIC
renderer_opengl/gl_shader_manager.h
renderer_opengl/gl_shader_util.cpp
renderer_opengl/gl_shader_util.h
renderer_opengl/gl_state.cpp
renderer_opengl/gl_state.h
renderer_opengl/gl_state_tracker.cpp
renderer_opengl/gl_state_tracker.h
renderer_opengl/gl_stream_buffer.cpp
renderer_opengl/gl_stream_buffer.h
renderer_opengl/gl_texture_cache.cpp
@@ -198,6 +200,8 @@ if (ENABLE_VULKAN)
renderer_vulkan/vk_shader_util.h
renderer_vulkan/vk_staging_buffer_pool.cpp
renderer_vulkan/vk_staging_buffer_pool.h
renderer_vulkan/vk_state_tracker.cpp
renderer_vulkan/vk_state_tracker.h
renderer_vulkan/vk_stream_buffer.cpp
renderer_vulkan/vk_stream_buffer.h
renderer_vulkan/vk_swapchain.cpp

View File

@@ -0,0 +1,46 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <array>
#include <cstddef>
#include "common/common_types.h"
#include "video_core/dirty_flags.h"
#define OFF(field_name) MAXWELL3D_REG_INDEX(field_name)
#define NUM(field_name) (sizeof(::Tegra::Engines::Maxwell3D::Regs::field_name) / sizeof(u32))
namespace VideoCommon::Dirty {
using Tegra::Engines::Maxwell3D;
void SetupCommonOnWriteStores(Tegra::Engines::Maxwell3D::DirtyState::Flags& store) {
store[RenderTargets] = true;
store[ZetaBuffer] = true;
for (std::size_t i = 0; i < Maxwell3D::Regs::NumRenderTargets; ++i) {
store[ColorBuffer0 + i] = true;
}
}
void SetupDirtyRenderTargets(Tegra::Engines::Maxwell3D::DirtyState::Tables& tables) {
static constexpr std::size_t num_per_rt = NUM(rt[0]);
static constexpr std::size_t begin = OFF(rt);
static constexpr std::size_t num = num_per_rt * Maxwell3D::Regs::NumRenderTargets;
for (std::size_t rt = 0; rt < Maxwell3D::Regs::NumRenderTargets; ++rt) {
FillBlock(tables[0], begin + rt * num_per_rt, num_per_rt, ColorBuffer0 + rt);
}
FillBlock(tables[1], begin, num, RenderTargets);
static constexpr std::array zeta_flags{ZetaBuffer, RenderTargets};
for (std::size_t i = 0; i < std::size(zeta_flags); ++i) {
const u8 flag = zeta_flags[i];
auto& table = tables[i];
table[OFF(zeta_enable)] = flag;
table[OFF(zeta_width)] = flag;
table[OFF(zeta_height)] = flag;
FillBlock(table, OFF(zeta), NUM(zeta), flag);
}
}
} // namespace VideoCommon::Dirty

View File

@@ -0,0 +1,51 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <algorithm>
#include <cstddef>
#include <iterator>
#include "common/common_types.h"
#include "video_core/engines/maxwell_3d.h"
namespace VideoCommon::Dirty {
enum : u8 {
NullEntry = 0,
RenderTargets,
ColorBuffer0,
ColorBuffer1,
ColorBuffer2,
ColorBuffer3,
ColorBuffer4,
ColorBuffer5,
ColorBuffer6,
ColorBuffer7,
ZetaBuffer,
LastCommonEntry,
};
template <typename Integer>
void FillBlock(Tegra::Engines::Maxwell3D::DirtyState::Table& table, std::size_t begin,
std::size_t num, Integer dirty_index) {
const auto it = std::begin(table) + begin;
std::fill(it, it + num, static_cast<u8>(dirty_index));
}
template <typename Integer1, typename Integer2>
void FillBlock(Tegra::Engines::Maxwell3D::DirtyState::Tables& tables, std::size_t begin,
std::size_t num, Integer1 index_a, Integer2 index_b) {
FillBlock(tables[0], begin, num, index_a);
FillBlock(tables[1], begin, num, index_b);
}
void SetupCommonOnWriteStores(Tegra::Engines::Maxwell3D::DirtyState::Flags& store);
void SetupDirtyRenderTargets(Tegra::Engines::Maxwell3D::DirtyState::Tables& tables);
} // namespace VideoCommon::Dirty

View File

@@ -22,7 +22,7 @@ void DmaPusher::DispatchCalls() {
MICROPROFILE_SCOPE(DispatchCalls);
// On entering GPU code, assume all memory may be touched by the ARM core.
gpu.Maxwell3D().dirty.OnMemoryWrite();
gpu.Maxwell3D().OnMemoryWrite();
dma_pushbuffer_subindex = 0;

View File

@@ -39,7 +39,7 @@ void KeplerCompute::CallMethod(const GPU::MethodCall& method_call) {
const bool is_last_call = method_call.IsLastCall();
upload_state.ProcessData(method_call.argument, is_last_call);
if (is_last_call) {
system.GPU().Maxwell3D().dirty.OnMemoryWrite();
system.GPU().Maxwell3D().OnMemoryWrite();
}
break;
}

View File

@@ -34,7 +34,7 @@ void KeplerMemory::CallMethod(const GPU::MethodCall& method_call) {
const bool is_last_call = method_call.IsLastCall();
upload_state.ProcessData(method_call.argument, is_last_call);
if (is_last_call) {
system.GPU().Maxwell3D().dirty.OnMemoryWrite();
system.GPU().Maxwell3D().OnMemoryWrite();
}
break;
}

View File

@@ -26,7 +26,8 @@ Maxwell3D::Maxwell3D(Core::System& system, VideoCore::RasterizerInterface& raste
MemoryManager& memory_manager)
: system{system}, rasterizer{rasterizer}, memory_manager{memory_manager},
macro_interpreter{*this}, upload_state{memory_manager, regs.upload} {
InitDirtySettings();
dirty.flags.flip();
InitializeRegisterDefaults();
}
@@ -75,8 +76,8 @@ void Maxwell3D::InitializeRegisterDefaults() {
regs.stencil_back_mask = 0xFFFFFFFF;
regs.depth_test_func = Regs::ComparisonOp::Always;
regs.cull.front_face = Regs::Cull::FrontFace::CounterClockWise;
regs.cull.cull_face = Regs::Cull::CullFace::Back;
regs.front_face = Regs::FrontFace::CounterClockWise;
regs.cull_face = Regs::CullFace::Back;
// TODO(Rodrigo): Most games do not set a point size. I think this is a case of a
// register carrying a default value. Assume it's OpenGL's default (1).
@@ -95,7 +96,7 @@ void Maxwell3D::InitializeRegisterDefaults() {
regs.rasterize_enable = 1;
regs.rt_separate_frag_data = 1;
regs.framebuffer_srgb = 1;
regs.cull.front_face = Maxwell3D::Regs::Cull::FrontFace::ClockWise;
regs.front_face = Maxwell3D::Regs::FrontFace::ClockWise;
mme_inline[MAXWELL3D_REG_INDEX(draw.vertex_end_gl)] = true;
mme_inline[MAXWELL3D_REG_INDEX(draw.vertex_begin_gl)] = true;
@@ -103,164 +104,6 @@ void Maxwell3D::InitializeRegisterDefaults() {
mme_inline[MAXWELL3D_REG_INDEX(index_array.count)] = true;
}
#define DIRTY_REGS_POS(field_name) static_cast<u8>(offsetof(Maxwell3D::DirtyRegs, field_name))
void Maxwell3D::InitDirtySettings() {
const auto set_block = [this](std::size_t start, std::size_t range, u8 position) {
const auto start_itr = dirty_pointers.begin() + start;
const auto end_itr = start_itr + range;
std::fill(start_itr, end_itr, position);
};
dirty.regs.fill(true);
// Init Render Targets
constexpr u32 registers_per_rt = sizeof(regs.rt[0]) / sizeof(u32);
constexpr u32 rt_start_reg = MAXWELL3D_REG_INDEX(rt);
constexpr u32 rt_end_reg = rt_start_reg + registers_per_rt * 8;
u8 rt_dirty_reg = DIRTY_REGS_POS(render_target);
for (u32 rt_reg = rt_start_reg; rt_reg < rt_end_reg; rt_reg += registers_per_rt) {
set_block(rt_reg, registers_per_rt, rt_dirty_reg);
++rt_dirty_reg;
}
constexpr u32 depth_buffer_flag = DIRTY_REGS_POS(depth_buffer);
dirty_pointers[MAXWELL3D_REG_INDEX(zeta_enable)] = depth_buffer_flag;
dirty_pointers[MAXWELL3D_REG_INDEX(zeta_width)] = depth_buffer_flag;
dirty_pointers[MAXWELL3D_REG_INDEX(zeta_height)] = depth_buffer_flag;
constexpr u32 registers_in_zeta = sizeof(regs.zeta) / sizeof(u32);
constexpr u32 zeta_reg = MAXWELL3D_REG_INDEX(zeta);
set_block(zeta_reg, registers_in_zeta, depth_buffer_flag);
// Init Vertex Arrays
constexpr u32 vertex_array_start = MAXWELL3D_REG_INDEX(vertex_array);
constexpr u32 vertex_array_size = sizeof(regs.vertex_array[0]) / sizeof(u32);
constexpr u32 vertex_array_end = vertex_array_start + vertex_array_size * Regs::NumVertexArrays;
u8 va_dirty_reg = DIRTY_REGS_POS(vertex_array);
u8 vi_dirty_reg = DIRTY_REGS_POS(vertex_instance);
for (u32 vertex_reg = vertex_array_start; vertex_reg < vertex_array_end;
vertex_reg += vertex_array_size) {
set_block(vertex_reg, 3, va_dirty_reg);
// The divisor concerns vertex array instances
dirty_pointers[static_cast<std::size_t>(vertex_reg) + 3] = vi_dirty_reg;
++va_dirty_reg;
++vi_dirty_reg;
}
constexpr u32 vertex_limit_start = MAXWELL3D_REG_INDEX(vertex_array_limit);
constexpr u32 vertex_limit_size = sizeof(regs.vertex_array_limit[0]) / sizeof(u32);
constexpr u32 vertex_limit_end = vertex_limit_start + vertex_limit_size * Regs::NumVertexArrays;
va_dirty_reg = DIRTY_REGS_POS(vertex_array);
for (u32 vertex_reg = vertex_limit_start; vertex_reg < vertex_limit_end;
vertex_reg += vertex_limit_size) {
set_block(vertex_reg, vertex_limit_size, va_dirty_reg);
va_dirty_reg++;
}
constexpr u32 vertex_instance_start = MAXWELL3D_REG_INDEX(instanced_arrays);
constexpr u32 vertex_instance_size =
sizeof(regs.instanced_arrays.is_instanced[0]) / sizeof(u32);
constexpr u32 vertex_instance_end =
vertex_instance_start + vertex_instance_size * Regs::NumVertexArrays;
vi_dirty_reg = DIRTY_REGS_POS(vertex_instance);
for (u32 vertex_reg = vertex_instance_start; vertex_reg < vertex_instance_end;
vertex_reg += vertex_instance_size) {
set_block(vertex_reg, vertex_instance_size, vi_dirty_reg);
vi_dirty_reg++;
}
set_block(MAXWELL3D_REG_INDEX(vertex_attrib_format), regs.vertex_attrib_format.size(),
DIRTY_REGS_POS(vertex_attrib_format));
// Init Shaders
constexpr u32 shader_registers_count =
sizeof(regs.shader_config[0]) * Regs::MaxShaderProgram / sizeof(u32);
set_block(MAXWELL3D_REG_INDEX(shader_config[0]), shader_registers_count,
DIRTY_REGS_POS(shaders));
// State
// Viewport
constexpr u8 viewport_dirty_reg = DIRTY_REGS_POS(viewport);
constexpr u32 viewport_start = MAXWELL3D_REG_INDEX(viewports);
constexpr u32 viewport_size = sizeof(regs.viewports) / sizeof(u32);
set_block(viewport_start, viewport_size, viewport_dirty_reg);
constexpr u32 view_volume_start = MAXWELL3D_REG_INDEX(view_volume_clip_control);
constexpr u32 view_volume_size = sizeof(regs.view_volume_clip_control) / sizeof(u32);
set_block(view_volume_start, view_volume_size, viewport_dirty_reg);
// Viewport transformation
constexpr u32 viewport_trans_start = MAXWELL3D_REG_INDEX(viewport_transform);
constexpr u32 viewport_trans_size = sizeof(regs.viewport_transform) / sizeof(u32);
set_block(viewport_trans_start, viewport_trans_size, DIRTY_REGS_POS(viewport_transform));
// Cullmode
constexpr u32 cull_mode_start = MAXWELL3D_REG_INDEX(cull);
constexpr u32 cull_mode_size = sizeof(regs.cull) / sizeof(u32);
set_block(cull_mode_start, cull_mode_size, DIRTY_REGS_POS(cull_mode));
// Screen y control
dirty_pointers[MAXWELL3D_REG_INDEX(screen_y_control)] = DIRTY_REGS_POS(screen_y_control);
// Primitive Restart
constexpr u32 primitive_restart_start = MAXWELL3D_REG_INDEX(primitive_restart);
constexpr u32 primitive_restart_size = sizeof(regs.primitive_restart) / sizeof(u32);
set_block(primitive_restart_start, primitive_restart_size, DIRTY_REGS_POS(primitive_restart));
// Depth Test
constexpr u8 depth_test_dirty_reg = DIRTY_REGS_POS(depth_test);
dirty_pointers[MAXWELL3D_REG_INDEX(depth_test_enable)] = depth_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(depth_write_enabled)] = depth_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(depth_test_func)] = depth_test_dirty_reg;
// Stencil Test
constexpr u32 stencil_test_dirty_reg = DIRTY_REGS_POS(stencil_test);
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_enable)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_func_func)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_func_ref)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_func_mask)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_op_fail)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_op_zfail)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_op_zpass)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_front_mask)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_two_side_enable)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_func_func)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_func_ref)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_func_mask)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_op_fail)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_op_zfail)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_op_zpass)] = stencil_test_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(stencil_back_mask)] = stencil_test_dirty_reg;
// Color Mask
constexpr u8 color_mask_dirty_reg = DIRTY_REGS_POS(color_mask);
dirty_pointers[MAXWELL3D_REG_INDEX(color_mask_common)] = color_mask_dirty_reg;
set_block(MAXWELL3D_REG_INDEX(color_mask), sizeof(regs.color_mask) / sizeof(u32),
color_mask_dirty_reg);
// Blend State
constexpr u8 blend_state_dirty_reg = DIRTY_REGS_POS(blend_state);
set_block(MAXWELL3D_REG_INDEX(blend_color), sizeof(regs.blend_color) / sizeof(u32),
blend_state_dirty_reg);
dirty_pointers[MAXWELL3D_REG_INDEX(independent_blend_enable)] = blend_state_dirty_reg;
set_block(MAXWELL3D_REG_INDEX(blend), sizeof(regs.blend) / sizeof(u32), blend_state_dirty_reg);
set_block(MAXWELL3D_REG_INDEX(independent_blend), sizeof(regs.independent_blend) / sizeof(u32),
blend_state_dirty_reg);
// Scissor State
constexpr u8 scissor_test_dirty_reg = DIRTY_REGS_POS(scissor_test);
set_block(MAXWELL3D_REG_INDEX(scissor_test), sizeof(regs.scissor_test) / sizeof(u32),
scissor_test_dirty_reg);
// Polygon Offset
constexpr u8 polygon_offset_dirty_reg = DIRTY_REGS_POS(polygon_offset);
dirty_pointers[MAXWELL3D_REG_INDEX(polygon_offset_fill_enable)] = polygon_offset_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(polygon_offset_line_enable)] = polygon_offset_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(polygon_offset_point_enable)] = polygon_offset_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(polygon_offset_units)] = polygon_offset_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(polygon_offset_factor)] = polygon_offset_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(polygon_offset_clamp)] = polygon_offset_dirty_reg;
// Depth bounds
constexpr u8 depth_bounds_values_dirty_reg = DIRTY_REGS_POS(depth_bounds_values);
dirty_pointers[MAXWELL3D_REG_INDEX(depth_bounds[0])] = depth_bounds_values_dirty_reg;
dirty_pointers[MAXWELL3D_REG_INDEX(depth_bounds[1])] = depth_bounds_values_dirty_reg;
}
void Maxwell3D::CallMacroMethod(u32 method, std::size_t num_parameters, const u32* parameters) {
// Reset the current macro.
executing_macro = 0;
@@ -319,19 +162,9 @@ void Maxwell3D::CallMethod(const GPU::MethodCall& method_call) {
if (regs.reg_array[method] != method_call.argument) {
regs.reg_array[method] = method_call.argument;
const std::size_t dirty_reg = dirty_pointers[method];
if (dirty_reg) {
dirty.regs[dirty_reg] = true;
if (dirty_reg >= DIRTY_REGS_POS(vertex_array) &&
dirty_reg < DIRTY_REGS_POS(vertex_array_buffers)) {
dirty.vertex_array_buffers = true;
} else if (dirty_reg >= DIRTY_REGS_POS(vertex_instance) &&
dirty_reg < DIRTY_REGS_POS(vertex_instances)) {
dirty.vertex_instances = true;
} else if (dirty_reg >= DIRTY_REGS_POS(render_target) &&
dirty_reg < DIRTY_REGS_POS(render_settings)) {
dirty.render_settings = true;
}
for (const auto& table : dirty.tables) {
dirty.flags[table[method]] = true;
}
}
@@ -419,7 +252,7 @@ void Maxwell3D::CallMethod(const GPU::MethodCall& method_call) {
const bool is_last_call = method_call.IsLastCall();
upload_state.ProcessData(method_call.argument, is_last_call);
if (is_last_call) {
dirty.OnMemoryWrite();
OnMemoryWrite();
}
break;
}
@@ -727,7 +560,7 @@ void Maxwell3D::FinishCBData() {
const u32 id = cb_data_state.id;
memory_manager.WriteBlock(address, cb_data_state.buffer[id].data(), size);
dirty.OnMemoryWrite();
OnMemoryWrite();
cb_data_state.id = null_cb_data;
cb_data_state.current = null_cb_data;

View File

@@ -6,6 +6,7 @@
#include <array>
#include <bitset>
#include <limits>
#include <optional>
#include <type_traits>
#include <unordered_map>
@@ -431,21 +432,15 @@ public:
GeneratedPrimitives = 0x1F,
};
struct Cull {
enum class FrontFace : u32 {
ClockWise = 0x0900,
CounterClockWise = 0x0901,
};
enum class FrontFace : u32 {
ClockWise = 0x0900,
CounterClockWise = 0x0901,
};
enum class CullFace : u32 {
Front = 0x0404,
Back = 0x0405,
FrontAndBack = 0x0408,
};
u32 enabled;
FrontFace front_face;
CullFace cull_face;
enum class CullFace : u32 {
Front = 0x0404,
Back = 0x0405,
FrontAndBack = 0x0408,
};
struct Blend {
@@ -574,7 +569,7 @@ public:
f32 translate_z;
INSERT_UNION_PADDING_WORDS(2);
Common::Rectangle<s32> GetRect() const {
Common::Rectangle<f32> GetRect() const {
return {
GetX(), // left
GetY() + GetHeight(), // top
@@ -583,20 +578,20 @@ public:
};
};
s32 GetX() const {
return static_cast<s32>(std::max(0.0f, translate_x - std::fabs(scale_x)));
f32 GetX() const {
return std::max(0.0f, translate_x - std::fabs(scale_x));
}
s32 GetY() const {
return static_cast<s32>(std::max(0.0f, translate_y - std::fabs(scale_y)));
f32 GetY() const {
return std::max(0.0f, translate_y - std::fabs(scale_y));
}
s32 GetWidth() const {
return static_cast<s32>(translate_x + std::fabs(scale_x)) - GetX();
f32 GetWidth() const {
return translate_x + std::fabs(scale_x) - GetX();
}
s32 GetHeight() const {
return static_cast<s32>(translate_y + std::fabs(scale_y)) - GetY();
f32 GetHeight() const {
return translate_y + std::fabs(scale_y) - GetY();
}
};
@@ -872,16 +867,7 @@ public:
INSERT_UNION_PADDING_WORDS(0x35);
union {
BitField<0, 1, u32> c0;
BitField<1, 1, u32> c1;
BitField<2, 1, u32> c2;
BitField<3, 1, u32> c3;
BitField<4, 1, u32> c4;
BitField<5, 1, u32> c5;
BitField<6, 1, u32> c6;
BitField<7, 1, u32> c7;
} clip_distance_enabled;
u32 clip_distance_enabled;
u32 samplecnt_enable;
@@ -1060,7 +1046,9 @@ public:
INSERT_UNION_PADDING_WORDS(1);
Cull cull;
u32 cull_test_enabled;
FrontFace front_face;
CullFace cull_face;
u32 pixel_center_integer;
@@ -1238,79 +1226,6 @@ public:
State state{};
struct DirtyRegs {
static constexpr std::size_t NUM_REGS = 256;
static_assert(NUM_REGS - 1 <= std::numeric_limits<u8>::max());
union {
struct {
bool null_dirty;
// Vertex Attributes
bool vertex_attrib_format;
// Vertex Arrays
std::array<bool, 32> vertex_array;
bool vertex_array_buffers;
// Vertex Instances
std::array<bool, 32> vertex_instance;
bool vertex_instances;
// Render Targets
std::array<bool, 8> render_target;
bool depth_buffer;
bool render_settings;
// Shaders
bool shaders;
// Rasterizer State
bool viewport;
bool clip_coefficient;
bool cull_mode;
bool primitive_restart;
bool depth_test;
bool stencil_test;
bool blend_state;
bool scissor_test;
bool transform_feedback;
bool color_mask;
bool polygon_offset;
bool depth_bounds_values;
// Complementary
bool viewport_transform;
bool screen_y_control;
bool memory_general;
};
std::array<bool, NUM_REGS> regs;
};
void ResetVertexArrays() {
vertex_array.fill(true);
vertex_array_buffers = true;
}
void ResetRenderTargets() {
depth_buffer = true;
render_target.fill(true);
render_settings = true;
}
void OnMemoryWrite() {
shaders = true;
memory_general = true;
ResetRenderTargets();
ResetVertexArrays();
}
} dirty{};
/// Reads a register value located at the input method address
u32 GetRegisterValue(u32 method) const;
@@ -1356,6 +1271,11 @@ public:
return execute_on;
}
/// Notify a memory write has happened.
void OnMemoryWrite() {
dirty.flags |= dirty.on_write_stores;
}
enum class MMEDrawMode : u32 {
Undefined,
Array,
@@ -1371,6 +1291,16 @@ public:
u32 gl_end_count{};
} mme_draw;
struct DirtyState {
using Flags = std::bitset<std::numeric_limits<u8>::max()>;
using Table = std::array<u8, Regs::NUM_REGS>;
using Tables = std::array<Table, 2>;
Flags flags;
Flags on_write_stores;
Tables tables{};
} dirty;
private:
void InitializeRegisterDefaults();
@@ -1417,8 +1347,6 @@ private:
/// Retrieves information about a specific TSC entry from the TSC buffer.
Texture::TSCEntry GetTSCEntry(u32 tsc_index) const;
void InitDirtySettings();
/**
* Call a macro on this engine.
* @param method Method to call
@@ -1561,7 +1489,9 @@ ASSERT_REG_POSITION(index_array, 0x5F2);
ASSERT_REG_POSITION(polygon_offset_clamp, 0x61F);
ASSERT_REG_POSITION(instanced_arrays, 0x620);
ASSERT_REG_POSITION(vp_point_size, 0x644);
ASSERT_REG_POSITION(cull, 0x646);
ASSERT_REG_POSITION(cull_test_enabled, 0x646);
ASSERT_REG_POSITION(front_face, 0x647);
ASSERT_REG_POSITION(cull_face, 0x648);
ASSERT_REG_POSITION(pixel_center_integer, 0x649);
ASSERT_REG_POSITION(viewport_transform_enabled, 0x64B);
ASSERT_REG_POSITION(view_volume_clip_control, 0x64F);

View File

@@ -57,7 +57,7 @@ void MaxwellDMA::HandleCopy() {
}
// All copies here update the main memory, so mark all rasterizer states as invalid.
system.GPU().Maxwell3D().dirty.OnMemoryWrite();
system.GPU().Maxwell3D().OnMemoryWrite();
if (regs.exec.is_dst_linear && regs.exec.is_src_linear) {
// When the enable_2d bit is disabled, the copy is performed as if we were copying a 1D

View File

@@ -140,71 +140,6 @@ void GPU::FlushCommands() {
renderer.Rasterizer().FlushCommands();
}
u32 RenderTargetBytesPerPixel(RenderTargetFormat format) {
ASSERT(format != RenderTargetFormat::NONE);
switch (format) {
case RenderTargetFormat::RGBA32_FLOAT:
case RenderTargetFormat::RGBA32_UINT:
return 16;
case RenderTargetFormat::RGBA16_UINT:
case RenderTargetFormat::RGBA16_UNORM:
case RenderTargetFormat::RGBA16_FLOAT:
case RenderTargetFormat::RGBX16_FLOAT:
case RenderTargetFormat::RG32_FLOAT:
case RenderTargetFormat::RG32_UINT:
return 8;
case RenderTargetFormat::RGBA8_UNORM:
case RenderTargetFormat::RGBA8_SNORM:
case RenderTargetFormat::RGBA8_SRGB:
case RenderTargetFormat::RGBA8_UINT:
case RenderTargetFormat::RGB10_A2_UNORM:
case RenderTargetFormat::BGRA8_UNORM:
case RenderTargetFormat::BGRA8_SRGB:
case RenderTargetFormat::RG16_UNORM:
case RenderTargetFormat::RG16_SNORM:
case RenderTargetFormat::RG16_UINT:
case RenderTargetFormat::RG16_SINT:
case RenderTargetFormat::RG16_FLOAT:
case RenderTargetFormat::R32_FLOAT:
case RenderTargetFormat::R11G11B10_FLOAT:
case RenderTargetFormat::R32_UINT:
return 4;
case RenderTargetFormat::R16_UNORM:
case RenderTargetFormat::R16_SNORM:
case RenderTargetFormat::R16_UINT:
case RenderTargetFormat::R16_SINT:
case RenderTargetFormat::R16_FLOAT:
case RenderTargetFormat::RG8_UNORM:
case RenderTargetFormat::RG8_SNORM:
return 2;
case RenderTargetFormat::R8_UNORM:
case RenderTargetFormat::R8_UINT:
return 1;
default:
UNIMPLEMENTED_MSG("Unimplemented render target format {}", static_cast<u32>(format));
return 1;
}
}
u32 DepthFormatBytesPerPixel(DepthFormat format) {
switch (format) {
case DepthFormat::Z32_S8_X24_FLOAT:
return 8;
case DepthFormat::Z32_FLOAT:
case DepthFormat::S8_Z24_UNORM:
case DepthFormat::Z24_X8_UNORM:
case DepthFormat::Z24_S8_UNORM:
case DepthFormat::Z24_C8_UNORM:
return 4;
case DepthFormat::Z16_UNORM:
return 2;
default:
UNIMPLEMENTED_MSG("Unimplemented Depth format {}", static_cast<u32>(format));
return 1;
}
}
// Note that, traditionally, methods are treated as 4-byte addressable locations, and hence
// their numbers are written down multiplied by 4 in Docs. Here we are not multiply by 4.
// So the values you see in docs might be multiplied by 4.

View File

@@ -83,12 +83,6 @@ enum class DepthFormat : u32 {
Z32_S8_X24_FLOAT = 0x19,
};
/// Returns the number of bytes per pixel of each rendertarget format.
u32 RenderTargetBytesPerPixel(RenderTargetFormat format);
/// Returns the number of bytes per pixel of each depth format.
u32 DepthFormatBytesPerPixel(DepthFormat format);
struct CommandListHeader;
class DebugContext;

View File

@@ -5,7 +5,7 @@
#include "common/assert.h"
#include "common/microprofile.h"
#include "core/core.h"
#include "core/frontend/scope_acquire_window_context.h"
#include "core/frontend/scope_acquire_context.h"
#include "video_core/dma_pusher.h"
#include "video_core/gpu.h"
#include "video_core/gpu_thread.h"
@@ -27,7 +27,7 @@ static void RunThread(VideoCore::RendererBase& renderer, Tegra::DmaPusher& dma_p
return;
}
Core::Frontend::ScopeAcquireWindowContext acquire_context{renderer.GetRenderWindow()};
Core::Frontend::ScopeAcquireContext acquire_context{renderer.GetRenderWindow()};
CommandDataContainer next;
while (state.is_running) {

View File

@@ -89,6 +89,9 @@ public:
virtual void LoadDiskResources(const std::atomic_bool& stop_loading = false,
const DiskResourceLoadCallback& callback = {}) {}
/// Initializes renderer dirty flags
virtual void SetupDirtyFlags() {}
/// Grant access to the Guest Driver Profile for recording/obtaining info on the guest driver.
GuestDriverProfile& AccessGuestDriverProfile() {
return guest_driver_profile;

View File

@@ -35,15 +35,19 @@ public:
explicit RendererBase(Core::Frontend::EmuWindow& window);
virtual ~RendererBase();
/// Swap buffers (render frame)
virtual void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) = 0;
/// Initialize the renderer
virtual bool Init() = 0;
/// Shutdown the renderer
virtual void ShutDown() = 0;
/// Finalize rendering the guest frame and draw into the presentation texture
virtual void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) = 0;
/// Draws the latest frame to the window waiting timeout_ms for a frame to arrive (Renderer
/// specific implementation)
virtual void TryPresent(int timeout_ms) = 0;
// Getter/setter functions:
// ------------------------

View File

@@ -11,7 +11,6 @@
#include "common/common_types.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_opengl/gl_framebuffer_cache.h"
#include "video_core/renderer_opengl/gl_state.h"
namespace OpenGL {
@@ -36,8 +35,7 @@ OGLFramebuffer FramebufferCacheOpenGL::CreateFramebuffer(const FramebufferCacheK
framebuffer.Create();
// TODO(Rodrigo): Use DSA here after Nvidia fixes their framebuffer DSA bugs.
local_state.draw.draw_framebuffer = framebuffer.handle;
local_state.ApplyFramebufferState();
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, framebuffer.handle);
if (key.zeta) {
const bool stencil = key.zeta->GetSurfaceParams().type == SurfaceType::DepthStencil;

View File

@@ -13,7 +13,6 @@
#include "common/common_types.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_texture_cache.h"
namespace OpenGL {
@@ -63,7 +62,6 @@ public:
private:
OGLFramebuffer CreateFramebuffer(const FramebufferCacheKey& key);
OpenGLState local_state;
std::unordered_map<FramebufferCacheKey, OGLFramebuffer> cache;
};

File diff suppressed because it is too large Load Diff

View File

@@ -30,7 +30,7 @@
#include "video_core/renderer_opengl/gl_shader_cache.h"
#include "video_core/renderer_opengl/gl_shader_decompiler.h"
#include "video_core/renderer_opengl/gl_shader_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_state_tracker.h"
#include "video_core/renderer_opengl/gl_texture_cache.h"
#include "video_core/renderer_opengl/utils.h"
#include "video_core/textures/texture.h"
@@ -55,7 +55,8 @@ struct DrawParameters;
class RasterizerOpenGL : public VideoCore::RasterizerAccelerated {
public:
explicit RasterizerOpenGL(Core::System& system, Core::Frontend::EmuWindow& emu_window,
ScreenInfo& info);
ScreenInfo& info, GLShader::ProgramManager& program_manager,
StateTracker& state_tracker);
~RasterizerOpenGL() override;
void Draw(bool is_indexed, bool is_instanced) override;
@@ -76,6 +77,7 @@ public:
u32 pixel_stride) override;
void LoadDiskResources(const std::atomic_bool& stop_loading,
const VideoCore::DiskResourceLoadCallback& callback) override;
void SetupDirtyFlags() override;
/// Returns true when there are commands queued to the OpenGL server.
bool AnyCommandQueued() const {
@@ -86,8 +88,7 @@ private:
/// Configures the color and depth framebuffer states.
void ConfigureFramebuffers();
void ConfigureClearFramebuffer(OpenGLState& current_state, bool using_color_fb,
bool using_depth_fb, bool using_stencil_fb);
void ConfigureClearFramebuffer(bool using_color_fb, bool using_depth_fb, bool using_stencil_fb);
/// Configures the current constbuffers to use for the draw command.
void SetupDrawConstBuffers(std::size_t stage_index, const Shader& shader);
@@ -130,11 +131,13 @@ private:
const GLShader::ImageEntry& entry);
/// Syncs the viewport and depth range to match the guest state
void SyncViewport(OpenGLState& current_state);
void SyncViewport();
/// Syncs the depth clamp state
void SyncDepthClamp();
/// Syncs the clip enabled status to match the guest state
void SyncClipEnabled(
const std::array<bool, Tegra::Engines::Maxwell3D::Regs::NumClipDistances>& clip_mask);
void SyncClipEnabled(u32 clip_mask);
/// Syncs the clip coefficients to match the guest state
void SyncClipCoef();
@@ -164,7 +167,7 @@ private:
void SyncMultiSampleState();
/// Syncs the scissor test state to match the guest state
void SyncScissorTest(OpenGLState& current_state);
void SyncScissorTest();
/// Syncs the transform feedback state to match the guest state
void SyncTransformFeedback();
@@ -173,7 +176,7 @@ private:
void SyncPointState();
/// Syncs the rasterizer enable state to match the guest state
void SyncRasterizeEnable(OpenGLState& current_state);
void SyncRasterizeEnable();
/// Syncs Color Mask
void SyncColorMask();
@@ -184,6 +187,9 @@ private:
/// Syncs the alpha test state to match the guest state
void SyncAlphaTest();
/// Syncs the framebuffer sRGB state to match the guest state
void SyncFramebufferSRGB();
/// Check for extension that are not strictly required but are needed for correct emulation
void CheckExtensions();
@@ -191,18 +197,17 @@ private:
std::size_t CalculateIndexBufferSize() const;
/// Updates and returns a vertex array object representing current vertex format
GLuint SetupVertexFormat();
/// Updates the current vertex format
void SetupVertexFormat();
void SetupVertexBuffer(GLuint vao);
void SetupVertexInstances(GLuint vao);
void SetupVertexBuffer();
void SetupVertexInstances();
GLintptr SetupIndexBuffer();
void SetupShaders(GLenum primitive_mode);
const Device device;
OpenGLState state;
TextureCacheOpenGL texture_cache;
ShaderCacheOpenGL shader_cache;
@@ -212,22 +217,20 @@ private:
Core::System& system;
ScreenInfo& screen_info;
std::unique_ptr<GLShader::ProgramManager> shader_program_manager;
std::map<std::array<Tegra::Engines::Maxwell3D::Regs::VertexAttribute,
Tegra::Engines::Maxwell3D::Regs::NumVertexAttributes>,
OGLVertexArray>
vertex_array_cache;
GLShader::ProgramManager& program_manager;
StateTracker& state_tracker;
static constexpr std::size_t STREAM_BUFFER_SIZE = 128 * 1024 * 1024;
OGLBufferCache buffer_cache;
VertexArrayPushBuffer vertex_array_pushbuffer;
VertexArrayPushBuffer vertex_array_pushbuffer{state_tracker};
BindBuffersRangePushBuffer bind_ubo_pushbuffer{GL_UNIFORM_BUFFER};
BindBuffersRangePushBuffer bind_ssbo_pushbuffer{GL_SHADER_STORAGE_BUFFER};
/// Number of commands queued to the OpenGL driver. Reseted on flush.
std::size_t num_queued_commands = 0;
u32 last_clip_distance_mask = 0;
};
} // namespace OpenGL

View File

@@ -8,13 +8,29 @@
#include "common/microprofile.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_shader_util.h"
#include "video_core/renderer_opengl/gl_state.h"
MICROPROFILE_DEFINE(OpenGL_ResourceCreation, "OpenGL", "Resource Creation", MP_RGB(128, 128, 192));
MICROPROFILE_DEFINE(OpenGL_ResourceDeletion, "OpenGL", "Resource Deletion", MP_RGB(128, 128, 192));
namespace OpenGL {
void OGLRenderbuffer::Create() {
if (handle != 0)
return;
MICROPROFILE_SCOPE(OpenGL_ResourceCreation);
glCreateRenderbuffers(1, &handle);
}
void OGLRenderbuffer::Release() {
if (handle == 0)
return;
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteRenderbuffers(1, &handle);
handle = 0;
}
void OGLTexture::Create(GLenum target) {
if (handle != 0)
return;
@@ -29,7 +45,6 @@ void OGLTexture::Release() {
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteTextures(1, &handle);
OpenGLState::GetCurState().UnbindTexture(handle).Apply();
handle = 0;
}
@@ -47,7 +62,6 @@ void OGLTextureView::Release() {
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteTextures(1, &handle);
OpenGLState::GetCurState().UnbindTexture(handle).Apply();
handle = 0;
}
@@ -65,7 +79,6 @@ void OGLSampler::Release() {
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteSamplers(1, &handle);
OpenGLState::GetCurState().ResetSampler(handle).Apply();
handle = 0;
}
@@ -109,7 +122,6 @@ void OGLProgram::Release() {
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteProgram(handle);
OpenGLState::GetCurState().ResetProgram(handle).Apply();
handle = 0;
}
@@ -127,7 +139,6 @@ void OGLPipeline::Release() {
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteProgramPipelines(1, &handle);
OpenGLState::GetCurState().ResetPipeline(handle).Apply();
handle = 0;
}
@@ -171,24 +182,6 @@ void OGLSync::Release() {
handle = 0;
}
void OGLVertexArray::Create() {
if (handle != 0)
return;
MICROPROFILE_SCOPE(OpenGL_ResourceCreation);
glCreateVertexArrays(1, &handle);
}
void OGLVertexArray::Release() {
if (handle == 0)
return;
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteVertexArrays(1, &handle);
OpenGLState::GetCurState().ResetVertexArray(handle).Apply();
handle = 0;
}
void OGLFramebuffer::Create() {
if (handle != 0)
return;
@@ -203,7 +196,6 @@ void OGLFramebuffer::Release() {
MICROPROFILE_SCOPE(OpenGL_ResourceDeletion);
glDeleteFramebuffers(1, &handle);
OpenGLState::GetCurState().ResetFramebuffer(handle).Apply();
handle = 0;
}

View File

@@ -11,6 +11,31 @@
namespace OpenGL {
class OGLRenderbuffer : private NonCopyable {
public:
OGLRenderbuffer() = default;
OGLRenderbuffer(OGLRenderbuffer&& o) noexcept : handle(std::exchange(o.handle, 0)) {}
~OGLRenderbuffer() {
Release();
}
OGLRenderbuffer& operator=(OGLRenderbuffer&& o) noexcept {
Release();
handle = std::exchange(o.handle, 0);
return *this;
}
/// Creates a new internal OpenGL resource and stores the handle
void Create();
/// Deletes the internal OpenGL resource
void Release();
GLuint handle = 0;
};
class OGLTexture : private NonCopyable {
public:
OGLTexture() = default;
@@ -216,31 +241,6 @@ public:
GLsync handle = 0;
};
class OGLVertexArray : private NonCopyable {
public:
OGLVertexArray() = default;
OGLVertexArray(OGLVertexArray&& o) noexcept : handle(std::exchange(o.handle, 0)) {}
~OGLVertexArray() {
Release();
}
OGLVertexArray& operator=(OGLVertexArray&& o) noexcept {
Release();
handle = std::exchange(o.handle, 0);
return *this;
}
/// Creates a new internal OpenGL resource and stores the handle
void Create();
/// Deletes the internal OpenGL resource
void Release();
GLuint handle = 0;
};
class OGLFramebuffer : private NonCopyable {
public:
OGLFramebuffer() = default;

View File

@@ -38,7 +38,7 @@ OGLSampler SamplerCacheOpenGL::CreateSampler(const Tegra::Texture::TSCEntry& tsc
glSamplerParameterf(sampler_id, GL_TEXTURE_MAX_ANISOTROPY, tsc.GetMaxAnisotropy());
} else if (GLAD_GL_EXT_texture_filter_anisotropic) {
glSamplerParameterf(sampler_id, GL_TEXTURE_MAX_ANISOTROPY_EXT, tsc.GetMaxAnisotropy());
} else if (tsc.GetMaxAnisotropy() != 1) {
} else {
LOG_WARNING(Render_OpenGL, "Anisotropy not supported by host GPU driver");
}

View File

@@ -22,6 +22,7 @@
#include "video_core/renderer_opengl/gl_shader_cache.h"
#include "video_core/renderer_opengl/gl_shader_decompiler.h"
#include "video_core/renderer_opengl/gl_shader_disk_cache.h"
#include "video_core/renderer_opengl/gl_state_tracker.h"
#include "video_core/renderer_opengl/utils.h"
#include "video_core/shader/shader_ir.h"
@@ -623,7 +624,7 @@ bool ShaderCacheOpenGL::GenerateUnspecializedShaders(
}
Shader ShaderCacheOpenGL::GetStageProgram(Maxwell::ShaderProgram program) {
if (!system.GPU().Maxwell3D().dirty.shaders) {
if (!system.GPU().Maxwell3D().dirty.flags[Dirty::Shaders]) {
return last_shaders[static_cast<std::size_t>(program)];
}

View File

@@ -2547,7 +2547,10 @@ ShaderEntries GetEntries(const VideoCommon::Shader::ShaderIR& ir) {
for (const auto& image : ir.GetImages()) {
entries.images.emplace_back(image);
}
entries.clip_distances = ir.GetClipDistances();
const auto clip_distances = ir.GetClipDistances();
for (std::size_t i = 0; i < std::size(clip_distances); ++i) {
entries.clip_distances = (clip_distances[i] ? 1U : 0U) << i;
}
entries.shader_length = ir.GetLength();
return entries;
}

View File

@@ -74,7 +74,7 @@ struct ShaderEntries {
std::vector<GlobalMemoryEntry> global_memory_entries;
std::vector<SamplerEntry> samplers;
std::vector<ImageEntry> images;
std::array<bool, Maxwell::NumClipDistances> clip_distances{};
u32 clip_distances{};
std::size_t shader_length{};
};

View File

@@ -2,45 +2,52 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <glad/glad.h>
#include "common/common_types.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_opengl/gl_shader_manager.h"
namespace OpenGL::GLShader {
using Tegra::Engines::Maxwell3D;
ProgramManager::ProgramManager() {
pipeline.Create();
}
ProgramManager::ProgramManager() = default;
ProgramManager::~ProgramManager() = default;
void ProgramManager::ApplyTo(OpenGLState& state) {
UpdatePipeline();
state.draw.shader_program = 0;
state.draw.program_pipeline = pipeline.handle;
void ProgramManager::Create() {
graphics_pipeline.Create();
glBindProgramPipeline(graphics_pipeline.handle);
}
void ProgramManager::UpdatePipeline() {
void ProgramManager::BindGraphicsPipeline() {
if (!is_graphics_bound) {
is_graphics_bound = true;
glUseProgram(0);
}
// Avoid updating the pipeline when values have no changed
if (old_state == current_state) {
return;
}
// Workaround for AMD bug
constexpr GLenum all_used_stages{GL_VERTEX_SHADER_BIT | GL_GEOMETRY_SHADER_BIT |
GL_FRAGMENT_SHADER_BIT};
glUseProgramStages(pipeline.handle, all_used_stages, 0);
glUseProgramStages(pipeline.handle, GL_VERTEX_SHADER_BIT, current_state.vertex_shader);
glUseProgramStages(pipeline.handle, GL_GEOMETRY_SHADER_BIT, current_state.geometry_shader);
glUseProgramStages(pipeline.handle, GL_FRAGMENT_SHADER_BIT, current_state.fragment_shader);
static constexpr GLenum all_used_stages{GL_VERTEX_SHADER_BIT | GL_GEOMETRY_SHADER_BIT |
GL_FRAGMENT_SHADER_BIT};
const GLuint handle = graphics_pipeline.handle;
glUseProgramStages(handle, all_used_stages, 0);
glUseProgramStages(handle, GL_VERTEX_SHADER_BIT, current_state.vertex_shader);
glUseProgramStages(handle, GL_GEOMETRY_SHADER_BIT, current_state.geometry_shader);
glUseProgramStages(handle, GL_FRAGMENT_SHADER_BIT, current_state.fragment_shader);
old_state = current_state;
}
void MaxwellUniformData::SetFromRegs(const Maxwell3D& maxwell) {
void ProgramManager::BindComputeShader(GLuint program) {
is_graphics_bound = false;
glUseProgram(program);
}
void MaxwellUniformData::SetFromRegs(const Tegra::Engines::Maxwell3D& maxwell) {
const auto& regs = maxwell.regs;
// Y_NEGATE controls what value S2R returns for the Y_DIRECTION system value.

View File

@@ -9,7 +9,6 @@
#include <glad/glad.h>
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/maxwell_to_gl.h"
namespace OpenGL::GLShader {
@@ -32,49 +31,47 @@ public:
explicit ProgramManager();
~ProgramManager();
void ApplyTo(OpenGLState& state);
void Create();
void UseProgrammableVertexShader(GLuint program) {
/// Updates the graphics pipeline and binds it.
void BindGraphicsPipeline();
/// Binds a compute shader.
void BindComputeShader(GLuint program);
void UseVertexShader(GLuint program) {
current_state.vertex_shader = program;
}
void UseProgrammableGeometryShader(GLuint program) {
void UseGeometryShader(GLuint program) {
current_state.geometry_shader = program;
}
void UseProgrammableFragmentShader(GLuint program) {
void UseFragmentShader(GLuint program) {
current_state.fragment_shader = program;
}
void UseTrivialGeometryShader() {
current_state.geometry_shader = 0;
}
void UseTrivialFragmentShader() {
current_state.fragment_shader = 0;
}
private:
struct PipelineState {
bool operator==(const PipelineState& rhs) const {
bool operator==(const PipelineState& rhs) const noexcept {
return vertex_shader == rhs.vertex_shader && fragment_shader == rhs.fragment_shader &&
geometry_shader == rhs.geometry_shader;
}
bool operator!=(const PipelineState& rhs) const {
bool operator!=(const PipelineState& rhs) const noexcept {
return !operator==(rhs);
}
GLuint vertex_shader{};
GLuint fragment_shader{};
GLuint geometry_shader{};
GLuint vertex_shader = 0;
GLuint fragment_shader = 0;
GLuint geometry_shader = 0;
};
void UpdatePipeline();
OGLPipeline pipeline;
OGLPipeline graphics_pipeline;
OGLPipeline compute_pipeline;
PipelineState current_state;
PipelineState old_state;
bool is_graphics_bound = true;
};
} // namespace OpenGL::GLShader

View File

@@ -1,554 +0,0 @@
// Copyright 2015 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <iterator>
#include <glad/glad.h>
#include "common/assert.h"
#include "common/logging/log.h"
#include "common/microprofile.h"
#include "video_core/renderer_opengl/gl_state.h"
MICROPROFILE_DEFINE(OpenGL_State, "OpenGL", "State Change", MP_RGB(192, 128, 128));
namespace OpenGL {
using Maxwell = Tegra::Engines::Maxwell3D::Regs;
OpenGLState OpenGLState::cur_state;
namespace {
template <typename T>
bool UpdateValue(T& current_value, const T new_value) {
const bool changed = current_value != new_value;
current_value = new_value;
return changed;
}
template <typename T1, typename T2>
bool UpdateTie(T1 current_value, const T2 new_value) {
const bool changed = current_value != new_value;
current_value = new_value;
return changed;
}
template <typename T>
std::optional<std::pair<GLuint, GLsizei>> UpdateArray(T& current_values, const T& new_values) {
std::optional<std::size_t> first;
std::size_t last;
for (std::size_t i = 0; i < std::size(current_values); ++i) {
if (!UpdateValue(current_values[i], new_values[i])) {
continue;
}
if (!first) {
first = i;
}
last = i;
}
if (!first) {
return std::nullopt;
}
return std::make_pair(static_cast<GLuint>(*first), static_cast<GLsizei>(last - *first + 1));
}
void Enable(GLenum cap, bool enable) {
if (enable) {
glEnable(cap);
} else {
glDisable(cap);
}
}
void Enable(GLenum cap, GLuint index, bool enable) {
if (enable) {
glEnablei(cap, index);
} else {
glDisablei(cap, index);
}
}
void Enable(GLenum cap, bool& current_value, bool new_value) {
if (UpdateValue(current_value, new_value)) {
Enable(cap, new_value);
}
}
void Enable(GLenum cap, GLuint index, bool& current_value, bool new_value) {
if (UpdateValue(current_value, new_value)) {
Enable(cap, index, new_value);
}
}
} // Anonymous namespace
OpenGLState::OpenGLState() = default;
void OpenGLState::SetDefaultViewports() {
viewports.fill(Viewport{});
depth_clamp.far_plane = false;
depth_clamp.near_plane = false;
}
void OpenGLState::ApplyFramebufferState() {
if (UpdateValue(cur_state.draw.read_framebuffer, draw.read_framebuffer)) {
glBindFramebuffer(GL_READ_FRAMEBUFFER, draw.read_framebuffer);
}
if (UpdateValue(cur_state.draw.draw_framebuffer, draw.draw_framebuffer)) {
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, draw.draw_framebuffer);
}
}
void OpenGLState::ApplyVertexArrayState() {
if (UpdateValue(cur_state.draw.vertex_array, draw.vertex_array)) {
glBindVertexArray(draw.vertex_array);
}
}
void OpenGLState::ApplyShaderProgram() {
if (UpdateValue(cur_state.draw.shader_program, draw.shader_program)) {
glUseProgram(draw.shader_program);
}
}
void OpenGLState::ApplyProgramPipeline() {
if (UpdateValue(cur_state.draw.program_pipeline, draw.program_pipeline)) {
glBindProgramPipeline(draw.program_pipeline);
}
}
void OpenGLState::ApplyClipDistances() {
for (std::size_t i = 0; i < clip_distance.size(); ++i) {
Enable(GL_CLIP_DISTANCE0 + static_cast<GLenum>(i), cur_state.clip_distance[i],
clip_distance[i]);
}
}
void OpenGLState::ApplyPointSize() {
Enable(GL_PROGRAM_POINT_SIZE, cur_state.point.program_control, point.program_control);
Enable(GL_POINT_SPRITE, cur_state.point.sprite, point.sprite);
if (UpdateValue(cur_state.point.size, point.size)) {
glPointSize(point.size);
}
}
void OpenGLState::ApplyFragmentColorClamp() {
if (UpdateValue(cur_state.fragment_color_clamp.enabled, fragment_color_clamp.enabled)) {
glClampColor(GL_CLAMP_FRAGMENT_COLOR_ARB,
fragment_color_clamp.enabled ? GL_TRUE : GL_FALSE);
}
}
void OpenGLState::ApplyMultisample() {
Enable(GL_SAMPLE_ALPHA_TO_COVERAGE, cur_state.multisample_control.alpha_to_coverage,
multisample_control.alpha_to_coverage);
Enable(GL_SAMPLE_ALPHA_TO_ONE, cur_state.multisample_control.alpha_to_one,
multisample_control.alpha_to_one);
}
void OpenGLState::ApplyDepthClamp() {
if (depth_clamp.far_plane == cur_state.depth_clamp.far_plane &&
depth_clamp.near_plane == cur_state.depth_clamp.near_plane) {
return;
}
cur_state.depth_clamp = depth_clamp;
UNIMPLEMENTED_IF_MSG(depth_clamp.far_plane != depth_clamp.near_plane,
"Unimplemented Depth Clamp Separation!");
Enable(GL_DEPTH_CLAMP, depth_clamp.far_plane || depth_clamp.near_plane);
}
void OpenGLState::ApplySRgb() {
if (cur_state.framebuffer_srgb.enabled == framebuffer_srgb.enabled)
return;
cur_state.framebuffer_srgb.enabled = framebuffer_srgb.enabled;
if (framebuffer_srgb.enabled) {
glEnable(GL_FRAMEBUFFER_SRGB);
} else {
glDisable(GL_FRAMEBUFFER_SRGB);
}
}
void OpenGLState::ApplyCulling() {
Enable(GL_CULL_FACE, cur_state.cull.enabled, cull.enabled);
if (UpdateValue(cur_state.cull.mode, cull.mode)) {
glCullFace(cull.mode);
}
if (UpdateValue(cur_state.cull.front_face, cull.front_face)) {
glFrontFace(cull.front_face);
}
}
void OpenGLState::ApplyRasterizerDiscard() {
Enable(GL_RASTERIZER_DISCARD, cur_state.rasterizer_discard, rasterizer_discard);
}
void OpenGLState::ApplyColorMask() {
if (!dirty.color_mask) {
return;
}
dirty.color_mask = false;
for (std::size_t i = 0; i < Maxwell::NumRenderTargets; ++i) {
const auto& updated = color_mask[i];
auto& current = cur_state.color_mask[i];
if (updated.red_enabled != current.red_enabled ||
updated.green_enabled != current.green_enabled ||
updated.blue_enabled != current.blue_enabled ||
updated.alpha_enabled != current.alpha_enabled) {
current = updated;
glColorMaski(static_cast<GLuint>(i), updated.red_enabled, updated.green_enabled,
updated.blue_enabled, updated.alpha_enabled);
}
}
}
void OpenGLState::ApplyDepth() {
Enable(GL_DEPTH_TEST, cur_state.depth.test_enabled, depth.test_enabled);
if (cur_state.depth.test_func != depth.test_func) {
cur_state.depth.test_func = depth.test_func;
glDepthFunc(depth.test_func);
}
if (cur_state.depth.write_mask != depth.write_mask) {
cur_state.depth.write_mask = depth.write_mask;
glDepthMask(depth.write_mask);
}
}
void OpenGLState::ApplyPrimitiveRestart() {
Enable(GL_PRIMITIVE_RESTART, cur_state.primitive_restart.enabled, primitive_restart.enabled);
if (cur_state.primitive_restart.index != primitive_restart.index) {
cur_state.primitive_restart.index = primitive_restart.index;
glPrimitiveRestartIndex(primitive_restart.index);
}
}
void OpenGLState::ApplyStencilTest() {
if (!dirty.stencil_state) {
return;
}
dirty.stencil_state = false;
Enable(GL_STENCIL_TEST, cur_state.stencil.test_enabled, stencil.test_enabled);
const auto ConfigStencil = [](GLenum face, const auto& config, auto& current) {
if (current.test_func != config.test_func || current.test_ref != config.test_ref ||
current.test_mask != config.test_mask) {
current.test_func = config.test_func;
current.test_ref = config.test_ref;
current.test_mask = config.test_mask;
glStencilFuncSeparate(face, config.test_func, config.test_ref, config.test_mask);
}
if (current.action_depth_fail != config.action_depth_fail ||
current.action_depth_pass != config.action_depth_pass ||
current.action_stencil_fail != config.action_stencil_fail) {
current.action_depth_fail = config.action_depth_fail;
current.action_depth_pass = config.action_depth_pass;
current.action_stencil_fail = config.action_stencil_fail;
glStencilOpSeparate(face, config.action_stencil_fail, config.action_depth_fail,
config.action_depth_pass);
}
if (current.write_mask != config.write_mask) {
current.write_mask = config.write_mask;
glStencilMaskSeparate(face, config.write_mask);
}
};
ConfigStencil(GL_FRONT, stencil.front, cur_state.stencil.front);
ConfigStencil(GL_BACK, stencil.back, cur_state.stencil.back);
}
void OpenGLState::ApplyViewport() {
for (GLuint i = 0; i < static_cast<GLuint>(Maxwell::NumViewports); ++i) {
const auto& updated = viewports[i];
auto& current = cur_state.viewports[i];
if (current.x != updated.x || current.y != updated.y || current.width != updated.width ||
current.height != updated.height) {
current.x = updated.x;
current.y = updated.y;
current.width = updated.width;
current.height = updated.height;
glViewportIndexedf(i, static_cast<GLfloat>(updated.x), static_cast<GLfloat>(updated.y),
static_cast<GLfloat>(updated.width),
static_cast<GLfloat>(updated.height));
}
if (current.depth_range_near != updated.depth_range_near ||
current.depth_range_far != updated.depth_range_far) {
current.depth_range_near = updated.depth_range_near;
current.depth_range_far = updated.depth_range_far;
glDepthRangeIndexed(i, updated.depth_range_near, updated.depth_range_far);
}
Enable(GL_SCISSOR_TEST, i, current.scissor.enabled, updated.scissor.enabled);
if (current.scissor.x != updated.scissor.x || current.scissor.y != updated.scissor.y ||
current.scissor.width != updated.scissor.width ||
current.scissor.height != updated.scissor.height) {
current.scissor.x = updated.scissor.x;
current.scissor.y = updated.scissor.y;
current.scissor.width = updated.scissor.width;
current.scissor.height = updated.scissor.height;
glScissorIndexed(i, updated.scissor.x, updated.scissor.y, updated.scissor.width,
updated.scissor.height);
}
}
}
void OpenGLState::ApplyGlobalBlending() {
const Blend& updated = blend[0];
Blend& current = cur_state.blend[0];
Enable(GL_BLEND, current.enabled, updated.enabled);
if (current.src_rgb_func != updated.src_rgb_func ||
current.dst_rgb_func != updated.dst_rgb_func || current.src_a_func != updated.src_a_func ||
current.dst_a_func != updated.dst_a_func) {
current.src_rgb_func = updated.src_rgb_func;
current.dst_rgb_func = updated.dst_rgb_func;
current.src_a_func = updated.src_a_func;
current.dst_a_func = updated.dst_a_func;
glBlendFuncSeparate(updated.src_rgb_func, updated.dst_rgb_func, updated.src_a_func,
updated.dst_a_func);
}
if (current.rgb_equation != updated.rgb_equation || current.a_equation != updated.a_equation) {
current.rgb_equation = updated.rgb_equation;
current.a_equation = updated.a_equation;
glBlendEquationSeparate(updated.rgb_equation, updated.a_equation);
}
}
void OpenGLState::ApplyTargetBlending(std::size_t target, bool force) {
const Blend& updated = blend[target];
Blend& current = cur_state.blend[target];
if (current.enabled != updated.enabled || force) {
current.enabled = updated.enabled;
Enable(GL_BLEND, static_cast<GLuint>(target), updated.enabled);
}
if (UpdateTie(std::tie(current.src_rgb_func, current.dst_rgb_func, current.src_a_func,
current.dst_a_func),
std::tie(updated.src_rgb_func, updated.dst_rgb_func, updated.src_a_func,
updated.dst_a_func))) {
glBlendFuncSeparatei(static_cast<GLuint>(target), updated.src_rgb_func,
updated.dst_rgb_func, updated.src_a_func, updated.dst_a_func);
}
if (UpdateTie(std::tie(current.rgb_equation, current.a_equation),
std::tie(updated.rgb_equation, updated.a_equation))) {
glBlendEquationSeparatei(static_cast<GLuint>(target), updated.rgb_equation,
updated.a_equation);
}
}
void OpenGLState::ApplyBlending() {
if (!dirty.blend_state) {
return;
}
dirty.blend_state = false;
if (independant_blend.enabled) {
const bool force = independant_blend.enabled != cur_state.independant_blend.enabled;
for (std::size_t target = 0; target < Maxwell::NumRenderTargets; ++target) {
ApplyTargetBlending(target, force);
}
} else {
ApplyGlobalBlending();
}
cur_state.independant_blend.enabled = independant_blend.enabled;
if (UpdateTie(
std::tie(cur_state.blend_color.red, cur_state.blend_color.green,
cur_state.blend_color.blue, cur_state.blend_color.alpha),
std::tie(blend_color.red, blend_color.green, blend_color.blue, blend_color.alpha))) {
glBlendColor(blend_color.red, blend_color.green, blend_color.blue, blend_color.alpha);
}
}
void OpenGLState::ApplyLogicOp() {
Enable(GL_COLOR_LOGIC_OP, cur_state.logic_op.enabled, logic_op.enabled);
if (UpdateValue(cur_state.logic_op.operation, logic_op.operation)) {
glLogicOp(logic_op.operation);
}
}
void OpenGLState::ApplyPolygonOffset() {
if (!dirty.polygon_offset) {
return;
}
dirty.polygon_offset = false;
Enable(GL_POLYGON_OFFSET_FILL, cur_state.polygon_offset.fill_enable,
polygon_offset.fill_enable);
Enable(GL_POLYGON_OFFSET_LINE, cur_state.polygon_offset.line_enable,
polygon_offset.line_enable);
Enable(GL_POLYGON_OFFSET_POINT, cur_state.polygon_offset.point_enable,
polygon_offset.point_enable);
if (UpdateTie(std::tie(cur_state.polygon_offset.factor, cur_state.polygon_offset.units,
cur_state.polygon_offset.clamp),
std::tie(polygon_offset.factor, polygon_offset.units, polygon_offset.clamp))) {
if (GLAD_GL_EXT_polygon_offset_clamp && polygon_offset.clamp != 0) {
glPolygonOffsetClamp(polygon_offset.factor, polygon_offset.units, polygon_offset.clamp);
} else {
UNIMPLEMENTED_IF_MSG(polygon_offset.clamp != 0,
"Unimplemented Depth polygon offset clamp.");
glPolygonOffset(polygon_offset.factor, polygon_offset.units);
}
}
}
void OpenGLState::ApplyAlphaTest() {
Enable(GL_ALPHA_TEST, cur_state.alpha_test.enabled, alpha_test.enabled);
if (UpdateTie(std::tie(cur_state.alpha_test.func, cur_state.alpha_test.ref),
std::tie(alpha_test.func, alpha_test.ref))) {
glAlphaFunc(alpha_test.func, alpha_test.ref);
}
}
void OpenGLState::ApplyClipControl() {
if (UpdateTie(std::tie(cur_state.clip_control.origin, cur_state.clip_control.depth_mode),
std::tie(clip_control.origin, clip_control.depth_mode))) {
glClipControl(clip_control.origin, clip_control.depth_mode);
}
}
void OpenGLState::ApplyTextures() {
const std::size_t size = std::size(textures);
for (std::size_t i = 0; i < size; ++i) {
if (UpdateValue(cur_state.textures[i], textures[i])) {
// BindTextureUnit doesn't support binding null textures, skip those binds.
// TODO(Rodrigo): Stop using null textures
if (textures[i] != 0) {
glBindTextureUnit(static_cast<GLuint>(i), textures[i]);
}
}
}
}
void OpenGLState::ApplySamplers() {
const std::size_t size = std::size(samplers);
for (std::size_t i = 0; i < size; ++i) {
if (UpdateValue(cur_state.samplers[i], samplers[i])) {
glBindSampler(static_cast<GLuint>(i), samplers[i]);
}
}
}
void OpenGLState::ApplyImages() {
if (const auto update = UpdateArray(cur_state.images, images)) {
glBindImageTextures(update->first, update->second, images.data() + update->first);
}
}
void OpenGLState::Apply() {
MICROPROFILE_SCOPE(OpenGL_State);
ApplyFramebufferState();
ApplyVertexArrayState();
ApplyShaderProgram();
ApplyProgramPipeline();
ApplyClipDistances();
ApplyPointSize();
ApplyFragmentColorClamp();
ApplyMultisample();
ApplyRasterizerDiscard();
ApplyColorMask();
ApplyDepthClamp();
ApplyViewport();
ApplyStencilTest();
ApplySRgb();
ApplyCulling();
ApplyDepth();
ApplyPrimitiveRestart();
ApplyBlending();
ApplyLogicOp();
ApplyTextures();
ApplySamplers();
ApplyImages();
ApplyPolygonOffset();
ApplyAlphaTest();
ApplyClipControl();
}
void OpenGLState::EmulateViewportWithScissor() {
auto& current = viewports[0];
if (current.scissor.enabled) {
const GLint left = std::max(current.x, current.scissor.x);
const GLint right =
std::max(current.x + current.width, current.scissor.x + current.scissor.width);
const GLint bottom = std::max(current.y, current.scissor.y);
const GLint top =
std::max(current.y + current.height, current.scissor.y + current.scissor.height);
current.scissor.x = std::max(left, 0);
current.scissor.y = std::max(bottom, 0);
current.scissor.width = std::max(right - left, 0);
current.scissor.height = std::max(top - bottom, 0);
} else {
current.scissor.enabled = true;
current.scissor.x = current.x;
current.scissor.y = current.y;
current.scissor.width = current.width;
current.scissor.height = current.height;
}
}
OpenGLState& OpenGLState::UnbindTexture(GLuint handle) {
for (auto& texture : textures) {
if (texture == handle) {
texture = 0;
}
}
return *this;
}
OpenGLState& OpenGLState::ResetSampler(GLuint handle) {
for (auto& sampler : samplers) {
if (sampler == handle) {
sampler = 0;
}
}
return *this;
}
OpenGLState& OpenGLState::ResetProgram(GLuint handle) {
if (draw.shader_program == handle) {
draw.shader_program = 0;
}
return *this;
}
OpenGLState& OpenGLState::ResetPipeline(GLuint handle) {
if (draw.program_pipeline == handle) {
draw.program_pipeline = 0;
}
return *this;
}
OpenGLState& OpenGLState::ResetVertexArray(GLuint handle) {
if (draw.vertex_array == handle) {
draw.vertex_array = 0;
}
return *this;
}
OpenGLState& OpenGLState::ResetFramebuffer(GLuint handle) {
if (draw.read_framebuffer == handle) {
draw.read_framebuffer = 0;
}
if (draw.draw_framebuffer == handle) {
draw.draw_framebuffer = 0;
}
return *this;
}
} // namespace OpenGL

View File

@@ -1,247 +0,0 @@
// Copyright 2015 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <array>
#include <type_traits>
#include <glad/glad.h>
#include "video_core/engines/maxwell_3d.h"
namespace OpenGL {
class OpenGLState {
public:
struct {
bool enabled = false; // GL_FRAMEBUFFER_SRGB
} framebuffer_srgb;
struct {
bool alpha_to_coverage = false; // GL_ALPHA_TO_COVERAGE
bool alpha_to_one = false; // GL_ALPHA_TO_ONE
} multisample_control;
struct {
bool enabled = false; // GL_CLAMP_FRAGMENT_COLOR_ARB
} fragment_color_clamp;
struct {
bool far_plane = false;
bool near_plane = false;
} depth_clamp; // GL_DEPTH_CLAMP
struct {
bool enabled = false; // GL_CULL_FACE
GLenum mode = GL_BACK; // GL_CULL_FACE_MODE
GLenum front_face = GL_CCW; // GL_FRONT_FACE
} cull;
struct {
bool test_enabled = false; // GL_DEPTH_TEST
GLboolean write_mask = GL_TRUE; // GL_DEPTH_WRITEMASK
GLenum test_func = GL_LESS; // GL_DEPTH_FUNC
} depth;
struct {
bool enabled = false;
GLuint index = 0;
} primitive_restart; // GL_PRIMITIVE_RESTART
bool rasterizer_discard = false; // GL_RASTERIZER_DISCARD
struct ColorMask {
GLboolean red_enabled = GL_TRUE;
GLboolean green_enabled = GL_TRUE;
GLboolean blue_enabled = GL_TRUE;
GLboolean alpha_enabled = GL_TRUE;
};
std::array<ColorMask, Tegra::Engines::Maxwell3D::Regs::NumRenderTargets>
color_mask; // GL_COLOR_WRITEMASK
struct {
bool test_enabled = false; // GL_STENCIL_TEST
struct {
GLenum test_func = GL_ALWAYS; // GL_STENCIL_FUNC
GLint test_ref = 0; // GL_STENCIL_REF
GLuint test_mask = 0xFFFFFFFF; // GL_STENCIL_VALUE_MASK
GLuint write_mask = 0xFFFFFFFF; // GL_STENCIL_WRITEMASK
GLenum action_stencil_fail = GL_KEEP; // GL_STENCIL_FAIL
GLenum action_depth_fail = GL_KEEP; // GL_STENCIL_PASS_DEPTH_FAIL
GLenum action_depth_pass = GL_KEEP; // GL_STENCIL_PASS_DEPTH_PASS
} front, back;
} stencil;
struct Blend {
bool enabled = false; // GL_BLEND
GLenum rgb_equation = GL_FUNC_ADD; // GL_BLEND_EQUATION_RGB
GLenum a_equation = GL_FUNC_ADD; // GL_BLEND_EQUATION_ALPHA
GLenum src_rgb_func = GL_ONE; // GL_BLEND_SRC_RGB
GLenum dst_rgb_func = GL_ZERO; // GL_BLEND_DST_RGB
GLenum src_a_func = GL_ONE; // GL_BLEND_SRC_ALPHA
GLenum dst_a_func = GL_ZERO; // GL_BLEND_DST_ALPHA
};
std::array<Blend, Tegra::Engines::Maxwell3D::Regs::NumRenderTargets> blend;
struct {
bool enabled = false;
} independant_blend;
struct {
GLclampf red = 0.0f;
GLclampf green = 0.0f;
GLclampf blue = 0.0f;
GLclampf alpha = 0.0f;
} blend_color; // GL_BLEND_COLOR
struct {
bool enabled = false; // GL_LOGIC_OP_MODE
GLenum operation = GL_COPY;
} logic_op;
static constexpr std::size_t NumSamplers = 32 * 5;
static constexpr std::size_t NumImages = 8 * 5;
std::array<GLuint, NumSamplers> textures = {};
std::array<GLuint, NumSamplers> samplers = {};
std::array<GLuint, NumImages> images = {};
struct {
GLuint read_framebuffer = 0; // GL_READ_FRAMEBUFFER_BINDING
GLuint draw_framebuffer = 0; // GL_DRAW_FRAMEBUFFER_BINDING
GLuint vertex_array = 0; // GL_VERTEX_ARRAY_BINDING
GLuint shader_program = 0; // GL_CURRENT_PROGRAM
GLuint program_pipeline = 0; // GL_PROGRAM_PIPELINE_BINDING
} draw;
struct Viewport {
GLint x = 0;
GLint y = 0;
GLint width = 0;
GLint height = 0;
GLfloat depth_range_near = 0.0f; // GL_DEPTH_RANGE
GLfloat depth_range_far = 1.0f; // GL_DEPTH_RANGE
struct {
bool enabled = false; // GL_SCISSOR_TEST
GLint x = 0;
GLint y = 0;
GLsizei width = 0;
GLsizei height = 0;
} scissor;
};
std::array<Viewport, Tegra::Engines::Maxwell3D::Regs::NumViewports> viewports;
struct {
bool program_control = false; // GL_PROGRAM_POINT_SIZE
bool sprite = false; // GL_POINT_SPRITE
GLfloat size = 1.0f; // GL_POINT_SIZE
} point;
struct {
bool point_enable = false;
bool line_enable = false;
bool fill_enable = false;
GLfloat units = 0.0f;
GLfloat factor = 0.0f;
GLfloat clamp = 0.0f;
} polygon_offset;
struct {
bool enabled = false; // GL_ALPHA_TEST
GLenum func = GL_ALWAYS; // GL_ALPHA_TEST_FUNC
GLfloat ref = 0.0f; // GL_ALPHA_TEST_REF
} alpha_test;
std::array<bool, 8> clip_distance = {}; // GL_CLIP_DISTANCE
struct {
GLenum origin = GL_LOWER_LEFT;
GLenum depth_mode = GL_NEGATIVE_ONE_TO_ONE;
} clip_control;
OpenGLState();
/// Get the currently active OpenGL state
static OpenGLState GetCurState() {
return cur_state;
}
void SetDefaultViewports();
/// Apply this state as the current OpenGL state
void Apply();
void ApplyFramebufferState();
void ApplyVertexArrayState();
void ApplyShaderProgram();
void ApplyProgramPipeline();
void ApplyClipDistances();
void ApplyPointSize();
void ApplyFragmentColorClamp();
void ApplyMultisample();
void ApplySRgb();
void ApplyCulling();
void ApplyRasterizerDiscard();
void ApplyColorMask();
void ApplyDepth();
void ApplyPrimitiveRestart();
void ApplyStencilTest();
void ApplyViewport();
void ApplyTargetBlending(std::size_t target, bool force);
void ApplyGlobalBlending();
void ApplyBlending();
void ApplyLogicOp();
void ApplyTextures();
void ApplySamplers();
void ApplyImages();
void ApplyDepthClamp();
void ApplyPolygonOffset();
void ApplyAlphaTest();
void ApplyClipControl();
/// Resets any references to the given resource
OpenGLState& UnbindTexture(GLuint handle);
OpenGLState& ResetSampler(GLuint handle);
OpenGLState& ResetProgram(GLuint handle);
OpenGLState& ResetPipeline(GLuint handle);
OpenGLState& ResetVertexArray(GLuint handle);
OpenGLState& ResetFramebuffer(GLuint handle);
/// Viewport does not affects glClearBuffer so emulate viewport using scissor test
void EmulateViewportWithScissor();
void MarkDirtyBlendState() {
dirty.blend_state = true;
}
void MarkDirtyStencilState() {
dirty.stencil_state = true;
}
void MarkDirtyPolygonOffset() {
dirty.polygon_offset = true;
}
void MarkDirtyColorMask() {
dirty.color_mask = true;
}
void AllDirty() {
dirty.blend_state = true;
dirty.stencil_state = true;
dirty.polygon_offset = true;
dirty.color_mask = true;
}
private:
static OpenGLState cur_state;
struct {
bool blend_state;
bool stencil_state;
bool viewport_state;
bool polygon_offset;
bool color_mask;
} dirty{};
};
static_assert(std::is_trivially_copyable_v<OpenGLState>);
} // namespace OpenGL

View File

@@ -0,0 +1,238 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <array>
#include <cstddef>
#include "common/common_types.h"
#include "core/core.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/gpu.h"
#include "video_core/renderer_opengl/gl_state_tracker.h"
#define OFF(field_name) MAXWELL3D_REG_INDEX(field_name)
#define NUM(field_name) (sizeof(Maxwell3D::Regs::field_name) / sizeof(u32))
namespace OpenGL {
namespace {
using namespace Dirty;
using namespace VideoCommon::Dirty;
using Tegra::Engines::Maxwell3D;
using Regs = Maxwell3D::Regs;
using Tables = Maxwell3D::DirtyState::Tables;
using Table = Maxwell3D::DirtyState::Table;
void SetupDirtyColorMasks(Tables& tables) {
tables[0][OFF(color_mask_common)] = ColorMaskCommon;
for (std::size_t rt = 0; rt < Regs::NumRenderTargets; ++rt) {
const std::size_t offset = OFF(color_mask) + rt * NUM(color_mask[0]);
FillBlock(tables[0], offset, NUM(color_mask[0]), ColorMask0 + rt);
}
FillBlock(tables[1], OFF(color_mask), NUM(color_mask), ColorMasks);
}
void SetupDirtyVertexArrays(Tables& tables) {
static constexpr std::size_t num_array = 3;
static constexpr std::size_t instance_base_offset = 3;
for (std::size_t i = 0; i < Regs::NumVertexArrays; ++i) {
const std::size_t array_offset = OFF(vertex_array) + i * NUM(vertex_array[0]);
const std::size_t limit_offset = OFF(vertex_array_limit) + i * NUM(vertex_array_limit[0]);
FillBlock(tables, array_offset, num_array, VertexBuffer0 + i, VertexBuffers);
FillBlock(tables, limit_offset, NUM(vertex_array_limit), VertexBuffer0 + i, VertexBuffers);
const std::size_t instance_array_offset = array_offset + instance_base_offset;
tables[0][instance_array_offset] = static_cast<u8>(VertexInstance0 + i);
tables[1][instance_array_offset] = VertexInstances;
const std::size_t instance_offset = OFF(instanced_arrays) + i;
tables[0][instance_offset] = static_cast<u8>(VertexInstance0 + i);
tables[1][instance_offset] = VertexInstances;
}
}
void SetupDirtyVertexFormat(Tables& tables) {
for (std::size_t i = 0; i < Regs::NumVertexAttributes; ++i) {
const std::size_t offset = OFF(vertex_attrib_format) + i * NUM(vertex_attrib_format[0]);
FillBlock(tables[0], offset, NUM(vertex_attrib_format[0]), VertexFormat0 + i);
}
FillBlock(tables[1], OFF(vertex_attrib_format), Regs::NumVertexAttributes, VertexFormats);
}
void SetupDirtyViewports(Tables& tables) {
for (std::size_t i = 0; i < Regs::NumViewports; ++i) {
const std::size_t transf_offset = OFF(viewport_transform) + i * NUM(viewport_transform[0]);
const std::size_t viewport_offset = OFF(viewports) + i * NUM(viewports[0]);
FillBlock(tables[0], transf_offset, NUM(viewport_transform[0]), Viewport0 + i);
FillBlock(tables[0], viewport_offset, NUM(viewports[0]), Viewport0 + i);
}
FillBlock(tables[1], OFF(viewport_transform), NUM(viewport_transform), Viewports);
FillBlock(tables[1], OFF(viewports), NUM(viewports), Viewports);
tables[0][OFF(viewport_transform_enabled)] = ViewportTransform;
tables[1][OFF(viewport_transform_enabled)] = Viewports;
}
void SetupDirtyScissors(Tables& tables) {
for (std::size_t i = 0; i < Regs::NumViewports; ++i) {
const std::size_t offset = OFF(scissor_test) + i * NUM(scissor_test[0]);
FillBlock(tables[0], offset, NUM(scissor_test[0]), Scissor0 + i);
}
FillBlock(tables[1], OFF(scissor_test), NUM(scissor_test), Scissors);
}
void SetupDirtyShaders(Tables& tables) {
FillBlock(tables[0], OFF(shader_config[0]), NUM(shader_config[0]) * Regs::MaxShaderProgram,
Shaders);
}
void SetupDirtyDepthTest(Tables& tables) {
auto& table = tables[0];
table[OFF(depth_test_enable)] = DepthTest;
table[OFF(depth_write_enabled)] = DepthMask;
table[OFF(depth_test_func)] = DepthTest;
}
void SetupDirtyStencilTest(Tables& tables) {
static constexpr std::array offsets = {
OFF(stencil_enable), OFF(stencil_front_func_func), OFF(stencil_front_func_ref),
OFF(stencil_front_func_mask), OFF(stencil_front_op_fail), OFF(stencil_front_op_zfail),
OFF(stencil_front_op_zpass), OFF(stencil_front_mask), OFF(stencil_two_side_enable),
OFF(stencil_back_func_func), OFF(stencil_back_func_ref), OFF(stencil_back_func_mask),
OFF(stencil_back_op_fail), OFF(stencil_back_op_zfail), OFF(stencil_back_op_zpass),
OFF(stencil_back_mask)};
for (const auto offset : offsets) {
tables[0][offset] = StencilTest;
}
}
void SetupDirtyAlphaTest(Tables& tables) {
auto& table = tables[0];
table[OFF(alpha_test_ref)] = AlphaTest;
table[OFF(alpha_test_func)] = AlphaTest;
table[OFF(alpha_test_enabled)] = AlphaTest;
}
void SetupDirtyBlend(Tables& tables) {
FillBlock(tables[0], OFF(blend_color), NUM(blend_color), BlendColor);
tables[0][OFF(independent_blend_enable)] = BlendIndependentEnabled;
for (std::size_t i = 0; i < Regs::NumRenderTargets; ++i) {
const std::size_t offset = OFF(independent_blend) + i * NUM(independent_blend[0]);
FillBlock(tables[0], offset, NUM(independent_blend[0]), BlendState0 + i);
tables[0][OFF(blend.enable) + i] = static_cast<u8>(BlendState0 + i);
}
FillBlock(tables[1], OFF(independent_blend), NUM(independent_blend), BlendStates);
FillBlock(tables[1], OFF(blend), NUM(blend), BlendStates);
}
void SetupDirtyPrimitiveRestart(Tables& tables) {
FillBlock(tables[0], OFF(primitive_restart), NUM(primitive_restart), PrimitiveRestart);
}
void SetupDirtyPolygonOffset(Tables& tables) {
auto& table = tables[0];
table[OFF(polygon_offset_fill_enable)] = PolygonOffset;
table[OFF(polygon_offset_line_enable)] = PolygonOffset;
table[OFF(polygon_offset_point_enable)] = PolygonOffset;
table[OFF(polygon_offset_factor)] = PolygonOffset;
table[OFF(polygon_offset_units)] = PolygonOffset;
table[OFF(polygon_offset_clamp)] = PolygonOffset;
}
void SetupDirtyMultisampleControl(Tables& tables) {
FillBlock(tables[0], OFF(multisample_control), NUM(multisample_control), MultisampleControl);
}
void SetupDirtyRasterizeEnable(Tables& tables) {
tables[0][OFF(rasterize_enable)] = RasterizeEnable;
}
void SetupDirtyFramebufferSRGB(Tables& tables) {
tables[0][OFF(framebuffer_srgb)] = FramebufferSRGB;
}
void SetupDirtyLogicOp(Tables& tables) {
FillBlock(tables[0], OFF(logic_op), NUM(logic_op), LogicOp);
}
void SetupDirtyFragmentClampColor(Tables& tables) {
tables[0][OFF(frag_color_clamp)] = FragmentClampColor;
}
void SetupDirtyPointSize(Tables& tables) {
tables[0][OFF(vp_point_size)] = PointSize;
tables[0][OFF(point_size)] = PointSize;
tables[0][OFF(point_sprite_enable)] = PointSize;
}
void SetupDirtyClipControl(Tables& tables) {
auto& table = tables[0];
table[OFF(screen_y_control)] = ClipControl;
table[OFF(depth_mode)] = ClipControl;
}
void SetupDirtyDepthClampEnabled(Tables& tables) {
tables[0][OFF(view_volume_clip_control)] = DepthClampEnabled;
}
void SetupDirtyMisc(Tables& tables) {
auto& table = tables[0];
table[OFF(clip_distance_enabled)] = ClipDistances;
table[OFF(front_face)] = FrontFace;
table[OFF(cull_test_enabled)] = CullTest;
table[OFF(cull_face)] = CullTest;
}
} // Anonymous namespace
StateTracker::StateTracker(Core::System& system) : system{system} {}
void StateTracker::Initialize() {
auto& dirty = system.GPU().Maxwell3D().dirty;
auto& tables = dirty.tables;
SetupDirtyRenderTargets(tables);
SetupDirtyColorMasks(tables);
SetupDirtyViewports(tables);
SetupDirtyScissors(tables);
SetupDirtyVertexArrays(tables);
SetupDirtyVertexFormat(tables);
SetupDirtyShaders(tables);
SetupDirtyDepthTest(tables);
SetupDirtyStencilTest(tables);
SetupDirtyAlphaTest(tables);
SetupDirtyBlend(tables);
SetupDirtyPrimitiveRestart(tables);
SetupDirtyPolygonOffset(tables);
SetupDirtyMultisampleControl(tables);
SetupDirtyRasterizeEnable(tables);
SetupDirtyFramebufferSRGB(tables);
SetupDirtyLogicOp(tables);
SetupDirtyFragmentClampColor(tables);
SetupDirtyPointSize(tables);
SetupDirtyClipControl(tables);
SetupDirtyDepthClampEnabled(tables);
SetupDirtyMisc(tables);
auto& store = dirty.on_write_stores;
SetupCommonOnWriteStores(store);
store[VertexBuffers] = true;
for (std::size_t i = 0; i < Regs::NumVertexArrays; ++i) {
store[VertexBuffer0 + i] = true;
}
}
} // namespace OpenGL

View File

@@ -0,0 +1,204 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <limits>
#include <glad/glad.h>
#include "common/common_types.h"
#include "core/core.h"
#include "video_core/dirty_flags.h"
#include "video_core/engines/maxwell_3d.h"
namespace Core {
class System;
}
namespace OpenGL {
namespace Dirty {
enum : u8 {
First = VideoCommon::Dirty::LastCommonEntry,
VertexFormats,
VertexFormat0,
VertexFormat31 = VertexFormat0 + 31,
VertexBuffers,
VertexBuffer0,
VertexBuffer31 = VertexBuffer0 + 31,
VertexInstances,
VertexInstance0,
VertexInstance31 = VertexInstance0 + 31,
ViewportTransform,
Viewports,
Viewport0,
Viewport15 = Viewport0 + 15,
Scissors,
Scissor0,
Scissor15 = Scissor0 + 15,
ColorMaskCommon,
ColorMasks,
ColorMask0,
ColorMask7 = ColorMask0 + 7,
BlendColor,
BlendIndependentEnabled,
BlendStates,
BlendState0,
BlendState7 = BlendState0 + 7,
Shaders,
ClipDistances,
ColorMask,
FrontFace,
CullTest,
DepthMask,
DepthTest,
StencilTest,
AlphaTest,
PrimitiveRestart,
PolygonOffset,
MultisampleControl,
RasterizeEnable,
FramebufferSRGB,
LogicOp,
FragmentClampColor,
PointSize,
ClipControl,
DepthClampEnabled,
Last
};
static_assert(Last <= std::numeric_limits<u8>::max());
} // namespace Dirty
class StateTracker {
public:
explicit StateTracker(Core::System& system);
void Initialize();
void BindIndexBuffer(GLuint new_index_buffer) {
if (index_buffer == new_index_buffer) {
return;
}
index_buffer = new_index_buffer;
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, new_index_buffer);
}
void NotifyScreenDrawVertexArray() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::VertexFormats] = true;
flags[OpenGL::Dirty::VertexFormat0 + 0] = true;
flags[OpenGL::Dirty::VertexFormat0 + 1] = true;
flags[OpenGL::Dirty::VertexBuffers] = true;
flags[OpenGL::Dirty::VertexBuffer0] = true;
flags[OpenGL::Dirty::VertexInstances] = true;
flags[OpenGL::Dirty::VertexInstance0 + 0] = true;
flags[OpenGL::Dirty::VertexInstance0 + 1] = true;
}
void NotifyViewport0() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::Viewports] = true;
flags[OpenGL::Dirty::Viewport0] = true;
}
void NotifyScissor0() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::Scissors] = true;
flags[OpenGL::Dirty::Scissor0] = true;
}
void NotifyColorMask0() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::ColorMasks] = true;
flags[OpenGL::Dirty::ColorMask0] = true;
}
void NotifyBlend0() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::BlendStates] = true;
flags[OpenGL::Dirty::BlendState0] = true;
}
void NotifyFramebuffer() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[VideoCommon::Dirty::RenderTargets] = true;
}
void NotifyFrontFace() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::FrontFace] = true;
}
void NotifyCullTest() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::CullTest] = true;
}
void NotifyDepthMask() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::DepthMask] = true;
}
void NotifyDepthTest() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::DepthTest] = true;
}
void NotifyStencilTest() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::StencilTest] = true;
}
void NotifyPolygonOffset() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::PolygonOffset] = true;
}
void NotifyRasterizeEnable() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::RasterizeEnable] = true;
}
void NotifyFramebufferSRGB() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::FramebufferSRGB] = true;
}
void NotifyLogicOp() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::LogicOp] = true;
}
void NotifyClipControl() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::ClipControl] = true;
}
void NotifyAlphaTest() {
auto& flags = system.GPU().Maxwell3D().dirty.flags;
flags[OpenGL::Dirty::AlphaTest] = true;
}
private:
Core::System& system;
GLuint index_buffer = 0;
};
} // namespace OpenGL

View File

@@ -7,7 +7,6 @@
#include "common/alignment.h"
#include "common/assert.h"
#include "common/microprofile.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_stream_buffer.h"
MICROPROFILE_DEFINE(OpenGL_StreamBuffer, "OpenGL", "Stream Buffer Orphaning",

View File

@@ -10,7 +10,7 @@
#include "core/core.h"
#include "video_core/morton.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_state_tracker.h"
#include "video_core/renderer_opengl/gl_texture_cache.h"
#include "video_core/renderer_opengl/utils.h"
#include "video_core/texture_cache/surface_base.h"
@@ -397,6 +397,7 @@ CachedSurfaceView::CachedSurfaceView(CachedSurface& surface, const ViewParams& p
const bool is_proxy)
: VideoCommon::ViewBase(params), surface{surface}, is_proxy{is_proxy} {
target = GetTextureTarget(params.target);
format = GetFormatTuple(surface.GetSurfaceParams().pixel_format).internal_format;
if (!is_proxy) {
texture_view = CreateTextureView();
}
@@ -467,25 +468,20 @@ void CachedSurfaceView::ApplySwizzle(SwizzleSource x_source, SwizzleSource y_sou
}
OGLTextureView CachedSurfaceView::CreateTextureView() const {
const auto& owner_params = surface.GetSurfaceParams();
OGLTextureView texture_view;
texture_view.Create();
const GLuint handle{texture_view.handle};
const FormatTuple& tuple{GetFormatTuple(owner_params.pixel_format)};
glTextureView(handle, target, surface.texture.handle, tuple.internal_format, params.base_level,
glTextureView(texture_view.handle, target, surface.texture.handle, format, params.base_level,
params.num_levels, params.base_layer, params.num_layers);
ApplyTextureDefaults(owner_params, handle);
ApplyTextureDefaults(surface.GetSurfaceParams(), texture_view.handle);
return texture_view;
}
TextureCacheOpenGL::TextureCacheOpenGL(Core::System& system,
VideoCore::RasterizerInterface& rasterizer,
const Device& device)
: TextureCacheBase{system, rasterizer} {
const Device& device, StateTracker& state_tracker)
: TextureCacheBase{system, rasterizer}, state_tracker{state_tracker} {
src_framebuffer.Create();
dst_framebuffer.Create();
}
@@ -519,25 +515,26 @@ void TextureCacheOpenGL::ImageBlit(View& src_view, View& dst_view,
const Tegra::Engines::Fermi2D::Config& copy_config) {
const auto& src_params{src_view->GetSurfaceParams()};
const auto& dst_params{dst_view->GetSurfaceParams()};
OpenGLState prev_state{OpenGLState::GetCurState()};
SCOPE_EXIT({
prev_state.AllDirty();
prev_state.Apply();
});
OpenGLState state;
state.draw.read_framebuffer = src_framebuffer.handle;
state.draw.draw_framebuffer = dst_framebuffer.handle;
state.framebuffer_srgb.enabled = dst_params.srgb_conversion;
state.AllDirty();
state.Apply();
u32 buffers{};
UNIMPLEMENTED_IF(src_params.target == SurfaceTarget::Texture3D);
UNIMPLEMENTED_IF(dst_params.target == SurfaceTarget::Texture3D);
state_tracker.NotifyScissor0();
state_tracker.NotifyFramebuffer();
state_tracker.NotifyRasterizeEnable();
state_tracker.NotifyFramebufferSRGB();
if (dst_params.srgb_conversion) {
glEnable(GL_FRAMEBUFFER_SRGB);
} else {
glDisable(GL_FRAMEBUFFER_SRGB);
}
glDisable(GL_RASTERIZER_DISCARD);
glDisablei(GL_SCISSOR_TEST, 0);
glBindFramebuffer(GL_READ_FRAMEBUFFER, src_framebuffer.handle);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, dst_framebuffer.handle);
GLenum buffers = 0;
if (src_params.type == SurfaceType::ColorTexture) {
src_view->Attach(GL_COLOR_ATTACHMENT0, GL_READ_FRAMEBUFFER);
glFramebufferTexture2D(GL_READ_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D, 0,

View File

@@ -27,6 +27,7 @@ using VideoCommon::ViewParams;
class CachedSurfaceView;
class CachedSurface;
class TextureCacheOpenGL;
class StateTracker;
using Surface = std::shared_ptr<CachedSurface>;
using View = std::shared_ptr<CachedSurfaceView>;
@@ -96,6 +97,10 @@ public:
return texture_view.handle;
}
GLenum GetFormat() const {
return format;
}
const SurfaceParams& GetSurfaceParams() const {
return surface.GetSurfaceParams();
}
@@ -113,6 +118,7 @@ private:
CachedSurface& surface;
GLenum target{};
GLenum format{};
OGLTextureView texture_view;
u32 swizzle{};
@@ -122,7 +128,7 @@ private:
class TextureCacheOpenGL final : public TextureCacheBase {
public:
explicit TextureCacheOpenGL(Core::System& system, VideoCore::RasterizerInterface& rasterizer,
const Device& device);
const Device& device, StateTracker& state_tracker);
~TextureCacheOpenGL();
protected:
@@ -139,6 +145,8 @@ protected:
private:
GLuint FetchPBO(std::size_t buffer_size);
StateTracker& state_tracker;
OGLFramebuffer src_framebuffer;
OGLFramebuffer dst_framebuffer;
std::unordered_map<u32, OGLBuffer> copy_pbo_cache;

View File

@@ -92,8 +92,32 @@ inline GLenum VertexType(Maxwell::VertexAttribute attrib) {
}
case Maxwell::VertexAttribute::Type::UnsignedScaled:
switch (attrib.size) {
case Maxwell::VertexAttribute::Size::Size_8:
case Maxwell::VertexAttribute::Size::Size_8_8:
case Maxwell::VertexAttribute::Size::Size_8_8_8:
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return GL_UNSIGNED_BYTE;
case Maxwell::VertexAttribute::Size::Size_16:
case Maxwell::VertexAttribute::Size::Size_16_16:
case Maxwell::VertexAttribute::Size::Size_16_16_16:
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return GL_UNSIGNED_SHORT;
default:
LOG_ERROR(Render_OpenGL, "Unimplemented vertex size={}", attrib.SizeString());
return {};
}
case Maxwell::VertexAttribute::Type::SignedScaled:
switch (attrib.size) {
case Maxwell::VertexAttribute::Size::Size_8:
case Maxwell::VertexAttribute::Size::Size_8_8:
case Maxwell::VertexAttribute::Size::Size_8_8_8:
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return GL_BYTE;
case Maxwell::VertexAttribute::Size::Size_16:
case Maxwell::VertexAttribute::Size::Size_16_16:
case Maxwell::VertexAttribute::Size::Size_16_16_16:
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return GL_SHORT;
default:
LOG_ERROR(Render_OpenGL, "Unimplemented vertex size={}", attrib.SizeString());
return {};
@@ -401,24 +425,24 @@ inline GLenum StencilOp(Maxwell::StencilOp stencil) {
return GL_KEEP;
}
inline GLenum FrontFace(Maxwell::Cull::FrontFace front_face) {
inline GLenum FrontFace(Maxwell::FrontFace front_face) {
switch (front_face) {
case Maxwell::Cull::FrontFace::ClockWise:
case Maxwell::FrontFace::ClockWise:
return GL_CW;
case Maxwell::Cull::FrontFace::CounterClockWise:
case Maxwell::FrontFace::CounterClockWise:
return GL_CCW;
}
LOG_ERROR(Render_OpenGL, "Unimplemented front face cull={}", static_cast<u32>(front_face));
return GL_CCW;
}
inline GLenum CullFace(Maxwell::Cull::CullFace cull_face) {
inline GLenum CullFace(Maxwell::CullFace cull_face) {
switch (cull_face) {
case Maxwell::Cull::CullFace::Front:
case Maxwell::CullFace::Front:
return GL_FRONT;
case Maxwell::Cull::CullFace::Back:
case Maxwell::CullFace::Back:
return GL_BACK;
case Maxwell::Cull::CullFace::FrontAndBack:
case Maxwell::CullFace::FrontAndBack:
return GL_FRONT_AND_BACK;
}
LOG_ERROR(Render_OpenGL, "Unimplemented cull face={}", static_cast<u32>(cull_face));

View File

@@ -9,26 +9,163 @@
#include <glad/glad.h>
#include "common/assert.h"
#include "common/logging/log.h"
#include "common/microprofile.h"
#include "common/telemetry.h"
#include "core/core.h"
#include "core/core_timing.h"
#include "core/frontend/emu_window.h"
#include "core/frontend/scope_acquire_window_context.h"
#include "core/memory.h"
#include "core/perf_stats.h"
#include "core/settings.h"
#include "core/telemetry_session.h"
#include "video_core/morton.h"
#include "video_core/renderer_opengl/gl_rasterizer.h"
#include "video_core/renderer_opengl/gl_shader_manager.h"
#include "video_core/renderer_opengl/renderer_opengl.h"
namespace OpenGL {
// If the size of this is too small, it ends up creating a soft cap on FPS as the renderer will have
// to wait on available presentation frames.
constexpr std::size_t SWAP_CHAIN_SIZE = 3;
struct Frame {
u32 width{}; /// Width of the frame (to detect resize)
u32 height{}; /// Height of the frame
bool color_reloaded{}; /// Texture attachment was recreated (ie: resized)
OpenGL::OGLRenderbuffer color{}; /// Buffer shared between the render/present FBO
OpenGL::OGLFramebuffer render{}; /// FBO created on the render thread
OpenGL::OGLFramebuffer present{}; /// FBO created on the present thread
GLsync render_fence{}; /// Fence created on the render thread
GLsync present_fence{}; /// Fence created on the presentation thread
bool is_srgb{}; /// Framebuffer is sRGB or RGB
};
/**
* For smooth Vsync rendering, we want to always present the latest frame that the core generates,
* but also make sure that rendering happens at the pace that the frontend dictates. This is a
* helper class that the renderer uses to sync frames between the render thread and the presentation
* thread
*/
class FrameMailbox {
public:
std::mutex swap_chain_lock;
std::condition_variable present_cv;
std::array<Frame, SWAP_CHAIN_SIZE> swap_chain{};
std::queue<Frame*> free_queue;
std::deque<Frame*> present_queue;
Frame* previous_frame{};
FrameMailbox() {
for (auto& frame : swap_chain) {
free_queue.push(&frame);
}
}
~FrameMailbox() {
// lock the mutex and clear out the present and free_queues and notify any people who are
// blocked to prevent deadlock on shutdown
std::scoped_lock lock{swap_chain_lock};
std::queue<Frame*>().swap(free_queue);
present_queue.clear();
present_cv.notify_all();
}
void ReloadPresentFrame(Frame* frame, u32 height, u32 width) {
frame->present.Release();
frame->present.Create();
GLint previous_draw_fbo{};
glGetIntegerv(GL_DRAW_FRAMEBUFFER_BINDING, &previous_draw_fbo);
glBindFramebuffer(GL_FRAMEBUFFER, frame->present.handle);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER,
frame->color.handle);
if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) {
LOG_CRITICAL(Render_OpenGL, "Failed to recreate present FBO!");
}
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, previous_draw_fbo);
frame->color_reloaded = false;
}
void ReloadRenderFrame(Frame* frame, u32 width, u32 height) {
// Recreate the color texture attachment
frame->color.Release();
frame->color.Create();
const GLenum internal_format = frame->is_srgb ? GL_SRGB8 : GL_RGB8;
glNamedRenderbufferStorage(frame->color.handle, internal_format, width, height);
// Recreate the FBO for the render target
frame->render.Release();
frame->render.Create();
glBindFramebuffer(GL_FRAMEBUFFER, frame->render.handle);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER,
frame->color.handle);
if (glCheckFramebufferStatus(GL_FRAMEBUFFER) != GL_FRAMEBUFFER_COMPLETE) {
LOG_CRITICAL(Render_OpenGL, "Failed to recreate render FBO!");
}
frame->width = width;
frame->height = height;
frame->color_reloaded = true;
}
Frame* GetRenderFrame() {
std::unique_lock lock{swap_chain_lock};
// If theres no free frames, we will reuse the oldest render frame
if (free_queue.empty()) {
auto frame = present_queue.back();
present_queue.pop_back();
return frame;
}
Frame* frame = free_queue.front();
free_queue.pop();
return frame;
}
void ReleaseRenderFrame(Frame* frame) {
std::unique_lock lock{swap_chain_lock};
present_queue.push_front(frame);
present_cv.notify_one();
}
Frame* TryGetPresentFrame(int timeout_ms) {
std::unique_lock lock{swap_chain_lock};
// wait for new entries in the present_queue
present_cv.wait_for(lock, std::chrono::milliseconds(timeout_ms),
[&] { return !present_queue.empty(); });
if (present_queue.empty()) {
// timed out waiting for a frame to draw so return the previous frame
return previous_frame;
}
// free the previous frame and add it back to the free queue
if (previous_frame) {
free_queue.push(previous_frame);
}
// the newest entries are pushed to the front of the queue
Frame* frame = present_queue.front();
present_queue.pop_front();
// remove all old entries from the present queue and move them back to the free_queue
for (auto f : present_queue) {
free_queue.push(f);
}
present_queue.clear();
previous_frame = frame;
return frame;
}
};
namespace {
constexpr char vertex_shader[] = R"(
constexpr char VERTEX_SHADER[] = R"(
#version 430 core
out gl_PerVertex {
vec4 gl_Position;
};
layout (location = 0) in vec2 vert_position;
layout (location = 1) in vec2 vert_tex_coord;
layout (location = 0) out vec2 frag_tex_coord;
@@ -49,7 +186,7 @@ void main() {
}
)";
constexpr char fragment_shader[] = R"(
constexpr char FRAGMENT_SHADER[] = R"(
#version 430 core
layout (location = 0) in vec2 frag_tex_coord;
@@ -58,7 +195,7 @@ layout (location = 0) out vec4 color;
layout (binding = 0) uniform sampler2D color_texture;
void main() {
color = texture(color_texture, frag_tex_coord);
color = vec4(texture(color_texture, frag_tex_coord).rgb, 1.0f);
}
)";
@@ -67,8 +204,8 @@ constexpr GLint TexCoordLocation = 1;
constexpr GLint ModelViewMatrixLocation = 0;
struct ScreenRectVertex {
constexpr ScreenRectVertex(GLfloat x, GLfloat y, GLfloat u, GLfloat v)
: position{{x, y}}, tex_coord{{u, v}} {}
constexpr ScreenRectVertex(u32 x, u32 y, GLfloat u, GLfloat v)
: position{{static_cast<GLfloat>(x), static_cast<GLfloat>(y)}}, tex_coord{{u, v}} {}
std::array<GLfloat, 2> position;
std::array<GLfloat, 2> tex_coord;
@@ -158,21 +295,81 @@ void APIENTRY DebugHandler(GLenum source, GLenum type, GLuint id, GLenum severit
} // Anonymous namespace
RendererOpenGL::RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system)
: VideoCore::RendererBase{emu_window}, emu_window{emu_window}, system{system} {}
: VideoCore::RendererBase{emu_window}, emu_window{emu_window}, system{system},
frame_mailbox{std::make_unique<FrameMailbox>()} {}
RendererOpenGL::~RendererOpenGL() = default;
void RendererOpenGL::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
// Maintain the rasterizer's state as a priority
OpenGLState prev_state = OpenGLState::GetCurState();
state.AllDirty();
state.Apply();
MICROPROFILE_DEFINE(OpenGL_RenderFrame, "OpenGL", "Render Frame", MP_RGB(128, 128, 64));
MICROPROFILE_DEFINE(OpenGL_WaitPresent, "OpenGL", "Wait For Present", MP_RGB(128, 128, 128));
void RendererOpenGL::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
render_window.PollEvents();
if (!framebuffer) {
return;
}
PrepareRendertarget(framebuffer);
RenderScreenshot();
Frame* frame;
{
MICROPROFILE_SCOPE(OpenGL_WaitPresent);
frame = frame_mailbox->GetRenderFrame();
// Clean up sync objects before drawing
// INTEL driver workaround. We can't delete the previous render sync object until we are
// sure that the presentation is done
if (frame->present_fence) {
glClientWaitSync(frame->present_fence, 0, GL_TIMEOUT_IGNORED);
}
// delete the draw fence if the frame wasn't presented
if (frame->render_fence) {
glDeleteSync(frame->render_fence);
frame->render_fence = 0;
}
// wait for the presentation to be done
if (frame->present_fence) {
glWaitSync(frame->present_fence, 0, GL_TIMEOUT_IGNORED);
glDeleteSync(frame->present_fence);
frame->present_fence = 0;
}
}
{
MICROPROFILE_SCOPE(OpenGL_RenderFrame);
const auto& layout = render_window.GetFramebufferLayout();
// Recreate the frame if the size of the window has changed
if (layout.width != frame->width || layout.height != frame->height ||
screen_info.display_srgb != frame->is_srgb) {
LOG_DEBUG(Render_OpenGL, "Reloading render frame");
frame->is_srgb = screen_info.display_srgb;
frame_mailbox->ReloadRenderFrame(frame, layout.width, layout.height);
}
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, frame->render.handle);
DrawScreen(layout);
// Create a fence for the frontend to wait on and swap this frame to OffTex
frame->render_fence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
glFlush();
frame_mailbox->ReleaseRenderFrame(frame);
m_current_frame++;
rasterizer->TickFrame();
}
}
void RendererOpenGL::PrepareRendertarget(const Tegra::FramebufferConfig* framebuffer) {
if (framebuffer) {
// If framebuffer is provided, reload it from memory to a texture
if (screen_info.texture.width != static_cast<GLsizei>(framebuffer->width) ||
screen_info.texture.height != static_cast<GLsizei>(framebuffer->height) ||
screen_info.texture.pixel_format != framebuffer->pixel_format) {
screen_info.texture.pixel_format != framebuffer->pixel_format ||
gl_framebuffer_data.empty()) {
// Reallocate texture if the framebuffer size has changed.
// This is expected to not happen very often and hence should not be a
// performance problem.
@@ -181,22 +378,7 @@ void RendererOpenGL::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
// Load the framebuffer from memory, draw it to the screen, and swap buffers
LoadFBToScreenInfo(*framebuffer);
if (renderer_settings.screenshot_requested)
CaptureScreenshot();
DrawScreen(render_window.GetFramebufferLayout());
rasterizer->TickFrame();
render_window.SwapBuffers();
}
render_window.PollEvents();
// Restore the rasterizer state
prev_state.AllDirty();
prev_state.Apply();
}
void RendererOpenGL::LoadFBToScreenInfo(const Tegra::FramebufferConfig& framebuffer) {
@@ -249,31 +431,24 @@ void RendererOpenGL::InitOpenGLObjects() {
glClearColor(Settings::values.bg_red, Settings::values.bg_green, Settings::values.bg_blue,
0.0f);
// Link shaders and get variable locations
shader.CreateFromSource(vertex_shader, nullptr, fragment_shader);
state.draw.shader_program = shader.handle;
state.AllDirty();
state.Apply();
// Create shader programs
OGLShader vertex_shader;
vertex_shader.Create(VERTEX_SHADER, GL_VERTEX_SHADER);
OGLShader fragment_shader;
fragment_shader.Create(FRAGMENT_SHADER, GL_FRAGMENT_SHADER);
vertex_program.Create(true, false, vertex_shader.handle);
fragment_program.Create(true, false, fragment_shader.handle);
// Create program pipeline
program_manager.Create();
// Generate VBO handle for drawing
vertex_buffer.Create();
// Generate VAO
vertex_array.Create();
state.draw.vertex_array = vertex_array.handle;
// Attach vertex data to VAO
glNamedBufferData(vertex_buffer.handle, sizeof(ScreenRectVertex) * 4, nullptr, GL_STREAM_DRAW);
glVertexArrayAttribFormat(vertex_array.handle, PositionLocation, 2, GL_FLOAT, GL_FALSE,
offsetof(ScreenRectVertex, position));
glVertexArrayAttribFormat(vertex_array.handle, TexCoordLocation, 2, GL_FLOAT, GL_FALSE,
offsetof(ScreenRectVertex, tex_coord));
glVertexArrayAttribBinding(vertex_array.handle, PositionLocation, 0);
glVertexArrayAttribBinding(vertex_array.handle, TexCoordLocation, 0);
glEnableVertexArrayAttrib(vertex_array.handle, PositionLocation);
glEnableVertexArrayAttrib(vertex_array.handle, TexCoordLocation);
glVertexArrayVertexBuffer(vertex_array.handle, 0, vertex_buffer.handle, 0,
sizeof(ScreenRectVertex));
// Allocate textures for the screen
screen_info.texture.resource.Create(GL_TEXTURE_2D);
@@ -306,7 +481,8 @@ void RendererOpenGL::CreateRasterizer() {
if (rasterizer) {
return;
}
rasterizer = std::make_unique<RasterizerOpenGL>(system, emu_window, screen_info);
rasterizer = std::make_unique<RasterizerOpenGL>(system, emu_window, screen_info,
program_manager, state_tracker);
}
void RendererOpenGL::ConfigureFramebufferTexture(TextureInfo& texture,
@@ -345,8 +521,19 @@ void RendererOpenGL::ConfigureFramebufferTexture(TextureInfo& texture,
glTextureStorage2D(texture.resource.handle, 1, internal_format, texture.width, texture.height);
}
void RendererOpenGL::DrawScreenTriangles(const ScreenInfo& screen_info, float x, float y, float w,
float h) {
void RendererOpenGL::DrawScreen(const Layout::FramebufferLayout& layout) {
if (renderer_settings.set_background_color) {
// Update background color before drawing
glClearColor(Settings::values.bg_red, Settings::values.bg_green, Settings::values.bg_blue,
0.0f);
}
// Set projection matrix
const std::array ortho_matrix =
MakeOrthographicMatrix(static_cast<float>(layout.width), static_cast<float>(layout.height));
glProgramUniformMatrix3x2fv(vertex_program.handle, ModelViewMatrixLocation, 1, GL_FALSE,
std::data(ortho_matrix));
const auto& texcoords = screen_info.display_texcoords;
auto left = texcoords.left;
auto right = texcoords.right;
@@ -378,60 +565,127 @@ void RendererOpenGL::DrawScreenTriangles(const ScreenInfo& screen_info, float x,
static_cast<f32>(screen_info.texture.height);
}
const auto& screen = layout.screen;
const std::array vertices = {
ScreenRectVertex(x, y, texcoords.top * scale_u, left * scale_v),
ScreenRectVertex(x + w, y, texcoords.bottom * scale_u, left * scale_v),
ScreenRectVertex(x, y + h, texcoords.top * scale_u, right * scale_v),
ScreenRectVertex(x + w, y + h, texcoords.bottom * scale_u, right * scale_v),
ScreenRectVertex(screen.left, screen.top, texcoords.top * scale_u, left * scale_v),
ScreenRectVertex(screen.right, screen.top, texcoords.bottom * scale_u, left * scale_v),
ScreenRectVertex(screen.left, screen.bottom, texcoords.top * scale_u, right * scale_v),
ScreenRectVertex(screen.right, screen.bottom, texcoords.bottom * scale_u, right * scale_v),
};
state.textures[0] = screen_info.display_texture;
state.framebuffer_srgb.enabled = screen_info.display_srgb;
state.AllDirty();
state.Apply();
glNamedBufferSubData(vertex_buffer.handle, 0, sizeof(vertices), std::data(vertices));
// TODO: Signal state tracker about these changes
state_tracker.NotifyScreenDrawVertexArray();
state_tracker.NotifyViewport0();
state_tracker.NotifyScissor0();
state_tracker.NotifyColorMask0();
state_tracker.NotifyBlend0();
state_tracker.NotifyFramebuffer();
state_tracker.NotifyFrontFace();
state_tracker.NotifyCullTest();
state_tracker.NotifyDepthTest();
state_tracker.NotifyStencilTest();
state_tracker.NotifyPolygonOffset();
state_tracker.NotifyRasterizeEnable();
state_tracker.NotifyFramebufferSRGB();
state_tracker.NotifyLogicOp();
state_tracker.NotifyClipControl();
state_tracker.NotifyAlphaTest();
program_manager.UseVertexShader(vertex_program.handle);
program_manager.UseGeometryShader(0);
program_manager.UseFragmentShader(fragment_program.handle);
program_manager.BindGraphicsPipeline();
glEnable(GL_CULL_FACE);
if (screen_info.display_srgb) {
glEnable(GL_FRAMEBUFFER_SRGB);
} else {
glDisable(GL_FRAMEBUFFER_SRGB);
}
glDisable(GL_COLOR_LOGIC_OP);
glDisable(GL_DEPTH_TEST);
glDisable(GL_STENCIL_TEST);
glDisable(GL_POLYGON_OFFSET_FILL);
glDisable(GL_RASTERIZER_DISCARD);
glDisable(GL_ALPHA_TEST);
glDisablei(GL_BLEND, 0);
glDisablei(GL_SCISSOR_TEST, 0);
glCullFace(GL_BACK);
glFrontFace(GL_CW);
glColorMaski(0, GL_TRUE, GL_TRUE, GL_TRUE, GL_TRUE);
glClipControl(GL_LOWER_LEFT, GL_ZERO_TO_ONE);
glViewportIndexedf(0, 0.0f, 0.0f, static_cast<GLfloat>(layout.width),
static_cast<GLfloat>(layout.height));
glDepthRangeIndexed(0, 0.0, 0.0);
glEnableVertexAttribArray(PositionLocation);
glEnableVertexAttribArray(TexCoordLocation);
glVertexAttribDivisor(PositionLocation, 0);
glVertexAttribDivisor(TexCoordLocation, 0);
glVertexAttribFormat(PositionLocation, 2, GL_FLOAT, GL_FALSE,
offsetof(ScreenRectVertex, position));
glVertexAttribFormat(TexCoordLocation, 2, GL_FLOAT, GL_FALSE,
offsetof(ScreenRectVertex, tex_coord));
glVertexAttribBinding(PositionLocation, 0);
glVertexAttribBinding(TexCoordLocation, 0);
glBindVertexBuffer(0, vertex_buffer.handle, 0, sizeof(ScreenRectVertex));
glBindTextureUnit(0, screen_info.display_texture);
glBindSampler(0, 0);
glClear(GL_COLOR_BUFFER_BIT);
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
// Restore default state
state.framebuffer_srgb.enabled = false;
state.textures[0] = 0;
state.AllDirty();
state.Apply();
}
void RendererOpenGL::DrawScreen(const Layout::FramebufferLayout& layout) {
if (renderer_settings.set_background_color) {
// Update background color before drawing
glClearColor(Settings::values.bg_red, Settings::values.bg_green, Settings::values.bg_blue,
0.0f);
void RendererOpenGL::TryPresent(int timeout_ms) {
const auto& layout = render_window.GetFramebufferLayout();
auto frame = frame_mailbox->TryGetPresentFrame(timeout_ms);
if (!frame) {
LOG_DEBUG(Render_OpenGL, "TryGetPresentFrame returned no frame to present");
return;
}
const auto& screen = layout.screen;
glViewport(0, 0, layout.width, layout.height);
// Clearing before a full overwrite of a fbo can signal to drivers that they can avoid a
// readback since we won't be doing any blending
glClear(GL_COLOR_BUFFER_BIT);
// Set projection matrix
const std::array ortho_matrix =
MakeOrthographicMatrix(static_cast<float>(layout.width), static_cast<float>(layout.height));
glUniformMatrix3x2fv(ModelViewMatrixLocation, 1, GL_FALSE, ortho_matrix.data());
// Recreate the presentation FBO if the color attachment was changed
if (frame->color_reloaded) {
LOG_DEBUG(Render_OpenGL, "Reloading present frame");
frame_mailbox->ReloadPresentFrame(frame, layout.width, layout.height);
}
glWaitSync(frame->render_fence, 0, GL_TIMEOUT_IGNORED);
// INTEL workaround.
// Normally we could just delete the draw fence here, but due to driver bugs, we can just delete
// it on the emulation thread without too much penalty
// glDeleteSync(frame.render_sync);
// frame.render_sync = 0;
DrawScreenTriangles(screen_info, static_cast<float>(screen.left),
static_cast<float>(screen.top), static_cast<float>(screen.GetWidth()),
static_cast<float>(screen.GetHeight()));
glBindFramebuffer(GL_READ_FRAMEBUFFER, frame->present.handle);
glBlitFramebuffer(0, 0, frame->width, frame->height, 0, 0, layout.width, layout.height,
GL_COLOR_BUFFER_BIT, GL_LINEAR);
m_current_frame++;
// Insert fence for the main thread to block on
frame->present_fence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
glFlush();
glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);
}
void RendererOpenGL::UpdateFramerate() {}
void RendererOpenGL::RenderScreenshot() {
if (!renderer_settings.screenshot_requested) {
return;
}
GLint old_read_fb;
GLint old_draw_fb;
glGetIntegerv(GL_READ_FRAMEBUFFER_BINDING, &old_read_fb);
glGetIntegerv(GL_DRAW_FRAMEBUFFER_BINDING, &old_draw_fb);
void RendererOpenGL::CaptureScreenshot() {
// Draw the current frame to the screenshot framebuffer
screenshot_framebuffer.Create();
GLuint old_read_fb = state.draw.read_framebuffer;
GLuint old_draw_fb = state.draw.draw_framebuffer;
state.draw.read_framebuffer = state.draw.draw_framebuffer = screenshot_framebuffer.handle;
state.AllDirty();
state.Apply();
glBindFramebuffer(GL_FRAMEBUFFER, screenshot_framebuffer.handle);
Layout::FramebufferLayout layout{renderer_settings.screenshot_framebuffer_layout};
@@ -448,19 +702,16 @@ void RendererOpenGL::CaptureScreenshot() {
renderer_settings.screenshot_bits);
screenshot_framebuffer.Release();
state.draw.read_framebuffer = old_read_fb;
state.draw.draw_framebuffer = old_draw_fb;
state.AllDirty();
state.Apply();
glDeleteRenderbuffers(1, &renderbuffer);
glBindFramebuffer(GL_READ_FRAMEBUFFER, old_read_fb);
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, old_draw_fb);
renderer_settings.screenshot_complete_callback();
renderer_settings.screenshot_requested = false;
}
bool RendererOpenGL::Init() {
Core::Frontend::ScopeAcquireWindowContext acquire_context{render_window};
if (GLAD_GL_KHR_debug) {
glEnable(GL_DEBUG_OUTPUT);
glDebugMessageCallback(DebugHandler, nullptr);

View File

@@ -10,7 +10,8 @@
#include "common/math_util.h"
#include "video_core/renderer_base.h"
#include "video_core/renderer_opengl/gl_resource_manager.h"
#include "video_core/renderer_opengl/gl_state.h"
#include "video_core/renderer_opengl/gl_shader_manager.h"
#include "video_core/renderer_opengl/gl_state_tracker.h"
namespace Core {
class System;
@@ -44,19 +45,23 @@ struct ScreenInfo {
TextureInfo texture;
};
struct PresentationTexture {
u32 width = 0;
u32 height = 0;
OGLTexture texture;
};
class FrameMailbox;
class RendererOpenGL final : public VideoCore::RendererBase {
public:
explicit RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system);
~RendererOpenGL() override;
/// Swap buffers (render frame)
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
/// Initialize the renderer
bool Init() override;
/// Shutdown the renderer
void ShutDown() override;
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
void TryPresent(int timeout_ms) override;
private:
/// Initializes the OpenGL state and creates persistent objects.
@@ -72,12 +77,7 @@ private:
/// Draws the emulated screens to the emulator window.
void DrawScreen(const Layout::FramebufferLayout& layout);
void DrawScreenTriangles(const ScreenInfo& screen_info, float x, float y, float w, float h);
/// Updates the framerate.
void UpdateFramerate();
void CaptureScreenshot();
void RenderScreenshot();
/// Loads framebuffer from emulated memory into the active OpenGL texture.
void LoadFBToScreenInfo(const Tegra::FramebufferConfig& framebuffer);
@@ -87,26 +87,34 @@ private:
void LoadColorToActiveGLTexture(u8 color_r, u8 color_g, u8 color_b, u8 color_a,
const TextureInfo& texture);
void PrepareRendertarget(const Tegra::FramebufferConfig* framebuffer);
Core::Frontend::EmuWindow& emu_window;
Core::System& system;
OpenGLState state;
StateTracker state_tracker{system};
// OpenGL object IDs
OGLVertexArray vertex_array;
OGLBuffer vertex_buffer;
OGLProgram shader;
OGLProgram vertex_program;
OGLProgram fragment_program;
OGLFramebuffer screenshot_framebuffer;
/// Display information for Switch screen
ScreenInfo screen_info;
/// Global dummy shader pipeline
GLShader::ProgramManager program_manager;
/// OpenGL framebuffer data
std::vector<u8> gl_framebuffer_data;
/// Used for transforming the framebuffer orientation
Tegra::FramebufferConfig::TransformFlags framebuffer_transform_flags;
Common::Rectangle<int> framebuffer_crop_rect;
/// Frame presentation mailbox
std::unique_ptr<FrameMailbox> frame_mailbox;
};
} // namespace OpenGL

View File

@@ -9,6 +9,7 @@
#include <glad/glad.h>
#include "common/common_types.h"
#include "video_core/renderer_opengl/gl_state_tracker.h"
#include "video_core/renderer_opengl/utils.h"
namespace OpenGL {
@@ -20,12 +21,12 @@ struct VertexArrayPushBuffer::Entry {
GLsizei stride{};
};
VertexArrayPushBuffer::VertexArrayPushBuffer() = default;
VertexArrayPushBuffer::VertexArrayPushBuffer(StateTracker& state_tracker)
: state_tracker{state_tracker} {}
VertexArrayPushBuffer::~VertexArrayPushBuffer() = default;
void VertexArrayPushBuffer::Setup(GLuint vao_) {
vao = vao_;
void VertexArrayPushBuffer::Setup() {
index_buffer = nullptr;
vertex_buffers.clear();
}
@@ -41,13 +42,11 @@ void VertexArrayPushBuffer::SetVertexBuffer(GLuint binding_index, const GLuint*
void VertexArrayPushBuffer::Bind() {
if (index_buffer) {
glVertexArrayElementBuffer(vao, *index_buffer);
state_tracker.BindIndexBuffer(*index_buffer);
}
// TODO(Rodrigo): Find a way to ARB_multi_bind this
for (const auto& entry : vertex_buffers) {
glVertexArrayVertexBuffer(vao, entry.binding_index, *entry.buffer, entry.offset,
entry.stride);
glBindVertexBuffer(entry.binding_index, *entry.buffer, entry.offset, entry.stride);
}
}

View File

@@ -11,12 +11,14 @@
namespace OpenGL {
class StateTracker;
class VertexArrayPushBuffer final {
public:
explicit VertexArrayPushBuffer();
explicit VertexArrayPushBuffer(StateTracker& state_tracker);
~VertexArrayPushBuffer();
void Setup(GLuint vao_);
void Setup();
void SetIndexBuffer(const GLuint* buffer);
@@ -28,7 +30,8 @@ public:
private:
struct Entry;
GLuint vao{};
StateTracker& state_tracker;
const GLuint* index_buffer{};
std::vector<Entry> vertex_buffers;
};

View File

@@ -112,19 +112,18 @@ constexpr FixedPipelineState::Rasterizer GetRasterizerState(const Maxwell& regs)
const auto& clip = regs.view_volume_clip_control;
const bool depth_clamp_enabled = clip.depth_clamp_near == 1 || clip.depth_clamp_far == 1;
Maxwell::Cull::FrontFace front_face = regs.cull.front_face;
Maxwell::FrontFace front_face = regs.front_face;
if (regs.screen_y_control.triangle_rast_flip != 0 &&
regs.viewport_transform[0].scale_y > 0.0f) {
if (front_face == Maxwell::Cull::FrontFace::CounterClockWise)
front_face = Maxwell::Cull::FrontFace::ClockWise;
else if (front_face == Maxwell::Cull::FrontFace::ClockWise)
front_face = Maxwell::Cull::FrontFace::CounterClockWise;
if (front_face == Maxwell::FrontFace::CounterClockWise)
front_face = Maxwell::FrontFace::ClockWise;
else if (front_face == Maxwell::FrontFace::ClockWise)
front_face = Maxwell::FrontFace::CounterClockWise;
}
const bool gl_ndc = regs.depth_mode == Maxwell::DepthMode::MinusOneToOne;
return FixedPipelineState::Rasterizer(regs.cull.enabled, depth_bias_enabled,
depth_clamp_enabled, gl_ndc, regs.cull.cull_face,
front_face);
return FixedPipelineState::Rasterizer(regs.cull_test_enabled, depth_bias_enabled,
depth_clamp_enabled, gl_ndc, regs.cull_face, front_face);
}
} // Anonymous namespace

View File

@@ -171,8 +171,8 @@ struct FixedPipelineState {
struct Rasterizer {
constexpr Rasterizer(bool cull_enable, bool depth_bias_enable, bool depth_clamp_enable,
bool ndc_minus_one_to_one, Maxwell::Cull::CullFace cull_face,
Maxwell::Cull::FrontFace front_face)
bool ndc_minus_one_to_one, Maxwell::CullFace cull_face,
Maxwell::FrontFace front_face)
: cull_enable{cull_enable}, depth_bias_enable{depth_bias_enable},
depth_clamp_enable{depth_clamp_enable}, ndc_minus_one_to_one{ndc_minus_one_to_one},
cull_face{cull_face}, front_face{front_face} {}
@@ -182,8 +182,8 @@ struct FixedPipelineState {
bool depth_bias_enable;
bool depth_clamp_enable;
bool ndc_minus_one_to_one;
Maxwell::Cull::CullFace cull_face;
Maxwell::Cull::FrontFace front_face;
Maxwell::CullFace cull_face;
Maxwell::FrontFace front_face;
std::size_t Hash() const noexcept;

View File

@@ -120,7 +120,7 @@ struct FormatTuple {
{vk::Format::eA8B8G8R8UintPack32, Attachable | Storage}, // ABGR8UI
{vk::Format::eB5G6R5UnormPack16, {}}, // B5G6R5U
{vk::Format::eA2B10G10R10UnormPack32, Attachable | Storage}, // A2B10G10R10U
{vk::Format::eA1R5G5B5UnormPack16, Attachable | Storage}, // A1B5G5R5U (flipped with swizzle)
{vk::Format::eA1R5G5B5UnormPack16, Attachable}, // A1B5G5R5U (flipped with swizzle)
{vk::Format::eR8Unorm, Attachable | Storage}, // R8U
{vk::Format::eR8Uint, Attachable | Storage}, // R8UI
{vk::Format::eR16G16B16A16Sfloat, Attachable | Storage}, // RGBA16F
@@ -371,8 +371,22 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
}
case Maxwell::VertexAttribute::Type::UnsignedScaled:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Uscaled;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Uscaled;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Uscaled;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Uscaled;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Uscaled;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Uscaled;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Uscaled;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Uscaled;
default:
break;
}
@@ -572,24 +586,24 @@ vk::BlendFactor BlendFactor(Maxwell::Blend::Factor factor) {
return {};
}
vk::FrontFace FrontFace(Maxwell::Cull::FrontFace front_face) {
vk::FrontFace FrontFace(Maxwell::FrontFace front_face) {
switch (front_face) {
case Maxwell::Cull::FrontFace::ClockWise:
case Maxwell::FrontFace::ClockWise:
return vk::FrontFace::eClockwise;
case Maxwell::Cull::FrontFace::CounterClockWise:
case Maxwell::FrontFace::CounterClockWise:
return vk::FrontFace::eCounterClockwise;
}
UNIMPLEMENTED_MSG("Unimplemented front face={}", static_cast<u32>(front_face));
return {};
}
vk::CullModeFlags CullFace(Maxwell::Cull::CullFace cull_face) {
vk::CullModeFlags CullFace(Maxwell::CullFace cull_face) {
switch (cull_face) {
case Maxwell::Cull::CullFace::Front:
case Maxwell::CullFace::Front:
return vk::CullModeFlagBits::eFront;
case Maxwell::Cull::CullFace::Back:
case Maxwell::CullFace::Back:
return vk::CullModeFlagBits::eBack;
case Maxwell::Cull::CullFace::FrontAndBack:
case Maxwell::CullFace::FrontAndBack:
return vk::CullModeFlagBits::eFrontAndBack;
}
UNIMPLEMENTED_MSG("Unimplemented cull face={}", static_cast<u32>(cull_face));

View File

@@ -54,9 +54,9 @@ vk::BlendOp BlendEquation(Maxwell::Blend::Equation equation);
vk::BlendFactor BlendFactor(Maxwell::Blend::Factor factor);
vk::FrontFace FrontFace(Maxwell::Cull::FrontFace front_face);
vk::FrontFace FrontFace(Maxwell::FrontFace front_face);
vk::CullModeFlags CullFace(Maxwell::Cull::CullFace cull_face);
vk::CullModeFlags CullFace(Maxwell::CullFace cull_face);
vk::ComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle);

View File

@@ -27,6 +27,7 @@
#include "video_core/renderer_vulkan/vk_rasterizer.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#include "video_core/renderer_vulkan/vk_swapchain.h"
namespace Vulkan {
@@ -106,8 +107,14 @@ RendererVulkan::~RendererVulkan() {
}
void RendererVulkan::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
render_window.PollEvents();
if (!framebuffer) {
return;
}
const auto& layout = render_window.GetFramebufferLayout();
if (framebuffer && layout.width > 0 && layout.height > 0 && render_window.IsShown()) {
if (layout.width > 0 && layout.height > 0 && render_window.IsShown()) {
const VAddr framebuffer_addr = framebuffer->address + framebuffer->offset;
const bool use_accelerated =
rasterizer->AccelerateDisplay(*framebuffer, framebuffer_addr, framebuffer->stride);
@@ -128,13 +135,16 @@ void RendererVulkan::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
blit_screen->Recreate();
}
render_window.SwapBuffers();
rasterizer->TickFrame();
}
render_window.PollEvents();
}
void RendererVulkan::TryPresent(int /*timeout_ms*/) {
// TODO (bunnei): ImplementMe
}
bool RendererVulkan::Init() {
PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr{};
render_window.RetrieveVulkanHandlers(&vkGetInstanceProcAddr, &instance, &surface);
@@ -168,10 +178,13 @@ bool RendererVulkan::Init() {
swapchain = std::make_unique<VKSwapchain>(surface, *device);
swapchain->Create(framebuffer.width, framebuffer.height, false);
scheduler = std::make_unique<VKScheduler>(*device, *resource_manager);
state_tracker = std::make_unique<StateTracker>(system);
scheduler = std::make_unique<VKScheduler>(*device, *resource_manager, *state_tracker);
rasterizer = std::make_unique<RasterizerVulkan>(system, render_window, screen_info, *device,
*resource_manager, *memory_manager, *scheduler);
*resource_manager, *memory_manager,
*state_tracker, *scheduler);
blit_screen = std::make_unique<VKBlitScreen>(system, render_window, *rasterizer, *device,
*resource_manager, *memory_manager, *swapchain,
@@ -262,4 +275,4 @@ void RendererVulkan::Report() const {
telemetry_session.AddField(field, "GPU_Vulkan_Extensions", extensions);
}
} // namespace Vulkan
} // namespace Vulkan

View File

@@ -4,8 +4,10 @@
#pragma once
#include <memory>
#include <optional>
#include <vector>
#include "video_core/renderer_base.h"
#include "video_core/renderer_vulkan/declarations.h"
@@ -15,6 +17,7 @@ class System;
namespace Vulkan {
class StateTracker;
class VKBlitScreen;
class VKDevice;
class VKFence;
@@ -36,14 +39,10 @@ public:
explicit RendererVulkan(Core::Frontend::EmuWindow& window, Core::System& system);
~RendererVulkan() override;
/// Swap buffers (render frame)
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
/// Initialize the renderer
bool Init() override;
/// Shutdown the renderer
void ShutDown() override;
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
void TryPresent(int timeout_ms) override;
private:
std::optional<vk::DebugUtilsMessengerEXT> CreateDebugCallback(
@@ -65,6 +64,7 @@ private:
std::unique_ptr<VKSwapchain> swapchain;
std::unique_ptr<VKMemoryManager> memory_manager;
std::unique_ptr<VKResourceManager> resource_manager;
std::unique_ptr<StateTracker> state_tracker;
std::unique_ptr<VKScheduler> scheduler;
std::unique_ptr<VKBlitScreen> blit_screen;
};

View File

@@ -73,7 +73,7 @@ UniqueDescriptorUpdateTemplate VKComputePipeline::CreateDescriptorUpdateTemplate
std::vector<vk::DescriptorUpdateTemplateEntry> template_entries;
u32 binding = 0;
u32 offset = 0;
FillDescriptorUpdateTemplateEntries(device, entries, binding, offset, template_entries);
FillDescriptorUpdateTemplateEntries(entries, binding, offset, template_entries);
if (template_entries.empty()) {
// If the shader doesn't use descriptor sets, skip template creation.
return UniqueDescriptorUpdateTemplate{};

View File

@@ -97,8 +97,7 @@ UniqueDescriptorUpdateTemplate VKGraphicsPipeline::CreateDescriptorUpdateTemplat
u32 offset = 0;
for (const auto& stage : program) {
if (stage) {
FillDescriptorUpdateTemplateEntries(device, stage->entries, binding, offset,
template_entries);
FillDescriptorUpdateTemplateEntries(stage->entries, binding, offset, template_entries);
}
}
if (template_entries.empty()) {

View File

@@ -36,6 +36,13 @@ using Tegra::Engines::ShaderType;
namespace {
// C++20's using enum
constexpr auto eUniformBuffer = vk::DescriptorType::eUniformBuffer;
constexpr auto eStorageBuffer = vk::DescriptorType::eStorageBuffer;
constexpr auto eUniformTexelBuffer = vk::DescriptorType::eUniformTexelBuffer;
constexpr auto eCombinedImageSampler = vk::DescriptorType::eCombinedImageSampler;
constexpr auto eStorageImage = vk::DescriptorType::eStorageImage;
constexpr VideoCommon::Shader::CompilerSettings compiler_settings{
VideoCommon::Shader::CompileDepth::FullDecompile};
@@ -119,23 +126,32 @@ ShaderType GetShaderType(Maxwell::ShaderProgram program) {
}
}
template <vk::DescriptorType descriptor_type, class Container>
void AddBindings(std::vector<vk::DescriptorSetLayoutBinding>& bindings, u32& binding,
vk::ShaderStageFlags stage_flags, const Container& container) {
const u32 num_entries = static_cast<u32>(std::size(container));
for (std::size_t i = 0; i < num_entries; ++i) {
u32 count = 1;
if constexpr (descriptor_type == eCombinedImageSampler) {
// Combined image samplers can be arrayed.
count = container[i].Size();
}
bindings.emplace_back(binding++, descriptor_type, count, stage_flags, nullptr);
}
}
u32 FillDescriptorLayout(const ShaderEntries& entries,
std::vector<vk::DescriptorSetLayoutBinding>& bindings,
Maxwell::ShaderProgram program_type, u32 base_binding) {
const ShaderType stage = GetStageFromProgram(program_type);
const vk::ShaderStageFlags stage_flags = MaxwellToVK::ShaderStage(stage);
const vk::ShaderStageFlags flags = MaxwellToVK::ShaderStage(stage);
u32 binding = base_binding;
const auto AddBindings = [&](vk::DescriptorType descriptor_type, std::size_t num_entries) {
for (std::size_t i = 0; i < num_entries; ++i) {
bindings.emplace_back(binding++, descriptor_type, 1, stage_flags, nullptr);
}
};
AddBindings(vk::DescriptorType::eUniformBuffer, entries.const_buffers.size());
AddBindings(vk::DescriptorType::eStorageBuffer, entries.global_buffers.size());
AddBindings(vk::DescriptorType::eUniformTexelBuffer, entries.texel_buffers.size());
AddBindings(vk::DescriptorType::eCombinedImageSampler, entries.samplers.size());
AddBindings(vk::DescriptorType::eStorageImage, entries.images.size());
AddBindings<eUniformBuffer>(bindings, binding, flags, entries.const_buffers);
AddBindings<eStorageBuffer>(bindings, binding, flags, entries.global_buffers);
AddBindings<eUniformTexelBuffer>(bindings, binding, flags, entries.texel_buffers);
AddBindings<eCombinedImageSampler>(bindings, binding, flags, entries.samplers);
AddBindings<eStorageImage>(bindings, binding, flags, entries.images);
return binding;
}
@@ -172,11 +188,6 @@ VKPipelineCache::~VKPipelineCache() = default;
std::array<Shader, Maxwell::MaxShaderProgram> VKPipelineCache::GetShaders() {
const auto& gpu = system.GPU().Maxwell3D();
auto& dirty = system.GPU().Maxwell3D().dirty.shaders;
if (!dirty) {
return last_shaders;
}
dirty = false;
std::array<Shader, Maxwell::MaxShaderProgram> shaders;
for (std::size_t index = 0; index < Maxwell::MaxShaderProgram; ++index) {
@@ -361,32 +372,45 @@ VKPipelineCache::DecompileShaders(const GraphicsPipelineCacheKey& key) {
return {std::move(program), std::move(bindings)};
}
void FillDescriptorUpdateTemplateEntries(
const VKDevice& device, const ShaderEntries& entries, u32& binding, u32& offset,
std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries) {
static constexpr auto entry_size = static_cast<u32>(sizeof(DescriptorUpdateEntry));
const auto AddEntry = [&](vk::DescriptorType descriptor_type, std::size_t count_) {
const u32 count = static_cast<u32>(count_);
if (descriptor_type == vk::DescriptorType::eUniformTexelBuffer &&
device.GetDriverID() == vk::DriverIdKHR::eNvidiaProprietary) {
// Nvidia has a bug where updating multiple uniform texels at once causes the driver to
// crash.
for (u32 i = 0; i < count; ++i) {
template_entries.emplace_back(binding + i, 0, 1, descriptor_type,
offset + i * entry_size, entry_size);
}
} else if (count != 0) {
template_entries.emplace_back(binding, 0, count, descriptor_type, offset, entry_size);
}
offset += count * entry_size;
binding += count;
};
template <vk::DescriptorType descriptor_type, class Container>
void AddEntry(std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries, u32& binding,
u32& offset, const Container& container) {
static constexpr u32 entry_size = static_cast<u32>(sizeof(DescriptorUpdateEntry));
const u32 count = static_cast<u32>(std::size(container));
AddEntry(vk::DescriptorType::eUniformBuffer, entries.const_buffers.size());
AddEntry(vk::DescriptorType::eStorageBuffer, entries.global_buffers.size());
AddEntry(vk::DescriptorType::eUniformTexelBuffer, entries.texel_buffers.size());
AddEntry(vk::DescriptorType::eCombinedImageSampler, entries.samplers.size());
AddEntry(vk::DescriptorType::eStorageImage, entries.images.size());
if constexpr (descriptor_type == eCombinedImageSampler) {
for (u32 i = 0; i < count; ++i) {
const u32 num_samplers = container[i].Size();
template_entries.emplace_back(binding, 0, num_samplers, descriptor_type, offset,
entry_size);
++binding;
offset += num_samplers * entry_size;
}
return;
}
if constexpr (descriptor_type == eUniformTexelBuffer) {
// Nvidia has a bug where updating multiple uniform texels at once causes the driver to
// crash.
for (u32 i = 0; i < count; ++i) {
template_entries.emplace_back(binding + i, 0, 1, descriptor_type,
offset + i * entry_size, entry_size);
}
} else if (count > 0) {
template_entries.emplace_back(binding, 0, count, descriptor_type, offset, entry_size);
}
offset += count * entry_size;
binding += count;
}
void FillDescriptorUpdateTemplateEntries(
const ShaderEntries& entries, u32& binding, u32& offset,
std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries) {
AddEntry<eUniformBuffer>(template_entries, offset, binding, entries.const_buffers);
AddEntry<eStorageBuffer>(template_entries, offset, binding, entries.global_buffers);
AddEntry<eUniformTexelBuffer>(template_entries, offset, binding, entries.texel_buffers);
AddEntry<eCombinedImageSampler>(template_entries, offset, binding, entries.samplers);
AddEntry<eStorageImage>(template_entries, offset, binding, entries.images);
}
} // namespace Vulkan

View File

@@ -194,7 +194,7 @@ private:
};
void FillDescriptorUpdateTemplateEntries(
const VKDevice& device, const ShaderEntries& entries, u32& binding, u32& offset,
const ShaderEntries& entries, u32& binding, u32& offset,
std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries);
} // namespace Vulkan

View File

@@ -36,6 +36,7 @@
#include "video_core/renderer_vulkan/vk_sampler_cache.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#include "video_core/renderer_vulkan/vk_texture_cache.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
@@ -105,17 +106,20 @@ void TransitionImages(const std::vector<ImageView>& views, vk::PipelineStageFlag
template <typename Engine, typename Entry>
Tegra::Texture::FullTextureInfo GetTextureInfo(const Engine& engine, const Entry& entry,
std::size_t stage) {
std::size_t stage, std::size_t index = 0) {
const auto stage_type = static_cast<Tegra::Engines::ShaderType>(stage);
if (entry.IsBindless()) {
const Tegra::Texture::TextureHandle tex_handle =
engine.AccessConstBuffer32(stage_type, entry.GetBuffer(), entry.GetOffset());
return engine.GetTextureInfo(tex_handle);
}
const auto& gpu_profile = engine.AccessGuestDriverProfile();
const u32 entry_offset = static_cast<u32>(index * gpu_profile.GetTextureHandlerSize());
const u32 offset = entry.GetOffset() + entry_offset;
if constexpr (std::is_same_v<Engine, Tegra::Engines::Maxwell3D>) {
return engine.GetStageTexture(stage_type, entry.GetOffset());
return engine.GetStageTexture(stage_type, offset);
} else {
return engine.GetTexture(entry.GetOffset());
return engine.GetTexture(offset);
}
}
@@ -277,10 +281,11 @@ void RasterizerVulkan::DrawParameters::Draw(vk::CommandBuffer cmdbuf,
RasterizerVulkan::RasterizerVulkan(Core::System& system, Core::Frontend::EmuWindow& renderer,
VKScreenInfo& screen_info, const VKDevice& device,
VKResourceManager& resource_manager,
VKMemoryManager& memory_manager, VKScheduler& scheduler)
VKMemoryManager& memory_manager, StateTracker& state_tracker,
VKScheduler& scheduler)
: RasterizerAccelerated{system.Memory()}, system{system}, render_window{renderer},
screen_info{screen_info}, device{device}, resource_manager{resource_manager},
memory_manager{memory_manager}, scheduler{scheduler},
memory_manager{memory_manager}, state_tracker{state_tracker}, scheduler{scheduler},
staging_pool(device, memory_manager, scheduler), descriptor_pool(device),
update_descriptor_queue(device, scheduler),
quad_array_pass(device, scheduler, descriptor_pool, staging_pool, update_descriptor_queue),
@@ -545,6 +550,10 @@ bool RasterizerVulkan::AccelerateDisplay(const Tegra::FramebufferConfig& config,
return true;
}
void RasterizerVulkan::SetupDirtyFlags() {
state_tracker.Initialize();
}
void RasterizerVulkan::FlushWork() {
static constexpr u32 DRAWS_TO_DISPATCH = 4096;
@@ -568,9 +577,9 @@ void RasterizerVulkan::FlushWork() {
RasterizerVulkan::Texceptions RasterizerVulkan::UpdateAttachments() {
MICROPROFILE_SCOPE(Vulkan_RenderTargets);
auto& dirty = system.GPU().Maxwell3D().dirty;
const bool update_rendertargets = dirty.render_settings;
dirty.render_settings = false;
auto& dirty = system.GPU().Maxwell3D().dirty.flags;
const bool update_rendertargets = dirty[VideoCommon::Dirty::RenderTargets];
dirty[VideoCommon::Dirty::RenderTargets] = false;
texture_cache.GuardRenderTargets(true);
@@ -720,13 +729,13 @@ void RasterizerVulkan::SetupImageTransitions(
}
void RasterizerVulkan::UpdateDynamicStates() {
auto& gpu = system.GPU().Maxwell3D();
UpdateViewportsState(gpu);
UpdateScissorsState(gpu);
UpdateDepthBias(gpu);
UpdateBlendConstants(gpu);
UpdateDepthBounds(gpu);
UpdateStencilFaces(gpu);
auto& regs = system.GPU().Maxwell3D().regs;
UpdateViewportsState(regs);
UpdateScissorsState(regs);
UpdateDepthBias(regs);
UpdateBlendConstants(regs);
UpdateDepthBounds(regs);
UpdateStencilFaces(regs);
}
void RasterizerVulkan::SetupVertexArrays(FixedPipelineState::VertexInput& vertex_input,
@@ -836,8 +845,10 @@ void RasterizerVulkan::SetupGraphicsTextures(const ShaderEntries& entries, std::
MICROPROFILE_SCOPE(Vulkan_Textures);
const auto& gpu = system.GPU().Maxwell3D();
for (const auto& entry : entries.samplers) {
const auto texture = GetTextureInfo(gpu, entry, stage);
SetupTexture(texture, entry);
for (std::size_t i = 0; i < entry.Size(); ++i) {
const auto texture = GetTextureInfo(gpu, entry, stage, i);
SetupTexture(texture, entry);
}
}
}
@@ -886,8 +897,10 @@ void RasterizerVulkan::SetupComputeTextures(const ShaderEntries& entries) {
MICROPROFILE_SCOPE(Vulkan_Textures);
const auto& gpu = system.GPU().KeplerCompute();
for (const auto& entry : entries.samplers) {
const auto texture = GetTextureInfo(gpu, entry, ComputeShaderIndex);
SetupTexture(texture, entry);
for (std::size_t i = 0; i < entry.Size(); ++i) {
const auto texture = GetTextureInfo(gpu, entry, ComputeShaderIndex, i);
SetupTexture(texture, entry);
}
}
}
@@ -972,12 +985,10 @@ void RasterizerVulkan::SetupImage(const Tegra::Texture::TICEntry& tic, const Ima
image_views.push_back(ImageView{std::move(view), image_layout});
}
void RasterizerVulkan::UpdateViewportsState(Tegra::Engines::Maxwell3D& gpu) {
if (!gpu.dirty.viewport_transform && scheduler.TouchViewports()) {
void RasterizerVulkan::UpdateViewportsState(Tegra::Engines::Maxwell3D::Regs& regs) {
if (!state_tracker.TouchViewports()) {
return;
}
gpu.dirty.viewport_transform = false;
const auto& regs = gpu.regs;
const std::array viewports{
GetViewportState(device, regs, 0), GetViewportState(device, regs, 1),
GetViewportState(device, regs, 2), GetViewportState(device, regs, 3),
@@ -992,12 +1003,10 @@ void RasterizerVulkan::UpdateViewportsState(Tegra::Engines::Maxwell3D& gpu) {
});
}
void RasterizerVulkan::UpdateScissorsState(Tegra::Engines::Maxwell3D& gpu) {
if (!gpu.dirty.scissor_test && scheduler.TouchScissors()) {
void RasterizerVulkan::UpdateScissorsState(Tegra::Engines::Maxwell3D::Regs& regs) {
if (!state_tracker.TouchScissors()) {
return;
}
gpu.dirty.scissor_test = false;
const auto& regs = gpu.regs;
const std::array scissors = {
GetScissorState(regs, 0), GetScissorState(regs, 1), GetScissorState(regs, 2),
GetScissorState(regs, 3), GetScissorState(regs, 4), GetScissorState(regs, 5),
@@ -1010,46 +1019,39 @@ void RasterizerVulkan::UpdateScissorsState(Tegra::Engines::Maxwell3D& gpu) {
});
}
void RasterizerVulkan::UpdateDepthBias(Tegra::Engines::Maxwell3D& gpu) {
if (!gpu.dirty.polygon_offset && scheduler.TouchDepthBias()) {
void RasterizerVulkan::UpdateDepthBias(Tegra::Engines::Maxwell3D::Regs& regs) {
if (!state_tracker.TouchDepthBias()) {
return;
}
gpu.dirty.polygon_offset = false;
const auto& regs = gpu.regs;
scheduler.Record([constant = regs.polygon_offset_units, clamp = regs.polygon_offset_clamp,
factor = regs.polygon_offset_factor](auto cmdbuf, auto& dld) {
cmdbuf.setDepthBias(constant, clamp, factor / 2.0f, dld);
});
}
void RasterizerVulkan::UpdateBlendConstants(Tegra::Engines::Maxwell3D& gpu) {
if (!gpu.dirty.blend_state && scheduler.TouchBlendConstants()) {
void RasterizerVulkan::UpdateBlendConstants(Tegra::Engines::Maxwell3D::Regs& regs) {
if (!state_tracker.TouchBlendConstants()) {
return;
}
gpu.dirty.blend_state = false;
const std::array blend_color = {gpu.regs.blend_color.r, gpu.regs.blend_color.g,
gpu.regs.blend_color.b, gpu.regs.blend_color.a};
const std::array blend_color = {regs.blend_color.r, regs.blend_color.g, regs.blend_color.b,
regs.blend_color.a};
scheduler.Record([blend_color](auto cmdbuf, auto& dld) {
cmdbuf.setBlendConstants(blend_color.data(), dld);
});
}
void RasterizerVulkan::UpdateDepthBounds(Tegra::Engines::Maxwell3D& gpu) {
if (!gpu.dirty.depth_bounds_values && scheduler.TouchDepthBounds()) {
void RasterizerVulkan::UpdateDepthBounds(Tegra::Engines::Maxwell3D::Regs& regs) {
if (!state_tracker.TouchDepthBounds()) {
return;
}
gpu.dirty.depth_bounds_values = false;
const auto& regs = gpu.regs;
scheduler.Record([min = regs.depth_bounds[0], max = regs.depth_bounds[1]](
auto cmdbuf, auto& dld) { cmdbuf.setDepthBounds(min, max, dld); });
}
void RasterizerVulkan::UpdateStencilFaces(Tegra::Engines::Maxwell3D& gpu) {
if (!gpu.dirty.stencil_test && scheduler.TouchStencilValues()) {
void RasterizerVulkan::UpdateStencilFaces(Tegra::Engines::Maxwell3D::Regs& regs) {
if (!state_tracker.TouchStencilProperties()) {
return;
}
gpu.dirty.stencil_test = false;
const auto& regs = gpu.regs;
if (regs.stencil_two_side_enable) {
// Separate values per face
scheduler.Record(

View File

@@ -96,6 +96,7 @@ struct hash<Vulkan::FramebufferCacheKey> {
namespace Vulkan {
class StateTracker;
class BufferBindings;
struct ImageView {
@@ -108,7 +109,7 @@ public:
explicit RasterizerVulkan(Core::System& system, Core::Frontend::EmuWindow& render_window,
VKScreenInfo& screen_info, const VKDevice& device,
VKResourceManager& resource_manager, VKMemoryManager& memory_manager,
VKScheduler& scheduler);
StateTracker& state_tracker, VKScheduler& scheduler);
~RasterizerVulkan() override;
void Draw(bool is_indexed, bool is_instanced) override;
@@ -127,6 +128,7 @@ public:
const Tegra::Engines::Fermi2D::Config& copy_config) override;
bool AccelerateDisplay(const Tegra::FramebufferConfig& config, VAddr framebuffer_addr,
u32 pixel_stride) override;
void SetupDirtyFlags() override;
/// Maximum supported size that a constbuffer can have in bytes.
static constexpr std::size_t MaxConstbufferSize = 0x10000;
@@ -215,12 +217,12 @@ private:
void SetupImage(const Tegra::Texture::TICEntry& tic, const ImageEntry& entry);
void UpdateViewportsState(Tegra::Engines::Maxwell3D& gpu);
void UpdateScissorsState(Tegra::Engines::Maxwell3D& gpu);
void UpdateDepthBias(Tegra::Engines::Maxwell3D& gpu);
void UpdateBlendConstants(Tegra::Engines::Maxwell3D& gpu);
void UpdateDepthBounds(Tegra::Engines::Maxwell3D& gpu);
void UpdateStencilFaces(Tegra::Engines::Maxwell3D& gpu);
void UpdateViewportsState(Tegra::Engines::Maxwell3D::Regs& regs);
void UpdateScissorsState(Tegra::Engines::Maxwell3D::Regs& regs);
void UpdateDepthBias(Tegra::Engines::Maxwell3D::Regs& regs);
void UpdateBlendConstants(Tegra::Engines::Maxwell3D::Regs& regs);
void UpdateDepthBounds(Tegra::Engines::Maxwell3D::Regs& regs);
void UpdateStencilFaces(Tegra::Engines::Maxwell3D::Regs& regs);
std::size_t CalculateGraphicsStreamBufferSize(bool is_indexed) const;
@@ -241,6 +243,7 @@ private:
const VKDevice& device;
VKResourceManager& resource_manager;
VKMemoryManager& memory_manager;
StateTracker& state_tracker;
VKScheduler& scheduler;
VKStagingBufferPool staging_pool;

View File

@@ -2,6 +2,12 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <memory>
#include <mutex>
#include <optional>
#include <thread>
#include <utility>
#include "common/assert.h"
#include "common/microprofile.h"
#include "video_core/renderer_vulkan/declarations.h"
@@ -9,6 +15,7 @@
#include "video_core/renderer_vulkan/vk_query_cache.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
namespace Vulkan {
@@ -29,9 +36,10 @@ void VKScheduler::CommandChunk::ExecuteAll(vk::CommandBuffer cmdbuf,
last = nullptr;
}
VKScheduler::VKScheduler(const VKDevice& device, VKResourceManager& resource_manager)
: device{device}, resource_manager{resource_manager}, next_fence{
&resource_manager.CommitFence()} {
VKScheduler::VKScheduler(const VKDevice& device, VKResourceManager& resource_manager,
StateTracker& state_tracker)
: device{device}, resource_manager{resource_manager}, state_tracker{state_tracker},
next_fence{&resource_manager.CommitFence()} {
AcquireNewChunk();
AllocateNewContext();
worker_thread = std::thread(&VKScheduler::WorkerThread, this);
@@ -157,12 +165,7 @@ void VKScheduler::AllocateNewContext() {
void VKScheduler::InvalidateState() {
state.graphics_pipeline = nullptr;
state.viewports = false;
state.scissors = false;
state.depth_bias = false;
state.blend_constants = false;
state.depth_bounds = false;
state.stencil_values = false;
state_tracker.InvalidateCommandBufferState();
}
void VKScheduler::EndPendingOperations() {

View File

@@ -17,6 +17,7 @@
namespace Vulkan {
class StateTracker;
class VKDevice;
class VKFence;
class VKQueryCache;
@@ -43,7 +44,8 @@ private:
/// OpenGL-like operations on Vulkan command buffers.
class VKScheduler {
public:
explicit VKScheduler(const VKDevice& device, VKResourceManager& resource_manager);
explicit VKScheduler(const VKDevice& device, VKResourceManager& resource_manager,
StateTracker& state_tracker);
~VKScheduler();
/// Sends the current execution context to the GPU.
@@ -74,36 +76,6 @@ public:
query_cache = &query_cache_;
}
/// Returns true when viewports have been set in the current command buffer.
bool TouchViewports() {
return std::exchange(state.viewports, true);
}
/// Returns true when scissors have been set in the current command buffer.
bool TouchScissors() {
return std::exchange(state.scissors, true);
}
/// Returns true when depth bias have been set in the current command buffer.
bool TouchDepthBias() {
return std::exchange(state.depth_bias, true);
}
/// Returns true when blend constants have been set in the current command buffer.
bool TouchBlendConstants() {
return std::exchange(state.blend_constants, true);
}
/// Returns true when depth bounds have been set in the current command buffer.
bool TouchDepthBounds() {
return std::exchange(state.depth_bounds, true);
}
/// Returns true when stencil values have been set in the current command buffer.
bool TouchStencilValues() {
return std::exchange(state.stencil_values, true);
}
/// Send work to a separate thread.
template <typename T>
void Record(T&& command) {
@@ -217,6 +189,8 @@ private:
const VKDevice& device;
VKResourceManager& resource_manager;
StateTracker& state_tracker;
VKQueryCache* query_cache = nullptr;
vk::CommandBuffer current_cmdbuf;
@@ -226,12 +200,6 @@ private:
struct State {
std::optional<vk::RenderPassBeginInfo> renderpass;
vk::Pipeline graphics_pipeline;
bool viewports = false;
bool scissors = false;
bool depth_bias = false;
bool blend_constants = false;
bool depth_bounds = false;
bool stencil_values = false;
} state;
std::unique_ptr<CommandChunk> chunk;

View File

@@ -69,8 +69,9 @@ struct TexelBuffer {
struct SampledImage {
Id image_type{};
Id sampled_image_type{};
Id sampler{};
Id sampler_type{};
Id sampler_pointer_type{};
Id variable{};
};
struct StorageImage {
@@ -833,16 +834,20 @@ private:
constexpr int sampled = 1;
constexpr auto format = spv::ImageFormat::Unknown;
const Id image_type = TypeImage(t_float, dim, depth, arrayed, ms, sampled, format);
const Id sampled_image_type = TypeSampledImage(image_type);
const Id pointer_type =
TypePointer(spv::StorageClass::UniformConstant, sampled_image_type);
const Id sampler_type = TypeSampledImage(image_type);
const Id sampler_pointer_type =
TypePointer(spv::StorageClass::UniformConstant, sampler_type);
const Id type = sampler.IsIndexed()
? TypeArray(sampler_type, Constant(t_uint, sampler.Size()))
: sampler_type;
const Id pointer_type = TypePointer(spv::StorageClass::UniformConstant, type);
const Id id = OpVariable(pointer_type, spv::StorageClass::UniformConstant);
AddGlobalVariable(Name(id, fmt::format("sampler_{}", sampler.GetIndex())));
Decorate(id, spv::Decoration::Binding, binding++);
Decorate(id, spv::Decoration::DescriptorSet, DESCRIPTOR_SET);
sampled_images.emplace(sampler.GetIndex(),
SampledImage{image_type, sampled_image_type, id});
sampled_images.emplace(sampler.GetIndex(), SampledImage{image_type, sampler_type,
sampler_pointer_type, id});
}
return binding;
}
@@ -1525,7 +1530,12 @@ private:
ASSERT(!meta.sampler.IsBuffer());
const auto& entry = sampled_images.at(meta.sampler.GetIndex());
return OpLoad(entry.sampled_image_type, entry.sampler);
Id sampler = entry.variable;
if (meta.sampler.IsIndexed()) {
const Id index = AsInt(Visit(meta.index));
sampler = OpAccessChain(entry.sampler_pointer_type, sampler, index);
}
return OpLoad(entry.sampler_type, sampler);
}
Id GetTextureImage(Operation operation) {
@@ -2211,16 +2221,14 @@ private:
switch (specialization.attribute_types.at(location)) {
case Maxwell::VertexAttribute::Type::SignedNorm:
case Maxwell::VertexAttribute::Type::UnsignedNorm:
case Maxwell::VertexAttribute::Type::UnsignedScaled:
case Maxwell::VertexAttribute::Type::SignedScaled:
case Maxwell::VertexAttribute::Type::Float:
return {Type::Float, t_in_float, t_in_float4};
case Maxwell::VertexAttribute::Type::SignedInt:
return {Type::Int, t_in_int, t_in_int4};
case Maxwell::VertexAttribute::Type::UnsignedInt:
return {Type::Uint, t_in_uint, t_in_uint4};
case Maxwell::VertexAttribute::Type::UnsignedScaled:
case Maxwell::VertexAttribute::Type::SignedScaled:
UNIMPLEMENTED();
return {Type::Float, t_in_float, t_in_float4};
default:
UNREACHABLE();
return {Type::Float, t_in_float, t_in_float4};

View File

@@ -0,0 +1,101 @@
// Copyright 2020 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <cstddef>
#include <iterator>
#include "common/common_types.h"
#include "core/core.h"
#include "video_core/dirty_flags.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/gpu.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#define OFF(field_name) MAXWELL3D_REG_INDEX(field_name)
#define NUM(field_name) (sizeof(Maxwell3D::Regs::field_name) / sizeof(u32))
namespace Vulkan {
namespace {
using namespace Dirty;
using namespace VideoCommon::Dirty;
using Tegra::Engines::Maxwell3D;
using Regs = Maxwell3D::Regs;
using Tables = Maxwell3D::DirtyState::Tables;
using Table = Maxwell3D::DirtyState::Table;
using Flags = Maxwell3D::DirtyState::Flags;
Flags MakeInvalidationFlags() {
Flags flags{};
flags[Viewports] = true;
flags[Scissors] = true;
flags[DepthBias] = true;
flags[BlendConstants] = true;
flags[DepthBounds] = true;
flags[StencilProperties] = true;
return flags;
}
void SetupDirtyViewports(Tables& tables) {
FillBlock(tables[0], OFF(viewport_transform), NUM(viewport_transform), Viewports);
FillBlock(tables[0], OFF(viewports), NUM(viewports), Viewports);
tables[0][OFF(viewport_transform_enabled)] = Viewports;
}
void SetupDirtyScissors(Tables& tables) {
FillBlock(tables[0], OFF(scissor_test), NUM(scissor_test), Scissors);
}
void SetupDirtyDepthBias(Tables& tables) {
auto& table = tables[0];
table[OFF(polygon_offset_units)] = DepthBias;
table[OFF(polygon_offset_clamp)] = DepthBias;
table[OFF(polygon_offset_factor)] = DepthBias;
}
void SetupDirtyBlendConstants(Tables& tables) {
FillBlock(tables[0], OFF(blend_color), NUM(blend_color), BlendConstants);
}
void SetupDirtyDepthBounds(Tables& tables) {
FillBlock(tables[0], OFF(depth_bounds), NUM(depth_bounds), DepthBounds);
}
void SetupDirtyStencilProperties(Tables& tables) {
auto& table = tables[0];
table[OFF(stencil_two_side_enable)] = StencilProperties;
table[OFF(stencil_front_func_ref)] = StencilProperties;
table[OFF(stencil_front_mask)] = StencilProperties;
table[OFF(stencil_front_func_mask)] = StencilProperties;
table[OFF(stencil_back_func_ref)] = StencilProperties;
table[OFF(stencil_back_mask)] = StencilProperties;
table[OFF(stencil_back_func_mask)] = StencilProperties;
}
} // Anonymous namespace
StateTracker::StateTracker(Core::System& system)
: system{system}, invalidation_flags{MakeInvalidationFlags()} {}
void StateTracker::Initialize() {
auto& dirty = system.GPU().Maxwell3D().dirty;
auto& tables = dirty.tables;
SetupDirtyRenderTargets(tables);
SetupDirtyViewports(tables);
SetupDirtyScissors(tables);
SetupDirtyDepthBias(tables);
SetupDirtyBlendConstants(tables);
SetupDirtyDepthBounds(tables);
SetupDirtyStencilProperties(tables);
SetupCommonOnWriteStores(dirty.on_write_stores);
}
void StateTracker::InvalidateCommandBufferState() {
system.GPU().Maxwell3D().dirty.flags |= invalidation_flags;
}
} // namespace Vulkan

Some files were not shown because too many files have changed in this diff Show More