Compare commits

..

203 Commits

Author SHA1 Message Date
ReinUsesLisp
c961770900 vk_compute_pass: Implement indexed quads
Implement indexed quads (GL_QUADS used with glDrawElements*) with a
compute pass conversion.

The compute shader converts from uint8/uint16/uint32 indices to uint32.
The format is passed through push constants to avoid having different
variants of the same shader.

- Used by Fast RMX
- Used by Xenoblade Chronicles 2 (it still has graphical due to
synchronization issues on Vulkan)
2020-04-16 21:12:32 -03:00
Fernando Sahmkow
c81f256111 Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache
buffer_cache: Return handles instead of pointer to handles
2020-04-16 19:58:13 -04:00
bunnei
5a067eda84 Merge pull request #3675 from degasus/linux_shared_libraries
externals: Use shared libraries if possible
2020-04-16 18:17:18 -04:00
Markus Wick
b520978043 externals: Use shared libraries if possible
This is mostly done by pkgconfig.
I've focused on the larger and more stable libraries.
2020-04-16 17:03:17 +02:00
Markus Wick
fedf750e1b externals: Move LibreSSL linking to httplib.
Neither core nor web_services use OpenSSL nor LibreSSL.
However they need to link them as it's a requirement of httplib.
So let's declare this within httplib instead of core and web_services.
2020-04-16 16:46:33 +02:00
Markus Wick
94c2c828a5 input_common: Use the CMake target instead of the variable. 2020-04-16 16:42:59 +02:00
Rodrigo Locatti
db67e017cb Merge pull request #3659 from bunnei/time-calc-standard-user
service: time: Implement CalculateStandardUserSystemClockDifferenceByUser.
2020-04-16 02:51:57 -03:00
ReinUsesLisp
090fd3fefa buffer_cache: Return handles instead of pointer to handles
The original idea of returning pointers is that handles can be moved.
The problem is that the implementation didn't take that in mind and made
everything harder to work with. This commit drops pointer to handles and
returns the handles themselves. While it is still true that handles can
be invalidated, this way we get an old handle instead of a dangling
pointer.

This problem can be solved in the future with sparse buffers.
2020-04-16 02:33:34 -03:00
Rodrigo Locatti
a5a2ee8766 Merge pull request #3689 from lioncash/unused-var
decode/shift: Remove unused variable within Shift()
2020-04-16 02:05:54 -03:00
Rodrigo Locatti
d196ce0f71 Merge pull request #3688 from lioncash/nequal
surface_view: Add missing operator!= to ViewParams
2020-04-16 01:39:51 -03:00
Rodrigo Locatti
4209dba1f6 Merge pull request #3680 from lioncash/static
gl_device: Mark stage_swizzle as constexpr
2020-04-16 01:26:23 -03:00
Rodrigo Locatti
60e8de7c95 Merge pull request #3687 from lioncash/constness
surface_base: Make IsInside() a const member function
2020-04-16 01:22:50 -03:00
Rodrigo Locatti
612966399b Merge pull request #3685 from lioncash/copies
control_flow: Make use of std::move in TryInspectAddress()
2020-04-16 01:22:40 -03:00
Lioncash
cd2a12e78f decode/shift: Remove unused variable within Shift()
Removes a redundant variable that is already satisfied by the IsFull()
utility function.
2020-04-16 00:16:06 -04:00
Lioncash
5fbe8785d2 surface_view: Add missing operator!= to ViewParams
Provides logical symmetry to the interface.
2020-04-16 00:03:12 -04:00
Lioncash
d551c910bb surface_base: Make IsInside() a const member function
This doesn't modify internal state, so this can be made const.
2020-04-15 23:59:35 -04:00
bunnei
319df1db77 Merge pull request #3683 from lioncash/docs
video_core: Amend doxygen comment references
2020-04-15 23:54:58 -04:00
Lioncash
72a224d3fc control_flow: Make use of std::move in TryInspectAddress()
Eliminates redundant atomic reference count increments and decrements.
2020-04-15 23:31:22 -04:00
Lioncash
11837e8f13 video_core: Amend doxygen comment references
Fixes broken documentation references.
2020-04-15 22:33:29 -04:00
Lioncash
71fb156611 gl_device: Mark stage_swizzle as constexpr
Previously this was mutable even though it shouldn't be.
2020-04-15 21:59:13 -04:00
Rodrigo Locatti
65cbb122ea Merge pull request #3649 from FernandoS27/3d-fix
Texture Cache: Read current data when flushing a 3D segment.
2020-04-15 17:06:55 -03:00
Fernando Sahmkow
e33196d4e7 Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
Mat M
4398bdb4c7 Merge pull request #3670 from lioncash/reorder
CMakeLists: Make -Wreorder a compile-time error
2020-04-15 14:40:05 -04:00
Lioncash
213fff67bc CMakeLists: Make -Wreorder a compile-time error
This can result in silent logic bugs within code, and given the amount
of times these kind of warnings are caused, they should be flagged at
compile-time so no new code is submitted with them.
2020-04-15 14:14:41 -04:00
Mat M
64b5985f0a Merge pull request #3662 from ReinUsesLisp/constant-attrs
gl_rasterizer: Implement constant vertex attributes
2020-04-15 11:54:50 -04:00
Fernando Sahmkow
6789d88a9c Texture Cache: Read current data when flushing a 3D segment.
This PR corrects flushing of 3D segments when data of other segments is
mixed, this aims to preserve the data in place.
2020-04-15 11:46:17 -04:00
Mat M
9208d555b7 Merge pull request #3668 from ReinUsesLisp/vtx-format-16ui
maxwell_to_vk: Add uint16 vertex formats
2020-04-15 11:43:52 -04:00
Mat M
ab72696beb Merge pull request #3656 from ReinUsesLisp/glsl-full-decompile
gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
2020-04-15 03:17:46 -04:00
Mat M
4878d6bb49 Merge pull request #3654 from ReinUsesLisp/fix-fb-attach
gl_texture_cache: Fix layered texture attachment base level
2020-04-15 03:17:18 -04:00
Mat M
50c0a92db8 Merge pull request #3663 from ReinUsesLisp/fcmp-rc
shader/arithmetic: Add FCMP_CR variant
2020-04-15 03:16:56 -04:00
Mat M
13331a3a32 Merge pull request #3664 from ReinUsesLisp/fe3h-black-squares
Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
2020-04-15 03:14:28 -04:00
Mat M
3a759d2352 Merge pull request #3667 from ReinUsesLisp/viewport-trash
vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
2020-04-15 03:10:34 -04:00
ReinUsesLisp
3036067047 maxwell_to_vk: Add uint16 vertex formats 2020-04-15 04:06:30 -03:00
ReinUsesLisp
b4e43c64c8 maxwell_to_vk: Add missing breaks
Avoid invalid fallbacks.
2020-04-15 04:05:33 -03:00
ReinUsesLisp
0ca456830f vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
When the dynamic state is specified, pViewports and pScissors are
ignored, quoting the specification:

  pViewports is a pointer to an array of VkViewport structures, defining
  the viewport transforms. If the viewport state is dynamic, this member
  is ignored.

That said, AMD's proprietary driver itself seem to read it regardless of
what the specification says.
2020-04-15 03:30:08 -03:00
Rodrigo Locatti
0b132e8cc1 Merge pull request #3657 from ReinUsesLisp/viewport-zero
vk_rasterizer: Default to 1 viewports with a size of 0
2020-04-15 01:51:17 -03:00
Fernando Sahmkow
daddbeffd1 Texture Cache: Only do buffer copies on accurate GPU. (#3634)
This is a simple optimization as Buffer Copies are mostly used for texture recycling. They are, however, useful when games abuse undefined behavior but most 3D APIs forbid it.
2020-04-14 23:21:00 -04:00
bunnei
eb676c343a service: time: Implement CalculateStandardUserSystemClockDifferenceByUser.
- Used by Animal Crossing: New Horizons.
2020-04-14 22:28:41 -04:00
ReinUsesLisp
fd6371eba7 Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
This reverts commit 05cf270836.

Apparently the first approach using floats instead of bitfieldInert
worked better for Fire Emblem: Three Houses. Reverting to get that
behavior back.
2020-04-14 21:24:33 -03:00
ReinUsesLisp
fefe7f18f9 shader/arithmetic: Add FCMP_CR variant
Adds another variant of FCMP.
2020-04-14 19:11:04 -03:00
Zach Hilman
e366b4ee1f Merge pull request #3660 from bunnei/friend-blocked-users
service: friend: Stub IFriendService::GetBlockedUserListIds.
2020-04-14 16:59:46 -04:00
Zach Hilman
8040f6d544 Merge pull request #3661 from bunnei/patch-manager-fix
file_sys: patch_manager: Return early when there are no layers to apply.
2020-04-14 16:59:25 -04:00
ReinUsesLisp
6dfcabc800 gl_rasterizer: Implement constant vertex attributes
Credits go to gdkchan from Ryujinx for finding constant attributes are
used in retail games.
2020-04-14 17:58:53 -03:00
bunnei
fc35803f91 file_sys: patch_manager: Return early when there are no layers to apply. 2020-04-14 16:25:55 -04:00
bunnei
598740f1dd service: friend: Stub IFriendService::GetBlockedUserListIds.
- This is safe to stub, as there should be no adverse consequences from reporting no blocked users.
2020-04-14 16:20:51 -04:00
ReinUsesLisp
37e5c4fa7c vk_rasterizer: Default to 1 viewports with a size of 0
Silence validation layer errors.
2020-04-14 04:44:34 -03:00
ReinUsesLisp
453d7419d9 gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
From my testing on a Splatoon 2 shader that takes 3800ms on average to
compile changing to FullDecompile reduces it to 900ms on average.

The shader decoder will automatically fallback to a more naive method if
it can't use full decompile.
2020-04-14 01:34:20 -03:00
ReinUsesLisp
21dc842171 gl_texture_cache: Fix layered texture attachment base level
The base level is already included in the texture view. If we specify
the base level in the texture again, this will end up in the incorrect
level and potentially out of bounds.
2020-04-13 18:24:56 -03:00
Rodrigo Locatti
7e4a132a77 Merge pull request #3636 from ReinUsesLisp/drop-vk-hpp
renderer_vulkan: Drop Vulkan-Hpp
2020-04-13 17:08:04 -03:00
Mat M
fbf13d3f48 Merge pull request #3651 from ReinUsesLisp/line-widths
gl_rasterizer: Implement line widths and smooth lines
2020-04-13 10:19:59 -04:00
Mat M
08266d70ba Merge pull request #3638 from ReinUsesLisp/remove-preserve-contents
texture_cache: Remove preserve_contents
2020-04-13 10:19:01 -04:00
Mat M
c4001225f6 Merge pull request #3631 from ReinUsesLisp/more-astc
texture/astc: More small ASTC optimizations
2020-04-13 10:17:32 -04:00
Mat M
7b62212461 Merge pull request #3619 from ReinUsesLisp/i2i
shader/conversion: Implement I2I sign extension, saturation and selection
2020-04-13 10:17:07 -04:00
Mat M
3351e1e94f Merge pull request #3627 from ReinUsesLisp/layered-view
gl_texture_cache: Attach view instead of base texture for layered attchments
2020-04-13 10:16:18 -04:00
Mat M
d37d899431 Merge pull request #3646 from ReinUsesLisp/fix-glsl-turing
gl_shader_decompiler: Improve generated code in HMergeH*
2020-04-13 10:15:12 -04:00
Mat M
47036859eb Merge pull request #3633 from ReinUsesLisp/clean-texdec
shader/texture: Remove type mismatches management from shader decoder
2020-04-13 10:13:05 -04:00
ReinUsesLisp
76615b9f34 gl_rasterizer: Implement line widths and smooth lines
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
2020-04-13 01:30:34 -03:00
ReinUsesLisp
05cf270836 gl_shader_decompiler: Implement merges with bitfieldInsert
This also fixes Turing issues but it avoids doing more bitcasts. This
should improve the generated code while also avoiding more points where
compilers can flush floats.
2020-04-12 22:39:59 -03:00
bunnei
a9f866264d Merge pull request #3606 from ReinUsesLisp/nvflinger
service/vi: Partially implement BufferQueue disconnect
2020-04-12 11:44:48 -04:00
Fernando Sahmkow
3d91dbb21d Merge pull request #3578 from ReinUsesLisp/vmnmx
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
Mat M
4aec01b850 Merge pull request #3644 from ReinUsesLisp/msaa
video_core: Add MSAA registers in 3D engine and TIC
2020-04-12 09:11:44 -04:00
ReinUsesLisp
75eb953575 gl_shader_decompiler: Improve generated code in HMergeH*
Avoiding bitwise expressions, this fixes Turing issues in shaders using
half float merges that affected several games.
2020-04-12 05:06:55 -03:00
ReinUsesLisp
76f178ba6e shader/video: Partially implement VMNMX
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.

VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.

This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
ReinUsesLisp
a7baf6fee4 video_core: Add MSAA registers in 3D engine and TIC
This adds the registers used for multisampling. It doesn't implement
anything for now.
2020-04-12 00:21:27 -03:00
Rodrigo Locatti
75e39f7f88 Merge pull request #3635 from FernandoS27/buffer-free
Buffer queue: Correct behavior of free buffer.
2020-04-11 17:58:15 -03:00
bunnei
8938f9941c Merge pull request #3611 from FearlessTobi/port-4956
Port citra-emu/citra#4956: "Fixes to game list sorting"
2020-04-11 12:44:36 -04:00
ReinUsesLisp
94b0e2e5da texture_cache: Remove preserve_contents
preserve_contents was always true. We can't assume we don't have to
preserve clears because scissored and color masked clears exist.

This removes preserve_contents and assumes it as true at all times.
2020-04-11 01:51:02 -03:00
ReinUsesLisp
2905142f47 renderer_vulkan: Drop Vulkan-Hpp 2020-04-10 22:49:02 -03:00
bunnei
51c6688e21 Merge pull request #3594 from ReinUsesLisp/vk-instance
yuzu: Drop SDL2 and Qt frontend Vulkan requirements
2020-04-10 20:06:55 -04:00
Fernando Sahmkow
486a42c45a Buffer queue: Correct behavior of free buffer.
This corrects the behavior of free buffer after witnessing it in an
unrelated hardware test. I haven't found any games affected by it but in
name of better accuracy we'll correct such behavior.
2020-04-10 16:44:28 -04:00
bunnei
8adf66f9fd Merge pull request #3607 from FearlessTobi/input-kms
yuzu/configuration: Fix input profiles and a wrong assert
2020-04-10 00:39:48 -04:00
ReinUsesLisp
8c0ba9c6fe service/vi: Partially implement BufferQueue disconnect 2020-04-10 01:00:50 -03:00
ReinUsesLisp
a87b16da9a shader/texture: Remove type mismatches management from shader decoder
Since commit e22816a5bb we handle type mismatches from the CPU.
We don't need to hack our shader decoder due to game bugs anymore.

Removed in this commit.
2020-04-10 00:57:32 -03:00
Fernando Sahmkow
f570b129a2 Merge pull request #3623 from ReinUsesLisp/renderdoc-bind-spam
qt/bootmanager: Remove unnecessary glBindFramebuffer
2020-04-09 18:02:17 -04:00
Fernando Sahmkow
7182ef31c9 Merge pull request #3622 from ReinUsesLisp/srgb-texture-border
video_core/texture: Use a LUT to convert sRGB texture borders
2020-04-09 18:01:48 -04:00
ReinUsesLisp
6bf5d2b011 astc: Hard code bit depth changes to 8 and use fast replicate 2020-04-09 18:37:12 -03:00
Rodrigo Locatti
36f607217f Merge pull request #3610 from FernandoS27/gpu-caches
Refactor all the GPU Caches to use VAddr for cache addressing
2020-04-09 17:59:21 -03:00
ReinUsesLisp
bd2c1ab8a0 astc: Use boost's static_vector to avoid heap allocations 2020-04-09 05:27:57 -03:00
ReinUsesLisp
5de130beea astc: Implement a fast precompiled alternative for Replicate 2020-04-09 03:58:25 -03:00
ReinUsesLisp
6b4d4473be astc: Move Replicate to a constexpr LUT when possible 2020-04-09 03:35:07 -03:00
ReinUsesLisp
d22a689250 astc: Make InputBitStream constexpr 2020-04-09 02:54:05 -03:00
ReinUsesLisp
0efc230381 astc: OutputBitStream style changes and make it constexpr 2020-04-09 02:37:51 -03:00
bunnei
b96fd0bd0e Merge pull request #3601 from ReinUsesLisp/some-shader-encodings
video_core/shader: Add some instruction and S2R encodings
2020-04-09 00:17:39 -04:00
ReinUsesLisp
6c8f9f40d7 gl_texture_cache: Attach view instead of base texture for layered attachments
This way we are not ignoring the base layer of the current texture.
2020-04-08 22:20:25 -03:00
Fernando Sahmkow
7cd6daf115 VkRasterizer: Eliminate Legacy code. 2020-04-08 18:59:09 -04:00
Fernando Sahmkow
1c18dc6577 Memory: Correct GCC errors. 2020-04-08 18:09:16 -04:00
Fernando Sahmkow
913f42a3a7 Memory: Address Feedback. 2020-04-08 13:40:46 -04:00
Fernando Sahmkow
e00d992848 GPUMemoryManager: Improve safety of memory reads. 2020-04-08 12:08:06 -04:00
bunnei
449255675d Merge pull request #3624 from Kewlan/fix-sl-sr-position
configuration: Fix placement of SL and SR
2020-04-08 11:22:56 -04:00
Kewlan
848d619aec Place SL and SR in the right most column. 2020-04-08 11:34:16 +02:00
bunnei
b128beb3a9 Merge pull request #3572 from FearlessTobi/port-5127
Port citra-emu/citra#5127: "common: Port some changes from dolphin"
2020-04-07 23:48:16 -04:00
ReinUsesLisp
c6ea0d010b qt/bootmanager: Remove unnecessary glBindFramebuffer
Presentation context always has GL_DRAW_FRAMEBUFFER_BINDING as zero.
There is no need to bind the default framebuffer constantly.

According to Nsight this was using ~0.7ms per frame and it broke
renderdoc captures.
2020-04-07 20:51:56 -03:00
ReinUsesLisp
a209d464f9 video_core/textures: Move GetMaxAnisotropy to cpp file 2020-04-07 20:47:31 -03:00
ReinUsesLisp
d7db088180 video_core/texture: Use a LUT to convert sRGB texture borders
This is a reversed look up table extracted from
https://gist.github.com/rygorous/2203834#file-gistfile1-cpp-L41-L62

that is used in
04d4e9e587/source/maxwell/tsc_generate.cpp (L38)

Games usually bind 0xFD expecting a float texture border of 1.0f.
The conversion previous to this commit was multiplying the uint8 sRGB
texture border color by 255. This is close to 1.0f but when that
difference matters, some graphical glitches appear.

This look up table is manually changed in the edges, clamping towards
0.0f and 1.0f.

While we are at it, move this logic to its own translation unit.
2020-04-07 20:38:14 -03:00
Zach Hilman
26ed65495d Merge pull request #3621 from SilverBeamx/fullnamefix
Log version and about section version fix
2020-04-07 17:16:23 -04:00
SilverBeamx
863f7385dc Addressed feedback: switched to snake case and fixed clang-format errors 2020-04-07 22:59:09 +02:00
bunnei
f316911248 Merge pull request #3599 from ReinUsesLisp/revert-3499
Revert "Merge pull request #3499 from ReinUsesLisp/depth-2d-array"
2020-04-07 16:51:41 -04:00
SilverBeamx
5a66ca4697 Removed leftover test code 2020-04-07 22:45:30 +02:00
SilverBeamx
6b512d78c9 Addressed feedback: removed CMake hack in favor of building the necessary strings via the supplied title format 2020-04-07 22:41:45 +02:00
ReinUsesLisp
bf1d66b7c0 yuzu: Drop SDL2 and Qt frontend Vulkan requirements
Create Vulkan instances and surfaces from the Vulkan backend.
2020-04-07 16:32:19 -03:00
Rodrigo Locatti
487f9ba525 Merge pull request #3489 from namkazt/patch-2
shader: implement SULD.D bits32/64
2020-04-07 16:21:09 -03:00
SilverBeamx
22b5d5211e Hack BUILD_FULLNAME into GenerateSCMRev.cmake 2020-04-07 15:54:19 +02:00
Nguyen Dac Nam
935648ffa9 address nit. 2020-04-07 18:29:30 +07:00
ReinUsesLisp
bc1b4b85b0 renderer_vulkan: Query device names from the backend 2020-04-07 02:23:23 -03:00
ReinUsesLisp
7104e01bb3 common/dynamic_library: Import and adapt helper from Dolphin 2020-04-07 02:23:23 -03:00
ReinUsesLisp
da706cad25 shader/conversion: Implement I2I sign extension, saturation and selection
Reimplements I2I adding sign extension, saturation (clamp source value
to the destination), selection and destination sizes that are not 32
bits wide.

It doesn't implement CC yet.
2020-04-07 02:19:44 -03:00
Nguyen Dac Nam
bf1174c114 Apply suggestions from code review
Co-Authored-By: Rodrigo Locatti <reinuseslisp@airmail.cc>
2020-04-07 07:55:49 +07:00
enler
27f122c48c file_sys: fix LayeredFS error when loading some games made with… (#3602)
* fix LayeredFS error when loading some games made with the Unity
2020-04-07 02:03:32 +02:00
Fernando Sahmkow
f9d5718c4b Clang Format. 2020-04-06 09:23:08 -04:00
Fernando Sahmkow
ea535d9470 Shader/Pipeline Cache: Use VAddr instead of physical memory for addressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow
3dd5c07454 Query Cache: Use VAddr instead of physical memory for adressing. 2020-04-06 09:23:07 -04:00
Fernando Sahmkow
7fcd0fee6d Buffer Cache: Use vAddr instead of physical memory. 2020-04-06 09:23:06 -04:00
Fernando Sahmkow
6ee316cb8f Texture Cache: Use vAddr instead of physical memory for caching. 2020-04-06 09:23:05 -04:00
Fernando Sahmkow
9c0f40a1f5 GPU: Setup Flush/Invalidate to use VAddr instead of CacheAddr 2020-04-06 09:21:46 -04:00
Fernando Sahmkow
588a20be3f Merge pull request #3513 from ReinUsesLisp/native-astc
video_core: Use native ASTC when available
2020-04-06 09:21:11 -04:00
namkazy
2c98e14d13 shader_decode: SULD.D using std::pair instead of out parameter 2020-04-06 13:46:55 +07:00
namkazy
9efa51311f shader_decode: SULD.D avoid duplicate code block. 2020-04-06 13:34:06 +07:00
namkazy
7f5696513f shader_decode: SULD.D fix conversion error. 2020-04-06 13:26:58 +07:00
namkazy
2906372ba1 shader_decode: SULD.D implement bits64 and reverse shader ir init method to removed shader stage. 2020-04-06 13:09:19 +07:00
ReinUsesLisp
3185245845 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
ReinUsesLisp
fd0a2b5151 shader/memory: Add "using std::move" 2020-04-06 02:18:14 -03:00
ReinUsesLisp
79970c9174 shader/memory: Minor fixes in ATOM 2020-04-06 00:54:22 -03:00
FearlessTobi
8d0fb33ac4 yuzu: Fixes to game list sorting
Should fix citra-emu/citra#4593.
As the issue might not be entirely clear, I'll offer a short explanation from what I understood from it and found from experimentation.

Currently yuzu offers the user the option to change the text that's displayed in the "Name" column in the game list. Generally, it is expected that the items are sorted based on the displayed text, but yuzu would sort them by title instead.

Made it so that an access to SortRole returns the same as DisplayRole.
There shouldn't be any UI changes, only change in behaviour.

Also fixes a bug with directory sorting, where having the directories out of order would enable you to try to "move up" to the addDirectory button, which would crash the emulator.

Co-Authored-By: Vitor K <vitor-k@users.noreply.github.com>
2020-04-06 03:12:17 +02:00
Fernando Sahmkow
69277de29d Merge pull request #3592 from ReinUsesLisp/ipa
shader_decompiler: Remove FragCoord.w hack and change IPA implementation
2020-04-05 19:29:40 -04:00
Fernando Sahmkow
1633fbf99a Merge pull request #3589 from ReinUsesLisp/fix-clears
gl_rasterizer: Mark cleared textures as dirty
2020-04-05 19:29:26 -04:00
namkazy
730f9b55b3 silent warning (conversion error) 2020-04-05 16:02:07 +07:00
namkazy
9f6ebccf06 shader_decode: SULD.D -> SINT actually same as UNORM. 2020-04-05 15:18:42 +07:00
namkazy
6f2b7087c2 shader_decode: SULD.D fix decode SNORM component 2020-04-05 14:46:43 +07:00
namkazy
69657ff19c clang-format 2020-04-05 12:57:50 +07:00
namkazy
24cc64c5b3 shader_decode: get sampler descriptor from registry. 2020-04-05 12:54:48 +07:00
FearlessTobi
aa6214feb7 yuzu/configuration: Only assert that all buttons exist when we are handling the click for a button device
This fixes failed assertions that were present in yuzu master code for 18 months.
2020-04-05 07:16:09 +02:00
FearlessTobi
fb8afee077 yuzu/configure_input_simple: Fix "Docked Joycons" controller profile
This was incorrectly using PlayerIndex 1 when calling the ConfigureDialog.
2020-04-05 07:14:35 +02:00
namkazy
acd3f0ab37 tweaking. 2020-04-05 10:31:32 +07:00
Nguyen Dac Nam
8370188b3c clang-format 2020-04-05 10:31:31 +07:00
namkazy
3e3afa9be6 cleanup unuse params 2020-04-05 10:31:31 +07:00
namkazy
5cd5857000 cleanup debug code. 2020-04-05 10:31:30 +07:00
namkazy
658112783d reimplement get component type, uncomment mistaken code 2020-04-05 10:31:30 +07:00
namkazy
3ad06e9b2b remove disable optimize 2020-04-05 10:31:30 +07:00
namkazy
f24c2e1103 [wip] reimplement SULD.D 2020-04-05 10:31:29 +07:00
namkazy
58bcb86af5 add shader stage when init shader ir 2020-04-05 10:31:29 +07:00
Nguyen Dac Nam
2cefdd92bd clang-fix 2020-04-05 10:31:28 +07:00
Nguyen Dac Nam
1f3d142875 shader: image - import PredCondition 2020-04-05 10:31:27 +07:00
Nguyen Dac Nam
08db60392d shader: SULD.D bits32 implement more complexer method. 2020-04-05 10:31:27 +07:00
Nguyen Dac Nam
ed1d8beb13 shader: SULD.D import StoreType 2020-04-05 10:31:26 +07:00
Nguyen Dac Nam
6d235b8631 shader: implement SULD.D bits32 2020-04-05 10:31:26 +07:00
Zach Hilman
59e75f4372 ci: Update to Windows Server 2019 and Visual Studio 2019
This updates to the latest available toolchain for MSVC builds.
2020-04-04 16:13:57 -04:00
ReinUsesLisp
60106531b4 shader/other: Add error message for some S2R registers 2020-04-04 03:46:07 -03:00
ReinUsesLisp
8b719e9e1d shader_bytecode: Rename MOV_SYS to S2R 2020-04-04 03:37:51 -03:00
ReinUsesLisp
9d15feb892 shader_bytecode: Add encoding for BAR 2020-04-04 03:36:21 -03:00
ReinUsesLisp
16ae98dbb3 shader_ir: Add error message for EXIT.FCSM_TR 2020-04-04 03:34:08 -03:00
ReinUsesLisp
c02a2dc24a shader_bytecode: Add encoding for VOTE.VTG 2020-04-04 03:28:11 -03:00
ReinUsesLisp
80c4fee4ec Revert "Merge pull request #3499 from ReinUsesLisp/depth-2d-array"
This reverts commit 41905ee467, reversing
changes made to 35145bd529.

It causes regressions in several games.
2020-04-04 00:02:26 -03:00
bunnei
e6f02d5725 Merge pull request #3579 from Kewlan/reorder-shoulder
configuration: Reorder shoulder buttons
2020-04-03 11:28:14 -04:00
Fernando Sahmkow
9d8886b1a4 Merge pull request #3563 from bunnei/fix-ldr-memstate
services: ldr: Fix MemoryState for read/write regions of NROs.
2020-04-03 10:14:56 -04:00
bunnei
0d4ca5a8fc Merge pull request #3595 from ReinUsesLisp/c4715-silence
shader/memory: Silence no return value warning
2020-04-02 14:32:19 -04:00
ReinUsesLisp
e1bd89e1c2 shader/memory: Silence no return value warning
Silences a warning about control paths not all returning a value.
2020-04-02 03:34:27 -03:00
Rodrigo Locatti
825a6e2615 Merge pull request #3552 from jroweboy/single-context
Refactor Context management (Fixes renderdoc on opengl issues)
2020-04-02 01:38:25 -03:00
ReinUsesLisp
2339fe199f shader_decompiler: Remove FragCoord.w hack and change IPA implementation
Credits go to gdkchan and Ryujinx. The pull request used for this can
be found here: https://github.com/Ryujinx/Ryujinx/pull/1082

yuzu was already using the header for interpolation, but it was missing
the FragCoord.w multiplication described in the linked pull request.
This commit finally removes the FragCoord.w == 1.0f hack from the shader
decompiler.

While we are at it, this commit renames some enumerations to match
Nvidia's documentation (linked below) and fixes component declaration
order in the shader program header (z and w were swapped).

https://github.com/NVIDIA/open-gpu-doc/blob/master/Shader-Program-Header/Shader-Program-Header.html
2020-04-01 21:48:55 -03:00
ReinUsesLisp
dd1232755b gl_texture_cache: Fix software ASTC fallback 2020-04-01 01:44:15 -03:00
ReinUsesLisp
2f0da10dc3 vk_device: Add missing ASTC queries 2020-04-01 01:14:04 -03:00
ReinUsesLisp
b6571ca9f0 video_core: Use native ASTC when available 2020-04-01 01:14:04 -03:00
ReinUsesLisp
16270dcfe4 gl_device: Detect if ASTC is reported and expose it 2020-04-01 01:14:04 -03:00
Rodrigo Locatti
baf91c920c Merge pull request #3591 from ReinUsesLisp/vk-wrapper-part2
renderer_vulkan/wrapper: Add a Vulkan wrapper (part 2 of 2)
2020-03-31 22:14:26 -03:00
Vitor K
bd0c56c6e7 common: Port some changes from dolphin (#5127)
* IOFile: Make the move constructor and move assignment operator noexcept

Certain parts of the standard library try to determine whether or not a
transfer operation should either be a copy or a move. The prevalent notion
of move constructors/assignment operators is that they should not throw,
they simply move an already existing resource somewhere else.

This is typically done with 'std::move_if_noexcept'. Like the name says,
if a type's move constructor is noexcept, then the functions retrieves an
r-value reference (for move semantics), or an l-value (for copy semantics)
if it is not noexcept.

As IOFile deletes the copy constructor and copy assignment operators,
using IOFile with certain parts of the standard library can fail in
unexcepted ways (especially when used with various container
implementations). This prevents that.

* fix various instances of -1 being assigned to unsigned types

* do not assign in conditional statements

* File/IOFile: Check _tfopen_s properly

* common/file_util.cpp: address review comments

Co-authored-by: Lioncash <mathew1800@gmail.com>
Co-authored-by: Shawn Hoffman <godisgovernment@gmail.com>
Co-authored-by: Sepalani <sepalani@hotmail.fr>
2020-04-01 02:58:42 +02:00
ReinUsesLisp
f22f6b72c3 renderer_vulkan/wrapper: Add vkEnumerateInstanceExtensionProperties wrapper 2020-03-31 21:32:08 -03:00
ReinUsesLisp
27dd542c60 renderer_vulkan/wrapper: Add command buffer handle 2020-03-31 21:32:08 -03:00
ReinUsesLisp
5c90d060d8 renderer_vulkan/wrapper: Add physical device handle 2020-03-31 21:32:08 -03:00
ReinUsesLisp
0eb37de98f renderer_vulkan/wrapper: Add device handle 2020-03-31 21:32:08 -03:00
ReinUsesLisp
11774308d3 renderer_vulkan/wrapper: Add swapchain handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp
7fe52ef77f renderer_vulkan/wrapper: Add fence handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp
3a63ae0658 renderer_vulkan/wrapper: Add device memory handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp
397f53dea1 renderer_vulkan/wrapper: Add pool handles 2020-03-31 21:32:07 -03:00
ReinUsesLisp
affee77b70 renderer_vulkan/wrapper: Add buffer and image handles 2020-03-31 21:32:07 -03:00
ReinUsesLisp
d85ca0ab33 renderer_vulkan/wrapper: Add queue handle 2020-03-31 21:32:07 -03:00
ReinUsesLisp
151ddcf419 renderer_vulkan/wrapper: Add instance handle 2020-03-31 21:32:07 -03:00
Morph
224a75d839 capsrv: Split Capture services into individual files and stub GetAlbumContentsFileListForApplication (#3571)
* Organize capture services into individual files

* Stub GetAlbumContentsFileListForApplication

* Address feedback
2020-03-31 19:16:36 -04:00
Fernando Sahmkow
b03c0536ce Merge pull request #3561 from ReinUsesLisp/f2f-conversion
shader/conversion: Fix F2F rounding operations with different sizes
2020-03-31 14:45:02 -04:00
Fernando Sahmkow
5b95a01463 Merge pull request #3577 from ReinUsesLisp/lea
shader/lea: Fix LEA implementation
2020-03-31 14:36:07 -04:00
ReinUsesLisp
1c5e2b60a7 gl_rasterizer: Mark cleared textures as dirty
Fixes a potential edge case where cleared textures read from the CPU
were not flushed.
2020-03-31 05:51:56 -03:00
Rodrigo Locatti
c19425ed69 Merge pull request #3506 from namkazt/patch-9
shader_decode: Implement partial ATOM/ATOMS instr
2020-03-31 00:56:28 -03:00
Rodrigo Locatti
69728e8ad5 Merge pull request #3566 from ReinUsesLisp/vk-wrapper-part1
renderer_vulkan/wrapper: Add a Vulkan wrapper (part 1 of 2)
2020-03-30 21:57:36 -03:00
bunnei
4c72190a06 Merge pull request #3560 from ReinUsesLisp/fix-stencil
gl_rasterizer: Synchronize stencil testing on clears
2020-03-30 17:03:07 -04:00
James Rowe
f1da3ec584 Frontend: Don't call DoneCurrent if the context isnt already current 2020-03-30 14:57:42 -06:00
Kewlan
a8f3a13a1f Re-order the shoulder buttons both in the configuration menu, and in the code. 2020-03-29 14:37:23 +02:00
ReinUsesLisp
08470d261d shader_bytecode: Fix I2I_IMM encoding 2020-03-28 18:49:07 -03:00
ReinUsesLisp
b6c9fba81c renderer_vulkan/wrapper: Address feedback 2020-03-28 04:09:02 -03:00
ReinUsesLisp
5300a918c6 shader/lea: Simplify generated LEA code 2020-03-28 03:55:04 -03:00
ReinUsesLisp
523a709bf1 shader/lea: Fix op_a and op_b usages
They were swapped.
2020-03-27 18:37:20 -03:00
ReinUsesLisp
796b3319e6 shader/lea: Remove const and use move when possible 2020-03-27 18:36:38 -03:00
ReinUsesLisp
2694552b7f renderer_vulkan/wrapper: Add owning handles 2020-03-27 03:21:04 -03:00
ReinUsesLisp
7413b30923 renderer_vulkan/wrapper: Add pool allocations owning templated class 2020-03-27 03:21:04 -03:00
ReinUsesLisp
d8d392b39a renderer_vulkan/wrapper: Add owning handle templated class 2020-03-27 03:21:04 -03:00
ReinUsesLisp
60f351084a renderer_vulkan/wrapper: Add destroy and free overload set 2020-03-27 03:21:04 -03:00
ReinUsesLisp
a9e4528d10 renderer_vulkan/wrapper: Add dispatch table and loaders 2020-03-27 03:21:04 -03:00
ReinUsesLisp
3f0b7673f0 renderer_vulkan/wrapper: Add exception class 2020-03-27 03:21:04 -03:00
ReinUsesLisp
f5cee0e885 renderer_vulkan/wrapper: Add ToString function for VkResult 2020-03-27 03:21:03 -03:00
ReinUsesLisp
92c8d783b3 renderer_vulkan/wrapper: Add Vulakn wrapper and a span helper
The intention behind a Vulkan wrapper is to drop Vulkan-Hpp.

The issues with Vulkan-Hpp are:
- Regular breaks of the API.
- Copy constructors that do the same as the aggregates (fixed recently)
- External dynamic dispatch that is hard to remove
- Alias KHR handles with non-KHR handles making it impossible to use
smart handles on Vulkan 1.0 instances with extensions that were included
on Vulkan 1.1.
- Dynamic dispatchers silently change size depending on preprocessor
definitions. Different files will have different dispatch definitions,
generating all kinds of hard to debug memory issues.

In other words, Vulkan-Hpp is not "production ready" for our needs and
this wrapper aims to replace it without losing RAII and exception
safety.
2020-03-27 03:13:18 -03:00
bunnei
5228bd0bb9 services: ldr: Fix MemoryState for read/write regions of NROs.
- Fixes #3541, used by Final Fantasy VIII Remastered.
2020-03-26 15:52:59 -04:00
James Rowe
cf9c94d401 Address review and fix broken yuzu-tester build 2020-03-25 23:32:42 -06:00
ReinUsesLisp
46791c464a shader/conversion: Fix F2F rounding operations with different sizes
Rounding operations only matter when the conversion size of source and
destination is the same, i.e. .F16.F16, .F32.F32 and .F64.F64.

When there is a mismatch (.F16.F32), these bits are used for IEEE
rounding, we don't emulate this because GLSL and SPIR-V don't support
configuring it per operation.
2020-03-26 01:58:49 -03:00
ReinUsesLisp
7617e88fb2 gl_rasterizer: Update stencil test regardless of it being disabled 2020-03-26 01:08:14 -03:00
ReinUsesLisp
c310cef615 gl_rasterizer: Synchronize stencil testing on clears 2020-03-26 00:51:47 -03:00
James Rowe
282adfc70b Frontend/GPU: Refactor context management
Changes the GraphicsContext to be managed by the GPU core. This
eliminates the need for the frontends to fool around with tricky
MakeCurrent/DoneCurrent calls that are dependent on the settings (such
as async gpu option).

This also refactors out the need to use QWidget::fromWindowContainer as
that caused issues with focus and input handling. Now we use a regular
QWidget and just access the native windowHandle() directly.

Another change is removing the debug tool setting in FrameMailbox.
Instead of trying to block the frontend until a new frame is ready, the
core will now take over presentation and draw directly to the window if
the renderer detects that its hooked by NSight or RenderDoc

Lastly, since it was in the way, I removed ScopeAcquireWindowContext and
replaced it with a simple subclass in GraphicsContext that achieves the
same result
2020-03-24 21:03:42 -06:00
182 changed files with 8725 additions and 4495 deletions

View File

@@ -4,7 +4,7 @@ parameters:
version: ''
steps:
- script: mkdir build && cd build && cmake -G "Visual Studio 15 2017 Win64" --config Release -DYUZU_USE_BUNDLED_QT=1 -DYUZU_USE_BUNDLED_SDL2=1 -DYUZU_USE_BUNDLED_UNICORN=1 -DYUZU_USE_QT_WEB_ENGINE=ON -DENABLE_COMPATIBILITY_LIST_DOWNLOAD=ON -DYUZU_ENABLE_COMPATIBILITY_REPORTING=${COMPAT} -DUSE_DISCORD_PRESENCE=ON -DDISPLAY_VERSION=${{ parameters['version'] }} .. && cd ..
- script: mkdir build && cd build && cmake -G "Visual Studio 16 2019" -A x64 --config Release -DYUZU_USE_BUNDLED_QT=1 -DYUZU_USE_BUNDLED_SDL2=1 -DYUZU_USE_BUNDLED_UNICORN=1 -DYUZU_USE_QT_WEB_ENGINE=ON -DENABLE_COMPATIBILITY_LIST_DOWNLOAD=ON -DYUZU_ENABLE_COMPATIBILITY_REPORTING=${COMPAT} -DUSE_DISCORD_PRESENCE=ON -DDISPLAY_VERSION=${{ parameters['version'] }} .. && cd ..
displayName: 'Configure CMake'
- task: MSBuild@1
displayName: 'Build'

View File

@@ -45,7 +45,7 @@ stages:
- job: build
displayName: 'msvc'
pool:
vmImage: vs2017-win2016
vmImage: windows-2019
steps:
- template: ./templates/sync-source.yml
parameters:

View File

@@ -22,7 +22,7 @@ stages:
- job: build
displayName: 'windows-msvc'
pool:
vmImage: vs2017-win2016
vmImage: windows-2019
steps:
- template: ./templates/sync-source.yml
parameters:

View File

@@ -3,13 +3,27 @@
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${PROJECT_SOURCE_DIR}/CMakeModules)
include(DownloadExternals)
# pkgconfig -- it is used to find shared libraries without cmake modules on linux systems
find_package(PkgConfig)
if (NOT PkgConfig_FOUND)
function(pkg_check_modules)
# STUB
endfunction()
endif()
# Catch
add_library(catch-single-include INTERFACE)
target_include_directories(catch-single-include INTERFACE catch/single_include)
# libfmt
add_subdirectory(fmt)
add_library(fmt::fmt ALIAS fmt)
pkg_check_modules(FMT IMPORTED_TARGET GLOBAL fmt>=6.1.0)
if (FMT_FOUND)
add_library(fmt::fmt ALIAS PkgConfig::FMT)
else()
message(STATUS "fmt 6.1.0 or newer not found, falling back to externals")
add_subdirectory(fmt)
add_library(fmt::fmt ALIAS fmt)
endif()
# Dynarmic
if (ARCHITECTURE_x86_64)
@@ -30,9 +44,15 @@ add_subdirectory(glad)
add_subdirectory(inih)
# lz4
set(LZ4_BUNDLED_MODE ON)
add_subdirectory(lz4/contrib/cmake_unofficial EXCLUDE_FROM_ALL)
target_include_directories(lz4_static INTERFACE ./lz4/lib)
pkg_check_modules(LIBLZ4 IMPORTED_TARGET GLOBAL liblz4>=1.8.0)
if (LIBLZ4_FOUND)
add_library(lz4_static ALIAS PkgConfig::LIBLZ4)
else()
message(STATUS "liblz4 1.8.0 or newer not found, falling back to externals")
set(LZ4_BUNDLED_MODE ON)
add_subdirectory(lz4/contrib/cmake_unofficial EXCLUDE_FROM_ALL)
target_include_directories(lz4_static INTERFACE ./lz4/lib)
endif()
# mbedtls
add_subdirectory(mbedtls EXCLUDE_FROM_ALL)
@@ -47,15 +67,27 @@ add_library(unicorn-headers INTERFACE)
target_include_directories(unicorn-headers INTERFACE ./unicorn/include)
# Zstandard
add_subdirectory(zstd/build/cmake EXCLUDE_FROM_ALL)
target_include_directories(libzstd_static INTERFACE ./zstd/lib)
pkg_check_modules(LIBZSTD IMPORTED_TARGET GLOBAL libzstd>=1.3.8)
if (LIBZSTD_FOUND)
add_library(libzstd_static ALIAS PkgConfig::LIBZSTD)
else()
message(STATUS "libzstd 1.3.8 or newer not found, falling back to externals")
add_subdirectory(zstd/build/cmake EXCLUDE_FROM_ALL)
target_include_directories(libzstd_static INTERFACE ./zstd/lib)
endif()
# SoundTouch
add_subdirectory(soundtouch)
# Opus
add_subdirectory(opus)
target_include_directories(opus INTERFACE ./opus/include)
pkg_check_modules(OPUS IMPORTED_TARGET GLOBAL opus>=1.3.1)
if (OPUS_FOUND)
add_library(opus ALIAS PkgConfig::OPUS)
else()
message(STATUS "opus 1.3.1 or newer not found, falling back to externals")
add_subdirectory(opus)
target_include_directories(opus INTERFACE ./opus/include)
endif()
# Cubeb
if(ENABLE_CUBEB)
@@ -75,18 +107,35 @@ if (ENABLE_VULKAN)
endif()
# zlib
add_subdirectory(zlib EXCLUDE_FROM_ALL)
set(ZLIB_LIBRARIES z)
find_package(ZLIB 1.2.11)
if (NOT ZLIB_FOUND)
message(STATUS "zlib 1.2.11 or newer not found, falling back to externals")
add_subdirectory(zlib EXCLUDE_FROM_ALL)
set(ZLIB_LIBRARIES z)
endif()
# libzip
add_subdirectory(libzip EXCLUDE_FROM_ALL)
pkg_check_modules(LIBZIP IMPORTED_TARGET GLOBAL libzip>=1.5.3)
if (LIBZIP_FOUND)
add_library(zip ALIAS PkgConfig::LIBZIP)
else()
message(STATUS "libzip 1.5.3 or newer not found, falling back to externals")
add_subdirectory(libzip EXCLUDE_FROM_ALL)
endif()
if (ENABLE_WEB_SERVICE)
# LibreSSL
set(LIBRESSL_SKIP_INSTALL ON CACHE BOOL "")
add_subdirectory(libressl EXCLUDE_FROM_ALL)
target_include_directories(ssl INTERFACE ./libressl/include)
target_compile_definitions(ssl PRIVATE -DHAVE_INET_NTOP)
find_package(OpenSSL COMPONENTS Crypto SSL)
if (NOT OpenSSL_FOUND)
message(STATUS "OpenSSL not found, falling back to externals")
set(LIBRESSL_SKIP_INSTALL ON CACHE BOOL "")
add_subdirectory(libressl EXCLUDE_FROM_ALL)
target_include_directories(ssl INTERFACE ./libressl/include)
target_compile_definitions(ssl PRIVATE -DHAVE_INET_NTOP)
get_directory_property(OPENSSL_LIBRARIES
DIRECTORY libressl
DEFINITION OPENSSL_LIBS)
endif()
# lurlparser
add_subdirectory(lurlparser EXCLUDE_FROM_ALL)
@@ -94,6 +143,8 @@ if (ENABLE_WEB_SERVICE)
# httplib
add_library(httplib INTERFACE)
target_include_directories(httplib INTERFACE ./httplib)
target_compile_definitions(httplib INTERFACE -DCPPHTTPLIB_OPENSSL_SUPPORT)
target_link_libraries(httplib INTERFACE ${OPENSSL_LIBRARIES})
# JSON
add_library(json-headers INTERFACE)

View File

@@ -53,6 +53,7 @@ if (MSVC)
else()
add_compile_options(
-Wall
-Werror=reorder
-Wno-attributes
)

View File

@@ -106,6 +106,8 @@ add_library(common STATIC
common_funcs.h
common_paths.h
common_types.h
dynamic_library.cpp
dynamic_library.h
file_util.cpp
file_util.h
hash.h

View File

@@ -0,0 +1,106 @@
// Copyright 2019 Dolphin Emulator Project
// Licensed under GPLv2+
// Refer to the license.txt file included.
#include <cstring>
#include <string>
#include <utility>
#include <fmt/format.h>
#include "common/dynamic_library.h"
#ifdef _WIN32
#include <windows.h>
#else
#include <dlfcn.h>
#endif
namespace Common {
DynamicLibrary::DynamicLibrary() = default;
DynamicLibrary::DynamicLibrary(const char* filename) {
Open(filename);
}
DynamicLibrary::DynamicLibrary(DynamicLibrary&& rhs) noexcept
: handle{std::exchange(rhs.handle, nullptr)} {}
DynamicLibrary& DynamicLibrary::operator=(DynamicLibrary&& rhs) noexcept {
Close();
handle = std::exchange(rhs.handle, nullptr);
return *this;
}
DynamicLibrary::~DynamicLibrary() {
Close();
}
std::string DynamicLibrary::GetUnprefixedFilename(const char* filename) {
#if defined(_WIN32)
return std::string(filename) + ".dll";
#elif defined(__APPLE__)
return std::string(filename) + ".dylib";
#else
return std::string(filename) + ".so";
#endif
}
std::string DynamicLibrary::GetVersionedFilename(const char* libname, int major, int minor) {
#if defined(_WIN32)
if (major >= 0 && minor >= 0)
return fmt::format("{}-{}-{}.dll", libname, major, minor);
else if (major >= 0)
return fmt::format("{}-{}.dll", libname, major);
else
return fmt::format("{}.dll", libname);
#elif defined(__APPLE__)
const char* prefix = std::strncmp(libname, "lib", 3) ? "lib" : "";
if (major >= 0 && minor >= 0)
return fmt::format("{}{}.{}.{}.dylib", prefix, libname, major, minor);
else if (major >= 0)
return fmt::format("{}{}.{}.dylib", prefix, libname, major);
else
return fmt::format("{}{}.dylib", prefix, libname);
#else
const char* prefix = std::strncmp(libname, "lib", 3) ? "lib" : "";
if (major >= 0 && minor >= 0)
return fmt::format("{}{}.so.{}.{}", prefix, libname, major, minor);
else if (major >= 0)
return fmt::format("{}{}.so.{}", prefix, libname, major);
else
return fmt::format("{}{}.so", prefix, libname);
#endif
}
bool DynamicLibrary::Open(const char* filename) {
#ifdef _WIN32
handle = reinterpret_cast<void*>(LoadLibraryA(filename));
#else
handle = dlopen(filename, RTLD_NOW);
#endif
return handle != nullptr;
}
void DynamicLibrary::Close() {
if (!IsOpen())
return;
#ifdef _WIN32
FreeLibrary(reinterpret_cast<HMODULE>(handle));
#else
dlclose(handle);
#endif
handle = nullptr;
}
void* DynamicLibrary::GetSymbolAddress(const char* name) const {
#ifdef _WIN32
return reinterpret_cast<void*>(GetProcAddress(reinterpret_cast<HMODULE>(handle), name));
#else
return reinterpret_cast<void*>(dlsym(handle, name));
#endif
}
} // namespace Common

View File

@@ -0,0 +1,75 @@
// Copyright 2019 Dolphin Emulator Project
// Licensed under GPLv2+
// Refer to the license.txt file included.
#pragma once
#include <string>
namespace Common {
/**
* Provides a platform-independent interface for loading a dynamic library and retrieving symbols.
* The interface maintains an internal reference count to allow one handle to be shared between
* multiple users.
*/
class DynamicLibrary final {
public:
/// Default constructor, does not load a library.
explicit DynamicLibrary();
/// Automatically loads the specified library. Call IsOpen() to check validity before use.
explicit DynamicLibrary(const char* filename);
/// Moves the library.
DynamicLibrary(DynamicLibrary&&) noexcept;
DynamicLibrary& operator=(DynamicLibrary&&) noexcept;
/// Delete copies, we can't copy a dynamic library.
DynamicLibrary(const DynamicLibrary&) = delete;
DynamicLibrary& operator=(const DynamicLibrary&) = delete;
/// Closes the library.
~DynamicLibrary();
/// Returns the specified library name with the platform-specific suffix added.
static std::string GetUnprefixedFilename(const char* filename);
/// Returns the specified library name in platform-specific format.
/// Major/minor versions will not be included if set to -1.
/// If libname already contains the "lib" prefix, it will not be added again.
/// Windows: LIBNAME-MAJOR-MINOR.dll
/// Linux: libLIBNAME.so.MAJOR.MINOR
/// Mac: libLIBNAME.MAJOR.MINOR.dylib
static std::string GetVersionedFilename(const char* libname, int major = -1, int minor = -1);
/// Returns true if a module is loaded, otherwise false.
bool IsOpen() const {
return handle != nullptr;
}
/// Loads (or replaces) the handle with the specified library file name.
/// Returns true if the library was loaded and can be used.
bool Open(const char* filename);
/// Unloads the library, any function pointers from this library are no longer valid.
void Close();
/// Returns the address of the specified symbol (function or variable) as an untyped pointer.
/// If the specified symbol does not exist in this library, nullptr is returned.
void* GetSymbolAddress(const char* name) const;
/// Obtains the address of the specified symbol, automatically casting to the correct type.
/// Returns true if the symbol was found and assigned, otherwise false.
template <typename T>
bool GetSymbol(const char* name, T* ptr) const {
*ptr = reinterpret_cast<T>(GetSymbolAddress(name));
return *ptr != nullptr;
}
private:
/// Platform-dependent data type representing a dynamic library handle.
void* handle = nullptr;
};
} // namespace Common

View File

@@ -3,6 +3,7 @@
// Refer to the license.txt file included.
#include <array>
#include <limits>
#include <memory>
#include <sstream>
#include <unordered_map>
@@ -530,11 +531,11 @@ void CopyDir(const std::string& source_path, const std::string& dest_path) {
std::optional<std::string> GetCurrentDir() {
// Get the current working directory (getcwd uses malloc)
#ifdef _WIN32
wchar_t* dir;
if (!(dir = _wgetcwd(nullptr, 0))) {
wchar_t* dir = _wgetcwd(nullptr, 0);
if (!dir) {
#else
char* dir;
if (!(dir = getcwd(nullptr, 0))) {
char* dir = getcwd(nullptr, 0);
if (!dir) {
#endif
LOG_ERROR(Common_Filesystem, "GetCurrentDirectory failed: {}", GetLastErrorMsg());
return {};
@@ -918,19 +919,22 @@ void IOFile::Swap(IOFile& other) noexcept {
bool IOFile::Open(const std::string& filename, const char openmode[], int flags) {
Close();
bool m_good;
#ifdef _WIN32
if (flags != 0) {
m_file = _wfsopen(Common::UTF8ToUTF16W(filename).c_str(),
Common::UTF8ToUTF16W(openmode).c_str(), flags);
m_good = m_file != nullptr;
} else {
_wfopen_s(&m_file, Common::UTF8ToUTF16W(filename).c_str(),
Common::UTF8ToUTF16W(openmode).c_str());
m_good = _wfopen_s(&m_file, Common::UTF8ToUTF16W(filename).c_str(),
Common::UTF8ToUTF16W(openmode).c_str()) == 0;
}
#else
m_file = fopen(filename.c_str(), openmode);
m_file = std::fopen(filename.c_str(), openmode);
m_good = m_file != nullptr;
#endif
return IsOpen();
return m_good;
}
bool IOFile::Close() {
@@ -956,7 +960,7 @@ u64 IOFile::Tell() const {
if (IsOpen())
return ftello(m_file);
return -1;
return std::numeric_limits<u64>::max();
}
bool IOFile::Flush() {

View File

@@ -28,11 +28,8 @@ namespace Common {
#ifdef _MSC_VER
// Sets the debugger-visible name of the current thread.
// Uses undocumented (actually, it is now documented) trick.
// http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vsdebug/html/vxtsksettingthreadname.asp
// This is implemented much nicer in upcoming msvc++, see:
// http://msdn.microsoft.com/en-us/library/xcb2z8hs(VS.100).aspx
// Uses trick documented in:
// https://docs.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code
void SetCurrentThreadName(const char* name) {
static const DWORD MS_VC_EXCEPTION = 0x406D1388;
@@ -47,7 +44,7 @@ void SetCurrentThreadName(const char* name) {
info.dwType = 0x1000;
info.szName = name;
info.dwThreadID = -1; // dwThreadID;
info.dwThreadID = std::numeric_limits<DWORD>::max();
info.dwFlags = 0;
__try {

View File

@@ -131,8 +131,6 @@ add_library(core STATIC
frontend/framebuffer_layout.cpp
frontend/framebuffer_layout.h
frontend/input.h
frontend/scope_acquire_context.cpp
frontend/scope_acquire_context.h
gdbstub/gdbstub.cpp
gdbstub/gdbstub.h
hardware_interrupt_manager.cpp
@@ -287,6 +285,18 @@ add_library(core STATIC
hle/service/btm/btm.h
hle/service/caps/caps.cpp
hle/service/caps/caps.h
hle/service/caps/caps_a.cpp
hle/service/caps/caps_a.h
hle/service/caps/caps_c.cpp
hle/service/caps/caps_c.h
hle/service/caps/caps_u.cpp
hle/service/caps/caps_u.h
hle/service/caps/caps_sc.cpp
hle/service/caps/caps_sc.h
hle/service/caps/caps_ss.cpp
hle/service/caps/caps_ss.h
hle/service/caps/caps_su.cpp
hle/service/caps/caps_su.h
hle/service/erpt/erpt.cpp
hle/service/erpt/erpt.h
hle/service/es/es.cpp
@@ -581,11 +591,8 @@ target_link_libraries(core PUBLIC common PRIVATE audio_core video_core)
target_link_libraries(core PUBLIC Boost::boost PRIVATE fmt json-headers mbedtls opus unicorn)
if (YUZU_ENABLE_BOXCAT)
get_directory_property(OPENSSL_LIBS
DIRECTORY ${PROJECT_SOURCE_DIR}/externals/libressl
DEFINITION OPENSSL_LIBS)
target_compile_definitions(core PRIVATE -DCPPHTTPLIB_OPENSSL_SUPPORT -DYUZU_ENABLE_BOXCAT)
target_link_libraries(core PRIVATE httplib json-headers ${OPENSSL_LIBS} zip)
target_compile_definitions(core PRIVATE -DYUZU_ENABLE_BOXCAT)
target_link_libraries(core PRIVATE httplib json-headers zip)
endif()
if (ENABLE_WEB_SERVICE)

View File

@@ -24,7 +24,6 @@
#include "core/file_sys/sdmc_factory.h"
#include "core/file_sys/vfs_concat.h"
#include "core/file_sys/vfs_real.h"
#include "core/frontend/scope_acquire_context.h"
#include "core/gdbstub/gdbstub.h"
#include "core/hardware_interrupt_manager.h"
#include "core/hle/kernel/client_port.h"
@@ -168,13 +167,12 @@ struct System::Impl {
Service::Init(service_manager, system);
GDBStub::DeferStart();
renderer = VideoCore::CreateRenderer(emu_window, system);
if (!renderer->Init()) {
interrupt_manager = std::make_unique<Core::Hardware::InterruptManager>(system);
gpu_core = VideoCore::CreateGPU(emu_window, system);
if (!gpu_core) {
return ResultStatus::ErrorVideoCore;
}
interrupt_manager = std::make_unique<Core::Hardware::InterruptManager>(system);
gpu_core = VideoCore::CreateGPU(system);
renderer->Rasterizer().SetupDirtyFlags();
gpu_core->Renderer().Rasterizer().SetupDirtyFlags();
is_powered_on = true;
exit_lock = false;
@@ -186,8 +184,6 @@ struct System::Impl {
ResultStatus Load(System& system, Frontend::EmuWindow& emu_window,
const std::string& filepath) {
Core::Frontend::ScopeAcquireContext acquire_context{emu_window};
app_loader = Loader::GetLoader(GetGameFileFromPath(virtual_filesystem, filepath));
if (!app_loader) {
LOG_CRITICAL(Core, "Failed to obtain loader for {}!", filepath);
@@ -216,10 +212,6 @@ struct System::Impl {
AddGlueRegistrationForProcess(*app_loader, *main_process);
kernel.MakeCurrentProcess(main_process.get());
// Main process has been loaded and been made current.
// Begin GPU and CPU execution.
gpu_core->Start();
// Initialize cheat engine
if (cheat_engine) {
cheat_engine->Initialize();
@@ -277,7 +269,6 @@ struct System::Impl {
}
// Shutdown emulation session
renderer.reset();
GDBStub::Shutdown();
Service::Shutdown();
service_manager.reset();
@@ -353,7 +344,6 @@ struct System::Impl {
Service::FileSystem::FileSystemController fs_controller;
/// AppLoader used to load the current executing application
std::unique_ptr<Loader::AppLoader> app_loader;
std::unique_ptr<VideoCore::RendererBase> renderer;
std::unique_ptr<Tegra::GPU> gpu_core;
std::unique_ptr<Hardware::InterruptManager> interrupt_manager;
Memory::Memory memory;
@@ -536,11 +526,11 @@ const Core::Hardware::InterruptManager& System::InterruptManager() const {
}
VideoCore::RendererBase& System::Renderer() {
return *impl->renderer;
return impl->gpu_core->Renderer();
}
const VideoCore::RendererBase& System::Renderer() const {
return *impl->renderer;
return impl->gpu_core->Renderer();
}
Kernel::KernelCore& System::Kernel() {

View File

@@ -348,6 +348,12 @@ static void ApplyLayeredFS(VirtualFile& romfs, u64 title_id, ContentRecordType t
if (ext_dir != nullptr)
layers_ext.push_back(std::move(ext_dir));
}
// When there are no layers to apply, return early as there is no need to rebuild the RomFS
if (layers.empty() && layers_ext.empty()) {
return;
}
layers.push_back(std::move(extracted));
auto layered = LayeredVfsDirectory::MakeLayeredDirectory(std::move(layers));

View File

@@ -5,6 +5,7 @@
#include <memory>
#include "common/common_types.h"
#include "common/string_util.h"
#include "common/swap.h"
#include "core/file_sys/fsmitm_romfsbuild.h"
#include "core/file_sys/romfs.h"
@@ -126,7 +127,7 @@ VirtualDir ExtractRomFS(VirtualFile file, RomFSExtractionType type) {
return out->GetSubdirectories().front();
while (out->GetSubdirectories().size() == 1 && out->GetFiles().empty()) {
if (out->GetSubdirectories().front()->GetName() == "data" &&
if (Common::ToLower(out->GetSubdirectories().front()->GetName()) == "data" &&
type == RomFSExtractionType::Truncated)
break;
out = out->GetSubdirectories().front();

View File

@@ -12,20 +12,49 @@
namespace Core::Frontend {
/// Information for the Graphics Backends signifying what type of screen pointer is in
/// WindowInformation
enum class WindowSystemType {
Headless,
Windows,
X11,
Wayland,
};
/**
* Represents a graphics context that can be used for background computation or drawing. If the
* graphics backend doesn't require the context, then the implementation of these methods can be
* stubs
* Represents a drawing context that supports graphics operations.
*/
class GraphicsContext {
public:
virtual ~GraphicsContext();
/// Inform the driver to swap the front/back buffers and present the current image
virtual void SwapBuffers() {}
/// Makes the graphics context current for the caller thread
virtual void MakeCurrent() = 0;
virtual void MakeCurrent() {}
/// Releases (dunno if this is the "right" word) the context from the caller thread
virtual void DoneCurrent() = 0;
virtual void DoneCurrent() {}
class Scoped {
public:
explicit Scoped(GraphicsContext& context_) : context(context_) {
context.MakeCurrent();
}
~Scoped() {
context.DoneCurrent();
}
private:
GraphicsContext& context;
};
/// Calls MakeCurrent on the context and calls DoneCurrent when the scope for the returned value
/// ends
Scoped Acquire() {
return Scoped{*this};
}
};
/**
@@ -46,7 +75,7 @@ public:
* - DO NOT TREAT THIS CLASS AS A GUI TOOLKIT ABSTRACTION LAYER. That's not what it is. Please
* re-read the upper points again and think about it if you don't see this.
*/
class EmuWindow : public GraphicsContext {
class EmuWindow {
public:
/// Data structure to store emuwindow configuration
struct WindowConfig {
@@ -56,29 +85,34 @@ public:
std::pair<unsigned, unsigned> min_client_area_size;
};
/// Data describing host window system information
struct WindowSystemInfo {
// Window system type. Determines which GL context or Vulkan WSI is used.
WindowSystemType type = WindowSystemType::Headless;
// Connection to a display server. This is used on X11 and Wayland platforms.
void* display_connection = nullptr;
// Render surface. This is a pointer to the native window handle, which depends
// on the platform. e.g. HWND for Windows, Window for X11. If the surface is
// set to nullptr, the video backend will run in headless mode.
void* render_surface = nullptr;
// Scale of the render surface. For hidpi systems, this will be >1.
float render_surface_scale = 1.0f;
};
/// Polls window events
virtual void PollEvents() = 0;
/**
* Returns a GraphicsContext that the frontend provides that is shared with the emu window. This
* context can be used from other threads for background graphics computation. If the frontend
* is using a graphics backend that doesn't need anything specific to run on a different thread,
* then it can use a stubbed implemenation for GraphicsContext.
*
* If the return value is null, then the core should assume that the frontend cannot provide a
* Shared Context
* Returns a GraphicsContext that the frontend provides to be used for rendering.
*/
virtual std::unique_ptr<GraphicsContext> CreateSharedContext() const {
return nullptr;
}
virtual std::unique_ptr<GraphicsContext> CreateSharedContext() const = 0;
/// Returns if window is shown (not minimized)
virtual bool IsShown() const = 0;
/// Retrieves Vulkan specific handlers from the window
virtual void RetrieveVulkanHandlers(void* get_instance_proc_addr, void* instance,
void* surface) const = 0;
/**
* Signal that a touch pressed event has occurred (e.g. mouse click pressed)
* @param framebuffer_x Framebuffer x-coordinate that was pressed
@@ -115,6 +149,13 @@ public:
config = val;
}
/**
* Returns system information about the drawing area.
*/
const WindowSystemInfo& GetWindowInfo() const {
return window_info;
}
/**
* Gets the framebuffer layout (width, height, and screen regions)
* @note This method is thread-safe
@@ -130,7 +171,7 @@ public:
void UpdateCurrentFramebufferLayout(unsigned width, unsigned height);
protected:
EmuWindow();
explicit EmuWindow();
virtual ~EmuWindow();
/**
@@ -167,6 +208,8 @@ protected:
client_area_height = size.second;
}
WindowSystemInfo window_info;
private:
/**
* Handler called when the minimal client area was requested to be changed via SetConfig.

View File

@@ -1,18 +0,0 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/frontend/emu_window.h"
#include "core/frontend/scope_acquire_context.h"
namespace Core::Frontend {
ScopeAcquireContext::ScopeAcquireContext(Core::Frontend::GraphicsContext& context)
: context{context} {
context.MakeCurrent();
}
ScopeAcquireContext::~ScopeAcquireContext() {
context.DoneCurrent();
}
} // namespace Core::Frontend

View File

@@ -1,23 +0,0 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "common/common_types.h"
namespace Core::Frontend {
class GraphicsContext;
/// Helper class to acquire/release window context within a given scope
class ScopeAcquireContext : NonCopyable {
public:
explicit ScopeAcquireContext(Core::Frontend::GraphicsContext& context);
~ScopeAcquireContext();
private:
Core::Frontend::GraphicsContext& context;
};
} // namespace Core::Frontend

View File

@@ -103,7 +103,7 @@ static void ThreadWakeupCallback(u64 thread_handle, [[maybe_unused]] s64 cycles_
struct KernelCore::Impl {
explicit Impl(Core::System& system, KernelCore& kernel)
: system{system}, global_scheduler{kernel}, synchronization{system}, time_manager{system} {}
: global_scheduler{kernel}, synchronization{system}, time_manager{system}, system{system} {}
void Initialize(KernelCore& kernel) {
Shutdown();

View File

@@ -2,168 +2,24 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <memory>
#include "core/hle/service/caps/caps.h"
#include "core/hle/service/caps/caps_a.h"
#include "core/hle/service/caps/caps_c.h"
#include "core/hle/service/caps/caps_sc.h"
#include "core/hle/service/caps/caps_ss.h"
#include "core/hle/service/caps/caps_su.h"
#include "core/hle/service/caps/caps_u.h"
#include "core/hle/service/service.h"
#include "core/hle/service/sm/sm.h"
namespace Service::Capture {
class CAPS_A final : public ServiceFramework<CAPS_A> {
public:
explicit CAPS_A() : ServiceFramework{"caps:a"} {
// clang-format off
static const FunctionInfo functions[] = {
{0, nullptr, "GetAlbumFileCount"},
{1, nullptr, "GetAlbumFileList"},
{2, nullptr, "LoadAlbumFile"},
{3, nullptr, "DeleteAlbumFile"},
{4, nullptr, "StorageCopyAlbumFile"},
{5, nullptr, "IsAlbumMounted"},
{6, nullptr, "GetAlbumUsage"},
{7, nullptr, "GetAlbumFileSize"},
{8, nullptr, "LoadAlbumFileThumbnail"},
{9, nullptr, "LoadAlbumScreenShotImage"},
{10, nullptr, "LoadAlbumScreenShotThumbnailImage"},
{11, nullptr, "GetAlbumEntryFromApplicationAlbumEntry"},
{12, nullptr, "Unknown12"},
{13, nullptr, "Unknown13"},
{14, nullptr, "Unknown14"},
{15, nullptr, "Unknown15"},
{16, nullptr, "Unknown16"},
{17, nullptr, "Unknown17"},
{18, nullptr, "Unknown18"},
{202, nullptr, "SaveEditedScreenShot"},
{301, nullptr, "GetLastThumbnail"},
{401, nullptr, "GetAutoSavingStorage"},
{501, nullptr, "GetRequiredStorageSpaceSizeToCopyAll"},
{1001, nullptr, "Unknown1001"},
{1002, nullptr, "Unknown1002"},
{1003, nullptr, "Unknown1003"},
{8001, nullptr, "ForceAlbumUnmounted"},
{8002, nullptr, "ResetAlbumMountStatus"},
{8011, nullptr, "RefreshAlbumCache"},
{8012, nullptr, "GetAlbumCache"},
{8013, nullptr, "Unknown8013"},
{8021, nullptr, "GetAlbumEntryFromApplicationAlbumEntryAruid"},
{10011, nullptr, "SetInternalErrorConversionEnabled"},
{50000, nullptr, "Unknown50000"},
{60002, nullptr, "Unknown60002"},
};
// clang-format on
RegisterHandlers(functions);
}
};
class CAPS_C final : public ServiceFramework<CAPS_C> {
public:
explicit CAPS_C() : ServiceFramework{"caps:c"} {
// clang-format off
static const FunctionInfo functions[] = {
{33, nullptr, "Unknown33"},
{2001, nullptr, "Unknown2001"},
{2002, nullptr, "Unknown2002"},
{2011, nullptr, "Unknown2011"},
{2012, nullptr, "Unknown2012"},
{2013, nullptr, "Unknown2013"},
{2014, nullptr, "Unknown2014"},
{2101, nullptr, "Unknown2101"},
{2102, nullptr, "Unknown2102"},
{2201, nullptr, "Unknown2201"},
{2301, nullptr, "Unknown2301"},
};
// clang-format on
RegisterHandlers(functions);
}
};
class CAPS_SC final : public ServiceFramework<CAPS_SC> {
public:
explicit CAPS_SC() : ServiceFramework{"caps:sc"} {
// clang-format off
static const FunctionInfo functions[] = {
{1, nullptr, "Unknown1"},
{2, nullptr, "Unknown2"},
{1001, nullptr, "Unknown3"},
{1002, nullptr, "Unknown4"},
{1003, nullptr, "Unknown5"},
{1011, nullptr, "Unknown6"},
{1012, nullptr, "Unknown7"},
{1201, nullptr, "Unknown8"},
{1202, nullptr, "Unknown9"},
{1203, nullptr, "Unknown10"},
};
// clang-format on
RegisterHandlers(functions);
}
};
class CAPS_SS final : public ServiceFramework<CAPS_SS> {
public:
explicit CAPS_SS() : ServiceFramework{"caps:ss"} {
// clang-format off
static const FunctionInfo functions[] = {
{201, nullptr, "Unknown1"},
{202, nullptr, "Unknown2"},
{203, nullptr, "Unknown3"},
{204, nullptr, "Unknown4"},
};
// clang-format on
RegisterHandlers(functions);
}
};
class CAPS_SU final : public ServiceFramework<CAPS_SU> {
public:
explicit CAPS_SU() : ServiceFramework{"caps:su"} {
// clang-format off
static const FunctionInfo functions[] = {
{201, nullptr, "SaveScreenShot"},
{203, nullptr, "SaveScreenShotEx0"},
};
// clang-format on
RegisterHandlers(functions);
}
};
class CAPS_U final : public ServiceFramework<CAPS_U> {
public:
explicit CAPS_U() : ServiceFramework{"caps:u"} {
// clang-format off
static const FunctionInfo functions[] = {
{32, nullptr, "SetShimLibraryVersion"},
{102, nullptr, "GetAlbumFileListByAruid"},
{103, nullptr, "DeleteAlbumFileByAruid"},
{104, nullptr, "GetAlbumFileSizeByAruid"},
{105, nullptr, "DeleteAlbumFileByAruidForDebug"},
{110, nullptr, "LoadAlbumScreenShotImageByAruid"},
{120, nullptr, "LoadAlbumScreenShotThumbnailImageByAruid"},
{130, nullptr, "PrecheckToCreateContentsByAruid"},
{140, nullptr, "GetAlbumFileList1AafeAruidDeprecated"},
{141, nullptr, "GetAlbumFileList2AafeUidAruidDeprecated"},
{142, nullptr, "GetAlbumFileList3AaeAruid"},
{143, nullptr, "GetAlbumFileList4AaeUidAruid"},
{60002, nullptr, "OpenAccessorSessionForApplication"},
};
// clang-format on
RegisterHandlers(functions);
}
};
void InstallInterfaces(SM::ServiceManager& sm) {
std::make_shared<CAPS_A>()->InstallAsService(sm);
std::make_shared<CAPS_C>()->InstallAsService(sm);
std::make_shared<CAPS_U>()->InstallAsService(sm);
std::make_shared<CAPS_SC>()->InstallAsService(sm);
std::make_shared<CAPS_SS>()->InstallAsService(sm);
std::make_shared<CAPS_SU>()->InstallAsService(sm);
std::make_shared<CAPS_U>()->InstallAsService(sm);
}
} // namespace Service::Capture

View File

@@ -4,12 +4,83 @@
#pragma once
#include "core/hle/service/service.h"
namespace Service::SM {
class ServiceManager;
}
namespace Service::Capture {
enum AlbumImageOrientation {
Orientation0 = 0,
Orientation1 = 1,
Orientation2 = 2,
Orientation3 = 3,
};
enum AlbumReportOption {
Disable = 0,
Enable = 1,
};
enum ContentType : u8 {
Screenshot = 0,
Movie = 1,
ExtraMovie = 3,
};
enum AlbumStorage : u8 {
NAND = 0,
SD = 1,
};
struct AlbumFileDateTime {
u16 year;
u8 month;
u8 day;
u8 hour;
u8 minute;
u8 second;
u8 uid;
};
struct AlbumEntry {
u64 size;
u64 application_id;
AlbumFileDateTime datetime;
AlbumStorage storage;
ContentType content;
u8 padding[6];
};
struct AlbumFileEntry {
u64 size;
u64 hash;
AlbumFileDateTime datetime;
AlbumStorage storage;
ContentType content;
u8 padding[5];
u8 unknown;
};
struct ApplicationAlbumEntry {
u64 size;
u64 hash;
AlbumFileDateTime datetime;
AlbumStorage storage;
ContentType content;
u8 padding[5];
u8 unknown;
};
struct ApplicationAlbumFileEntry {
ApplicationAlbumEntry entry;
AlbumFileDateTime datetime;
u64 unknown;
};
/// Registers all Capture services with the specified service manager.
void InstallInterfaces(SM::ServiceManager& sm);
} // namespace Service::Capture

View File

@@ -0,0 +1,78 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/hle/service/caps/caps_a.h"
namespace Service::Capture {
class IAlbumAccessorSession final : public ServiceFramework<IAlbumAccessorSession> {
public:
explicit IAlbumAccessorSession() : ServiceFramework{"IAlbumAccessorSession"} {
// clang-format off
static const FunctionInfo functions[] = {
{2001, nullptr, "OpenAlbumMovieReadStream"},
{2002, nullptr, "CloseAlbumMovieReadStream"},
{2003, nullptr, "GetAlbumMovieReadStreamMovieDataSize"},
{2004, nullptr, "ReadMovieDataFromAlbumMovieReadStream"},
{2005, nullptr, "GetAlbumMovieReadStreamBrokenReason"},
{2006, nullptr, "GetAlbumMovieReadStreamImageDataSize"},
{2007, nullptr, "ReadImageDataFromAlbumMovieReadStream"},
{2008, nullptr, "ReadFileAttributeFromAlbumMovieReadStream"},
};
// clang-format on
RegisterHandlers(functions);
}
};
CAPS_A::CAPS_A() : ServiceFramework("caps:a") {
// clang-format off
static const FunctionInfo functions[] = {
{0, nullptr, "GetAlbumFileCount"},
{1, nullptr, "GetAlbumFileList"},
{2, nullptr, "LoadAlbumFile"},
{3, nullptr, "DeleteAlbumFile"},
{4, nullptr, "StorageCopyAlbumFile"},
{5, nullptr, "IsAlbumMounted"},
{6, nullptr, "GetAlbumUsage"},
{7, nullptr, "GetAlbumFileSize"},
{8, nullptr, "LoadAlbumFileThumbnail"},
{9, nullptr, "LoadAlbumScreenShotImage"},
{10, nullptr, "LoadAlbumScreenShotThumbnailImage"},
{11, nullptr, "GetAlbumEntryFromApplicationAlbumEntry"},
{12, nullptr, "LoadAlbumScreenShotImageEx"},
{13, nullptr, "LoadAlbumScreenShotThumbnailImageEx"},
{14, nullptr, "LoadAlbumScreenShotImageEx0"},
{15, nullptr, "GetAlbumUsage3"},
{16, nullptr, "GetAlbumMountResult"},
{17, nullptr, "GetAlbumUsage16"},
{18, nullptr, "Unknown18"},
{100, nullptr, "GetAlbumFileCountEx0"},
{101, nullptr, "GetAlbumFileListEx0"},
{202, nullptr, "SaveEditedScreenShot"},
{301, nullptr, "GetLastThumbnail"},
{302, nullptr, "GetLastOverlayMovieThumbnail"},
{401, nullptr, "GetAutoSavingStorage"},
{501, nullptr, "GetRequiredStorageSpaceSizeToCopyAll"},
{1001, nullptr, "LoadAlbumScreenShotThumbnailImageEx0"},
{1002, nullptr, "LoadAlbumScreenShotImageEx1"},
{1003, nullptr, "LoadAlbumScreenShotThumbnailImageEx1"},
{8001, nullptr, "ForceAlbumUnmounted"},
{8002, nullptr, "ResetAlbumMountStatus"},
{8011, nullptr, "RefreshAlbumCache"},
{8012, nullptr, "GetAlbumCache"},
{8013, nullptr, "GetAlbumCacheEx"},
{8021, nullptr, "GetAlbumEntryFromApplicationAlbumEntryAruid"},
{10011, nullptr, "SetInternalErrorConversionEnabled"},
{50000, nullptr, "LoadMakerNoteInfoForDebug"},
{60002, nullptr, "OpenAccessorSession"},
};
// clang-format on
RegisterHandlers(functions);
}
CAPS_A::~CAPS_A() = default;
} // namespace Service::Capture

View File

@@ -0,0 +1,21 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Kernel {
class HLERequestContext;
}
namespace Service::Capture {
class CAPS_A final : public ServiceFramework<CAPS_A> {
public:
explicit CAPS_A();
~CAPS_A() override;
};
} // namespace Service::Capture

View File

@@ -0,0 +1,75 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/hle/service/caps/caps_c.h"
namespace Service::Capture {
class IAlbumControlSession final : public ServiceFramework<IAlbumControlSession> {
public:
explicit IAlbumControlSession() : ServiceFramework{"IAlbumControlSession"} {
// clang-format off
static const FunctionInfo functions[] = {
{2001, nullptr, "OpenAlbumMovieReadStream"},
{2002, nullptr, "CloseAlbumMovieReadStream"},
{2003, nullptr, "GetAlbumMovieReadStreamMovieDataSize"},
{2004, nullptr, "ReadMovieDataFromAlbumMovieReadStream"},
{2005, nullptr, "GetAlbumMovieReadStreamBrokenReason"},
{2006, nullptr, "GetAlbumMovieReadStreamImageDataSize"},
{2007, nullptr, "ReadImageDataFromAlbumMovieReadStream"},
{2008, nullptr, "ReadFileAttributeFromAlbumMovieReadStream"},
{2401, nullptr, "OpenAlbumMovieWriteStream"},
{2402, nullptr, "FinishAlbumMovieWriteStream"},
{2403, nullptr, "CommitAlbumMovieWriteStream"},
{2404, nullptr, "DiscardAlbumMovieWriteStream"},
{2405, nullptr, "DiscardAlbumMovieWriteStreamNoDelete"},
{2406, nullptr, "CommitAlbumMovieWriteStreamEx"},
{2411, nullptr, "StartAlbumMovieWriteStreamDataSection"},
{2412, nullptr, "EndAlbumMovieWriteStreamDataSection"},
{2413, nullptr, "StartAlbumMovieWriteStreamMetaSection"},
{2414, nullptr, "EndAlbumMovieWriteStreamMetaSection"},
{2421, nullptr, "ReadDataFromAlbumMovieWriteStream"},
{2422, nullptr, "WriteDataToAlbumMovieWriteStream"},
{2424, nullptr, "WriteMetaToAlbumMovieWriteStream"},
{2431, nullptr, "GetAlbumMovieWriteStreamBrokenReason"},
{2433, nullptr, "GetAlbumMovieWriteStreamDataSize"},
{2434, nullptr, "SetAlbumMovieWriteStreamDataSize"},
};
// clang-format on
RegisterHandlers(functions);
}
};
CAPS_C::CAPS_C() : ServiceFramework("caps:c") {
// clang-format off
static const FunctionInfo functions[] = {
{1, nullptr, "CaptureRawImage"},
{2, nullptr, "CaptureRawImageWithTimeout"},
{33, nullptr, "Unknown33"},
{1001, nullptr, "RequestTakingScreenShot"},
{1002, nullptr, "RequestTakingScreenShotWithTimeout"},
{1011, nullptr, "NotifyTakingScreenShotRefused"},
{2001, nullptr, "NotifyAlbumStorageIsAvailable"},
{2002, nullptr, "NotifyAlbumStorageIsUnavailable"},
{2011, nullptr, "RegisterAppletResourceUserId"},
{2012, nullptr, "UnregisterAppletResourceUserId"},
{2013, nullptr, "GetApplicationIdFromAruid"},
{2014, nullptr, "CheckApplicationIdRegistered"},
{2101, nullptr, "GenerateCurrentAlbumFileId"},
{2102, nullptr, "GenerateApplicationAlbumEntry"},
{2201, nullptr, "SaveAlbumScreenShotFile"},
{2202, nullptr, "SaveAlbumScreenShotFileEx"},
{2301, nullptr, "SetOverlayScreenShotThumbnailData"},
{2302, nullptr, "SetOverlayMovieThumbnailData"},
{60001, nullptr, "OpenControlSession"},
};
// clang-format on
RegisterHandlers(functions);
}
CAPS_C::~CAPS_C() = default;
} // namespace Service::Capture

View File

@@ -0,0 +1,21 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Kernel {
class HLERequestContext;
}
namespace Service::Capture {
class CAPS_C final : public ServiceFramework<CAPS_C> {
public:
explicit CAPS_C();
~CAPS_C() override;
};
} // namespace Service::Capture

View File

@@ -0,0 +1,40 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/hle/service/caps/caps_sc.h"
namespace Service::Capture {
CAPS_SC::CAPS_SC() : ServiceFramework("caps:sc") {
// clang-format off
static const FunctionInfo functions[] = {
{1, nullptr, "CaptureRawImage"},
{2, nullptr, "CaptureRawImageWithTimeout"},
{3, nullptr, "AttachSharedBuffer"},
{5, nullptr, "CaptureRawImageToAttachedSharedBuffer"},
{210, nullptr, "Unknown210"},
{1001, nullptr, "RequestTakingScreenShot"},
{1002, nullptr, "RequestTakingScreenShotWithTimeout"},
{1003, nullptr, "RequestTakingScreenShotEx"},
{1004, nullptr, "RequestTakingScreenShotEx1"},
{1009, nullptr, "CancelTakingScreenShot"},
{1010, nullptr, "SetTakingScreenShotCancelState"},
{1011, nullptr, "NotifyTakingScreenShotRefused"},
{1012, nullptr, "NotifyTakingScreenShotFailed"},
{1101, nullptr, "SetupOverlayMovieThumbnail"},
{1106, nullptr, "Unknown1106"},
{1107, nullptr, "Unknown1107"},
{1201, nullptr, "OpenRawScreenShotReadStream"},
{1202, nullptr, "CloseRawScreenShotReadStream"},
{1203, nullptr, "ReadRawScreenShotReadStream"},
{1204, nullptr, "Unknown1204"},
};
// clang-format on
RegisterHandlers(functions);
}
CAPS_SC::~CAPS_SC() = default;
} // namespace Service::Capture

View File

@@ -0,0 +1,21 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Kernel {
class HLERequestContext;
}
namespace Service::Capture {
class CAPS_SC final : public ServiceFramework<CAPS_SC> {
public:
explicit CAPS_SC();
~CAPS_SC() override;
};
} // namespace Service::Capture

View File

@@ -0,0 +1,26 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/hle/service/caps/caps_ss.h"
namespace Service::Capture {
CAPS_SS::CAPS_SS() : ServiceFramework("caps:ss") {
// clang-format off
static const FunctionInfo functions[] = {
{201, nullptr, "SaveScreenShot"},
{202, nullptr, "SaveEditedScreenShot"},
{203, nullptr, "SaveScreenShotEx0"},
{204, nullptr, "SaveEditedScreenShotEx0"},
{206, nullptr, "Unknown206"},
{208, nullptr, "SaveScreenShotOfMovieEx1"},
};
// clang-format on
RegisterHandlers(functions);
}
CAPS_SS::~CAPS_SS() = default;
} // namespace Service::Capture

View File

@@ -0,0 +1,21 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Kernel {
class HLERequestContext;
}
namespace Service::Capture {
class CAPS_SS final : public ServiceFramework<CAPS_SS> {
public:
explicit CAPS_SS();
~CAPS_SS() override;
};
} // namespace Service::Capture

View File

@@ -0,0 +1,22 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/hle/service/caps/caps_su.h"
namespace Service::Capture {
CAPS_SU::CAPS_SU() : ServiceFramework("caps:su") {
// clang-format off
static const FunctionInfo functions[] = {
{201, nullptr, "SaveScreenShot"},
{203, nullptr, "SaveScreenShotEx0"},
};
// clang-format on
RegisterHandlers(functions);
}
CAPS_SU::~CAPS_SU() = default;
} // namespace Service::Capture

View File

@@ -0,0 +1,21 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Kernel {
class HLERequestContext;
}
namespace Service::Capture {
class CAPS_SU final : public ServiceFramework<CAPS_SU> {
public:
explicit CAPS_SU();
~CAPS_SU() override;
};
} // namespace Service::Capture

View File

@@ -0,0 +1,76 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "common/logging/log.h"
#include "core/hle/ipc_helpers.h"
#include "core/hle/service/caps/caps.h"
#include "core/hle/service/caps/caps_u.h"
namespace Service::Capture {
class IAlbumAccessorApplicationSession final
: public ServiceFramework<IAlbumAccessorApplicationSession> {
public:
explicit IAlbumAccessorApplicationSession()
: ServiceFramework{"IAlbumAccessorApplicationSession"} {
// clang-format off
static const FunctionInfo functions[] = {
{2001, nullptr, "OpenAlbumMovieReadStream"},
{2002, nullptr, "CloseAlbumMovieReadStream"},
{2003, nullptr, "GetAlbumMovieReadStreamMovieDataSize"},
{2004, nullptr, "ReadMovieDataFromAlbumMovieReadStream"},
{2005, nullptr, "GetAlbumMovieReadStreamBrokenReason"},
};
// clang-format on
RegisterHandlers(functions);
}
};
CAPS_U::CAPS_U() : ServiceFramework("caps:u") {
// clang-format off
static const FunctionInfo functions[] = {
{31, nullptr, "GetShimLibraryVersion"},
{32, nullptr, "SetShimLibraryVersion"},
{102, &CAPS_U::GetAlbumContentsFileListForApplication, "GetAlbumContentsFileListForApplication"},
{103, nullptr, "DeleteAlbumContentsFileForApplication"},
{104, nullptr, "GetAlbumContentsFileSizeForApplication"},
{105, nullptr, "DeleteAlbumFileByAruidForDebug"},
{110, nullptr, "LoadAlbumContentsFileScreenShotImageForApplication"},
{120, nullptr, "LoadAlbumContentsFileThumbnailImageForApplication"},
{130, nullptr, "PrecheckToCreateContentsForApplication"},
{140, nullptr, "GetAlbumFileList1AafeAruidDeprecated"},
{141, nullptr, "GetAlbumFileList2AafeUidAruidDeprecated"},
{142, nullptr, "GetAlbumFileList3AaeAruid"},
{143, nullptr, "GetAlbumFileList4AaeUidAruid"},
{60002, nullptr, "OpenAccessorSessionForApplication"},
};
// clang-format on
RegisterHandlers(functions);
}
CAPS_U::~CAPS_U() = default;
void CAPS_U::GetAlbumContentsFileListForApplication(Kernel::HLERequestContext& ctx) {
// Takes a type-0x6 output buffer containing an array of ApplicationAlbumFileEntry, a PID, an
// u8 ContentType, two s64s, and an u64 AppletResourceUserId. Returns an output u64 for total
// output entries (which is copied to a s32 by official SW).
IPC::RequestParser rp{ctx};
[[maybe_unused]] const auto application_album_file_entries = rp.PopRaw<std::array<u8, 0x30>>();
const auto pid = rp.Pop<s32>();
const auto content_type = rp.PopRaw<ContentType>();
[[maybe_unused]] const auto start_datetime = rp.PopRaw<AlbumFileDateTime>();
[[maybe_unused]] const auto end_datetime = rp.PopRaw<AlbumFileDateTime>();
const auto applet_resource_user_id = rp.Pop<u64>();
LOG_WARNING(Service_Capture,
"(STUBBED) called. pid={}, content_type={}, applet_resource_user_id={}", pid,
content_type, applet_resource_user_id);
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<s32>(0);
}
} // namespace Service::Capture

View File

@@ -0,0 +1,24 @@
// Copyright 2020 yuzu emulator team
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include "core/hle/service/service.h"
namespace Kernel {
class HLERequestContext;
}
namespace Service::Capture {
class CAPS_U final : public ServiceFramework<CAPS_U> {
public:
explicit CAPS_U();
~CAPS_U() override;
private:
void GetAlbumContentsFileListForApplication(Kernel::HLERequestContext& ctx);
};
} // namespace Service::Capture

View File

@@ -27,7 +27,7 @@ public:
{10110, nullptr, "GetFriendProfileImage"},
{10200, nullptr, "SendFriendRequestForApplication"},
{10211, nullptr, "AddFacedFriendRequestForApplication"},
{10400, nullptr, "GetBlockedUserListIds"},
{10400, &IFriendService::GetBlockedUserListIds, "GetBlockedUserListIds"},
{10500, nullptr, "GetProfileList"},
{10600, nullptr, "DeclareOpenOnlinePlaySession"},
{10601, &IFriendService::DeclareCloseOnlinePlaySession, "DeclareCloseOnlinePlaySession"},
@@ -121,6 +121,15 @@ private:
};
static_assert(sizeof(SizedFriendFilter) == 0x10, "SizedFriendFilter is an invalid size");
void GetBlockedUserListIds(Kernel::HLERequestContext& ctx) {
// This is safe to stub, as there should be no adverse consequences from reporting no
// blocked users.
LOG_WARNING(Service_ACC, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u32>(0); // Indicates there are no blocked users
}
void DeclareCloseOnlinePlaySession(Kernel::HLERequestContext& ctx) {
// Stub used by Splatoon 2
LOG_WARNING(Service_ACC, "(STUBBED) called");

View File

@@ -342,17 +342,27 @@ public:
return;
}
ASSERT(
vm_manager
.MirrorMemory(*map_address, nro_address, nro_size, Kernel::MemoryState::ModuleCode)
.IsSuccess());
// Mark text and read-only region as ModuleCode
ASSERT(vm_manager
.MirrorMemory(*map_address, nro_address, header.text_size + header.ro_size,
Kernel::MemoryState::ModuleCode)
.IsSuccess());
// Mark read/write region as ModuleCodeData, which is necessary if this region is used for
// TransferMemory (e.g. Final Fantasy VIII Remastered does this)
ASSERT(vm_manager
.MirrorMemory(*map_address + header.rw_offset, nro_address + header.rw_offset,
header.rw_size, Kernel::MemoryState::ModuleCodeData)
.IsSuccess());
// Revoke permissions from the old memory region
ASSERT(vm_manager.ReprotectRange(nro_address, nro_size, Kernel::VMAPermission::None)
.IsSuccess());
if (bss_size > 0) {
// Mark BSS region as ModuleCodeData, which is necessary if this region is used for
// TransferMemory (e.g. Final Fantasy VIII Remastered does this)
ASSERT(vm_manager
.MirrorMemory(*map_address + nro_size, bss_address, bss_size,
Kernel::MemoryState::ModuleCode)
Kernel::MemoryState::ModuleCodeData)
.IsSuccess());
ASSERT(vm_manager.ReprotectRange(bss_address, bss_size, Kernel::VMAPermission::None)
.IsSuccess());

View File

@@ -28,6 +28,7 @@ void BufferQueue::SetPreallocatedBuffer(u32 slot, const IGBPBuffer& igbp_buffer)
buffer.slot = slot;
buffer.igbp_buffer = igbp_buffer;
buffer.status = Buffer::Status::Free;
free_buffers.push_back(slot);
queue.emplace_back(buffer);
buffer_wait_event.writable->Signal();
@@ -35,16 +36,37 @@ void BufferQueue::SetPreallocatedBuffer(u32 slot, const IGBPBuffer& igbp_buffer)
std::optional<std::pair<u32, Service::Nvidia::MultiFence*>> BufferQueue::DequeueBuffer(u32 width,
u32 height) {
auto itr = std::find_if(queue.begin(), queue.end(), [&](const Buffer& buffer) {
// Only consider free buffers. Buffers become free once again after they've been Acquired
// and Released by the compositor, see the NVFlinger::Compose method.
if (buffer.status != Buffer::Status::Free) {
return false;
}
// Make sure that the parameters match.
return buffer.igbp_buffer.width == width && buffer.igbp_buffer.height == height;
});
if (free_buffers.empty()) {
return {};
}
auto f_itr = free_buffers.begin();
auto itr = queue.end();
while (f_itr != free_buffers.end()) {
auto slot = *f_itr;
itr = std::find_if(queue.begin(), queue.end(), [&](const Buffer& buffer) {
// Only consider free buffers. Buffers become free once again after they've been
// Acquired and Released by the compositor, see the NVFlinger::Compose method.
if (buffer.status != Buffer::Status::Free) {
return false;
}
if (buffer.slot != slot) {
return false;
}
// Make sure that the parameters match.
return buffer.igbp_buffer.width == width && buffer.igbp_buffer.height == height;
});
if (itr != queue.end()) {
free_buffers.erase(f_itr);
break;
}
++f_itr;
}
if (itr == queue.end()) {
return {};
@@ -99,10 +121,18 @@ void BufferQueue::ReleaseBuffer(u32 slot) {
ASSERT(itr != queue.end());
ASSERT(itr->status == Buffer::Status::Acquired);
itr->status = Buffer::Status::Free;
free_buffers.push_back(slot);
buffer_wait_event.writable->Signal();
}
void BufferQueue::Disconnect() {
queue.clear();
queue_sequence.clear();
id = 1;
layer_id = 1;
}
u32 BufferQueue::Query(QueryType type) {
LOG_WARNING(Service, "(STUBBED) called type={}", static_cast<u32>(type));

View File

@@ -87,6 +87,7 @@ public:
Service::Nvidia::MultiFence& multi_fence);
std::optional<std::reference_wrapper<const Buffer>> AcquireBuffer();
void ReleaseBuffer(u32 slot);
void Disconnect();
u32 Query(QueryType type);
u32 GetId() const {
@@ -101,6 +102,7 @@ private:
u32 id;
u64 layer_id;
std::list<u32> free_buffers;
std::vector<Buffer> queue;
std::list<u32> queue_sequence;
Kernel::EventPair buffer_wait_event;

View File

@@ -29,7 +29,7 @@ Time::Time(std::shared_ptr<Module> module, Core::System& system, const char* nam
{300, &Time::CalculateMonotonicSystemClockBaseTimePoint, "CalculateMonotonicSystemClockBaseTimePoint"},
{400, &Time::GetClockSnapshot, "GetClockSnapshot"},
{401, &Time::GetClockSnapshotFromSystemClockContext, "GetClockSnapshotFromSystemClockContext"},
{500, nullptr, "CalculateStandardUserSystemClockDifferenceByUser"},
{500, &Time::CalculateStandardUserSystemClockDifferenceByUser, "CalculateStandardUserSystemClockDifferenceByUser"},
{501, &Time::CalculateSpanBetween, "CalculateSpanBetween"},
};
// clang-format on

View File

@@ -308,6 +308,29 @@ void Module::Interface::GetClockSnapshotFromSystemClockContext(Kernel::HLEReques
ctx.WriteBuffer(&clock_snapshot, sizeof(Clock::ClockSnapshot));
}
void Module::Interface::CalculateStandardUserSystemClockDifferenceByUser(
Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_Time, "called");
IPC::RequestParser rp{ctx};
const auto snapshot_a = rp.PopRaw<Clock::ClockSnapshot>();
const auto snapshot_b = rp.PopRaw<Clock::ClockSnapshot>();
auto time_span_type{Clock::TimeSpanType::FromSeconds(snapshot_b.user_context.offset -
snapshot_a.user_context.offset)};
if ((snapshot_b.user_context.steady_time_point.clock_source_id !=
snapshot_a.user_context.steady_time_point.clock_source_id) ||
(snapshot_b.is_automatic_correction_enabled &&
snapshot_a.is_automatic_correction_enabled)) {
time_span_type.nanoseconds = 0;
}
IPC::ResponseBuilder rb{ctx, (sizeof(s64) / 4) + 2};
rb.Push(RESULT_SUCCESS);
rb.PushRaw(time_span_type.nanoseconds);
}
void Module::Interface::CalculateSpanBetween(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_Time, "called");

View File

@@ -32,6 +32,7 @@ public:
void CalculateMonotonicSystemClockBaseTimePoint(Kernel::HLERequestContext& ctx);
void GetClockSnapshot(Kernel::HLERequestContext& ctx);
void GetClockSnapshotFromSystemClockContext(Kernel::HLERequestContext& ctx);
void CalculateStandardUserSystemClockDifferenceByUser(Kernel::HLERequestContext& ctx);
void CalculateSpanBetween(Kernel::HLERequestContext& ctx);
void GetSharedMemoryNativeHandle(Kernel::HLERequestContext& ctx);

View File

@@ -513,7 +513,8 @@ private:
auto& buffer_queue = nv_flinger->FindBufferQueue(id);
if (transaction == TransactionId::Connect) {
switch (transaction) {
case TransactionId::Connect: {
IGBPConnectRequestParcel request{ctx.ReadBuffer()};
IGBPConnectResponseParcel response{
static_cast<u32>(static_cast<u32>(DisplayResolution::UndockedWidth) *
@@ -521,14 +522,18 @@ private:
static_cast<u32>(static_cast<u32>(DisplayResolution::UndockedHeight) *
Settings::values.resolution_factor)};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::SetPreallocatedBuffer) {
break;
}
case TransactionId::SetPreallocatedBuffer: {
IGBPSetPreallocatedBufferRequestParcel request{ctx.ReadBuffer()};
buffer_queue.SetPreallocatedBuffer(request.data.slot, request.buffer);
IGBPSetPreallocatedBufferResponseParcel response{};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::DequeueBuffer) {
break;
}
case TransactionId::DequeueBuffer: {
IGBPDequeueBufferRequestParcel request{ctx.ReadBuffer()};
const u32 width{request.data.width};
const u32 height{request.data.height};
@@ -556,14 +561,18 @@ private:
},
buffer_queue.GetWritableBufferWaitEvent());
}
} else if (transaction == TransactionId::RequestBuffer) {
break;
}
case TransactionId::RequestBuffer: {
IGBPRequestBufferRequestParcel request{ctx.ReadBuffer()};
auto& buffer = buffer_queue.RequestBuffer(request.slot);
IGBPRequestBufferResponseParcel response{buffer};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::QueueBuffer) {
break;
}
case TransactionId::QueueBuffer: {
IGBPQueueBufferRequestParcel request{ctx.ReadBuffer()};
buffer_queue.QueueBuffer(request.data.slot, request.data.transform,
@@ -572,7 +581,9 @@ private:
IGBPQueueBufferResponseParcel response{1280, 720};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::Query) {
break;
}
case TransactionId::Query: {
IGBPQueryRequestParcel request{ctx.ReadBuffer()};
const u32 value =
@@ -580,15 +591,30 @@ private:
IGBPQueryResponseParcel response{value};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::CancelBuffer) {
break;
}
case TransactionId::CancelBuffer: {
LOG_CRITICAL(Service_VI, "(STUBBED) called, transaction=CancelBuffer");
} else if (transaction == TransactionId::Disconnect ||
transaction == TransactionId::DetachBuffer) {
break;
}
case TransactionId::Disconnect: {
LOG_WARNING(Service_VI, "(STUBBED) called, transaction=Disconnect");
const auto buffer = ctx.ReadBuffer();
buffer_queue.Disconnect();
IGBPEmptyResponseParcel response{};
ctx.WriteBuffer(response.Serialize());
break;
}
case TransactionId::DetachBuffer: {
const auto buffer = ctx.ReadBuffer();
IGBPEmptyResponseParcel response{};
ctx.WriteBuffer(response.Serialize());
} else {
break;
}
default:
ASSERT_MSG(false, "Unimplemented");
}

View File

@@ -242,7 +242,52 @@ struct Memory::Impl {
}
case Common::PageType::RasterizerCachedMemory: {
const u8* const host_ptr = GetPointerFromVMA(process, current_vaddr);
system.GPU().FlushRegion(ToCacheAddr(host_ptr), copy_amount);
system.GPU().FlushRegion(current_vaddr, copy_amount);
std::memcpy(dest_buffer, host_ptr, copy_amount);
break;
}
default:
UNREACHABLE();
}
page_index++;
page_offset = 0;
dest_buffer = static_cast<u8*>(dest_buffer) + copy_amount;
remaining_size -= copy_amount;
}
}
void ReadBlockUnsafe(const Kernel::Process& process, const VAddr src_addr, void* dest_buffer,
const std::size_t size) {
const auto& page_table = process.VMManager().page_table;
std::size_t remaining_size = size;
std::size_t page_index = src_addr >> PAGE_BITS;
std::size_t page_offset = src_addr & PAGE_MASK;
while (remaining_size > 0) {
const std::size_t copy_amount =
std::min(static_cast<std::size_t>(PAGE_SIZE) - page_offset, remaining_size);
const auto current_vaddr = static_cast<VAddr>((page_index << PAGE_BITS) + page_offset);
switch (page_table.attributes[page_index]) {
case Common::PageType::Unmapped: {
LOG_ERROR(HW_Memory,
"Unmapped ReadBlock @ 0x{:016X} (start address = 0x{:016X}, size = {})",
current_vaddr, src_addr, size);
std::memset(dest_buffer, 0, copy_amount);
break;
}
case Common::PageType::Memory: {
DEBUG_ASSERT(page_table.pointers[page_index]);
const u8* const src_ptr =
page_table.pointers[page_index] + page_offset + (page_index << PAGE_BITS);
std::memcpy(dest_buffer, src_ptr, copy_amount);
break;
}
case Common::PageType::RasterizerCachedMemory: {
const u8* const host_ptr = GetPointerFromVMA(process, current_vaddr);
std::memcpy(dest_buffer, host_ptr, copy_amount);
break;
}
@@ -261,6 +306,10 @@ struct Memory::Impl {
ReadBlock(*system.CurrentProcess(), src_addr, dest_buffer, size);
}
void ReadBlockUnsafe(const VAddr src_addr, void* dest_buffer, const std::size_t size) {
ReadBlockUnsafe(*system.CurrentProcess(), src_addr, dest_buffer, size);
}
void WriteBlock(const Kernel::Process& process, const VAddr dest_addr, const void* src_buffer,
const std::size_t size) {
const auto& page_table = process.VMManager().page_table;
@@ -290,7 +339,50 @@ struct Memory::Impl {
}
case Common::PageType::RasterizerCachedMemory: {
u8* const host_ptr = GetPointerFromVMA(process, current_vaddr);
system.GPU().InvalidateRegion(ToCacheAddr(host_ptr), copy_amount);
system.GPU().InvalidateRegion(current_vaddr, copy_amount);
std::memcpy(host_ptr, src_buffer, copy_amount);
break;
}
default:
UNREACHABLE();
}
page_index++;
page_offset = 0;
src_buffer = static_cast<const u8*>(src_buffer) + copy_amount;
remaining_size -= copy_amount;
}
}
void WriteBlockUnsafe(const Kernel::Process& process, const VAddr dest_addr,
const void* src_buffer, const std::size_t size) {
const auto& page_table = process.VMManager().page_table;
std::size_t remaining_size = size;
std::size_t page_index = dest_addr >> PAGE_BITS;
std::size_t page_offset = dest_addr & PAGE_MASK;
while (remaining_size > 0) {
const std::size_t copy_amount =
std::min(static_cast<std::size_t>(PAGE_SIZE) - page_offset, remaining_size);
const auto current_vaddr = static_cast<VAddr>((page_index << PAGE_BITS) + page_offset);
switch (page_table.attributes[page_index]) {
case Common::PageType::Unmapped: {
LOG_ERROR(HW_Memory,
"Unmapped WriteBlock @ 0x{:016X} (start address = 0x{:016X}, size = {})",
current_vaddr, dest_addr, size);
break;
}
case Common::PageType::Memory: {
DEBUG_ASSERT(page_table.pointers[page_index]);
u8* const dest_ptr =
page_table.pointers[page_index] + page_offset + (page_index << PAGE_BITS);
std::memcpy(dest_ptr, src_buffer, copy_amount);
break;
}
case Common::PageType::RasterizerCachedMemory: {
u8* const host_ptr = GetPointerFromVMA(process, current_vaddr);
std::memcpy(host_ptr, src_buffer, copy_amount);
break;
}
@@ -309,6 +401,10 @@ struct Memory::Impl {
WriteBlock(*system.CurrentProcess(), dest_addr, src_buffer, size);
}
void WriteBlockUnsafe(const VAddr dest_addr, const void* src_buffer, const std::size_t size) {
WriteBlockUnsafe(*system.CurrentProcess(), dest_addr, src_buffer, size);
}
void ZeroBlock(const Kernel::Process& process, const VAddr dest_addr, const std::size_t size) {
const auto& page_table = process.VMManager().page_table;
std::size_t remaining_size = size;
@@ -337,7 +433,7 @@ struct Memory::Impl {
}
case Common::PageType::RasterizerCachedMemory: {
u8* const host_ptr = GetPointerFromVMA(process, current_vaddr);
system.GPU().InvalidateRegion(ToCacheAddr(host_ptr), copy_amount);
system.GPU().InvalidateRegion(current_vaddr, copy_amount);
std::memset(host_ptr, 0, copy_amount);
break;
}
@@ -384,7 +480,7 @@ struct Memory::Impl {
}
case Common::PageType::RasterizerCachedMemory: {
const u8* const host_ptr = GetPointerFromVMA(process, current_vaddr);
system.GPU().FlushRegion(ToCacheAddr(host_ptr), copy_amount);
system.GPU().FlushRegion(current_vaddr, copy_amount);
WriteBlock(process, dest_addr, host_ptr, copy_amount);
break;
}
@@ -545,7 +641,7 @@ struct Memory::Impl {
break;
case Common::PageType::RasterizerCachedMemory: {
const u8* const host_ptr = GetPointerFromVMA(vaddr);
system.GPU().FlushRegion(ToCacheAddr(host_ptr), sizeof(T));
system.GPU().FlushRegion(vaddr, sizeof(T));
T value;
std::memcpy(&value, host_ptr, sizeof(T));
return value;
@@ -587,7 +683,7 @@ struct Memory::Impl {
break;
case Common::PageType::RasterizerCachedMemory: {
u8* const host_ptr{GetPointerFromVMA(vaddr)};
system.GPU().InvalidateRegion(ToCacheAddr(host_ptr), sizeof(T));
system.GPU().InvalidateRegion(vaddr, sizeof(T));
std::memcpy(host_ptr, &data, sizeof(T));
break;
}
@@ -696,6 +792,15 @@ void Memory::ReadBlock(const VAddr src_addr, void* dest_buffer, const std::size_
impl->ReadBlock(src_addr, dest_buffer, size);
}
void Memory::ReadBlockUnsafe(const Kernel::Process& process, const VAddr src_addr,
void* dest_buffer, const std::size_t size) {
impl->ReadBlockUnsafe(process, src_addr, dest_buffer, size);
}
void Memory::ReadBlockUnsafe(const VAddr src_addr, void* dest_buffer, const std::size_t size) {
impl->ReadBlockUnsafe(src_addr, dest_buffer, size);
}
void Memory::WriteBlock(const Kernel::Process& process, VAddr dest_addr, const void* src_buffer,
std::size_t size) {
impl->WriteBlock(process, dest_addr, src_buffer, size);
@@ -705,6 +810,16 @@ void Memory::WriteBlock(const VAddr dest_addr, const void* src_buffer, const std
impl->WriteBlock(dest_addr, src_buffer, size);
}
void Memory::WriteBlockUnsafe(const Kernel::Process& process, VAddr dest_addr,
const void* src_buffer, std::size_t size) {
impl->WriteBlockUnsafe(process, dest_addr, src_buffer, size);
}
void Memory::WriteBlockUnsafe(const VAddr dest_addr, const void* src_buffer,
const std::size_t size) {
impl->WriteBlockUnsafe(dest_addr, src_buffer, size);
}
void Memory::ZeroBlock(const Kernel::Process& process, VAddr dest_addr, std::size_t size) {
impl->ZeroBlock(process, dest_addr, size);
}

View File

@@ -294,6 +294,27 @@ public:
void ReadBlock(const Kernel::Process& process, VAddr src_addr, void* dest_buffer,
std::size_t size);
/**
* Reads a contiguous block of bytes from a specified process' address space.
* This unsafe version does not trigger GPU flushing.
*
* @param process The process to read the data from.
* @param src_addr The virtual address to begin reading from.
* @param dest_buffer The buffer to place the read bytes into.
* @param size The amount of data to read, in bytes.
*
* @note If a size of 0 is specified, then this function reads nothing and
* no attempts to access memory are made at all.
*
* @pre dest_buffer must be at least size bytes in length, otherwise a
* buffer overrun will occur.
*
* @post The range [dest_buffer, size) contains the read bytes from the
* process' address space.
*/
void ReadBlockUnsafe(const Kernel::Process& process, VAddr src_addr, void* dest_buffer,
std::size_t size);
/**
* Reads a contiguous block of bytes from the current process' address space.
*
@@ -312,6 +333,25 @@ public:
*/
void ReadBlock(VAddr src_addr, void* dest_buffer, std::size_t size);
/**
* Reads a contiguous block of bytes from the current process' address space.
* This unsafe version does not trigger GPU flushing.
*
* @param src_addr The virtual address to begin reading from.
* @param dest_buffer The buffer to place the read bytes into.
* @param size The amount of data to read, in bytes.
*
* @note If a size of 0 is specified, then this function reads nothing and
* no attempts to access memory are made at all.
*
* @pre dest_buffer must be at least size bytes in length, otherwise a
* buffer overrun will occur.
*
* @post The range [dest_buffer, size) contains the read bytes from the
* current process' address space.
*/
void ReadBlockUnsafe(VAddr src_addr, void* dest_buffer, std::size_t size);
/**
* Writes a range of bytes into a given process' address space at the specified
* virtual address.
@@ -335,6 +375,26 @@ public:
void WriteBlock(const Kernel::Process& process, VAddr dest_addr, const void* src_buffer,
std::size_t size);
/**
* Writes a range of bytes into a given process' address space at the specified
* virtual address.
* This unsafe version does not invalidate GPU Memory.
*
* @param process The process to write data into the address space of.
* @param dest_addr The destination virtual address to begin writing the data at.
* @param src_buffer The data to write into the process' address space.
* @param size The size of the data to write, in bytes.
*
* @post The address range [dest_addr, size) in the process' address space
* contains the data that was within src_buffer.
*
* @post If an attempt is made to write into an unmapped region of memory, the writes
* will be ignored and an error will be logged.
*
*/
void WriteBlockUnsafe(const Kernel::Process& process, VAddr dest_addr, const void* src_buffer,
std::size_t size);
/**
* Writes a range of bytes into the current process' address space at the specified
* virtual address.
@@ -356,6 +416,24 @@ public:
*/
void WriteBlock(VAddr dest_addr, const void* src_buffer, std::size_t size);
/**
* Writes a range of bytes into the current process' address space at the specified
* virtual address.
* This unsafe version does not invalidate GPU Memory.
*
* @param dest_addr The destination virtual address to begin writing the data at.
* @param src_buffer The data to write into the current process' address space.
* @param size The size of the data to write, in bytes.
*
* @post The address range [dest_addr, size) in the current process' address space
* contains the data that was within src_buffer.
*
* @post If an attempt is made to write into an unmapped region of memory, the writes
* will be ignored and an error will be logged.
*
*/
void WriteBlockUnsafe(VAddr dest_addr, const void* src_buffer, std::size_t size);
/**
* Fills the specified address range within a process' address space with zeroes.
*

View File

@@ -27,4 +27,4 @@ if(SDL2_FOUND)
endif()
create_target_directory_groups(input_common)
target_link_libraries(input_common PUBLIC core PRIVATE common ${Boost_LIBRARIES})
target_link_libraries(input_common PUBLIC core PRIVATE common Boost::boost)

View File

@@ -148,6 +148,7 @@ add_library(video_core STATIC
textures/convert.h
textures/decoders.cpp
textures/decoders.h
textures/texture.cpp
textures/texture.h
video_core.cpp
video_core.h
@@ -155,7 +156,6 @@ add_library(video_core STATIC
if (ENABLE_VULKAN)
target_sources(video_core PRIVATE
renderer_vulkan/declarations.h
renderer_vulkan/fixed_pipeline_state.cpp
renderer_vulkan/fixed_pipeline_state.h
renderer_vulkan/maxwell_to_vk.cpp
@@ -210,6 +210,8 @@ if (ENABLE_VULKAN)
renderer_vulkan/vk_texture_cache.h
renderer_vulkan/vk_update_descriptor.cpp
renderer_vulkan/vk_update_descriptor.h
renderer_vulkan/wrapper.cpp
renderer_vulkan/wrapper.h
)
target_include_directories(video_core PRIVATE sirit ../../externals/Vulkan-Headers/include)

View File

@@ -15,37 +15,29 @@ namespace VideoCommon {
class BufferBlock {
public:
bool Overlaps(const CacheAddr start, const CacheAddr end) const {
return (cache_addr < end) && (cache_addr_end > start);
bool Overlaps(const VAddr start, const VAddr end) const {
return (cpu_addr < end) && (cpu_addr_end > start);
}
bool IsInside(const CacheAddr other_start, const CacheAddr other_end) const {
return cache_addr <= other_start && other_end <= cache_addr_end;
bool IsInside(const VAddr other_start, const VAddr other_end) const {
return cpu_addr <= other_start && other_end <= cpu_addr_end;
}
u8* GetWritableHostPtr() const {
return FromCacheAddr(cache_addr);
std::size_t GetOffset(const VAddr in_addr) {
return static_cast<std::size_t>(in_addr - cpu_addr);
}
u8* GetWritableHostPtr(std::size_t offset) const {
return FromCacheAddr(cache_addr + offset);
VAddr GetCpuAddr() const {
return cpu_addr;
}
std::size_t GetOffset(const CacheAddr in_addr) {
return static_cast<std::size_t>(in_addr - cache_addr);
VAddr GetCpuAddrEnd() const {
return cpu_addr_end;
}
CacheAddr GetCacheAddr() const {
return cache_addr;
}
CacheAddr GetCacheAddrEnd() const {
return cache_addr_end;
}
void SetCacheAddr(const CacheAddr new_addr) {
cache_addr = new_addr;
cache_addr_end = new_addr + size;
void SetCpuAddr(const VAddr new_addr) {
cpu_addr = new_addr;
cpu_addr_end = new_addr + size;
}
std::size_t GetSize() const {
@@ -61,14 +53,14 @@ public:
}
protected:
explicit BufferBlock(CacheAddr cache_addr, const std::size_t size) : size{size} {
SetCacheAddr(cache_addr);
explicit BufferBlock(VAddr cpu_addr, const std::size_t size) : size{size} {
SetCpuAddr(cpu_addr);
}
~BufferBlock() = default;
private:
CacheAddr cache_addr{};
CacheAddr cache_addr_end{};
VAddr cpu_addr{};
VAddr cpu_addr_end{};
std::size_t size{};
u64 epoch{};
};

View File

@@ -19,6 +19,7 @@
#include "common/alignment.h"
#include "common/common_types.h"
#include "core/core.h"
#include "core/memory.h"
#include "video_core/buffer_cache/buffer_block.h"
#include "video_core/buffer_cache/map_interval.h"
#include "video_core/memory_manager.h"
@@ -28,37 +29,54 @@ namespace VideoCommon {
using MapInterval = std::shared_ptr<MapIntervalBase>;
template <typename TBuffer, typename TBufferType, typename StreamBuffer>
template <typename OwnerBuffer, typename BufferType, typename StreamBuffer>
class BufferCache {
public:
using BufferInfo = std::pair<const TBufferType*, u64>;
using BufferInfo = std::pair<BufferType, u64>;
BufferInfo UploadMemory(GPUVAddr gpu_addr, std::size_t size, std::size_t alignment = 4,
bool is_written = false, bool use_fast_cbuf = false) {
std::lock_guard lock{mutex};
auto& memory_manager = system.GPU().MemoryManager();
const auto host_ptr = memory_manager.GetPointer(gpu_addr);
if (!host_ptr) {
const std::optional<VAddr> cpu_addr_opt =
system.GPU().MemoryManager().GpuToCpuAddress(gpu_addr);
if (!cpu_addr_opt) {
return {GetEmptyBuffer(size), 0};
}
const auto cache_addr = ToCacheAddr(host_ptr);
VAddr cpu_addr = *cpu_addr_opt;
// Cache management is a big overhead, so only cache entries with a given size.
// TODO: Figure out which size is the best for given games.
constexpr std::size_t max_stream_size = 0x800;
if (use_fast_cbuf || size < max_stream_size) {
if (!is_written && !IsRegionWritten(cache_addr, cache_addr + size - 1)) {
if (!is_written && !IsRegionWritten(cpu_addr, cpu_addr + size - 1)) {
auto& memory_manager = system.GPU().MemoryManager();
if (use_fast_cbuf) {
return ConstBufferUpload(host_ptr, size);
if (memory_manager.IsGranularRange(gpu_addr, size)) {
const auto host_ptr = memory_manager.GetPointer(gpu_addr);
return ConstBufferUpload(host_ptr, size);
} else {
staging_buffer.resize(size);
memory_manager.ReadBlockUnsafe(gpu_addr, staging_buffer.data(), size);
return ConstBufferUpload(staging_buffer.data(), size);
}
} else {
return StreamBufferUpload(host_ptr, size, alignment);
if (memory_manager.IsGranularRange(gpu_addr, size)) {
const auto host_ptr = memory_manager.GetPointer(gpu_addr);
return StreamBufferUpload(host_ptr, size, alignment);
} else {
staging_buffer.resize(size);
memory_manager.ReadBlockUnsafe(gpu_addr, staging_buffer.data(), size);
return StreamBufferUpload(staging_buffer.data(), size, alignment);
}
}
}
}
auto block = GetBlock(cache_addr, size);
auto map = MapAddress(block, gpu_addr, cache_addr, size);
auto block = GetBlock(cpu_addr, size);
auto map = MapAddress(block, gpu_addr, cpu_addr, size);
if (is_written) {
map->MarkAsModified(true, GetModifiedTicks());
if (!map->IsWritten()) {
@@ -71,9 +89,7 @@ public:
}
}
const u64 offset = static_cast<u64>(block->GetOffset(cache_addr));
return {ToHandle(block), offset};
return {ToHandle(block), static_cast<u64>(block->GetOffset(cpu_addr))};
}
/// Uploads from a host memory. Returns the OpenGL buffer where it's located and its offset.
@@ -112,7 +128,7 @@ public:
}
/// Write any cached resources overlapping the specified region back to memory
void FlushRegion(CacheAddr addr, std::size_t size) {
void FlushRegion(VAddr addr, std::size_t size) {
std::lock_guard lock{mutex};
std::vector<MapInterval> objects = GetMapsInRange(addr, size);
@@ -127,7 +143,7 @@ public:
}
/// Mark the specified region as being invalidated
void InvalidateRegion(CacheAddr addr, u64 size) {
void InvalidateRegion(VAddr addr, u64 size) {
std::lock_guard lock{mutex};
std::vector<MapInterval> objects = GetMapsInRange(addr, size);
@@ -138,7 +154,7 @@ public:
}
}
virtual const TBufferType* GetEmptyBuffer(std::size_t size) = 0;
virtual BufferType GetEmptyBuffer(std::size_t size) = 0;
protected:
explicit BufferCache(VideoCore::RasterizerInterface& rasterizer, Core::System& system,
@@ -148,19 +164,19 @@ protected:
~BufferCache() = default;
virtual const TBufferType* ToHandle(const TBuffer& storage) = 0;
virtual BufferType ToHandle(const OwnerBuffer& storage) = 0;
virtual void WriteBarrier() = 0;
virtual TBuffer CreateBlock(CacheAddr cache_addr, std::size_t size) = 0;
virtual OwnerBuffer CreateBlock(VAddr cpu_addr, std::size_t size) = 0;
virtual void UploadBlockData(const TBuffer& buffer, std::size_t offset, std::size_t size,
virtual void UploadBlockData(const OwnerBuffer& buffer, std::size_t offset, std::size_t size,
const u8* data) = 0;
virtual void DownloadBlockData(const TBuffer& buffer, std::size_t offset, std::size_t size,
virtual void DownloadBlockData(const OwnerBuffer& buffer, std::size_t offset, std::size_t size,
u8* data) = 0;
virtual void CopyBlock(const TBuffer& src, const TBuffer& dst, std::size_t src_offset,
virtual void CopyBlock(const OwnerBuffer& src, const OwnerBuffer& dst, std::size_t src_offset,
std::size_t dst_offset, std::size_t size) = 0;
virtual BufferInfo ConstBufferUpload(const void* raw_pointer, std::size_t size) {
@@ -169,20 +185,17 @@ protected:
/// Register an object into the cache
void Register(const MapInterval& new_map, bool inherit_written = false) {
const CacheAddr cache_ptr = new_map->GetStart();
const std::optional<VAddr> cpu_addr =
system.GPU().MemoryManager().GpuToCpuAddress(new_map->GetGpuAddress());
if (!cache_ptr || !cpu_addr) {
const VAddr cpu_addr = new_map->GetStart();
if (!cpu_addr) {
LOG_CRITICAL(HW_GPU, "Failed to register buffer with unmapped gpu_address 0x{:016x}",
new_map->GetGpuAddress());
return;
}
const std::size_t size = new_map->GetEnd() - new_map->GetStart();
new_map->SetCpuAddress(*cpu_addr);
new_map->MarkAsRegistered(true);
const IntervalType interval{new_map->GetStart(), new_map->GetEnd()};
mapped_addresses.insert({interval, new_map});
rasterizer.UpdatePagesCachedCount(*cpu_addr, size, 1);
rasterizer.UpdatePagesCachedCount(cpu_addr, size, 1);
if (inherit_written) {
MarkRegionAsWritten(new_map->GetStart(), new_map->GetEnd() - 1);
new_map->MarkAsWritten(true);
@@ -192,7 +205,7 @@ protected:
/// Unregisters an object from the cache
void Unregister(MapInterval& map) {
const std::size_t size = map->GetEnd() - map->GetStart();
rasterizer.UpdatePagesCachedCount(map->GetCpuAddress(), size, -1);
rasterizer.UpdatePagesCachedCount(map->GetStart(), size, -1);
map->MarkAsRegistered(false);
if (map->IsWritten()) {
UnmarkRegionAsWritten(map->GetStart(), map->GetEnd() - 1);
@@ -202,32 +215,38 @@ protected:
}
private:
MapInterval CreateMap(const CacheAddr start, const CacheAddr end, const GPUVAddr gpu_addr) {
MapInterval CreateMap(const VAddr start, const VAddr end, const GPUVAddr gpu_addr) {
return std::make_shared<MapIntervalBase>(start, end, gpu_addr);
}
MapInterval MapAddress(const TBuffer& block, const GPUVAddr gpu_addr,
const CacheAddr cache_addr, const std::size_t size) {
std::vector<MapInterval> overlaps = GetMapsInRange(cache_addr, size);
MapInterval MapAddress(const OwnerBuffer& block, const GPUVAddr gpu_addr, const VAddr cpu_addr,
const std::size_t size) {
std::vector<MapInterval> overlaps = GetMapsInRange(cpu_addr, size);
if (overlaps.empty()) {
const CacheAddr cache_addr_end = cache_addr + size;
MapInterval new_map = CreateMap(cache_addr, cache_addr_end, gpu_addr);
u8* host_ptr = FromCacheAddr(cache_addr);
UploadBlockData(block, block->GetOffset(cache_addr), size, host_ptr);
auto& memory_manager = system.GPU().MemoryManager();
const VAddr cpu_addr_end = cpu_addr + size;
MapInterval new_map = CreateMap(cpu_addr, cpu_addr_end, gpu_addr);
if (memory_manager.IsGranularRange(gpu_addr, size)) {
u8* host_ptr = memory_manager.GetPointer(gpu_addr);
UploadBlockData(block, block->GetOffset(cpu_addr), size, host_ptr);
} else {
staging_buffer.resize(size);
memory_manager.ReadBlockUnsafe(gpu_addr, staging_buffer.data(), size);
UploadBlockData(block, block->GetOffset(cpu_addr), size, staging_buffer.data());
}
Register(new_map);
return new_map;
}
const CacheAddr cache_addr_end = cache_addr + size;
const VAddr cpu_addr_end = cpu_addr + size;
if (overlaps.size() == 1) {
MapInterval& current_map = overlaps[0];
if (current_map->IsInside(cache_addr, cache_addr_end)) {
if (current_map->IsInside(cpu_addr, cpu_addr_end)) {
return current_map;
}
}
CacheAddr new_start = cache_addr;
CacheAddr new_end = cache_addr_end;
VAddr new_start = cpu_addr;
VAddr new_end = cpu_addr_end;
bool write_inheritance = false;
bool modified_inheritance = false;
// Calculate new buffer parameters
@@ -237,7 +256,7 @@ private:
write_inheritance |= overlap->IsWritten();
modified_inheritance |= overlap->IsModified();
}
GPUVAddr new_gpu_addr = gpu_addr + new_start - cache_addr;
GPUVAddr new_gpu_addr = gpu_addr + new_start - cpu_addr;
for (auto& overlap : overlaps) {
Unregister(overlap);
}
@@ -250,7 +269,7 @@ private:
return new_map;
}
void UpdateBlock(const TBuffer& block, CacheAddr start, CacheAddr end,
void UpdateBlock(const OwnerBuffer& block, VAddr start, VAddr end,
std::vector<MapInterval>& overlaps) {
const IntervalType base_interval{start, end};
IntervalSet interval_set{};
@@ -262,13 +281,15 @@ private:
for (auto& interval : interval_set) {
std::size_t size = interval.upper() - interval.lower();
if (size > 0) {
u8* host_ptr = FromCacheAddr(interval.lower());
UploadBlockData(block, block->GetOffset(interval.lower()), size, host_ptr);
staging_buffer.resize(size);
system.Memory().ReadBlockUnsafe(interval.lower(), staging_buffer.data(), size);
UploadBlockData(block, block->GetOffset(interval.lower()), size,
staging_buffer.data());
}
}
}
std::vector<MapInterval> GetMapsInRange(CacheAddr addr, std::size_t size) {
std::vector<MapInterval> GetMapsInRange(VAddr addr, std::size_t size) {
if (size == 0) {
return {};
}
@@ -289,9 +310,10 @@ private:
void FlushMap(MapInterval map) {
std::size_t size = map->GetEnd() - map->GetStart();
TBuffer block = blocks[map->GetStart() >> block_page_bits];
u8* host_ptr = FromCacheAddr(map->GetStart());
DownloadBlockData(block, block->GetOffset(map->GetStart()), size, host_ptr);
OwnerBuffer block = blocks[map->GetStart() >> block_page_bits];
staging_buffer.resize(size);
DownloadBlockData(block, block->GetOffset(map->GetStart()), size, staging_buffer.data());
system.Memory().WriteBlockUnsafe(map->GetStart(), staging_buffer.data(), size);
map->MarkAsModified(false, 0);
}
@@ -303,7 +325,7 @@ private:
buffer_ptr += size;
buffer_offset += size;
return {&stream_buffer_handle, uploaded_offset};
return {stream_buffer_handle, uploaded_offset};
}
void AlignBuffer(std::size_t alignment) {
@@ -313,17 +335,17 @@ private:
buffer_offset = offset_aligned;
}
TBuffer EnlargeBlock(TBuffer buffer) {
OwnerBuffer EnlargeBlock(OwnerBuffer buffer) {
const std::size_t old_size = buffer->GetSize();
const std::size_t new_size = old_size + block_page_size;
const CacheAddr cache_addr = buffer->GetCacheAddr();
TBuffer new_buffer = CreateBlock(cache_addr, new_size);
const VAddr cpu_addr = buffer->GetCpuAddr();
OwnerBuffer new_buffer = CreateBlock(cpu_addr, new_size);
CopyBlock(buffer, new_buffer, 0, 0, old_size);
buffer->SetEpoch(epoch);
pending_destruction.push_back(buffer);
const CacheAddr cache_addr_end = cache_addr + new_size - 1;
u64 page_start = cache_addr >> block_page_bits;
const u64 page_end = cache_addr_end >> block_page_bits;
const VAddr cpu_addr_end = cpu_addr + new_size - 1;
u64 page_start = cpu_addr >> block_page_bits;
const u64 page_end = cpu_addr_end >> block_page_bits;
while (page_start <= page_end) {
blocks[page_start] = new_buffer;
++page_start;
@@ -331,23 +353,23 @@ private:
return new_buffer;
}
TBuffer MergeBlocks(TBuffer first, TBuffer second) {
OwnerBuffer MergeBlocks(OwnerBuffer first, OwnerBuffer second) {
const std::size_t size_1 = first->GetSize();
const std::size_t size_2 = second->GetSize();
const CacheAddr first_addr = first->GetCacheAddr();
const CacheAddr second_addr = second->GetCacheAddr();
const CacheAddr new_addr = std::min(first_addr, second_addr);
const VAddr first_addr = first->GetCpuAddr();
const VAddr second_addr = second->GetCpuAddr();
const VAddr new_addr = std::min(first_addr, second_addr);
const std::size_t new_size = size_1 + size_2;
TBuffer new_buffer = CreateBlock(new_addr, new_size);
OwnerBuffer new_buffer = CreateBlock(new_addr, new_size);
CopyBlock(first, new_buffer, 0, new_buffer->GetOffset(first_addr), size_1);
CopyBlock(second, new_buffer, 0, new_buffer->GetOffset(second_addr), size_2);
first->SetEpoch(epoch);
second->SetEpoch(epoch);
pending_destruction.push_back(first);
pending_destruction.push_back(second);
const CacheAddr cache_addr_end = new_addr + new_size - 1;
const VAddr cpu_addr_end = new_addr + new_size - 1;
u64 page_start = new_addr >> block_page_bits;
const u64 page_end = cache_addr_end >> block_page_bits;
const u64 page_end = cpu_addr_end >> block_page_bits;
while (page_start <= page_end) {
blocks[page_start] = new_buffer;
++page_start;
@@ -355,18 +377,18 @@ private:
return new_buffer;
}
TBuffer GetBlock(const CacheAddr cache_addr, const std::size_t size) {
TBuffer found{};
const CacheAddr cache_addr_end = cache_addr + size - 1;
u64 page_start = cache_addr >> block_page_bits;
const u64 page_end = cache_addr_end >> block_page_bits;
OwnerBuffer GetBlock(const VAddr cpu_addr, const std::size_t size) {
OwnerBuffer found;
const VAddr cpu_addr_end = cpu_addr + size - 1;
u64 page_start = cpu_addr >> block_page_bits;
const u64 page_end = cpu_addr_end >> block_page_bits;
while (page_start <= page_end) {
auto it = blocks.find(page_start);
if (it == blocks.end()) {
if (found) {
found = EnlargeBlock(found);
} else {
const CacheAddr start_addr = (page_start << block_page_bits);
const VAddr start_addr = (page_start << block_page_bits);
found = CreateBlock(start_addr, block_page_size);
blocks[page_start] = found;
}
@@ -386,7 +408,7 @@ private:
return found;
}
void MarkRegionAsWritten(const CacheAddr start, const CacheAddr end) {
void MarkRegionAsWritten(const VAddr start, const VAddr end) {
u64 page_start = start >> write_page_bit;
const u64 page_end = end >> write_page_bit;
while (page_start <= page_end) {
@@ -400,7 +422,7 @@ private:
}
}
void UnmarkRegionAsWritten(const CacheAddr start, const CacheAddr end) {
void UnmarkRegionAsWritten(const VAddr start, const VAddr end) {
u64 page_start = start >> write_page_bit;
const u64 page_end = end >> write_page_bit;
while (page_start <= page_end) {
@@ -416,7 +438,7 @@ private:
}
}
bool IsRegionWritten(const CacheAddr start, const CacheAddr end) const {
bool IsRegionWritten(const VAddr start, const VAddr end) const {
u64 page_start = start >> write_page_bit;
const u64 page_end = end >> write_page_bit;
while (page_start <= page_end) {
@@ -432,7 +454,7 @@ private:
Core::System& system;
std::unique_ptr<StreamBuffer> stream_buffer;
TBufferType stream_buffer_handle{};
BufferType stream_buffer_handle{};
bool invalidated = false;
@@ -440,8 +462,8 @@ private:
u64 buffer_offset = 0;
u64 buffer_offset_base = 0;
using IntervalSet = boost::icl::interval_set<CacheAddr>;
using IntervalCache = boost::icl::interval_map<CacheAddr, MapInterval>;
using IntervalSet = boost::icl::interval_set<VAddr>;
using IntervalCache = boost::icl::interval_map<VAddr, MapInterval>;
using IntervalType = typename IntervalCache::interval_type;
IntervalCache mapped_addresses;
@@ -450,12 +472,14 @@ private:
static constexpr u64 block_page_bits = 21;
static constexpr u64 block_page_size = 1ULL << block_page_bits;
std::unordered_map<u64, TBuffer> blocks;
std::unordered_map<u64, OwnerBuffer> blocks;
std::list<TBuffer> pending_destruction;
std::list<OwnerBuffer> pending_destruction;
u64 epoch = 0;
u64 modified_ticks = 0;
std::vector<u8> staging_buffer;
std::recursive_mutex mutex;
};

View File

@@ -11,7 +11,7 @@ namespace VideoCommon {
class MapIntervalBase {
public:
MapIntervalBase(const CacheAddr start, const CacheAddr end, const GPUVAddr gpu_addr)
MapIntervalBase(const VAddr start, const VAddr end, const GPUVAddr gpu_addr)
: start{start}, end{end}, gpu_addr{gpu_addr} {}
void SetCpuAddress(VAddr new_cpu_addr) {
@@ -26,7 +26,7 @@ public:
return gpu_addr;
}
bool IsInside(const CacheAddr other_start, const CacheAddr other_end) const {
bool IsInside(const VAddr other_start, const VAddr other_end) const {
return (start <= other_start && other_end <= end);
}
@@ -46,11 +46,11 @@ public:
return is_registered;
}
CacheAddr GetStart() const {
VAddr GetStart() const {
return start;
}
CacheAddr GetEnd() const {
VAddr GetEnd() const {
return end;
}
@@ -76,8 +76,8 @@ public:
}
private:
CacheAddr start;
CacheAddr end;
VAddr start;
VAddr end;
GPUVAddr gpu_addr;
VAddr cpu_addr{};
bool is_written{};

View File

@@ -303,6 +303,10 @@ public:
return (type == Type::SignedNorm) || (type == Type::UnsignedNorm);
}
bool IsConstant() const {
return constant;
}
bool IsValid() const {
return size != Size::Invalid;
}
@@ -312,6 +316,35 @@ public:
}
};
struct MsaaSampleLocation {
union {
BitField<0, 4, u32> x0;
BitField<4, 4, u32> y0;
BitField<8, 4, u32> x1;
BitField<12, 4, u32> y1;
BitField<16, 4, u32> x2;
BitField<20, 4, u32> y2;
BitField<24, 4, u32> x3;
BitField<28, 4, u32> y3;
};
constexpr std::pair<u32, u32> Location(int index) const {
switch (index) {
case 0:
return {x0, y0};
case 1:
return {x1, y1};
case 2:
return {x2, y2};
case 3:
return {x3, y3};
default:
UNREACHABLE();
return {0, 0};
}
}
};
enum class DepthMode : u32 {
MinusOneToOne = 0,
ZeroToOne = 1,
@@ -793,7 +826,13 @@ public:
u32 rt_separate_frag_data;
INSERT_UNION_PADDING_WORDS(0xC);
INSERT_UNION_PADDING_WORDS(0x1);
u32 multisample_raster_enable;
u32 multisample_raster_samples;
std::array<u32, 4> multisample_sample_mask;
INSERT_UNION_PADDING_WORDS(0x5);
struct {
u32 address_high;
@@ -830,7 +869,16 @@ public:
std::array<VertexAttribute, NumVertexAttributes> vertex_attrib_format;
INSERT_UNION_PADDING_WORDS(0xF);
std::array<MsaaSampleLocation, 4> multisample_sample_locations;
INSERT_UNION_PADDING_WORDS(0x2);
union {
BitField<0, 1, u32> enable;
BitField<4, 3, u32> target;
} multisample_coverage_to_color;
INSERT_UNION_PADDING_WORDS(0x8);
struct {
union {
@@ -922,7 +970,10 @@ public:
BitField<4, 1, u32> triangle_rast_flip;
} screen_y_control;
INSERT_UNION_PADDING_WORDS(0x21);
float line_width_smooth;
float line_width_aliased;
INSERT_UNION_PADDING_WORDS(0x1F);
u32 vb_element_base;
u32 vb_base_instance;
@@ -943,7 +994,7 @@ public:
CounterReset counter_reset;
INSERT_UNION_PADDING_WORDS(0x1);
u32 multisample_enable;
u32 zeta_enable;
@@ -980,7 +1031,7 @@ public:
float polygon_offset_factor;
INSERT_UNION_PADDING_WORDS(0x1);
u32 line_smooth_enable;
struct {
u32 tic_address_high;
@@ -1007,7 +1058,11 @@ public:
float polygon_offset_units;
INSERT_UNION_PADDING_WORDS(0x11);
INSERT_UNION_PADDING_WORDS(0x4);
Tegra::Texture::MsaaMode multisample_mode;
INSERT_UNION_PADDING_WORDS(0xC);
union {
BitField<2, 1, u32> coord_origin;
@@ -1507,12 +1562,17 @@ ASSERT_REG_POSITION(stencil_back_func_ref, 0x3D5);
ASSERT_REG_POSITION(stencil_back_mask, 0x3D6);
ASSERT_REG_POSITION(stencil_back_func_mask, 0x3D7);
ASSERT_REG_POSITION(color_mask_common, 0x3E4);
ASSERT_REG_POSITION(rt_separate_frag_data, 0x3EB);
ASSERT_REG_POSITION(depth_bounds, 0x3E7);
ASSERT_REG_POSITION(rt_separate_frag_data, 0x3EB);
ASSERT_REG_POSITION(multisample_raster_enable, 0x3ED);
ASSERT_REG_POSITION(multisample_raster_samples, 0x3EE);
ASSERT_REG_POSITION(multisample_sample_mask, 0x3EF);
ASSERT_REG_POSITION(zeta, 0x3F8);
ASSERT_REG_POSITION(clear_flags, 0x43E);
ASSERT_REG_POSITION(fill_rectangle, 0x44F);
ASSERT_REG_POSITION(vertex_attrib_format, 0x458);
ASSERT_REG_POSITION(multisample_sample_locations, 0x478);
ASSERT_REG_POSITION(multisample_coverage_to_color, 0x47E);
ASSERT_REG_POSITION(rt_control, 0x487);
ASSERT_REG_POSITION(zeta_width, 0x48a);
ASSERT_REG_POSITION(zeta_height, 0x48b);
@@ -1538,6 +1598,8 @@ ASSERT_REG_POSITION(stencil_front_func_mask, 0x4E6);
ASSERT_REG_POSITION(stencil_front_mask, 0x4E7);
ASSERT_REG_POSITION(frag_color_clamp, 0x4EA);
ASSERT_REG_POSITION(screen_y_control, 0x4EB);
ASSERT_REG_POSITION(line_width_smooth, 0x4EC);
ASSERT_REG_POSITION(line_width_aliased, 0x4ED);
ASSERT_REG_POSITION(vb_element_base, 0x50D);
ASSERT_REG_POSITION(vb_base_instance, 0x50E);
ASSERT_REG_POSITION(clip_distance_enabled, 0x544);
@@ -1545,11 +1607,13 @@ ASSERT_REG_POSITION(samplecnt_enable, 0x545);
ASSERT_REG_POSITION(point_size, 0x546);
ASSERT_REG_POSITION(point_sprite_enable, 0x548);
ASSERT_REG_POSITION(counter_reset, 0x54C);
ASSERT_REG_POSITION(multisample_enable, 0x54D);
ASSERT_REG_POSITION(zeta_enable, 0x54E);
ASSERT_REG_POSITION(multisample_control, 0x54F);
ASSERT_REG_POSITION(condition, 0x554);
ASSERT_REG_POSITION(tsc, 0x557);
ASSERT_REG_POSITION(polygon_offset_factor, 0x55b);
ASSERT_REG_POSITION(polygon_offset_factor, 0x55B);
ASSERT_REG_POSITION(line_smooth_enable, 0x55C);
ASSERT_REG_POSITION(tic, 0x55D);
ASSERT_REG_POSITION(stencil_two_side_enable, 0x565);
ASSERT_REG_POSITION(stencil_back_op_fail, 0x566);
@@ -1558,6 +1622,7 @@ ASSERT_REG_POSITION(stencil_back_op_zpass, 0x568);
ASSERT_REG_POSITION(stencil_back_func_func, 0x569);
ASSERT_REG_POSITION(framebuffer_srgb, 0x56E);
ASSERT_REG_POSITION(polygon_offset_units, 0x56F);
ASSERT_REG_POSITION(multisample_mode, 0x574);
ASSERT_REG_POSITION(point_coord_replace, 0x581);
ASSERT_REG_POSITION(code_address, 0x582);
ASSERT_REG_POSITION(draw, 0x585);

View File

@@ -290,6 +290,23 @@ enum class VmadShr : u64 {
Shr15 = 2,
};
enum class VmnmxType : u64 {
Bits8,
Bits16,
Bits32,
};
enum class VmnmxOperation : u64 {
Mrg_16H = 0,
Mrg_16L = 1,
Mrg_8B0 = 2,
Mrg_8B2 = 3,
Acc = 4,
Min = 5,
Max = 6,
Nop = 7,
};
enum class XmadMode : u64 {
None = 0,
CLo = 1,
@@ -988,6 +1005,12 @@ union Instruction {
BitField<46, 2, u64> cache_mode;
} stg;
union {
BitField<23, 3, AtomicOp> operation;
BitField<48, 1, u64> extended;
BitField<20, 3, GlobalAtomicType> type;
} red;
union {
BitField<52, 4, AtomicOp> operation;
BitField<49, 3, GlobalAtomicType> type;
@@ -1650,6 +1673,42 @@ union Instruction {
BitField<47, 1, u64> cc;
} vmad;
union {
BitField<54, 1, u64> is_dest_signed;
BitField<48, 1, u64> is_src_a_signed;
BitField<49, 1, u64> is_src_b_signed;
BitField<37, 2, u64> src_format_a;
BitField<29, 2, u64> src_format_b;
BitField<56, 1, u64> mx;
BitField<55, 1, u64> sat;
BitField<36, 2, u64> selector_a;
BitField<28, 2, u64> selector_b;
BitField<50, 1, u64> is_op_b_register;
BitField<51, 3, VmnmxOperation> operation;
VmnmxType SourceFormatA() const {
switch (src_format_a) {
case 0b11:
return VmnmxType::Bits32;
case 0b10:
return VmnmxType::Bits16;
default:
return VmnmxType::Bits8;
}
}
VmnmxType SourceFormatB() const {
switch (src_format_b) {
case 0b11:
return VmnmxType::Bits32;
case 0b10:
return VmnmxType::Bits16;
default:
return VmnmxType::Bits8;
}
}
} vmnmx;
union {
BitField<20, 16, u64> imm20_16;
BitField<35, 1, u64> high_b_rr; // used on RR
@@ -1712,6 +1771,7 @@ public:
BRK,
DEPBAR,
VOTE,
VOTE_VTG,
SHFL,
FSWZADD,
BFE_C,
@@ -1733,6 +1793,7 @@ public:
ST_S,
ST, // Store in generic memory
STG, // Store in global memory
RED, // Reduction operation
ATOM, // Atomic operation on global memory
ATOMS, // Atomic operation on shared memory
AL2P, // Transforms attribute memory into physical memory
@@ -1758,9 +1819,11 @@ public:
IPA,
OUT_R, // Emit vertex/primitive
ISBERD,
BAR,
MEMBAR,
VMAD,
VSETP,
VMNMX,
FFMA_IMM, // Fused Multiply and Add
FFMA_CR,
FFMA_RC,
@@ -1815,7 +1878,8 @@ public:
ICMP_R,
ICMP_CR,
ICMP_IMM,
FCMP_R,
FCMP_RR,
FCMP_RC,
MUFU, // Multi-Function Operator
RRO_C, // Range Reduction Operator
RRO_R,
@@ -1842,7 +1906,7 @@ public:
MOV_C,
MOV_R,
MOV_IMM,
MOV_SYS,
S2R,
MOV32_IMM,
SHL_C,
SHL_R,
@@ -2026,6 +2090,7 @@ private:
INST("111000110000----", Id::EXIT, Type::Flow, "EXIT"),
INST("1111000011110---", Id::DEPBAR, Type::Synch, "DEPBAR"),
INST("0101000011011---", Id::VOTE, Type::Warp, "VOTE"),
INST("0101000011100---", Id::VOTE_VTG, Type::Warp, "VOTE_VTG"),
INST("1110111100010---", Id::SHFL, Type::Warp, "SHFL"),
INST("0101000011111---", Id::FSWZADD, Type::Warp, "FSWZADD"),
INST("1110111111011---", Id::LD_A, Type::Memory, "LD_A"),
@@ -2039,6 +2104,7 @@ private:
INST("1110111101010---", Id::ST_L, Type::Memory, "ST_L"),
INST("101-------------", Id::ST, Type::Memory, "ST"),
INST("1110111011011---", Id::STG, Type::Memory, "STG"),
INST("1110101111111---", Id::RED, Type::Memory, "RED"),
INST("11101101--------", Id::ATOM, Type::Memory, "ATOM"),
INST("11101100--------", Id::ATOMS, Type::Memory, "ATOMS"),
INST("1110111110100---", Id::AL2P, Type::Memory, "AL2P"),
@@ -2063,9 +2129,11 @@ private:
INST("11100000--------", Id::IPA, Type::Trivial, "IPA"),
INST("1111101111100---", Id::OUT_R, Type::Trivial, "OUT_R"),
INST("1110111111010---", Id::ISBERD, Type::Trivial, "ISBERD"),
INST("1111000010101---", Id::BAR, Type::Trivial, "BAR"),
INST("1110111110011---", Id::MEMBAR, Type::Trivial, "MEMBAR"),
INST("01011111--------", Id::VMAD, Type::Video, "VMAD"),
INST("0101000011110---", Id::VSETP, Type::Video, "VSETP"),
INST("0011101---------", Id::VMNMX, Type::Video, "VMNMX"),
INST("0011001-1-------", Id::FFMA_IMM, Type::Ffma, "FFMA_IMM"),
INST("010010011-------", Id::FFMA_CR, Type::Ffma, "FFMA_CR"),
INST("010100011-------", Id::FFMA_RC, Type::Ffma, "FFMA_RC"),
@@ -2120,7 +2188,8 @@ private:
INST("0101110100100---", Id::HSETP2_R, Type::HalfSetPredicate, "HSETP2_R"),
INST("0111111-0-------", Id::HSETP2_IMM, Type::HalfSetPredicate, "HSETP2_IMM"),
INST("0101110100011---", Id::HSET2_R, Type::HalfSet, "HSET2_R"),
INST("010110111010----", Id::FCMP_R, Type::Arithmetic, "FCMP_R"),
INST("010110111010----", Id::FCMP_RR, Type::Arithmetic, "FCMP_RR"),
INST("010010111010----", Id::FCMP_RC, Type::Arithmetic, "FCMP_RC"),
INST("0101000010000---", Id::MUFU, Type::Arithmetic, "MUFU"),
INST("0100110010010---", Id::RRO_C, Type::Arithmetic, "RRO_C"),
INST("0101110010010---", Id::RRO_R, Type::Arithmetic, "RRO_R"),
@@ -2134,7 +2203,7 @@ private:
INST("0100110010011---", Id::MOV_C, Type::Arithmetic, "MOV_C"),
INST("0101110010011---", Id::MOV_R, Type::Arithmetic, "MOV_R"),
INST("0011100-10011---", Id::MOV_IMM, Type::Arithmetic, "MOV_IMM"),
INST("1111000011001---", Id::MOV_SYS, Type::Trivial, "MOV_SYS"),
INST("1111000011001---", Id::S2R, Type::Trivial, "S2R"),
INST("000000010000----", Id::MOV32_IMM, Type::ArithmeticImmediate, "MOV32_IMM"),
INST("0100110001100---", Id::FMNMX_C, Type::Arithmetic, "FMNMX_C"),
INST("0101110001100---", Id::FMNMX_R, Type::Arithmetic, "FMNMX_R"),
@@ -2166,7 +2235,7 @@ private:
INST("0011011-11111---", Id::SHF_LEFT_IMM, Type::Shift, "SHF_LEFT_IMM"),
INST("0100110011100---", Id::I2I_C, Type::Conversion, "I2I_C"),
INST("0101110011100---", Id::I2I_R, Type::Conversion, "I2I_R"),
INST("0011101-11100---", Id::I2I_IMM, Type::Conversion, "I2I_IMM"),
INST("0011100-11100---", Id::I2I_IMM, Type::Conversion, "I2I_IMM"),
INST("0100110010111---", Id::I2F_C, Type::Conversion, "I2F_C"),
INST("0101110010111---", Id::I2F_R, Type::Conversion, "I2F_R"),
INST("0011100-10111---", Id::I2F_IMM, Type::Conversion, "I2F_IMM"),

View File

@@ -4,6 +4,9 @@
#pragma once
#include <array>
#include <optional>
#include "common/bit_field.h"
#include "common/common_funcs.h"
#include "common/common_types.h"
@@ -16,7 +19,7 @@ enum class OutputTopology : u32 {
TriangleStrip = 7,
};
enum class AttributeUse : u8 {
enum class PixelImap : u8 {
Unused = 0,
Constant = 1,
Perspective = 2,
@@ -24,7 +27,7 @@ enum class AttributeUse : u8 {
};
// Documentation in:
// http://download.nvidia.com/open-gpu-doc/Shader-Program-Header/1/Shader-Program-Header.html#ImapTexture
// http://download.nvidia.com/open-gpu-doc/Shader-Program-Header/1/Shader-Program-Header.html
struct Header {
union {
BitField<0, 5, u32> sph_type;
@@ -59,8 +62,8 @@ struct Header {
union {
BitField<0, 12, u32> max_output_vertices;
BitField<12, 8, u32> store_req_start; // NOTE: not used by geometry shaders.
BitField<24, 4, u32> reserved;
BitField<12, 8, u32> store_req_end; // NOTE: not used by geometry shaders.
BitField<20, 4, u32> reserved;
BitField<24, 8, u32> store_req_end; // NOTE: not used by geometry shaders.
} common4{};
union {
@@ -93,17 +96,20 @@ struct Header {
struct {
INSERT_UNION_PADDING_BYTES(3); // ImapSystemValuesA
INSERT_UNION_PADDING_BYTES(1); // ImapSystemValuesB
union {
BitField<0, 2, AttributeUse> x;
BitField<2, 2, AttributeUse> y;
BitField<4, 2, AttributeUse> w;
BitField<6, 2, AttributeUse> z;
BitField<0, 2, PixelImap> x;
BitField<2, 2, PixelImap> y;
BitField<4, 2, PixelImap> z;
BitField<6, 2, PixelImap> w;
u8 raw;
} imap_generic_vector[32];
INSERT_UNION_PADDING_BYTES(2); // ImapColor
INSERT_UNION_PADDING_BYTES(2); // ImapSystemValuesC
INSERT_UNION_PADDING_BYTES(10); // ImapFixedFncTexture[10]
INSERT_UNION_PADDING_BYTES(2); // ImapReserved
struct {
u32 target;
union {
@@ -112,31 +118,30 @@ struct Header {
BitField<2, 30, u32> reserved;
};
} omap;
bool IsColorComponentOutputEnabled(u32 render_target, u32 component) const {
const u32 bit = render_target * 4 + component;
return omap.target & (1 << bit);
}
AttributeUse GetAttributeIndexUse(u32 attribute, u32 index) const {
return static_cast<AttributeUse>(
(imap_generic_vector[attribute].raw >> (index * 2)) & 0x03);
}
AttributeUse GetAttributeUse(u32 attribute) const {
AttributeUse result = AttributeUse::Unused;
for (u32 i = 0; i < 4; i++) {
const auto index = GetAttributeIndexUse(attribute, i);
if (index == AttributeUse::Unused) {
PixelImap GetPixelImap(u32 attribute) const {
const auto get_index = [this, attribute](u32 index) {
return static_cast<PixelImap>(
(imap_generic_vector[attribute].raw >> (index * 2)) & 3);
};
std::optional<PixelImap> result;
for (u32 component = 0; component < 4; ++component) {
const PixelImap index = get_index(component);
if (index == PixelImap::Unused) {
continue;
}
if (result == AttributeUse::Unused || result == index) {
result = index;
continue;
}
LOG_CRITICAL(HW_GPU, "Generic Attribute Conflict in Interpolation Mode");
if (index == AttributeUse::Perspective) {
result = index;
if (result && result != index) {
LOG_CRITICAL(HW_GPU, "Generic attribute conflict in interpolation mode");
}
result = index;
}
return result;
return result.value_or(PixelImap::Unused);
}
} ps;

View File

@@ -7,6 +7,7 @@
#include "core/core.h"
#include "core/core_timing.h"
#include "core/core_timing_util.h"
#include "core/frontend/emu_window.h"
#include "core/memory.h"
#include "video_core/engines/fermi_2d.h"
#include "video_core/engines/kepler_compute.h"
@@ -16,14 +17,15 @@
#include "video_core/gpu.h"
#include "video_core/memory_manager.h"
#include "video_core/renderer_base.h"
#include "video_core/video_core.h"
namespace Tegra {
MICROPROFILE_DEFINE(GPU_wait, "GPU", "Wait for the GPU", MP_RGB(128, 128, 192));
GPU::GPU(Core::System& system, VideoCore::RendererBase& renderer, bool is_async)
: system{system}, renderer{renderer}, is_async{is_async} {
auto& rasterizer{renderer.Rasterizer()};
GPU::GPU(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer_, bool is_async)
: system{system}, renderer{std::move(renderer_)}, is_async{is_async} {
auto& rasterizer{renderer->Rasterizer()};
memory_manager = std::make_unique<Tegra::MemoryManager>(system, rasterizer);
dma_pusher = std::make_unique<Tegra::DmaPusher>(*this);
maxwell_3d = std::make_unique<Engines::Maxwell3D>(system, rasterizer, *memory_manager);
@@ -137,7 +139,7 @@ u64 GPU::GetTicks() const {
}
void GPU::FlushCommands() {
renderer.Rasterizer().FlushCommands();
renderer->Rasterizer().FlushCommands();
}
// Note that, traditionally, methods are treated as 4-byte addressable locations, and hence

View File

@@ -25,8 +25,11 @@ inline u8* FromCacheAddr(CacheAddr cache_addr) {
}
namespace Core {
class System;
namespace Frontend {
class EmuWindow;
}
class System;
} // namespace Core
namespace VideoCore {
class RendererBase;
@@ -129,7 +132,8 @@ class MemoryManager;
class GPU {
public:
explicit GPU(Core::System& system, VideoCore::RendererBase& renderer, bool is_async);
explicit GPU(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer,
bool is_async);
virtual ~GPU();
@@ -174,6 +178,14 @@ public:
/// Returns a reference to the GPU DMA pusher.
Tegra::DmaPusher& DmaPusher();
VideoCore::RendererBase& Renderer() {
return *renderer;
}
const VideoCore::RendererBase& Renderer() const {
return *renderer;
}
// Waits for the GPU to finish working
virtual void WaitIdle() const = 0;
@@ -258,13 +270,13 @@ public:
virtual void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) = 0;
/// Notify rasterizer that any caches of the specified region should be flushed to Switch memory
virtual void FlushRegion(CacheAddr addr, u64 size) = 0;
virtual void FlushRegion(VAddr addr, u64 size) = 0;
/// Notify rasterizer that any caches of the specified region should be invalidated
virtual void InvalidateRegion(CacheAddr addr, u64 size) = 0;
virtual void InvalidateRegion(VAddr addr, u64 size) = 0;
/// Notify rasterizer that any caches of the specified region should be flushed and invalidated
virtual void FlushAndInvalidateRegion(CacheAddr addr, u64 size) = 0;
virtual void FlushAndInvalidateRegion(VAddr addr, u64 size) = 0;
protected:
virtual void TriggerCpuInterrupt(u32 syncpoint_id, u32 value) const = 0;
@@ -287,7 +299,7 @@ private:
protected:
std::unique_ptr<Tegra::DmaPusher> dma_pusher;
Core::System& system;
VideoCore::RendererBase& renderer;
std::unique_ptr<VideoCore::RendererBase> renderer;
private:
std::unique_ptr<Tegra::MemoryManager> memory_manager;

View File

@@ -10,13 +10,17 @@
namespace VideoCommon {
GPUAsynch::GPUAsynch(Core::System& system, VideoCore::RendererBase& renderer)
: GPU(system, renderer, true), gpu_thread{system} {}
GPUAsynch::GPUAsynch(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer_,
std::unique_ptr<Core::Frontend::GraphicsContext>&& context)
: GPU(system, std::move(renderer_), true), gpu_thread{system},
cpu_context(renderer->GetRenderWindow().CreateSharedContext()),
gpu_context(std::move(context)) {}
GPUAsynch::~GPUAsynch() = default;
void GPUAsynch::Start() {
gpu_thread.StartThread(renderer, *dma_pusher);
cpu_context->MakeCurrent();
gpu_thread.StartThread(*renderer, *gpu_context, *dma_pusher);
}
void GPUAsynch::PushGPUEntries(Tegra::CommandList&& entries) {
@@ -27,15 +31,15 @@ void GPUAsynch::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
gpu_thread.SwapBuffers(framebuffer);
}
void GPUAsynch::FlushRegion(CacheAddr addr, u64 size) {
void GPUAsynch::FlushRegion(VAddr addr, u64 size) {
gpu_thread.FlushRegion(addr, size);
}
void GPUAsynch::InvalidateRegion(CacheAddr addr, u64 size) {
void GPUAsynch::InvalidateRegion(VAddr addr, u64 size) {
gpu_thread.InvalidateRegion(addr, size);
}
void GPUAsynch::FlushAndInvalidateRegion(CacheAddr addr, u64 size) {
void GPUAsynch::FlushAndInvalidateRegion(VAddr addr, u64 size) {
gpu_thread.FlushAndInvalidateRegion(addr, size);
}

View File

@@ -7,6 +7,10 @@
#include "video_core/gpu.h"
#include "video_core/gpu_thread.h"
namespace Core::Frontend {
class GraphicsContext;
}
namespace VideoCore {
class RendererBase;
} // namespace VideoCore
@@ -16,15 +20,16 @@ namespace VideoCommon {
/// Implementation of GPU interface that runs the GPU asynchronously
class GPUAsynch final : public Tegra::GPU {
public:
explicit GPUAsynch(Core::System& system, VideoCore::RendererBase& renderer);
explicit GPUAsynch(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer,
std::unique_ptr<Core::Frontend::GraphicsContext>&& context);
~GPUAsynch() override;
void Start() override;
void PushGPUEntries(Tegra::CommandList&& entries) override;
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
void FlushRegion(CacheAddr addr, u64 size) override;
void InvalidateRegion(CacheAddr addr, u64 size) override;
void FlushAndInvalidateRegion(CacheAddr addr, u64 size) override;
void FlushRegion(VAddr addr, u64 size) override;
void InvalidateRegion(VAddr addr, u64 size) override;
void FlushAndInvalidateRegion(VAddr addr, u64 size) override;
void WaitIdle() const override;
protected:
@@ -32,6 +37,8 @@ protected:
private:
GPUThread::ThreadManager gpu_thread;
std::unique_ptr<Core::Frontend::GraphicsContext> cpu_context;
std::unique_ptr<Core::Frontend::GraphicsContext> gpu_context;
};
} // namespace VideoCommon

View File

@@ -7,12 +7,15 @@
namespace VideoCommon {
GPUSynch::GPUSynch(Core::System& system, VideoCore::RendererBase& renderer)
: GPU(system, renderer, false) {}
GPUSynch::GPUSynch(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer,
std::unique_ptr<Core::Frontend::GraphicsContext>&& context)
: GPU(system, std::move(renderer), false), context{std::move(context)} {}
GPUSynch::~GPUSynch() = default;
void GPUSynch::Start() {}
void GPUSynch::Start() {
context->MakeCurrent();
}
void GPUSynch::PushGPUEntries(Tegra::CommandList&& entries) {
dma_pusher->Push(std::move(entries));
@@ -20,19 +23,19 @@ void GPUSynch::PushGPUEntries(Tegra::CommandList&& entries) {
}
void GPUSynch::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
renderer.SwapBuffers(framebuffer);
renderer->SwapBuffers(framebuffer);
}
void GPUSynch::FlushRegion(CacheAddr addr, u64 size) {
renderer.Rasterizer().FlushRegion(addr, size);
void GPUSynch::FlushRegion(VAddr addr, u64 size) {
renderer->Rasterizer().FlushRegion(addr, size);
}
void GPUSynch::InvalidateRegion(CacheAddr addr, u64 size) {
renderer.Rasterizer().InvalidateRegion(addr, size);
void GPUSynch::InvalidateRegion(VAddr addr, u64 size) {
renderer->Rasterizer().InvalidateRegion(addr, size);
}
void GPUSynch::FlushAndInvalidateRegion(CacheAddr addr, u64 size) {
renderer.Rasterizer().FlushAndInvalidateRegion(addr, size);
void GPUSynch::FlushAndInvalidateRegion(VAddr addr, u64 size) {
renderer->Rasterizer().FlushAndInvalidateRegion(addr, size);
}
} // namespace VideoCommon

View File

@@ -6,6 +6,10 @@
#include "video_core/gpu.h"
namespace Core::Frontend {
class GraphicsContext;
}
namespace VideoCore {
class RendererBase;
} // namespace VideoCore
@@ -15,20 +19,24 @@ namespace VideoCommon {
/// Implementation of GPU interface that runs the GPU synchronously
class GPUSynch final : public Tegra::GPU {
public:
explicit GPUSynch(Core::System& system, VideoCore::RendererBase& renderer);
explicit GPUSynch(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer,
std::unique_ptr<Core::Frontend::GraphicsContext>&& context);
~GPUSynch() override;
void Start() override;
void PushGPUEntries(Tegra::CommandList&& entries) override;
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
void FlushRegion(CacheAddr addr, u64 size) override;
void InvalidateRegion(CacheAddr addr, u64 size) override;
void FlushAndInvalidateRegion(CacheAddr addr, u64 size) override;
void FlushRegion(VAddr addr, u64 size) override;
void InvalidateRegion(VAddr addr, u64 size) override;
void FlushAndInvalidateRegion(VAddr addr, u64 size) override;
void WaitIdle() const override {}
protected:
void TriggerCpuInterrupt([[maybe_unused]] u32 syncpoint_id,
[[maybe_unused]] u32 value) const override {}
private:
std::unique_ptr<Core::Frontend::GraphicsContext> context;
};
} // namespace VideoCommon

View File

@@ -5,7 +5,7 @@
#include "common/assert.h"
#include "common/microprofile.h"
#include "core/core.h"
#include "core/frontend/scope_acquire_context.h"
#include "core/frontend/emu_window.h"
#include "video_core/dma_pusher.h"
#include "video_core/gpu.h"
#include "video_core/gpu_thread.h"
@@ -14,8 +14,8 @@
namespace VideoCommon::GPUThread {
/// Runs the GPU thread
static void RunThread(VideoCore::RendererBase& renderer, Tegra::DmaPusher& dma_pusher,
SynchState& state) {
static void RunThread(VideoCore::RendererBase& renderer, Core::Frontend::GraphicsContext& context,
Tegra::DmaPusher& dma_pusher, SynchState& state) {
MicroProfileOnThreadCreate("GpuThread");
// Wait for first GPU command before acquiring the window context
@@ -27,7 +27,7 @@ static void RunThread(VideoCore::RendererBase& renderer, Tegra::DmaPusher& dma_p
return;
}
Core::Frontend::ScopeAcquireContext acquire_context{renderer.GetRenderWindow()};
auto current_context = context.Acquire();
CommandDataContainer next;
while (state.is_running) {
@@ -62,8 +62,11 @@ ThreadManager::~ThreadManager() {
thread.join();
}
void ThreadManager::StartThread(VideoCore::RendererBase& renderer, Tegra::DmaPusher& dma_pusher) {
thread = std::thread{RunThread, std::ref(renderer), std::ref(dma_pusher), std::ref(state)};
void ThreadManager::StartThread(VideoCore::RendererBase& renderer,
Core::Frontend::GraphicsContext& context,
Tegra::DmaPusher& dma_pusher) {
thread = std::thread{RunThread, std::ref(renderer), std::ref(context), std::ref(dma_pusher),
std::ref(state)};
}
void ThreadManager::SubmitList(Tegra::CommandList&& entries) {
@@ -74,15 +77,15 @@ void ThreadManager::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
PushCommand(SwapBuffersCommand(framebuffer ? std::make_optional(*framebuffer) : std::nullopt));
}
void ThreadManager::FlushRegion(CacheAddr addr, u64 size) {
void ThreadManager::FlushRegion(VAddr addr, u64 size) {
PushCommand(FlushRegionCommand(addr, size));
}
void ThreadManager::InvalidateRegion(CacheAddr addr, u64 size) {
void ThreadManager::InvalidateRegion(VAddr addr, u64 size) {
system.Renderer().Rasterizer().InvalidateRegion(addr, size);
}
void ThreadManager::FlushAndInvalidateRegion(CacheAddr addr, u64 size) {
void ThreadManager::FlushAndInvalidateRegion(VAddr addr, u64 size) {
// Skip flush on asynch mode, as FlushAndInvalidateRegion is not used for anything too important
InvalidateRegion(addr, size);
}

View File

@@ -10,7 +10,6 @@
#include <optional>
#include <thread>
#include <variant>
#include "common/threadsafe_queue.h"
#include "video_core/gpu.h"
@@ -20,6 +19,9 @@ class DmaPusher;
} // namespace Tegra
namespace Core {
namespace Frontend {
class GraphicsContext;
}
class System;
} // namespace Core
@@ -45,26 +47,26 @@ struct SwapBuffersCommand final {
/// Command to signal to the GPU thread to flush a region
struct FlushRegionCommand final {
explicit constexpr FlushRegionCommand(CacheAddr addr, u64 size) : addr{addr}, size{size} {}
explicit constexpr FlushRegionCommand(VAddr addr, u64 size) : addr{addr}, size{size} {}
CacheAddr addr;
VAddr addr;
u64 size;
};
/// Command to signal to the GPU thread to invalidate a region
struct InvalidateRegionCommand final {
explicit constexpr InvalidateRegionCommand(CacheAddr addr, u64 size) : addr{addr}, size{size} {}
explicit constexpr InvalidateRegionCommand(VAddr addr, u64 size) : addr{addr}, size{size} {}
CacheAddr addr;
VAddr addr;
u64 size;
};
/// Command to signal to the GPU thread to flush and invalidate a region
struct FlushAndInvalidateRegionCommand final {
explicit constexpr FlushAndInvalidateRegionCommand(CacheAddr addr, u64 size)
explicit constexpr FlushAndInvalidateRegionCommand(VAddr addr, u64 size)
: addr{addr}, size{size} {}
CacheAddr addr;
VAddr addr;
u64 size;
};
@@ -99,7 +101,8 @@ public:
~ThreadManager();
/// Creates and starts the GPU thread.
void StartThread(VideoCore::RendererBase& renderer, Tegra::DmaPusher& dma_pusher);
void StartThread(VideoCore::RendererBase& renderer, Core::Frontend::GraphicsContext& context,
Tegra::DmaPusher& dma_pusher);
/// Push GPU command entries to be processed
void SubmitList(Tegra::CommandList&& entries);
@@ -108,13 +111,13 @@ public:
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer);
/// Notify rasterizer that any caches of the specified region should be flushed to Switch memory
void FlushRegion(CacheAddr addr, u64 size);
void FlushRegion(VAddr addr, u64 size);
/// Notify rasterizer that any caches of the specified region should be invalidated
void InvalidateRegion(CacheAddr addr, u64 size);
void InvalidateRegion(VAddr addr, u64 size);
/// Notify rasterizer that any caches of the specified region should be flushed and invalidated
void FlushAndInvalidateRegion(CacheAddr addr, u64 size);
void FlushAndInvalidateRegion(VAddr addr, u64 size);
// Wait until the gpu thread is idle.
void WaitIdle() const;

View File

@@ -81,12 +81,11 @@ GPUVAddr MemoryManager::UnmapBuffer(GPUVAddr gpu_addr, u64 size) {
ASSERT((gpu_addr & page_mask) == 0);
const u64 aligned_size{Common::AlignUp(size, page_size)};
const CacheAddr cache_addr{ToCacheAddr(GetPointer(gpu_addr))};
const auto cpu_addr = GpuToCpuAddress(gpu_addr);
ASSERT(cpu_addr);
// Flush and invalidate through the GPU interface, to be asynchronous if possible.
system.GPU().FlushAndInvalidateRegion(cache_addr, aligned_size);
system.GPU().FlushAndInvalidateRegion(*cpu_addr, aligned_size);
UnmapRange(gpu_addr, aligned_size);
ASSERT(system.CurrentProcess()
@@ -140,11 +139,11 @@ T MemoryManager::Read(GPUVAddr addr) const {
return {};
}
const u8* page_pointer{page_table.pointers[addr >> page_bits]};
const u8* page_pointer{GetPointer(addr)};
if (page_pointer) {
// NOTE: Avoid adding any extra logic to this fast-path block
T value;
std::memcpy(&value, &page_pointer[addr & page_mask], sizeof(T));
std::memcpy(&value, page_pointer, sizeof(T));
return value;
}
@@ -167,10 +166,10 @@ void MemoryManager::Write(GPUVAddr addr, T data) {
return;
}
u8* page_pointer{page_table.pointers[addr >> page_bits]};
u8* page_pointer{GetPointer(addr)};
if (page_pointer) {
// NOTE: Avoid adding any extra logic to this fast-path block
std::memcpy(&page_pointer[addr & page_mask], &data, sizeof(T));
std::memcpy(page_pointer, &data, sizeof(T));
return;
}
@@ -201,9 +200,12 @@ u8* MemoryManager::GetPointer(GPUVAddr addr) {
return {};
}
u8* const page_pointer{page_table.pointers[addr >> page_bits]};
if (page_pointer != nullptr) {
return page_pointer + (addr & page_mask);
auto& memory = system.Memory();
const VAddr page_addr{page_table.backing_addr[addr >> page_bits]};
if (page_addr != 0) {
return memory.GetPointer(page_addr + (addr & page_mask));
}
LOG_ERROR(HW_GPU, "Unknown GetPointer @ 0x{:016X}", addr);
@@ -215,9 +217,12 @@ const u8* MemoryManager::GetPointer(GPUVAddr addr) const {
return {};
}
const u8* const page_pointer{page_table.pointers[addr >> page_bits]};
if (page_pointer != nullptr) {
return page_pointer + (addr & page_mask);
const auto& memory = system.Memory();
const VAddr page_addr{page_table.backing_addr[addr >> page_bits]};
if (page_addr != 0) {
return memory.GetPointer(page_addr + (addr & page_mask));
}
LOG_ERROR(HW_GPU, "Unknown GetPointer @ 0x{:016X}", addr);
@@ -238,17 +243,19 @@ void MemoryManager::ReadBlock(GPUVAddr src_addr, void* dest_buffer, const std::s
std::size_t page_index{src_addr >> page_bits};
std::size_t page_offset{src_addr & page_mask};
auto& memory = system.Memory();
while (remaining_size > 0) {
const std::size_t copy_amount{
std::min(static_cast<std::size_t>(page_size) - page_offset, remaining_size)};
switch (page_table.attributes[page_index]) {
case Common::PageType::Memory: {
const u8* src_ptr{page_table.pointers[page_index] + page_offset};
const VAddr src_addr{page_table.backing_addr[page_index] + page_offset};
// Flush must happen on the rasterizer interface, such that memory is always synchronous
// when it is read (even when in asynchronous GPU mode). Fixes Dead Cells title menu.
rasterizer.FlushRegion(ToCacheAddr(src_ptr), copy_amount);
std::memcpy(dest_buffer, src_ptr, copy_amount);
rasterizer.FlushRegion(src_addr, copy_amount);
memory.ReadBlockUnsafe(src_addr, dest_buffer, copy_amount);
break;
}
default:
@@ -268,13 +275,15 @@ void MemoryManager::ReadBlockUnsafe(GPUVAddr src_addr, void* dest_buffer,
std::size_t page_index{src_addr >> page_bits};
std::size_t page_offset{src_addr & page_mask};
auto& memory = system.Memory();
while (remaining_size > 0) {
const std::size_t copy_amount{
std::min(static_cast<std::size_t>(page_size) - page_offset, remaining_size)};
const u8* page_pointer = page_table.pointers[page_index];
if (page_pointer) {
const u8* src_ptr{page_pointer + page_offset};
std::memcpy(dest_buffer, src_ptr, copy_amount);
const VAddr src_addr{page_table.backing_addr[page_index] + page_offset};
memory.ReadBlockUnsafe(src_addr, dest_buffer, copy_amount);
} else {
std::memset(dest_buffer, 0, copy_amount);
}
@@ -290,17 +299,19 @@ void MemoryManager::WriteBlock(GPUVAddr dest_addr, const void* src_buffer, const
std::size_t page_index{dest_addr >> page_bits};
std::size_t page_offset{dest_addr & page_mask};
auto& memory = system.Memory();
while (remaining_size > 0) {
const std::size_t copy_amount{
std::min(static_cast<std::size_t>(page_size) - page_offset, remaining_size)};
switch (page_table.attributes[page_index]) {
case Common::PageType::Memory: {
u8* dest_ptr{page_table.pointers[page_index] + page_offset};
const VAddr dest_addr{page_table.backing_addr[page_index] + page_offset};
// Invalidate must happen on the rasterizer interface, such that memory is always
// synchronous when it is written (even when in asynchronous GPU mode).
rasterizer.InvalidateRegion(ToCacheAddr(dest_ptr), copy_amount);
std::memcpy(dest_ptr, src_buffer, copy_amount);
rasterizer.InvalidateRegion(dest_addr, copy_amount);
memory.WriteBlockUnsafe(dest_addr, src_buffer, copy_amount);
break;
}
default:
@@ -320,13 +331,15 @@ void MemoryManager::WriteBlockUnsafe(GPUVAddr dest_addr, const void* src_buffer,
std::size_t page_index{dest_addr >> page_bits};
std::size_t page_offset{dest_addr & page_mask};
auto& memory = system.Memory();
while (remaining_size > 0) {
const std::size_t copy_amount{
std::min(static_cast<std::size_t>(page_size) - page_offset, remaining_size)};
u8* page_pointer = page_table.pointers[page_index];
if (page_pointer) {
u8* dest_ptr{page_pointer + page_offset};
std::memcpy(dest_ptr, src_buffer, copy_amount);
const VAddr dest_addr{page_table.backing_addr[page_index] + page_offset};
memory.WriteBlockUnsafe(dest_addr, src_buffer, copy_amount);
}
page_index++;
page_offset = 0;
@@ -336,33 +349,9 @@ void MemoryManager::WriteBlockUnsafe(GPUVAddr dest_addr, const void* src_buffer,
}
void MemoryManager::CopyBlock(GPUVAddr dest_addr, GPUVAddr src_addr, const std::size_t size) {
std::size_t remaining_size{size};
std::size_t page_index{src_addr >> page_bits};
std::size_t page_offset{src_addr & page_mask};
while (remaining_size > 0) {
const std::size_t copy_amount{
std::min(static_cast<std::size_t>(page_size) - page_offset, remaining_size)};
switch (page_table.attributes[page_index]) {
case Common::PageType::Memory: {
// Flush must happen on the rasterizer interface, such that memory is always synchronous
// when it is copied (even when in asynchronous GPU mode).
const u8* src_ptr{page_table.pointers[page_index] + page_offset};
rasterizer.FlushRegion(ToCacheAddr(src_ptr), copy_amount);
WriteBlock(dest_addr, src_ptr, copy_amount);
break;
}
default:
UNREACHABLE();
}
page_index++;
page_offset = 0;
dest_addr += static_cast<VAddr>(copy_amount);
src_addr += static_cast<VAddr>(copy_amount);
remaining_size -= copy_amount;
}
std::vector<u8> tmp_buffer(size);
ReadBlock(src_addr, tmp_buffer.data(), size);
WriteBlock(dest_addr, tmp_buffer.data(), size);
}
void MemoryManager::CopyBlockUnsafe(GPUVAddr dest_addr, GPUVAddr src_addr, const std::size_t size) {
@@ -371,6 +360,12 @@ void MemoryManager::CopyBlockUnsafe(GPUVAddr dest_addr, GPUVAddr src_addr, const
WriteBlockUnsafe(dest_addr, tmp_buffer.data(), size);
}
bool MemoryManager::IsGranularRange(GPUVAddr gpu_addr, std::size_t size) {
const VAddr addr = page_table.backing_addr[gpu_addr >> page_bits];
const std::size_t page = (addr & Memory::PAGE_MASK) + size;
return page <= Memory::PAGE_SIZE;
}
void MemoryManager::MapPages(GPUVAddr base, u64 size, u8* memory, Common::PageType type,
VAddr backing_addr) {
LOG_DEBUG(HW_GPU, "Mapping {} onto {:016X}-{:016X}", fmt::ptr(memory), base * page_size,

View File

@@ -97,6 +97,11 @@ public:
void WriteBlockUnsafe(GPUVAddr dest_addr, const void* src_buffer, std::size_t size);
void CopyBlockUnsafe(GPUVAddr dest_addr, GPUVAddr src_addr, std::size_t size);
/**
* IsGranularRange checks if a gpu region can be simply read with a pointer
*/
bool IsGranularRange(GPUVAddr gpu_addr, std::size_t size);
private:
using VMAMap = std::map<GPUVAddr, VirtualMemoryArea>;
using VMAHandle = VMAMap::const_iterator;

View File

@@ -98,12 +98,12 @@ public:
static_cast<QueryCache&>(*this),
VideoCore::QueryType::SamplesPassed}}} {}
void InvalidateRegion(CacheAddr addr, std::size_t size) {
void InvalidateRegion(VAddr addr, std::size_t size) {
std::unique_lock lock{mutex};
FlushAndRemoveRegion(addr, size);
}
void FlushRegion(CacheAddr addr, std::size_t size) {
void FlushRegion(VAddr addr, std::size_t size) {
std::unique_lock lock{mutex};
FlushAndRemoveRegion(addr, size);
}
@@ -117,14 +117,16 @@ public:
void Query(GPUVAddr gpu_addr, VideoCore::QueryType type, std::optional<u64> timestamp) {
std::unique_lock lock{mutex};
auto& memory_manager = system.GPU().MemoryManager();
const auto host_ptr = memory_manager.GetPointer(gpu_addr);
const std::optional<VAddr> cpu_addr_opt = memory_manager.GpuToCpuAddress(gpu_addr);
ASSERT(cpu_addr_opt);
VAddr cpu_addr = *cpu_addr_opt;
CachedQuery* query = TryGet(ToCacheAddr(host_ptr));
CachedQuery* query = TryGet(cpu_addr);
if (!query) {
const auto cpu_addr = memory_manager.GpuToCpuAddress(gpu_addr);
ASSERT_OR_EXECUTE(cpu_addr, return;);
ASSERT_OR_EXECUTE(cpu_addr_opt, return;);
const auto host_ptr = memory_manager.GetPointer(gpu_addr);
query = Register(type, *cpu_addr, host_ptr, timestamp.has_value());
query = Register(type, cpu_addr, host_ptr, timestamp.has_value());
}
query->BindCounter(Stream(type).Current(), timestamp);
@@ -173,11 +175,11 @@ protected:
private:
/// Flushes a memory range to guest memory and removes it from the cache.
void FlushAndRemoveRegion(CacheAddr addr, std::size_t size) {
void FlushAndRemoveRegion(VAddr addr, std::size_t size) {
const u64 addr_begin = static_cast<u64>(addr);
const u64 addr_end = addr_begin + static_cast<u64>(size);
const auto in_range = [addr_begin, addr_end](CachedQuery& query) {
const u64 cache_begin = query.GetCacheAddr();
const u64 cache_begin = query.GetCpuAddr();
const u64 cache_end = cache_begin + query.SizeInBytes();
return cache_begin < addr_end && addr_begin < cache_end;
};
@@ -193,7 +195,7 @@ private:
if (!in_range(query)) {
continue;
}
rasterizer.UpdatePagesCachedCount(query.CpuAddr(), query.SizeInBytes(), -1);
rasterizer.UpdatePagesCachedCount(query.GetCpuAddr(), query.SizeInBytes(), -1);
query.Flush();
}
contents.erase(std::remove_if(std::begin(contents), std::end(contents), in_range),
@@ -204,22 +206,21 @@ private:
/// Registers the passed parameters as cached and returns a pointer to the stored cached query.
CachedQuery* Register(VideoCore::QueryType type, VAddr cpu_addr, u8* host_ptr, bool timestamp) {
rasterizer.UpdatePagesCachedCount(cpu_addr, CachedQuery::SizeInBytes(timestamp), 1);
const u64 page = static_cast<u64>(ToCacheAddr(host_ptr)) >> PAGE_SHIFT;
const u64 page = static_cast<u64>(cpu_addr) >> PAGE_SHIFT;
return &cached_queries[page].emplace_back(static_cast<QueryCache&>(*this), type, cpu_addr,
host_ptr);
}
/// Tries to a get a cached query. Returns nullptr on failure.
CachedQuery* TryGet(CacheAddr addr) {
CachedQuery* TryGet(VAddr addr) {
const u64 page = static_cast<u64>(addr) >> PAGE_SHIFT;
const auto it = cached_queries.find(page);
if (it == std::end(cached_queries)) {
return nullptr;
}
auto& contents = it->second;
const auto found =
std::find_if(std::begin(contents), std::end(contents),
[addr](auto& query) { return query.GetCacheAddr() == addr; });
const auto found = std::find_if(std::begin(contents), std::end(contents),
[addr](auto& query) { return query.GetCpuAddr() == addr; });
return found != std::end(contents) ? &*found : nullptr;
}
@@ -323,14 +324,10 @@ public:
timestamp = timestamp_;
}
VAddr CpuAddr() const noexcept {
VAddr GetCpuAddr() const noexcept {
return cpu_addr;
}
CacheAddr GetCacheAddr() const noexcept {
return ToCacheAddr(host_ptr);
}
u64 SizeInBytes() const noexcept {
return SizeInBytes(timestamp.has_value());
}

View File

@@ -18,22 +18,14 @@
class RasterizerCacheObject {
public:
explicit RasterizerCacheObject(const u8* host_ptr)
: host_ptr{host_ptr}, cache_addr{ToCacheAddr(host_ptr)} {}
explicit RasterizerCacheObject(const VAddr cpu_addr) : cpu_addr{cpu_addr} {}
virtual ~RasterizerCacheObject();
CacheAddr GetCacheAddr() const {
return cache_addr;
VAddr GetCpuAddr() const {
return cpu_addr;
}
const u8* GetHostPtr() const {
return host_ptr;
}
/// Gets the address of the shader in guest memory, required for cache management
virtual VAddr GetCpuAddr() const = 0;
/// Gets the size of the shader in guest memory, required for cache management
virtual std::size_t GetSizeInBytes() const = 0;
@@ -68,8 +60,7 @@ private:
bool is_registered{}; ///< Whether the object is currently registered with the cache
bool is_dirty{}; ///< Whether the object is dirty (out of sync with guest memory)
u64 last_modified_ticks{}; ///< When the object was last modified, used for in-order flushing
const u8* host_ptr{}; ///< Pointer to the memory backing this cached region
CacheAddr cache_addr{}; ///< Cache address memory, unique from emulated virtual address space
VAddr cpu_addr{}; ///< Cpu address memory, unique from emulated virtual address space
};
template <class T>
@@ -80,7 +71,7 @@ public:
explicit RasterizerCache(VideoCore::RasterizerInterface& rasterizer) : rasterizer{rasterizer} {}
/// Write any cached resources overlapping the specified region back to memory
void FlushRegion(CacheAddr addr, std::size_t size) {
void FlushRegion(VAddr addr, std::size_t size) {
std::lock_guard lock{mutex};
const auto& objects{GetSortedObjectsFromRegion(addr, size)};
@@ -90,7 +81,7 @@ public:
}
/// Mark the specified region as being invalidated
void InvalidateRegion(CacheAddr addr, u64 size) {
void InvalidateRegion(VAddr addr, u64 size) {
std::lock_guard lock{mutex};
const auto& objects{GetSortedObjectsFromRegion(addr, size)};
@@ -114,27 +105,20 @@ public:
protected:
/// Tries to get an object from the cache with the specified cache address
T TryGet(CacheAddr addr) const {
T TryGet(VAddr addr) const {
const auto iter = map_cache.find(addr);
if (iter != map_cache.end())
return iter->second;
return nullptr;
}
T TryGet(const void* addr) const {
const auto iter = map_cache.find(ToCacheAddr(addr));
if (iter != map_cache.end())
return iter->second;
return nullptr;
}
/// Register an object into the cache
virtual void Register(const T& object) {
std::lock_guard lock{mutex};
object->SetIsRegistered(true);
interval_cache.add({GetInterval(object), ObjectSet{object}});
map_cache.insert({object->GetCacheAddr(), object});
map_cache.insert({object->GetCpuAddr(), object});
rasterizer.UpdatePagesCachedCount(object->GetCpuAddr(), object->GetSizeInBytes(), 1);
}
@@ -144,7 +128,7 @@ protected:
object->SetIsRegistered(false);
rasterizer.UpdatePagesCachedCount(object->GetCpuAddr(), object->GetSizeInBytes(), -1);
const CacheAddr addr = object->GetCacheAddr();
const VAddr addr = object->GetCpuAddr();
interval_cache.subtract({GetInterval(object), ObjectSet{object}});
map_cache.erase(addr);
}
@@ -173,7 +157,7 @@ protected:
private:
/// Returns a list of cached objects from the specified memory region, ordered by access time
std::vector<T> GetSortedObjectsFromRegion(CacheAddr addr, u64 size) {
std::vector<T> GetSortedObjectsFromRegion(VAddr addr, u64 size) {
if (size == 0) {
return {};
}
@@ -197,13 +181,13 @@ private:
}
using ObjectSet = std::set<T>;
using ObjectCache = std::unordered_map<CacheAddr, T>;
using IntervalCache = boost::icl::interval_map<CacheAddr, ObjectSet>;
using ObjectCache = std::unordered_map<VAddr, T>;
using IntervalCache = boost::icl::interval_map<VAddr, ObjectSet>;
using ObjectInterval = typename IntervalCache::interval_type;
static auto GetInterval(const T& object) {
return ObjectInterval::right_open(object->GetCacheAddr(),
object->GetCacheAddr() + object->GetSizeInBytes());
return ObjectInterval::right_open(object->GetCpuAddr(),
object->GetCpuAddr() + object->GetSizeInBytes());
}
ObjectCache map_cache;

View File

@@ -53,14 +53,14 @@ public:
virtual void FlushAll() = 0;
/// Notify rasterizer that any caches of the specified region should be flushed to Switch memory
virtual void FlushRegion(CacheAddr addr, u64 size) = 0;
virtual void FlushRegion(VAddr addr, u64 size) = 0;
/// Notify rasterizer that any caches of the specified region should be invalidated
virtual void InvalidateRegion(CacheAddr addr, u64 size) = 0;
virtual void InvalidateRegion(VAddr addr, u64 size) = 0;
/// Notify rasterizer that any caches of the specified region should be flushed to Switch memory
/// and invalidated
virtual void FlushAndInvalidateRegion(CacheAddr addr, u64 size) = 0;
virtual void FlushAndInvalidateRegion(VAddr addr, u64 size) = 0;
/// Notify the rasterizer to send all written commands to the host GPU.
virtual void FlushCommands() = 0;

View File

@@ -46,7 +46,8 @@ public:
/// Draws the latest frame to the window waiting timeout_ms for a frame to arrive (Renderer
/// specific implementation)
virtual void TryPresent(int timeout_ms) = 0;
/// Returns true if a frame was drawn
virtual bool TryPresent(int timeout_ms) = 0;
// Getter/setter functions:
// ------------------------

View File

@@ -21,8 +21,8 @@ using Maxwell = Tegra::Engines::Maxwell3D::Regs;
MICROPROFILE_DEFINE(OpenGL_Buffer_Download, "OpenGL", "Buffer Download", MP_RGB(192, 192, 128));
CachedBufferBlock::CachedBufferBlock(CacheAddr cache_addr, const std::size_t size)
: VideoCommon::BufferBlock{cache_addr, size} {
CachedBufferBlock::CachedBufferBlock(VAddr cpu_addr, const std::size_t size)
: VideoCommon::BufferBlock{cpu_addr, size} {
gl_buffer.Create();
glNamedBufferData(gl_buffer.handle, static_cast<GLsizeiptr>(size), nullptr, GL_DYNAMIC_DRAW);
}
@@ -47,41 +47,39 @@ OGLBufferCache::~OGLBufferCache() {
glDeleteBuffers(static_cast<GLsizei>(std::size(cbufs)), std::data(cbufs));
}
Buffer OGLBufferCache::CreateBlock(CacheAddr cache_addr, std::size_t size) {
return std::make_shared<CachedBufferBlock>(cache_addr, size);
Buffer OGLBufferCache::CreateBlock(VAddr cpu_addr, std::size_t size) {
return std::make_shared<CachedBufferBlock>(cpu_addr, size);
}
void OGLBufferCache::WriteBarrier() {
glMemoryBarrier(GL_ALL_BARRIER_BITS);
}
const GLuint* OGLBufferCache::ToHandle(const Buffer& buffer) {
GLuint OGLBufferCache::ToHandle(const Buffer& buffer) {
return buffer->GetHandle();
}
const GLuint* OGLBufferCache::GetEmptyBuffer(std::size_t) {
static const GLuint null_buffer = 0;
return &null_buffer;
GLuint OGLBufferCache::GetEmptyBuffer(std::size_t) {
return 0;
}
void OGLBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) {
glNamedBufferSubData(*buffer->GetHandle(), static_cast<GLintptr>(offset),
glNamedBufferSubData(buffer->GetHandle(), static_cast<GLintptr>(offset),
static_cast<GLsizeiptr>(size), data);
}
void OGLBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
u8* data) {
MICROPROFILE_SCOPE(OpenGL_Buffer_Download);
glGetNamedBufferSubData(*buffer->GetHandle(), static_cast<GLintptr>(offset),
glGetNamedBufferSubData(buffer->GetHandle(), static_cast<GLintptr>(offset),
static_cast<GLsizeiptr>(size), data);
}
void OGLBufferCache::CopyBlock(const Buffer& src, const Buffer& dst, std::size_t src_offset,
std::size_t dst_offset, std::size_t size) {
glCopyNamedBufferSubData(*src->GetHandle(), *dst->GetHandle(),
static_cast<GLintptr>(src_offset), static_cast<GLintptr>(dst_offset),
static_cast<GLsizeiptr>(size));
glCopyNamedBufferSubData(src->GetHandle(), dst->GetHandle(), static_cast<GLintptr>(src_offset),
static_cast<GLintptr>(dst_offset), static_cast<GLsizeiptr>(size));
}
OGLBufferCache::BufferInfo OGLBufferCache::ConstBufferUpload(const void* raw_pointer,
@@ -89,7 +87,7 @@ OGLBufferCache::BufferInfo OGLBufferCache::ConstBufferUpload(const void* raw_poi
DEBUG_ASSERT(cbuf_cursor < std::size(cbufs));
const GLuint& cbuf = cbufs[cbuf_cursor++];
glNamedBufferSubData(cbuf, 0, static_cast<GLsizeiptr>(size), raw_pointer);
return {&cbuf, 0};
return {cbuf, 0};
}
} // namespace OpenGL

View File

@@ -31,15 +31,15 @@ using GenericBufferCache = VideoCommon::BufferCache<Buffer, GLuint, OGLStreamBuf
class CachedBufferBlock : public VideoCommon::BufferBlock {
public:
explicit CachedBufferBlock(CacheAddr cache_addr, const std::size_t size);
explicit CachedBufferBlock(VAddr cpu_addr, const std::size_t size);
~CachedBufferBlock();
const GLuint* GetHandle() const {
return &gl_buffer.handle;
GLuint GetHandle() const {
return gl_buffer.handle;
}
private:
OGLBuffer gl_buffer{};
OGLBuffer gl_buffer;
};
class OGLBufferCache final : public GenericBufferCache {
@@ -48,19 +48,19 @@ public:
const Device& device, std::size_t stream_size);
~OGLBufferCache();
const GLuint* GetEmptyBuffer(std::size_t) override;
GLuint GetEmptyBuffer(std::size_t) override;
void Acquire() noexcept {
cbuf_cursor = 0;
}
protected:
Buffer CreateBlock(CacheAddr cache_addr, std::size_t size) override;
Buffer CreateBlock(VAddr cpu_addr, std::size_t size) override;
GLuint ToHandle(const Buffer& buffer) override;
void WriteBarrier() override;
const GLuint* ToHandle(const Buffer& buffer) override;
void UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) override;

View File

@@ -87,7 +87,7 @@ u32 Extract(u32& base, u32& num, u32 amount, std::optional<GLenum> limit = {}) {
std::array<Device::BaseBindings, Tegra::Engines::MaxShaderTypes> BuildBaseBindings() noexcept {
std::array<Device::BaseBindings, Tegra::Engines::MaxShaderTypes> bindings;
static std::array<std::size_t, 5> stage_swizzle = {0, 1, 2, 3, 4};
static constexpr std::array<std::size_t, 5> stage_swizzle{0, 1, 2, 3, 4};
const u32 total_ubos = GetInteger<u32>(GL_MAX_UNIFORM_BUFFER_BINDINGS);
const u32 total_ssbos = GetInteger<u32>(GL_MAX_SHADER_STORAGE_BUFFER_BINDINGS);
const u32 total_samplers = GetInteger<u32>(GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS);
@@ -131,6 +131,31 @@ std::array<Device::BaseBindings, Tegra::Engines::MaxShaderTypes> BuildBaseBindin
return bindings;
}
bool IsASTCSupported() {
static constexpr std::array formats = {
GL_COMPRESSED_RGBA_ASTC_4x4_KHR, GL_COMPRESSED_RGBA_ASTC_5x4_KHR,
GL_COMPRESSED_RGBA_ASTC_5x5_KHR, GL_COMPRESSED_RGBA_ASTC_6x5_KHR,
GL_COMPRESSED_RGBA_ASTC_6x6_KHR, GL_COMPRESSED_RGBA_ASTC_8x5_KHR,
GL_COMPRESSED_RGBA_ASTC_8x6_KHR, GL_COMPRESSED_RGBA_ASTC_8x8_KHR,
GL_COMPRESSED_RGBA_ASTC_10x5_KHR, GL_COMPRESSED_RGBA_ASTC_10x6_KHR,
GL_COMPRESSED_RGBA_ASTC_10x8_KHR, GL_COMPRESSED_RGBA_ASTC_10x10_KHR,
GL_COMPRESSED_RGBA_ASTC_12x10_KHR, GL_COMPRESSED_RGBA_ASTC_12x12_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR,
GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR, GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR,
};
return std::find_if_not(formats.begin(), formats.end(), [](GLenum format) {
GLint supported;
glGetInternalformativ(GL_TEXTURE_2D, format, GL_INTERNALFORMAT_SUPPORTED, 1,
&supported);
return supported == GL_TRUE;
}) == formats.end();
}
} // Anonymous namespace
Device::Device() : base_bindings{BuildBaseBindings()} {
@@ -152,6 +177,7 @@ Device::Device() : base_bindings{BuildBaseBindings()} {
has_shader_ballot = GLAD_GL_ARB_shader_ballot;
has_vertex_viewport_layer = GLAD_GL_ARB_shader_viewport_layer_array;
has_image_load_formatted = HasExtension(extensions, "GL_EXT_shader_image_load_formatted");
has_astc = IsASTCSupported();
has_variable_aoffi = TestVariableAoffi();
has_component_indexing_bug = is_amd;
has_precise_bug = TestPreciseBug();

View File

@@ -64,6 +64,10 @@ public:
return has_image_load_formatted;
}
bool HasASTC() const {
return has_astc;
}
bool HasVariableAoffi() const {
return has_variable_aoffi;
}
@@ -97,6 +101,7 @@ private:
bool has_shader_ballot{};
bool has_vertex_viewport_layer{};
bool has_image_load_formatted{};
bool has_astc{};
bool has_variable_aoffi{};
bool has_component_indexing_bug{};
bool has_precise_bug{};

View File

@@ -140,8 +140,8 @@ void RasterizerOpenGL::SetupVertexFormat() {
const auto attrib = gpu.regs.vertex_attrib_format[index];
const auto gl_index = static_cast<GLuint>(index);
// Ignore invalid attributes.
if (!attrib.IsValid()) {
// Disable constant attributes.
if (attrib.IsConstant()) {
glDisableVertexAttribArray(gl_index);
continue;
}
@@ -188,10 +188,8 @@ void RasterizerOpenGL::SetupVertexBuffer() {
ASSERT(end > start);
const u64 size = end - start + 1;
const auto [vertex_buffer, vertex_buffer_offset] = buffer_cache.UploadMemory(start, size);
// Bind the vertex array to the buffer at the current offset.
vertex_array_pushbuffer.SetVertexBuffer(static_cast<GLuint>(index), vertex_buffer,
vertex_buffer_offset, vertex_array.stride);
glBindVertexBuffer(static_cast<GLuint>(index), vertex_buffer, vertex_buffer_offset,
vertex_array.stride);
}
}
@@ -222,7 +220,7 @@ GLintptr RasterizerOpenGL::SetupIndexBuffer() {
const auto& regs = system.GPU().Maxwell3D().regs;
const std::size_t size = CalculateIndexBufferSize();
const auto [buffer, offset] = buffer_cache.UploadMemory(regs.index_array.IndexStart(), size);
vertex_array_pushbuffer.SetIndexBuffer(buffer);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, buffer);
return offset;
}
@@ -345,7 +343,7 @@ void RasterizerOpenGL::ConfigureFramebuffers() {
texture_cache.GuardRenderTargets(true);
View depth_surface = texture_cache.GetDepthBufferSurface(true);
View depth_surface = texture_cache.GetDepthBufferSurface();
const auto& regs = gpu.regs;
UNIMPLEMENTED_IF(regs.rt_separate_frag_data == 0);
@@ -354,7 +352,7 @@ void RasterizerOpenGL::ConfigureFramebuffers() {
FramebufferCacheKey key;
const auto colors_count = static_cast<std::size_t>(regs.rt_control.count);
for (std::size_t index = 0; index < colors_count; ++index) {
View color_surface{texture_cache.GetColorBufferSurface(index, true)};
View color_surface{texture_cache.GetColorBufferSurface(index)};
if (!color_surface) {
continue;
}
@@ -386,11 +384,14 @@ void RasterizerOpenGL::ConfigureClearFramebuffer(bool using_color_fb, bool using
texture_cache.GuardRenderTargets(true);
View color_surface;
if (using_color_fb) {
color_surface = texture_cache.GetColorBufferSurface(regs.clear_buffers.RT, false);
const std::size_t index = regs.clear_buffers.RT;
color_surface = texture_cache.GetColorBufferSurface(index);
texture_cache.MarkColorBufferInUse(index);
}
View depth_surface;
if (using_depth_fb || using_stencil_fb) {
depth_surface = texture_cache.GetDepthBufferSurface(false);
depth_surface = texture_cache.GetDepthBufferSurface();
texture_cache.MarkDepthBufferInUse();
}
texture_cache.GuardRenderTargets(false);
@@ -444,6 +445,7 @@ void RasterizerOpenGL::Clear() {
}
SyncRasterizeEnable();
SyncStencilTestState();
if (regs.clear_flags.scissor) {
SyncScissorTest();
@@ -492,6 +494,7 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
SyncPrimitiveRestart();
SyncScissorTest();
SyncPointState();
SyncLineState();
SyncPolygonOffset();
SyncAlphaTest();
SyncFramebufferSRGB();
@@ -519,7 +522,6 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
// Prepare vertex array format.
SetupVertexFormat();
vertex_array_pushbuffer.Setup();
// Upload vertex and index data.
SetupVertexBuffer();
@@ -529,17 +531,13 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
index_buffer_offset = SetupIndexBuffer();
}
// Prepare packed bindings.
bind_ubo_pushbuffer.Setup();
bind_ssbo_pushbuffer.Setup();
// Setup emulation uniform buffer.
GLShader::MaxwellUniformData ubo;
ubo.SetFromRegs(gpu);
const auto [buffer, offset] =
buffer_cache.UploadHostMemory(&ubo, sizeof(ubo), device.GetUniformBufferAlignment());
bind_ubo_pushbuffer.Push(EmulationUniformBlockBinding, buffer, offset,
static_cast<GLsizeiptr>(sizeof(ubo)));
glBindBufferRange(GL_UNIFORM_BUFFER, EmulationUniformBlockBinding, buffer, offset,
static_cast<GLsizeiptr>(sizeof(ubo)));
// Setup shaders and their used resources.
texture_cache.GuardSamplers(true);
@@ -552,11 +550,6 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
// Signal the buffer cache that we are not going to upload more things.
buffer_cache.Unmap();
// Now that we are no longer uploading data, we can safely bind the buffers to OpenGL.
vertex_array_pushbuffer.Bind();
bind_ubo_pushbuffer.Bind();
bind_ssbo_pushbuffer.Bind();
program_manager.BindGraphicsPipeline();
if (texture_cache.TextureBarrier()) {
@@ -625,17 +618,11 @@ void RasterizerOpenGL::DispatchCompute(GPUVAddr code_addr) {
(Maxwell::MaxConstBufferSize + device.GetUniformBufferAlignment());
buffer_cache.Map(buffer_size);
bind_ubo_pushbuffer.Setup();
bind_ssbo_pushbuffer.Setup();
SetupComputeConstBuffers(kernel);
SetupComputeGlobalMemory(kernel);
buffer_cache.Unmap();
bind_ubo_pushbuffer.Bind();
bind_ssbo_pushbuffer.Bind();
const auto& launch_desc = system.GPU().KeplerCompute().launch_description;
glDispatchCompute(launch_desc.grid_dim_x, launch_desc.grid_dim_y, launch_desc.grid_dim_z);
++num_queued_commands;
@@ -652,9 +639,9 @@ void RasterizerOpenGL::Query(GPUVAddr gpu_addr, VideoCore::QueryType type,
void RasterizerOpenGL::FlushAll() {}
void RasterizerOpenGL::FlushRegion(CacheAddr addr, u64 size) {
void RasterizerOpenGL::FlushRegion(VAddr addr, u64 size) {
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
if (!addr || !size) {
if (addr == 0 || size == 0) {
return;
}
texture_cache.FlushRegion(addr, size);
@@ -662,9 +649,9 @@ void RasterizerOpenGL::FlushRegion(CacheAddr addr, u64 size) {
query_cache.FlushRegion(addr, size);
}
void RasterizerOpenGL::InvalidateRegion(CacheAddr addr, u64 size) {
void RasterizerOpenGL::InvalidateRegion(VAddr addr, u64 size) {
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
if (!addr || !size) {
if (addr == 0 || size == 0) {
return;
}
texture_cache.InvalidateRegion(addr, size);
@@ -673,7 +660,7 @@ void RasterizerOpenGL::InvalidateRegion(CacheAddr addr, u64 size) {
query_cache.InvalidateRegion(addr, size);
}
void RasterizerOpenGL::FlushAndInvalidateRegion(CacheAddr addr, u64 size) {
void RasterizerOpenGL::FlushAndInvalidateRegion(VAddr addr, u64 size) {
if (Settings::values.use_accurate_gpu_emulation) {
FlushRegion(addr, size);
}
@@ -712,8 +699,7 @@ bool RasterizerOpenGL::AccelerateDisplay(const Tegra::FramebufferConfig& config,
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
const auto surface{
texture_cache.TryFindFramebufferSurface(system.Memory().GetPointer(framebuffer_addr))};
const auto surface{texture_cache.TryFindFramebufferSurface(framebuffer_addr)};
if (!surface) {
return {};
}
@@ -767,8 +753,8 @@ void RasterizerOpenGL::SetupConstBuffer(u32 binding, const Tegra::Engines::Const
const ConstBufferEntry& entry) {
if (!buffer.enabled) {
// Set values to zero to unbind buffers
bind_ubo_pushbuffer.Push(binding, buffer_cache.GetEmptyBuffer(sizeof(float)), 0,
sizeof(float));
glBindBufferRange(GL_UNIFORM_BUFFER, binding, buffer_cache.GetEmptyBuffer(sizeof(float)), 0,
sizeof(float));
return;
}
@@ -779,7 +765,7 @@ void RasterizerOpenGL::SetupConstBuffer(u32 binding, const Tegra::Engines::Const
const auto alignment = device.GetUniformBufferAlignment();
const auto [cbuf, offset] = buffer_cache.UploadMemory(buffer.address, size, alignment, false,
device.HasFastBufferSubData());
bind_ubo_pushbuffer.Push(binding, cbuf, offset, size);
glBindBufferRange(GL_UNIFORM_BUFFER, binding, cbuf, offset, size);
}
void RasterizerOpenGL::SetupDrawGlobalMemory(std::size_t stage_index, const Shader& shader) {
@@ -815,7 +801,8 @@ void RasterizerOpenGL::SetupGlobalMemory(u32 binding, const GlobalMemoryEntry& e
const auto alignment{device.GetShaderStorageBufferAlignment()};
const auto [ssbo, buffer_offset] =
buffer_cache.UploadMemory(gpu_addr, size, alignment, entry.IsWritten());
bind_ssbo_pushbuffer.Push(binding, ssbo, buffer_offset, static_cast<GLsizeiptr>(size));
glBindBufferRange(GL_SHADER_STORAGE_BUFFER, binding, ssbo, buffer_offset,
static_cast<GLsizeiptr>(size));
}
void RasterizerOpenGL::SetupDrawTextures(std::size_t stage_index, const Shader& shader) {
@@ -1052,12 +1039,8 @@ void RasterizerOpenGL::SyncStencilTestState() {
flags[Dirty::StencilTest] = false;
const auto& regs = gpu.regs;
if (!regs.stencil_enable) {
glDisable(GL_STENCIL_TEST);
return;
}
oglEnable(GL_STENCIL_TEST, regs.stencil_enable);
glEnable(GL_STENCIL_TEST);
glStencilFuncSeparate(GL_FRONT, MaxwellToGL::ComparisonOp(regs.stencil_front_func_func),
regs.stencil_front_func_ref, regs.stencil_front_func_mask);
glStencilOpSeparate(GL_FRONT, MaxwellToGL::StencilOp(regs.stencil_front_op_fail),
@@ -1312,6 +1295,19 @@ void RasterizerOpenGL::SyncPointState() {
glDisable(GL_PROGRAM_POINT_SIZE);
}
void RasterizerOpenGL::SyncLineState() {
auto& gpu = system.GPU().Maxwell3D();
auto& flags = gpu.dirty.flags;
if (!flags[Dirty::LineWidth]) {
return;
}
flags[Dirty::LineWidth] = false;
const auto& regs = gpu.regs;
oglEnable(GL_LINE_SMOOTH, regs.line_smooth_enable);
glLineWidth(regs.line_smooth_enable ? regs.line_width_smooth : regs.line_width_aliased);
}
void RasterizerOpenGL::SyncPolygonOffset() {
auto& gpu = system.GPU().Maxwell3D();
auto& flags = gpu.dirty.flags;
@@ -1419,7 +1415,7 @@ void RasterizerOpenGL::EndTransformFeedback() {
const GPUVAddr gpu_addr = binding.Address();
const std::size_t size = binding.buffer_size;
const auto [dest_buffer, offset] = buffer_cache.UploadMemory(gpu_addr, size, 4, true);
glCopyNamedBufferSubData(handle, *dest_buffer, 0, offset, static_cast<GLsizeiptr>(size));
glCopyNamedBufferSubData(handle, dest_buffer, 0, offset, static_cast<GLsizeiptr>(size));
}
}

View File

@@ -65,9 +65,9 @@ public:
void ResetCounter(VideoCore::QueryType type) override;
void Query(GPUVAddr gpu_addr, VideoCore::QueryType type, std::optional<u64> timestamp) override;
void FlushAll() override;
void FlushRegion(CacheAddr addr, u64 size) override;
void InvalidateRegion(CacheAddr addr, u64 size) override;
void FlushAndInvalidateRegion(CacheAddr addr, u64 size) override;
void FlushRegion(VAddr addr, u64 size) override;
void InvalidateRegion(VAddr addr, u64 size) override;
void FlushAndInvalidateRegion(VAddr addr, u64 size) override;
void FlushCommands() override;
void TickFrame() override;
bool AccelerateSurfaceCopy(const Tegra::Engines::Fermi2D::Regs::Surface& src,
@@ -171,6 +171,9 @@ private:
/// Syncs the point state to match the guest state
void SyncPointState();
/// Syncs the line state to match the guest state
void SyncLineState();
/// Syncs the rasterizer enable state to match the guest state
void SyncRasterizeEnable();
@@ -228,9 +231,7 @@ private:
static constexpr std::size_t STREAM_BUFFER_SIZE = 128 * 1024 * 1024;
OGLBufferCache buffer_cache;
VertexArrayPushBuffer vertex_array_pushbuffer{state_tracker};
BindBuffersRangePushBuffer bind_ubo_pushbuffer{GL_UNIFORM_BUFFER};
BindBuffersRangePushBuffer bind_ssbo_pushbuffer{GL_SHADER_STORAGE_BUFFER};
GLint vertex_binding = 0;
std::array<OGLBuffer, Tegra::Engines::Maxwell3D::Regs::NumTransformFeedbackBuffers>
transform_feedback_buffers;

View File

@@ -34,6 +34,8 @@
namespace OpenGL {
using Tegra::Engines::ShaderType;
using VideoCommon::Shader::CompileDepth;
using VideoCommon::Shader::CompilerSettings;
using VideoCommon::Shader::ProgramCode;
using VideoCommon::Shader::Registry;
using VideoCommon::Shader::ShaderIR;
@@ -43,7 +45,7 @@ namespace {
constexpr u32 STAGE_MAIN_OFFSET = 10;
constexpr u32 KERNEL_MAIN_OFFSET = 0;
constexpr VideoCommon::Shader::CompilerSettings COMPILER_SETTINGS{};
constexpr CompilerSettings COMPILER_SETTINGS{CompileDepth::FullDecompile};
/// Gets the address for the specified shader stage program
GPUVAddr GetShaderAddress(Core::System& system, Maxwell::ShaderProgram program) {
@@ -214,11 +216,11 @@ std::unordered_set<GLenum> GetSupportedFormats() {
} // Anonymous namespace
CachedShader::CachedShader(const u8* host_ptr, VAddr cpu_addr, std::size_t size_in_bytes,
CachedShader::CachedShader(VAddr cpu_addr, std::size_t size_in_bytes,
std::shared_ptr<VideoCommon::Shader::Registry> registry,
ShaderEntries entries, std::shared_ptr<OGLProgram> program)
: RasterizerCacheObject{host_ptr}, registry{std::move(registry)}, entries{std::move(entries)},
cpu_addr{cpu_addr}, size_in_bytes{size_in_bytes}, program{std::move(program)} {}
: RasterizerCacheObject{cpu_addr}, registry{std::move(registry)}, entries{std::move(entries)},
size_in_bytes{size_in_bytes}, program{std::move(program)} {}
CachedShader::~CachedShader() = default;
@@ -254,9 +256,8 @@ Shader CachedShader::CreateStageFromMemory(const ShaderParameters& params,
entry.bindless_samplers = registry->GetBindlessSamplers();
params.disk_cache.SaveEntry(std::move(entry));
return std::shared_ptr<CachedShader>(new CachedShader(params.host_ptr, params.cpu_addr,
size_in_bytes, std::move(registry),
MakeEntries(ir), std::move(program)));
return std::shared_ptr<CachedShader>(new CachedShader(
params.cpu_addr, size_in_bytes, std::move(registry), MakeEntries(ir), std::move(program)));
}
Shader CachedShader::CreateKernelFromMemory(const ShaderParameters& params, ProgramCode code) {
@@ -279,17 +280,16 @@ Shader CachedShader::CreateKernelFromMemory(const ShaderParameters& params, Prog
entry.bindless_samplers = registry->GetBindlessSamplers();
params.disk_cache.SaveEntry(std::move(entry));
return std::shared_ptr<CachedShader>(new CachedShader(params.host_ptr, params.cpu_addr,
size_in_bytes, std::move(registry),
MakeEntries(ir), std::move(program)));
return std::shared_ptr<CachedShader>(new CachedShader(
params.cpu_addr, size_in_bytes, std::move(registry), MakeEntries(ir), std::move(program)));
}
Shader CachedShader::CreateFromCache(const ShaderParameters& params,
const PrecompiledShader& precompiled_shader,
std::size_t size_in_bytes) {
return std::shared_ptr<CachedShader>(new CachedShader(
params.host_ptr, params.cpu_addr, size_in_bytes, precompiled_shader.registry,
precompiled_shader.entries, precompiled_shader.program));
return std::shared_ptr<CachedShader>(
new CachedShader(params.cpu_addr, size_in_bytes, precompiled_shader.registry,
precompiled_shader.entries, precompiled_shader.program));
}
ShaderCacheOpenGL::ShaderCacheOpenGL(RasterizerOpenGL& rasterizer, Core::System& system,
@@ -327,8 +327,7 @@ void ShaderCacheOpenGL::LoadDiskCache(const std::atomic_bool& stop_loading,
const auto worker = [&](Core::Frontend::GraphicsContext* context, std::size_t begin,
std::size_t end) {
context->MakeCurrent();
SCOPE_EXIT({ return context->DoneCurrent(); });
const auto scope = context->Acquire();
for (std::size_t i = begin; i < end; ++i) {
if (stop_loading) {
@@ -450,12 +449,14 @@ Shader ShaderCacheOpenGL::GetStageProgram(Maxwell::ShaderProgram program) {
const GPUVAddr address{GetShaderAddress(system, program)};
// Look up shader in the cache based on address
const auto host_ptr{memory_manager.GetPointer(address)};
Shader shader{TryGet(host_ptr)};
const auto cpu_addr{memory_manager.GpuToCpuAddress(address)};
Shader shader{cpu_addr ? TryGet(*cpu_addr) : nullptr};
if (shader) {
return last_shaders[static_cast<std::size_t>(program)] = shader;
}
const auto host_ptr{memory_manager.GetPointer(address)};
// No shader found - create a new one
ProgramCode code{GetShaderCode(memory_manager, address, host_ptr)};
ProgramCode code_b;
@@ -466,9 +467,9 @@ Shader ShaderCacheOpenGL::GetStageProgram(Maxwell::ShaderProgram program) {
const auto unique_identifier = GetUniqueIdentifier(
GetShaderType(program), program == Maxwell::ShaderProgram::VertexA, code, code_b);
const auto cpu_addr{*memory_manager.GpuToCpuAddress(address)};
const ShaderParameters params{system, disk_cache, device,
cpu_addr, host_ptr, unique_identifier};
const ShaderParameters params{system, disk_cache, device,
*cpu_addr, host_ptr, unique_identifier};
const auto found = runtime_cache.find(unique_identifier);
if (found == runtime_cache.end()) {
@@ -485,18 +486,20 @@ Shader ShaderCacheOpenGL::GetStageProgram(Maxwell::ShaderProgram program) {
Shader ShaderCacheOpenGL::GetComputeKernel(GPUVAddr code_addr) {
auto& memory_manager{system.GPU().MemoryManager()};
const auto host_ptr{memory_manager.GetPointer(code_addr)};
auto kernel = TryGet(host_ptr);
const auto cpu_addr{memory_manager.GpuToCpuAddress(code_addr)};
auto kernel = cpu_addr ? TryGet(*cpu_addr) : nullptr;
if (kernel) {
return kernel;
}
const auto host_ptr{memory_manager.GetPointer(code_addr)};
// No kernel found, create a new one
auto code{GetShaderCode(memory_manager, code_addr, host_ptr)};
const auto unique_identifier{GetUniqueIdentifier(ShaderType::Compute, false, code)};
const auto cpu_addr{*memory_manager.GpuToCpuAddress(code_addr)};
const ShaderParameters params{system, disk_cache, device,
cpu_addr, host_ptr, unique_identifier};
const ShaderParameters params{system, disk_cache, device,
*cpu_addr, host_ptr, unique_identifier};
const auto found = runtime_cache.find(unique_identifier);
if (found == runtime_cache.end()) {

View File

@@ -65,11 +65,6 @@ public:
/// Gets the GL program handle for the shader
GLuint GetHandle() const;
/// Returns the guest CPU address of the shader
VAddr GetCpuAddr() const override {
return cpu_addr;
}
/// Returns the size in bytes of the shader
std::size_t GetSizeInBytes() const override {
return size_in_bytes;
@@ -90,13 +85,12 @@ public:
std::size_t size_in_bytes);
private:
explicit CachedShader(const u8* host_ptr, VAddr cpu_addr, std::size_t size_in_bytes,
explicit CachedShader(VAddr cpu_addr, std::size_t size_in_bytes,
std::shared_ptr<VideoCommon::Shader::Registry> registry,
ShaderEntries entries, std::shared_ptr<OGLProgram> program);
std::shared_ptr<VideoCommon::Shader::Registry> registry;
ShaderEntries entries;
VAddr cpu_addr = 0;
std::size_t size_in_bytes = 0;
std::shared_ptr<OGLProgram> program;
};

View File

@@ -31,11 +31,11 @@ namespace {
using Tegra::Engines::ShaderType;
using Tegra::Shader::Attribute;
using Tegra::Shader::AttributeUse;
using Tegra::Shader::Header;
using Tegra::Shader::IpaInterpMode;
using Tegra::Shader::IpaMode;
using Tegra::Shader::IpaSampleMode;
using Tegra::Shader::PixelImap;
using Tegra::Shader::Register;
using VideoCommon::Shader::BuildTransformFeedback;
using VideoCommon::Shader::Registry;
@@ -702,20 +702,19 @@ private:
code.AddNewLine();
}
std::string GetInputFlags(AttributeUse attribute) {
const char* GetInputFlags(PixelImap attribute) {
switch (attribute) {
case AttributeUse::Perspective:
// Default, Smooth
return {};
case AttributeUse::Constant:
return "flat ";
case AttributeUse::ScreenLinear:
return "noperspective ";
default:
case AttributeUse::Unused:
UNIMPLEMENTED_MSG("Unknown attribute usage index={}", static_cast<u32>(attribute));
return {};
case PixelImap::Perspective:
return "smooth";
case PixelImap::Constant:
return "flat";
case PixelImap::ScreenLinear:
return "noperspective";
case PixelImap::Unused:
break;
}
UNIMPLEMENTED_MSG("Unknown attribute usage index={}", static_cast<int>(attribute));
return {};
}
void DeclareInputAttributes() {
@@ -749,8 +748,8 @@ private:
std::string suffix;
if (stage == ShaderType::Fragment) {
const auto input_mode{header.ps.GetAttributeUse(location)};
if (skip_unused && input_mode == AttributeUse::Unused) {
const auto input_mode{header.ps.GetPixelImap(location)};
if (input_mode == PixelImap::Unused) {
return;
}
suffix = GetInputFlags(input_mode);
@@ -927,7 +926,7 @@ private:
const u32 address{generic_base + index * generic_stride + element * element_stride};
const bool declared = stage != ShaderType::Fragment ||
header.ps.GetAttributeUse(index) != AttributeUse::Unused;
header.ps.GetPixelImap(index) != PixelImap::Unused;
const std::string value =
declared ? ReadAttribute(attribute, element).AsFloat() : "0.0f";
code.AddLine("case 0x{:X}U: return {};", address, value);
@@ -1142,8 +1141,7 @@ private:
GetSwizzle(element)),
Type::Float};
case ShaderType::Fragment:
return {element == 3 ? "1.0f" : ("gl_FragCoord"s + GetSwizzle(element)),
Type::Float};
return {"gl_FragCoord"s + GetSwizzle(element), Type::Float};
default:
UNREACHABLE();
}
@@ -1821,15 +1819,17 @@ private:
}
Expression HMergeH0(Operation operation) {
std::string dest = VisitOperand(operation, 0).AsUint();
std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("(({} & 0x0000FFFFU) | ({} & 0xFFFF0000U))", src, dest), Type::Uint};
const std::string dest = VisitOperand(operation, 0).AsUint();
const std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("vec2(unpackHalf2x16({}).x, unpackHalf2x16({}).y)", src, dest),
Type::HalfFloat};
}
Expression HMergeH1(Operation operation) {
std::string dest = VisitOperand(operation, 0).AsUint();
std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("(({} & 0x0000FFFFU) | ({} & 0xFFFF0000U))", dest, src), Type::Uint};
const std::string dest = VisitOperand(operation, 0).AsUint();
const std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("vec2(unpackHalf2x16({}).x, unpackHalf2x16({}).y)", dest, src),
Type::HalfFloat};
}
Expression HPack2(Operation operation) {
@@ -2119,8 +2119,14 @@ private:
return {};
}
return {fmt::format("atomic{}({}, {})", opname, Visit(operation[0]).GetCode(),
Visit(operation[1]).As(type)),
type};
Visit(operation[1]).AsUint()),
Type::Uint};
}
template <const std::string_view& opname, Type type>
Expression Reduce(Operation operation) {
code.AddLine("{};", Atomic<opname, type>(operation).GetCode());
return {};
}
Expression Branch(Operation operation) {
@@ -2479,6 +2485,20 @@ private:
&GLSLDecompiler::Atomic<Func::Or, Type::Int>,
&GLSLDecompiler::Atomic<Func::Xor, Type::Int>,
&GLSLDecompiler::Reduce<Func::Add, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Min, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Max, Type::Uint>,
&GLSLDecompiler::Reduce<Func::And, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Or, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Xor, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Add, Type::Int>,
&GLSLDecompiler::Reduce<Func::Min, Type::Int>,
&GLSLDecompiler::Reduce<Func::Max, Type::Int>,
&GLSLDecompiler::Reduce<Func::And, Type::Int>,
&GLSLDecompiler::Reduce<Func::Or, Type::Int>,
&GLSLDecompiler::Reduce<Func::Xor, Type::Int>,
&GLSLDecompiler::Branch,
&GLSLDecompiler::BranchIndirect,
&GLSLDecompiler::PushFlowStack,

View File

@@ -185,6 +185,12 @@ void SetupDirtyPointSize(Tables& tables) {
tables[0][OFF(point_sprite_enable)] = PointSize;
}
void SetupDirtyLineWidth(Tables& tables) {
tables[0][OFF(line_width_smooth)] = LineWidth;
tables[0][OFF(line_width_aliased)] = LineWidth;
tables[0][OFF(line_smooth_enable)] = LineWidth;
}
void SetupDirtyClipControl(Tables& tables) {
auto& table = tables[0];
table[OFF(screen_y_control)] = ClipControl;
@@ -233,6 +239,7 @@ void StateTracker::Initialize() {
SetupDirtyLogicOp(tables);
SetupDirtyFragmentClampColor(tables);
SetupDirtyPointSize(tables);
SetupDirtyLineWidth(tables);
SetupDirtyClipControl(tables);
SetupDirtyDepthClampEnabled(tables);
SetupDirtyMisc(tables);

View File

@@ -78,6 +78,7 @@ enum : u8 {
LogicOp,
FragmentClampColor,
PointSize,
LineWidth,
ClipControl,
DepthClampEnabled,

View File

@@ -24,7 +24,6 @@ using Tegra::Texture::SwizzleSource;
using VideoCore::MortonSwizzleMode;
using VideoCore::Surface::PixelFormat;
using VideoCore::Surface::SurfaceCompression;
using VideoCore::Surface::SurfaceTarget;
using VideoCore::Surface::SurfaceType;
@@ -37,102 +36,100 @@ namespace {
struct FormatTuple {
GLint internal_format;
GLenum format;
GLenum type;
bool compressed;
GLenum format = GL_NONE;
GLenum type = GL_NONE;
};
constexpr std::array<FormatTuple, VideoCore::Surface::MaxPixelFormat> tex_format_tuples = {{
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8_REV, false}, // ABGR8U
{GL_RGBA8_SNORM, GL_RGBA, GL_BYTE, false}, // ABGR8S
{GL_RGBA8UI, GL_RGBA_INTEGER, GL_UNSIGNED_BYTE, false}, // ABGR8UI
{GL_RGB565, GL_RGB, GL_UNSIGNED_SHORT_5_6_5_REV, false}, // B5G6R5U
{GL_RGB10_A2, GL_RGBA, GL_UNSIGNED_INT_2_10_10_10_REV, false}, // A2B10G10R10U
{GL_RGB5_A1, GL_RGBA, GL_UNSIGNED_SHORT_1_5_5_5_REV, false}, // A1B5G5R5U
{GL_R8, GL_RED, GL_UNSIGNED_BYTE, false}, // R8U
{GL_R8UI, GL_RED_INTEGER, GL_UNSIGNED_BYTE, false}, // R8UI
{GL_RGBA16F, GL_RGBA, GL_HALF_FLOAT, false}, // RGBA16F
{GL_RGBA16, GL_RGBA, GL_UNSIGNED_SHORT, false}, // RGBA16U
{GL_RGBA16_SNORM, GL_RGBA, GL_SHORT, false}, // RGBA16S
{GL_RGBA16UI, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT, false}, // RGBA16UI
{GL_R11F_G11F_B10F, GL_RGB, GL_UNSIGNED_INT_10F_11F_11F_REV, false}, // R11FG11FB10F
{GL_RGBA32UI, GL_RGBA_INTEGER, GL_UNSIGNED_INT, false}, // RGBA32UI
{GL_COMPRESSED_RGBA_S3TC_DXT1_EXT, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // DXT1
{GL_COMPRESSED_RGBA_S3TC_DXT3_EXT, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // DXT23
{GL_COMPRESSED_RGBA_S3TC_DXT5_EXT, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // DXT45
{GL_COMPRESSED_RED_RGTC1, GL_RED, GL_UNSIGNED_INT_8_8_8_8, true}, // DXN1
{GL_COMPRESSED_RG_RGTC2, GL_RG, GL_UNSIGNED_INT_8_8_8_8, true}, // DXN2UNORM
{GL_COMPRESSED_SIGNED_RG_RGTC2, GL_RG, GL_INT, true}, // DXN2SNORM
{GL_COMPRESSED_RGBA_BPTC_UNORM, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // BC7U
{GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT, GL_RGB, GL_UNSIGNED_INT_8_8_8_8, true}, // BC6H_UF16
{GL_COMPRESSED_RGB_BPTC_SIGNED_FLOAT, GL_RGB, GL_UNSIGNED_INT_8_8_8_8, true}, // BC6H_SF16
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_4X4
{GL_RGBA8, GL_BGRA, GL_UNSIGNED_BYTE, false}, // BGRA8
{GL_RGBA32F, GL_RGBA, GL_FLOAT, false}, // RGBA32F
{GL_RG32F, GL_RG, GL_FLOAT, false}, // RG32F
{GL_R32F, GL_RED, GL_FLOAT, false}, // R32F
{GL_R16F, GL_RED, GL_HALF_FLOAT, false}, // R16F
{GL_R16, GL_RED, GL_UNSIGNED_SHORT, false}, // R16U
{GL_R16_SNORM, GL_RED, GL_SHORT, false}, // R16S
{GL_R16UI, GL_RED_INTEGER, GL_UNSIGNED_SHORT, false}, // R16UI
{GL_R16I, GL_RED_INTEGER, GL_SHORT, false}, // R16I
{GL_RG16, GL_RG, GL_UNSIGNED_SHORT, false}, // RG16
{GL_RG16F, GL_RG, GL_HALF_FLOAT, false}, // RG16F
{GL_RG16UI, GL_RG_INTEGER, GL_UNSIGNED_SHORT, false}, // RG16UI
{GL_RG16I, GL_RG_INTEGER, GL_SHORT, false}, // RG16I
{GL_RG16_SNORM, GL_RG, GL_SHORT, false}, // RG16S
{GL_RGB32F, GL_RGB, GL_FLOAT, false}, // RGB32F
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8_REV, false}, // RGBA8_SRGB
{GL_RG8, GL_RG, GL_UNSIGNED_BYTE, false}, // RG8U
{GL_RG8_SNORM, GL_RG, GL_BYTE, false}, // RG8S
{GL_RG32UI, GL_RG_INTEGER, GL_UNSIGNED_INT, false}, // RG32UI
{GL_RGB16F, GL_RGBA, GL_HALF_FLOAT, false}, // RGBX16F
{GL_R32UI, GL_RED_INTEGER, GL_UNSIGNED_INT, false}, // R32UI
{GL_R32I, GL_RED_INTEGER, GL_INT, false}, // R32I
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_8X8
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_8X5
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_5X4
{GL_SRGB8_ALPHA8, GL_BGRA, GL_UNSIGNED_BYTE, false}, // BGRA8
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8_REV}, // ABGR8U
{GL_RGBA8_SNORM, GL_RGBA, GL_BYTE}, // ABGR8S
{GL_RGBA8UI, GL_RGBA_INTEGER, GL_UNSIGNED_BYTE}, // ABGR8UI
{GL_RGB565, GL_RGB, GL_UNSIGNED_SHORT_5_6_5_REV}, // B5G6R5U
{GL_RGB10_A2, GL_RGBA, GL_UNSIGNED_INT_2_10_10_10_REV}, // A2B10G10R10U
{GL_RGB5_A1, GL_RGBA, GL_UNSIGNED_SHORT_1_5_5_5_REV}, // A1B5G5R5U
{GL_R8, GL_RED, GL_UNSIGNED_BYTE}, // R8U
{GL_R8UI, GL_RED_INTEGER, GL_UNSIGNED_BYTE}, // R8UI
{GL_RGBA16F, GL_RGBA, GL_HALF_FLOAT}, // RGBA16F
{GL_RGBA16, GL_RGBA, GL_UNSIGNED_SHORT}, // RGBA16U
{GL_RGBA16_SNORM, GL_RGBA, GL_SHORT}, // RGBA16S
{GL_RGBA16UI, GL_RGBA_INTEGER, GL_UNSIGNED_SHORT}, // RGBA16UI
{GL_R11F_G11F_B10F, GL_RGB, GL_UNSIGNED_INT_10F_11F_11F_REV}, // R11FG11FB10F
{GL_RGBA32UI, GL_RGBA_INTEGER, GL_UNSIGNED_INT}, // RGBA32UI
{GL_COMPRESSED_RGBA_S3TC_DXT1_EXT}, // DXT1
{GL_COMPRESSED_RGBA_S3TC_DXT3_EXT}, // DXT23
{GL_COMPRESSED_RGBA_S3TC_DXT5_EXT}, // DXT45
{GL_COMPRESSED_RED_RGTC1}, // DXN1
{GL_COMPRESSED_RG_RGTC2}, // DXN2UNORM
{GL_COMPRESSED_SIGNED_RG_RGTC2}, // DXN2SNORM
{GL_COMPRESSED_RGBA_BPTC_UNORM}, // BC7U
{GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT}, // BC6H_UF16
{GL_COMPRESSED_RGB_BPTC_SIGNED_FLOAT}, // BC6H_SF16
{GL_COMPRESSED_RGBA_ASTC_4x4_KHR}, // ASTC_2D_4X4
{GL_RGBA8, GL_BGRA, GL_UNSIGNED_BYTE}, // BGRA8
{GL_RGBA32F, GL_RGBA, GL_FLOAT}, // RGBA32F
{GL_RG32F, GL_RG, GL_FLOAT}, // RG32F
{GL_R32F, GL_RED, GL_FLOAT}, // R32F
{GL_R16F, GL_RED, GL_HALF_FLOAT}, // R16F
{GL_R16, GL_RED, GL_UNSIGNED_SHORT}, // R16U
{GL_R16_SNORM, GL_RED, GL_SHORT}, // R16S
{GL_R16UI, GL_RED_INTEGER, GL_UNSIGNED_SHORT}, // R16UI
{GL_R16I, GL_RED_INTEGER, GL_SHORT}, // R16I
{GL_RG16, GL_RG, GL_UNSIGNED_SHORT}, // RG16
{GL_RG16F, GL_RG, GL_HALF_FLOAT}, // RG16F
{GL_RG16UI, GL_RG_INTEGER, GL_UNSIGNED_SHORT}, // RG16UI
{GL_RG16I, GL_RG_INTEGER, GL_SHORT}, // RG16I
{GL_RG16_SNORM, GL_RG, GL_SHORT}, // RG16S
{GL_RGB32F, GL_RGB, GL_FLOAT}, // RGB32F
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8_REV}, // RGBA8_SRGB
{GL_RG8, GL_RG, GL_UNSIGNED_BYTE}, // RG8U
{GL_RG8_SNORM, GL_RG, GL_BYTE}, // RG8S
{GL_RG32UI, GL_RG_INTEGER, GL_UNSIGNED_INT}, // RG32UI
{GL_RGB16F, GL_RGBA, GL_HALF_FLOAT}, // RGBX16F
{GL_R32UI, GL_RED_INTEGER, GL_UNSIGNED_INT}, // R32UI
{GL_R32I, GL_RED_INTEGER, GL_INT}, // R32I
{GL_COMPRESSED_RGBA_ASTC_8x8_KHR}, // ASTC_2D_8X8
{GL_COMPRESSED_RGBA_ASTC_8x5_KHR}, // ASTC_2D_8X5
{GL_COMPRESSED_RGBA_ASTC_5x4_KHR}, // ASTC_2D_5X4
{GL_SRGB8_ALPHA8, GL_BGRA, GL_UNSIGNED_BYTE}, // BGRA8
// Compressed sRGB formats
{GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // DXT1_SRGB
{GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // DXT23_SRGB
{GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // DXT45_SRGB
{GL_COMPRESSED_SRGB_ALPHA_BPTC_UNORM, GL_RGBA, GL_UNSIGNED_INT_8_8_8_8, true}, // BC7U_SRGB
{GL_RGBA4, GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4_REV, false}, // R4G4B4A4U
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_4X4_SRGB
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_8X8_SRGB
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_8X5_SRGB
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_5X4_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_5X5
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_5X5_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_10X8
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_10X8_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_6X6
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_6X6_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_10X10
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_10X10_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_12X12
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_12X12_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_8X6
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_8X6_SRGB
{GL_RGBA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_6X5
{GL_SRGB8_ALPHA8, GL_RGBA, GL_UNSIGNED_BYTE, false}, // ASTC_2D_6X5_SRGB
{GL_RGB9_E5, GL_RGB, GL_UNSIGNED_INT_5_9_9_9_REV, false}, // E5B9G9R9F
{GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT1_EXT}, // DXT1_SRGB
{GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT3_EXT}, // DXT23_SRGB
{GL_COMPRESSED_SRGB_ALPHA_S3TC_DXT5_EXT}, // DXT45_SRGB
{GL_COMPRESSED_SRGB_ALPHA_BPTC_UNORM}, // BC7U_SRGB
{GL_RGBA4, GL_RGBA, GL_UNSIGNED_SHORT_4_4_4_4_REV}, // R4G4B4A4U
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR}, // ASTC_2D_4X4_SRGB
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR}, // ASTC_2D_8X8_SRGB
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR}, // ASTC_2D_8X5_SRGB
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR}, // ASTC_2D_5X4_SRGB
{GL_COMPRESSED_RGBA_ASTC_5x5_KHR}, // ASTC_2D_5X5
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR}, // ASTC_2D_5X5_SRGB
{GL_COMPRESSED_RGBA_ASTC_10x8_KHR}, // ASTC_2D_10X8
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR}, // ASTC_2D_10X8_SRGB
{GL_COMPRESSED_RGBA_ASTC_6x6_KHR}, // ASTC_2D_6X6
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR}, // ASTC_2D_6X6_SRGB
{GL_COMPRESSED_RGBA_ASTC_10x10_KHR}, // ASTC_2D_10X10
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR}, // ASTC_2D_10X10_SRGB
{GL_COMPRESSED_RGBA_ASTC_12x12_KHR}, // ASTC_2D_12X12
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR}, // ASTC_2D_12X12_SRGB
{GL_COMPRESSED_RGBA_ASTC_8x6_KHR}, // ASTC_2D_8X6
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR}, // ASTC_2D_8X6_SRGB
{GL_COMPRESSED_RGBA_ASTC_6x5_KHR}, // ASTC_2D_6X5
{GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR}, // ASTC_2D_6X5_SRGB
{GL_RGB9_E5, GL_RGB, GL_UNSIGNED_INT_5_9_9_9_REV}, // E5B9G9R9F
// Depth formats
{GL_DEPTH_COMPONENT32F, GL_DEPTH_COMPONENT, GL_FLOAT, false}, // Z32F
{GL_DEPTH_COMPONENT16, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT, false}, // Z16
{GL_DEPTH_COMPONENT32F, GL_DEPTH_COMPONENT, GL_FLOAT}, // Z32F
{GL_DEPTH_COMPONENT16, GL_DEPTH_COMPONENT, GL_UNSIGNED_SHORT}, // Z16
// DepthStencil formats
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, false}, // Z24S8
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8, false}, // S8Z24
{GL_DEPTH32F_STENCIL8, GL_DEPTH_STENCIL, GL_FLOAT_32_UNSIGNED_INT_24_8_REV, false}, // Z32FS8
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8}, // Z24S8
{GL_DEPTH24_STENCIL8, GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8}, // S8Z24
{GL_DEPTH32F_STENCIL8, GL_DEPTH_STENCIL, GL_FLOAT_32_UNSIGNED_INT_24_8_REV}, // Z32FS8
}};
const FormatTuple& GetFormatTuple(PixelFormat pixel_format) {
ASSERT(static_cast<std::size_t>(pixel_format) < tex_format_tuples.size());
const auto& format{tex_format_tuples[static_cast<std::size_t>(pixel_format)]};
return format;
return tex_format_tuples[static_cast<std::size_t>(pixel_format)];
}
GLenum GetTextureTarget(const SurfaceTarget& target) {
@@ -242,13 +239,20 @@ OGLTexture CreateTexture(const SurfaceParams& params, GLenum target, GLenum inte
} // Anonymous namespace
CachedSurface::CachedSurface(const GPUVAddr gpu_addr, const SurfaceParams& params)
: VideoCommon::SurfaceBase<View>(gpu_addr, params) {
const auto& tuple{GetFormatTuple(params.pixel_format)};
internal_format = tuple.internal_format;
format = tuple.format;
type = tuple.type;
is_compressed = tuple.compressed;
CachedSurface::CachedSurface(const GPUVAddr gpu_addr, const SurfaceParams& params,
bool is_astc_supported)
: VideoCommon::SurfaceBase<View>(gpu_addr, params, is_astc_supported) {
if (is_converted) {
internal_format = params.srgb_conversion ? GL_SRGB8_ALPHA8 : GL_RGBA8;
format = GL_RGBA;
type = GL_UNSIGNED_BYTE;
} else {
const auto& tuple{GetFormatTuple(params.pixel_format)};
internal_format = tuple.internal_format;
format = tuple.format;
type = tuple.type;
is_compressed = params.IsCompressed();
}
target = GetTextureTarget(params.target);
texture = CreateTexture(params, target, internal_format, texture_buffer);
DecorateSurfaceName();
@@ -264,7 +268,7 @@ void CachedSurface::DownloadTexture(std::vector<u8>& staging_buffer) {
if (params.IsBuffer()) {
glGetNamedBufferSubData(texture_buffer.handle, 0,
static_cast<GLsizeiptr>(params.GetHostSizeInBytes()),
static_cast<GLsizeiptr>(params.GetHostSizeInBytes(false)),
staging_buffer.data());
return;
}
@@ -272,9 +276,10 @@ void CachedSurface::DownloadTexture(std::vector<u8>& staging_buffer) {
SCOPE_EXIT({ glPixelStorei(GL_PACK_ROW_LENGTH, 0); });
for (u32 level = 0; level < params.emulated_levels; ++level) {
glPixelStorei(GL_PACK_ALIGNMENT, std::min(8U, params.GetRowAlignment(level)));
glPixelStorei(GL_PACK_ALIGNMENT, std::min(8U, params.GetRowAlignment(level, is_converted)));
glPixelStorei(GL_PACK_ROW_LENGTH, static_cast<GLint>(params.GetMipWidth(level)));
const std::size_t mip_offset = params.GetHostMipmapLevelOffset(level);
const std::size_t mip_offset = params.GetHostMipmapLevelOffset(level, is_converted);
u8* const mip_data = staging_buffer.data() + mip_offset;
const GLsizei size = static_cast<GLsizei>(params.GetHostMipmapSize(level));
if (is_compressed) {
@@ -294,14 +299,10 @@ void CachedSurface::UploadTexture(const std::vector<u8>& staging_buffer) {
}
void CachedSurface::UploadTextureMipmap(u32 level, const std::vector<u8>& staging_buffer) {
glPixelStorei(GL_UNPACK_ALIGNMENT, std::min(8U, params.GetRowAlignment(level)));
glPixelStorei(GL_UNPACK_ALIGNMENT, std::min(8U, params.GetRowAlignment(level, is_converted)));
glPixelStorei(GL_UNPACK_ROW_LENGTH, static_cast<GLint>(params.GetMipWidth(level)));
auto compression_type = params.GetCompressionType();
const std::size_t mip_offset = compression_type == SurfaceCompression::Converted
? params.GetConvertedMipmapOffset(level)
: params.GetHostMipmapLevelOffset(level);
const std::size_t mip_offset = params.GetHostMipmapLevelOffset(level, is_converted);
const u8* buffer{staging_buffer.data() + mip_offset};
if (is_compressed) {
const auto image_size{static_cast<GLsizei>(params.GetHostMipmapSize(level))};
@@ -410,14 +411,13 @@ CachedSurfaceView::~CachedSurfaceView() = default;
void CachedSurfaceView::Attach(GLenum attachment, GLenum target) const {
ASSERT(params.num_levels == 1);
const GLuint texture = surface.GetTexture();
if (params.num_layers > 1) {
// Layered framebuffer attachments
UNIMPLEMENTED_IF(params.base_layer != 0);
switch (params.target) {
case SurfaceTarget::Texture2DArray:
glFramebufferTexture(target, attachment, texture, params.base_level);
glFramebufferTexture(target, attachment, GetTexture(), 0);
break;
default:
UNIMPLEMENTED();
@@ -426,6 +426,7 @@ void CachedSurfaceView::Attach(GLenum attachment, GLenum target) const {
}
const GLenum view_target = surface.GetTarget();
const GLuint texture = surface.GetTexture();
switch (surface.GetSurfaceParams().target) {
case SurfaceTarget::Texture1D:
glFramebufferTexture1D(target, attachment, view_target, texture, params.base_level);
@@ -482,7 +483,7 @@ OGLTextureView CachedSurfaceView::CreateTextureView() const {
TextureCacheOpenGL::TextureCacheOpenGL(Core::System& system,
VideoCore::RasterizerInterface& rasterizer,
const Device& device, StateTracker& state_tracker)
: TextureCacheBase{system, rasterizer}, state_tracker{state_tracker} {
: TextureCacheBase{system, rasterizer, device.HasASTC()}, state_tracker{state_tracker} {
src_framebuffer.Create();
dst_framebuffer.Create();
}
@@ -490,7 +491,7 @@ TextureCacheOpenGL::TextureCacheOpenGL(Core::System& system,
TextureCacheOpenGL::~TextureCacheOpenGL() = default;
Surface TextureCacheOpenGL::CreateSurface(GPUVAddr gpu_addr, const SurfaceParams& params) {
return std::make_shared<CachedSurface>(gpu_addr, params);
return std::make_shared<CachedSurface>(gpu_addr, params, is_astc_supported);
}
void TextureCacheOpenGL::ImageCopy(Surface& src_surface, Surface& dst_surface,
@@ -596,7 +597,7 @@ void TextureCacheOpenGL::BufferCopy(Surface& src_surface, Surface& dst_surface)
glBindBuffer(GL_PIXEL_PACK_BUFFER, copy_pbo_handle);
if (source_format.compressed) {
if (src_surface->IsCompressed()) {
glGetCompressedTextureImage(src_surface->GetTexture(), 0, static_cast<GLsizei>(source_size),
nullptr);
} else {
@@ -610,7 +611,7 @@ void TextureCacheOpenGL::BufferCopy(Surface& src_surface, Surface& dst_surface)
const GLsizei width = static_cast<GLsizei>(dst_params.width);
const GLsizei height = static_cast<GLsizei>(dst_params.height);
const GLsizei depth = static_cast<GLsizei>(dst_params.depth);
if (dest_format.compressed) {
if (dst_surface->IsCompressed()) {
LOG_CRITICAL(HW_GPU, "Compressed buffer copy is unimplemented!");
UNREACHABLE();
} else {

View File

@@ -37,7 +37,7 @@ class CachedSurface final : public VideoCommon::SurfaceBase<View> {
friend CachedSurfaceView;
public:
explicit CachedSurface(GPUVAddr gpu_addr, const SurfaceParams& params);
explicit CachedSurface(GPUVAddr gpu_addr, const SurfaceParams& params, bool is_astc_supported);
~CachedSurface();
void UploadTexture(const std::vector<u8>& staging_buffer) override;
@@ -51,6 +51,10 @@ public:
return texture.handle;
}
bool IsCompressed() const {
return is_compressed;
}
protected:
void DecorateSurfaceName() override;

View File

@@ -30,8 +30,6 @@ namespace OpenGL {
namespace {
// If the size of this is too small, it ends up creating a soft cap on FPS as the renderer will have
// to wait on available presentation frames.
constexpr std::size_t SWAP_CHAIN_SIZE = 3;
struct Frame {
@@ -214,7 +212,7 @@ public:
std::deque<Frame*> present_queue;
Frame* previous_frame{};
FrameMailbox() : has_debug_tool{HasDebugTool()} {
FrameMailbox() {
for (auto& frame : swap_chain) {
free_queue.push(&frame);
}
@@ -285,13 +283,9 @@ public:
std::unique_lock lock{swap_chain_lock};
present_queue.push_front(frame);
present_cv.notify_one();
DebugNotifyNextFrame();
}
Frame* TryGetPresentFrame(int timeout_ms) {
DebugWaitForNextFrame();
std::unique_lock lock{swap_chain_lock};
// wait for new entries in the present_queue
present_cv.wait_for(lock, std::chrono::milliseconds(timeout_ms),
@@ -317,38 +311,12 @@ public:
previous_frame = frame;
return frame;
}
private:
std::mutex debug_synch_mutex;
std::condition_variable debug_synch_condition;
std::atomic_int frame_for_debug{};
const bool has_debug_tool; // When true, using a GPU debugger, so keep frames in lock-step
/// Signal that a new frame is available (called from GPU thread)
void DebugNotifyNextFrame() {
if (!has_debug_tool) {
return;
}
frame_for_debug++;
std::lock_guard lock{debug_synch_mutex};
debug_synch_condition.notify_one();
}
/// Wait for a new frame to be available (called from presentation thread)
void DebugWaitForNextFrame() {
if (!has_debug_tool) {
return;
}
const int last_frame = frame_for_debug;
std::unique_lock lock{debug_synch_mutex};
debug_synch_condition.wait(lock,
[this, last_frame] { return frame_for_debug > last_frame; });
}
};
RendererOpenGL::RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system)
: VideoCore::RendererBase{emu_window}, emu_window{emu_window}, system{system},
frame_mailbox{std::make_unique<FrameMailbox>()} {}
RendererOpenGL::RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system,
Core::Frontend::GraphicsContext& context)
: RendererBase{emu_window}, emu_window{emu_window}, system{system}, context{context},
has_debug_tool{HasDebugTool()} {}
RendererOpenGL::~RendererOpenGL() = default;
@@ -356,8 +324,6 @@ MICROPROFILE_DEFINE(OpenGL_RenderFrame, "OpenGL", "Render Frame", MP_RGB(128, 12
MICROPROFILE_DEFINE(OpenGL_WaitPresent, "OpenGL", "Wait For Present", MP_RGB(128, 128, 128));
void RendererOpenGL::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
render_window.PollEvents();
if (!framebuffer) {
return;
}
@@ -413,6 +379,13 @@ void RendererOpenGL::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
m_current_frame++;
rasterizer->TickFrame();
}
render_window.PollEvents();
if (has_debug_tool) {
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);
Present(0);
context.SwapBuffers();
}
}
void RendererOpenGL::PrepareRendertarget(const Tegra::FramebufferConfig* framebuffer) {
@@ -480,6 +453,8 @@ void RendererOpenGL::LoadColorToActiveGLTexture(u8 color_r, u8 color_g, u8 color
}
void RendererOpenGL::InitOpenGLObjects() {
frame_mailbox = std::make_unique<FrameMailbox>();
glClearColor(Settings::values.bg_red, Settings::values.bg_green, Settings::values.bg_blue,
0.0f);
@@ -692,12 +667,21 @@ void RendererOpenGL::DrawScreen(const Layout::FramebufferLayout& layout) {
glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
}
void RendererOpenGL::TryPresent(int timeout_ms) {
bool RendererOpenGL::TryPresent(int timeout_ms) {
if (has_debug_tool) {
LOG_DEBUG(Render_OpenGL,
"Skipping presentation because we are presenting on the main context");
return false;
}
return Present(timeout_ms);
}
bool RendererOpenGL::Present(int timeout_ms) {
const auto& layout = render_window.GetFramebufferLayout();
auto frame = frame_mailbox->TryGetPresentFrame(timeout_ms);
if (!frame) {
LOG_DEBUG(Render_OpenGL, "TryGetPresentFrame returned no frame to present");
return;
return false;
}
// Clearing before a full overwrite of a fbo can signal to drivers that they can avoid a
@@ -725,6 +709,7 @@ void RendererOpenGL::TryPresent(int timeout_ms) {
glFlush();
glBindFramebuffer(GL_READ_FRAMEBUFFER, 0);
return true;
}
void RendererOpenGL::RenderScreenshot() {

View File

@@ -55,13 +55,14 @@ class FrameMailbox;
class RendererOpenGL final : public VideoCore::RendererBase {
public:
explicit RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system);
explicit RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system,
Core::Frontend::GraphicsContext& context);
~RendererOpenGL() override;
bool Init() override;
void ShutDown() override;
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
void TryPresent(int timeout_ms) override;
bool TryPresent(int timeout_ms) override;
private:
/// Initializes the OpenGL state and creates persistent objects.
@@ -89,8 +90,11 @@ private:
void PrepareRendertarget(const Tegra::FramebufferConfig* framebuffer);
bool Present(int timeout_ms);
Core::Frontend::EmuWindow& emu_window;
Core::System& system;
Core::Frontend::GraphicsContext& context;
StateTracker state_tracker{system};
@@ -115,6 +119,8 @@ private:
/// Frame presentation mailbox
std::unique_ptr<FrameMailbox> frame_mailbox;
bool has_debug_tool = false;
};
} // namespace OpenGL

View File

@@ -14,68 +14,6 @@
namespace OpenGL {
struct VertexArrayPushBuffer::Entry {
GLuint binding_index{};
const GLuint* buffer{};
GLintptr offset{};
GLsizei stride{};
};
VertexArrayPushBuffer::VertexArrayPushBuffer(StateTracker& state_tracker)
: state_tracker{state_tracker} {}
VertexArrayPushBuffer::~VertexArrayPushBuffer() = default;
void VertexArrayPushBuffer::Setup() {
index_buffer = nullptr;
vertex_buffers.clear();
}
void VertexArrayPushBuffer::SetIndexBuffer(const GLuint* buffer) {
index_buffer = buffer;
}
void VertexArrayPushBuffer::SetVertexBuffer(GLuint binding_index, const GLuint* buffer,
GLintptr offset, GLsizei stride) {
vertex_buffers.push_back(Entry{binding_index, buffer, offset, stride});
}
void VertexArrayPushBuffer::Bind() {
if (index_buffer) {
state_tracker.BindIndexBuffer(*index_buffer);
}
for (const auto& entry : vertex_buffers) {
glBindVertexBuffer(entry.binding_index, *entry.buffer, entry.offset, entry.stride);
}
}
struct BindBuffersRangePushBuffer::Entry {
GLuint binding;
const GLuint* buffer;
GLintptr offset;
GLsizeiptr size;
};
BindBuffersRangePushBuffer::BindBuffersRangePushBuffer(GLenum target) : target{target} {}
BindBuffersRangePushBuffer::~BindBuffersRangePushBuffer() = default;
void BindBuffersRangePushBuffer::Setup() {
entries.clear();
}
void BindBuffersRangePushBuffer::Push(GLuint binding, const GLuint* buffer, GLintptr offset,
GLsizeiptr size) {
entries.push_back(Entry{binding, buffer, offset, size});
}
void BindBuffersRangePushBuffer::Bind() {
for (const Entry& entry : entries) {
glBindBufferRange(target, entry.binding, *entry.buffer, entry.offset, entry.size);
}
}
void LabelGLObject(GLenum identifier, GLuint handle, VAddr addr, std::string_view extra_info) {
if (!GLAD_GL_KHR_debug) {
// We don't need to throw an error as this is just for debugging

View File

@@ -11,49 +11,6 @@
namespace OpenGL {
class StateTracker;
class VertexArrayPushBuffer final {
public:
explicit VertexArrayPushBuffer(StateTracker& state_tracker);
~VertexArrayPushBuffer();
void Setup();
void SetIndexBuffer(const GLuint* buffer);
void SetVertexBuffer(GLuint binding_index, const GLuint* buffer, GLintptr offset,
GLsizei stride);
void Bind();
private:
struct Entry;
StateTracker& state_tracker;
const GLuint* index_buffer{};
std::vector<Entry> vertex_buffers;
};
class BindBuffersRangePushBuffer final {
public:
explicit BindBuffersRangePushBuffer(GLenum target);
~BindBuffersRangePushBuffer();
void Setup();
void Push(GLuint binding, const GLuint* buffer, GLintptr offset, GLsizeiptr size);
void Bind();
private:
struct Entry;
GLenum target;
std::vector<Entry> entries;
};
void LabelGLObject(GLenum identifier, GLuint handle, VAddr addr, std::string_view extra_info = {});
} // namespace OpenGL

View File

@@ -1,58 +0,0 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
namespace vk {
class DispatchLoaderDynamic;
}
namespace Vulkan {
constexpr vk::DispatchLoaderDynamic* dont_use_me_dld = nullptr;
}
#define VULKAN_HPP_DEFAULT_DISPATCHER (*::Vulkan::dont_use_me_dld)
#define VULKAN_HPP_ENABLE_DYNAMIC_LOADER_TOOL 0
#define VULKAN_HPP_DISPATCH_LOADER_DYNAMIC 1
#include <vulkan/vulkan.hpp>
namespace Vulkan {
// vulkan.hpp unique handlers use DispatchLoaderStatic
template <typename T>
using UniqueHandle = vk::UniqueHandle<T, vk::DispatchLoaderDynamic>;
using UniqueAccelerationStructureNV = UniqueHandle<vk::AccelerationStructureNV>;
using UniqueBuffer = UniqueHandle<vk::Buffer>;
using UniqueBufferView = UniqueHandle<vk::BufferView>;
using UniqueCommandBuffer = UniqueHandle<vk::CommandBuffer>;
using UniqueCommandPool = UniqueHandle<vk::CommandPool>;
using UniqueDescriptorPool = UniqueHandle<vk::DescriptorPool>;
using UniqueDescriptorSet = UniqueHandle<vk::DescriptorSet>;
using UniqueDescriptorSetLayout = UniqueHandle<vk::DescriptorSetLayout>;
using UniqueDescriptorUpdateTemplate = UniqueHandle<vk::DescriptorUpdateTemplate>;
using UniqueDevice = UniqueHandle<vk::Device>;
using UniqueDeviceMemory = UniqueHandle<vk::DeviceMemory>;
using UniqueEvent = UniqueHandle<vk::Event>;
using UniqueFence = UniqueHandle<vk::Fence>;
using UniqueFramebuffer = UniqueHandle<vk::Framebuffer>;
using UniqueImage = UniqueHandle<vk::Image>;
using UniqueImageView = UniqueHandle<vk::ImageView>;
using UniqueIndirectCommandsLayoutNVX = UniqueHandle<vk::IndirectCommandsLayoutNVX>;
using UniqueObjectTableNVX = UniqueHandle<vk::ObjectTableNVX>;
using UniquePipeline = UniqueHandle<vk::Pipeline>;
using UniquePipelineCache = UniqueHandle<vk::PipelineCache>;
using UniquePipelineLayout = UniqueHandle<vk::PipelineLayout>;
using UniqueQueryPool = UniqueHandle<vk::QueryPool>;
using UniqueRenderPass = UniqueHandle<vk::RenderPass>;
using UniqueSampler = UniqueHandle<vk::Sampler>;
using UniqueSamplerYcbcrConversion = UniqueHandle<vk::SamplerYcbcrConversion>;
using UniqueSemaphore = UniqueHandle<vk::Semaphore>;
using UniqueShaderModule = UniqueHandle<vk::ShaderModule>;
using UniqueSwapchainKHR = UniqueHandle<vk::SwapchainKHR>;
using UniqueValidationCacheEXT = UniqueHandle<vk::ValidationCacheEXT>;
using UniqueDebugReportCallbackEXT = UniqueHandle<vk::DebugReportCallbackEXT>;
using UniqueDebugUtilsMessengerEXT = UniqueHandle<vk::DebugUtilsMessengerEXT>;
} // namespace Vulkan

View File

@@ -2,13 +2,15 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <iterator>
#include "common/assert.h"
#include "common/common_types.h"
#include "common/logging/log.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
namespace Vulkan::MaxwellToVK {
@@ -17,88 +19,89 @@ using Maxwell = Tegra::Engines::Maxwell3D::Regs;
namespace Sampler {
vk::Filter Filter(Tegra::Texture::TextureFilter filter) {
VkFilter Filter(Tegra::Texture::TextureFilter filter) {
switch (filter) {
case Tegra::Texture::TextureFilter::Linear:
return vk::Filter::eLinear;
return VK_FILTER_LINEAR;
case Tegra::Texture::TextureFilter::Nearest:
return vk::Filter::eNearest;
return VK_FILTER_NEAREST;
}
UNIMPLEMENTED_MSG("Unimplemented sampler filter={}", static_cast<u32>(filter));
return {};
}
vk::SamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter) {
VkSamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter) {
switch (mipmap_filter) {
case Tegra::Texture::TextureMipmapFilter::None:
// TODO(Rodrigo): None seems to be mapped to OpenGL's mag and min filters without mipmapping
// (e.g. GL_NEAREST and GL_LINEAR). Vulkan doesn't have such a thing, find out if we have to
// use an image view with a single mipmap level to emulate this.
return vk::SamplerMipmapMode::eLinear;
return VK_SAMPLER_MIPMAP_MODE_LINEAR;
;
case Tegra::Texture::TextureMipmapFilter::Linear:
return vk::SamplerMipmapMode::eLinear;
return VK_SAMPLER_MIPMAP_MODE_LINEAR;
case Tegra::Texture::TextureMipmapFilter::Nearest:
return vk::SamplerMipmapMode::eNearest;
return VK_SAMPLER_MIPMAP_MODE_NEAREST;
}
UNIMPLEMENTED_MSG("Unimplemented sampler mipmap mode={}", static_cast<u32>(mipmap_filter));
return {};
}
vk::SamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter) {
VkSamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter) {
switch (wrap_mode) {
case Tegra::Texture::WrapMode::Wrap:
return vk::SamplerAddressMode::eRepeat;
return VK_SAMPLER_ADDRESS_MODE_REPEAT;
case Tegra::Texture::WrapMode::Mirror:
return vk::SamplerAddressMode::eMirroredRepeat;
return VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT;
case Tegra::Texture::WrapMode::ClampToEdge:
return vk::SamplerAddressMode::eClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
case Tegra::Texture::WrapMode::Border:
return vk::SamplerAddressMode::eClampToBorder;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
case Tegra::Texture::WrapMode::Clamp:
if (device.GetDriverID() == vk::DriverIdKHR::eNvidiaProprietary) {
if (device.GetDriverID() == VK_DRIVER_ID_NVIDIA_PROPRIETARY_KHR) {
// Nvidia's Vulkan driver defaults to GL_CLAMP on invalid enumerations, we can hack this
// by sending an invalid enumeration.
return static_cast<vk::SamplerAddressMode>(0xcafe);
return static_cast<VkSamplerAddressMode>(0xcafe);
}
// TODO(Rodrigo): Emulate GL_CLAMP properly on other vendors
switch (filter) {
case Tegra::Texture::TextureFilter::Nearest:
return vk::SamplerAddressMode::eClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
case Tegra::Texture::TextureFilter::Linear:
return vk::SamplerAddressMode::eClampToBorder;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
}
UNREACHABLE();
return vk::SamplerAddressMode::eClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
case Tegra::Texture::WrapMode::MirrorOnceClampToEdge:
return vk::SamplerAddressMode::eMirrorClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
case Tegra::Texture::WrapMode::MirrorOnceBorder:
UNIMPLEMENTED();
return vk::SamplerAddressMode::eMirrorClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
default:
UNIMPLEMENTED_MSG("Unimplemented wrap mode={}", static_cast<u32>(wrap_mode));
return {};
}
}
vk::CompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func) {
VkCompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func) {
switch (depth_compare_func) {
case Tegra::Texture::DepthCompareFunc::Never:
return vk::CompareOp::eNever;
return VK_COMPARE_OP_NEVER;
case Tegra::Texture::DepthCompareFunc::Less:
return vk::CompareOp::eLess;
return VK_COMPARE_OP_LESS;
case Tegra::Texture::DepthCompareFunc::LessEqual:
return vk::CompareOp::eLessOrEqual;
return VK_COMPARE_OP_LESS_OR_EQUAL;
case Tegra::Texture::DepthCompareFunc::Equal:
return vk::CompareOp::eEqual;
return VK_COMPARE_OP_EQUAL;
case Tegra::Texture::DepthCompareFunc::NotEqual:
return vk::CompareOp::eNotEqual;
return VK_COMPARE_OP_NOT_EQUAL;
case Tegra::Texture::DepthCompareFunc::Greater:
return vk::CompareOp::eGreater;
return VK_COMPARE_OP_GREATER;
case Tegra::Texture::DepthCompareFunc::GreaterEqual:
return vk::CompareOp::eGreaterOrEqual;
return VK_COMPARE_OP_GREATER_OR_EQUAL;
case Tegra::Texture::DepthCompareFunc::Always:
return vk::CompareOp::eAlways;
return VK_COMPARE_OP_ALWAYS;
}
UNIMPLEMENTED_MSG("Unimplemented sampler depth compare function={}",
static_cast<u32>(depth_compare_func));
@@ -112,92 +115,92 @@ namespace {
enum : u32 { Attachable = 1, Storage = 2 };
struct FormatTuple {
vk::Format format; ///< Vulkan format
int usage; ///< Describes image format usage
VkFormat format; ///< Vulkan format
int usage = 0; ///< Describes image format usage
} constexpr tex_format_tuples[] = {
{vk::Format::eA8B8G8R8UnormPack32, Attachable | Storage}, // ABGR8U
{vk::Format::eA8B8G8R8SnormPack32, Attachable | Storage}, // ABGR8S
{vk::Format::eA8B8G8R8UintPack32, Attachable | Storage}, // ABGR8UI
{vk::Format::eB5G6R5UnormPack16, {}}, // B5G6R5U
{vk::Format::eA2B10G10R10UnormPack32, Attachable | Storage}, // A2B10G10R10U
{vk::Format::eA1R5G5B5UnormPack16, Attachable}, // A1B5G5R5U (flipped with swizzle)
{vk::Format::eR8Unorm, Attachable | Storage}, // R8U
{vk::Format::eR8Uint, Attachable | Storage}, // R8UI
{vk::Format::eR16G16B16A16Sfloat, Attachable | Storage}, // RGBA16F
{vk::Format::eR16G16B16A16Unorm, Attachable | Storage}, // RGBA16U
{vk::Format::eR16G16B16A16Snorm, Attachable | Storage}, // RGBA16S
{vk::Format::eR16G16B16A16Uint, Attachable | Storage}, // RGBA16UI
{vk::Format::eB10G11R11UfloatPack32, Attachable | Storage}, // R11FG11FB10F
{vk::Format::eR32G32B32A32Uint, Attachable | Storage}, // RGBA32UI
{vk::Format::eBc1RgbaUnormBlock, {}}, // DXT1
{vk::Format::eBc2UnormBlock, {}}, // DXT23
{vk::Format::eBc3UnormBlock, {}}, // DXT45
{vk::Format::eBc4UnormBlock, {}}, // DXN1
{vk::Format::eBc5UnormBlock, {}}, // DXN2UNORM
{vk::Format::eBc5SnormBlock, {}}, // DXN2SNORM
{vk::Format::eBc7UnormBlock, {}}, // BC7U
{vk::Format::eBc6HUfloatBlock, {}}, // BC6H_UF16
{vk::Format::eBc6HSfloatBlock, {}}, // BC6H_SF16
{vk::Format::eAstc4x4UnormBlock, {}}, // ASTC_2D_4X4
{vk::Format::eB8G8R8A8Unorm, {}}, // BGRA8
{vk::Format::eR32G32B32A32Sfloat, Attachable | Storage}, // RGBA32F
{vk::Format::eR32G32Sfloat, Attachable | Storage}, // RG32F
{vk::Format::eR32Sfloat, Attachable | Storage}, // R32F
{vk::Format::eR16Sfloat, Attachable | Storage}, // R16F
{vk::Format::eR16Unorm, Attachable | Storage}, // R16U
{vk::Format::eUndefined, {}}, // R16S
{vk::Format::eUndefined, {}}, // R16UI
{vk::Format::eUndefined, {}}, // R16I
{vk::Format::eR16G16Unorm, Attachable | Storage}, // RG16
{vk::Format::eR16G16Sfloat, Attachable | Storage}, // RG16F
{vk::Format::eUndefined, {}}, // RG16UI
{vk::Format::eUndefined, {}}, // RG16I
{vk::Format::eR16G16Snorm, Attachable | Storage}, // RG16S
{vk::Format::eUndefined, {}}, // RGB32F
{vk::Format::eR8G8B8A8Srgb, Attachable}, // RGBA8_SRGB
{vk::Format::eR8G8Unorm, Attachable | Storage}, // RG8U
{vk::Format::eR8G8Snorm, Attachable | Storage}, // RG8S
{vk::Format::eR32G32Uint, Attachable | Storage}, // RG32UI
{vk::Format::eUndefined, {}}, // RGBX16F
{vk::Format::eR32Uint, Attachable | Storage}, // R32UI
{vk::Format::eR32Sint, Attachable | Storage}, // R32I
{vk::Format::eAstc8x8UnormBlock, {}}, // ASTC_2D_8X8
{vk::Format::eUndefined, {}}, // ASTC_2D_8X5
{vk::Format::eUndefined, {}}, // ASTC_2D_5X4
{vk::Format::eUndefined, {}}, // BGRA8_SRGB
{vk::Format::eBc1RgbaSrgbBlock, {}}, // DXT1_SRGB
{vk::Format::eBc2SrgbBlock, {}}, // DXT23_SRGB
{vk::Format::eBc3SrgbBlock, {}}, // DXT45_SRGB
{vk::Format::eBc7SrgbBlock, {}}, // BC7U_SRGB
{vk::Format::eR4G4B4A4UnormPack16, Attachable}, // R4G4B4A4U
{vk::Format::eAstc4x4SrgbBlock, {}}, // ASTC_2D_4X4_SRGB
{vk::Format::eAstc8x8SrgbBlock, {}}, // ASTC_2D_8X8_SRGB
{vk::Format::eAstc8x5SrgbBlock, {}}, // ASTC_2D_8X5_SRGB
{vk::Format::eAstc5x4SrgbBlock, {}}, // ASTC_2D_5X4_SRGB
{vk::Format::eAstc5x5UnormBlock, {}}, // ASTC_2D_5X5
{vk::Format::eAstc5x5SrgbBlock, {}}, // ASTC_2D_5X5_SRGB
{vk::Format::eAstc10x8UnormBlock, {}}, // ASTC_2D_10X8
{vk::Format::eAstc10x8SrgbBlock, {}}, // ASTC_2D_10X8_SRGB
{vk::Format::eAstc6x6UnormBlock, {}}, // ASTC_2D_6X6
{vk::Format::eAstc6x6SrgbBlock, {}}, // ASTC_2D_6X6_SRGB
{vk::Format::eAstc10x10UnormBlock, {}}, // ASTC_2D_10X10
{vk::Format::eAstc10x10SrgbBlock, {}}, // ASTC_2D_10X10_SRGB
{vk::Format::eAstc12x12UnormBlock, {}}, // ASTC_2D_12X12
{vk::Format::eAstc12x12SrgbBlock, {}}, // ASTC_2D_12X12_SRGB
{vk::Format::eAstc8x6UnormBlock, {}}, // ASTC_2D_8X6
{vk::Format::eAstc8x6SrgbBlock, {}}, // ASTC_2D_8X6_SRGB
{vk::Format::eAstc6x5UnormBlock, {}}, // ASTC_2D_6X5
{vk::Format::eAstc6x5SrgbBlock, {}}, // ASTC_2D_6X5_SRGB
{vk::Format::eE5B9G9R9UfloatPack32, {}}, // E5B9G9R9F
{VK_FORMAT_A8B8G8R8_UNORM_PACK32, Attachable | Storage}, // ABGR8U
{VK_FORMAT_A8B8G8R8_SNORM_PACK32, Attachable | Storage}, // ABGR8S
{VK_FORMAT_A8B8G8R8_UINT_PACK32, Attachable | Storage}, // ABGR8UI
{VK_FORMAT_B5G6R5_UNORM_PACK16}, // B5G6R5U
{VK_FORMAT_A2B10G10R10_UNORM_PACK32, Attachable | Storage}, // A2B10G10R10U
{VK_FORMAT_A1R5G5B5_UNORM_PACK16, Attachable}, // A1B5G5R5U (flipped with swizzle)
{VK_FORMAT_R8_UNORM, Attachable | Storage}, // R8U
{VK_FORMAT_R8_UINT, Attachable | Storage}, // R8UI
{VK_FORMAT_R16G16B16A16_SFLOAT, Attachable | Storage}, // RGBA16F
{VK_FORMAT_R16G16B16A16_UNORM, Attachable | Storage}, // RGBA16U
{VK_FORMAT_R16G16B16A16_SNORM, Attachable | Storage}, // RGBA16S
{VK_FORMAT_R16G16B16A16_UINT, Attachable | Storage}, // RGBA16UI
{VK_FORMAT_B10G11R11_UFLOAT_PACK32, Attachable | Storage}, // R11FG11FB10F
{VK_FORMAT_R32G32B32A32_UINT, Attachable | Storage}, // RGBA32UI
{VK_FORMAT_BC1_RGBA_UNORM_BLOCK}, // DXT1
{VK_FORMAT_BC2_UNORM_BLOCK}, // DXT23
{VK_FORMAT_BC3_UNORM_BLOCK}, // DXT45
{VK_FORMAT_BC4_UNORM_BLOCK}, // DXN1
{VK_FORMAT_BC5_UNORM_BLOCK}, // DXN2UNORM
{VK_FORMAT_BC5_SNORM_BLOCK}, // DXN2SNORM
{VK_FORMAT_BC7_UNORM_BLOCK}, // BC7U
{VK_FORMAT_BC6H_UFLOAT_BLOCK}, // BC6H_UF16
{VK_FORMAT_BC6H_SFLOAT_BLOCK}, // BC6H_SF16
{VK_FORMAT_ASTC_4x4_UNORM_BLOCK}, // ASTC_2D_4X4
{VK_FORMAT_B8G8R8A8_UNORM}, // BGRA8
{VK_FORMAT_R32G32B32A32_SFLOAT, Attachable | Storage}, // RGBA32F
{VK_FORMAT_R32G32_SFLOAT, Attachable | Storage}, // RG32F
{VK_FORMAT_R32_SFLOAT, Attachable | Storage}, // R32F
{VK_FORMAT_R16_SFLOAT, Attachable | Storage}, // R16F
{VK_FORMAT_R16_UNORM, Attachable | Storage}, // R16U
{VK_FORMAT_UNDEFINED}, // R16S
{VK_FORMAT_UNDEFINED}, // R16UI
{VK_FORMAT_UNDEFINED}, // R16I
{VK_FORMAT_R16G16_UNORM, Attachable | Storage}, // RG16
{VK_FORMAT_R16G16_SFLOAT, Attachable | Storage}, // RG16F
{VK_FORMAT_UNDEFINED}, // RG16UI
{VK_FORMAT_UNDEFINED}, // RG16I
{VK_FORMAT_R16G16_SNORM, Attachable | Storage}, // RG16S
{VK_FORMAT_UNDEFINED}, // RGB32F
{VK_FORMAT_R8G8B8A8_SRGB, Attachable}, // RGBA8_SRGB
{VK_FORMAT_R8G8_UNORM, Attachable | Storage}, // RG8U
{VK_FORMAT_R8G8_SNORM, Attachable | Storage}, // RG8S
{VK_FORMAT_R32G32_UINT, Attachable | Storage}, // RG32UI
{VK_FORMAT_UNDEFINED}, // RGBX16F
{VK_FORMAT_R32_UINT, Attachable | Storage}, // R32UI
{VK_FORMAT_R32_SINT, Attachable | Storage}, // R32I
{VK_FORMAT_ASTC_8x8_UNORM_BLOCK}, // ASTC_2D_8X8
{VK_FORMAT_UNDEFINED}, // ASTC_2D_8X5
{VK_FORMAT_UNDEFINED}, // ASTC_2D_5X4
{VK_FORMAT_UNDEFINED}, // BGRA8_SRGB
{VK_FORMAT_BC1_RGBA_SRGB_BLOCK}, // DXT1_SRGB
{VK_FORMAT_BC2_SRGB_BLOCK}, // DXT23_SRGB
{VK_FORMAT_BC3_SRGB_BLOCK}, // DXT45_SRGB
{VK_FORMAT_BC7_SRGB_BLOCK}, // BC7U_SRGB
{VK_FORMAT_R4G4B4A4_UNORM_PACK16, Attachable}, // R4G4B4A4U
{VK_FORMAT_ASTC_4x4_SRGB_BLOCK}, // ASTC_2D_4X4_SRGB
{VK_FORMAT_ASTC_8x8_SRGB_BLOCK}, // ASTC_2D_8X8_SRGB
{VK_FORMAT_ASTC_8x5_SRGB_BLOCK}, // ASTC_2D_8X5_SRGB
{VK_FORMAT_ASTC_5x4_SRGB_BLOCK}, // ASTC_2D_5X4_SRGB
{VK_FORMAT_ASTC_5x5_UNORM_BLOCK}, // ASTC_2D_5X5
{VK_FORMAT_ASTC_5x5_SRGB_BLOCK}, // ASTC_2D_5X5_SRGB
{VK_FORMAT_ASTC_10x8_UNORM_BLOCK}, // ASTC_2D_10X8
{VK_FORMAT_ASTC_10x8_SRGB_BLOCK}, // ASTC_2D_10X8_SRGB
{VK_FORMAT_ASTC_6x6_UNORM_BLOCK}, // ASTC_2D_6X6
{VK_FORMAT_ASTC_6x6_SRGB_BLOCK}, // ASTC_2D_6X6_SRGB
{VK_FORMAT_ASTC_10x10_UNORM_BLOCK}, // ASTC_2D_10X10
{VK_FORMAT_ASTC_10x10_SRGB_BLOCK}, // ASTC_2D_10X10_SRGB
{VK_FORMAT_ASTC_12x12_UNORM_BLOCK}, // ASTC_2D_12X12
{VK_FORMAT_ASTC_12x12_SRGB_BLOCK}, // ASTC_2D_12X12_SRGB
{VK_FORMAT_ASTC_8x6_UNORM_BLOCK}, // ASTC_2D_8X6
{VK_FORMAT_ASTC_8x6_SRGB_BLOCK}, // ASTC_2D_8X6_SRGB
{VK_FORMAT_ASTC_6x5_UNORM_BLOCK}, // ASTC_2D_6X5
{VK_FORMAT_ASTC_6x5_SRGB_BLOCK}, // ASTC_2D_6X5_SRGB
{VK_FORMAT_E5B9G9R9_UFLOAT_PACK32}, // E5B9G9R9F
// Depth formats
{vk::Format::eD32Sfloat, Attachable}, // Z32F
{vk::Format::eD16Unorm, Attachable}, // Z16
{VK_FORMAT_D32_SFLOAT, Attachable}, // Z32F
{VK_FORMAT_D16_UNORM, Attachable}, // Z16
// DepthStencil formats
{vk::Format::eD24UnormS8Uint, Attachable}, // Z24S8
{vk::Format::eD24UnormS8Uint, Attachable}, // S8Z24 (emulated)
{vk::Format::eD32SfloatS8Uint, Attachable}, // Z32FS8
{VK_FORMAT_D24_UNORM_S8_UINT, Attachable}, // Z24S8
{VK_FORMAT_D24_UNORM_S8_UINT, Attachable}, // S8Z24 (emulated)
{VK_FORMAT_D32_SFLOAT_S8_UINT, Attachable}, // Z32FS8
};
static_assert(std::size(tex_format_tuples) == VideoCore::Surface::MaxPixelFormat);
@@ -212,106 +215,106 @@ FormatInfo SurfaceFormat(const VKDevice& device, FormatType format_type, PixelFo
ASSERT(static_cast<std::size_t>(pixel_format) < std::size(tex_format_tuples));
auto tuple = tex_format_tuples[static_cast<std::size_t>(pixel_format)];
if (tuple.format == vk::Format::eUndefined) {
if (tuple.format == VK_FORMAT_UNDEFINED) {
UNIMPLEMENTED_MSG("Unimplemented texture format with pixel format={}",
static_cast<u32>(pixel_format));
return {vk::Format::eA8B8G8R8UnormPack32, true, true};
return {VK_FORMAT_A8B8G8R8_UNORM_PACK32, true, true};
}
// Use ABGR8 on hardware that doesn't support ASTC natively
if (!device.IsOptimalAstcSupported() && VideoCore::Surface::IsPixelFormatASTC(pixel_format)) {
tuple.format = VideoCore::Surface::IsPixelFormatSRGB(pixel_format)
? vk::Format::eA8B8G8R8SrgbPack32
: vk::Format::eA8B8G8R8UnormPack32;
? VK_FORMAT_A8B8G8R8_SRGB_PACK32
: VK_FORMAT_A8B8G8R8_UNORM_PACK32;
}
const bool attachable = tuple.usage & Attachable;
const bool storage = tuple.usage & Storage;
vk::FormatFeatureFlags usage;
VkFormatFeatureFlags usage;
if (format_type == FormatType::Buffer) {
usage = vk::FormatFeatureFlagBits::eStorageTexelBuffer |
vk::FormatFeatureFlagBits::eUniformTexelBuffer;
usage =
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT | VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT;
} else {
usage = vk::FormatFeatureFlagBits::eSampledImage | vk::FormatFeatureFlagBits::eTransferDst |
vk::FormatFeatureFlagBits::eTransferSrc;
usage = VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT | VK_FORMAT_FEATURE_TRANSFER_DST_BIT |
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT;
if (attachable) {
usage |= IsZetaFormat(pixel_format) ? vk::FormatFeatureFlagBits::eDepthStencilAttachment
: vk::FormatFeatureFlagBits::eColorAttachment;
usage |= IsZetaFormat(pixel_format) ? VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT
: VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT;
}
if (storage) {
usage |= vk::FormatFeatureFlagBits::eStorageImage;
usage |= VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT;
}
}
return {device.GetSupportedFormat(tuple.format, usage, format_type), attachable, storage};
}
vk::ShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage) {
VkShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage) {
switch (stage) {
case Tegra::Engines::ShaderType::Vertex:
return vk::ShaderStageFlagBits::eVertex;
return VK_SHADER_STAGE_VERTEX_BIT;
case Tegra::Engines::ShaderType::TesselationControl:
return vk::ShaderStageFlagBits::eTessellationControl;
return VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT;
case Tegra::Engines::ShaderType::TesselationEval:
return vk::ShaderStageFlagBits::eTessellationEvaluation;
return VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT;
case Tegra::Engines::ShaderType::Geometry:
return vk::ShaderStageFlagBits::eGeometry;
return VK_SHADER_STAGE_GEOMETRY_BIT;
case Tegra::Engines::ShaderType::Fragment:
return vk::ShaderStageFlagBits::eFragment;
return VK_SHADER_STAGE_FRAGMENT_BIT;
case Tegra::Engines::ShaderType::Compute:
return vk::ShaderStageFlagBits::eCompute;
return VK_SHADER_STAGE_COMPUTE_BIT;
}
UNIMPLEMENTED_MSG("Unimplemented shader stage={}", static_cast<u32>(stage));
return {};
}
vk::PrimitiveTopology PrimitiveTopology([[maybe_unused]] const VKDevice& device,
Maxwell::PrimitiveTopology topology) {
VkPrimitiveTopology PrimitiveTopology([[maybe_unused]] const VKDevice& device,
Maxwell::PrimitiveTopology topology) {
switch (topology) {
case Maxwell::PrimitiveTopology::Points:
return vk::PrimitiveTopology::ePointList;
return VK_PRIMITIVE_TOPOLOGY_POINT_LIST;
case Maxwell::PrimitiveTopology::Lines:
return vk::PrimitiveTopology::eLineList;
return VK_PRIMITIVE_TOPOLOGY_LINE_LIST;
case Maxwell::PrimitiveTopology::LineStrip:
return vk::PrimitiveTopology::eLineStrip;
return VK_PRIMITIVE_TOPOLOGY_LINE_STRIP;
case Maxwell::PrimitiveTopology::Triangles:
return vk::PrimitiveTopology::eTriangleList;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;
case Maxwell::PrimitiveTopology::TriangleStrip:
return vk::PrimitiveTopology::eTriangleStrip;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP;
case Maxwell::PrimitiveTopology::TriangleFan:
return vk::PrimitiveTopology::eTriangleFan;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN;
case Maxwell::PrimitiveTopology::Quads:
// TODO(Rodrigo): Use VK_PRIMITIVE_TOPOLOGY_QUAD_LIST_EXT whenever it releases
return vk::PrimitiveTopology::eTriangleList;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;
case Maxwell::PrimitiveTopology::Patches:
return vk::PrimitiveTopology::ePatchList;
return VK_PRIMITIVE_TOPOLOGY_PATCH_LIST;
default:
UNIMPLEMENTED_MSG("Unimplemented topology={}", static_cast<u32>(topology));
return {};
}
}
vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size) {
VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size) {
switch (type) {
case Maxwell::VertexAttribute::Type::SignedNorm:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Snorm;
return VK_FORMAT_R8_SNORM;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Snorm;
return VK_FORMAT_R8G8_SNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Snorm;
return VK_FORMAT_R8G8B8_SNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Snorm;
return VK_FORMAT_R8G8B8A8_SNORM;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Snorm;
return VK_FORMAT_R16_SNORM;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Snorm;
return VK_FORMAT_R16G16_SNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Snorm;
return VK_FORMAT_R16G16B16_SNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Snorm;
return VK_FORMAT_R16G16B16A16_SNORM;
case Maxwell::VertexAttribute::Size::Size_10_10_10_2:
return vk::Format::eA2B10G10R10SnormPack32;
return VK_FORMAT_A2B10G10R10_SNORM_PACK32;
default:
break;
}
@@ -319,23 +322,23 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::UnsignedNorm:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Unorm;
return VK_FORMAT_R8_UNORM;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Unorm;
return VK_FORMAT_R8G8_UNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Unorm;
return VK_FORMAT_R8G8B8_UNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Unorm;
return VK_FORMAT_R8G8B8A8_UNORM;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Unorm;
return VK_FORMAT_R16_UNORM;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Unorm;
return VK_FORMAT_R16G16_UNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Unorm;
return VK_FORMAT_R16G16B16_UNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Unorm;
return VK_FORMAT_R16G16B16A16_UNORM;
case Maxwell::VertexAttribute::Size::Size_10_10_10_2:
return vk::Format::eA2B10G10R10UnormPack32;
return VK_FORMAT_A2B10G10R10_UNORM_PACK32;
default:
break;
}
@@ -343,59 +346,69 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::SignedInt:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Sint;
return VK_FORMAT_R16G16B16A16_SINT;
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Sint;
return VK_FORMAT_R8_SINT;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Sint;
return VK_FORMAT_R8G8_SINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Sint;
return VK_FORMAT_R8G8B8_SINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Sint;
return VK_FORMAT_R8G8B8A8_SINT;
case Maxwell::VertexAttribute::Size::Size_32:
return vk::Format::eR32Sint;
return VK_FORMAT_R32_SINT;
default:
break;
}
break;
case Maxwell::VertexAttribute::Type::UnsignedInt:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Uint;
return VK_FORMAT_R8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Uint;
return VK_FORMAT_R8G8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Uint;
return VK_FORMAT_R8G8B8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Uint;
return VK_FORMAT_R8G8B8A8_UINT;
case Maxwell::VertexAttribute::Size::Size_16:
return VK_FORMAT_R16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16:
return VK_FORMAT_R16G16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return VK_FORMAT_R16G16B16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return VK_FORMAT_R16G16B16A16_UINT;
case Maxwell::VertexAttribute::Size::Size_32:
return vk::Format::eR32Uint;
return VK_FORMAT_R32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32:
return vk::Format::eR32G32Uint;
return VK_FORMAT_R32G32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32_32:
return vk::Format::eR32G32B32Uint;
return VK_FORMAT_R32G32B32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32_32_32:
return vk::Format::eR32G32B32A32Uint;
return VK_FORMAT_R32G32B32A32_UINT;
default:
break;
}
break;
case Maxwell::VertexAttribute::Type::UnsignedScaled:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Uscaled;
return VK_FORMAT_R8_USCALED;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Uscaled;
return VK_FORMAT_R8G8_USCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Uscaled;
return VK_FORMAT_R8G8B8_USCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Uscaled;
return VK_FORMAT_R8G8B8A8_USCALED;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Uscaled;
return VK_FORMAT_R16_USCALED;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Uscaled;
return VK_FORMAT_R16G16_USCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Uscaled;
return VK_FORMAT_R16G16B16_USCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Uscaled;
return VK_FORMAT_R16G16B16A16_USCALED;
default:
break;
}
@@ -403,21 +416,21 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::SignedScaled:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Sscaled;
return VK_FORMAT_R8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Sscaled;
return VK_FORMAT_R8G8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Sscaled;
return VK_FORMAT_R8G8B8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Sscaled;
return VK_FORMAT_R8G8B8A8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Sscaled;
return VK_FORMAT_R16_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Sscaled;
return VK_FORMAT_R16G16_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Sscaled;
return VK_FORMAT_R16G16B16_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Sscaled;
return VK_FORMAT_R16G16B16A16_SSCALED;
default:
break;
}
@@ -425,21 +438,21 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::Float:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_32:
return vk::Format::eR32Sfloat;
return VK_FORMAT_R32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_32_32:
return vk::Format::eR32G32Sfloat;
return VK_FORMAT_R32G32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_32_32_32:
return vk::Format::eR32G32B32Sfloat;
return VK_FORMAT_R32G32B32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_32_32_32_32:
return vk::Format::eR32G32B32A32Sfloat;
return VK_FORMAT_R32G32B32A32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Sfloat;
return VK_FORMAT_R16_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Sfloat;
return VK_FORMAT_R16G16_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Sfloat;
return VK_FORMAT_R16G16B16_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Sfloat;
return VK_FORMAT_R16G16B16A16_SFLOAT;
default:
break;
}
@@ -450,210 +463,210 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
return {};
}
vk::CompareOp ComparisonOp(Maxwell::ComparisonOp comparison) {
VkCompareOp ComparisonOp(Maxwell::ComparisonOp comparison) {
switch (comparison) {
case Maxwell::ComparisonOp::Never:
case Maxwell::ComparisonOp::NeverOld:
return vk::CompareOp::eNever;
return VK_COMPARE_OP_NEVER;
case Maxwell::ComparisonOp::Less:
case Maxwell::ComparisonOp::LessOld:
return vk::CompareOp::eLess;
return VK_COMPARE_OP_LESS;
case Maxwell::ComparisonOp::Equal:
case Maxwell::ComparisonOp::EqualOld:
return vk::CompareOp::eEqual;
return VK_COMPARE_OP_EQUAL;
case Maxwell::ComparisonOp::LessEqual:
case Maxwell::ComparisonOp::LessEqualOld:
return vk::CompareOp::eLessOrEqual;
return VK_COMPARE_OP_LESS_OR_EQUAL;
case Maxwell::ComparisonOp::Greater:
case Maxwell::ComparisonOp::GreaterOld:
return vk::CompareOp::eGreater;
return VK_COMPARE_OP_GREATER;
case Maxwell::ComparisonOp::NotEqual:
case Maxwell::ComparisonOp::NotEqualOld:
return vk::CompareOp::eNotEqual;
return VK_COMPARE_OP_NOT_EQUAL;
case Maxwell::ComparisonOp::GreaterEqual:
case Maxwell::ComparisonOp::GreaterEqualOld:
return vk::CompareOp::eGreaterOrEqual;
return VK_COMPARE_OP_GREATER_OR_EQUAL;
case Maxwell::ComparisonOp::Always:
case Maxwell::ComparisonOp::AlwaysOld:
return vk::CompareOp::eAlways;
return VK_COMPARE_OP_ALWAYS;
}
UNIMPLEMENTED_MSG("Unimplemented comparison op={}", static_cast<u32>(comparison));
return {};
}
vk::IndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format) {
VkIndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format) {
switch (index_format) {
case Maxwell::IndexFormat::UnsignedByte:
if (!device.IsExtIndexTypeUint8Supported()) {
UNIMPLEMENTED_MSG("Native uint8 indices are not supported on this device");
return vk::IndexType::eUint16;
return VK_INDEX_TYPE_UINT16;
}
return vk::IndexType::eUint8EXT;
return VK_INDEX_TYPE_UINT8_EXT;
case Maxwell::IndexFormat::UnsignedShort:
return vk::IndexType::eUint16;
return VK_INDEX_TYPE_UINT16;
case Maxwell::IndexFormat::UnsignedInt:
return vk::IndexType::eUint32;
return VK_INDEX_TYPE_UINT32;
}
UNIMPLEMENTED_MSG("Unimplemented index_format={}", static_cast<u32>(index_format));
return {};
}
vk::StencilOp StencilOp(Maxwell::StencilOp stencil_op) {
VkStencilOp StencilOp(Maxwell::StencilOp stencil_op) {
switch (stencil_op) {
case Maxwell::StencilOp::Keep:
case Maxwell::StencilOp::KeepOGL:
return vk::StencilOp::eKeep;
return VK_STENCIL_OP_KEEP;
case Maxwell::StencilOp::Zero:
case Maxwell::StencilOp::ZeroOGL:
return vk::StencilOp::eZero;
return VK_STENCIL_OP_ZERO;
case Maxwell::StencilOp::Replace:
case Maxwell::StencilOp::ReplaceOGL:
return vk::StencilOp::eReplace;
return VK_STENCIL_OP_REPLACE;
case Maxwell::StencilOp::Incr:
case Maxwell::StencilOp::IncrOGL:
return vk::StencilOp::eIncrementAndClamp;
return VK_STENCIL_OP_INCREMENT_AND_CLAMP;
case Maxwell::StencilOp::Decr:
case Maxwell::StencilOp::DecrOGL:
return vk::StencilOp::eDecrementAndClamp;
return VK_STENCIL_OP_DECREMENT_AND_CLAMP;
case Maxwell::StencilOp::Invert:
case Maxwell::StencilOp::InvertOGL:
return vk::StencilOp::eInvert;
return VK_STENCIL_OP_INVERT;
case Maxwell::StencilOp::IncrWrap:
case Maxwell::StencilOp::IncrWrapOGL:
return vk::StencilOp::eIncrementAndWrap;
return VK_STENCIL_OP_INCREMENT_AND_WRAP;
case Maxwell::StencilOp::DecrWrap:
case Maxwell::StencilOp::DecrWrapOGL:
return vk::StencilOp::eDecrementAndWrap;
return VK_STENCIL_OP_DECREMENT_AND_WRAP;
}
UNIMPLEMENTED_MSG("Unimplemented stencil op={}", static_cast<u32>(stencil_op));
return {};
}
vk::BlendOp BlendEquation(Maxwell::Blend::Equation equation) {
VkBlendOp BlendEquation(Maxwell::Blend::Equation equation) {
switch (equation) {
case Maxwell::Blend::Equation::Add:
case Maxwell::Blend::Equation::AddGL:
return vk::BlendOp::eAdd;
return VK_BLEND_OP_ADD;
case Maxwell::Blend::Equation::Subtract:
case Maxwell::Blend::Equation::SubtractGL:
return vk::BlendOp::eSubtract;
return VK_BLEND_OP_SUBTRACT;
case Maxwell::Blend::Equation::ReverseSubtract:
case Maxwell::Blend::Equation::ReverseSubtractGL:
return vk::BlendOp::eReverseSubtract;
return VK_BLEND_OP_REVERSE_SUBTRACT;
case Maxwell::Blend::Equation::Min:
case Maxwell::Blend::Equation::MinGL:
return vk::BlendOp::eMin;
return VK_BLEND_OP_MIN;
case Maxwell::Blend::Equation::Max:
case Maxwell::Blend::Equation::MaxGL:
return vk::BlendOp::eMax;
return VK_BLEND_OP_MAX;
}
UNIMPLEMENTED_MSG("Unimplemented blend equation={}", static_cast<u32>(equation));
return {};
}
vk::BlendFactor BlendFactor(Maxwell::Blend::Factor factor) {
VkBlendFactor BlendFactor(Maxwell::Blend::Factor factor) {
switch (factor) {
case Maxwell::Blend::Factor::Zero:
case Maxwell::Blend::Factor::ZeroGL:
return vk::BlendFactor::eZero;
return VK_BLEND_FACTOR_ZERO;
case Maxwell::Blend::Factor::One:
case Maxwell::Blend::Factor::OneGL:
return vk::BlendFactor::eOne;
return VK_BLEND_FACTOR_ONE;
case Maxwell::Blend::Factor::SourceColor:
case Maxwell::Blend::Factor::SourceColorGL:
return vk::BlendFactor::eSrcColor;
return VK_BLEND_FACTOR_SRC_COLOR;
case Maxwell::Blend::Factor::OneMinusSourceColor:
case Maxwell::Blend::Factor::OneMinusSourceColorGL:
return vk::BlendFactor::eOneMinusSrcColor;
return VK_BLEND_FACTOR_ONE_MINUS_SRC_COLOR;
case Maxwell::Blend::Factor::SourceAlpha:
case Maxwell::Blend::Factor::SourceAlphaGL:
return vk::BlendFactor::eSrcAlpha;
return VK_BLEND_FACTOR_SRC_ALPHA;
case Maxwell::Blend::Factor::OneMinusSourceAlpha:
case Maxwell::Blend::Factor::OneMinusSourceAlphaGL:
return vk::BlendFactor::eOneMinusSrcAlpha;
return VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA;
case Maxwell::Blend::Factor::DestAlpha:
case Maxwell::Blend::Factor::DestAlphaGL:
return vk::BlendFactor::eDstAlpha;
return VK_BLEND_FACTOR_DST_ALPHA;
case Maxwell::Blend::Factor::OneMinusDestAlpha:
case Maxwell::Blend::Factor::OneMinusDestAlphaGL:
return vk::BlendFactor::eOneMinusDstAlpha;
return VK_BLEND_FACTOR_ONE_MINUS_DST_ALPHA;
case Maxwell::Blend::Factor::DestColor:
case Maxwell::Blend::Factor::DestColorGL:
return vk::BlendFactor::eDstColor;
return VK_BLEND_FACTOR_DST_COLOR;
case Maxwell::Blend::Factor::OneMinusDestColor:
case Maxwell::Blend::Factor::OneMinusDestColorGL:
return vk::BlendFactor::eOneMinusDstColor;
return VK_BLEND_FACTOR_ONE_MINUS_DST_COLOR;
case Maxwell::Blend::Factor::SourceAlphaSaturate:
case Maxwell::Blend::Factor::SourceAlphaSaturateGL:
return vk::BlendFactor::eSrcAlphaSaturate;
return VK_BLEND_FACTOR_SRC_ALPHA_SATURATE;
case Maxwell::Blend::Factor::Source1Color:
case Maxwell::Blend::Factor::Source1ColorGL:
return vk::BlendFactor::eSrc1Color;
return VK_BLEND_FACTOR_SRC1_COLOR;
case Maxwell::Blend::Factor::OneMinusSource1Color:
case Maxwell::Blend::Factor::OneMinusSource1ColorGL:
return vk::BlendFactor::eOneMinusSrc1Color;
return VK_BLEND_FACTOR_ONE_MINUS_SRC1_COLOR;
case Maxwell::Blend::Factor::Source1Alpha:
case Maxwell::Blend::Factor::Source1AlphaGL:
return vk::BlendFactor::eSrc1Alpha;
return VK_BLEND_FACTOR_SRC1_ALPHA;
case Maxwell::Blend::Factor::OneMinusSource1Alpha:
case Maxwell::Blend::Factor::OneMinusSource1AlphaGL:
return vk::BlendFactor::eOneMinusSrc1Alpha;
return VK_BLEND_FACTOR_ONE_MINUS_SRC1_ALPHA;
case Maxwell::Blend::Factor::ConstantColor:
case Maxwell::Blend::Factor::ConstantColorGL:
return vk::BlendFactor::eConstantColor;
return VK_BLEND_FACTOR_CONSTANT_COLOR;
case Maxwell::Blend::Factor::OneMinusConstantColor:
case Maxwell::Blend::Factor::OneMinusConstantColorGL:
return vk::BlendFactor::eOneMinusConstantColor;
return VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_COLOR;
case Maxwell::Blend::Factor::ConstantAlpha:
case Maxwell::Blend::Factor::ConstantAlphaGL:
return vk::BlendFactor::eConstantAlpha;
return VK_BLEND_FACTOR_CONSTANT_ALPHA;
case Maxwell::Blend::Factor::OneMinusConstantAlpha:
case Maxwell::Blend::Factor::OneMinusConstantAlphaGL:
return vk::BlendFactor::eOneMinusConstantAlpha;
return VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_ALPHA;
}
UNIMPLEMENTED_MSG("Unimplemented blend factor={}", static_cast<u32>(factor));
return {};
}
vk::FrontFace FrontFace(Maxwell::FrontFace front_face) {
VkFrontFace FrontFace(Maxwell::FrontFace front_face) {
switch (front_face) {
case Maxwell::FrontFace::ClockWise:
return vk::FrontFace::eClockwise;
return VK_FRONT_FACE_CLOCKWISE;
case Maxwell::FrontFace::CounterClockWise:
return vk::FrontFace::eCounterClockwise;
return VK_FRONT_FACE_COUNTER_CLOCKWISE;
}
UNIMPLEMENTED_MSG("Unimplemented front face={}", static_cast<u32>(front_face));
return {};
}
vk::CullModeFlags CullFace(Maxwell::CullFace cull_face) {
VkCullModeFlags CullFace(Maxwell::CullFace cull_face) {
switch (cull_face) {
case Maxwell::CullFace::Front:
return vk::CullModeFlagBits::eFront;
return VK_CULL_MODE_FRONT_BIT;
case Maxwell::CullFace::Back:
return vk::CullModeFlagBits::eBack;
return VK_CULL_MODE_BACK_BIT;
case Maxwell::CullFace::FrontAndBack:
return vk::CullModeFlagBits::eFrontAndBack;
return VK_CULL_MODE_FRONT_AND_BACK;
}
UNIMPLEMENTED_MSG("Unimplemented cull face={}", static_cast<u32>(cull_face));
return {};
}
vk::ComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle) {
VkComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle) {
switch (swizzle) {
case Tegra::Texture::SwizzleSource::Zero:
return vk::ComponentSwizzle::eZero;
return VK_COMPONENT_SWIZZLE_ZERO;
case Tegra::Texture::SwizzleSource::R:
return vk::ComponentSwizzle::eR;
return VK_COMPONENT_SWIZZLE_R;
case Tegra::Texture::SwizzleSource::G:
return vk::ComponentSwizzle::eG;
return VK_COMPONENT_SWIZZLE_G;
case Tegra::Texture::SwizzleSource::B:
return vk::ComponentSwizzle::eB;
return VK_COMPONENT_SWIZZLE_B;
case Tegra::Texture::SwizzleSource::A:
return vk::ComponentSwizzle::eA;
return VK_COMPONENT_SWIZZLE_A;
case Tegra::Texture::SwizzleSource::OneInt:
case Tegra::Texture::SwizzleSource::OneFloat:
return vk::ComponentSwizzle::eOne;
return VK_COMPONENT_SWIZZLE_ONE;
}
UNIMPLEMENTED_MSG("Unimplemented swizzle source={}", static_cast<u32>(swizzle));
return {};

View File

@@ -6,8 +6,8 @@
#include "common/common_types.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
#include "video_core/textures/texture.h"
@@ -18,46 +18,45 @@ using PixelFormat = VideoCore::Surface::PixelFormat;
namespace Sampler {
vk::Filter Filter(Tegra::Texture::TextureFilter filter);
VkFilter Filter(Tegra::Texture::TextureFilter filter);
vk::SamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter);
VkSamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter);
vk::SamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter);
VkSamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter);
vk::CompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func);
VkCompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func);
} // namespace Sampler
struct FormatInfo {
vk::Format format;
VkFormat format;
bool attachable;
bool storage;
};
FormatInfo SurfaceFormat(const VKDevice& device, FormatType format_type, PixelFormat pixel_format);
vk::ShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage);
VkShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage);
vk::PrimitiveTopology PrimitiveTopology(const VKDevice& device,
Maxwell::PrimitiveTopology topology);
VkPrimitiveTopology PrimitiveTopology(const VKDevice& device, Maxwell::PrimitiveTopology topology);
vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size);
VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size);
vk::CompareOp ComparisonOp(Maxwell::ComparisonOp comparison);
VkCompareOp ComparisonOp(Maxwell::ComparisonOp comparison);
vk::IndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format);
VkIndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format);
vk::StencilOp StencilOp(Maxwell::StencilOp stencil_op);
VkStencilOp StencilOp(Maxwell::StencilOp stencil_op);
vk::BlendOp BlendEquation(Maxwell::Blend::Equation equation);
VkBlendOp BlendEquation(Maxwell::Blend::Equation equation);
vk::BlendFactor BlendFactor(Maxwell::Blend::Factor factor);
VkBlendFactor BlendFactor(Maxwell::Blend::Factor factor);
vk::FrontFace FrontFace(Maxwell::FrontFace front_face);
VkFrontFace FrontFace(Maxwell::FrontFace front_face);
vk::CullModeFlags CullFace(Maxwell::CullFace cull_face);
VkCullModeFlags CullFace(Maxwell::CullFace cull_face);
vk::ComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle);
VkComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle);
} // namespace Vulkan::MaxwellToVK

View File

@@ -2,13 +2,18 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <algorithm>
#include <array>
#include <cstring>
#include <memory>
#include <optional>
#include <string>
#include <vector>
#include <fmt/format.h>
#include "common/assert.h"
#include "common/dynamic_library.h"
#include "common/logging/log.h"
#include "common/telemetry.h"
#include "core/core.h"
@@ -19,7 +24,6 @@
#include "core/settings.h"
#include "core/telemetry_session.h"
#include "video_core/gpu.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/renderer_vulkan.h"
#include "video_core/renderer_vulkan/vk_blit_screen.h"
#include "video_core/renderer_vulkan/vk_device.h"
@@ -29,30 +33,145 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#include "video_core/renderer_vulkan/vk_swapchain.h"
#include "video_core/renderer_vulkan/wrapper.h"
// Include these late to avoid polluting previous headers
#ifdef _WIN32
#include <windows.h>
// ensure include order
#include <vulkan/vulkan_win32.h>
#endif
#ifdef __linux__
#include <X11/Xlib.h>
#include <vulkan/vulkan_wayland.h>
#include <vulkan/vulkan_xlib.h>
#endif
namespace Vulkan {
namespace {
VkBool32 DebugCallback(VkDebugUtilsMessageSeverityFlagBitsEXT severity_,
using Core::Frontend::WindowSystemType;
VkBool32 DebugCallback(VkDebugUtilsMessageSeverityFlagBitsEXT severity,
VkDebugUtilsMessageTypeFlagsEXT type,
const VkDebugUtilsMessengerCallbackDataEXT* data,
[[maybe_unused]] void* user_data) {
const vk::DebugUtilsMessageSeverityFlagBitsEXT severity{severity_};
const char* message{data->pMessage};
if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eError) {
if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT) {
LOG_CRITICAL(Render_Vulkan, "{}", message);
} else if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eWarning) {
} else if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT) {
LOG_WARNING(Render_Vulkan, "{}", message);
} else if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eInfo) {
} else if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT) {
LOG_INFO(Render_Vulkan, "{}", message);
} else if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eVerbose) {
} else if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT) {
LOG_DEBUG(Render_Vulkan, "{}", message);
}
return VK_FALSE;
}
Common::DynamicLibrary OpenVulkanLibrary() {
Common::DynamicLibrary library;
#ifdef __APPLE__
// Check if a path to a specific Vulkan library has been specified.
char* libvulkan_env = getenv("LIBVULKAN_PATH");
if (!libvulkan_env || !library.Open(libvulkan_env)) {
// Use the libvulkan.dylib from the application bundle.
std::string filename = File::GetBundleDirectory() + "/Contents/Frameworks/libvulkan.dylib";
library.Open(filename.c_str());
}
#else
std::string filename = Common::DynamicLibrary::GetVersionedFilename("vulkan", 1);
if (!library.Open(filename.c_str())) {
// Android devices may not have libvulkan.so.1, only libvulkan.so.
filename = Common::DynamicLibrary::GetVersionedFilename("vulkan");
library.Open(filename.c_str());
}
#endif
return library;
}
vk::Instance CreateInstance(Common::DynamicLibrary& library, vk::InstanceDispatch& dld,
WindowSystemType window_type = WindowSystemType::Headless,
bool enable_layers = false) {
if (!library.IsOpen()) {
LOG_ERROR(Render_Vulkan, "Vulkan library not available");
return {};
}
if (!library.GetSymbol("vkGetInstanceProcAddr", &dld.vkGetInstanceProcAddr)) {
LOG_ERROR(Render_Vulkan, "vkGetInstanceProcAddr not present in Vulkan");
return {};
}
if (!vk::Load(dld)) {
LOG_ERROR(Render_Vulkan, "Failed to load Vulkan function pointers");
return {};
}
std::vector<const char*> extensions;
extensions.reserve(6);
switch (window_type) {
case Core::Frontend::WindowSystemType::Headless:
break;
#ifdef _WIN32
case Core::Frontend::WindowSystemType::Windows:
extensions.push_back(VK_KHR_WIN32_SURFACE_EXTENSION_NAME);
break;
#endif
#ifdef __linux__
case Core::Frontend::WindowSystemType::X11:
extensions.push_back(VK_KHR_XLIB_SURFACE_EXTENSION_NAME);
break;
case Core::Frontend::WindowSystemType::Wayland:
extensions.push_back(VK_KHR_WAYLAND_SURFACE_EXTENSION_NAME);
break;
#endif
default:
LOG_ERROR(Render_Vulkan, "Presentation not supported on this platform");
break;
}
if (window_type != Core::Frontend::WindowSystemType::Headless) {
extensions.push_back(VK_KHR_SURFACE_EXTENSION_NAME);
}
if (enable_layers) {
extensions.push_back(VK_EXT_DEBUG_UTILS_EXTENSION_NAME);
}
extensions.push_back(VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME);
const std::optional properties = vk::EnumerateInstanceExtensionProperties(dld);
if (!properties) {
LOG_ERROR(Render_Vulkan, "Failed to query extension properties");
return {};
}
for (const char* extension : extensions) {
const auto it =
std::find_if(properties->begin(), properties->end(), [extension](const auto& prop) {
return !std::strcmp(extension, prop.extensionName);
});
if (it == properties->end()) {
LOG_ERROR(Render_Vulkan, "Required instance extension {} is not available", extension);
return {};
}
}
static constexpr std::array layers_data{"VK_LAYER_LUNARG_standard_validation"};
vk::Span<const char*> layers = layers_data;
if (!enable_layers) {
layers = {};
}
vk::Instance instance = vk::Instance::Create(layers, extensions, dld);
if (!instance) {
LOG_ERROR(Render_Vulkan, "Failed to create Vulkan instance");
return {};
}
if (!vk::Load(*instance, dld)) {
LOG_ERROR(Render_Vulkan, "Failed to load Vulkan instance function pointers");
}
return instance;
}
std::string GetReadableVersion(u32 version) {
return fmt::format("{}.{}.{}", VK_VERSION_MAJOR(version), VK_VERSION_MINOR(version),
VK_VERSION_PATCH(version));
@@ -63,14 +182,14 @@ std::string GetDriverVersion(const VKDevice& device) {
// https://github.com/SaschaWillems/vulkan.gpuinfo.org/blob/5dddea46ea1120b0df14eef8f15ff8e318e35462/functions.php#L308-L314
const u32 version = device.GetDriverVersion();
if (device.GetDriverID() == vk::DriverIdKHR::eNvidiaProprietary) {
if (device.GetDriverID() == VK_DRIVER_ID_NVIDIA_PROPRIETARY_KHR) {
const u32 major = (version >> 22) & 0x3ff;
const u32 minor = (version >> 14) & 0x0ff;
const u32 secondary = (version >> 6) & 0x0ff;
const u32 tertiary = version & 0x003f;
return fmt::format("{}.{}.{}.{}", major, minor, secondary, tertiary);
}
if (device.GetDriverID() == vk::DriverIdKHR::eIntelProprietaryWindows) {
if (device.GetDriverID() == VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS_KHR) {
const u32 major = version >> 14;
const u32 minor = version & 0x3fff;
return fmt::format("{}.{}", major, minor);
@@ -141,32 +260,18 @@ void RendererVulkan::SwapBuffers(const Tegra::FramebufferConfig* framebuffer) {
render_window.PollEvents();
}
void RendererVulkan::TryPresent(int /*timeout_ms*/) {
bool RendererVulkan::TryPresent(int /*timeout_ms*/) {
// TODO (bunnei): ImplementMe
return true;
}
bool RendererVulkan::Init() {
PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr{};
render_window.RetrieveVulkanHandlers(&vkGetInstanceProcAddr, &instance, &surface);
const vk::DispatchLoaderDynamic dldi(instance, vkGetInstanceProcAddr);
std::optional<vk::DebugUtilsMessengerEXT> callback;
if (Settings::values.renderer_debug && dldi.vkCreateDebugUtilsMessengerEXT) {
callback = CreateDebugCallback(dldi);
if (!callback) {
return false;
}
}
if (!PickDevices(dldi)) {
if (callback) {
instance.destroy(*callback, nullptr, dldi);
}
library = OpenVulkanLibrary();
instance = CreateInstance(library, dld, render_window.GetWindowInfo().type,
Settings::values.renderer_debug);
if (!instance || !CreateDebugCallback() || !CreateSurface() || !PickDevices()) {
return false;
}
debug_callback = UniqueDebugUtilsMessengerEXT(
*callback, vk::ObjectDestroy<vk::Instance, vk::DispatchLoaderDynamic>(
instance, nullptr, device->GetDispatchLoader()));
Report();
@@ -175,7 +280,7 @@ bool RendererVulkan::Init() {
resource_manager = std::make_unique<VKResourceManager>(*device);
const auto& framebuffer = render_window.GetFramebufferLayout();
swapchain = std::make_unique<VKSwapchain>(surface, *device);
swapchain = std::make_unique<VKSwapchain>(*surface, *device);
swapchain->Create(framebuffer.width, framebuffer.height, false);
state_tracker = std::make_unique<StateTracker>(system);
@@ -197,10 +302,8 @@ void RendererVulkan::ShutDown() {
if (!device) {
return;
}
const auto dev = device->GetLogical();
const auto& dld = device->GetDispatchLoader();
if (dev && dld.vkDeviceWaitIdle) {
dev.waitIdle(dld);
if (const auto& dev = device->GetLogical()) {
dev.WaitIdle();
}
rasterizer.reset();
@@ -212,44 +315,94 @@ void RendererVulkan::ShutDown() {
device.reset();
}
std::optional<vk::DebugUtilsMessengerEXT> RendererVulkan::CreateDebugCallback(
const vk::DispatchLoaderDynamic& dldi) {
const vk::DebugUtilsMessengerCreateInfoEXT callback_ci(
{},
vk::DebugUtilsMessageSeverityFlagBitsEXT::eError |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eWarning |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eInfo |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eVerbose,
vk::DebugUtilsMessageTypeFlagBitsEXT::eGeneral |
vk::DebugUtilsMessageTypeFlagBitsEXT::eValidation |
vk::DebugUtilsMessageTypeFlagBitsEXT::ePerformance,
&DebugCallback, nullptr);
vk::DebugUtilsMessengerEXT callback;
if (instance.createDebugUtilsMessengerEXT(&callback_ci, nullptr, &callback, dldi) !=
vk::Result::eSuccess) {
LOG_ERROR(Render_Vulkan, "Failed to create debug callback");
return {};
bool RendererVulkan::CreateDebugCallback() {
if (!Settings::values.renderer_debug) {
return true;
}
return callback;
debug_callback = instance.TryCreateDebugCallback(DebugCallback);
if (!debug_callback) {
LOG_ERROR(Render_Vulkan, "Failed to create debug callback");
return false;
}
return true;
}
bool RendererVulkan::PickDevices(const vk::DispatchLoaderDynamic& dldi) {
const auto devices = instance.enumeratePhysicalDevices(dldi);
bool RendererVulkan::CreateSurface() {
[[maybe_unused]] const auto& window_info = render_window.GetWindowInfo();
VkSurfaceKHR unsafe_surface = nullptr;
#ifdef _WIN32
if (window_info.type == Core::Frontend::WindowSystemType::Windows) {
const HWND hWnd = static_cast<HWND>(window_info.render_surface);
const VkWin32SurfaceCreateInfoKHR win32_ci{VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR,
nullptr, 0, nullptr, hWnd};
const auto vkCreateWin32SurfaceKHR = reinterpret_cast<PFN_vkCreateWin32SurfaceKHR>(
dld.vkGetInstanceProcAddr(*instance, "vkCreateWin32SurfaceKHR"));
if (!vkCreateWin32SurfaceKHR ||
vkCreateWin32SurfaceKHR(*instance, &win32_ci, nullptr, &unsafe_surface) != VK_SUCCESS) {
LOG_ERROR(Render_Vulkan, "Failed to initialize Win32 surface");
return false;
}
}
#endif
#ifdef __linux__
if (window_info.type == Core::Frontend::WindowSystemType::X11) {
const VkXlibSurfaceCreateInfoKHR xlib_ci{
VK_STRUCTURE_TYPE_XLIB_SURFACE_CREATE_INFO_KHR, nullptr, 0,
static_cast<Display*>(window_info.display_connection),
reinterpret_cast<Window>(window_info.render_surface)};
const auto vkCreateXlibSurfaceKHR = reinterpret_cast<PFN_vkCreateXlibSurfaceKHR>(
dld.vkGetInstanceProcAddr(*instance, "vkCreateXlibSurfaceKHR"));
if (!vkCreateXlibSurfaceKHR ||
vkCreateXlibSurfaceKHR(*instance, &xlib_ci, nullptr, &unsafe_surface) != VK_SUCCESS) {
LOG_ERROR(Render_Vulkan, "Failed to initialize Xlib surface");
return false;
}
}
if (window_info.type == Core::Frontend::WindowSystemType::Wayland) {
const VkWaylandSurfaceCreateInfoKHR wayland_ci{
VK_STRUCTURE_TYPE_WAYLAND_SURFACE_CREATE_INFO_KHR, nullptr, 0,
static_cast<wl_display*>(window_info.display_connection),
static_cast<wl_surface*>(window_info.render_surface)};
const auto vkCreateWaylandSurfaceKHR = reinterpret_cast<PFN_vkCreateWaylandSurfaceKHR>(
dld.vkGetInstanceProcAddr(*instance, "vkCreateWaylandSurfaceKHR"));
if (!vkCreateWaylandSurfaceKHR ||
vkCreateWaylandSurfaceKHR(*instance, &wayland_ci, nullptr, &unsafe_surface) !=
VK_SUCCESS) {
LOG_ERROR(Render_Vulkan, "Failed to initialize Wayland surface");
return false;
}
}
#endif
if (!unsafe_surface) {
LOG_ERROR(Render_Vulkan, "Presentation not supported on this platform");
return false;
}
surface = vk::SurfaceKHR(unsafe_surface, *instance, dld);
return true;
}
bool RendererVulkan::PickDevices() {
const auto devices = instance.EnumeratePhysicalDevices();
if (!devices) {
LOG_ERROR(Render_Vulkan, "Failed to enumerate physical devices");
return false;
}
// TODO(Rodrigo): Choose device from config file
const s32 device_index = Settings::values.vulkan_device;
if (device_index < 0 || device_index >= static_cast<s32>(devices.size())) {
if (device_index < 0 || device_index >= static_cast<s32>(devices->size())) {
LOG_ERROR(Render_Vulkan, "Invalid device index {}!", device_index);
return false;
}
const vk::PhysicalDevice physical_device = devices[device_index];
if (!VKDevice::IsSuitable(dldi, physical_device, surface)) {
const vk::PhysicalDevice physical_device((*devices)[static_cast<std::size_t>(device_index)],
dld);
if (!VKDevice::IsSuitable(physical_device, *surface)) {
return false;
}
device = std::make_unique<VKDevice>(dldi, physical_device, surface);
return device->Create(dldi, instance);
device = std::make_unique<VKDevice>(*instance, physical_device, *surface, dld);
return device->Create();
}
void RendererVulkan::Report() const {
@@ -275,4 +428,25 @@ void RendererVulkan::Report() const {
telemetry_session.AddField(field, "GPU_Vulkan_Extensions", extensions);
}
std::vector<std::string> RendererVulkan::EnumerateDevices() {
vk::InstanceDispatch dld;
Common::DynamicLibrary library = OpenVulkanLibrary();
vk::Instance instance = CreateInstance(library, dld);
if (!instance) {
return {};
}
const std::optional physical_devices = instance.EnumeratePhysicalDevices();
if (!physical_devices) {
return {};
}
std::vector<std::string> names;
names.reserve(physical_devices->size());
for (const auto& device : *physical_devices) {
names.push_back(vk::PhysicalDevice(device, dld).GetProperties().deviceName);
}
return names;
}
} // namespace Vulkan

View File

@@ -6,10 +6,13 @@
#include <memory>
#include <optional>
#include <string>
#include <vector>
#include "common/dynamic_library.h"
#include "video_core/renderer_base.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -42,24 +45,30 @@ public:
bool Init() override;
void ShutDown() override;
void SwapBuffers(const Tegra::FramebufferConfig* framebuffer) override;
void TryPresent(int timeout_ms) override;
bool TryPresent(int timeout_ms) override;
static std::vector<std::string> EnumerateDevices();
private:
std::optional<vk::DebugUtilsMessengerEXT> CreateDebugCallback(
const vk::DispatchLoaderDynamic& dldi);
bool CreateDebugCallback();
bool PickDevices(const vk::DispatchLoaderDynamic& dldi);
bool CreateSurface();
bool PickDevices();
void Report() const;
Core::System& system;
Common::DynamicLibrary library;
vk::InstanceDispatch dld;
vk::Instance instance;
vk::SurfaceKHR surface;
VKScreenInfo screen_info;
UniqueDebugUtilsMessengerEXT debug_callback;
vk::DebugCallback debug_callback;
std::unique_ptr<VKDevice> device;
std::unique_ptr<VKSwapchain> swapchain;
std::unique_ptr<VKMemoryManager> memory_manager;

View File

@@ -0,0 +1,50 @@
// Copyright 2020 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
/*
* Build instructions:
* $ glslangValidator -V quad_indexed.comp -o output.spv
* $ spirv-opt -O --strip-debug output.spv -o optimized.spv
* $ xxd -i optimized.spv
*
* Then copy that bytecode to the C++ file
*/
#version 460 core
layout (local_size_x = 1024) in;
layout (std430, set = 0, binding = 0) readonly buffer InputBuffer {
uint input_indexes[];
};
layout (std430, set = 0, binding = 1) writeonly buffer OutputBuffer {
uint output_indexes[];
};
layout (push_constant) uniform PushConstants {
uint base_vertex;
int index_shift; // 0: uint8, 1: uint16, 2: uint32
};
void main() {
int primitive = int(gl_GlobalInvocationID.x);
if (primitive * 6 >= output_indexes.length()) {
return;
}
int index_size = 8 << index_shift;
int flipped_shift = 2 - index_shift;
int mask = (1 << flipped_shift) - 1;
const int quad_swizzle[6] = int[](0, 1, 2, 0, 2, 3);
for (uint vertex = 0; vertex < 6; ++vertex) {
int offset = primitive * 4 + quad_swizzle[vertex];
int int_offset = offset >> flipped_shift;
int bit_offset = (offset & mask) * index_size;
uint packed_input = input_indexes[int_offset];
uint index = bitfieldExtract(packed_input, bit_offset, index_size);
output_indexes[primitive * 6 + vertex] = index + base_vertex;
}
}

View File

@@ -20,7 +20,6 @@
#include "video_core/gpu.h"
#include "video_core/morton.h"
#include "video_core/rasterizer_interface.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/renderer_vulkan.h"
#include "video_core/renderer_vulkan/vk_blit_screen.h"
#include "video_core/renderer_vulkan/vk_device.h"
@@ -30,6 +29,7 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_shader_util.h"
#include "video_core/renderer_vulkan/vk_swapchain.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
namespace Vulkan {
@@ -140,16 +140,25 @@ struct ScreenRectVertex {
std::array<f32, 2> position;
std::array<f32, 2> tex_coord;
static vk::VertexInputBindingDescription GetDescription() {
return vk::VertexInputBindingDescription(0, sizeof(ScreenRectVertex),
vk::VertexInputRate::eVertex);
static VkVertexInputBindingDescription GetDescription() {
VkVertexInputBindingDescription description;
description.binding = 0;
description.stride = sizeof(ScreenRectVertex);
description.inputRate = VK_VERTEX_INPUT_RATE_VERTEX;
return description;
}
static std::array<vk::VertexInputAttributeDescription, 2> GetAttributes() {
return {vk::VertexInputAttributeDescription(0, 0, vk::Format::eR32G32Sfloat,
offsetof(ScreenRectVertex, position)),
vk::VertexInputAttributeDescription(1, 0, vk::Format::eR32G32Sfloat,
offsetof(ScreenRectVertex, tex_coord))};
static std::array<VkVertexInputAttributeDescription, 2> GetAttributes() {
std::array<VkVertexInputAttributeDescription, 2> attributes;
attributes[0].location = 0;
attributes[0].binding = 0;
attributes[0].format = VK_FORMAT_R32G32_SFLOAT;
attributes[0].offset = offsetof(ScreenRectVertex, position);
attributes[1].location = 1;
attributes[1].binding = 0;
attributes[1].format = VK_FORMAT_R32G32_SFLOAT;
attributes[1].offset = offsetof(ScreenRectVertex, tex_coord);
return attributes;
}
};
@@ -172,16 +181,16 @@ std::size_t GetSizeInBytes(const Tegra::FramebufferConfig& framebuffer) {
static_cast<std::size_t>(framebuffer.height) * GetBytesPerPixel(framebuffer);
}
vk::Format GetFormat(const Tegra::FramebufferConfig& framebuffer) {
VkFormat GetFormat(const Tegra::FramebufferConfig& framebuffer) {
switch (framebuffer.pixel_format) {
case Tegra::FramebufferConfig::PixelFormat::ABGR8:
return vk::Format::eA8B8G8R8UnormPack32;
return VK_FORMAT_A8B8G8R8_UNORM_PACK32;
case Tegra::FramebufferConfig::PixelFormat::RGB565:
return vk::Format::eR5G6B5UnormPack16;
return VK_FORMAT_R5G6B5_UNORM_PACK16;
default:
UNIMPLEMENTED_MSG("Unknown framebuffer pixel format: {}",
static_cast<u32>(framebuffer.pixel_format));
return vk::Format::eA8B8G8R8UnormPack32;
return VK_FORMAT_A8B8G8R8_UNORM_PACK32;
}
}
@@ -219,8 +228,8 @@ void VKBlitScreen::Recreate() {
CreateDynamicResources();
}
std::tuple<VKFence&, vk::Semaphore> VKBlitScreen::Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated) {
std::tuple<VKFence&, VkSemaphore> VKBlitScreen::Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated) {
RefreshResources(framebuffer);
// Finish any pending renderpass
@@ -255,46 +264,76 @@ std::tuple<VKFence&, vk::Semaphore> VKBlitScreen::Draw(const Tegra::FramebufferC
framebuffer.stride, block_height_log2, framebuffer.height, 0, 1, 1,
map.GetAddress() + image_offset, host_ptr);
blit_image->Transition(0, 1, 0, 1, vk::PipelineStageFlagBits::eTransfer,
vk::AccessFlagBits::eTransferWrite,
vk::ImageLayout::eTransferDstOptimal);
blit_image->Transition(0, 1, 0, 1, VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_ACCESS_TRANSFER_WRITE_BIT, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
const vk::BufferImageCopy copy(image_offset, 0, 0,
{vk::ImageAspectFlagBits::eColor, 0, 0, 1}, {0, 0, 0},
{framebuffer.width, framebuffer.height, 1});
scheduler.Record([buffer_handle = *buffer, image = blit_image->GetHandle(),
copy](auto cmdbuf, auto& dld) {
cmdbuf.copyBufferToImage(buffer_handle, image, vk::ImageLayout::eTransferDstOptimal,
{copy}, dld);
});
VkBufferImageCopy copy;
copy.bufferOffset = image_offset;
copy.bufferRowLength = 0;
copy.bufferImageHeight = 0;
copy.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
copy.imageSubresource.mipLevel = 0;
copy.imageSubresource.baseArrayLayer = 0;
copy.imageSubresource.layerCount = 1;
copy.imageOffset.x = 0;
copy.imageOffset.y = 0;
copy.imageOffset.z = 0;
copy.imageExtent.width = framebuffer.width;
copy.imageExtent.height = framebuffer.height;
copy.imageExtent.depth = 1;
scheduler.Record(
[buffer = *buffer, image = *blit_image->GetHandle(), copy](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBufferToImage(buffer, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, copy);
});
}
map.Release();
blit_image->Transition(0, 1, 0, 1, vk::PipelineStageFlagBits::eFragmentShader,
vk::AccessFlagBits::eShaderRead,
vk::ImageLayout::eShaderReadOnlyOptimal);
blit_image->Transition(0, 1, 0, 1, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
VK_ACCESS_SHADER_READ_BIT, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
scheduler.Record([renderpass = *renderpass, framebuffer = *framebuffers[image_index],
descriptor_set = descriptor_sets[image_index], buffer = *buffer,
size = swapchain.GetSize(), pipeline = *pipeline,
layout = *pipeline_layout](auto cmdbuf, auto& dld) {
const vk::ClearValue clear_color{std::array{0.0f, 0.0f, 0.0f, 1.0f}};
const vk::RenderPassBeginInfo renderpass_bi(renderpass, framebuffer, {{0, 0}, size}, 1,
&clear_color);
layout = *pipeline_layout](vk::CommandBuffer cmdbuf) {
VkClearValue clear_color;
clear_color.color.float32[0] = 0.0f;
clear_color.color.float32[1] = 0.0f;
clear_color.color.float32[2] = 0.0f;
clear_color.color.float32[3] = 0.0f;
cmdbuf.beginRenderPass(renderpass_bi, vk::SubpassContents::eInline, dld);
cmdbuf.bindPipeline(vk::PipelineBindPoint::eGraphics, pipeline, dld);
cmdbuf.setViewport(
0,
{{0.0f, 0.0f, static_cast<f32>(size.width), static_cast<f32>(size.height), 0.0f, 1.0f}},
dld);
cmdbuf.setScissor(0, {{{0, 0}, size}}, dld);
VkRenderPassBeginInfo renderpass_bi;
renderpass_bi.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO;
renderpass_bi.pNext = nullptr;
renderpass_bi.renderPass = renderpass;
renderpass_bi.framebuffer = framebuffer;
renderpass_bi.renderArea.offset.x = 0;
renderpass_bi.renderArea.offset.y = 0;
renderpass_bi.renderArea.extent = size;
renderpass_bi.clearValueCount = 1;
renderpass_bi.pClearValues = &clear_color;
cmdbuf.bindVertexBuffers(0, {buffer}, {offsetof(BufferData, vertices)}, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eGraphics, layout, 0, {descriptor_set}, {},
dld);
cmdbuf.draw(4, 1, 0, 0, dld);
cmdbuf.endRenderPass(dld);
VkViewport viewport;
viewport.x = 0.0f;
viewport.y = 0.0f;
viewport.width = static_cast<float>(size.width);
viewport.height = static_cast<float>(size.height);
viewport.minDepth = 0.0f;
viewport.maxDepth = 1.0f;
VkRect2D scissor;
scissor.offset.x = 0;
scissor.offset.y = 0;
scissor.extent = size;
cmdbuf.BeginRenderPass(renderpass_bi, VK_SUBPASS_CONTENTS_INLINE);
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
cmdbuf.SetViewport(0, viewport);
cmdbuf.SetScissor(0, scissor);
cmdbuf.BindVertexBuffer(0, buffer, offsetof(BufferData, vertices));
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_GRAPHICS, layout, 0, descriptor_set, {});
cmdbuf.Draw(4, 1, 0, 0);
cmdbuf.EndRenderPass();
});
return {scheduler.GetFence(), *semaphores[image_index]};
@@ -334,165 +373,297 @@ void VKBlitScreen::CreateShaders() {
}
void VKBlitScreen::CreateSemaphores() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
semaphores.resize(image_count);
for (std::size_t i = 0; i < image_count; ++i) {
semaphores[i] = dev.createSemaphoreUnique({}, nullptr, dld);
}
std::generate(semaphores.begin(), semaphores.end(),
[this] { return device.GetLogical().CreateSemaphore(); });
}
void VKBlitScreen::CreateDescriptorPool() {
const std::array<vk::DescriptorPoolSize, 2> pool_sizes{
vk::DescriptorPoolSize{vk::DescriptorType::eUniformBuffer, static_cast<u32>(image_count)},
vk::DescriptorPoolSize{vk::DescriptorType::eCombinedImageSampler,
static_cast<u32>(image_count)}};
const vk::DescriptorPoolCreateInfo pool_ci(
{}, static_cast<u32>(image_count), static_cast<u32>(pool_sizes.size()), pool_sizes.data());
const auto dev = device.GetLogical();
descriptor_pool = dev.createDescriptorPoolUnique(pool_ci, nullptr, device.GetDispatchLoader());
std::array<VkDescriptorPoolSize, 2> pool_sizes;
pool_sizes[0].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
pool_sizes[0].descriptorCount = static_cast<u32>(image_count);
pool_sizes[1].type = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
pool_sizes[1].descriptorCount = static_cast<u32>(image_count);
VkDescriptorPoolCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT;
ci.maxSets = static_cast<u32>(image_count);
ci.poolSizeCount = static_cast<u32>(pool_sizes.size());
ci.pPoolSizes = pool_sizes.data();
descriptor_pool = device.GetLogical().CreateDescriptorPool(ci);
}
void VKBlitScreen::CreateRenderPass() {
const vk::AttachmentDescription color_attachment(
{}, swapchain.GetImageFormat(), vk::SampleCountFlagBits::e1, vk::AttachmentLoadOp::eClear,
vk::AttachmentStoreOp::eStore, vk::AttachmentLoadOp::eDontCare,
vk::AttachmentStoreOp::eDontCare, vk::ImageLayout::eUndefined,
vk::ImageLayout::ePresentSrcKHR);
VkAttachmentDescription color_attachment;
color_attachment.flags = 0;
color_attachment.format = swapchain.GetImageFormat();
color_attachment.samples = VK_SAMPLE_COUNT_1_BIT;
color_attachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
color_attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
color_attachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
color_attachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
color_attachment.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
color_attachment.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;
const vk::AttachmentReference color_attachment_ref(0, vk::ImageLayout::eColorAttachmentOptimal);
VkAttachmentReference color_attachment_ref;
color_attachment_ref.attachment = 0;
color_attachment_ref.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
const vk::SubpassDescription subpass_description({}, vk::PipelineBindPoint::eGraphics, 0,
nullptr, 1, &color_attachment_ref, nullptr,
nullptr, 0, nullptr);
VkSubpassDescription subpass_description;
subpass_description.flags = 0;
subpass_description.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass_description.inputAttachmentCount = 0;
subpass_description.pInputAttachments = nullptr;
subpass_description.colorAttachmentCount = 1;
subpass_description.pColorAttachments = &color_attachment_ref;
subpass_description.pResolveAttachments = nullptr;
subpass_description.pDepthStencilAttachment = nullptr;
subpass_description.preserveAttachmentCount = 0;
subpass_description.pPreserveAttachments = nullptr;
const vk::SubpassDependency dependency(
VK_SUBPASS_EXTERNAL, 0, vk::PipelineStageFlagBits::eColorAttachmentOutput,
vk::PipelineStageFlagBits::eColorAttachmentOutput, {},
vk::AccessFlagBits::eColorAttachmentRead | vk::AccessFlagBits::eColorAttachmentWrite, {});
VkSubpassDependency dependency;
dependency.srcSubpass = VK_SUBPASS_EXTERNAL;
dependency.dstSubpass = 0;
dependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
dependency.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
dependency.srcAccessMask = 0;
dependency.dstAccessMask =
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
dependency.dependencyFlags = 0;
const vk::RenderPassCreateInfo renderpass_ci({}, 1, &color_attachment, 1, &subpass_description,
1, &dependency);
VkRenderPassCreateInfo renderpass_ci;
renderpass_ci.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
renderpass_ci.pNext = nullptr;
renderpass_ci.flags = 0;
renderpass_ci.attachmentCount = 1;
renderpass_ci.pAttachments = &color_attachment;
renderpass_ci.subpassCount = 1;
renderpass_ci.pSubpasses = &subpass_description;
renderpass_ci.dependencyCount = 1;
renderpass_ci.pDependencies = &dependency;
const auto dev = device.GetLogical();
renderpass = dev.createRenderPassUnique(renderpass_ci, nullptr, device.GetDispatchLoader());
renderpass = device.GetLogical().CreateRenderPass(renderpass_ci);
}
void VKBlitScreen::CreateDescriptorSetLayout() {
const std::array<vk::DescriptorSetLayoutBinding, 2> layout_bindings{
vk::DescriptorSetLayoutBinding(0, vk::DescriptorType::eUniformBuffer, 1,
vk::ShaderStageFlagBits::eVertex, nullptr),
vk::DescriptorSetLayoutBinding(1, vk::DescriptorType::eCombinedImageSampler, 1,
vk::ShaderStageFlagBits::eFragment, nullptr)};
const vk::DescriptorSetLayoutCreateInfo descriptor_layout_ci(
{}, static_cast<u32>(layout_bindings.size()), layout_bindings.data());
std::array<VkDescriptorSetLayoutBinding, 2> layout_bindings;
layout_bindings[0].binding = 0;
layout_bindings[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
layout_bindings[0].descriptorCount = 1;
layout_bindings[0].stageFlags = VK_SHADER_STAGE_VERTEX_BIT;
layout_bindings[0].pImmutableSamplers = nullptr;
layout_bindings[1].binding = 1;
layout_bindings[1].descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
layout_bindings[1].descriptorCount = 1;
layout_bindings[1].stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;
layout_bindings[1].pImmutableSamplers = nullptr;
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
descriptor_set_layout = dev.createDescriptorSetLayoutUnique(descriptor_layout_ci, nullptr, dld);
VkDescriptorSetLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.bindingCount = static_cast<u32>(layout_bindings.size());
ci.pBindings = layout_bindings.data();
descriptor_set_layout = device.GetLogical().CreateDescriptorSetLayout(ci);
}
void VKBlitScreen::CreateDescriptorSets() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const std::vector layouts(image_count, *descriptor_set_layout);
descriptor_sets.resize(image_count);
for (std::size_t i = 0; i < image_count; ++i) {
const vk::DescriptorSetLayout layout = *descriptor_set_layout;
const vk::DescriptorSetAllocateInfo descriptor_set_ai(*descriptor_pool, 1, &layout);
const vk::Result result =
dev.allocateDescriptorSets(&descriptor_set_ai, &descriptor_sets[i], dld);
ASSERT(result == vk::Result::eSuccess);
}
VkDescriptorSetAllocateInfo ai;
ai.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
ai.pNext = nullptr;
ai.descriptorPool = *descriptor_pool;
ai.descriptorSetCount = static_cast<u32>(image_count);
ai.pSetLayouts = layouts.data();
descriptor_sets = descriptor_pool.Allocate(ai);
}
void VKBlitScreen::CreatePipelineLayout() {
const vk::PipelineLayoutCreateInfo pipeline_layout_ci({}, 1, &descriptor_set_layout.get(), 0,
nullptr);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
pipeline_layout = dev.createPipelineLayoutUnique(pipeline_layout_ci, nullptr, dld);
VkPipelineLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.setLayoutCount = 1;
ci.pSetLayouts = descriptor_set_layout.address();
ci.pushConstantRangeCount = 0;
ci.pPushConstantRanges = nullptr;
pipeline_layout = device.GetLogical().CreatePipelineLayout(ci);
}
void VKBlitScreen::CreateGraphicsPipeline() {
const std::array shader_stages = {
vk::PipelineShaderStageCreateInfo({}, vk::ShaderStageFlagBits::eVertex, *vertex_shader,
"main", nullptr),
vk::PipelineShaderStageCreateInfo({}, vk::ShaderStageFlagBits::eFragment, *fragment_shader,
"main", nullptr)};
std::array<VkPipelineShaderStageCreateInfo, 2> shader_stages;
shader_stages[0].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
shader_stages[0].pNext = nullptr;
shader_stages[0].flags = 0;
shader_stages[0].stage = VK_SHADER_STAGE_VERTEX_BIT;
shader_stages[0].module = *vertex_shader;
shader_stages[0].pName = "main";
shader_stages[0].pSpecializationInfo = nullptr;
shader_stages[1].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
shader_stages[1].pNext = nullptr;
shader_stages[1].flags = 0;
shader_stages[1].stage = VK_SHADER_STAGE_FRAGMENT_BIT;
shader_stages[1].module = *fragment_shader;
shader_stages[1].pName = "main";
shader_stages[1].pSpecializationInfo = nullptr;
const auto vertex_binding_description = ScreenRectVertex::GetDescription();
const auto vertex_attrs_description = ScreenRectVertex::GetAttributes();
const vk::PipelineVertexInputStateCreateInfo vertex_input(
{}, 1, &vertex_binding_description, static_cast<u32>(vertex_attrs_description.size()),
vertex_attrs_description.data());
const vk::PipelineInputAssemblyStateCreateInfo input_assembly(
{}, vk::PrimitiveTopology::eTriangleStrip, false);
VkPipelineVertexInputStateCreateInfo vertex_input_ci;
vertex_input_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO;
vertex_input_ci.pNext = nullptr;
vertex_input_ci.flags = 0;
vertex_input_ci.vertexBindingDescriptionCount = 1;
vertex_input_ci.pVertexBindingDescriptions = &vertex_binding_description;
vertex_input_ci.vertexAttributeDescriptionCount = u32{vertex_attrs_description.size()};
vertex_input_ci.pVertexAttributeDescriptions = vertex_attrs_description.data();
// Set a dummy viewport, it's going to be replaced by dynamic states.
const vk::Viewport viewport(0.0f, 0.0f, 1.0f, 1.0f, 0.0f, 1.0f);
const vk::Rect2D scissor({0, 0}, {1, 1});
VkPipelineInputAssemblyStateCreateInfo input_assembly_ci;
input_assembly_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO;
input_assembly_ci.pNext = nullptr;
input_assembly_ci.flags = 0;
input_assembly_ci.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP;
input_assembly_ci.primitiveRestartEnable = VK_FALSE;
const vk::PipelineViewportStateCreateInfo viewport_state({}, 1, &viewport, 1, &scissor);
VkPipelineViewportStateCreateInfo viewport_state_ci;
viewport_state_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO;
viewport_state_ci.pNext = nullptr;
viewport_state_ci.flags = 0;
viewport_state_ci.viewportCount = 1;
viewport_state_ci.pViewports = nullptr;
viewport_state_ci.scissorCount = 1;
viewport_state_ci.pScissors = nullptr;
const vk::PipelineRasterizationStateCreateInfo rasterizer(
{}, false, false, vk::PolygonMode::eFill, vk::CullModeFlagBits::eNone,
vk::FrontFace::eClockwise, false, 0.0f, 0.0f, 0.0f, 1.0f);
VkPipelineRasterizationStateCreateInfo rasterization_ci;
rasterization_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO;
rasterization_ci.pNext = nullptr;
rasterization_ci.flags = 0;
rasterization_ci.depthClampEnable = VK_FALSE;
rasterization_ci.rasterizerDiscardEnable = VK_FALSE;
rasterization_ci.polygonMode = VK_POLYGON_MODE_FILL;
rasterization_ci.cullMode = VK_CULL_MODE_NONE;
rasterization_ci.frontFace = VK_FRONT_FACE_CLOCKWISE;
rasterization_ci.depthBiasEnable = VK_FALSE;
rasterization_ci.depthBiasConstantFactor = 0.0f;
rasterization_ci.depthBiasClamp = 0.0f;
rasterization_ci.depthBiasSlopeFactor = 0.0f;
rasterization_ci.lineWidth = 1.0f;
const vk::PipelineMultisampleStateCreateInfo multisampling({}, vk::SampleCountFlagBits::e1,
false, 0.0f, nullptr, false, false);
VkPipelineMultisampleStateCreateInfo multisampling_ci;
multisampling_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO;
multisampling_ci.pNext = nullptr;
multisampling_ci.flags = 0;
multisampling_ci.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT;
multisampling_ci.sampleShadingEnable = VK_FALSE;
multisampling_ci.minSampleShading = 0.0f;
multisampling_ci.pSampleMask = nullptr;
multisampling_ci.alphaToCoverageEnable = VK_FALSE;
multisampling_ci.alphaToOneEnable = VK_FALSE;
const vk::PipelineColorBlendAttachmentState color_blend_attachment(
false, vk::BlendFactor::eZero, vk::BlendFactor::eZero, vk::BlendOp::eAdd,
vk::BlendFactor::eZero, vk::BlendFactor::eZero, vk::BlendOp::eAdd,
vk::ColorComponentFlagBits::eR | vk::ColorComponentFlagBits::eG |
vk::ColorComponentFlagBits::eB | vk::ColorComponentFlagBits::eA);
VkPipelineColorBlendAttachmentState color_blend_attachment;
color_blend_attachment.blendEnable = VK_FALSE;
color_blend_attachment.srcColorBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.dstColorBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.colorBlendOp = VK_BLEND_OP_ADD;
color_blend_attachment.srcAlphaBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.dstAlphaBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.alphaBlendOp = VK_BLEND_OP_ADD;
color_blend_attachment.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT |
VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT;
const vk::PipelineColorBlendStateCreateInfo color_blending(
{}, false, vk::LogicOp::eCopy, 1, &color_blend_attachment, {0.0f, 0.0f, 0.0f, 0.0f});
VkPipelineColorBlendStateCreateInfo color_blend_ci;
color_blend_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO;
color_blend_ci.flags = 0;
color_blend_ci.pNext = nullptr;
color_blend_ci.logicOpEnable = VK_FALSE;
color_blend_ci.logicOp = VK_LOGIC_OP_COPY;
color_blend_ci.attachmentCount = 1;
color_blend_ci.pAttachments = &color_blend_attachment;
color_blend_ci.blendConstants[0] = 0.0f;
color_blend_ci.blendConstants[1] = 0.0f;
color_blend_ci.blendConstants[2] = 0.0f;
color_blend_ci.blendConstants[3] = 0.0f;
const std::array<vk::DynamicState, 2> dynamic_states = {vk::DynamicState::eViewport,
vk::DynamicState::eScissor};
static constexpr std::array dynamic_states = {VK_DYNAMIC_STATE_VIEWPORT,
VK_DYNAMIC_STATE_SCISSOR};
VkPipelineDynamicStateCreateInfo dynamic_state_ci;
dynamic_state_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO;
dynamic_state_ci.pNext = nullptr;
dynamic_state_ci.flags = 0;
dynamic_state_ci.dynamicStateCount = static_cast<u32>(dynamic_states.size());
dynamic_state_ci.pDynamicStates = dynamic_states.data();
const vk::PipelineDynamicStateCreateInfo dynamic_state(
{}, static_cast<u32>(dynamic_states.size()), dynamic_states.data());
VkGraphicsPipelineCreateInfo pipeline_ci;
pipeline_ci.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO;
pipeline_ci.pNext = nullptr;
pipeline_ci.flags = 0;
pipeline_ci.stageCount = static_cast<u32>(shader_stages.size());
pipeline_ci.pStages = shader_stages.data();
pipeline_ci.pVertexInputState = &vertex_input_ci;
pipeline_ci.pInputAssemblyState = &input_assembly_ci;
pipeline_ci.pTessellationState = nullptr;
pipeline_ci.pViewportState = &viewport_state_ci;
pipeline_ci.pRasterizationState = &rasterization_ci;
pipeline_ci.pMultisampleState = &multisampling_ci;
pipeline_ci.pDepthStencilState = nullptr;
pipeline_ci.pColorBlendState = &color_blend_ci;
pipeline_ci.pDynamicState = &dynamic_state_ci;
pipeline_ci.layout = *pipeline_layout;
pipeline_ci.renderPass = *renderpass;
pipeline_ci.subpass = 0;
pipeline_ci.basePipelineHandle = 0;
pipeline_ci.basePipelineIndex = 0;
const vk::GraphicsPipelineCreateInfo pipeline_ci(
{}, static_cast<u32>(shader_stages.size()), shader_stages.data(), &vertex_input,
&input_assembly, nullptr, &viewport_state, &rasterizer, &multisampling, nullptr,
&color_blending, &dynamic_state, *pipeline_layout, *renderpass, 0, nullptr, 0);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
pipeline = dev.createGraphicsPipelineUnique({}, pipeline_ci, nullptr, dld);
pipeline = device.GetLogical().CreateGraphicsPipeline(pipeline_ci);
}
void VKBlitScreen::CreateSampler() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const vk::SamplerCreateInfo sampler_ci(
{}, vk::Filter::eLinear, vk::Filter::eLinear, vk::SamplerMipmapMode::eLinear,
vk::SamplerAddressMode::eClampToBorder, vk::SamplerAddressMode::eClampToBorder,
vk::SamplerAddressMode::eClampToBorder, 0.0f, false, 0.0f, false, vk::CompareOp::eNever,
0.0f, 0.0f, vk::BorderColor::eFloatOpaqueBlack, false);
sampler = dev.createSamplerUnique(sampler_ci, nullptr, dld);
VkSamplerCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.magFilter = VK_FILTER_LINEAR;
ci.minFilter = VK_FILTER_NEAREST;
ci.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR;
ci.addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
ci.addressModeV = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
ci.addressModeW = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
ci.mipLodBias = 0.0f;
ci.anisotropyEnable = VK_FALSE;
ci.maxAnisotropy = 0.0f;
ci.compareEnable = VK_FALSE;
ci.compareOp = VK_COMPARE_OP_NEVER;
ci.minLod = 0.0f;
ci.maxLod = 0.0f;
ci.borderColor = VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK;
ci.unnormalizedCoordinates = VK_FALSE;
sampler = device.GetLogical().CreateSampler(ci);
}
void VKBlitScreen::CreateFramebuffers() {
const vk::Extent2D size{swapchain.GetSize()};
framebuffers.clear();
const VkExtent2D size{swapchain.GetSize()};
framebuffers.resize(image_count);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
VkFramebufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.renderPass = *renderpass;
ci.attachmentCount = 1;
ci.width = size.width;
ci.height = size.height;
ci.layers = 1;
for (std::size_t i = 0; i < image_count; ++i) {
const vk::ImageView image_view{swapchain.GetImageViewIndex(i)};
const vk::FramebufferCreateInfo framebuffer_ci({}, *renderpass, 1, &image_view, size.width,
size.height, 1);
framebuffers[i] = dev.createFramebufferUnique(framebuffer_ci, nullptr, dld);
const VkImageView image_view{swapchain.GetImageViewIndex(i)};
ci.pAttachments = &image_view;
framebuffers[i] = device.GetLogical().CreateFramebuffer(ci);
}
}
@@ -507,54 +678,86 @@ void VKBlitScreen::ReleaseRawImages() {
}
void VKBlitScreen::CreateStagingBuffer(const Tegra::FramebufferConfig& framebuffer) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
VkBufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.size = CalculateBufferSize(framebuffer);
ci.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT |
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
const vk::BufferCreateInfo buffer_ci({}, CalculateBufferSize(framebuffer),
vk::BufferUsageFlagBits::eTransferSrc |
vk::BufferUsageFlagBits::eVertexBuffer |
vk::BufferUsageFlagBits::eUniformBuffer,
vk::SharingMode::eExclusive, 0, nullptr);
buffer = dev.createBufferUnique(buffer_ci, nullptr, dld);
buffer_commit = memory_manager.Commit(*buffer, true);
buffer = device.GetLogical().CreateBuffer(ci);
buffer_commit = memory_manager.Commit(buffer, true);
}
void VKBlitScreen::CreateRawImages(const Tegra::FramebufferConfig& framebuffer) {
raw_images.resize(image_count);
raw_buffer_commits.resize(image_count);
const auto format = GetFormat(framebuffer);
for (std::size_t i = 0; i < image_count; ++i) {
const vk::ImageCreateInfo image_ci(
{}, vk::ImageType::e2D, format, {framebuffer.width, framebuffer.height, 1}, 1, 1,
vk::SampleCountFlagBits::e1, vk::ImageTiling::eOptimal,
vk::ImageUsageFlagBits::eTransferDst | vk::ImageUsageFlagBits::eSampled,
vk::SharingMode::eExclusive, 0, nullptr, vk::ImageLayout::eUndefined);
VkImageCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.imageType = VK_IMAGE_TYPE_2D;
ci.format = GetFormat(framebuffer);
ci.extent.width = framebuffer.width;
ci.extent.height = framebuffer.height;
ci.extent.depth = 1;
ci.mipLevels = 1;
ci.arrayLayers = 1;
ci.samples = VK_SAMPLE_COUNT_1_BIT;
ci.tiling = VK_IMAGE_TILING_LINEAR;
ci.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
ci.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
raw_images[i] =
std::make_unique<VKImage>(device, scheduler, image_ci, vk::ImageAspectFlagBits::eColor);
for (std::size_t i = 0; i < image_count; ++i) {
raw_images[i] = std::make_unique<VKImage>(device, scheduler, ci, VK_IMAGE_ASPECT_COLOR_BIT);
raw_buffer_commits[i] = memory_manager.Commit(raw_images[i]->GetHandle(), false);
}
}
void VKBlitScreen::UpdateDescriptorSet(std::size_t image_index, vk::ImageView image_view) const {
const vk::DescriptorSet descriptor_set = descriptor_sets[image_index];
void VKBlitScreen::UpdateDescriptorSet(std::size_t image_index, VkImageView image_view) const {
VkDescriptorBufferInfo buffer_info;
buffer_info.buffer = *buffer;
buffer_info.offset = offsetof(BufferData, uniform);
buffer_info.range = sizeof(BufferData::uniform);
const vk::DescriptorBufferInfo buffer_info(*buffer, offsetof(BufferData, uniform),
sizeof(BufferData::uniform));
const vk::WriteDescriptorSet ubo_write(descriptor_set, 0, 0, 1,
vk::DescriptorType::eUniformBuffer, nullptr,
&buffer_info, nullptr);
VkWriteDescriptorSet ubo_write;
ubo_write.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
ubo_write.pNext = nullptr;
ubo_write.dstSet = descriptor_sets[image_index];
ubo_write.dstBinding = 0;
ubo_write.dstArrayElement = 0;
ubo_write.descriptorCount = 1;
ubo_write.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
ubo_write.pImageInfo = nullptr;
ubo_write.pBufferInfo = &buffer_info;
ubo_write.pTexelBufferView = nullptr;
const vk::DescriptorImageInfo image_info(*sampler, image_view,
vk::ImageLayout::eShaderReadOnlyOptimal);
const vk::WriteDescriptorSet sampler_write(descriptor_set, 1, 0, 1,
vk::DescriptorType::eCombinedImageSampler,
&image_info, nullptr, nullptr);
VkDescriptorImageInfo image_info;
image_info.sampler = *sampler;
image_info.imageView = image_view;
image_info.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
dev.updateDescriptorSets({ubo_write, sampler_write}, {}, dld);
VkWriteDescriptorSet sampler_write;
sampler_write.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
sampler_write.pNext = nullptr;
sampler_write.dstSet = descriptor_sets[image_index];
sampler_write.dstBinding = 1;
sampler_write.dstArrayElement = 0;
sampler_write.descriptorCount = 1;
sampler_write.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
sampler_write.pImageInfo = &image_info;
sampler_write.pBufferInfo = nullptr;
sampler_write.pTexelBufferView = nullptr;
device.GetLogical().UpdateDescriptorSets(std::array{ubo_write, sampler_write}, {});
}
void VKBlitScreen::SetUniformData(BufferData& data,

View File

@@ -8,9 +8,9 @@
#include <memory>
#include <tuple>
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -49,8 +49,8 @@ public:
void Recreate();
std::tuple<VKFence&, vk::Semaphore> Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated);
std::tuple<VKFence&, VkSemaphore> Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated);
private:
struct BufferData;
@@ -74,7 +74,7 @@ private:
void CreateStagingBuffer(const Tegra::FramebufferConfig& framebuffer);
void CreateRawImages(const Tegra::FramebufferConfig& framebuffer);
void UpdateDescriptorSet(std::size_t image_index, vk::ImageView image_view) const;
void UpdateDescriptorSet(std::size_t image_index, VkImageView image_view) const;
void SetUniformData(BufferData& data, const Tegra::FramebufferConfig& framebuffer) const;
void SetVertexData(BufferData& data, const Tegra::FramebufferConfig& framebuffer) const;
@@ -93,23 +93,23 @@ private:
const std::size_t image_count;
const VKScreenInfo& screen_info;
UniqueShaderModule vertex_shader;
UniqueShaderModule fragment_shader;
UniqueDescriptorPool descriptor_pool;
UniqueDescriptorSetLayout descriptor_set_layout;
UniquePipelineLayout pipeline_layout;
UniquePipeline pipeline;
UniqueRenderPass renderpass;
std::vector<UniqueFramebuffer> framebuffers;
std::vector<vk::DescriptorSet> descriptor_sets;
UniqueSampler sampler;
vk::ShaderModule vertex_shader;
vk::ShaderModule fragment_shader;
vk::DescriptorPool descriptor_pool;
vk::DescriptorSetLayout descriptor_set_layout;
vk::PipelineLayout pipeline_layout;
vk::Pipeline pipeline;
vk::RenderPass renderpass;
std::vector<vk::Framebuffer> framebuffers;
vk::DescriptorSets descriptor_sets;
vk::Sampler sampler;
UniqueBuffer buffer;
vk::Buffer buffer;
VKMemoryCommit buffer_commit;
std::vector<std::unique_ptr<VKFenceWatch>> watches;
std::vector<UniqueSemaphore> semaphores;
std::vector<vk::Semaphore> semaphores;
std::vector<std::unique_ptr<VKImage>> raw_images;
std::vector<VKMemoryCommit> raw_buffer_commits;
u32 raw_width = 0;

View File

@@ -11,48 +11,50 @@
#include "common/assert.h"
#include "common/bit_util.h"
#include "core/core.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_buffer_cache.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_stream_buffer.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
namespace {
const auto BufferUsage =
vk::BufferUsageFlagBits::eVertexBuffer | vk::BufferUsageFlagBits::eIndexBuffer |
vk::BufferUsageFlagBits::eUniformBuffer | vk::BufferUsageFlagBits::eStorageBuffer;
constexpr VkBufferUsageFlags BUFFER_USAGE =
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_INDEX_BUFFER_BIT |
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT;
const auto UploadPipelineStage =
vk::PipelineStageFlagBits::eTransfer | vk::PipelineStageFlagBits::eVertexInput |
vk::PipelineStageFlagBits::eVertexShader | vk::PipelineStageFlagBits::eFragmentShader |
vk::PipelineStageFlagBits::eComputeShader;
constexpr VkPipelineStageFlags UPLOAD_PIPELINE_STAGE =
VK_PIPELINE_STAGE_TRANSFER_BIT | VK_PIPELINE_STAGE_VERTEX_INPUT_BIT |
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT | VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT |
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT;
const auto UploadAccessBarriers =
vk::AccessFlagBits::eTransferRead | vk::AccessFlagBits::eShaderRead |
vk::AccessFlagBits::eUniformRead | vk::AccessFlagBits::eVertexAttributeRead |
vk::AccessFlagBits::eIndexRead;
constexpr VkAccessFlags UPLOAD_ACCESS_BARRIERS =
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_UNIFORM_READ_BIT |
VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT | VK_ACCESS_INDEX_READ_BIT;
auto CreateStreamBuffer(const VKDevice& device, VKScheduler& scheduler) {
return std::make_unique<VKStreamBuffer>(device, scheduler, BufferUsage);
std::unique_ptr<VKStreamBuffer> CreateStreamBuffer(const VKDevice& device, VKScheduler& scheduler) {
return std::make_unique<VKStreamBuffer>(device, scheduler, BUFFER_USAGE);
}
} // Anonymous namespace
CachedBufferBlock::CachedBufferBlock(const VKDevice& device, VKMemoryManager& memory_manager,
CacheAddr cache_addr, std::size_t size)
: VideoCommon::BufferBlock{cache_addr, size} {
const vk::BufferCreateInfo buffer_ci({}, static_cast<vk::DeviceSize>(size),
BufferUsage | vk::BufferUsageFlagBits::eTransferSrc |
vk::BufferUsageFlagBits::eTransferDst,
vk::SharingMode::eExclusive, 0, nullptr);
VAddr cpu_addr, std::size_t size)
: VideoCommon::BufferBlock{cpu_addr, size} {
VkBufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.size = static_cast<VkDeviceSize>(size);
ci.usage = BUFFER_USAGE | VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
const auto& dld{device.GetDispatchLoader()};
const auto dev{device.GetLogical()};
buffer.handle = dev.createBufferUnique(buffer_ci, nullptr, dld);
buffer.commit = memory_manager.Commit(*buffer.handle, false);
buffer.handle = device.GetLogical().CreateBuffer(ci);
buffer.commit = memory_manager.Commit(buffer.handle, false);
}
CachedBufferBlock::~CachedBufferBlock() = default;
@@ -60,30 +62,30 @@ CachedBufferBlock::~CachedBufferBlock() = default;
VKBufferCache::VKBufferCache(VideoCore::RasterizerInterface& rasterizer, Core::System& system,
const VKDevice& device, VKMemoryManager& memory_manager,
VKScheduler& scheduler, VKStagingBufferPool& staging_pool)
: VideoCommon::BufferCache<Buffer, vk::Buffer, VKStreamBuffer>{rasterizer, system,
CreateStreamBuffer(device,
scheduler)},
: VideoCommon::BufferCache<Buffer, VkBuffer, VKStreamBuffer>{rasterizer, system,
CreateStreamBuffer(device,
scheduler)},
device{device}, memory_manager{memory_manager}, scheduler{scheduler}, staging_pool{
staging_pool} {}
VKBufferCache::~VKBufferCache() = default;
Buffer VKBufferCache::CreateBlock(CacheAddr cache_addr, std::size_t size) {
return std::make_shared<CachedBufferBlock>(device, memory_manager, cache_addr, size);
Buffer VKBufferCache::CreateBlock(VAddr cpu_addr, std::size_t size) {
return std::make_shared<CachedBufferBlock>(device, memory_manager, cpu_addr, size);
}
const vk::Buffer* VKBufferCache::ToHandle(const Buffer& buffer) {
VkBuffer VKBufferCache::ToHandle(const Buffer& buffer) {
return buffer->GetHandle();
}
const vk::Buffer* VKBufferCache::GetEmptyBuffer(std::size_t size) {
VkBuffer VKBufferCache::GetEmptyBuffer(std::size_t size) {
size = std::max(size, std::size_t(4));
const auto& empty = staging_pool.GetUnusedBuffer(size, false);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([size, buffer = *empty.handle](vk::CommandBuffer cmdbuf, auto& dld) {
cmdbuf.fillBuffer(buffer, 0, size, 0, dld);
scheduler.Record([size, buffer = *empty.handle](vk::CommandBuffer cmdbuf) {
cmdbuf.FillBuffer(buffer, 0, size, 0);
});
return &*empty.handle;
return *empty.handle;
}
void VKBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
@@ -92,15 +94,22 @@ void VKBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, st
std::memcpy(staging.commit->Map(size), data, size);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([staging = *staging.handle, buffer = *buffer->GetHandle(), offset,
size](auto cmdbuf, auto& dld) {
cmdbuf.copyBuffer(staging, buffer, {{0, offset, size}}, dld);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTransfer, UploadPipelineStage, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferWrite, UploadAccessBarriers,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, buffer,
offset, size)},
{}, dld);
scheduler.Record([staging = *staging.handle, buffer = buffer->GetHandle(), offset,
size](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBuffer(staging, buffer, VkBufferCopy{0, offset, size});
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstAccessMask = UPLOAD_ACCESS_BARRIERS;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = offset;
barrier.size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFER_BIT, UPLOAD_PIPELINE_STAGE, 0, {},
barrier, {});
});
}
@@ -108,17 +117,24 @@ void VKBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset,
u8* data) {
const auto& staging = staging_pool.GetUnusedBuffer(size, true);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([staging = *staging.handle, buffer = *buffer->GetHandle(), offset,
size](auto cmdbuf, auto& dld) {
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eVertexShader | vk::PipelineStageFlagBits::eFragmentShader |
vk::PipelineStageFlagBits::eComputeShader,
vk::PipelineStageFlagBits::eTransfer, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eShaderWrite,
vk::AccessFlagBits::eTransferRead, VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED, buffer, offset, size)},
{}, dld);
cmdbuf.copyBuffer(buffer, staging, {{offset, 0, size}}, dld);
scheduler.Record([staging = *staging.handle, buffer = buffer->GetHandle(), offset,
size](vk::CommandBuffer cmdbuf) {
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = offset;
barrier.size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_VERTEX_SHADER_BIT |
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT |
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_TRANSFER_BIT, 0, {}, barrier, {});
cmdbuf.CopyBuffer(buffer, staging, VkBufferCopy{offset, 0, size});
});
scheduler.Finish();
@@ -128,18 +144,31 @@ void VKBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset,
void VKBufferCache::CopyBlock(const Buffer& src, const Buffer& dst, std::size_t src_offset,
std::size_t dst_offset, std::size_t size) {
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([src_buffer = *src->GetHandle(), dst_buffer = *dst->GetHandle(), src_offset,
dst_offset, size](auto cmdbuf, auto& dld) {
cmdbuf.copyBuffer(src_buffer, dst_buffer, {{src_offset, dst_offset, size}}, dld);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTransfer, UploadPipelineStage, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferRead,
vk::AccessFlagBits::eShaderWrite, VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED, src_buffer, src_offset, size),
vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferWrite, UploadAccessBarriers,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, dst_buffer,
dst_offset, size)},
{}, dld);
scheduler.Record([src_buffer = src->GetHandle(), dst_buffer = dst->GetHandle(), src_offset,
dst_offset, size](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBuffer(src_buffer, dst_buffer, VkBufferCopy{src_offset, dst_offset, size});
std::array<VkBufferMemoryBarrier, 2> barriers;
barriers[0].sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barriers[0].pNext = nullptr;
barriers[0].srcAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
barriers[0].dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barriers[0].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[0].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[0].buffer = src_buffer;
barriers[0].offset = src_offset;
barriers[0].size = size;
barriers[1].sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barriers[1].pNext = nullptr;
barriers[1].srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barriers[1].dstAccessMask = UPLOAD_ACCESS_BARRIERS;
barriers[1].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[1].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[1].buffer = dst_buffer;
barriers[1].offset = dst_offset;
barriers[1].size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFER_BIT, UPLOAD_PIPELINE_STAGE, 0, {},
barriers, {});
});
}

View File

@@ -11,11 +11,11 @@
#include "common/common_types.h"
#include "video_core/buffer_cache/buffer_cache.h"
#include "video_core/rasterizer_cache.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_stream_buffer.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -30,11 +30,11 @@ class VKScheduler;
class CachedBufferBlock final : public VideoCommon::BufferBlock {
public:
explicit CachedBufferBlock(const VKDevice& device, VKMemoryManager& memory_manager,
CacheAddr cache_addr, std::size_t size);
VAddr cpu_addr, std::size_t size);
~CachedBufferBlock();
const vk::Buffer* GetHandle() const {
return &*buffer.handle;
VkBuffer GetHandle() const {
return *buffer.handle;
}
private:
@@ -43,21 +43,21 @@ private:
using Buffer = std::shared_ptr<CachedBufferBlock>;
class VKBufferCache final : public VideoCommon::BufferCache<Buffer, vk::Buffer, VKStreamBuffer> {
class VKBufferCache final : public VideoCommon::BufferCache<Buffer, VkBuffer, VKStreamBuffer> {
public:
explicit VKBufferCache(VideoCore::RasterizerInterface& rasterizer, Core::System& system,
const VKDevice& device, VKMemoryManager& memory_manager,
VKScheduler& scheduler, VKStagingBufferPool& staging_pool);
~VKBufferCache();
const vk::Buffer* GetEmptyBuffer(std::size_t size) override;
VkBuffer GetEmptyBuffer(std::size_t size) override;
protected:
VkBuffer ToHandle(const Buffer& buffer) override;
void WriteBarrier() override {}
Buffer CreateBlock(CacheAddr cache_addr, std::size_t size) override;
const vk::Buffer* ToHandle(const Buffer& buffer) override;
Buffer CreateBlock(VAddr cpu_addr, std::size_t size) override;
void UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) override;

View File

@@ -10,13 +10,13 @@
#include "common/alignment.h"
#include "common/assert.h"
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_compute_pass.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -114,6 +114,35 @@ constexpr u8 quad_array[] = {
0xf9, 0x00, 0x02, 0x00, 0x4c, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x4b, 0x00, 0x00, 0x00,
0xfd, 0x00, 0x01, 0x00, 0x38, 0x00, 0x01, 0x00};
VkDescriptorSetLayoutBinding BuildQuadArrayPassDescriptorSetLayoutBinding() {
VkDescriptorSetLayoutBinding binding;
binding.binding = 0;
binding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
binding.descriptorCount = 1;
binding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
binding.pImmutableSamplers = nullptr;
return binding;
}
VkDescriptorUpdateTemplateEntryKHR BuildQuadArrayPassDescriptorUpdateTemplateEntry() {
VkDescriptorUpdateTemplateEntryKHR entry;
entry.dstBinding = 0;
entry.dstArrayElement = 0;
entry.descriptorCount = 1;
entry.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
entry.offset = 0;
entry.stride = sizeof(DescriptorUpdateEntry);
return entry;
}
VkPushConstantRange BuildComputePushConstantRange(std::size_t size) {
VkPushConstantRange range;
range.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
range.offset = 0;
range.size = static_cast<u32>(size);
return range;
}
// Uint8 SPIR-V module. Generated from the "shaders/" directory.
constexpr u8 uint8_pass[] = {
0x03, 0x02, 0x23, 0x07, 0x00, 0x00, 0x01, 0x00, 0x07, 0x00, 0x08, 0x00, 0x2f, 0x00, 0x00, 0x00,
@@ -191,53 +220,234 @@ constexpr u8 uint8_pass[] = {
0xf9, 0x00, 0x02, 0x00, 0x1d, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x1d, 0x00, 0x00, 0x00,
0xfd, 0x00, 0x01, 0x00, 0x38, 0x00, 0x01, 0x00};
// Quad indexed SPIR-V module. Generated from the "shaders/" directory.
constexpr u8 QUAD_INDEXED_SPV[] = {
0x03, 0x02, 0x23, 0x07, 0x00, 0x00, 0x01, 0x00, 0x07, 0x00, 0x08, 0x00, 0x7c, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x11, 0x00, 0x02, 0x00, 0x01, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x06, 0x00,
0x01, 0x00, 0x00, 0x00, 0x47, 0x4c, 0x53, 0x4c, 0x2e, 0x73, 0x74, 0x64, 0x2e, 0x34, 0x35, 0x30,
0x00, 0x00, 0x00, 0x00, 0x0e, 0x00, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00,
0x0f, 0x00, 0x06, 0x00, 0x05, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x6d, 0x61, 0x69, 0x6e,
0x00, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x10, 0x00, 0x06, 0x00, 0x04, 0x00, 0x00, 0x00,
0x11, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00,
0x47, 0x00, 0x04, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00,
0x47, 0x00, 0x04, 0x00, 0x15, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00,
0x48, 0x00, 0x04, 0x00, 0x16, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x19, 0x00, 0x00, 0x00,
0x48, 0x00, 0x05, 0x00, 0x16, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x23, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x47, 0x00, 0x03, 0x00, 0x16, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00,
0x47, 0x00, 0x04, 0x00, 0x18, 0x00, 0x00, 0x00, 0x22, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x47, 0x00, 0x04, 0x00, 0x18, 0x00, 0x00, 0x00, 0x21, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00,
0x48, 0x00, 0x05, 0x00, 0x22, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x23, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x48, 0x00, 0x05, 0x00, 0x22, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00,
0x23, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x47, 0x00, 0x03, 0x00, 0x22, 0x00, 0x00, 0x00,
0x02, 0x00, 0x00, 0x00, 0x47, 0x00, 0x04, 0x00, 0x56, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00,
0x04, 0x00, 0x00, 0x00, 0x48, 0x00, 0x04, 0x00, 0x57, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x18, 0x00, 0x00, 0x00, 0x48, 0x00, 0x05, 0x00, 0x57, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x23, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x47, 0x00, 0x03, 0x00, 0x57, 0x00, 0x00, 0x00,
0x03, 0x00, 0x00, 0x00, 0x47, 0x00, 0x04, 0x00, 0x59, 0x00, 0x00, 0x00, 0x22, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x47, 0x00, 0x04, 0x00, 0x59, 0x00, 0x00, 0x00, 0x21, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x47, 0x00, 0x04, 0x00, 0x72, 0x00, 0x00, 0x00, 0x0b, 0x00, 0x00, 0x00,
0x19, 0x00, 0x00, 0x00, 0x13, 0x00, 0x02, 0x00, 0x02, 0x00, 0x00, 0x00, 0x21, 0x00, 0x03, 0x00,
0x03, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x15, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x20, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x07, 0x00, 0x00, 0x00,
0x07, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x15, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00,
0x20, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x17, 0x00, 0x04, 0x00, 0x0a, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x0b, 0x00, 0x00, 0x00,
0x01, 0x00, 0x00, 0x00, 0x0a, 0x00, 0x00, 0x00, 0x3b, 0x00, 0x04, 0x00, 0x0b, 0x00, 0x00, 0x00,
0x0c, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00,
0x0d, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x0e, 0x00, 0x00, 0x00,
0x01, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x13, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x1d, 0x00, 0x03, 0x00, 0x15, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x1e, 0x00, 0x03, 0x00, 0x16, 0x00, 0x00, 0x00, 0x15, 0x00, 0x00, 0x00,
0x20, 0x00, 0x04, 0x00, 0x17, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x16, 0x00, 0x00, 0x00,
0x3b, 0x00, 0x04, 0x00, 0x17, 0x00, 0x00, 0x00, 0x18, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
0x14, 0x00, 0x02, 0x00, 0x1b, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x21, 0x00, 0x00, 0x00, 0x08, 0x00, 0x00, 0x00, 0x1e, 0x00, 0x04, 0x00, 0x22, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x23, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x22, 0x00, 0x00, 0x00, 0x3b, 0x00, 0x04, 0x00, 0x23, 0x00, 0x00, 0x00,
0x24, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x25, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x26, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x2b, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00,
0x3b, 0x00, 0x00, 0x00, 0x06, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x3f, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x04, 0x00, 0x41, 0x00, 0x00, 0x00,
0x06, 0x00, 0x00, 0x00, 0x3b, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x42, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00,
0x43, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00, 0x2c, 0x00, 0x09, 0x00, 0x41, 0x00, 0x00, 0x00,
0x44, 0x00, 0x00, 0x00, 0x42, 0x00, 0x00, 0x00, 0x25, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x00, 0x00,
0x42, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x00, 0x00, 0x43, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00,
0x46, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00, 0x41, 0x00, 0x00, 0x00, 0x1d, 0x00, 0x03, 0x00,
0x56, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x1e, 0x00, 0x03, 0x00, 0x57, 0x00, 0x00, 0x00,
0x56, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x58, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
0x57, 0x00, 0x00, 0x00, 0x3b, 0x00, 0x04, 0x00, 0x58, 0x00, 0x00, 0x00, 0x59, 0x00, 0x00, 0x00,
0x02, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x5b, 0x00, 0x00, 0x00, 0x02, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x20, 0x00, 0x04, 0x00, 0x69, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00,
0x09, 0x00, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00, 0x70, 0x00, 0x00, 0x00,
0x00, 0x04, 0x00, 0x00, 0x2b, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00, 0x71, 0x00, 0x00, 0x00,
0x01, 0x00, 0x00, 0x00, 0x2c, 0x00, 0x06, 0x00, 0x0a, 0x00, 0x00, 0x00, 0x72, 0x00, 0x00, 0x00,
0x70, 0x00, 0x00, 0x00, 0x71, 0x00, 0x00, 0x00, 0x71, 0x00, 0x00, 0x00, 0x36, 0x00, 0x05, 0x00,
0x02, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0x00, 0x00, 0x00,
0xf8, 0x00, 0x02, 0x00, 0x05, 0x00, 0x00, 0x00, 0x3b, 0x00, 0x04, 0x00, 0x46, 0x00, 0x00, 0x00,
0x47, 0x00, 0x00, 0x00, 0x07, 0x00, 0x00, 0x00, 0xf9, 0x00, 0x02, 0x00, 0x74, 0x00, 0x00, 0x00,
0xf8, 0x00, 0x02, 0x00, 0x74, 0x00, 0x00, 0x00, 0xf6, 0x00, 0x04, 0x00, 0x73, 0x00, 0x00, 0x00,
0x76, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xf9, 0x00, 0x02, 0x00, 0x75, 0x00, 0x00, 0x00,
0xf8, 0x00, 0x02, 0x00, 0x75, 0x00, 0x00, 0x00, 0x41, 0x00, 0x05, 0x00, 0x0e, 0x00, 0x00, 0x00,
0x0f, 0x00, 0x00, 0x00, 0x0c, 0x00, 0x00, 0x00, 0x0d, 0x00, 0x00, 0x00, 0x3d, 0x00, 0x04, 0x00,
0x09, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x0f, 0x00, 0x00, 0x00, 0x7c, 0x00, 0x04, 0x00,
0x06, 0x00, 0x00, 0x00, 0x11, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00, 0x84, 0x00, 0x05, 0x00,
0x06, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x11, 0x00, 0x00, 0x00, 0x13, 0x00, 0x00, 0x00,
0x44, 0x00, 0x05, 0x00, 0x09, 0x00, 0x00, 0x00, 0x19, 0x00, 0x00, 0x00, 0x18, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x7c, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00, 0x1a, 0x00, 0x00, 0x00,
0x19, 0x00, 0x00, 0x00, 0xaf, 0x00, 0x05, 0x00, 0x1b, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00,
0x14, 0x00, 0x00, 0x00, 0x1a, 0x00, 0x00, 0x00, 0xf7, 0x00, 0x03, 0x00, 0x1e, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0xfa, 0x00, 0x04, 0x00, 0x1c, 0x00, 0x00, 0x00, 0x1d, 0x00, 0x00, 0x00,
0x1e, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x1d, 0x00, 0x00, 0x00, 0xf9, 0x00, 0x02, 0x00,
0x73, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x1e, 0x00, 0x00, 0x00, 0x41, 0x00, 0x05, 0x00,
0x26, 0x00, 0x00, 0x00, 0x27, 0x00, 0x00, 0x00, 0x24, 0x00, 0x00, 0x00, 0x25, 0x00, 0x00, 0x00,
0x3d, 0x00, 0x04, 0x00, 0x06, 0x00, 0x00, 0x00, 0x28, 0x00, 0x00, 0x00, 0x27, 0x00, 0x00, 0x00,
0xc4, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00, 0x29, 0x00, 0x00, 0x00, 0x21, 0x00, 0x00, 0x00,
0x28, 0x00, 0x00, 0x00, 0x82, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00, 0x2e, 0x00, 0x00, 0x00,
0x2b, 0x00, 0x00, 0x00, 0x28, 0x00, 0x00, 0x00, 0xc4, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00,
0x31, 0x00, 0x00, 0x00, 0x25, 0x00, 0x00, 0x00, 0x2e, 0x00, 0x00, 0x00, 0x82, 0x00, 0x05, 0x00,
0x06, 0x00, 0x00, 0x00, 0x32, 0x00, 0x00, 0x00, 0x31, 0x00, 0x00, 0x00, 0x25, 0x00, 0x00, 0x00,
0xf9, 0x00, 0x02, 0x00, 0x35, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x35, 0x00, 0x00, 0x00,
0xf5, 0x00, 0x07, 0x00, 0x09, 0x00, 0x00, 0x00, 0x7b, 0x00, 0x00, 0x00, 0x0d, 0x00, 0x00, 0x00,
0x1e, 0x00, 0x00, 0x00, 0x6f, 0x00, 0x00, 0x00, 0x36, 0x00, 0x00, 0x00, 0xb0, 0x00, 0x05, 0x00,
0x1b, 0x00, 0x00, 0x00, 0x3c, 0x00, 0x00, 0x00, 0x7b, 0x00, 0x00, 0x00, 0x3b, 0x00, 0x00, 0x00,
0xf6, 0x00, 0x04, 0x00, 0x37, 0x00, 0x00, 0x00, 0x36, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0xfa, 0x00, 0x04, 0x00, 0x3c, 0x00, 0x00, 0x00, 0x36, 0x00, 0x00, 0x00, 0x37, 0x00, 0x00, 0x00,
0xf8, 0x00, 0x02, 0x00, 0x36, 0x00, 0x00, 0x00, 0x84, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00,
0x40, 0x00, 0x00, 0x00, 0x11, 0x00, 0x00, 0x00, 0x3f, 0x00, 0x00, 0x00, 0x3e, 0x00, 0x03, 0x00,
0x47, 0x00, 0x00, 0x00, 0x44, 0x00, 0x00, 0x00, 0x41, 0x00, 0x05, 0x00, 0x07, 0x00, 0x00, 0x00,
0x48, 0x00, 0x00, 0x00, 0x47, 0x00, 0x00, 0x00, 0x7b, 0x00, 0x00, 0x00, 0x3d, 0x00, 0x04, 0x00,
0x06, 0x00, 0x00, 0x00, 0x49, 0x00, 0x00, 0x00, 0x48, 0x00, 0x00, 0x00, 0x80, 0x00, 0x05, 0x00,
0x06, 0x00, 0x00, 0x00, 0x4a, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x49, 0x00, 0x00, 0x00,
0xc3, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00, 0x4e, 0x00, 0x00, 0x00, 0x4a, 0x00, 0x00, 0x00,
0x2e, 0x00, 0x00, 0x00, 0xc7, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00, 0x52, 0x00, 0x00, 0x00,
0x4a, 0x00, 0x00, 0x00, 0x32, 0x00, 0x00, 0x00, 0x84, 0x00, 0x05, 0x00, 0x06, 0x00, 0x00, 0x00,
0x54, 0x00, 0x00, 0x00, 0x52, 0x00, 0x00, 0x00, 0x29, 0x00, 0x00, 0x00, 0x41, 0x00, 0x06, 0x00,
0x5b, 0x00, 0x00, 0x00, 0x5c, 0x00, 0x00, 0x00, 0x59, 0x00, 0x00, 0x00, 0x42, 0x00, 0x00, 0x00,
0x4e, 0x00, 0x00, 0x00, 0x3d, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00, 0x5d, 0x00, 0x00, 0x00,
0x5c, 0x00, 0x00, 0x00, 0xcb, 0x00, 0x06, 0x00, 0x09, 0x00, 0x00, 0x00, 0x62, 0x00, 0x00, 0x00,
0x5d, 0x00, 0x00, 0x00, 0x54, 0x00, 0x00, 0x00, 0x29, 0x00, 0x00, 0x00, 0x7c, 0x00, 0x04, 0x00,
0x09, 0x00, 0x00, 0x00, 0x65, 0x00, 0x00, 0x00, 0x14, 0x00, 0x00, 0x00, 0x80, 0x00, 0x05, 0x00,
0x09, 0x00, 0x00, 0x00, 0x67, 0x00, 0x00, 0x00, 0x65, 0x00, 0x00, 0x00, 0x7b, 0x00, 0x00, 0x00,
0x41, 0x00, 0x05, 0x00, 0x69, 0x00, 0x00, 0x00, 0x6a, 0x00, 0x00, 0x00, 0x24, 0x00, 0x00, 0x00,
0x42, 0x00, 0x00, 0x00, 0x3d, 0x00, 0x04, 0x00, 0x09, 0x00, 0x00, 0x00, 0x6b, 0x00, 0x00, 0x00,
0x6a, 0x00, 0x00, 0x00, 0x80, 0x00, 0x05, 0x00, 0x09, 0x00, 0x00, 0x00, 0x6c, 0x00, 0x00, 0x00,
0x62, 0x00, 0x00, 0x00, 0x6b, 0x00, 0x00, 0x00, 0x41, 0x00, 0x06, 0x00, 0x5b, 0x00, 0x00, 0x00,
0x6d, 0x00, 0x00, 0x00, 0x18, 0x00, 0x00, 0x00, 0x42, 0x00, 0x00, 0x00, 0x67, 0x00, 0x00, 0x00,
0x3e, 0x00, 0x03, 0x00, 0x6d, 0x00, 0x00, 0x00, 0x6c, 0x00, 0x00, 0x00, 0x80, 0x00, 0x05, 0x00,
0x09, 0x00, 0x00, 0x00, 0x6f, 0x00, 0x00, 0x00, 0x7b, 0x00, 0x00, 0x00, 0x25, 0x00, 0x00, 0x00,
0xf9, 0x00, 0x02, 0x00, 0x35, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x37, 0x00, 0x00, 0x00,
0xf9, 0x00, 0x02, 0x00, 0x73, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x76, 0x00, 0x00, 0x00,
0xf9, 0x00, 0x02, 0x00, 0x74, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x73, 0x00, 0x00, 0x00,
0xfd, 0x00, 0x01, 0x00, 0x38, 0x00, 0x01, 0x00};
std::array<VkDescriptorSetLayoutBinding, 2> BuildInputOutputDescriptorSetBindings() {
std::array<VkDescriptorSetLayoutBinding, 2> bindings;
bindings[0].binding = 0;
bindings[0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
bindings[0].descriptorCount = 1;
bindings[0].stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
bindings[0].pImmutableSamplers = nullptr;
bindings[1].binding = 1;
bindings[1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
bindings[1].descriptorCount = 1;
bindings[1].stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
bindings[1].pImmutableSamplers = nullptr;
return bindings;
}
VkDescriptorUpdateTemplateEntryKHR BuildInputOutputDescriptorUpdateTemplate() {
VkDescriptorUpdateTemplateEntryKHR entry;
entry.dstBinding = 0;
entry.dstArrayElement = 0;
entry.descriptorCount = 2;
entry.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
entry.offset = 0;
entry.stride = sizeof(DescriptorUpdateEntry);
return entry;
}
} // Anonymous namespace
VKComputePass::VKComputePass(const VKDevice& device, VKDescriptorPool& descriptor_pool,
const std::vector<vk::DescriptorSetLayoutBinding>& bindings,
const std::vector<vk::DescriptorUpdateTemplateEntry>& templates,
const std::vector<vk::PushConstantRange> push_constants,
std::size_t code_size, const u8* code) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
vk::Span<VkDescriptorSetLayoutBinding> bindings,
vk::Span<VkDescriptorUpdateTemplateEntryKHR> templates,
vk::Span<VkPushConstantRange> push_constants, std::size_t code_size,
const u8* code) {
VkDescriptorSetLayoutCreateInfo descriptor_layout_ci;
descriptor_layout_ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
descriptor_layout_ci.pNext = nullptr;
descriptor_layout_ci.flags = 0;
descriptor_layout_ci.bindingCount = bindings.size();
descriptor_layout_ci.pBindings = bindings.data();
descriptor_set_layout = device.GetLogical().CreateDescriptorSetLayout(descriptor_layout_ci);
const vk::DescriptorSetLayoutCreateInfo descriptor_layout_ci(
{}, static_cast<u32>(bindings.size()), bindings.data());
descriptor_set_layout = dev.createDescriptorSetLayoutUnique(descriptor_layout_ci, nullptr, dld);
const vk::PipelineLayoutCreateInfo pipeline_layout_ci({}, 1, &*descriptor_set_layout,
static_cast<u32>(push_constants.size()),
push_constants.data());
layout = dev.createPipelineLayoutUnique(pipeline_layout_ci, nullptr, dld);
VkPipelineLayoutCreateInfo pipeline_layout_ci;
pipeline_layout_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
pipeline_layout_ci.pNext = nullptr;
pipeline_layout_ci.flags = 0;
pipeline_layout_ci.setLayoutCount = 1;
pipeline_layout_ci.pSetLayouts = descriptor_set_layout.address();
pipeline_layout_ci.pushConstantRangeCount = push_constants.size();
pipeline_layout_ci.pPushConstantRanges = push_constants.data();
layout = device.GetLogical().CreatePipelineLayout(pipeline_layout_ci);
if (!templates.empty()) {
const vk::DescriptorUpdateTemplateCreateInfo template_ci(
{}, static_cast<u32>(templates.size()), templates.data(),
vk::DescriptorUpdateTemplateType::eDescriptorSet, *descriptor_set_layout,
vk::PipelineBindPoint::eGraphics, *layout, 0);
descriptor_template = dev.createDescriptorUpdateTemplateUnique(template_ci, nullptr, dld);
VkDescriptorUpdateTemplateCreateInfoKHR template_ci;
template_ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR;
template_ci.pNext = nullptr;
template_ci.flags = 0;
template_ci.descriptorUpdateEntryCount = templates.size();
template_ci.pDescriptorUpdateEntries = templates.data();
template_ci.templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR;
template_ci.descriptorSetLayout = *descriptor_set_layout;
template_ci.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
template_ci.pipelineLayout = *layout;
template_ci.set = 0;
descriptor_template = device.GetLogical().CreateDescriptorUpdateTemplateKHR(template_ci);
descriptor_allocator.emplace(descriptor_pool, *descriptor_set_layout);
}
auto code_copy = std::make_unique<u32[]>(code_size / sizeof(u32) + 1);
std::memcpy(code_copy.get(), code, code_size);
const vk::ShaderModuleCreateInfo module_ci({}, code_size, code_copy.get());
module = dev.createShaderModuleUnique(module_ci, nullptr, dld);
const vk::PipelineShaderStageCreateInfo stage_ci({}, vk::ShaderStageFlagBits::eCompute, *module,
"main", nullptr);
VkShaderModuleCreateInfo module_ci;
module_ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
module_ci.pNext = nullptr;
module_ci.flags = 0;
module_ci.codeSize = code_size;
module_ci.pCode = code_copy.get();
module = device.GetLogical().CreateShaderModule(module_ci);
const vk::ComputePipelineCreateInfo pipeline_ci({}, stage_ci, *layout, nullptr, 0);
pipeline = dev.createComputePipelineUnique(nullptr, pipeline_ci, nullptr, dld);
VkComputePipelineCreateInfo pipeline_ci;
pipeline_ci.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO;
pipeline_ci.pNext = nullptr;
pipeline_ci.flags = 0;
pipeline_ci.layout = *layout;
pipeline_ci.basePipelineHandle = nullptr;
pipeline_ci.basePipelineIndex = 0;
VkPipelineShaderStageCreateInfo& stage_ci = pipeline_ci.stage;
stage_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
stage_ci.pNext = nullptr;
stage_ci.flags = 0;
stage_ci.stage = VK_SHADER_STAGE_COMPUTE_BIT;
stage_ci.module = *module;
stage_ci.pName = "main";
stage_ci.pSpecializationInfo = nullptr;
pipeline = device.GetLogical().CreateComputePipeline(pipeline_ci);
}
VKComputePass::~VKComputePass() = default;
vk::DescriptorSet VKComputePass::CommitDescriptorSet(
VKUpdateDescriptorQueue& update_descriptor_queue, VKFence& fence) {
VkDescriptorSet VKComputePass::CommitDescriptorSet(VKUpdateDescriptorQueue& update_descriptor_queue,
VKFence& fence) {
if (!descriptor_template) {
return {};
return nullptr;
}
const auto set = descriptor_allocator->Commit(fence);
update_descriptor_queue.Send(*descriptor_template, set);
@@ -248,25 +458,21 @@ QuadArrayPass::QuadArrayPass(const VKDevice& device, VKScheduler& scheduler,
VKDescriptorPool& descriptor_pool,
VKStagingBufferPool& staging_buffer_pool,
VKUpdateDescriptorQueue& update_descriptor_queue)
: VKComputePass(device, descriptor_pool,
{vk::DescriptorSetLayoutBinding(0, vk::DescriptorType::eStorageBuffer, 1,
vk::ShaderStageFlagBits::eCompute, nullptr)},
{vk::DescriptorUpdateTemplateEntry(0, 0, 1, vk::DescriptorType::eStorageBuffer,
0, sizeof(DescriptorUpdateEntry))},
{vk::PushConstantRange(vk::ShaderStageFlagBits::eCompute, 0, sizeof(u32))},
std::size(quad_array), quad_array),
: VKComputePass(device, descriptor_pool, BuildQuadArrayPassDescriptorSetLayoutBinding(),
BuildQuadArrayPassDescriptorUpdateTemplateEntry(),
BuildComputePushConstantRange(sizeof(u32)), std::size(quad_array), quad_array),
scheduler{scheduler}, staging_buffer_pool{staging_buffer_pool},
update_descriptor_queue{update_descriptor_queue} {}
QuadArrayPass::~QuadArrayPass() = default;
std::pair<const vk::Buffer&, vk::DeviceSize> QuadArrayPass::Assemble(u32 num_vertices, u32 first) {
const u32 num_triangle_vertices = num_vertices * 6 / 4;
std::pair<VkBuffer, VkDeviceSize> QuadArrayPass::Assemble(u32 num_vertices, u32 first) {
const u32 num_triangle_vertices = (num_vertices / 4) * 6;
const std::size_t staging_size = num_triangle_vertices * sizeof(u32);
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(&*buffer.handle, 0, staging_size);
update_descriptor_queue.AddBuffer(*buffer.handle, 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
@@ -274,20 +480,25 @@ std::pair<const vk::Buffer&, vk::DeviceSize> QuadArrayPass::Assemble(u32 num_ver
ASSERT(num_vertices % 4 == 0);
const u32 num_quads = num_vertices / 4;
scheduler.Record([layout = *layout, pipeline = *pipeline, buffer = *buffer.handle, num_quads,
first, set](auto cmdbuf, auto& dld) {
first, set](vk::CommandBuffer cmdbuf) {
constexpr u32 dispatch_size = 1024;
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, pipeline, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, layout, 0, {set}, {}, dld);
cmdbuf.pushConstants(layout, vk::ShaderStageFlagBits::eCompute, 0, sizeof(first), &first,
dld);
cmdbuf.dispatch(Common::AlignUp(num_quads, dispatch_size) / dispatch_size, 1, 1, dld);
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, layout, 0, set, {});
cmdbuf.PushConstants(layout, VK_SHADER_STAGE_COMPUTE_BIT, 0, sizeof(first), &first);
cmdbuf.Dispatch(Common::AlignUp(num_quads, dispatch_size) / dispatch_size, 1, 1);
const vk::BufferMemoryBarrier barrier(
vk::AccessFlagBits::eShaderWrite, vk::AccessFlagBits::eVertexAttributeRead,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, buffer, 0,
static_cast<vk::DeviceSize>(num_quads) * 6 * sizeof(u32));
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eComputeShader,
vk::PipelineStageFlagBits::eVertexInput, {}, {}, {barrier}, {}, dld);
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = 0;
barrier.size = static_cast<VkDeviceSize>(num_quads) * 6 * sizeof(u32);
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, {barrier}, {});
});
return {*buffer.handle, 0};
}
@@ -295,45 +506,112 @@ std::pair<const vk::Buffer&, vk::DeviceSize> QuadArrayPass::Assemble(u32 num_ver
Uint8Pass::Uint8Pass(const VKDevice& device, VKScheduler& scheduler,
VKDescriptorPool& descriptor_pool, VKStagingBufferPool& staging_buffer_pool,
VKUpdateDescriptorQueue& update_descriptor_queue)
: VKComputePass(device, descriptor_pool,
{vk::DescriptorSetLayoutBinding(0, vk::DescriptorType::eStorageBuffer, 1,
vk::ShaderStageFlagBits::eCompute, nullptr),
vk::DescriptorSetLayoutBinding(1, vk::DescriptorType::eStorageBuffer, 1,
vk::ShaderStageFlagBits::eCompute, nullptr)},
{vk::DescriptorUpdateTemplateEntry(0, 0, 2, vk::DescriptorType::eStorageBuffer,
0, sizeof(DescriptorUpdateEntry))},
{}, std::size(uint8_pass), uint8_pass),
: VKComputePass(device, descriptor_pool, BuildInputOutputDescriptorSetBindings(),
BuildInputOutputDescriptorUpdateTemplate(), {}, std::size(uint8_pass),
uint8_pass),
scheduler{scheduler}, staging_buffer_pool{staging_buffer_pool},
update_descriptor_queue{update_descriptor_queue} {}
Uint8Pass::~Uint8Pass() = default;
std::pair<const vk::Buffer*, u64> Uint8Pass::Assemble(u32 num_vertices, vk::Buffer src_buffer,
u64 src_offset) {
std::pair<VkBuffer, u64> Uint8Pass::Assemble(u32 num_vertices, VkBuffer src_buffer,
u64 src_offset) {
const auto staging_size = static_cast<u32>(num_vertices * sizeof(u16));
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(&src_buffer, src_offset, num_vertices);
update_descriptor_queue.AddBuffer(&*buffer.handle, 0, staging_size);
update_descriptor_queue.AddBuffer(src_buffer, src_offset, num_vertices);
update_descriptor_queue.AddBuffer(*buffer.handle, 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([layout = *layout, pipeline = *pipeline, buffer = *buffer.handle, set,
num_vertices](auto cmdbuf, auto& dld) {
num_vertices](vk::CommandBuffer cmdbuf) {
constexpr u32 dispatch_size = 1024;
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, pipeline, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, layout, 0, {set}, {}, dld);
cmdbuf.dispatch(Common::AlignUp(num_vertices, dispatch_size) / dispatch_size, 1, 1, dld);
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, layout, 0, set, {});
cmdbuf.Dispatch(Common::AlignUp(num_vertices, dispatch_size) / dispatch_size, 1, 1);
const vk::BufferMemoryBarrier barrier(
vk::AccessFlagBits::eShaderWrite, vk::AccessFlagBits::eVertexAttributeRead,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, buffer, 0,
static_cast<vk::DeviceSize>(num_vertices) * sizeof(u16));
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eComputeShader,
vk::PipelineStageFlagBits::eVertexInput, {}, {}, {barrier}, {}, dld);
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = 0;
barrier.size = static_cast<VkDeviceSize>(num_vertices * sizeof(u16));
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, barrier, {});
});
return {&*buffer.handle, 0};
return {*buffer.handle, 0};
}
QuadIndexedPass::QuadIndexedPass(const VKDevice& device, VKScheduler& scheduler,
VKDescriptorPool& descriptor_pool,
VKStagingBufferPool& staging_buffer_pool,
VKUpdateDescriptorQueue& update_descriptor_queue)
: VKComputePass(device, descriptor_pool, BuildInputOutputDescriptorSetBindings(),
BuildInputOutputDescriptorUpdateTemplate(),
BuildComputePushConstantRange(sizeof(u32) * 2), std::size(QUAD_INDEXED_SPV),
QUAD_INDEXED_SPV),
scheduler{scheduler}, staging_buffer_pool{staging_buffer_pool},
update_descriptor_queue{update_descriptor_queue} {}
QuadIndexedPass::~QuadIndexedPass() = default;
std::pair<VkBuffer, u64> QuadIndexedPass::Assemble(
Tegra::Engines::Maxwell3D::Regs::IndexFormat index_format, u32 num_vertices, u32 base_vertex,
VkBuffer src_buffer, u64 src_offset) {
const u32 index_shift = [index_format] {
switch (index_format) {
case Tegra::Engines::Maxwell3D::Regs::IndexFormat::UnsignedByte:
return 0;
case Tegra::Engines::Maxwell3D::Regs::IndexFormat::UnsignedShort:
return 1;
case Tegra::Engines::Maxwell3D::Regs::IndexFormat::UnsignedInt:
return 2;
}
UNREACHABLE();
return 2;
}();
const u32 input_size = num_vertices << index_shift;
const u32 num_tri_vertices = (num_vertices / 4) * 6;
const std::size_t staging_size = num_tri_vertices * sizeof(u32);
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(src_buffer, src_offset, input_size);
update_descriptor_queue.AddBuffer(*buffer.handle, 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([layout = *layout, pipeline = *pipeline, buffer = *buffer.handle, set,
num_tri_vertices, base_vertex, index_shift](vk::CommandBuffer cmdbuf) {
static constexpr u32 dispatch_size = 1024;
const std::array push_constants = {base_vertex, index_shift};
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, layout, 0, set, {});
cmdbuf.PushConstants(layout, VK_SHADER_STAGE_COMPUTE_BIT, 0, sizeof(push_constants),
&push_constants);
cmdbuf.Dispatch(Common::AlignUp(num_tri_vertices, dispatch_size) / dispatch_size, 1, 1);
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = 0;
barrier.size = static_cast<VkDeviceSize>(num_tri_vertices * sizeof(u32));
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, barrier, {});
});
return {*buffer.handle, 0};
}
} // namespace Vulkan

View File

@@ -8,8 +8,9 @@
#include <utility>
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -22,24 +23,24 @@ class VKUpdateDescriptorQueue;
class VKComputePass {
public:
explicit VKComputePass(const VKDevice& device, VKDescriptorPool& descriptor_pool,
const std::vector<vk::DescriptorSetLayoutBinding>& bindings,
const std::vector<vk::DescriptorUpdateTemplateEntry>& templates,
const std::vector<vk::PushConstantRange> push_constants,
std::size_t code_size, const u8* code);
vk::Span<VkDescriptorSetLayoutBinding> bindings,
vk::Span<VkDescriptorUpdateTemplateEntryKHR> templates,
vk::Span<VkPushConstantRange> push_constants, std::size_t code_size,
const u8* code);
~VKComputePass();
protected:
vk::DescriptorSet CommitDescriptorSet(VKUpdateDescriptorQueue& update_descriptor_queue,
VKFence& fence);
VkDescriptorSet CommitDescriptorSet(VKUpdateDescriptorQueue& update_descriptor_queue,
VKFence& fence);
UniqueDescriptorUpdateTemplate descriptor_template;
UniquePipelineLayout layout;
UniquePipeline pipeline;
vk::DescriptorUpdateTemplateKHR descriptor_template;
vk::PipelineLayout layout;
vk::Pipeline pipeline;
private:
UniqueDescriptorSetLayout descriptor_set_layout;
vk::DescriptorSetLayout descriptor_set_layout;
std::optional<DescriptorAllocator> descriptor_allocator;
UniqueShaderModule module;
vk::ShaderModule module;
};
class QuadArrayPass final : public VKComputePass {
@@ -50,7 +51,7 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue);
~QuadArrayPass();
std::pair<const vk::Buffer&, vk::DeviceSize> Assemble(u32 num_vertices, u32 first);
std::pair<VkBuffer, VkDeviceSize> Assemble(u32 num_vertices, u32 first);
private:
VKScheduler& scheduler;
@@ -65,8 +66,25 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue);
~Uint8Pass();
std::pair<const vk::Buffer*, u64> Assemble(u32 num_vertices, vk::Buffer src_buffer,
u64 src_offset);
std::pair<VkBuffer, u64> Assemble(u32 num_vertices, VkBuffer src_buffer, u64 src_offset);
private:
VKScheduler& scheduler;
VKStagingBufferPool& staging_buffer_pool;
VKUpdateDescriptorQueue& update_descriptor_queue;
};
class QuadIndexedPass final : public VKComputePass {
public:
explicit QuadIndexedPass(const VKDevice& device, VKScheduler& scheduler,
VKDescriptorPool& descriptor_pool,
VKStagingBufferPool& staging_buffer_pool,
VKUpdateDescriptorQueue& update_descriptor_queue);
~QuadIndexedPass();
std::pair<VkBuffer, u64> Assemble(Tegra::Engines::Maxwell3D::Regs::IndexFormat index_format,
u32 num_vertices, u32 base_vertex, VkBuffer src_buffer,
u64 src_offset);
private:
VKScheduler& scheduler;

View File

@@ -5,7 +5,6 @@
#include <memory>
#include <vector>
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_compute_pipeline.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_device.h"
@@ -14,6 +13,7 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_shader_decompiler.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -30,7 +30,7 @@ VKComputePipeline::VKComputePipeline(const VKDevice& device, VKScheduler& schedu
VKComputePipeline::~VKComputePipeline() = default;
vk::DescriptorSet VKComputePipeline::CommitDescriptorSet() {
VkDescriptorSet VKComputePipeline::CommitDescriptorSet() {
if (!descriptor_template) {
return {};
}
@@ -39,74 +39,109 @@ vk::DescriptorSet VKComputePipeline::CommitDescriptorSet() {
return set;
}
UniqueDescriptorSetLayout VKComputePipeline::CreateDescriptorSetLayout() const {
std::vector<vk::DescriptorSetLayoutBinding> bindings;
vk::DescriptorSetLayout VKComputePipeline::CreateDescriptorSetLayout() const {
std::vector<VkDescriptorSetLayoutBinding> bindings;
u32 binding = 0;
const auto AddBindings = [&](vk::DescriptorType descriptor_type, std::size_t num_entries) {
const auto add_bindings = [&](VkDescriptorType descriptor_type, std::size_t num_entries) {
// TODO(Rodrigo): Maybe make individual bindings here?
for (u32 bindpoint = 0; bindpoint < static_cast<u32>(num_entries); ++bindpoint) {
bindings.emplace_back(binding++, descriptor_type, 1, vk::ShaderStageFlagBits::eCompute,
nullptr);
VkDescriptorSetLayoutBinding& entry = bindings.emplace_back();
entry.binding = binding++;
entry.descriptorType = descriptor_type;
entry.descriptorCount = 1;
entry.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
entry.pImmutableSamplers = nullptr;
}
};
AddBindings(vk::DescriptorType::eUniformBuffer, entries.const_buffers.size());
AddBindings(vk::DescriptorType::eStorageBuffer, entries.global_buffers.size());
AddBindings(vk::DescriptorType::eUniformTexelBuffer, entries.texel_buffers.size());
AddBindings(vk::DescriptorType::eCombinedImageSampler, entries.samplers.size());
AddBindings(vk::DescriptorType::eStorageImage, entries.images.size());
add_bindings(VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, entries.const_buffers.size());
add_bindings(VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, entries.global_buffers.size());
add_bindings(VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER, entries.texel_buffers.size());
add_bindings(VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, entries.samplers.size());
add_bindings(VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, entries.images.size());
const vk::DescriptorSetLayoutCreateInfo descriptor_set_layout_ci(
{}, static_cast<u32>(bindings.size()), bindings.data());
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createDescriptorSetLayoutUnique(descriptor_set_layout_ci, nullptr, dld);
VkDescriptorSetLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.bindingCount = static_cast<u32>(bindings.size());
ci.pBindings = bindings.data();
return device.GetLogical().CreateDescriptorSetLayout(ci);
}
UniquePipelineLayout VKComputePipeline::CreatePipelineLayout() const {
const vk::PipelineLayoutCreateInfo layout_ci({}, 1, &*descriptor_set_layout, 0, nullptr);
const auto dev = device.GetLogical();
return dev.createPipelineLayoutUnique(layout_ci, nullptr, device.GetDispatchLoader());
vk::PipelineLayout VKComputePipeline::CreatePipelineLayout() const {
VkPipelineLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.setLayoutCount = 1;
ci.pSetLayouts = descriptor_set_layout.address();
ci.pushConstantRangeCount = 0;
ci.pPushConstantRanges = nullptr;
return device.GetLogical().CreatePipelineLayout(ci);
}
UniqueDescriptorUpdateTemplate VKComputePipeline::CreateDescriptorUpdateTemplate() const {
std::vector<vk::DescriptorUpdateTemplateEntry> template_entries;
vk::DescriptorUpdateTemplateKHR VKComputePipeline::CreateDescriptorUpdateTemplate() const {
std::vector<VkDescriptorUpdateTemplateEntryKHR> template_entries;
u32 binding = 0;
u32 offset = 0;
FillDescriptorUpdateTemplateEntries(entries, binding, offset, template_entries);
if (template_entries.empty()) {
// If the shader doesn't use descriptor sets, skip template creation.
return UniqueDescriptorUpdateTemplate{};
return {};
}
const vk::DescriptorUpdateTemplateCreateInfo template_ci(
{}, static_cast<u32>(template_entries.size()), template_entries.data(),
vk::DescriptorUpdateTemplateType::eDescriptorSet, *descriptor_set_layout,
vk::PipelineBindPoint::eGraphics, *layout, DESCRIPTOR_SET);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createDescriptorUpdateTemplateUnique(template_ci, nullptr, dld);
VkDescriptorUpdateTemplateCreateInfoKHR ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR;
ci.pNext = nullptr;
ci.flags = 0;
ci.descriptorUpdateEntryCount = static_cast<u32>(template_entries.size());
ci.pDescriptorUpdateEntries = template_entries.data();
ci.templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR;
ci.descriptorSetLayout = *descriptor_set_layout;
ci.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
ci.pipelineLayout = *layout;
ci.set = DESCRIPTOR_SET;
return device.GetLogical().CreateDescriptorUpdateTemplateKHR(ci);
}
UniqueShaderModule VKComputePipeline::CreateShaderModule(const std::vector<u32>& code) const {
const vk::ShaderModuleCreateInfo module_ci({}, code.size() * sizeof(u32), code.data());
const auto dev = device.GetLogical();
return dev.createShaderModuleUnique(module_ci, nullptr, device.GetDispatchLoader());
vk::ShaderModule VKComputePipeline::CreateShaderModule(const std::vector<u32>& code) const {
VkShaderModuleCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.codeSize = code.size() * sizeof(u32);
ci.pCode = code.data();
return device.GetLogical().CreateShaderModule(ci);
}
UniquePipeline VKComputePipeline::CreatePipeline() const {
vk::PipelineShaderStageCreateInfo shader_stage_ci({}, vk::ShaderStageFlagBits::eCompute,
*shader_module, "main", nullptr);
vk::PipelineShaderStageRequiredSubgroupSizeCreateInfoEXT subgroup_size_ci;
vk::Pipeline VKComputePipeline::CreatePipeline() const {
VkComputePipelineCreateInfo ci;
VkPipelineShaderStageCreateInfo& stage_ci = ci.stage;
stage_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
stage_ci.pNext = nullptr;
stage_ci.flags = 0;
stage_ci.stage = VK_SHADER_STAGE_COMPUTE_BIT;
stage_ci.module = *shader_module;
stage_ci.pName = "main";
stage_ci.pSpecializationInfo = nullptr;
VkPipelineShaderStageRequiredSubgroupSizeCreateInfoEXT subgroup_size_ci;
subgroup_size_ci.sType =
VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_REQUIRED_SUBGROUP_SIZE_CREATE_INFO_EXT;
subgroup_size_ci.pNext = nullptr;
subgroup_size_ci.requiredSubgroupSize = GuestWarpSize;
if (entries.uses_warps && device.IsGuestWarpSizeSupported(vk::ShaderStageFlagBits::eCompute)) {
shader_stage_ci.pNext = &subgroup_size_ci;
if (entries.uses_warps && device.IsGuestWarpSizeSupported(VK_SHADER_STAGE_COMPUTE_BIT)) {
stage_ci.pNext = &subgroup_size_ci;
}
const vk::ComputePipelineCreateInfo create_info({}, shader_stage_ci, *layout, {}, 0);
const auto dev = device.GetLogical();
return dev.createComputePipelineUnique({}, create_info, nullptr, device.GetDispatchLoader());
ci.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.layout = *layout;
ci.basePipelineHandle = nullptr;
ci.basePipelineIndex = 0;
return device.GetLogical().CreateComputePipeline(ci);
}
} // namespace Vulkan

View File

@@ -7,9 +7,9 @@
#include <memory>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_shader_decompiler.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -25,42 +25,42 @@ public:
const SPIRVShader& shader);
~VKComputePipeline();
vk::DescriptorSet CommitDescriptorSet();
VkDescriptorSet CommitDescriptorSet();
vk::Pipeline GetHandle() const {
VkPipeline GetHandle() const {
return *pipeline;
}
vk::PipelineLayout GetLayout() const {
VkPipelineLayout GetLayout() const {
return *layout;
}
const ShaderEntries& GetEntries() {
const ShaderEntries& GetEntries() const {
return entries;
}
private:
UniqueDescriptorSetLayout CreateDescriptorSetLayout() const;
vk::DescriptorSetLayout CreateDescriptorSetLayout() const;
UniquePipelineLayout CreatePipelineLayout() const;
vk::PipelineLayout CreatePipelineLayout() const;
UniqueDescriptorUpdateTemplate CreateDescriptorUpdateTemplate() const;
vk::DescriptorUpdateTemplateKHR CreateDescriptorUpdateTemplate() const;
UniqueShaderModule CreateShaderModule(const std::vector<u32>& code) const;
vk::ShaderModule CreateShaderModule(const std::vector<u32>& code) const;
UniquePipeline CreatePipeline() const;
vk::Pipeline CreatePipeline() const;
const VKDevice& device;
VKScheduler& scheduler;
ShaderEntries entries;
UniqueDescriptorSetLayout descriptor_set_layout;
vk::DescriptorSetLayout descriptor_set_layout;
DescriptorAllocator descriptor_allocator;
VKUpdateDescriptorQueue& update_descriptor_queue;
UniquePipelineLayout layout;
UniqueDescriptorUpdateTemplate descriptor_template;
UniqueShaderModule shader_module;
UniquePipeline pipeline;
vk::PipelineLayout layout;
vk::DescriptorUpdateTemplateKHR descriptor_template;
vk::ShaderModule shader_module;
vk::Pipeline pipeline;
};
} // namespace Vulkan

View File

@@ -6,10 +6,10 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -17,19 +17,18 @@ namespace Vulkan {
constexpr std::size_t SETS_GROW_RATE = 0x20;
DescriptorAllocator::DescriptorAllocator(VKDescriptorPool& descriptor_pool,
vk::DescriptorSetLayout layout)
VkDescriptorSetLayout layout)
: VKFencedPool{SETS_GROW_RATE}, descriptor_pool{descriptor_pool}, layout{layout} {}
DescriptorAllocator::~DescriptorAllocator() = default;
vk::DescriptorSet DescriptorAllocator::Commit(VKFence& fence) {
return *descriptors[CommitResource(fence)];
VkDescriptorSet DescriptorAllocator::Commit(VKFence& fence) {
const std::size_t index = CommitResource(fence);
return descriptors_allocations[index / SETS_GROW_RATE][index % SETS_GROW_RATE];
}
void DescriptorAllocator::Allocate(std::size_t begin, std::size_t end) {
auto new_sets = descriptor_pool.AllocateDescriptors(layout, end - begin);
descriptors.insert(descriptors.end(), std::make_move_iterator(new_sets.begin()),
std::make_move_iterator(new_sets.end()));
descriptors_allocations.push_back(descriptor_pool.AllocateDescriptors(layout, end - begin));
}
VKDescriptorPool::VKDescriptorPool(const VKDevice& device)
@@ -37,53 +36,50 @@ VKDescriptorPool::VKDescriptorPool(const VKDevice& device)
VKDescriptorPool::~VKDescriptorPool() = default;
vk::DescriptorPool VKDescriptorPool::AllocateNewPool() {
vk::DescriptorPool* VKDescriptorPool::AllocateNewPool() {
static constexpr u32 num_sets = 0x20000;
static constexpr vk::DescriptorPoolSize pool_sizes[] = {
{vk::DescriptorType::eUniformBuffer, num_sets * 90},
{vk::DescriptorType::eStorageBuffer, num_sets * 60},
{vk::DescriptorType::eUniformTexelBuffer, num_sets * 64},
{vk::DescriptorType::eCombinedImageSampler, num_sets * 64},
{vk::DescriptorType::eStorageImage, num_sets * 40}};
static constexpr VkDescriptorPoolSize pool_sizes[] = {
{VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, num_sets * 90},
{VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, num_sets * 60},
{VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER, num_sets * 64},
{VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, num_sets * 64},
{VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, num_sets * 40}};
const vk::DescriptorPoolCreateInfo create_info(
vk::DescriptorPoolCreateFlagBits::eFreeDescriptorSet, num_sets,
static_cast<u32>(std::size(pool_sizes)), std::data(pool_sizes));
const auto dev = device.GetLogical();
return *pools.emplace_back(
dev.createDescriptorPoolUnique(create_info, nullptr, device.GetDispatchLoader()));
VkDescriptorPoolCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT;
ci.maxSets = num_sets;
ci.poolSizeCount = static_cast<u32>(std::size(pool_sizes));
ci.pPoolSizes = std::data(pool_sizes);
return &pools.emplace_back(device.GetLogical().CreateDescriptorPool(ci));
}
std::vector<UniqueDescriptorSet> VKDescriptorPool::AllocateDescriptors(
vk::DescriptorSetLayout layout, std::size_t count) {
std::vector layout_copies(count, layout);
vk::DescriptorSetAllocateInfo allocate_info(active_pool, static_cast<u32>(count),
layout_copies.data());
vk::DescriptorSets VKDescriptorPool::AllocateDescriptors(VkDescriptorSetLayout layout,
std::size_t count) {
const std::vector layout_copies(count, layout);
VkDescriptorSetAllocateInfo ai;
ai.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
ai.pNext = nullptr;
ai.descriptorPool = **active_pool;
ai.descriptorSetCount = static_cast<u32>(count);
ai.pSetLayouts = layout_copies.data();
std::vector<vk::DescriptorSet> sets(count);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
switch (const auto result = dev.allocateDescriptorSets(&allocate_info, sets.data(), dld)) {
case vk::Result::eSuccess:
break;
case vk::Result::eErrorOutOfPoolMemory:
active_pool = AllocateNewPool();
allocate_info.descriptorPool = active_pool;
if (dev.allocateDescriptorSets(&allocate_info, sets.data(), dld) == vk::Result::eSuccess) {
break;
}
[[fallthrough]];
default:
vk::throwResultException(result, "vk::Device::allocateDescriptorSetsUnique");
vk::DescriptorSets sets = active_pool->Allocate(ai);
if (!sets.IsOutOfPoolMemory()) {
return sets;
}
vk::PoolFree deleter(dev, active_pool, dld);
std::vector<UniqueDescriptorSet> unique_sets;
unique_sets.reserve(count);
for (const auto set : sets) {
unique_sets.push_back(UniqueDescriptorSet{set, deleter});
// Our current pool is out of memory. Allocate a new one and retry
active_pool = AllocateNewPool();
ai.descriptorPool = **active_pool;
sets = active_pool->Allocate(ai);
if (!sets.IsOutOfPoolMemory()) {
return sets;
}
return unique_sets;
// After allocating a new pool, we are out of memory again. We can't handle this from here.
throw vk::Exception(VK_ERROR_OUT_OF_POOL_MEMORY);
}
} // namespace Vulkan

View File

@@ -8,8 +8,8 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -17,21 +17,21 @@ class VKDescriptorPool;
class DescriptorAllocator final : public VKFencedPool {
public:
explicit DescriptorAllocator(VKDescriptorPool& descriptor_pool, vk::DescriptorSetLayout layout);
explicit DescriptorAllocator(VKDescriptorPool& descriptor_pool, VkDescriptorSetLayout layout);
~DescriptorAllocator() override;
DescriptorAllocator(const DescriptorAllocator&) = delete;
vk::DescriptorSet Commit(VKFence& fence);
VkDescriptorSet Commit(VKFence& fence);
protected:
void Allocate(std::size_t begin, std::size_t end) override;
private:
VKDescriptorPool& descriptor_pool;
const vk::DescriptorSetLayout layout;
const VkDescriptorSetLayout layout;
std::vector<UniqueDescriptorSet> descriptors;
std::vector<vk::DescriptorSets> descriptors_allocations;
};
class VKDescriptorPool final {
@@ -42,15 +42,14 @@ public:
~VKDescriptorPool();
private:
vk::DescriptorPool AllocateNewPool();
vk::DescriptorPool* AllocateNewPool();
std::vector<UniqueDescriptorSet> AllocateDescriptors(vk::DescriptorSetLayout layout,
std::size_t count);
vk::DescriptorSets AllocateDescriptors(VkDescriptorSetLayout layout, std::size_t count);
const VKDevice& device;
std::vector<UniqueDescriptorPool> pools;
vk::DescriptorPool active_pool;
std::vector<vk::DescriptorPool> pools;
vk::DescriptorPool* active_pool;
};
} // namespace Vulkan

View File

@@ -6,14 +6,15 @@
#include <chrono>
#include <cstdlib>
#include <optional>
#include <set>
#include <string_view>
#include <thread>
#include <unordered_set>
#include <vector>
#include "common/assert.h"
#include "core/settings.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -21,49 +22,43 @@ namespace {
namespace Alternatives {
constexpr std::array Depth24UnormS8Uint = {vk::Format::eD32SfloatS8Uint,
vk::Format::eD16UnormS8Uint, vk::Format{}};
constexpr std::array Depth16UnormS8Uint = {vk::Format::eD24UnormS8Uint,
vk::Format::eD32SfloatS8Uint, vk::Format{}};
constexpr std::array Depth24UnormS8_UINT = {VK_FORMAT_D32_SFLOAT_S8_UINT,
VK_FORMAT_D16_UNORM_S8_UINT, VkFormat{}};
constexpr std::array Depth16UnormS8_UINT = {VK_FORMAT_D24_UNORM_S8_UINT,
VK_FORMAT_D32_SFLOAT_S8_UINT, VkFormat{}};
} // namespace Alternatives
constexpr std::array REQUIRED_EXTENSIONS = {
VK_KHR_SWAPCHAIN_EXTENSION_NAME,
VK_KHR_16BIT_STORAGE_EXTENSION_NAME,
VK_KHR_8BIT_STORAGE_EXTENSION_NAME,
VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME,
VK_KHR_DESCRIPTOR_UPDATE_TEMPLATE_EXTENSION_NAME,
VK_EXT_VERTEX_ATTRIBUTE_DIVISOR_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_BALLOT_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_VOTE_EXTENSION_NAME,
VK_EXT_HOST_QUERY_RESET_EXTENSION_NAME,
};
template <typename T>
void SetNext(void**& next, T& data) {
*next = &data;
next = &data.pNext;
}
template <typename T>
T GetFeatures(vk::PhysicalDevice physical, const vk::DispatchLoaderDynamic& dldi) {
vk::PhysicalDeviceFeatures2 features;
T extension_features;
features.pNext = &extension_features;
physical.getFeatures2(&features, dldi);
return extension_features;
}
template <typename T>
T GetProperties(vk::PhysicalDevice physical, const vk::DispatchLoaderDynamic& dldi) {
vk::PhysicalDeviceProperties2 properties;
T extension_properties;
properties.pNext = &extension_properties;
physical.getProperties2(&properties, dldi);
return extension_properties;
}
constexpr const vk::Format* GetFormatAlternatives(vk::Format format) {
constexpr const VkFormat* GetFormatAlternatives(VkFormat format) {
switch (format) {
case vk::Format::eD24UnormS8Uint:
return Alternatives::Depth24UnormS8Uint.data();
case vk::Format::eD16UnormS8Uint:
return Alternatives::Depth16UnormS8Uint.data();
case VK_FORMAT_D24_UNORM_S8_UINT:
return Alternatives::Depth24UnormS8_UINT.data();
case VK_FORMAT_D16_UNORM_S8_UINT:
return Alternatives::Depth16UnormS8_UINT.data();
default:
return nullptr;
}
}
vk::FormatFeatureFlags GetFormatFeatures(vk::FormatProperties properties, FormatType format_type) {
VkFormatFeatureFlags GetFormatFeatures(VkFormatProperties properties, FormatType format_type) {
switch (format_type) {
case FormatType::Linear:
return properties.linearTilingFeatures;
@@ -76,79 +71,220 @@ vk::FormatFeatureFlags GetFormatFeatures(vk::FormatProperties properties, Format
}
}
std::unordered_map<VkFormat, VkFormatProperties> GetFormatProperties(
vk::PhysicalDevice physical, const vk::InstanceDispatch& dld) {
static constexpr std::array formats{VK_FORMAT_A8B8G8R8_UNORM_PACK32,
VK_FORMAT_A8B8G8R8_UINT_PACK32,
VK_FORMAT_A8B8G8R8_SNORM_PACK32,
VK_FORMAT_A8B8G8R8_SRGB_PACK32,
VK_FORMAT_B5G6R5_UNORM_PACK16,
VK_FORMAT_A2B10G10R10_UNORM_PACK32,
VK_FORMAT_A1R5G5B5_UNORM_PACK16,
VK_FORMAT_R32G32B32A32_SFLOAT,
VK_FORMAT_R32G32B32A32_UINT,
VK_FORMAT_R32G32_SFLOAT,
VK_FORMAT_R32G32_UINT,
VK_FORMAT_R16G16B16A16_UINT,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16_UNORM,
VK_FORMAT_R16G16_SNORM,
VK_FORMAT_R16G16_SFLOAT,
VK_FORMAT_R16_UNORM,
VK_FORMAT_R8G8B8A8_SRGB,
VK_FORMAT_R8G8_UNORM,
VK_FORMAT_R8G8_SNORM,
VK_FORMAT_R8_UNORM,
VK_FORMAT_R8_UINT,
VK_FORMAT_B10G11R11_UFLOAT_PACK32,
VK_FORMAT_R32_SFLOAT,
VK_FORMAT_R32_UINT,
VK_FORMAT_R32_SINT,
VK_FORMAT_R16_SFLOAT,
VK_FORMAT_R16G16B16A16_SFLOAT,
VK_FORMAT_B8G8R8A8_UNORM,
VK_FORMAT_R4G4B4A4_UNORM_PACK16,
VK_FORMAT_D32_SFLOAT,
VK_FORMAT_D16_UNORM,
VK_FORMAT_D16_UNORM_S8_UINT,
VK_FORMAT_D24_UNORM_S8_UINT,
VK_FORMAT_D32_SFLOAT_S8_UINT,
VK_FORMAT_BC1_RGBA_UNORM_BLOCK,
VK_FORMAT_BC2_UNORM_BLOCK,
VK_FORMAT_BC3_UNORM_BLOCK,
VK_FORMAT_BC4_UNORM_BLOCK,
VK_FORMAT_BC5_UNORM_BLOCK,
VK_FORMAT_BC5_SNORM_BLOCK,
VK_FORMAT_BC7_UNORM_BLOCK,
VK_FORMAT_BC6H_UFLOAT_BLOCK,
VK_FORMAT_BC6H_SFLOAT_BLOCK,
VK_FORMAT_BC1_RGBA_SRGB_BLOCK,
VK_FORMAT_BC2_SRGB_BLOCK,
VK_FORMAT_BC3_SRGB_BLOCK,
VK_FORMAT_BC7_SRGB_BLOCK,
VK_FORMAT_ASTC_4x4_SRGB_BLOCK,
VK_FORMAT_ASTC_8x8_SRGB_BLOCK,
VK_FORMAT_ASTC_8x5_SRGB_BLOCK,
VK_FORMAT_ASTC_5x4_SRGB_BLOCK,
VK_FORMAT_ASTC_5x5_UNORM_BLOCK,
VK_FORMAT_ASTC_5x5_SRGB_BLOCK,
VK_FORMAT_ASTC_10x8_UNORM_BLOCK,
VK_FORMAT_ASTC_10x8_SRGB_BLOCK,
VK_FORMAT_ASTC_6x6_UNORM_BLOCK,
VK_FORMAT_ASTC_6x6_SRGB_BLOCK,
VK_FORMAT_ASTC_10x10_UNORM_BLOCK,
VK_FORMAT_ASTC_10x10_SRGB_BLOCK,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK,
VK_FORMAT_ASTC_12x12_SRGB_BLOCK,
VK_FORMAT_ASTC_8x6_UNORM_BLOCK,
VK_FORMAT_ASTC_8x6_SRGB_BLOCK,
VK_FORMAT_ASTC_6x5_UNORM_BLOCK,
VK_FORMAT_ASTC_6x5_SRGB_BLOCK,
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32};
std::unordered_map<VkFormat, VkFormatProperties> format_properties;
for (const auto format : formats) {
format_properties.emplace(format, physical.GetFormatProperties(format));
}
return format_properties;
}
} // Anonymous namespace
VKDevice::VKDevice(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDevice physical,
vk::SurfaceKHR surface)
: physical{physical}, properties{physical.getProperties(dldi)},
format_properties{GetFormatProperties(dldi, physical)} {
SetupFamilies(dldi, surface);
SetupFeatures(dldi);
VKDevice::VKDevice(VkInstance instance, vk::PhysicalDevice physical, VkSurfaceKHR surface,
const vk::InstanceDispatch& dld)
: dld{dld}, physical{physical}, properties{physical.GetProperties()},
format_properties{GetFormatProperties(physical, dld)} {
SetupFamilies(surface);
SetupFeatures();
}
VKDevice::~VKDevice() = default;
bool VKDevice::Create(const vk::DispatchLoaderDynamic& dldi, vk::Instance instance) {
bool VKDevice::Create() {
const auto queue_cis = GetDeviceQueueCreateInfos();
const std::vector extensions = LoadExtensions(dldi);
const std::vector extensions = LoadExtensions();
vk::PhysicalDeviceFeatures2 features2;
VkPhysicalDeviceFeatures2 features2;
features2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2;
features2.pNext = nullptr;
void** next = &features2.pNext;
auto& features = features2.features;
features.vertexPipelineStoresAndAtomics = true;
features.robustBufferAccess = false;
features.fullDrawIndexUint32 = false;
features.imageCubeArray = false;
features.independentBlend = true;
features.depthClamp = true;
features.samplerAnisotropy = true;
features.largePoints = true;
features.multiViewport = true;
features.depthBiasClamp = true;
features.geometryShader = true;
features.tessellationShader = true;
features.sampleRateShading = false;
features.dualSrcBlend = false;
features.logicOp = false;
features.multiDrawIndirect = false;
features.drawIndirectFirstInstance = false;
features.depthClamp = true;
features.depthBiasClamp = true;
features.fillModeNonSolid = false;
features.depthBounds = false;
features.wideLines = false;
features.largePoints = true;
features.alphaToOne = false;
features.multiViewport = true;
features.samplerAnisotropy = true;
features.textureCompressionETC2 = false;
features.textureCompressionASTC_LDR = is_optimal_astc_supported;
features.textureCompressionBC = false;
features.occlusionQueryPrecise = true;
features.pipelineStatisticsQuery = false;
features.vertexPipelineStoresAndAtomics = true;
features.fragmentStoresAndAtomics = true;
features.shaderTessellationAndGeometryPointSize = false;
features.shaderImageGatherExtended = true;
features.shaderStorageImageExtendedFormats = false;
features.shaderStorageImageMultisample = false;
features.shaderStorageImageReadWithoutFormat = is_formatless_image_load_supported;
features.shaderStorageImageWriteWithoutFormat = true;
features.textureCompressionASTC_LDR = is_optimal_astc_supported;
features.shaderUniformBufferArrayDynamicIndexing = false;
features.shaderSampledImageArrayDynamicIndexing = false;
features.shaderStorageBufferArrayDynamicIndexing = false;
features.shaderStorageImageArrayDynamicIndexing = false;
features.shaderClipDistance = false;
features.shaderCullDistance = false;
features.shaderFloat64 = false;
features.shaderInt64 = false;
features.shaderInt16 = false;
features.shaderResourceResidency = false;
features.shaderResourceMinLod = false;
features.sparseBinding = false;
features.sparseResidencyBuffer = false;
features.sparseResidencyImage2D = false;
features.sparseResidencyImage3D = false;
features.sparseResidency2Samples = false;
features.sparseResidency4Samples = false;
features.sparseResidency8Samples = false;
features.sparseResidency16Samples = false;
features.sparseResidencyAliased = false;
features.variableMultisampleRate = false;
features.inheritedQueries = false;
vk::PhysicalDevice16BitStorageFeaturesKHR bit16_storage;
VkPhysicalDevice16BitStorageFeaturesKHR bit16_storage;
bit16_storage.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES_KHR;
bit16_storage.pNext = nullptr;
bit16_storage.storageBuffer16BitAccess = false;
bit16_storage.uniformAndStorageBuffer16BitAccess = true;
bit16_storage.storagePushConstant16 = false;
bit16_storage.storageInputOutput16 = false;
SetNext(next, bit16_storage);
vk::PhysicalDevice8BitStorageFeaturesKHR bit8_storage;
VkPhysicalDevice8BitStorageFeaturesKHR bit8_storage;
bit8_storage.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES_KHR;
bit8_storage.pNext = nullptr;
bit8_storage.storageBuffer8BitAccess = false;
bit8_storage.uniformAndStorageBuffer8BitAccess = true;
bit8_storage.storagePushConstant8 = false;
SetNext(next, bit8_storage);
vk::PhysicalDeviceHostQueryResetFeaturesEXT host_query_reset;
VkPhysicalDeviceHostQueryResetFeaturesEXT host_query_reset;
host_query_reset.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_HOST_QUERY_RESET_FEATURES_EXT;
host_query_reset.hostQueryReset = true;
SetNext(next, host_query_reset);
vk::PhysicalDeviceFloat16Int8FeaturesKHR float16_int8;
VkPhysicalDeviceFloat16Int8FeaturesKHR float16_int8;
if (is_float16_supported) {
float16_int8.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR;
float16_int8.pNext = nullptr;
float16_int8.shaderFloat16 = true;
float16_int8.shaderInt8 = false;
SetNext(next, float16_int8);
} else {
LOG_INFO(Render_Vulkan, "Device doesn't support float16 natively");
}
vk::PhysicalDeviceUniformBufferStandardLayoutFeaturesKHR std430_layout;
VkPhysicalDeviceUniformBufferStandardLayoutFeaturesKHR std430_layout;
if (khr_uniform_buffer_standard_layout) {
std430_layout.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_UNIFORM_BUFFER_STANDARD_LAYOUT_FEATURES_KHR;
std430_layout.pNext = nullptr;
std430_layout.uniformBufferStandardLayout = true;
SetNext(next, std430_layout);
} else {
LOG_INFO(Render_Vulkan, "Device doesn't support packed UBOs");
}
vk::PhysicalDeviceIndexTypeUint8FeaturesEXT index_type_uint8;
VkPhysicalDeviceIndexTypeUint8FeaturesEXT index_type_uint8;
if (ext_index_type_uint8) {
index_type_uint8.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INDEX_TYPE_UINT8_FEATURES_EXT;
index_type_uint8.pNext = nullptr;
index_type_uint8.indexTypeUint8 = true;
SetNext(next, index_type_uint8);
} else {
LOG_INFO(Render_Vulkan, "Device doesn't support uint8 indexes");
}
vk::PhysicalDeviceTransformFeedbackFeaturesEXT transform_feedback;
VkPhysicalDeviceTransformFeedbackFeaturesEXT transform_feedback;
if (ext_transform_feedback) {
transform_feedback.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT;
transform_feedback.pNext = nullptr;
transform_feedback.transformFeedback = true;
transform_feedback.geometryStreams = true;
SetNext(next, transform_feedback);
@@ -160,62 +296,48 @@ bool VKDevice::Create(const vk::DispatchLoaderDynamic& dldi, vk::Instance instan
LOG_INFO(Render_Vulkan, "Device doesn't support depth range unrestricted");
}
vk::DeviceCreateInfo device_ci({}, static_cast<u32>(queue_cis.size()), queue_cis.data(), 0,
nullptr, static_cast<u32>(extensions.size()), extensions.data(),
nullptr);
device_ci.pNext = &features2;
vk::Device dummy_logical;
if (physical.createDevice(&device_ci, nullptr, &dummy_logical, dldi) != vk::Result::eSuccess) {
LOG_CRITICAL(Render_Vulkan, "Logical device failed to be created!");
logical = vk::Device::Create(physical, queue_cis, extensions, features2, dld);
if (!logical) {
LOG_ERROR(Render_Vulkan, "Failed to create logical device");
return false;
}
dld.init(instance, dldi.vkGetInstanceProcAddr, dummy_logical, dldi.vkGetDeviceProcAddr);
logical = UniqueDevice(
dummy_logical, vk::ObjectDestroy<vk::NoParent, vk::DispatchLoaderDynamic>(nullptr, dld));
CollectTelemetryParameters();
graphics_queue = logical->getQueue(graphics_family, 0, dld);
present_queue = logical->getQueue(present_family, 0, dld);
graphics_queue = logical.GetQueue(graphics_family);
present_queue = logical.GetQueue(present_family);
return true;
}
vk::Format VKDevice::GetSupportedFormat(vk::Format wanted_format,
vk::FormatFeatureFlags wanted_usage,
FormatType format_type) const {
VkFormat VKDevice::GetSupportedFormat(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const {
if (IsFormatSupported(wanted_format, wanted_usage, format_type)) {
return wanted_format;
}
// The wanted format is not supported by hardware, search for alternatives
const vk::Format* alternatives = GetFormatAlternatives(wanted_format);
const VkFormat* alternatives = GetFormatAlternatives(wanted_format);
if (alternatives == nullptr) {
UNREACHABLE_MSG("Format={} with usage={} and type={} has no defined alternatives and host "
"hardware does not support it",
vk::to_string(wanted_format), vk::to_string(wanted_usage),
static_cast<u32>(format_type));
wanted_format, wanted_usage, format_type);
return wanted_format;
}
std::size_t i = 0;
for (vk::Format alternative = alternatives[0]; alternative != vk::Format{};
alternative = alternatives[++i]) {
for (VkFormat alternative = *alternatives; alternative; alternative = alternatives[++i]) {
if (!IsFormatSupported(alternative, wanted_usage, format_type)) {
continue;
}
LOG_WARNING(Render_Vulkan,
"Emulating format={} with alternative format={} with usage={} and type={}",
static_cast<u32>(wanted_format), static_cast<u32>(alternative),
static_cast<u32>(wanted_usage), static_cast<u32>(format_type));
wanted_format, alternative, wanted_usage, format_type);
return alternative;
}
// No alternatives found, panic
UNREACHABLE_MSG("Format={} with usage={} and type={} is not supported by the host hardware and "
"doesn't support any of the alternatives",
static_cast<u32>(wanted_format), static_cast<u32>(wanted_usage),
static_cast<u32>(format_type));
wanted_format, wanted_usage, format_type);
return wanted_format;
}
@@ -229,35 +351,39 @@ void VKDevice::ReportLoss() const {
return;
}
[[maybe_unused]] const std::vector data = graphics_queue.getCheckpointDataNV(dld);
[[maybe_unused]] const std::vector data = graphics_queue.GetCheckpointDataNV(dld);
// Catch here in debug builds (or with optimizations disabled) the last graphics pipeline to be
// executed. It can be done on a debugger by evaluating the expression:
// *(VKGraphicsPipeline*)data[0]
}
bool VKDevice::IsOptimalAstcSupported(const vk::PhysicalDeviceFeatures& features,
const vk::DispatchLoaderDynamic& dldi) const {
bool VKDevice::IsOptimalAstcSupported(const VkPhysicalDeviceFeatures& features) const {
// Disable for now to avoid converting ASTC twice.
return false;
static constexpr std::array astc_formats = {
vk::Format::eAstc4x4SrgbBlock, vk::Format::eAstc8x8SrgbBlock,
vk::Format::eAstc8x5SrgbBlock, vk::Format::eAstc5x4SrgbBlock,
vk::Format::eAstc5x5UnormBlock, vk::Format::eAstc5x5SrgbBlock,
vk::Format::eAstc10x8UnormBlock, vk::Format::eAstc10x8SrgbBlock,
vk::Format::eAstc6x6UnormBlock, vk::Format::eAstc6x6SrgbBlock,
vk::Format::eAstc10x10UnormBlock, vk::Format::eAstc10x10SrgbBlock,
vk::Format::eAstc12x12UnormBlock, vk::Format::eAstc12x12SrgbBlock,
vk::Format::eAstc8x6UnormBlock, vk::Format::eAstc8x6SrgbBlock,
vk::Format::eAstc6x5UnormBlock, vk::Format::eAstc6x5SrgbBlock};
VK_FORMAT_ASTC_4x4_UNORM_BLOCK, VK_FORMAT_ASTC_4x4_SRGB_BLOCK,
VK_FORMAT_ASTC_5x4_UNORM_BLOCK, VK_FORMAT_ASTC_5x4_SRGB_BLOCK,
VK_FORMAT_ASTC_5x5_UNORM_BLOCK, VK_FORMAT_ASTC_5x5_SRGB_BLOCK,
VK_FORMAT_ASTC_6x5_UNORM_BLOCK, VK_FORMAT_ASTC_6x5_SRGB_BLOCK,
VK_FORMAT_ASTC_6x6_UNORM_BLOCK, VK_FORMAT_ASTC_6x6_SRGB_BLOCK,
VK_FORMAT_ASTC_8x5_UNORM_BLOCK, VK_FORMAT_ASTC_8x5_SRGB_BLOCK,
VK_FORMAT_ASTC_8x6_UNORM_BLOCK, VK_FORMAT_ASTC_8x6_SRGB_BLOCK,
VK_FORMAT_ASTC_8x8_UNORM_BLOCK, VK_FORMAT_ASTC_8x8_SRGB_BLOCK,
VK_FORMAT_ASTC_10x5_UNORM_BLOCK, VK_FORMAT_ASTC_10x5_SRGB_BLOCK,
VK_FORMAT_ASTC_10x6_UNORM_BLOCK, VK_FORMAT_ASTC_10x6_SRGB_BLOCK,
VK_FORMAT_ASTC_10x8_UNORM_BLOCK, VK_FORMAT_ASTC_10x8_SRGB_BLOCK,
VK_FORMAT_ASTC_10x10_UNORM_BLOCK, VK_FORMAT_ASTC_10x10_SRGB_BLOCK,
VK_FORMAT_ASTC_12x10_UNORM_BLOCK, VK_FORMAT_ASTC_12x10_SRGB_BLOCK,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK, VK_FORMAT_ASTC_12x12_SRGB_BLOCK,
};
if (!features.textureCompressionASTC_LDR) {
return false;
}
const auto format_feature_usage{
vk::FormatFeatureFlagBits::eSampledImage | vk::FormatFeatureFlagBits::eBlitSrc |
vk::FormatFeatureFlagBits::eBlitDst | vk::FormatFeatureFlagBits::eTransferSrc |
vk::FormatFeatureFlagBits::eTransferDst};
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT | VK_FORMAT_FEATURE_BLIT_SRC_BIT |
VK_FORMAT_FEATURE_BLIT_DST_BIT | VK_FORMAT_FEATURE_TRANSFER_SRC_BIT |
VK_FORMAT_FEATURE_TRANSFER_DST_BIT};
for (const auto format : astc_formats) {
const auto format_properties{physical.getFormatProperties(format, dldi)};
const auto format_properties{physical.GetFormatProperties(format)};
if (!(format_properties.optimalTilingFeatures & format_feature_usage)) {
return false;
}
@@ -265,62 +391,49 @@ bool VKDevice::IsOptimalAstcSupported(const vk::PhysicalDeviceFeatures& features
return true;
}
bool VKDevice::IsFormatSupported(vk::Format wanted_format, vk::FormatFeatureFlags wanted_usage,
bool VKDevice::IsFormatSupported(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const {
const auto it = format_properties.find(wanted_format);
if (it == format_properties.end()) {
UNIMPLEMENTED_MSG("Unimplemented format query={}", vk::to_string(wanted_format));
UNIMPLEMENTED_MSG("Unimplemented format query={}", wanted_format);
return true;
}
const auto supported_usage = GetFormatFeatures(it->second, format_type);
return (supported_usage & wanted_usage) == wanted_usage;
}
bool VKDevice::IsSuitable(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDevice physical,
vk::SurfaceKHR surface) {
bool VKDevice::IsSuitable(vk::PhysicalDevice physical, VkSurfaceKHR surface) {
bool is_suitable = true;
std::bitset<REQUIRED_EXTENSIONS.size()> available_extensions;
constexpr std::array required_extensions = {
VK_KHR_SWAPCHAIN_EXTENSION_NAME,
VK_KHR_16BIT_STORAGE_EXTENSION_NAME,
VK_KHR_8BIT_STORAGE_EXTENSION_NAME,
VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME,
VK_EXT_VERTEX_ATTRIBUTE_DIVISOR_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_BALLOT_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_VOTE_EXTENSION_NAME,
VK_EXT_HOST_QUERY_RESET_EXTENSION_NAME,
};
std::bitset<required_extensions.size()> available_extensions{};
for (const auto& prop : physical.enumerateDeviceExtensionProperties(nullptr, dldi)) {
for (std::size_t i = 0; i < required_extensions.size(); ++i) {
for (const auto& prop : physical.EnumerateDeviceExtensionProperties()) {
for (std::size_t i = 0; i < REQUIRED_EXTENSIONS.size(); ++i) {
if (available_extensions[i]) {
continue;
}
available_extensions[i] =
required_extensions[i] == std::string_view{prop.extensionName};
const std::string_view name{prop.extensionName};
available_extensions[i] = name == REQUIRED_EXTENSIONS[i];
}
}
if (!available_extensions.all()) {
for (std::size_t i = 0; i < required_extensions.size(); ++i) {
for (std::size_t i = 0; i < REQUIRED_EXTENSIONS.size(); ++i) {
if (available_extensions[i]) {
continue;
}
LOG_ERROR(Render_Vulkan, "Missing required extension: {}", required_extensions[i]);
LOG_ERROR(Render_Vulkan, "Missing required extension: {}", REQUIRED_EXTENSIONS[i]);
is_suitable = false;
}
}
bool has_graphics{}, has_present{};
const auto queue_family_properties = physical.getQueueFamilyProperties(dldi);
const std::vector queue_family_properties = physical.GetQueueFamilyProperties();
for (u32 i = 0; i < static_cast<u32>(queue_family_properties.size()); ++i) {
const auto& family = queue_family_properties[i];
if (family.queueCount == 0) {
continue;
}
has_graphics |=
(family.queueFlags & vk::QueueFlagBits::eGraphics) != static_cast<vk::QueueFlagBits>(0);
has_present |= physical.getSurfaceSupportKHR(i, surface, dldi) != 0;
has_graphics |= family.queueFlags & VK_QUEUE_GRAPHICS_BIT;
has_present |= physical.GetSurfaceSupportKHR(i, surface);
}
if (!has_graphics || !has_present) {
LOG_ERROR(Render_Vulkan, "Device lacks a graphics and present queue");
@@ -328,7 +441,7 @@ bool VKDevice::IsSuitable(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDev
}
// TODO(Rodrigo): Check if the device matches all requeriments.
const auto properties{physical.getProperties(dldi)};
const auto properties{physical.GetProperties()};
const auto& limits{properties.limits};
constexpr u32 required_ubo_size = 65536;
@@ -345,7 +458,7 @@ bool VKDevice::IsSuitable(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDev
is_suitable = false;
}
const auto features{physical.getFeatures(dldi)};
const auto features{physical.GetFeatures()};
const std::array feature_report = {
std::make_pair(features.vertexPipelineStoresAndAtomics, "vertexPipelineStoresAndAtomics"),
std::make_pair(features.independentBlend, "independentBlend"),
@@ -377,9 +490,9 @@ bool VKDevice::IsSuitable(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDev
return is_suitable;
}
std::vector<const char*> VKDevice::LoadExtensions(const vk::DispatchLoaderDynamic& dldi) {
std::vector<const char*> VKDevice::LoadExtensions() {
std::vector<const char*> extensions;
const auto Test = [&](const vk::ExtensionProperties& extension,
const auto Test = [&](const VkExtensionProperties& extension,
std::optional<std::reference_wrapper<bool>> status, const char* name,
bool push) {
if (extension.extensionName != std::string_view(name)) {
@@ -393,22 +506,13 @@ std::vector<const char*> VKDevice::LoadExtensions(const vk::DispatchLoaderDynami
}
};
extensions.reserve(15);
extensions.push_back(VK_KHR_SWAPCHAIN_EXTENSION_NAME);
extensions.push_back(VK_KHR_16BIT_STORAGE_EXTENSION_NAME);
extensions.push_back(VK_KHR_8BIT_STORAGE_EXTENSION_NAME);
extensions.push_back(VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME);
extensions.push_back(VK_EXT_VERTEX_ATTRIBUTE_DIVISOR_EXTENSION_NAME);
extensions.push_back(VK_EXT_SHADER_SUBGROUP_BALLOT_EXTENSION_NAME);
extensions.push_back(VK_EXT_SHADER_SUBGROUP_VOTE_EXTENSION_NAME);
extensions.push_back(VK_EXT_HOST_QUERY_RESET_EXTENSION_NAME);
extensions.reserve(7 + REQUIRED_EXTENSIONS.size());
extensions.insert(extensions.begin(), REQUIRED_EXTENSIONS.begin(), REQUIRED_EXTENSIONS.end());
[[maybe_unused]] const bool nsight =
std::getenv("NVTX_INJECTION64_PATH") || std::getenv("NSIGHT_LAUNCHED");
bool has_khr_shader_float16_int8{};
bool has_ext_subgroup_size_control{};
bool has_ext_transform_feedback{};
for (const auto& extension : physical.enumerateDeviceExtensionProperties(nullptr, dldi)) {
for (const auto& extension : physical.EnumerateDeviceExtensionProperties()) {
Test(extension, khr_uniform_buffer_standard_layout,
VK_KHR_UNIFORM_BUFFER_STANDARD_LAYOUT_EXTENSION_NAME, true);
Test(extension, has_khr_shader_float16_int8, VK_KHR_SHADER_FLOAT16_INT8_EXTENSION_NAME,
@@ -428,38 +532,67 @@ std::vector<const char*> VKDevice::LoadExtensions(const vk::DispatchLoaderDynami
}
}
VkPhysicalDeviceFeatures2KHR features;
features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR;
VkPhysicalDeviceProperties2KHR properties;
properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR;
if (has_khr_shader_float16_int8) {
is_float16_supported =
GetFeatures<vk::PhysicalDeviceFloat16Int8FeaturesKHR>(physical, dldi).shaderFloat16;
VkPhysicalDeviceFloat16Int8FeaturesKHR float16_int8_features;
float16_int8_features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR;
float16_int8_features.pNext = nullptr;
features.pNext = &float16_int8_features;
physical.GetFeatures2KHR(features);
is_float16_supported = float16_int8_features.shaderFloat16;
extensions.push_back(VK_KHR_SHADER_FLOAT16_INT8_EXTENSION_NAME);
}
if (has_ext_subgroup_size_control) {
const auto features =
GetFeatures<vk::PhysicalDeviceSubgroupSizeControlFeaturesEXT>(physical, dldi);
const auto properties =
GetProperties<vk::PhysicalDeviceSubgroupSizeControlPropertiesEXT>(physical, dldi);
VkPhysicalDeviceSubgroupSizeControlFeaturesEXT subgroup_features;
subgroup_features.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_SIZE_CONTROL_FEATURES_EXT;
subgroup_features.pNext = nullptr;
features.pNext = &subgroup_features;
physical.GetFeatures2KHR(features);
is_warp_potentially_bigger = properties.maxSubgroupSize > GuestWarpSize;
VkPhysicalDeviceSubgroupSizeControlPropertiesEXT subgroup_properties;
subgroup_properties.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_SIZE_CONTROL_PROPERTIES_EXT;
subgroup_properties.pNext = nullptr;
properties.pNext = &subgroup_properties;
physical.GetProperties2KHR(properties);
if (features.subgroupSizeControl && properties.minSubgroupSize <= GuestWarpSize &&
properties.maxSubgroupSize >= GuestWarpSize) {
is_warp_potentially_bigger = subgroup_properties.maxSubgroupSize > GuestWarpSize;
if (subgroup_features.subgroupSizeControl &&
subgroup_properties.minSubgroupSize <= GuestWarpSize &&
subgroup_properties.maxSubgroupSize >= GuestWarpSize) {
extensions.push_back(VK_EXT_SUBGROUP_SIZE_CONTROL_EXTENSION_NAME);
guest_warp_stages = properties.requiredSubgroupSizeStages;
guest_warp_stages = subgroup_properties.requiredSubgroupSizeStages;
}
} else {
is_warp_potentially_bigger = true;
}
if (has_ext_transform_feedback) {
const auto features =
GetFeatures<vk::PhysicalDeviceTransformFeedbackFeaturesEXT>(physical, dldi);
const auto properties =
GetProperties<vk::PhysicalDeviceTransformFeedbackPropertiesEXT>(physical, dldi);
VkPhysicalDeviceTransformFeedbackFeaturesEXT tfb_features;
tfb_features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT;
tfb_features.pNext = nullptr;
features.pNext = &tfb_features;
physical.GetFeatures2KHR(features);
if (features.transformFeedback && features.geometryStreams &&
properties.maxTransformFeedbackStreams >= 4 && properties.maxTransformFeedbackBuffers &&
properties.transformFeedbackQueries && properties.transformFeedbackDraw) {
VkPhysicalDeviceTransformFeedbackPropertiesEXT tfb_properties;
tfb_properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_PROPERTIES_EXT;
tfb_properties.pNext = nullptr;
properties.pNext = &tfb_properties;
physical.GetProperties2KHR(properties);
if (tfb_features.transformFeedback && tfb_features.geometryStreams &&
tfb_properties.maxTransformFeedbackStreams >= 4 &&
tfb_properties.maxTransformFeedbackBuffers && tfb_properties.transformFeedbackQueries &&
tfb_properties.transformFeedbackDraw) {
extensions.push_back(VK_EXT_TRANSFORM_FEEDBACK_EXTENSION_NAME);
ext_transform_feedback = true;
}
@@ -468,10 +601,10 @@ std::vector<const char*> VKDevice::LoadExtensions(const vk::DispatchLoaderDynami
return extensions;
}
void VKDevice::SetupFamilies(const vk::DispatchLoaderDynamic& dldi, vk::SurfaceKHR surface) {
void VKDevice::SetupFamilies(VkSurfaceKHR surface) {
std::optional<u32> graphics_family_, present_family_;
const auto queue_family_properties = physical.getQueueFamilyProperties(dldi);
const std::vector queue_family_properties = physical.GetQueueFamilyProperties();
for (u32 i = 0; i < static_cast<u32>(queue_family_properties.size()); ++i) {
if (graphics_family_ && present_family_)
break;
@@ -480,10 +613,12 @@ void VKDevice::SetupFamilies(const vk::DispatchLoaderDynamic& dldi, vk::SurfaceK
if (queue_family.queueCount == 0)
continue;
if (queue_family.queueFlags & vk::QueueFlagBits::eGraphics)
if (queue_family.queueFlags & VK_QUEUE_GRAPHICS_BIT) {
graphics_family_ = i;
if (physical.getSurfaceSupportKHR(i, surface, dldi))
}
if (physical.GetSurfaceSupportKHR(i, surface)) {
present_family_ = i;
}
}
ASSERT(graphics_family_ && present_family_);
@@ -491,111 +626,49 @@ void VKDevice::SetupFamilies(const vk::DispatchLoaderDynamic& dldi, vk::SurfaceK
present_family = *present_family_;
}
void VKDevice::SetupFeatures(const vk::DispatchLoaderDynamic& dldi) {
const auto supported_features{physical.getFeatures(dldi)};
void VKDevice::SetupFeatures() {
const auto supported_features{physical.GetFeatures()};
is_formatless_image_load_supported = supported_features.shaderStorageImageReadWithoutFormat;
is_optimal_astc_supported = IsOptimalAstcSupported(supported_features, dldi);
is_optimal_astc_supported = IsOptimalAstcSupported(supported_features);
}
void VKDevice::CollectTelemetryParameters() {
const auto driver = GetProperties<vk::PhysicalDeviceDriverPropertiesKHR>(physical, dld);
VkPhysicalDeviceDriverPropertiesKHR driver;
driver.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DRIVER_PROPERTIES_KHR;
driver.pNext = nullptr;
VkPhysicalDeviceProperties2KHR properties;
properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR;
properties.pNext = &driver;
physical.GetProperties2KHR(properties);
driver_id = driver.driverID;
vendor_name = driver.driverName;
const auto extensions = physical.enumerateDeviceExtensionProperties(nullptr, dld);
const std::vector extensions = physical.EnumerateDeviceExtensionProperties();
reported_extensions.reserve(std::size(extensions));
for (const auto& extension : extensions) {
reported_extensions.push_back(extension.extensionName);
}
}
std::vector<vk::DeviceQueueCreateInfo> VKDevice::GetDeviceQueueCreateInfos() const {
static const float QUEUE_PRIORITY = 1.0f;
std::vector<VkDeviceQueueCreateInfo> VKDevice::GetDeviceQueueCreateInfos() const {
static constexpr float QUEUE_PRIORITY = 1.0f;
std::set<u32> unique_queue_families = {graphics_family, present_family};
std::vector<vk::DeviceQueueCreateInfo> queue_cis;
std::unordered_set<u32> unique_queue_families = {graphics_family, present_family};
std::vector<VkDeviceQueueCreateInfo> queue_cis;
for (u32 queue_family : unique_queue_families)
queue_cis.push_back({{}, queue_family, 1, &QUEUE_PRIORITY});
for (const u32 queue_family : unique_queue_families) {
VkDeviceQueueCreateInfo& ci = queue_cis.emplace_back();
ci.sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.queueFamilyIndex = queue_family;
ci.queueCount = 1;
ci.pQueuePriorities = &QUEUE_PRIORITY;
}
return queue_cis;
}
std::unordered_map<vk::Format, vk::FormatProperties> VKDevice::GetFormatProperties(
const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDevice physical) {
static constexpr std::array formats{vk::Format::eA8B8G8R8UnormPack32,
vk::Format::eA8B8G8R8UintPack32,
vk::Format::eA8B8G8R8SnormPack32,
vk::Format::eA8B8G8R8SrgbPack32,
vk::Format::eB5G6R5UnormPack16,
vk::Format::eA2B10G10R10UnormPack32,
vk::Format::eA1R5G5B5UnormPack16,
vk::Format::eR32G32B32A32Sfloat,
vk::Format::eR32G32B32A32Uint,
vk::Format::eR32G32Sfloat,
vk::Format::eR32G32Uint,
vk::Format::eR16G16B16A16Uint,
vk::Format::eR16G16B16A16Snorm,
vk::Format::eR16G16B16A16Unorm,
vk::Format::eR16G16Unorm,
vk::Format::eR16G16Snorm,
vk::Format::eR16G16Sfloat,
vk::Format::eR16Unorm,
vk::Format::eR8G8B8A8Srgb,
vk::Format::eR8G8Unorm,
vk::Format::eR8G8Snorm,
vk::Format::eR8Unorm,
vk::Format::eR8Uint,
vk::Format::eB10G11R11UfloatPack32,
vk::Format::eR32Sfloat,
vk::Format::eR32Uint,
vk::Format::eR32Sint,
vk::Format::eR16Sfloat,
vk::Format::eR16G16B16A16Sfloat,
vk::Format::eB8G8R8A8Unorm,
vk::Format::eR4G4B4A4UnormPack16,
vk::Format::eD32Sfloat,
vk::Format::eD16Unorm,
vk::Format::eD16UnormS8Uint,
vk::Format::eD24UnormS8Uint,
vk::Format::eD32SfloatS8Uint,
vk::Format::eBc1RgbaUnormBlock,
vk::Format::eBc2UnormBlock,
vk::Format::eBc3UnormBlock,
vk::Format::eBc4UnormBlock,
vk::Format::eBc5UnormBlock,
vk::Format::eBc5SnormBlock,
vk::Format::eBc7UnormBlock,
vk::Format::eBc6HUfloatBlock,
vk::Format::eBc6HSfloatBlock,
vk::Format::eBc1RgbaSrgbBlock,
vk::Format::eBc2SrgbBlock,
vk::Format::eBc3SrgbBlock,
vk::Format::eBc7SrgbBlock,
vk::Format::eAstc4x4SrgbBlock,
vk::Format::eAstc8x8SrgbBlock,
vk::Format::eAstc8x5SrgbBlock,
vk::Format::eAstc5x4SrgbBlock,
vk::Format::eAstc5x5UnormBlock,
vk::Format::eAstc5x5SrgbBlock,
vk::Format::eAstc10x8UnormBlock,
vk::Format::eAstc10x8SrgbBlock,
vk::Format::eAstc6x6UnormBlock,
vk::Format::eAstc6x6SrgbBlock,
vk::Format::eAstc10x10UnormBlock,
vk::Format::eAstc10x10SrgbBlock,
vk::Format::eAstc12x12UnormBlock,
vk::Format::eAstc12x12SrgbBlock,
vk::Format::eAstc8x6UnormBlock,
vk::Format::eAstc8x6SrgbBlock,
vk::Format::eAstc6x5UnormBlock,
vk::Format::eAstc6x5SrgbBlock,
vk::Format::eE5B9G9R9UfloatPack32};
std::unordered_map<vk::Format, vk::FormatProperties> format_properties;
for (const auto format : formats) {
format_properties.emplace(format, physical.getFormatProperties(format, dldi));
}
return format_properties;
}
} // namespace Vulkan

View File

@@ -8,8 +8,9 @@
#include <string_view>
#include <unordered_map>
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -22,12 +23,12 @@ const u32 GuestWarpSize = 32;
/// Handles data specific to a physical device.
class VKDevice final {
public:
explicit VKDevice(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDevice physical,
vk::SurfaceKHR surface);
explicit VKDevice(VkInstance instance, vk::PhysicalDevice physical, VkSurfaceKHR surface,
const vk::InstanceDispatch& dld);
~VKDevice();
/// Initializes the device. Returns true on success.
bool Create(const vk::DispatchLoaderDynamic& dldi, vk::Instance instance);
bool Create();
/**
* Returns a format supported by the device for the passed requeriments.
@@ -36,20 +37,20 @@ public:
* @param format_type Format type usage.
* @returns A format supported by the device.
*/
vk::Format GetSupportedFormat(vk::Format wanted_format, vk::FormatFeatureFlags wanted_usage,
FormatType format_type) const;
VkFormat GetSupportedFormat(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const;
/// Reports a device loss.
void ReportLoss() const;
/// Returns the dispatch loader with direct function pointers of the device.
const vk::DispatchLoaderDynamic& GetDispatchLoader() const {
const vk::DeviceDispatch& GetDispatchLoader() const {
return dld;
}
/// Returns the logical device.
vk::Device GetLogical() const {
return logical.get();
const vk::Device& GetLogical() const {
return logical;
}
/// Returns the physical device.
@@ -79,7 +80,7 @@ public:
/// Returns true if the device is integrated with the host CPU.
bool IsIntegrated() const {
return properties.deviceType == vk::PhysicalDeviceType::eIntegratedGpu;
return properties.deviceType == VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU;
}
/// Returns the current Vulkan API version provided in Vulkan-formatted version numbers.
@@ -98,27 +99,27 @@ public:
}
/// Returns the driver ID.
vk::DriverIdKHR GetDriverID() const {
VkDriverIdKHR GetDriverID() const {
return driver_id;
}
/// Returns uniform buffer alignment requeriment.
vk::DeviceSize GetUniformBufferAlignment() const {
VkDeviceSize GetUniformBufferAlignment() const {
return properties.limits.minUniformBufferOffsetAlignment;
}
/// Returns storage alignment requeriment.
vk::DeviceSize GetStorageBufferAlignment() const {
VkDeviceSize GetStorageBufferAlignment() const {
return properties.limits.minStorageBufferOffsetAlignment;
}
/// Returns the maximum range for storage buffers.
vk::DeviceSize GetMaxStorageBufferRange() const {
VkDeviceSize GetMaxStorageBufferRange() const {
return properties.limits.maxStorageBufferRange;
}
/// Returns the maximum size for push constants.
vk::DeviceSize GetMaxPushConstantsSize() const {
VkDeviceSize GetMaxPushConstantsSize() const {
return properties.limits.maxPushConstantsSize;
}
@@ -138,8 +139,8 @@ public:
}
/// Returns true if the device can be forced to use the guest warp size.
bool IsGuestWarpSizeSupported(vk::ShaderStageFlagBits stage) const {
return (guest_warp_stages & stage) != vk::ShaderStageFlags{};
bool IsGuestWarpSizeSupported(VkShaderStageFlagBits stage) const {
return guest_warp_stages & stage;
}
/// Returns true if formatless image load is supported.
@@ -188,50 +189,44 @@ public:
}
/// Checks if the physical device is suitable.
static bool IsSuitable(const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDevice physical,
vk::SurfaceKHR surface);
static bool IsSuitable(vk::PhysicalDevice physical, VkSurfaceKHR surface);
private:
/// Loads extensions into a vector and stores available ones in this object.
std::vector<const char*> LoadExtensions(const vk::DispatchLoaderDynamic& dldi);
std::vector<const char*> LoadExtensions();
/// Sets up queue families.
void SetupFamilies(const vk::DispatchLoaderDynamic& dldi, vk::SurfaceKHR surface);
void SetupFamilies(VkSurfaceKHR surface);
/// Sets up device features.
void SetupFeatures(const vk::DispatchLoaderDynamic& dldi);
void SetupFeatures();
/// Collects telemetry information from the device.
void CollectTelemetryParameters();
/// Returns a list of queue initialization descriptors.
std::vector<vk::DeviceQueueCreateInfo> GetDeviceQueueCreateInfos() const;
std::vector<VkDeviceQueueCreateInfo> GetDeviceQueueCreateInfos() const;
/// Returns true if ASTC textures are natively supported.
bool IsOptimalAstcSupported(const vk::PhysicalDeviceFeatures& features,
const vk::DispatchLoaderDynamic& dldi) const;
bool IsOptimalAstcSupported(const VkPhysicalDeviceFeatures& features) const;
/// Returns true if a format is supported.
bool IsFormatSupported(vk::Format wanted_format, vk::FormatFeatureFlags wanted_usage,
bool IsFormatSupported(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const;
/// Returns the device properties for Vulkan formats.
static std::unordered_map<vk::Format, vk::FormatProperties> GetFormatProperties(
const vk::DispatchLoaderDynamic& dldi, vk::PhysicalDevice physical);
const vk::PhysicalDevice physical; ///< Physical device.
vk::DispatchLoaderDynamic dld; ///< Device function pointers.
vk::PhysicalDeviceProperties properties; ///< Device properties.
UniqueDevice logical; ///< Logical device.
vk::Queue graphics_queue; ///< Main graphics queue.
vk::Queue present_queue; ///< Main present queue.
u32 graphics_family{}; ///< Main graphics queue family index.
u32 present_family{}; ///< Main present queue family index.
vk::DriverIdKHR driver_id{}; ///< Driver ID.
vk::ShaderStageFlags guest_warp_stages{}; ///< Stages where the guest warp size can be forced.ed
bool is_optimal_astc_supported{}; ///< Support for native ASTC.
bool is_float16_supported{}; ///< Support for float16 arithmetics.
bool is_warp_potentially_bigger{}; ///< Host warp size can be bigger than guest.
vk::DeviceDispatch dld; ///< Device function pointers.
vk::PhysicalDevice physical; ///< Physical device.
VkPhysicalDeviceProperties properties; ///< Device properties.
vk::Device logical; ///< Logical device.
vk::Queue graphics_queue; ///< Main graphics queue.
vk::Queue present_queue; ///< Main present queue.
u32 graphics_family{}; ///< Main graphics queue family index.
u32 present_family{}; ///< Main present queue family index.
VkDriverIdKHR driver_id{}; ///< Driver ID.
VkShaderStageFlags guest_warp_stages{}; ///< Stages where the guest warp size can be forced.ed
bool is_optimal_astc_supported{}; ///< Support for native ASTC.
bool is_float16_supported{}; ///< Support for float16 arithmetics.
bool is_warp_potentially_bigger{}; ///< Host warp size can be bigger than guest.
bool is_formatless_image_load_supported{}; ///< Support for shader image read without format.
bool khr_uniform_buffer_standard_layout{}; ///< Support for std430 on UBOs.
bool ext_index_type_uint8{}; ///< Support for VK_EXT_index_type_uint8.
@@ -245,7 +240,7 @@ private:
std::vector<std::string> reported_extensions; ///< Reported Vulkan extensions.
/// Format properties dictionary.
std::unordered_map<vk::Format, vk::FormatProperties> format_properties;
std::unordered_map<VkFormat, VkFormatProperties> format_properties;
};
} // namespace Vulkan

Some files were not shown because too many files have changed in this diff Show More