Compare commits

...

56 Commits

Author SHA1 Message Date
Lioncash
1c340c6efa CMakeLists: Specify -Wextra on linux builds
Allows reporting more cases where logic errors may exist, such as
implicit fallthrough cases, etc.

We currently ignore unused parameters, since we currently have many
cases where this is intentional (virtual interfaces).

While we're at it, we can also tidy up any existing code that causes
warnings. This also uncovered a few bugs as well.
2020-04-15 21:33:46 -04:00
Fernando Sahmkow
e33196d4e7 Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
Mat M
4398bdb4c7 Merge pull request #3670 from lioncash/reorder
CMakeLists: Make -Wreorder a compile-time error
2020-04-15 14:40:05 -04:00
Lioncash
213fff67bc CMakeLists: Make -Wreorder a compile-time error
This can result in silent logic bugs within code, and given the amount
of times these kind of warnings are caused, they should be flagged at
compile-time so no new code is submitted with them.
2020-04-15 14:14:41 -04:00
Mat M
64b5985f0a Merge pull request #3662 from ReinUsesLisp/constant-attrs
gl_rasterizer: Implement constant vertex attributes
2020-04-15 11:54:50 -04:00
Mat M
9208d555b7 Merge pull request #3668 from ReinUsesLisp/vtx-format-16ui
maxwell_to_vk: Add uint16 vertex formats
2020-04-15 11:43:52 -04:00
Mat M
ab72696beb Merge pull request #3656 from ReinUsesLisp/glsl-full-decompile
gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
2020-04-15 03:17:46 -04:00
Mat M
4878d6bb49 Merge pull request #3654 from ReinUsesLisp/fix-fb-attach
gl_texture_cache: Fix layered texture attachment base level
2020-04-15 03:17:18 -04:00
Mat M
50c0a92db8 Merge pull request #3663 from ReinUsesLisp/fcmp-rc
shader/arithmetic: Add FCMP_CR variant
2020-04-15 03:16:56 -04:00
Mat M
13331a3a32 Merge pull request #3664 from ReinUsesLisp/fe3h-black-squares
Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
2020-04-15 03:14:28 -04:00
Mat M
3a759d2352 Merge pull request #3667 from ReinUsesLisp/viewport-trash
vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
2020-04-15 03:10:34 -04:00
ReinUsesLisp
3036067047 maxwell_to_vk: Add uint16 vertex formats 2020-04-15 04:06:30 -03:00
ReinUsesLisp
b4e43c64c8 maxwell_to_vk: Add missing breaks
Avoid invalid fallbacks.
2020-04-15 04:05:33 -03:00
ReinUsesLisp
0ca456830f vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
When the dynamic state is specified, pViewports and pScissors are
ignored, quoting the specification:

  pViewports is a pointer to an array of VkViewport structures, defining
  the viewport transforms. If the viewport state is dynamic, this member
  is ignored.

That said, AMD's proprietary driver itself seem to read it regardless of
what the specification says.
2020-04-15 03:30:08 -03:00
Rodrigo Locatti
0b132e8cc1 Merge pull request #3657 from ReinUsesLisp/viewport-zero
vk_rasterizer: Default to 1 viewports with a size of 0
2020-04-15 01:51:17 -03:00
Fernando Sahmkow
daddbeffd1 Texture Cache: Only do buffer copies on accurate GPU. (#3634)
This is a simple optimization as Buffer Copies are mostly used for texture recycling. They are, however, useful when games abuse undefined behavior but most 3D APIs forbid it.
2020-04-14 23:21:00 -04:00
ReinUsesLisp
fd6371eba7 Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
This reverts commit 05cf270836.

Apparently the first approach using floats instead of bitfieldInert
worked better for Fire Emblem: Three Houses. Reverting to get that
behavior back.
2020-04-14 21:24:33 -03:00
ReinUsesLisp
fefe7f18f9 shader/arithmetic: Add FCMP_CR variant
Adds another variant of FCMP.
2020-04-14 19:11:04 -03:00
Zach Hilman
e366b4ee1f Merge pull request #3660 from bunnei/friend-blocked-users
service: friend: Stub IFriendService::GetBlockedUserListIds.
2020-04-14 16:59:46 -04:00
Zach Hilman
8040f6d544 Merge pull request #3661 from bunnei/patch-manager-fix
file_sys: patch_manager: Return early when there are no layers to apply.
2020-04-14 16:59:25 -04:00
ReinUsesLisp
6dfcabc800 gl_rasterizer: Implement constant vertex attributes
Credits go to gdkchan from Ryujinx for finding constant attributes are
used in retail games.
2020-04-14 17:58:53 -03:00
bunnei
fc35803f91 file_sys: patch_manager: Return early when there are no layers to apply. 2020-04-14 16:25:55 -04:00
bunnei
598740f1dd service: friend: Stub IFriendService::GetBlockedUserListIds.
- This is safe to stub, as there should be no adverse consequences from reporting no blocked users.
2020-04-14 16:20:51 -04:00
ReinUsesLisp
37e5c4fa7c vk_rasterizer: Default to 1 viewports with a size of 0
Silence validation layer errors.
2020-04-14 04:44:34 -03:00
ReinUsesLisp
453d7419d9 gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
From my testing on a Splatoon 2 shader that takes 3800ms on average to
compile changing to FullDecompile reduces it to 900ms on average.

The shader decoder will automatically fallback to a more naive method if
it can't use full decompile.
2020-04-14 01:34:20 -03:00
ReinUsesLisp
21dc842171 gl_texture_cache: Fix layered texture attachment base level
The base level is already included in the texture view. If we specify
the base level in the texture again, this will end up in the incorrect
level and potentially out of bounds.
2020-04-13 18:24:56 -03:00
Rodrigo Locatti
7e4a132a77 Merge pull request #3636 from ReinUsesLisp/drop-vk-hpp
renderer_vulkan: Drop Vulkan-Hpp
2020-04-13 17:08:04 -03:00
Mat M
fbf13d3f48 Merge pull request #3651 from ReinUsesLisp/line-widths
gl_rasterizer: Implement line widths and smooth lines
2020-04-13 10:19:59 -04:00
Mat M
08266d70ba Merge pull request #3638 from ReinUsesLisp/remove-preserve-contents
texture_cache: Remove preserve_contents
2020-04-13 10:19:01 -04:00
Mat M
c4001225f6 Merge pull request #3631 from ReinUsesLisp/more-astc
texture/astc: More small ASTC optimizations
2020-04-13 10:17:32 -04:00
Mat M
7b62212461 Merge pull request #3619 from ReinUsesLisp/i2i
shader/conversion: Implement I2I sign extension, saturation and selection
2020-04-13 10:17:07 -04:00
Mat M
3351e1e94f Merge pull request #3627 from ReinUsesLisp/layered-view
gl_texture_cache: Attach view instead of base texture for layered attchments
2020-04-13 10:16:18 -04:00
Mat M
d37d899431 Merge pull request #3646 from ReinUsesLisp/fix-glsl-turing
gl_shader_decompiler: Improve generated code in HMergeH*
2020-04-13 10:15:12 -04:00
Mat M
47036859eb Merge pull request #3633 from ReinUsesLisp/clean-texdec
shader/texture: Remove type mismatches management from shader decoder
2020-04-13 10:13:05 -04:00
ReinUsesLisp
76615b9f34 gl_rasterizer: Implement line widths and smooth lines
Implements "legacy" features from OpenGL present on hardware such as
smooth lines and line width.
2020-04-13 01:30:34 -03:00
bunnei
a9f866264d Merge pull request #3606 from ReinUsesLisp/nvflinger
service/vi: Partially implement BufferQueue disconnect
2020-04-12 11:44:48 -04:00
Fernando Sahmkow
3d91dbb21d Merge pull request #3578 from ReinUsesLisp/vmnmx
shader/video: Partially implement VMNMX
2020-04-12 10:44:03 -04:00
Mat M
4aec01b850 Merge pull request #3644 from ReinUsesLisp/msaa
video_core: Add MSAA registers in 3D engine and TIC
2020-04-12 09:11:44 -04:00
ReinUsesLisp
76f178ba6e shader/video: Partially implement VMNMX
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.

VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.

This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
2020-04-12 00:34:42 -03:00
ReinUsesLisp
a7baf6fee4 video_core: Add MSAA registers in 3D engine and TIC
This adds the registers used for multisampling. It doesn't implement
anything for now.
2020-04-12 00:21:27 -03:00
ReinUsesLisp
94b0e2e5da texture_cache: Remove preserve_contents
preserve_contents was always true. We can't assume we don't have to
preserve clears because scissored and color masked clears exist.

This removes preserve_contents and assumes it as true at all times.
2020-04-11 01:51:02 -03:00
ReinUsesLisp
2905142f47 renderer_vulkan: Drop Vulkan-Hpp 2020-04-10 22:49:02 -03:00
ReinUsesLisp
8c0ba9c6fe service/vi: Partially implement BufferQueue disconnect 2020-04-10 01:00:50 -03:00
ReinUsesLisp
a87b16da9a shader/texture: Remove type mismatches management from shader decoder
Since commit e22816a5bb we handle type mismatches from the CPU.
We don't need to hack our shader decoder due to game bugs anymore.

Removed in this commit.
2020-04-10 00:57:32 -03:00
ReinUsesLisp
6bf5d2b011 astc: Hard code bit depth changes to 8 and use fast replicate 2020-04-09 18:37:12 -03:00
ReinUsesLisp
bd2c1ab8a0 astc: Use boost's static_vector to avoid heap allocations 2020-04-09 05:27:57 -03:00
ReinUsesLisp
5de130beea astc: Implement a fast precompiled alternative for Replicate 2020-04-09 03:58:25 -03:00
ReinUsesLisp
6b4d4473be astc: Move Replicate to a constexpr LUT when possible 2020-04-09 03:35:07 -03:00
ReinUsesLisp
d22a689250 astc: Make InputBitStream constexpr 2020-04-09 02:54:05 -03:00
ReinUsesLisp
0efc230381 astc: OutputBitStream style changes and make it constexpr 2020-04-09 02:37:51 -03:00
ReinUsesLisp
6c8f9f40d7 gl_texture_cache: Attach view instead of base texture for layered attachments
This way we are not ignoring the base layer of the current texture.
2020-04-08 22:20:25 -03:00
ReinUsesLisp
da706cad25 shader/conversion: Implement I2I sign extension, saturation and selection
Reimplements I2I adding sign extension, saturation (clamp source value
to the destination), selection and destination sizes that are not 32
bits wide.

It doesn't implement CC yet.
2020-04-07 02:19:44 -03:00
ReinUsesLisp
3185245845 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
ReinUsesLisp
fd0a2b5151 shader/memory: Add "using std::move" 2020-04-06 02:18:14 -03:00
ReinUsesLisp
79970c9174 shader/memory: Minor fixes in ATOM 2020-04-06 00:54:22 -03:00
ReinUsesLisp
08470d261d shader_bytecode: Fix I2I_IMM encoding 2020-03-28 18:49:07 -03:00
100 changed files with 3686 additions and 2553 deletions

View File

@@ -53,7 +53,11 @@ if (MSVC)
else()
add_compile_options(
-Wall
-Werror=implicit-fallthrough
-Werror=reorder
-Wextra
-Wno-attributes
-Wno-unused-parameter
)
if (APPLE AND CMAKE_CXX_COMPILER_ID STREQUAL Clang)

View File

@@ -123,7 +123,7 @@ Symbols GetSymbols(VAddr text_offset, Memory::Memory& memory) {
std::optional<std::string> GetSymbolName(const Symbols& symbols, VAddr func_address) {
const auto iter =
std::find_if(symbols.begin(), symbols.end(), [func_address](const auto& pair) {
const auto& [symbol, name] = pair;
const auto& symbol = pair.first;
const auto end_address = symbol.value + symbol.size;
return func_address >= symbol.value && func_address < end_address;
});
@@ -146,7 +146,7 @@ std::vector<ARM_Interface::BacktraceEntry> ARM_Interface::GetBacktrace() const {
auto fp = GetReg(29);
auto lr = GetReg(30);
while (true) {
out.push_back({"", 0, lr, 0});
out.push_back({"", 0, lr, 0, ""});
if (!fp) {
break;
}

View File

@@ -348,6 +348,12 @@ static void ApplyLayeredFS(VirtualFile& romfs, u64 title_id, ContentRecordType t
if (ext_dir != nullptr)
layers_ext.push_back(std::move(ext_dir));
}
// When there are no layers to apply, return early as there is no need to rebuild the RomFS
if (layers.empty() && layers_ext.empty()) {
return;
}
layers.push_back(std::move(extracted));
auto layered = LayeredVfsDirectory::MakeLayeredDirectory(std::move(layers));
@@ -434,7 +440,8 @@ std::map<std::string, std::string, std::less<>> PatchManager::GetPatchVersionNam
// Game Updates
const auto update_tid = GetUpdateTitleID(title_id);
PatchManager update{update_tid};
auto [nacp, discard_icon_file] = update.GetControlMetadata();
const auto metadata = update.GetControlMetadata();
const auto& nacp = metadata.first;
const auto update_disabled =
std::find(disabled.cbegin(), disabled.cend(), "Update") != disabled.cend();

View File

@@ -591,14 +591,18 @@ InstallResult RegisteredCache::InstallEntry(const NSP& nsp, bool overwrite_if_ex
InstallResult RegisteredCache::InstallEntry(const NCA& nca, TitleType type,
bool overwrite_if_exists, const VfsCopyFunction& copy) {
CNMTHeader header{
nca.GetTitleId(), ///< Title ID
0, ///< Ignore/Default title version
type, ///< Type
{}, ///< Padding
0x10, ///< Default table offset
1, ///< 1 Content Entry
0, ///< No Meta Entries
{}, ///< Padding
nca.GetTitleId(), // Title ID
0, // Ignore/Default title version
type, // Type
{}, // Padding
0x10, // Default table offset
1, // 1 Content Entry
0, // No Meta Entries
{}, // Padding
{}, // Reserved 1
0, // Is committed
0, // Required download system version
{}, // Reserved 2
};
OptionalHeader opt_header{0, 0};
ContentRecord c_rec{{}, {}, {}, GetCRTypeFromNCAType(nca.GetType()), {}};
@@ -848,7 +852,8 @@ VirtualFile ManualContentProvider::GetEntryUnparsed(u64 title_id, ContentRecordT
VirtualFile ManualContentProvider::GetEntryRaw(u64 title_id, ContentRecordType type) const {
const auto iter =
std::find_if(entries.begin(), entries.end(), [title_id, type](const auto& entry) {
const auto [title_type, content_type, e_title_id] = entry.first;
const auto content_type = std::get<1>(entry.first);
const auto e_title_id = std::get<2>(entry.first);
return content_type == type && e_title_id == title_id;
});
if (iter == entries.end())

View File

@@ -42,11 +42,11 @@ VirtualDir ExtractZIP(VirtualFile file) {
continue;
if (name.back() != '/') {
std::unique_ptr<zip_file_t, decltype(&zip_fclose)> file{
std::unique_ptr<zip_file_t, decltype(&zip_fclose)> file2{
zip_fopen_index(zip.get(), i, 0), zip_fclose};
std::vector<u8> buf(stat.size);
if (zip_fread(file.get(), buf.data(), buf.size()) != buf.size())
if (zip_fread(file2.get(), buf.data(), buf.size()) != s64(buf.size()))
return nullptr;
const auto parts = FileUtil::SplitPathComponents(stat.name);

View File

@@ -25,7 +25,7 @@ FramebufferLayout DefaultFrameLayout(u32 width, u32 height) {
ASSERT(height > 0);
// The drawing code needs at least somewhat valid values for both screens
// so just calculate them both even if the other isn't showing.
FramebufferLayout res{width, height};
FramebufferLayout res{width, height, false, {}};
const float window_aspect_ratio = static_cast<float>(height) / width;
const float emulation_aspect_ratio = EmulationAspectRatio(

View File

@@ -284,17 +284,17 @@ ResultCode HLERequestContext::WriteToOutgoingCommandBuffer(Thread& thread) {
std::vector<u8> HLERequestContext::ReadBuffer(int buffer_index) const {
std::vector<u8> buffer;
const bool is_buffer_a{BufferDescriptorA().size() > buffer_index &&
const bool is_buffer_a{BufferDescriptorA().size() > std::size_t(buffer_index) &&
BufferDescriptorA()[buffer_index].Size()};
auto& memory = Core::System::GetInstance().Memory();
if (is_buffer_a) {
ASSERT_MSG(BufferDescriptorA().size() > buffer_index,
ASSERT_MSG(BufferDescriptorA().size() > std::size_t(buffer_index),
"BufferDescriptorA invalid buffer_index {}", buffer_index);
buffer.resize(BufferDescriptorA()[buffer_index].Size());
memory.ReadBlock(BufferDescriptorA()[buffer_index].Address(), buffer.data(), buffer.size());
} else {
ASSERT_MSG(BufferDescriptorX().size() > buffer_index,
ASSERT_MSG(BufferDescriptorX().size() > std::size_t(buffer_index),
"BufferDescriptorX invalid buffer_index {}", buffer_index);
buffer.resize(BufferDescriptorX()[buffer_index].Size());
memory.ReadBlock(BufferDescriptorX()[buffer_index].Address(), buffer.data(), buffer.size());
@@ -310,7 +310,7 @@ std::size_t HLERequestContext::WriteBuffer(const void* buffer, std::size_t size,
return 0;
}
const bool is_buffer_b{BufferDescriptorB().size() > buffer_index &&
const bool is_buffer_b{BufferDescriptorB().size() > std::size_t(buffer_index) &&
BufferDescriptorB()[buffer_index].Size()};
const std::size_t buffer_size{GetWriteBufferSize(buffer_index)};
if (size > buffer_size) {
@@ -321,13 +321,13 @@ std::size_t HLERequestContext::WriteBuffer(const void* buffer, std::size_t size,
auto& memory = Core::System::GetInstance().Memory();
if (is_buffer_b) {
ASSERT_MSG(BufferDescriptorB().size() > buffer_index,
ASSERT_MSG(BufferDescriptorB().size() > std::size_t(buffer_index),
"BufferDescriptorB invalid buffer_index {}", buffer_index);
ASSERT_MSG(BufferDescriptorB()[buffer_index].Size() >= size,
"BufferDescriptorB buffer_index {} is not large enough", buffer_index);
memory.WriteBlock(BufferDescriptorB()[buffer_index].Address(), buffer, size);
} else {
ASSERT_MSG(BufferDescriptorC().size() > buffer_index,
ASSERT_MSG(BufferDescriptorC().size() > std::size_t(buffer_index),
"BufferDescriptorC invalid buffer_index {}", buffer_index);
ASSERT_MSG(BufferDescriptorC()[buffer_index].Size() >= size,
"BufferDescriptorC buffer_index {} is not large enough", buffer_index);
@@ -338,16 +338,16 @@ std::size_t HLERequestContext::WriteBuffer(const void* buffer, std::size_t size,
}
std::size_t HLERequestContext::GetReadBufferSize(int buffer_index) const {
const bool is_buffer_a{BufferDescriptorA().size() > buffer_index &&
const bool is_buffer_a{BufferDescriptorA().size() > std::size_t(buffer_index) &&
BufferDescriptorA()[buffer_index].Size()};
if (is_buffer_a) {
ASSERT_MSG(BufferDescriptorA().size() > buffer_index,
ASSERT_MSG(BufferDescriptorA().size() > std::size_t(buffer_index),
"BufferDescriptorA invalid buffer_index {}", buffer_index);
ASSERT_MSG(BufferDescriptorA()[buffer_index].Size() > 0,
"BufferDescriptorA buffer_index {} is empty", buffer_index);
return BufferDescriptorA()[buffer_index].Size();
} else {
ASSERT_MSG(BufferDescriptorX().size() > buffer_index,
ASSERT_MSG(BufferDescriptorX().size() > std::size_t(buffer_index),
"BufferDescriptorX invalid buffer_index {}", buffer_index);
ASSERT_MSG(BufferDescriptorX()[buffer_index].Size() > 0,
"BufferDescriptorX buffer_index {} is empty", buffer_index);
@@ -356,14 +356,14 @@ std::size_t HLERequestContext::GetReadBufferSize(int buffer_index) const {
}
std::size_t HLERequestContext::GetWriteBufferSize(int buffer_index) const {
const bool is_buffer_b{BufferDescriptorB().size() > buffer_index &&
const bool is_buffer_b{BufferDescriptorB().size() > std::size_t(buffer_index) &&
BufferDescriptorB()[buffer_index].Size()};
if (is_buffer_b) {
ASSERT_MSG(BufferDescriptorB().size() > buffer_index,
ASSERT_MSG(BufferDescriptorB().size() > std::size_t(buffer_index),
"BufferDescriptorB invalid buffer_index {}", buffer_index);
return BufferDescriptorB()[buffer_index].Size();
} else {
ASSERT_MSG(BufferDescriptorC().size() > buffer_index,
ASSERT_MSG(BufferDescriptorC().size() > std::size_t(buffer_index),
"BufferDescriptorC invalid buffer_index {}", buffer_index);
return BufferDescriptorC()[buffer_index].Size();
}

View File

@@ -103,7 +103,7 @@ static void ThreadWakeupCallback(u64 thread_handle, [[maybe_unused]] s64 cycles_
struct KernelCore::Impl {
explicit Impl(Core::System& system, KernelCore& kernel)
: system{system}, global_scheduler{kernel}, synchronization{system}, time_manager{system} {}
: global_scheduler{kernel}, synchronization{system}, time_manager{system}, system{system} {}
void Initialize(KernelCore& kernel) {
Shutdown();

View File

@@ -129,7 +129,7 @@ private:
LOG_DEBUG(Service_Audio, "called. rendering_time_limit_percent={}",
rendering_time_limit_percent);
ASSERT(rendering_time_limit_percent >= 0 && rendering_time_limit_percent <= 100);
ASSERT(rendering_time_limit_percent <= 100);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);

View File

@@ -451,7 +451,8 @@ FileSys::SaveDataSize FileSystemController::ReadSaveDataSize(FileSys::SaveDataTy
if (res != Loader::ResultStatus::Success) {
FileSys::PatchManager pm{system.CurrentProcess()->GetTitleID()};
auto [nacp_unique, discard] = pm.GetControlMetadata();
const auto metadata = pm.GetControlMetadata();
const auto& nacp_unique = metadata.first;
if (nacp_unique != nullptr) {
new_size = {nacp_unique->GetDefaultNormalSaveSize(),

View File

@@ -575,6 +575,7 @@ private:
0,
user_id->GetSize(),
{},
{},
});
continue;
@@ -595,6 +596,7 @@ private:
stoull_be(title_id->GetName()),
title_id->GetSize(),
{},
{},
});
}
}
@@ -619,6 +621,7 @@ private:
stoull_be(title_id->GetName()),
title_id->GetSize(),
{},
{},
});
}
}

View File

@@ -27,7 +27,7 @@ public:
{10110, nullptr, "GetFriendProfileImage"},
{10200, nullptr, "SendFriendRequestForApplication"},
{10211, nullptr, "AddFacedFriendRequestForApplication"},
{10400, nullptr, "GetBlockedUserListIds"},
{10400, &IFriendService::GetBlockedUserListIds, "GetBlockedUserListIds"},
{10500, nullptr, "GetProfileList"},
{10600, nullptr, "DeclareOpenOnlinePlaySession"},
{10601, &IFriendService::DeclareCloseOnlinePlaySession, "DeclareCloseOnlinePlaySession"},
@@ -121,6 +121,15 @@ private:
};
static_assert(sizeof(SizedFriendFilter) == 0x10, "SizedFriendFilter is an invalid size");
void GetBlockedUserListIds(Kernel::HLERequestContext& ctx) {
// This is safe to stub, as there should be no adverse consequences from reporting no
// blocked users.
LOG_WARNING(Service_ACC, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u32>(0); // Indicates there are no blocked users
}
void DeclareCloseOnlinePlaySession(Kernel::HLERequestContext& ctx) {
// Stub used by Splatoon 2
LOG_WARNING(Service_ACC, "(STUBBED) called");

View File

@@ -126,6 +126,13 @@ void BufferQueue::ReleaseBuffer(u32 slot) {
buffer_wait_event.writable->Signal();
}
void BufferQueue::Disconnect() {
queue.clear();
queue_sequence.clear();
id = 1;
layer_id = 1;
}
u32 BufferQueue::Query(QueryType type) {
LOG_WARNING(Service, "(STUBBED) called type={}", static_cast<u32>(type));

View File

@@ -87,6 +87,7 @@ public:
Service::Nvidia::MultiFence& multi_fence);
std::optional<std::reference_wrapper<const Buffer>> AcquireBuffer();
void ReleaseBuffer(u32 slot);
void Disconnect();
u32 Query(QueryType type);
u32 GetId() const {

View File

@@ -309,7 +309,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
offset = GetTZName(name, offset);
std_len = offset;
}
if (!std_len) {
if (std_len == 0) {
return {};
}
if (!GetOffset(name, offset, std_offset)) {
@@ -320,7 +320,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
int dest_len{};
int dest_offset{};
const char* dest_name{name + offset};
if (rule.chars.size() < char_count) {
if (rule.chars.size() < std::size_t(char_count)) {
return {};
}
@@ -343,7 +343,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
return {};
}
char_count += dest_len + 1;
if (rule.chars.size() < char_count) {
if (rule.chars.size() < std::size_t(char_count)) {
return {};
}
if (name[offset] != '\0' && name[offset] != ',' && name[offset] != ';') {
@@ -414,7 +414,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
if (is_reversed ||
(start_time < end_time &&
(end_time - start_time < (year_seconds + (std_offset - dest_offset))))) {
if (rule.ats.size() - 2 < time_count) {
if (rule.ats.size() - 2 < std::size_t(time_count)) {
break;
}
@@ -609,7 +609,7 @@ static bool ParseTimeZoneBinary(TimeZoneRule& time_zone_rule, FileSys::VirtualFi
}
const u64 position{(read_offset - sizeof(TzifHeader))};
const std::size_t bytes_read{vfs_file->GetSize() - sizeof(TzifHeader) - position};
const s64 bytes_read = s64(vfs_file->GetSize() - sizeof(TzifHeader) - position);
if (bytes_read < 0) {
return {};
}
@@ -621,11 +621,11 @@ static bool ParseTimeZoneBinary(TimeZoneRule& time_zone_rule, FileSys::VirtualFi
std::array<char, time_zone_name_max + 1> temp_name{};
vfs_file->ReadArray(temp_name.data(), bytes_read, read_offset);
if (bytes_read > 2 && temp_name[0] == '\n' && temp_name[bytes_read - 1] == '\n' &&
time_zone_rule.type_count + 2 <= time_zone_rule.ttis.size()) {
std::size_t(time_zone_rule.type_count) + 2 <= time_zone_rule.ttis.size()) {
temp_name[bytes_read - 1] = '\0';
std::array<char, time_zone_name_max> name{};
std::memcpy(name.data(), temp_name.data() + 1, bytes_read - 1);
std::memcpy(name.data(), temp_name.data() + 1, std::size_t(bytes_read - 1));
TimeZoneRule temp_rule;
if (ParsePosixName(name.data(), temp_rule)) {

View File

@@ -101,8 +101,8 @@ public:
}
std::u16string ReadInterfaceToken() {
u32 unknown = Read<u32_le>();
u32 length = Read<u32_le>();
[[maybe_unused]] const u32 unknown = Read<u32_le>();
const u32 length = Read<u32_le>();
std::u16string token{};
@@ -513,7 +513,8 @@ private:
auto& buffer_queue = nv_flinger->FindBufferQueue(id);
if (transaction == TransactionId::Connect) {
switch (transaction) {
case TransactionId::Connect: {
IGBPConnectRequestParcel request{ctx.ReadBuffer()};
IGBPConnectResponseParcel response{
static_cast<u32>(static_cast<u32>(DisplayResolution::UndockedWidth) *
@@ -521,14 +522,18 @@ private:
static_cast<u32>(static_cast<u32>(DisplayResolution::UndockedHeight) *
Settings::values.resolution_factor)};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::SetPreallocatedBuffer) {
break;
}
case TransactionId::SetPreallocatedBuffer: {
IGBPSetPreallocatedBufferRequestParcel request{ctx.ReadBuffer()};
buffer_queue.SetPreallocatedBuffer(request.data.slot, request.buffer);
IGBPSetPreallocatedBufferResponseParcel response{};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::DequeueBuffer) {
break;
}
case TransactionId::DequeueBuffer: {
IGBPDequeueBufferRequestParcel request{ctx.ReadBuffer()};
const u32 width{request.data.width};
const u32 height{request.data.height};
@@ -556,14 +561,18 @@ private:
},
buffer_queue.GetWritableBufferWaitEvent());
}
} else if (transaction == TransactionId::RequestBuffer) {
break;
}
case TransactionId::RequestBuffer: {
IGBPRequestBufferRequestParcel request{ctx.ReadBuffer()};
auto& buffer = buffer_queue.RequestBuffer(request.slot);
IGBPRequestBufferResponseParcel response{buffer};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::QueueBuffer) {
break;
}
case TransactionId::QueueBuffer: {
IGBPQueueBufferRequestParcel request{ctx.ReadBuffer()};
buffer_queue.QueueBuffer(request.data.slot, request.data.transform,
@@ -572,7 +581,9 @@ private:
IGBPQueueBufferResponseParcel response{1280, 720};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::Query) {
break;
}
case TransactionId::Query: {
IGBPQueryRequestParcel request{ctx.ReadBuffer()};
const u32 value =
@@ -580,15 +591,30 @@ private:
IGBPQueryResponseParcel response{value};
ctx.WriteBuffer(response.Serialize());
} else if (transaction == TransactionId::CancelBuffer) {
break;
}
case TransactionId::CancelBuffer: {
LOG_CRITICAL(Service_VI, "(STUBBED) called, transaction=CancelBuffer");
} else if (transaction == TransactionId::Disconnect ||
transaction == TransactionId::DetachBuffer) {
break;
}
case TransactionId::Disconnect: {
LOG_WARNING(Service_VI, "(STUBBED) called, transaction=Disconnect");
const auto buffer = ctx.ReadBuffer();
buffer_queue.Disconnect();
IGBPEmptyResponseParcel response{};
ctx.WriteBuffer(response.Serialize());
break;
}
case TransactionId::DetachBuffer: {
const auto buffer = ctx.ReadBuffer();
IGBPEmptyResponseParcel response{};
ctx.WriteBuffer(response.Serialize());
} else {
break;
}
default:
ASSERT_MSG(false, "Unimplemented");
}

View File

@@ -55,7 +55,7 @@ void DmntCheatVm::LogOpcode(const CheatVmOpcode& opcode) {
fmt::format("Cond Type: {:X}", static_cast<u32>(begin_cond->cond_type)));
callbacks->CommandLog(fmt::format("Rel Addr: {:X}", begin_cond->rel_address));
callbacks->CommandLog(fmt::format("Value: {:X}", begin_cond->value.bit64));
} else if (auto end_cond = std::get_if<EndConditionalOpcode>(&opcode.opcode)) {
} else if (std::holds_alternative<EndConditionalOpcode>(opcode.opcode)) {
callbacks->CommandLog("Opcode: End Conditional");
} else if (auto ctrl_loop = std::get_if<ControlLoopOpcode>(&opcode.opcode)) {
if (ctrl_loop->start_loop) {
@@ -399,6 +399,7 @@ bool DmntCheatVm::DecodeNextOpcode(CheatVmOpcode& out) {
// 8kkkkkkk
// Just parse the mask.
begin_keypress_cond.key_mask = first_dword & 0x0FFFFFFF;
opcode.opcode = begin_keypress_cond;
} break;
case CheatVmOpcodeType::PerformArithmeticRegister: {
PerformArithmeticRegisterOpcode perform_math_reg{};
@@ -779,7 +780,7 @@ void DmntCheatVm::Execute(const CheatProcessMetadata& metadata) {
if (!cond_met) {
SkipConditionalBlock();
}
} else if (auto end_cond = std::get_if<EndConditionalOpcode>(&cur_opcode.opcode)) {
} else if (std::holds_alternative<EndConditionalOpcode>(cur_opcode.opcode)) {
// Decrement the condition depth.
// We will assume, graciously, that mismatched conditional block ends are a nop.
if (condition_depth > 0) {

View File

@@ -153,9 +153,9 @@ void TelemetrySession::AddInitialInfo(Loader::AppLoader& app_loader) {
app_loader.ReadTitle(name);
if (name.empty()) {
auto [nacp, icon_file] = FileSys::PatchManager(program_id).GetControlMetadata();
if (nacp != nullptr) {
name = nacp->GetApplicationName();
const auto metadata = FileSys::PatchManager(program_id).GetControlMetadata();
if (metadata.first != nullptr) {
name = metadata.first->GetApplicationName();
}
}

View File

@@ -603,6 +603,7 @@ public:
if (std::abs(event.jaxis.value / 32767.0) < 0.5) {
break;
}
[[fallthrough]];
case SDL_JOYBUTTONUP:
case SDL_JOYHATMOTION:
return SDLEventToButtonParamPackage(state, event);

View File

@@ -156,7 +156,6 @@ add_library(video_core STATIC
if (ENABLE_VULKAN)
target_sources(video_core PRIVATE
renderer_vulkan/declarations.h
renderer_vulkan/fixed_pipeline_state.cpp
renderer_vulkan/fixed_pipeline_state.h
renderer_vulkan/maxwell_to_vk.cpp

View File

@@ -303,6 +303,10 @@ public:
return (type == Type::SignedNorm) || (type == Type::UnsignedNorm);
}
bool IsConstant() const {
return constant;
}
bool IsValid() const {
return size != Size::Invalid;
}
@@ -312,6 +316,35 @@ public:
}
};
struct MsaaSampleLocation {
union {
BitField<0, 4, u32> x0;
BitField<4, 4, u32> y0;
BitField<8, 4, u32> x1;
BitField<12, 4, u32> y1;
BitField<16, 4, u32> x2;
BitField<20, 4, u32> y2;
BitField<24, 4, u32> x3;
BitField<28, 4, u32> y3;
};
constexpr std::pair<u32, u32> Location(int index) const {
switch (index) {
case 0:
return {x0, y0};
case 1:
return {x1, y1};
case 2:
return {x2, y2};
case 3:
return {x3, y3};
default:
UNREACHABLE();
return {0, 0};
}
}
};
enum class DepthMode : u32 {
MinusOneToOne = 0,
ZeroToOne = 1,
@@ -793,7 +826,13 @@ public:
u32 rt_separate_frag_data;
INSERT_UNION_PADDING_WORDS(0xC);
INSERT_UNION_PADDING_WORDS(0x1);
u32 multisample_raster_enable;
u32 multisample_raster_samples;
std::array<u32, 4> multisample_sample_mask;
INSERT_UNION_PADDING_WORDS(0x5);
struct {
u32 address_high;
@@ -830,7 +869,16 @@ public:
std::array<VertexAttribute, NumVertexAttributes> vertex_attrib_format;
INSERT_UNION_PADDING_WORDS(0xF);
std::array<MsaaSampleLocation, 4> multisample_sample_locations;
INSERT_UNION_PADDING_WORDS(0x2);
union {
BitField<0, 1, u32> enable;
BitField<4, 3, u32> target;
} multisample_coverage_to_color;
INSERT_UNION_PADDING_WORDS(0x8);
struct {
union {
@@ -922,7 +970,10 @@ public:
BitField<4, 1, u32> triangle_rast_flip;
} screen_y_control;
INSERT_UNION_PADDING_WORDS(0x21);
float line_width_smooth;
float line_width_aliased;
INSERT_UNION_PADDING_WORDS(0x1F);
u32 vb_element_base;
u32 vb_base_instance;
@@ -943,7 +994,7 @@ public:
CounterReset counter_reset;
INSERT_UNION_PADDING_WORDS(0x1);
u32 multisample_enable;
u32 zeta_enable;
@@ -980,7 +1031,7 @@ public:
float polygon_offset_factor;
INSERT_UNION_PADDING_WORDS(0x1);
u32 line_smooth_enable;
struct {
u32 tic_address_high;
@@ -1007,7 +1058,11 @@ public:
float polygon_offset_units;
INSERT_UNION_PADDING_WORDS(0x11);
INSERT_UNION_PADDING_WORDS(0x4);
Tegra::Texture::MsaaMode multisample_mode;
INSERT_UNION_PADDING_WORDS(0xC);
union {
BitField<2, 1, u32> coord_origin;
@@ -1507,12 +1562,17 @@ ASSERT_REG_POSITION(stencil_back_func_ref, 0x3D5);
ASSERT_REG_POSITION(stencil_back_mask, 0x3D6);
ASSERT_REG_POSITION(stencil_back_func_mask, 0x3D7);
ASSERT_REG_POSITION(color_mask_common, 0x3E4);
ASSERT_REG_POSITION(rt_separate_frag_data, 0x3EB);
ASSERT_REG_POSITION(depth_bounds, 0x3E7);
ASSERT_REG_POSITION(rt_separate_frag_data, 0x3EB);
ASSERT_REG_POSITION(multisample_raster_enable, 0x3ED);
ASSERT_REG_POSITION(multisample_raster_samples, 0x3EE);
ASSERT_REG_POSITION(multisample_sample_mask, 0x3EF);
ASSERT_REG_POSITION(zeta, 0x3F8);
ASSERT_REG_POSITION(clear_flags, 0x43E);
ASSERT_REG_POSITION(fill_rectangle, 0x44F);
ASSERT_REG_POSITION(vertex_attrib_format, 0x458);
ASSERT_REG_POSITION(multisample_sample_locations, 0x478);
ASSERT_REG_POSITION(multisample_coverage_to_color, 0x47E);
ASSERT_REG_POSITION(rt_control, 0x487);
ASSERT_REG_POSITION(zeta_width, 0x48a);
ASSERT_REG_POSITION(zeta_height, 0x48b);
@@ -1538,6 +1598,8 @@ ASSERT_REG_POSITION(stencil_front_func_mask, 0x4E6);
ASSERT_REG_POSITION(stencil_front_mask, 0x4E7);
ASSERT_REG_POSITION(frag_color_clamp, 0x4EA);
ASSERT_REG_POSITION(screen_y_control, 0x4EB);
ASSERT_REG_POSITION(line_width_smooth, 0x4EC);
ASSERT_REG_POSITION(line_width_aliased, 0x4ED);
ASSERT_REG_POSITION(vb_element_base, 0x50D);
ASSERT_REG_POSITION(vb_base_instance, 0x50E);
ASSERT_REG_POSITION(clip_distance_enabled, 0x544);
@@ -1545,11 +1607,13 @@ ASSERT_REG_POSITION(samplecnt_enable, 0x545);
ASSERT_REG_POSITION(point_size, 0x546);
ASSERT_REG_POSITION(point_sprite_enable, 0x548);
ASSERT_REG_POSITION(counter_reset, 0x54C);
ASSERT_REG_POSITION(multisample_enable, 0x54D);
ASSERT_REG_POSITION(zeta_enable, 0x54E);
ASSERT_REG_POSITION(multisample_control, 0x54F);
ASSERT_REG_POSITION(condition, 0x554);
ASSERT_REG_POSITION(tsc, 0x557);
ASSERT_REG_POSITION(polygon_offset_factor, 0x55b);
ASSERT_REG_POSITION(polygon_offset_factor, 0x55B);
ASSERT_REG_POSITION(line_smooth_enable, 0x55C);
ASSERT_REG_POSITION(tic, 0x55D);
ASSERT_REG_POSITION(stencil_two_side_enable, 0x565);
ASSERT_REG_POSITION(stencil_back_op_fail, 0x566);
@@ -1558,6 +1622,7 @@ ASSERT_REG_POSITION(stencil_back_op_zpass, 0x568);
ASSERT_REG_POSITION(stencil_back_func_func, 0x569);
ASSERT_REG_POSITION(framebuffer_srgb, 0x56E);
ASSERT_REG_POSITION(polygon_offset_units, 0x56F);
ASSERT_REG_POSITION(multisample_mode, 0x574);
ASSERT_REG_POSITION(point_coord_replace, 0x581);
ASSERT_REG_POSITION(code_address, 0x582);
ASSERT_REG_POSITION(draw, 0x585);

View File

@@ -290,6 +290,23 @@ enum class VmadShr : u64 {
Shr15 = 2,
};
enum class VmnmxType : u64 {
Bits8,
Bits16,
Bits32,
};
enum class VmnmxOperation : u64 {
Mrg_16H = 0,
Mrg_16L = 1,
Mrg_8B0 = 2,
Mrg_8B2 = 3,
Acc = 4,
Min = 5,
Max = 6,
Nop = 7,
};
enum class XmadMode : u64 {
None = 0,
CLo = 1,
@@ -988,6 +1005,12 @@ union Instruction {
BitField<46, 2, u64> cache_mode;
} stg;
union {
BitField<23, 3, AtomicOp> operation;
BitField<48, 1, u64> extended;
BitField<20, 3, GlobalAtomicType> type;
} red;
union {
BitField<52, 4, AtomicOp> operation;
BitField<49, 3, GlobalAtomicType> type;
@@ -1484,7 +1507,7 @@ union Instruction {
TextureType GetTextureType() const {
// The TLDS instruction has a weird encoding for the texture type.
if (texture_info >= 0 && texture_info <= 1) {
if (texture_info <= 1) {
return TextureType::Texture1D;
}
if (texture_info == 2 || texture_info == 8 || texture_info == 12 ||
@@ -1650,6 +1673,42 @@ union Instruction {
BitField<47, 1, u64> cc;
} vmad;
union {
BitField<54, 1, u64> is_dest_signed;
BitField<48, 1, u64> is_src_a_signed;
BitField<49, 1, u64> is_src_b_signed;
BitField<37, 2, u64> src_format_a;
BitField<29, 2, u64> src_format_b;
BitField<56, 1, u64> mx;
BitField<55, 1, u64> sat;
BitField<36, 2, u64> selector_a;
BitField<28, 2, u64> selector_b;
BitField<50, 1, u64> is_op_b_register;
BitField<51, 3, VmnmxOperation> operation;
VmnmxType SourceFormatA() const {
switch (src_format_a) {
case 0b11:
return VmnmxType::Bits32;
case 0b10:
return VmnmxType::Bits16;
default:
return VmnmxType::Bits8;
}
}
VmnmxType SourceFormatB() const {
switch (src_format_b) {
case 0b11:
return VmnmxType::Bits32;
case 0b10:
return VmnmxType::Bits16;
default:
return VmnmxType::Bits8;
}
}
} vmnmx;
union {
BitField<20, 16, u64> imm20_16;
BitField<35, 1, u64> high_b_rr; // used on RR
@@ -1734,6 +1793,7 @@ public:
ST_S,
ST, // Store in generic memory
STG, // Store in global memory
RED, // Reduction operation
ATOM, // Atomic operation on global memory
ATOMS, // Atomic operation on shared memory
AL2P, // Transforms attribute memory into physical memory
@@ -1763,6 +1823,7 @@ public:
MEMBAR,
VMAD,
VSETP,
VMNMX,
FFMA_IMM, // Fused Multiply and Add
FFMA_CR,
FFMA_RC,
@@ -1817,7 +1878,8 @@ public:
ICMP_R,
ICMP_CR,
ICMP_IMM,
FCMP_R,
FCMP_RR,
FCMP_RC,
MUFU, // Multi-Function Operator
RRO_C, // Range Reduction Operator
RRO_R,
@@ -2042,6 +2104,7 @@ private:
INST("1110111101010---", Id::ST_L, Type::Memory, "ST_L"),
INST("101-------------", Id::ST, Type::Memory, "ST"),
INST("1110111011011---", Id::STG, Type::Memory, "STG"),
INST("1110101111111---", Id::RED, Type::Memory, "RED"),
INST("11101101--------", Id::ATOM, Type::Memory, "ATOM"),
INST("11101100--------", Id::ATOMS, Type::Memory, "ATOMS"),
INST("1110111110100---", Id::AL2P, Type::Memory, "AL2P"),
@@ -2070,6 +2133,7 @@ private:
INST("1110111110011---", Id::MEMBAR, Type::Trivial, "MEMBAR"),
INST("01011111--------", Id::VMAD, Type::Video, "VMAD"),
INST("0101000011110---", Id::VSETP, Type::Video, "VSETP"),
INST("0011101---------", Id::VMNMX, Type::Video, "VMNMX"),
INST("0011001-1-------", Id::FFMA_IMM, Type::Ffma, "FFMA_IMM"),
INST("010010011-------", Id::FFMA_CR, Type::Ffma, "FFMA_CR"),
INST("010100011-------", Id::FFMA_RC, Type::Ffma, "FFMA_RC"),
@@ -2124,7 +2188,8 @@ private:
INST("0101110100100---", Id::HSETP2_R, Type::HalfSetPredicate, "HSETP2_R"),
INST("0111111-0-------", Id::HSETP2_IMM, Type::HalfSetPredicate, "HSETP2_IMM"),
INST("0101110100011---", Id::HSET2_R, Type::HalfSet, "HSET2_R"),
INST("010110111010----", Id::FCMP_R, Type::Arithmetic, "FCMP_R"),
INST("010110111010----", Id::FCMP_RR, Type::Arithmetic, "FCMP_RR"),
INST("010010111010----", Id::FCMP_RC, Type::Arithmetic, "FCMP_RC"),
INST("0101000010000---", Id::MUFU, Type::Arithmetic, "MUFU"),
INST("0100110010010---", Id::RRO_C, Type::Arithmetic, "RRO_C"),
INST("0101110010010---", Id::RRO_R, Type::Arithmetic, "RRO_R"),
@@ -2170,7 +2235,7 @@ private:
INST("0011011-11111---", Id::SHF_LEFT_IMM, Type::Shift, "SHF_LEFT_IMM"),
INST("0100110011100---", Id::I2I_C, Type::Conversion, "I2I_C"),
INST("0101110011100---", Id::I2I_R, Type::Conversion, "I2I_R"),
INST("0011101-11100---", Id::I2I_IMM, Type::Conversion, "I2I_IMM"),
INST("0011100-11100---", Id::I2I_IMM, Type::Conversion, "I2I_IMM"),
INST("0100110010111---", Id::I2F_C, Type::Conversion, "I2F_C"),
INST("0101110010111---", Id::I2F_R, Type::Conversion, "I2F_R"),
INST("0011100-10111---", Id::I2F_IMM, Type::Conversion, "I2F_IMM"),

View File

@@ -12,8 +12,9 @@ namespace VideoCommon {
GPUAsynch::GPUAsynch(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer_,
std::unique_ptr<Core::Frontend::GraphicsContext>&& context)
: GPU(system, std::move(renderer_), true), gpu_thread{system}, gpu_context(std::move(context)),
cpu_context(renderer->GetRenderWindow().CreateSharedContext()) {}
: GPU(system, std::move(renderer_), true), gpu_thread{system},
cpu_context(renderer->GetRenderWindow().CreateSharedContext()),
gpu_context(std::move(context)) {}
GPUAsynch::~GPUAsynch() = default;

View File

@@ -140,8 +140,8 @@ void RasterizerOpenGL::SetupVertexFormat() {
const auto attrib = gpu.regs.vertex_attrib_format[index];
const auto gl_index = static_cast<GLuint>(index);
// Ignore invalid attributes.
if (!attrib.IsValid()) {
// Disable constant attributes.
if (attrib.IsConstant()) {
glDisableVertexAttribArray(gl_index);
continue;
}
@@ -345,7 +345,7 @@ void RasterizerOpenGL::ConfigureFramebuffers() {
texture_cache.GuardRenderTargets(true);
View depth_surface = texture_cache.GetDepthBufferSurface(true);
View depth_surface = texture_cache.GetDepthBufferSurface();
const auto& regs = gpu.regs;
UNIMPLEMENTED_IF(regs.rt_separate_frag_data == 0);
@@ -354,7 +354,7 @@ void RasterizerOpenGL::ConfigureFramebuffers() {
FramebufferCacheKey key;
const auto colors_count = static_cast<std::size_t>(regs.rt_control.count);
for (std::size_t index = 0; index < colors_count; ++index) {
View color_surface{texture_cache.GetColorBufferSurface(index, true)};
View color_surface{texture_cache.GetColorBufferSurface(index)};
if (!color_surface) {
continue;
}
@@ -387,12 +387,12 @@ void RasterizerOpenGL::ConfigureClearFramebuffer(bool using_color_fb, bool using
View color_surface;
if (using_color_fb) {
const std::size_t index = regs.clear_buffers.RT;
color_surface = texture_cache.GetColorBufferSurface(index, true);
color_surface = texture_cache.GetColorBufferSurface(index);
texture_cache.MarkColorBufferInUse(index);
}
View depth_surface;
if (using_depth_fb || using_stencil_fb) {
depth_surface = texture_cache.GetDepthBufferSurface(true);
depth_surface = texture_cache.GetDepthBufferSurface();
texture_cache.MarkDepthBufferInUse();
}
texture_cache.GuardRenderTargets(false);
@@ -496,6 +496,7 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
SyncPrimitiveRestart();
SyncScissorTest();
SyncPointState();
SyncLineState();
SyncPolygonOffset();
SyncAlphaTest();
SyncFramebufferSRGB();
@@ -1311,6 +1312,19 @@ void RasterizerOpenGL::SyncPointState() {
glDisable(GL_PROGRAM_POINT_SIZE);
}
void RasterizerOpenGL::SyncLineState() {
auto& gpu = system.GPU().Maxwell3D();
auto& flags = gpu.dirty.flags;
if (!flags[Dirty::LineWidth]) {
return;
}
flags[Dirty::LineWidth] = false;
const auto& regs = gpu.regs;
oglEnable(GL_LINE_SMOOTH, regs.line_smooth_enable);
glLineWidth(regs.line_smooth_enable ? regs.line_width_smooth : regs.line_width_aliased);
}
void RasterizerOpenGL::SyncPolygonOffset() {
auto& gpu = system.GPU().Maxwell3D();
auto& flags = gpu.dirty.flags;

View File

@@ -171,6 +171,9 @@ private:
/// Syncs the point state to match the guest state
void SyncPointState();
/// Syncs the line state to match the guest state
void SyncLineState();
/// Syncs the rasterizer enable state to match the guest state
void SyncRasterizeEnable();

View File

@@ -34,6 +34,8 @@
namespace OpenGL {
using Tegra::Engines::ShaderType;
using VideoCommon::Shader::CompileDepth;
using VideoCommon::Shader::CompilerSettings;
using VideoCommon::Shader::ProgramCode;
using VideoCommon::Shader::Registry;
using VideoCommon::Shader::ShaderIR;
@@ -43,7 +45,7 @@ namespace {
constexpr u32 STAGE_MAIN_OFFSET = 10;
constexpr u32 KERNEL_MAIN_OFFSET = 0;
constexpr VideoCommon::Shader::CompilerSettings COMPILER_SETTINGS{};
constexpr CompilerSettings COMPILER_SETTINGS{CompileDepth::FullDecompile};
/// Gets the address for the specified shader stage program
GPUVAddr GetShaderAddress(Core::System& system, Maxwell::ShaderProgram program) {

View File

@@ -835,7 +835,8 @@ private:
void DeclareConstantBuffers() {
u32 binding = device.GetBaseBindings(stage).uniform_buffer;
for (const auto& [index, cbuf] : ir.GetConstantBuffers()) {
for (const auto& buffers : ir.GetConstantBuffers()) {
const auto index = buffers.first;
code.AddLine("layout (std140, binding = {}) uniform {} {{", binding++,
GetConstBufferBlock(index));
code.AddLine(" uvec4 {}[{}];", GetConstBuffer(index), MAX_CONSTBUFFER_ELEMENTS);
@@ -1821,13 +1822,15 @@ private:
Expression HMergeH0(Operation operation) {
const std::string dest = VisitOperand(operation, 0).AsUint();
const std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("bitfieldInsert({}, {}, 0, 16)", dest, src), Type::Uint};
return {fmt::format("vec2(unpackHalf2x16({}).x, unpackHalf2x16({}).y)", src, dest),
Type::HalfFloat};
}
Expression HMergeH1(Operation operation) {
const std::string dest = VisitOperand(operation, 0).AsUint();
const std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("bitfieldInsert({}, {}, 16, 16)", dest, src), Type::Uint};
return {fmt::format("vec2(unpackHalf2x16({}).x, unpackHalf2x16({}).y)", dest, src),
Type::HalfFloat};
}
Expression HPack2(Operation operation) {
@@ -2117,8 +2120,14 @@ private:
return {};
}
return {fmt::format("atomic{}({}, {})", opname, Visit(operation[0]).GetCode(),
Visit(operation[1]).As(type)),
type};
Visit(operation[1]).AsUint()),
Type::Uint};
}
template <const std::string_view& opname, Type type>
Expression Reduce(Operation operation) {
code.AddLine("{};", Atomic<opname, type>(operation).GetCode());
return {};
}
Expression Branch(Operation operation) {
@@ -2477,6 +2486,20 @@ private:
&GLSLDecompiler::Atomic<Func::Or, Type::Int>,
&GLSLDecompiler::Atomic<Func::Xor, Type::Int>,
&GLSLDecompiler::Reduce<Func::Add, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Min, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Max, Type::Uint>,
&GLSLDecompiler::Reduce<Func::And, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Or, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Xor, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Add, Type::Int>,
&GLSLDecompiler::Reduce<Func::Min, Type::Int>,
&GLSLDecompiler::Reduce<Func::Max, Type::Int>,
&GLSLDecompiler::Reduce<Func::And, Type::Int>,
&GLSLDecompiler::Reduce<Func::Or, Type::Int>,
&GLSLDecompiler::Reduce<Func::Xor, Type::Int>,
&GLSLDecompiler::Branch,
&GLSLDecompiler::BranchIndirect,
&GLSLDecompiler::PushFlowStack,

View File

@@ -185,6 +185,12 @@ void SetupDirtyPointSize(Tables& tables) {
tables[0][OFF(point_sprite_enable)] = PointSize;
}
void SetupDirtyLineWidth(Tables& tables) {
tables[0][OFF(line_width_smooth)] = LineWidth;
tables[0][OFF(line_width_aliased)] = LineWidth;
tables[0][OFF(line_smooth_enable)] = LineWidth;
}
void SetupDirtyClipControl(Tables& tables) {
auto& table = tables[0];
table[OFF(screen_y_control)] = ClipControl;
@@ -233,6 +239,7 @@ void StateTracker::Initialize() {
SetupDirtyLogicOp(tables);
SetupDirtyFragmentClampColor(tables);
SetupDirtyPointSize(tables);
SetupDirtyLineWidth(tables);
SetupDirtyClipControl(tables);
SetupDirtyDepthClampEnabled(tables);
SetupDirtyMisc(tables);

View File

@@ -78,6 +78,7 @@ enum : u8 {
LogicOp,
FragmentClampColor,
PointSize,
LineWidth,
ClipControl,
DepthClampEnabled,

View File

@@ -411,14 +411,13 @@ CachedSurfaceView::~CachedSurfaceView() = default;
void CachedSurfaceView::Attach(GLenum attachment, GLenum target) const {
ASSERT(params.num_levels == 1);
const GLuint texture = surface.GetTexture();
if (params.num_layers > 1) {
// Layered framebuffer attachments
UNIMPLEMENTED_IF(params.base_layer != 0);
switch (params.target) {
case SurfaceTarget::Texture2DArray:
glFramebufferTexture(target, attachment, texture, params.base_level);
glFramebufferTexture(target, attachment, GetTexture(), 0);
break;
default:
UNIMPLEMENTED();
@@ -427,6 +426,7 @@ void CachedSurfaceView::Attach(GLenum attachment, GLenum target) const {
}
const GLenum view_target = surface.GetTarget();
const GLuint texture = surface.GetTexture();
switch (surface.GetSurfaceParams().target) {
case SurfaceTarget::Texture1D:
glFramebufferTexture1D(target, attachment, view_target, texture, params.base_level);

View File

@@ -315,8 +315,8 @@ public:
RendererOpenGL::RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system,
Core::Frontend::GraphicsContext& context)
: VideoCore::RendererBase{emu_window}, emu_window{emu_window}, system{system},
frame_mailbox{}, context{context}, has_debug_tool{HasDebugTool()} {}
: RendererBase{emu_window}, emu_window{emu_window}, system{system}, context{context},
has_debug_tool{HasDebugTool()} {}
RendererOpenGL::~RendererOpenGL() = default;

View File

@@ -1,60 +0,0 @@
// Copyright 2019 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
namespace vk {
class DispatchLoaderDynamic;
}
namespace Vulkan {
constexpr vk::DispatchLoaderDynamic* dont_use_me_dld = nullptr;
}
#define VULKAN_HPP_DEFAULT_DISPATCHER (*::Vulkan::dont_use_me_dld)
#define VULKAN_HPP_ENABLE_DYNAMIC_LOADER_TOOL 0
#define VULKAN_HPP_DISPATCH_LOADER_DYNAMIC 1
#include <vulkan/vulkan.hpp>
namespace Vulkan {
// vulkan.hpp unique handlers use DispatchLoaderStatic
template <typename T>
using UniqueHandle = vk::UniqueHandle<T, vk::DispatchLoaderDynamic>;
using UniqueAccelerationStructureNV = UniqueHandle<vk::AccelerationStructureNV>;
using UniqueBuffer = UniqueHandle<vk::Buffer>;
using UniqueBufferView = UniqueHandle<vk::BufferView>;
using UniqueCommandBuffer = UniqueHandle<vk::CommandBuffer>;
using UniqueCommandPool = UniqueHandle<vk::CommandPool>;
using UniqueDescriptorPool = UniqueHandle<vk::DescriptorPool>;
using UniqueDescriptorSet = UniqueHandle<vk::DescriptorSet>;
using UniqueDescriptorSetLayout = UniqueHandle<vk::DescriptorSetLayout>;
using UniqueDescriptorUpdateTemplate = UniqueHandle<vk::DescriptorUpdateTemplate>;
using UniqueDevice = UniqueHandle<vk::Device>;
using UniqueDeviceMemory = UniqueHandle<vk::DeviceMemory>;
using UniqueEvent = UniqueHandle<vk::Event>;
using UniqueFence = UniqueHandle<vk::Fence>;
using UniqueFramebuffer = UniqueHandle<vk::Framebuffer>;
using UniqueImage = UniqueHandle<vk::Image>;
using UniqueImageView = UniqueHandle<vk::ImageView>;
using UniqueInstance = UniqueHandle<vk::Instance>;
using UniqueIndirectCommandsLayoutNVX = UniqueHandle<vk::IndirectCommandsLayoutNVX>;
using UniqueObjectTableNVX = UniqueHandle<vk::ObjectTableNVX>;
using UniquePipeline = UniqueHandle<vk::Pipeline>;
using UniquePipelineCache = UniqueHandle<vk::PipelineCache>;
using UniquePipelineLayout = UniqueHandle<vk::PipelineLayout>;
using UniqueQueryPool = UniqueHandle<vk::QueryPool>;
using UniqueRenderPass = UniqueHandle<vk::RenderPass>;
using UniqueSampler = UniqueHandle<vk::Sampler>;
using UniqueSamplerYcbcrConversion = UniqueHandle<vk::SamplerYcbcrConversion>;
using UniqueSemaphore = UniqueHandle<vk::Semaphore>;
using UniqueShaderModule = UniqueHandle<vk::ShaderModule>;
using UniqueSurfaceKHR = UniqueHandle<vk::SurfaceKHR>;
using UniqueSwapchainKHR = UniqueHandle<vk::SwapchainKHR>;
using UniqueValidationCacheEXT = UniqueHandle<vk::ValidationCacheEXT>;
using UniqueDebugReportCallbackEXT = UniqueHandle<vk::DebugReportCallbackEXT>;
using UniqueDebugUtilsMessengerEXT = UniqueHandle<vk::DebugUtilsMessengerEXT>;
} // namespace Vulkan

View File

@@ -2,13 +2,15 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <iterator>
#include "common/assert.h"
#include "common/common_types.h"
#include "common/logging/log.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
namespace Vulkan::MaxwellToVK {
@@ -17,88 +19,89 @@ using Maxwell = Tegra::Engines::Maxwell3D::Regs;
namespace Sampler {
vk::Filter Filter(Tegra::Texture::TextureFilter filter) {
VkFilter Filter(Tegra::Texture::TextureFilter filter) {
switch (filter) {
case Tegra::Texture::TextureFilter::Linear:
return vk::Filter::eLinear;
return VK_FILTER_LINEAR;
case Tegra::Texture::TextureFilter::Nearest:
return vk::Filter::eNearest;
return VK_FILTER_NEAREST;
}
UNIMPLEMENTED_MSG("Unimplemented sampler filter={}", static_cast<u32>(filter));
return {};
}
vk::SamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter) {
VkSamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter) {
switch (mipmap_filter) {
case Tegra::Texture::TextureMipmapFilter::None:
// TODO(Rodrigo): None seems to be mapped to OpenGL's mag and min filters without mipmapping
// (e.g. GL_NEAREST and GL_LINEAR). Vulkan doesn't have such a thing, find out if we have to
// use an image view with a single mipmap level to emulate this.
return vk::SamplerMipmapMode::eLinear;
return VK_SAMPLER_MIPMAP_MODE_LINEAR;
;
case Tegra::Texture::TextureMipmapFilter::Linear:
return vk::SamplerMipmapMode::eLinear;
return VK_SAMPLER_MIPMAP_MODE_LINEAR;
case Tegra::Texture::TextureMipmapFilter::Nearest:
return vk::SamplerMipmapMode::eNearest;
return VK_SAMPLER_MIPMAP_MODE_NEAREST;
}
UNIMPLEMENTED_MSG("Unimplemented sampler mipmap mode={}", static_cast<u32>(mipmap_filter));
return {};
}
vk::SamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter) {
VkSamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter) {
switch (wrap_mode) {
case Tegra::Texture::WrapMode::Wrap:
return vk::SamplerAddressMode::eRepeat;
return VK_SAMPLER_ADDRESS_MODE_REPEAT;
case Tegra::Texture::WrapMode::Mirror:
return vk::SamplerAddressMode::eMirroredRepeat;
return VK_SAMPLER_ADDRESS_MODE_MIRRORED_REPEAT;
case Tegra::Texture::WrapMode::ClampToEdge:
return vk::SamplerAddressMode::eClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
case Tegra::Texture::WrapMode::Border:
return vk::SamplerAddressMode::eClampToBorder;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
case Tegra::Texture::WrapMode::Clamp:
if (device.GetDriverID() == vk::DriverIdKHR::eNvidiaProprietary) {
if (device.GetDriverID() == VK_DRIVER_ID_NVIDIA_PROPRIETARY_KHR) {
// Nvidia's Vulkan driver defaults to GL_CLAMP on invalid enumerations, we can hack this
// by sending an invalid enumeration.
return static_cast<vk::SamplerAddressMode>(0xcafe);
return static_cast<VkSamplerAddressMode>(0xcafe);
}
// TODO(Rodrigo): Emulate GL_CLAMP properly on other vendors
switch (filter) {
case Tegra::Texture::TextureFilter::Nearest:
return vk::SamplerAddressMode::eClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
case Tegra::Texture::TextureFilter::Linear:
return vk::SamplerAddressMode::eClampToBorder;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
}
UNREACHABLE();
return vk::SamplerAddressMode::eClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE;
case Tegra::Texture::WrapMode::MirrorOnceClampToEdge:
return vk::SamplerAddressMode::eMirrorClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
case Tegra::Texture::WrapMode::MirrorOnceBorder:
UNIMPLEMENTED();
return vk::SamplerAddressMode::eMirrorClampToEdge;
return VK_SAMPLER_ADDRESS_MODE_MIRROR_CLAMP_TO_EDGE;
default:
UNIMPLEMENTED_MSG("Unimplemented wrap mode={}", static_cast<u32>(wrap_mode));
return {};
}
}
vk::CompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func) {
VkCompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func) {
switch (depth_compare_func) {
case Tegra::Texture::DepthCompareFunc::Never:
return vk::CompareOp::eNever;
return VK_COMPARE_OP_NEVER;
case Tegra::Texture::DepthCompareFunc::Less:
return vk::CompareOp::eLess;
return VK_COMPARE_OP_LESS;
case Tegra::Texture::DepthCompareFunc::LessEqual:
return vk::CompareOp::eLessOrEqual;
return VK_COMPARE_OP_LESS_OR_EQUAL;
case Tegra::Texture::DepthCompareFunc::Equal:
return vk::CompareOp::eEqual;
return VK_COMPARE_OP_EQUAL;
case Tegra::Texture::DepthCompareFunc::NotEqual:
return vk::CompareOp::eNotEqual;
return VK_COMPARE_OP_NOT_EQUAL;
case Tegra::Texture::DepthCompareFunc::Greater:
return vk::CompareOp::eGreater;
return VK_COMPARE_OP_GREATER;
case Tegra::Texture::DepthCompareFunc::GreaterEqual:
return vk::CompareOp::eGreaterOrEqual;
return VK_COMPARE_OP_GREATER_OR_EQUAL;
case Tegra::Texture::DepthCompareFunc::Always:
return vk::CompareOp::eAlways;
return VK_COMPARE_OP_ALWAYS;
}
UNIMPLEMENTED_MSG("Unimplemented sampler depth compare function={}",
static_cast<u32>(depth_compare_func));
@@ -112,92 +115,92 @@ namespace {
enum : u32 { Attachable = 1, Storage = 2 };
struct FormatTuple {
vk::Format format; ///< Vulkan format
int usage; ///< Describes image format usage
VkFormat format; ///< Vulkan format
int usage = 0; ///< Describes image format usage
} constexpr tex_format_tuples[] = {
{vk::Format::eA8B8G8R8UnormPack32, Attachable | Storage}, // ABGR8U
{vk::Format::eA8B8G8R8SnormPack32, Attachable | Storage}, // ABGR8S
{vk::Format::eA8B8G8R8UintPack32, Attachable | Storage}, // ABGR8UI
{vk::Format::eB5G6R5UnormPack16, {}}, // B5G6R5U
{vk::Format::eA2B10G10R10UnormPack32, Attachable | Storage}, // A2B10G10R10U
{vk::Format::eA1R5G5B5UnormPack16, Attachable}, // A1B5G5R5U (flipped with swizzle)
{vk::Format::eR8Unorm, Attachable | Storage}, // R8U
{vk::Format::eR8Uint, Attachable | Storage}, // R8UI
{vk::Format::eR16G16B16A16Sfloat, Attachable | Storage}, // RGBA16F
{vk::Format::eR16G16B16A16Unorm, Attachable | Storage}, // RGBA16U
{vk::Format::eR16G16B16A16Snorm, Attachable | Storage}, // RGBA16S
{vk::Format::eR16G16B16A16Uint, Attachable | Storage}, // RGBA16UI
{vk::Format::eB10G11R11UfloatPack32, Attachable | Storage}, // R11FG11FB10F
{vk::Format::eR32G32B32A32Uint, Attachable | Storage}, // RGBA32UI
{vk::Format::eBc1RgbaUnormBlock, {}}, // DXT1
{vk::Format::eBc2UnormBlock, {}}, // DXT23
{vk::Format::eBc3UnormBlock, {}}, // DXT45
{vk::Format::eBc4UnormBlock, {}}, // DXN1
{vk::Format::eBc5UnormBlock, {}}, // DXN2UNORM
{vk::Format::eBc5SnormBlock, {}}, // DXN2SNORM
{vk::Format::eBc7UnormBlock, {}}, // BC7U
{vk::Format::eBc6HUfloatBlock, {}}, // BC6H_UF16
{vk::Format::eBc6HSfloatBlock, {}}, // BC6H_SF16
{vk::Format::eAstc4x4UnormBlock, {}}, // ASTC_2D_4X4
{vk::Format::eB8G8R8A8Unorm, {}}, // BGRA8
{vk::Format::eR32G32B32A32Sfloat, Attachable | Storage}, // RGBA32F
{vk::Format::eR32G32Sfloat, Attachable | Storage}, // RG32F
{vk::Format::eR32Sfloat, Attachable | Storage}, // R32F
{vk::Format::eR16Sfloat, Attachable | Storage}, // R16F
{vk::Format::eR16Unorm, Attachable | Storage}, // R16U
{vk::Format::eUndefined, {}}, // R16S
{vk::Format::eUndefined, {}}, // R16UI
{vk::Format::eUndefined, {}}, // R16I
{vk::Format::eR16G16Unorm, Attachable | Storage}, // RG16
{vk::Format::eR16G16Sfloat, Attachable | Storage}, // RG16F
{vk::Format::eUndefined, {}}, // RG16UI
{vk::Format::eUndefined, {}}, // RG16I
{vk::Format::eR16G16Snorm, Attachable | Storage}, // RG16S
{vk::Format::eUndefined, {}}, // RGB32F
{vk::Format::eR8G8B8A8Srgb, Attachable}, // RGBA8_SRGB
{vk::Format::eR8G8Unorm, Attachable | Storage}, // RG8U
{vk::Format::eR8G8Snorm, Attachable | Storage}, // RG8S
{vk::Format::eR32G32Uint, Attachable | Storage}, // RG32UI
{vk::Format::eUndefined, {}}, // RGBX16F
{vk::Format::eR32Uint, Attachable | Storage}, // R32UI
{vk::Format::eR32Sint, Attachable | Storage}, // R32I
{vk::Format::eAstc8x8UnormBlock, {}}, // ASTC_2D_8X8
{vk::Format::eUndefined, {}}, // ASTC_2D_8X5
{vk::Format::eUndefined, {}}, // ASTC_2D_5X4
{vk::Format::eUndefined, {}}, // BGRA8_SRGB
{vk::Format::eBc1RgbaSrgbBlock, {}}, // DXT1_SRGB
{vk::Format::eBc2SrgbBlock, {}}, // DXT23_SRGB
{vk::Format::eBc3SrgbBlock, {}}, // DXT45_SRGB
{vk::Format::eBc7SrgbBlock, {}}, // BC7U_SRGB
{vk::Format::eR4G4B4A4UnormPack16, Attachable}, // R4G4B4A4U
{vk::Format::eAstc4x4SrgbBlock, {}}, // ASTC_2D_4X4_SRGB
{vk::Format::eAstc8x8SrgbBlock, {}}, // ASTC_2D_8X8_SRGB
{vk::Format::eAstc8x5SrgbBlock, {}}, // ASTC_2D_8X5_SRGB
{vk::Format::eAstc5x4SrgbBlock, {}}, // ASTC_2D_5X4_SRGB
{vk::Format::eAstc5x5UnormBlock, {}}, // ASTC_2D_5X5
{vk::Format::eAstc5x5SrgbBlock, {}}, // ASTC_2D_5X5_SRGB
{vk::Format::eAstc10x8UnormBlock, {}}, // ASTC_2D_10X8
{vk::Format::eAstc10x8SrgbBlock, {}}, // ASTC_2D_10X8_SRGB
{vk::Format::eAstc6x6UnormBlock, {}}, // ASTC_2D_6X6
{vk::Format::eAstc6x6SrgbBlock, {}}, // ASTC_2D_6X6_SRGB
{vk::Format::eAstc10x10UnormBlock, {}}, // ASTC_2D_10X10
{vk::Format::eAstc10x10SrgbBlock, {}}, // ASTC_2D_10X10_SRGB
{vk::Format::eAstc12x12UnormBlock, {}}, // ASTC_2D_12X12
{vk::Format::eAstc12x12SrgbBlock, {}}, // ASTC_2D_12X12_SRGB
{vk::Format::eAstc8x6UnormBlock, {}}, // ASTC_2D_8X6
{vk::Format::eAstc8x6SrgbBlock, {}}, // ASTC_2D_8X6_SRGB
{vk::Format::eAstc6x5UnormBlock, {}}, // ASTC_2D_6X5
{vk::Format::eAstc6x5SrgbBlock, {}}, // ASTC_2D_6X5_SRGB
{vk::Format::eE5B9G9R9UfloatPack32, {}}, // E5B9G9R9F
{VK_FORMAT_A8B8G8R8_UNORM_PACK32, Attachable | Storage}, // ABGR8U
{VK_FORMAT_A8B8G8R8_SNORM_PACK32, Attachable | Storage}, // ABGR8S
{VK_FORMAT_A8B8G8R8_UINT_PACK32, Attachable | Storage}, // ABGR8UI
{VK_FORMAT_B5G6R5_UNORM_PACK16}, // B5G6R5U
{VK_FORMAT_A2B10G10R10_UNORM_PACK32, Attachable | Storage}, // A2B10G10R10U
{VK_FORMAT_A1R5G5B5_UNORM_PACK16, Attachable}, // A1B5G5R5U (flipped with swizzle)
{VK_FORMAT_R8_UNORM, Attachable | Storage}, // R8U
{VK_FORMAT_R8_UINT, Attachable | Storage}, // R8UI
{VK_FORMAT_R16G16B16A16_SFLOAT, Attachable | Storage}, // RGBA16F
{VK_FORMAT_R16G16B16A16_UNORM, Attachable | Storage}, // RGBA16U
{VK_FORMAT_R16G16B16A16_SNORM, Attachable | Storage}, // RGBA16S
{VK_FORMAT_R16G16B16A16_UINT, Attachable | Storage}, // RGBA16UI
{VK_FORMAT_B10G11R11_UFLOAT_PACK32, Attachable | Storage}, // R11FG11FB10F
{VK_FORMAT_R32G32B32A32_UINT, Attachable | Storage}, // RGBA32UI
{VK_FORMAT_BC1_RGBA_UNORM_BLOCK}, // DXT1
{VK_FORMAT_BC2_UNORM_BLOCK}, // DXT23
{VK_FORMAT_BC3_UNORM_BLOCK}, // DXT45
{VK_FORMAT_BC4_UNORM_BLOCK}, // DXN1
{VK_FORMAT_BC5_UNORM_BLOCK}, // DXN2UNORM
{VK_FORMAT_BC5_SNORM_BLOCK}, // DXN2SNORM
{VK_FORMAT_BC7_UNORM_BLOCK}, // BC7U
{VK_FORMAT_BC6H_UFLOAT_BLOCK}, // BC6H_UF16
{VK_FORMAT_BC6H_SFLOAT_BLOCK}, // BC6H_SF16
{VK_FORMAT_ASTC_4x4_UNORM_BLOCK}, // ASTC_2D_4X4
{VK_FORMAT_B8G8R8A8_UNORM}, // BGRA8
{VK_FORMAT_R32G32B32A32_SFLOAT, Attachable | Storage}, // RGBA32F
{VK_FORMAT_R32G32_SFLOAT, Attachable | Storage}, // RG32F
{VK_FORMAT_R32_SFLOAT, Attachable | Storage}, // R32F
{VK_FORMAT_R16_SFLOAT, Attachable | Storage}, // R16F
{VK_FORMAT_R16_UNORM, Attachable | Storage}, // R16U
{VK_FORMAT_UNDEFINED}, // R16S
{VK_FORMAT_UNDEFINED}, // R16UI
{VK_FORMAT_UNDEFINED}, // R16I
{VK_FORMAT_R16G16_UNORM, Attachable | Storage}, // RG16
{VK_FORMAT_R16G16_SFLOAT, Attachable | Storage}, // RG16F
{VK_FORMAT_UNDEFINED}, // RG16UI
{VK_FORMAT_UNDEFINED}, // RG16I
{VK_FORMAT_R16G16_SNORM, Attachable | Storage}, // RG16S
{VK_FORMAT_UNDEFINED}, // RGB32F
{VK_FORMAT_R8G8B8A8_SRGB, Attachable}, // RGBA8_SRGB
{VK_FORMAT_R8G8_UNORM, Attachable | Storage}, // RG8U
{VK_FORMAT_R8G8_SNORM, Attachable | Storage}, // RG8S
{VK_FORMAT_R32G32_UINT, Attachable | Storage}, // RG32UI
{VK_FORMAT_UNDEFINED}, // RGBX16F
{VK_FORMAT_R32_UINT, Attachable | Storage}, // R32UI
{VK_FORMAT_R32_SINT, Attachable | Storage}, // R32I
{VK_FORMAT_ASTC_8x8_UNORM_BLOCK}, // ASTC_2D_8X8
{VK_FORMAT_UNDEFINED}, // ASTC_2D_8X5
{VK_FORMAT_UNDEFINED}, // ASTC_2D_5X4
{VK_FORMAT_UNDEFINED}, // BGRA8_SRGB
{VK_FORMAT_BC1_RGBA_SRGB_BLOCK}, // DXT1_SRGB
{VK_FORMAT_BC2_SRGB_BLOCK}, // DXT23_SRGB
{VK_FORMAT_BC3_SRGB_BLOCK}, // DXT45_SRGB
{VK_FORMAT_BC7_SRGB_BLOCK}, // BC7U_SRGB
{VK_FORMAT_R4G4B4A4_UNORM_PACK16, Attachable}, // R4G4B4A4U
{VK_FORMAT_ASTC_4x4_SRGB_BLOCK}, // ASTC_2D_4X4_SRGB
{VK_FORMAT_ASTC_8x8_SRGB_BLOCK}, // ASTC_2D_8X8_SRGB
{VK_FORMAT_ASTC_8x5_SRGB_BLOCK}, // ASTC_2D_8X5_SRGB
{VK_FORMAT_ASTC_5x4_SRGB_BLOCK}, // ASTC_2D_5X4_SRGB
{VK_FORMAT_ASTC_5x5_UNORM_BLOCK}, // ASTC_2D_5X5
{VK_FORMAT_ASTC_5x5_SRGB_BLOCK}, // ASTC_2D_5X5_SRGB
{VK_FORMAT_ASTC_10x8_UNORM_BLOCK}, // ASTC_2D_10X8
{VK_FORMAT_ASTC_10x8_SRGB_BLOCK}, // ASTC_2D_10X8_SRGB
{VK_FORMAT_ASTC_6x6_UNORM_BLOCK}, // ASTC_2D_6X6
{VK_FORMAT_ASTC_6x6_SRGB_BLOCK}, // ASTC_2D_6X6_SRGB
{VK_FORMAT_ASTC_10x10_UNORM_BLOCK}, // ASTC_2D_10X10
{VK_FORMAT_ASTC_10x10_SRGB_BLOCK}, // ASTC_2D_10X10_SRGB
{VK_FORMAT_ASTC_12x12_UNORM_BLOCK}, // ASTC_2D_12X12
{VK_FORMAT_ASTC_12x12_SRGB_BLOCK}, // ASTC_2D_12X12_SRGB
{VK_FORMAT_ASTC_8x6_UNORM_BLOCK}, // ASTC_2D_8X6
{VK_FORMAT_ASTC_8x6_SRGB_BLOCK}, // ASTC_2D_8X6_SRGB
{VK_FORMAT_ASTC_6x5_UNORM_BLOCK}, // ASTC_2D_6X5
{VK_FORMAT_ASTC_6x5_SRGB_BLOCK}, // ASTC_2D_6X5_SRGB
{VK_FORMAT_E5B9G9R9_UFLOAT_PACK32}, // E5B9G9R9F
// Depth formats
{vk::Format::eD32Sfloat, Attachable}, // Z32F
{vk::Format::eD16Unorm, Attachable}, // Z16
{VK_FORMAT_D32_SFLOAT, Attachable}, // Z32F
{VK_FORMAT_D16_UNORM, Attachable}, // Z16
// DepthStencil formats
{vk::Format::eD24UnormS8Uint, Attachable}, // Z24S8
{vk::Format::eD24UnormS8Uint, Attachable}, // S8Z24 (emulated)
{vk::Format::eD32SfloatS8Uint, Attachable}, // Z32FS8
{VK_FORMAT_D24_UNORM_S8_UINT, Attachable}, // Z24S8
{VK_FORMAT_D24_UNORM_S8_UINT, Attachable}, // S8Z24 (emulated)
{VK_FORMAT_D32_SFLOAT_S8_UINT, Attachable}, // Z32FS8
};
static_assert(std::size(tex_format_tuples) == VideoCore::Surface::MaxPixelFormat);
@@ -212,106 +215,106 @@ FormatInfo SurfaceFormat(const VKDevice& device, FormatType format_type, PixelFo
ASSERT(static_cast<std::size_t>(pixel_format) < std::size(tex_format_tuples));
auto tuple = tex_format_tuples[static_cast<std::size_t>(pixel_format)];
if (tuple.format == vk::Format::eUndefined) {
if (tuple.format == VK_FORMAT_UNDEFINED) {
UNIMPLEMENTED_MSG("Unimplemented texture format with pixel format={}",
static_cast<u32>(pixel_format));
return {vk::Format::eA8B8G8R8UnormPack32, true, true};
return {VK_FORMAT_A8B8G8R8_UNORM_PACK32, true, true};
}
// Use ABGR8 on hardware that doesn't support ASTC natively
if (!device.IsOptimalAstcSupported() && VideoCore::Surface::IsPixelFormatASTC(pixel_format)) {
tuple.format = VideoCore::Surface::IsPixelFormatSRGB(pixel_format)
? vk::Format::eA8B8G8R8SrgbPack32
: vk::Format::eA8B8G8R8UnormPack32;
? VK_FORMAT_A8B8G8R8_SRGB_PACK32
: VK_FORMAT_A8B8G8R8_UNORM_PACK32;
}
const bool attachable = tuple.usage & Attachable;
const bool storage = tuple.usage & Storage;
vk::FormatFeatureFlags usage;
VkFormatFeatureFlags usage;
if (format_type == FormatType::Buffer) {
usage = vk::FormatFeatureFlagBits::eStorageTexelBuffer |
vk::FormatFeatureFlagBits::eUniformTexelBuffer;
usage =
VK_FORMAT_FEATURE_STORAGE_TEXEL_BUFFER_BIT | VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT;
} else {
usage = vk::FormatFeatureFlagBits::eSampledImage | vk::FormatFeatureFlagBits::eTransferDst |
vk::FormatFeatureFlagBits::eTransferSrc;
usage = VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT | VK_FORMAT_FEATURE_TRANSFER_DST_BIT |
VK_FORMAT_FEATURE_TRANSFER_SRC_BIT;
if (attachable) {
usage |= IsZetaFormat(pixel_format) ? vk::FormatFeatureFlagBits::eDepthStencilAttachment
: vk::FormatFeatureFlagBits::eColorAttachment;
usage |= IsZetaFormat(pixel_format) ? VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT
: VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT;
}
if (storage) {
usage |= vk::FormatFeatureFlagBits::eStorageImage;
usage |= VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT;
}
}
return {device.GetSupportedFormat(tuple.format, usage, format_type), attachable, storage};
}
vk::ShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage) {
VkShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage) {
switch (stage) {
case Tegra::Engines::ShaderType::Vertex:
return vk::ShaderStageFlagBits::eVertex;
return VK_SHADER_STAGE_VERTEX_BIT;
case Tegra::Engines::ShaderType::TesselationControl:
return vk::ShaderStageFlagBits::eTessellationControl;
return VK_SHADER_STAGE_TESSELLATION_CONTROL_BIT;
case Tegra::Engines::ShaderType::TesselationEval:
return vk::ShaderStageFlagBits::eTessellationEvaluation;
return VK_SHADER_STAGE_TESSELLATION_EVALUATION_BIT;
case Tegra::Engines::ShaderType::Geometry:
return vk::ShaderStageFlagBits::eGeometry;
return VK_SHADER_STAGE_GEOMETRY_BIT;
case Tegra::Engines::ShaderType::Fragment:
return vk::ShaderStageFlagBits::eFragment;
return VK_SHADER_STAGE_FRAGMENT_BIT;
case Tegra::Engines::ShaderType::Compute:
return vk::ShaderStageFlagBits::eCompute;
return VK_SHADER_STAGE_COMPUTE_BIT;
}
UNIMPLEMENTED_MSG("Unimplemented shader stage={}", static_cast<u32>(stage));
return {};
}
vk::PrimitiveTopology PrimitiveTopology([[maybe_unused]] const VKDevice& device,
Maxwell::PrimitiveTopology topology) {
VkPrimitiveTopology PrimitiveTopology([[maybe_unused]] const VKDevice& device,
Maxwell::PrimitiveTopology topology) {
switch (topology) {
case Maxwell::PrimitiveTopology::Points:
return vk::PrimitiveTopology::ePointList;
return VK_PRIMITIVE_TOPOLOGY_POINT_LIST;
case Maxwell::PrimitiveTopology::Lines:
return vk::PrimitiveTopology::eLineList;
return VK_PRIMITIVE_TOPOLOGY_LINE_LIST;
case Maxwell::PrimitiveTopology::LineStrip:
return vk::PrimitiveTopology::eLineStrip;
return VK_PRIMITIVE_TOPOLOGY_LINE_STRIP;
case Maxwell::PrimitiveTopology::Triangles:
return vk::PrimitiveTopology::eTriangleList;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;
case Maxwell::PrimitiveTopology::TriangleStrip:
return vk::PrimitiveTopology::eTriangleStrip;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP;
case Maxwell::PrimitiveTopology::TriangleFan:
return vk::PrimitiveTopology::eTriangleFan;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_FAN;
case Maxwell::PrimitiveTopology::Quads:
// TODO(Rodrigo): Use VK_PRIMITIVE_TOPOLOGY_QUAD_LIST_EXT whenever it releases
return vk::PrimitiveTopology::eTriangleList;
return VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST;
case Maxwell::PrimitiveTopology::Patches:
return vk::PrimitiveTopology::ePatchList;
return VK_PRIMITIVE_TOPOLOGY_PATCH_LIST;
default:
UNIMPLEMENTED_MSG("Unimplemented topology={}", static_cast<u32>(topology));
return {};
}
}
vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size) {
VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size) {
switch (type) {
case Maxwell::VertexAttribute::Type::SignedNorm:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Snorm;
return VK_FORMAT_R8_SNORM;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Snorm;
return VK_FORMAT_R8G8_SNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Snorm;
return VK_FORMAT_R8G8B8_SNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Snorm;
return VK_FORMAT_R8G8B8A8_SNORM;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Snorm;
return VK_FORMAT_R16_SNORM;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Snorm;
return VK_FORMAT_R16G16_SNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Snorm;
return VK_FORMAT_R16G16B16_SNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Snorm;
return VK_FORMAT_R16G16B16A16_SNORM;
case Maxwell::VertexAttribute::Size::Size_10_10_10_2:
return vk::Format::eA2B10G10R10SnormPack32;
return VK_FORMAT_A2B10G10R10_SNORM_PACK32;
default:
break;
}
@@ -319,23 +322,23 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::UnsignedNorm:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Unorm;
return VK_FORMAT_R8_UNORM;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Unorm;
return VK_FORMAT_R8G8_UNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Unorm;
return VK_FORMAT_R8G8B8_UNORM;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Unorm;
return VK_FORMAT_R8G8B8A8_UNORM;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Unorm;
return VK_FORMAT_R16_UNORM;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Unorm;
return VK_FORMAT_R16G16_UNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Unorm;
return VK_FORMAT_R16G16B16_UNORM;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Unorm;
return VK_FORMAT_R16G16B16A16_UNORM;
case Maxwell::VertexAttribute::Size::Size_10_10_10_2:
return vk::Format::eA2B10G10R10UnormPack32;
return VK_FORMAT_A2B10G10R10_UNORM_PACK32;
default:
break;
}
@@ -343,59 +346,69 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::SignedInt:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Sint;
return VK_FORMAT_R16G16B16A16_SINT;
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Sint;
return VK_FORMAT_R8_SINT;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Sint;
return VK_FORMAT_R8G8_SINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Sint;
return VK_FORMAT_R8G8B8_SINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Sint;
return VK_FORMAT_R8G8B8A8_SINT;
case Maxwell::VertexAttribute::Size::Size_32:
return vk::Format::eR32Sint;
return VK_FORMAT_R32_SINT;
default:
break;
}
break;
case Maxwell::VertexAttribute::Type::UnsignedInt:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Uint;
return VK_FORMAT_R8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Uint;
return VK_FORMAT_R8G8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Uint;
return VK_FORMAT_R8G8B8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Uint;
return VK_FORMAT_R8G8B8A8_UINT;
case Maxwell::VertexAttribute::Size::Size_16:
return VK_FORMAT_R16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16:
return VK_FORMAT_R16G16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return VK_FORMAT_R16G16B16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return VK_FORMAT_R16G16B16A16_UINT;
case Maxwell::VertexAttribute::Size::Size_32:
return vk::Format::eR32Uint;
return VK_FORMAT_R32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32:
return vk::Format::eR32G32Uint;
return VK_FORMAT_R32G32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32_32:
return vk::Format::eR32G32B32Uint;
return VK_FORMAT_R32G32B32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32_32_32:
return vk::Format::eR32G32B32A32Uint;
return VK_FORMAT_R32G32B32A32_UINT;
default:
break;
}
break;
case Maxwell::VertexAttribute::Type::UnsignedScaled:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Uscaled;
return VK_FORMAT_R8_USCALED;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Uscaled;
return VK_FORMAT_R8G8_USCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Uscaled;
return VK_FORMAT_R8G8B8_USCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Uscaled;
return VK_FORMAT_R8G8B8A8_USCALED;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Uscaled;
return VK_FORMAT_R16_USCALED;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Uscaled;
return VK_FORMAT_R16G16_USCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Uscaled;
return VK_FORMAT_R16G16B16_USCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Uscaled;
return VK_FORMAT_R16G16B16A16_USCALED;
default:
break;
}
@@ -403,21 +416,21 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::SignedScaled:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
return vk::Format::eR8Sscaled;
return VK_FORMAT_R8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_8_8:
return vk::Format::eR8G8Sscaled;
return VK_FORMAT_R8G8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8:
return vk::Format::eR8G8B8Sscaled;
return VK_FORMAT_R8G8B8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return vk::Format::eR8G8B8A8Sscaled;
return VK_FORMAT_R8G8B8A8_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Sscaled;
return VK_FORMAT_R16_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Sscaled;
return VK_FORMAT_R16G16_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Sscaled;
return VK_FORMAT_R16G16B16_SSCALED;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Sscaled;
return VK_FORMAT_R16G16B16A16_SSCALED;
default:
break;
}
@@ -425,21 +438,21 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
case Maxwell::VertexAttribute::Type::Float:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_32:
return vk::Format::eR32Sfloat;
return VK_FORMAT_R32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_32_32:
return vk::Format::eR32G32Sfloat;
return VK_FORMAT_R32G32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_32_32_32:
return vk::Format::eR32G32B32Sfloat;
return VK_FORMAT_R32G32B32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_32_32_32_32:
return vk::Format::eR32G32B32A32Sfloat;
return VK_FORMAT_R32G32B32A32_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16:
return vk::Format::eR16Sfloat;
return VK_FORMAT_R16_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16_16:
return vk::Format::eR16G16Sfloat;
return VK_FORMAT_R16G16_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return vk::Format::eR16G16B16Sfloat;
return VK_FORMAT_R16G16B16_SFLOAT;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return vk::Format::eR16G16B16A16Sfloat;
return VK_FORMAT_R16G16B16A16_SFLOAT;
default:
break;
}
@@ -450,210 +463,210 @@ vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttr
return {};
}
vk::CompareOp ComparisonOp(Maxwell::ComparisonOp comparison) {
VkCompareOp ComparisonOp(Maxwell::ComparisonOp comparison) {
switch (comparison) {
case Maxwell::ComparisonOp::Never:
case Maxwell::ComparisonOp::NeverOld:
return vk::CompareOp::eNever;
return VK_COMPARE_OP_NEVER;
case Maxwell::ComparisonOp::Less:
case Maxwell::ComparisonOp::LessOld:
return vk::CompareOp::eLess;
return VK_COMPARE_OP_LESS;
case Maxwell::ComparisonOp::Equal:
case Maxwell::ComparisonOp::EqualOld:
return vk::CompareOp::eEqual;
return VK_COMPARE_OP_EQUAL;
case Maxwell::ComparisonOp::LessEqual:
case Maxwell::ComparisonOp::LessEqualOld:
return vk::CompareOp::eLessOrEqual;
return VK_COMPARE_OP_LESS_OR_EQUAL;
case Maxwell::ComparisonOp::Greater:
case Maxwell::ComparisonOp::GreaterOld:
return vk::CompareOp::eGreater;
return VK_COMPARE_OP_GREATER;
case Maxwell::ComparisonOp::NotEqual:
case Maxwell::ComparisonOp::NotEqualOld:
return vk::CompareOp::eNotEqual;
return VK_COMPARE_OP_NOT_EQUAL;
case Maxwell::ComparisonOp::GreaterEqual:
case Maxwell::ComparisonOp::GreaterEqualOld:
return vk::CompareOp::eGreaterOrEqual;
return VK_COMPARE_OP_GREATER_OR_EQUAL;
case Maxwell::ComparisonOp::Always:
case Maxwell::ComparisonOp::AlwaysOld:
return vk::CompareOp::eAlways;
return VK_COMPARE_OP_ALWAYS;
}
UNIMPLEMENTED_MSG("Unimplemented comparison op={}", static_cast<u32>(comparison));
return {};
}
vk::IndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format) {
VkIndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format) {
switch (index_format) {
case Maxwell::IndexFormat::UnsignedByte:
if (!device.IsExtIndexTypeUint8Supported()) {
UNIMPLEMENTED_MSG("Native uint8 indices are not supported on this device");
return vk::IndexType::eUint16;
return VK_INDEX_TYPE_UINT16;
}
return vk::IndexType::eUint8EXT;
return VK_INDEX_TYPE_UINT8_EXT;
case Maxwell::IndexFormat::UnsignedShort:
return vk::IndexType::eUint16;
return VK_INDEX_TYPE_UINT16;
case Maxwell::IndexFormat::UnsignedInt:
return vk::IndexType::eUint32;
return VK_INDEX_TYPE_UINT32;
}
UNIMPLEMENTED_MSG("Unimplemented index_format={}", static_cast<u32>(index_format));
return {};
}
vk::StencilOp StencilOp(Maxwell::StencilOp stencil_op) {
VkStencilOp StencilOp(Maxwell::StencilOp stencil_op) {
switch (stencil_op) {
case Maxwell::StencilOp::Keep:
case Maxwell::StencilOp::KeepOGL:
return vk::StencilOp::eKeep;
return VK_STENCIL_OP_KEEP;
case Maxwell::StencilOp::Zero:
case Maxwell::StencilOp::ZeroOGL:
return vk::StencilOp::eZero;
return VK_STENCIL_OP_ZERO;
case Maxwell::StencilOp::Replace:
case Maxwell::StencilOp::ReplaceOGL:
return vk::StencilOp::eReplace;
return VK_STENCIL_OP_REPLACE;
case Maxwell::StencilOp::Incr:
case Maxwell::StencilOp::IncrOGL:
return vk::StencilOp::eIncrementAndClamp;
return VK_STENCIL_OP_INCREMENT_AND_CLAMP;
case Maxwell::StencilOp::Decr:
case Maxwell::StencilOp::DecrOGL:
return vk::StencilOp::eDecrementAndClamp;
return VK_STENCIL_OP_DECREMENT_AND_CLAMP;
case Maxwell::StencilOp::Invert:
case Maxwell::StencilOp::InvertOGL:
return vk::StencilOp::eInvert;
return VK_STENCIL_OP_INVERT;
case Maxwell::StencilOp::IncrWrap:
case Maxwell::StencilOp::IncrWrapOGL:
return vk::StencilOp::eIncrementAndWrap;
return VK_STENCIL_OP_INCREMENT_AND_WRAP;
case Maxwell::StencilOp::DecrWrap:
case Maxwell::StencilOp::DecrWrapOGL:
return vk::StencilOp::eDecrementAndWrap;
return VK_STENCIL_OP_DECREMENT_AND_WRAP;
}
UNIMPLEMENTED_MSG("Unimplemented stencil op={}", static_cast<u32>(stencil_op));
return {};
}
vk::BlendOp BlendEquation(Maxwell::Blend::Equation equation) {
VkBlendOp BlendEquation(Maxwell::Blend::Equation equation) {
switch (equation) {
case Maxwell::Blend::Equation::Add:
case Maxwell::Blend::Equation::AddGL:
return vk::BlendOp::eAdd;
return VK_BLEND_OP_ADD;
case Maxwell::Blend::Equation::Subtract:
case Maxwell::Blend::Equation::SubtractGL:
return vk::BlendOp::eSubtract;
return VK_BLEND_OP_SUBTRACT;
case Maxwell::Blend::Equation::ReverseSubtract:
case Maxwell::Blend::Equation::ReverseSubtractGL:
return vk::BlendOp::eReverseSubtract;
return VK_BLEND_OP_REVERSE_SUBTRACT;
case Maxwell::Blend::Equation::Min:
case Maxwell::Blend::Equation::MinGL:
return vk::BlendOp::eMin;
return VK_BLEND_OP_MIN;
case Maxwell::Blend::Equation::Max:
case Maxwell::Blend::Equation::MaxGL:
return vk::BlendOp::eMax;
return VK_BLEND_OP_MAX;
}
UNIMPLEMENTED_MSG("Unimplemented blend equation={}", static_cast<u32>(equation));
return {};
}
vk::BlendFactor BlendFactor(Maxwell::Blend::Factor factor) {
VkBlendFactor BlendFactor(Maxwell::Blend::Factor factor) {
switch (factor) {
case Maxwell::Blend::Factor::Zero:
case Maxwell::Blend::Factor::ZeroGL:
return vk::BlendFactor::eZero;
return VK_BLEND_FACTOR_ZERO;
case Maxwell::Blend::Factor::One:
case Maxwell::Blend::Factor::OneGL:
return vk::BlendFactor::eOne;
return VK_BLEND_FACTOR_ONE;
case Maxwell::Blend::Factor::SourceColor:
case Maxwell::Blend::Factor::SourceColorGL:
return vk::BlendFactor::eSrcColor;
return VK_BLEND_FACTOR_SRC_COLOR;
case Maxwell::Blend::Factor::OneMinusSourceColor:
case Maxwell::Blend::Factor::OneMinusSourceColorGL:
return vk::BlendFactor::eOneMinusSrcColor;
return VK_BLEND_FACTOR_ONE_MINUS_SRC_COLOR;
case Maxwell::Blend::Factor::SourceAlpha:
case Maxwell::Blend::Factor::SourceAlphaGL:
return vk::BlendFactor::eSrcAlpha;
return VK_BLEND_FACTOR_SRC_ALPHA;
case Maxwell::Blend::Factor::OneMinusSourceAlpha:
case Maxwell::Blend::Factor::OneMinusSourceAlphaGL:
return vk::BlendFactor::eOneMinusSrcAlpha;
return VK_BLEND_FACTOR_ONE_MINUS_SRC_ALPHA;
case Maxwell::Blend::Factor::DestAlpha:
case Maxwell::Blend::Factor::DestAlphaGL:
return vk::BlendFactor::eDstAlpha;
return VK_BLEND_FACTOR_DST_ALPHA;
case Maxwell::Blend::Factor::OneMinusDestAlpha:
case Maxwell::Blend::Factor::OneMinusDestAlphaGL:
return vk::BlendFactor::eOneMinusDstAlpha;
return VK_BLEND_FACTOR_ONE_MINUS_DST_ALPHA;
case Maxwell::Blend::Factor::DestColor:
case Maxwell::Blend::Factor::DestColorGL:
return vk::BlendFactor::eDstColor;
return VK_BLEND_FACTOR_DST_COLOR;
case Maxwell::Blend::Factor::OneMinusDestColor:
case Maxwell::Blend::Factor::OneMinusDestColorGL:
return vk::BlendFactor::eOneMinusDstColor;
return VK_BLEND_FACTOR_ONE_MINUS_DST_COLOR;
case Maxwell::Blend::Factor::SourceAlphaSaturate:
case Maxwell::Blend::Factor::SourceAlphaSaturateGL:
return vk::BlendFactor::eSrcAlphaSaturate;
return VK_BLEND_FACTOR_SRC_ALPHA_SATURATE;
case Maxwell::Blend::Factor::Source1Color:
case Maxwell::Blend::Factor::Source1ColorGL:
return vk::BlendFactor::eSrc1Color;
return VK_BLEND_FACTOR_SRC1_COLOR;
case Maxwell::Blend::Factor::OneMinusSource1Color:
case Maxwell::Blend::Factor::OneMinusSource1ColorGL:
return vk::BlendFactor::eOneMinusSrc1Color;
return VK_BLEND_FACTOR_ONE_MINUS_SRC1_COLOR;
case Maxwell::Blend::Factor::Source1Alpha:
case Maxwell::Blend::Factor::Source1AlphaGL:
return vk::BlendFactor::eSrc1Alpha;
return VK_BLEND_FACTOR_SRC1_ALPHA;
case Maxwell::Blend::Factor::OneMinusSource1Alpha:
case Maxwell::Blend::Factor::OneMinusSource1AlphaGL:
return vk::BlendFactor::eOneMinusSrc1Alpha;
return VK_BLEND_FACTOR_ONE_MINUS_SRC1_ALPHA;
case Maxwell::Blend::Factor::ConstantColor:
case Maxwell::Blend::Factor::ConstantColorGL:
return vk::BlendFactor::eConstantColor;
return VK_BLEND_FACTOR_CONSTANT_COLOR;
case Maxwell::Blend::Factor::OneMinusConstantColor:
case Maxwell::Blend::Factor::OneMinusConstantColorGL:
return vk::BlendFactor::eOneMinusConstantColor;
return VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_COLOR;
case Maxwell::Blend::Factor::ConstantAlpha:
case Maxwell::Blend::Factor::ConstantAlphaGL:
return vk::BlendFactor::eConstantAlpha;
return VK_BLEND_FACTOR_CONSTANT_ALPHA;
case Maxwell::Blend::Factor::OneMinusConstantAlpha:
case Maxwell::Blend::Factor::OneMinusConstantAlphaGL:
return vk::BlendFactor::eOneMinusConstantAlpha;
return VK_BLEND_FACTOR_ONE_MINUS_CONSTANT_ALPHA;
}
UNIMPLEMENTED_MSG("Unimplemented blend factor={}", static_cast<u32>(factor));
return {};
}
vk::FrontFace FrontFace(Maxwell::FrontFace front_face) {
VkFrontFace FrontFace(Maxwell::FrontFace front_face) {
switch (front_face) {
case Maxwell::FrontFace::ClockWise:
return vk::FrontFace::eClockwise;
return VK_FRONT_FACE_CLOCKWISE;
case Maxwell::FrontFace::CounterClockWise:
return vk::FrontFace::eCounterClockwise;
return VK_FRONT_FACE_COUNTER_CLOCKWISE;
}
UNIMPLEMENTED_MSG("Unimplemented front face={}", static_cast<u32>(front_face));
return {};
}
vk::CullModeFlags CullFace(Maxwell::CullFace cull_face) {
VkCullModeFlags CullFace(Maxwell::CullFace cull_face) {
switch (cull_face) {
case Maxwell::CullFace::Front:
return vk::CullModeFlagBits::eFront;
return VK_CULL_MODE_FRONT_BIT;
case Maxwell::CullFace::Back:
return vk::CullModeFlagBits::eBack;
return VK_CULL_MODE_BACK_BIT;
case Maxwell::CullFace::FrontAndBack:
return vk::CullModeFlagBits::eFrontAndBack;
return VK_CULL_MODE_FRONT_AND_BACK;
}
UNIMPLEMENTED_MSG("Unimplemented cull face={}", static_cast<u32>(cull_face));
return {};
}
vk::ComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle) {
VkComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle) {
switch (swizzle) {
case Tegra::Texture::SwizzleSource::Zero:
return vk::ComponentSwizzle::eZero;
return VK_COMPONENT_SWIZZLE_ZERO;
case Tegra::Texture::SwizzleSource::R:
return vk::ComponentSwizzle::eR;
return VK_COMPONENT_SWIZZLE_R;
case Tegra::Texture::SwizzleSource::G:
return vk::ComponentSwizzle::eG;
return VK_COMPONENT_SWIZZLE_G;
case Tegra::Texture::SwizzleSource::B:
return vk::ComponentSwizzle::eB;
return VK_COMPONENT_SWIZZLE_B;
case Tegra::Texture::SwizzleSource::A:
return vk::ComponentSwizzle::eA;
return VK_COMPONENT_SWIZZLE_A;
case Tegra::Texture::SwizzleSource::OneInt:
case Tegra::Texture::SwizzleSource::OneFloat:
return vk::ComponentSwizzle::eOne;
return VK_COMPONENT_SWIZZLE_ONE;
}
UNIMPLEMENTED_MSG("Unimplemented swizzle source={}", static_cast<u32>(swizzle));
return {};

View File

@@ -6,8 +6,8 @@
#include "common/common_types.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
#include "video_core/textures/texture.h"
@@ -18,46 +18,45 @@ using PixelFormat = VideoCore::Surface::PixelFormat;
namespace Sampler {
vk::Filter Filter(Tegra::Texture::TextureFilter filter);
VkFilter Filter(Tegra::Texture::TextureFilter filter);
vk::SamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter);
VkSamplerMipmapMode MipmapMode(Tegra::Texture::TextureMipmapFilter mipmap_filter);
vk::SamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter);
VkSamplerAddressMode WrapMode(const VKDevice& device, Tegra::Texture::WrapMode wrap_mode,
Tegra::Texture::TextureFilter filter);
vk::CompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func);
VkCompareOp DepthCompareFunction(Tegra::Texture::DepthCompareFunc depth_compare_func);
} // namespace Sampler
struct FormatInfo {
vk::Format format;
VkFormat format;
bool attachable;
bool storage;
};
FormatInfo SurfaceFormat(const VKDevice& device, FormatType format_type, PixelFormat pixel_format);
vk::ShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage);
VkShaderStageFlagBits ShaderStage(Tegra::Engines::ShaderType stage);
vk::PrimitiveTopology PrimitiveTopology(const VKDevice& device,
Maxwell::PrimitiveTopology topology);
VkPrimitiveTopology PrimitiveTopology(const VKDevice& device, Maxwell::PrimitiveTopology topology);
vk::Format VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size);
VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttribute::Size size);
vk::CompareOp ComparisonOp(Maxwell::ComparisonOp comparison);
VkCompareOp ComparisonOp(Maxwell::ComparisonOp comparison);
vk::IndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format);
VkIndexType IndexFormat(const VKDevice& device, Maxwell::IndexFormat index_format);
vk::StencilOp StencilOp(Maxwell::StencilOp stencil_op);
VkStencilOp StencilOp(Maxwell::StencilOp stencil_op);
vk::BlendOp BlendEquation(Maxwell::Blend::Equation equation);
VkBlendOp BlendEquation(Maxwell::Blend::Equation equation);
vk::BlendFactor BlendFactor(Maxwell::Blend::Factor factor);
VkBlendFactor BlendFactor(Maxwell::Blend::Factor factor);
vk::FrontFace FrontFace(Maxwell::FrontFace front_face);
VkFrontFace FrontFace(Maxwell::FrontFace front_face);
vk::CullModeFlags CullFace(Maxwell::CullFace cull_face);
VkCullModeFlags CullFace(Maxwell::CullFace cull_face);
vk::ComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle);
VkComponentSwizzle SwizzleSource(Tegra::Texture::SwizzleSource swizzle);
} // namespace Vulkan::MaxwellToVK

View File

@@ -24,7 +24,6 @@
#include "core/settings.h"
#include "core/telemetry_session.h"
#include "video_core/gpu.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/renderer_vulkan.h"
#include "video_core/renderer_vulkan/vk_blit_screen.h"
#include "video_core/renderer_vulkan/vk_device.h"
@@ -34,8 +33,9 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#include "video_core/renderer_vulkan/vk_swapchain.h"
#include "video_core/renderer_vulkan/wrapper.h"
// Include these late to avoid changing Vulkan-Hpp's dynamic dispatcher size
// Include these late to avoid polluting previous headers
#ifdef _WIN32
#include <windows.h>
// ensure include order
@@ -54,20 +54,19 @@ namespace {
using Core::Frontend::WindowSystemType;
VkBool32 DebugCallback(VkDebugUtilsMessageSeverityFlagBitsEXT severity_,
VkBool32 DebugCallback(VkDebugUtilsMessageSeverityFlagBitsEXT severity,
VkDebugUtilsMessageTypeFlagsEXT type,
const VkDebugUtilsMessengerCallbackDataEXT* data,
[[maybe_unused]] void* user_data) {
const auto severity{static_cast<vk::DebugUtilsMessageSeverityFlagBitsEXT>(severity_)};
const char* message{data->pMessage};
if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eError) {
if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT) {
LOG_CRITICAL(Render_Vulkan, "{}", message);
} else if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eWarning) {
} else if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_WARNING_BIT_EXT) {
LOG_WARNING(Render_Vulkan, "{}", message);
} else if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eInfo) {
} else if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_INFO_BIT_EXT) {
LOG_INFO(Render_Vulkan, "{}", message);
} else if (severity & vk::DebugUtilsMessageSeverityFlagBitsEXT::eVerbose) {
} else if (severity & VK_DEBUG_UTILS_MESSAGE_SEVERITY_VERBOSE_BIT_EXT) {
LOG_DEBUG(Render_Vulkan, "{}", message);
}
return VK_FALSE;
@@ -94,22 +93,24 @@ Common::DynamicLibrary OpenVulkanLibrary() {
return library;
}
UniqueInstance CreateInstance(Common::DynamicLibrary& library, vk::DispatchLoaderDynamic& dld,
WindowSystemType window_type = WindowSystemType::Headless,
bool enable_layers = false) {
vk::Instance CreateInstance(Common::DynamicLibrary& library, vk::InstanceDispatch& dld,
WindowSystemType window_type = WindowSystemType::Headless,
bool enable_layers = false) {
if (!library.IsOpen()) {
LOG_ERROR(Render_Vulkan, "Vulkan library not available");
return UniqueInstance{};
return {};
}
PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr;
if (!library.GetSymbol("vkGetInstanceProcAddr", &vkGetInstanceProcAddr)) {
if (!library.GetSymbol("vkGetInstanceProcAddr", &dld.vkGetInstanceProcAddr)) {
LOG_ERROR(Render_Vulkan, "vkGetInstanceProcAddr not present in Vulkan");
return UniqueInstance{};
return {};
}
if (!vk::Load(dld)) {
LOG_ERROR(Render_Vulkan, "Failed to load Vulkan function pointers");
return {};
}
dld.init(vkGetInstanceProcAddr);
std::vector<const char*> extensions;
extensions.reserve(4);
extensions.reserve(6);
switch (window_type) {
case Core::Frontend::WindowSystemType::Headless:
break;
@@ -136,45 +137,39 @@ UniqueInstance CreateInstance(Common::DynamicLibrary& library, vk::DispatchLoade
if (enable_layers) {
extensions.push_back(VK_EXT_DEBUG_UTILS_EXTENSION_NAME);
}
extensions.push_back(VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME);
u32 num_properties;
if (vk::enumerateInstanceExtensionProperties(nullptr, &num_properties, nullptr, dld) !=
vk::Result::eSuccess) {
LOG_ERROR(Render_Vulkan, "Failed to query number of extension properties");
return UniqueInstance{};
}
std::vector<vk::ExtensionProperties> properties(num_properties);
if (vk::enumerateInstanceExtensionProperties(nullptr, &num_properties, properties.data(),
dld) != vk::Result::eSuccess) {
const std::optional properties = vk::EnumerateInstanceExtensionProperties(dld);
if (!properties) {
LOG_ERROR(Render_Vulkan, "Failed to query extension properties");
return UniqueInstance{};
return {};
}
for (const char* extension : extensions) {
const auto it =
std::find_if(properties.begin(), properties.end(), [extension](const auto& prop) {
std::find_if(properties->begin(), properties->end(), [extension](const auto& prop) {
return !std::strcmp(extension, prop.extensionName);
});
if (it == properties.end()) {
if (it == properties->end()) {
LOG_ERROR(Render_Vulkan, "Required instance extension {} is not available", extension);
return UniqueInstance{};
return {};
}
}
const vk::ApplicationInfo application_info("yuzu Emulator", VK_MAKE_VERSION(0, 1, 0),
"yuzu Emulator", VK_MAKE_VERSION(0, 1, 0),
VK_API_VERSION_1_1);
const std::array layers = {"VK_LAYER_LUNARG_standard_validation"};
const vk::InstanceCreateInfo instance_ci(
{}, &application_info, enable_layers ? static_cast<u32>(layers.size()) : 0, layers.data(),
static_cast<u32>(extensions.size()), extensions.data());
vk::Instance unsafe_instance;
if (vk::createInstance(&instance_ci, nullptr, &unsafe_instance, dld) != vk::Result::eSuccess) {
LOG_ERROR(Render_Vulkan, "Failed to create Vulkan instance");
return UniqueInstance{};
static constexpr std::array layers_data{"VK_LAYER_LUNARG_standard_validation"};
vk::Span<const char*> layers = layers_data;
if (!enable_layers) {
layers = {};
}
dld.init(unsafe_instance);
return UniqueInstance(unsafe_instance, {nullptr, dld});
vk::Instance instance = vk::Instance::Create(layers, extensions, dld);
if (!instance) {
LOG_ERROR(Render_Vulkan, "Failed to create Vulkan instance");
return {};
}
if (!vk::Load(*instance, dld)) {
LOG_ERROR(Render_Vulkan, "Failed to load Vulkan instance function pointers");
}
return instance;
}
std::string GetReadableVersion(u32 version) {
@@ -187,14 +182,14 @@ std::string GetDriverVersion(const VKDevice& device) {
// https://github.com/SaschaWillems/vulkan.gpuinfo.org/blob/5dddea46ea1120b0df14eef8f15ff8e318e35462/functions.php#L308-L314
const u32 version = device.GetDriverVersion();
if (device.GetDriverID() == vk::DriverIdKHR::eNvidiaProprietary) {
if (device.GetDriverID() == VK_DRIVER_ID_NVIDIA_PROPRIETARY_KHR) {
const u32 major = (version >> 22) & 0x3ff;
const u32 minor = (version >> 14) & 0x0ff;
const u32 secondary = (version >> 6) & 0x0ff;
const u32 tertiary = version & 0x003f;
return fmt::format("{}.{}.{}.{}", major, minor, secondary, tertiary);
}
if (device.GetDriverID() == vk::DriverIdKHR::eIntelProprietaryWindows) {
if (device.GetDriverID() == VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS_KHR) {
const u32 major = version >> 14;
const u32 minor = version & 0x3fff;
return fmt::format("{}.{}", major, minor);
@@ -307,10 +302,8 @@ void RendererVulkan::ShutDown() {
if (!device) {
return;
}
const auto dev = device->GetLogical();
const auto& dld = device->GetDispatchLoader();
if (dev && dld.vkDeviceWaitIdle) {
dev.waitIdle(dld);
if (const auto& dev = device->GetLogical()) {
dev.WaitIdle();
}
rasterizer.reset();
@@ -326,23 +319,11 @@ bool RendererVulkan::CreateDebugCallback() {
if (!Settings::values.renderer_debug) {
return true;
}
const vk::DebugUtilsMessengerCreateInfoEXT callback_ci(
{},
vk::DebugUtilsMessageSeverityFlagBitsEXT::eError |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eWarning |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eInfo |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eVerbose,
vk::DebugUtilsMessageTypeFlagBitsEXT::eGeneral |
vk::DebugUtilsMessageTypeFlagBitsEXT::eValidation |
vk::DebugUtilsMessageTypeFlagBitsEXT::ePerformance,
&DebugCallback, nullptr);
vk::DebugUtilsMessengerEXT unsafe_callback;
if (instance->createDebugUtilsMessengerEXT(&callback_ci, nullptr, &unsafe_callback, dld) !=
vk::Result::eSuccess) {
debug_callback = instance.TryCreateDebugCallback(DebugCallback);
if (!debug_callback) {
LOG_ERROR(Render_Vulkan, "Failed to create debug callback");
return false;
}
debug_callback = UniqueDebugUtilsMessengerEXT(unsafe_callback, {*instance, nullptr, dld});
return true;
}
@@ -357,8 +338,8 @@ bool RendererVulkan::CreateSurface() {
nullptr, 0, nullptr, hWnd};
const auto vkCreateWin32SurfaceKHR = reinterpret_cast<PFN_vkCreateWin32SurfaceKHR>(
dld.vkGetInstanceProcAddr(*instance, "vkCreateWin32SurfaceKHR"));
if (!vkCreateWin32SurfaceKHR || vkCreateWin32SurfaceKHR(instance.get(), &win32_ci, nullptr,
&unsafe_surface) != VK_SUCCESS) {
if (!vkCreateWin32SurfaceKHR ||
vkCreateWin32SurfaceKHR(*instance, &win32_ci, nullptr, &unsafe_surface) != VK_SUCCESS) {
LOG_ERROR(Render_Vulkan, "Failed to initialize Win32 surface");
return false;
}
@@ -372,8 +353,8 @@ bool RendererVulkan::CreateSurface() {
reinterpret_cast<Window>(window_info.render_surface)};
const auto vkCreateXlibSurfaceKHR = reinterpret_cast<PFN_vkCreateXlibSurfaceKHR>(
dld.vkGetInstanceProcAddr(*instance, "vkCreateXlibSurfaceKHR"));
if (!vkCreateXlibSurfaceKHR || vkCreateXlibSurfaceKHR(instance.get(), &xlib_ci, nullptr,
&unsafe_surface) != VK_SUCCESS) {
if (!vkCreateXlibSurfaceKHR ||
vkCreateXlibSurfaceKHR(*instance, &xlib_ci, nullptr, &unsafe_surface) != VK_SUCCESS) {
LOG_ERROR(Render_Vulkan, "Failed to initialize Xlib surface");
return false;
}
@@ -386,7 +367,7 @@ bool RendererVulkan::CreateSurface() {
const auto vkCreateWaylandSurfaceKHR = reinterpret_cast<PFN_vkCreateWaylandSurfaceKHR>(
dld.vkGetInstanceProcAddr(*instance, "vkCreateWaylandSurfaceKHR"));
if (!vkCreateWaylandSurfaceKHR ||
vkCreateWaylandSurfaceKHR(instance.get(), &wayland_ci, nullptr, &unsafe_surface) !=
vkCreateWaylandSurfaceKHR(*instance, &wayland_ci, nullptr, &unsafe_surface) !=
VK_SUCCESS) {
LOG_ERROR(Render_Vulkan, "Failed to initialize Wayland surface");
return false;
@@ -398,26 +379,30 @@ bool RendererVulkan::CreateSurface() {
return false;
}
surface = UniqueSurfaceKHR(unsafe_surface, {*instance, nullptr, dld});
surface = vk::SurfaceKHR(unsafe_surface, *instance, dld);
return true;
}
bool RendererVulkan::PickDevices() {
const auto devices = instance->enumeratePhysicalDevices(dld);
const auto devices = instance.EnumeratePhysicalDevices();
if (!devices) {
LOG_ERROR(Render_Vulkan, "Failed to enumerate physical devices");
return false;
}
const s32 device_index = Settings::values.vulkan_device;
if (device_index < 0 || device_index >= static_cast<s32>(devices.size())) {
if (device_index < 0 || device_index >= static_cast<s32>(devices->size())) {
LOG_ERROR(Render_Vulkan, "Invalid device index {}!", device_index);
return false;
}
const vk::PhysicalDevice physical_device = devices[static_cast<std::size_t>(device_index)];
if (!VKDevice::IsSuitable(physical_device, *surface, dld)) {
const vk::PhysicalDevice physical_device((*devices)[static_cast<std::size_t>(device_index)],
dld);
if (!VKDevice::IsSuitable(physical_device, *surface)) {
return false;
}
device = std::make_unique<VKDevice>(dld, physical_device, *surface);
return device->Create(*instance);
device = std::make_unique<VKDevice>(*instance, physical_device, *surface, dld);
return device->Create();
}
void RendererVulkan::Report() const {
@@ -444,30 +429,22 @@ void RendererVulkan::Report() const {
}
std::vector<std::string> RendererVulkan::EnumerateDevices() {
// Avoid putting DispatchLoaderDynamic, it's too large
auto dld_memory = std::make_unique<vk::DispatchLoaderDynamic>();
auto& dld = *dld_memory;
vk::InstanceDispatch dld;
Common::DynamicLibrary library = OpenVulkanLibrary();
UniqueInstance instance = CreateInstance(library, dld);
vk::Instance instance = CreateInstance(library, dld);
if (!instance) {
return {};
}
u32 num_devices;
if (instance->enumeratePhysicalDevices(&num_devices, nullptr, dld) != vk::Result::eSuccess) {
return {};
}
std::vector<vk::PhysicalDevice> devices(num_devices);
if (instance->enumeratePhysicalDevices(&num_devices, devices.data(), dld) !=
vk::Result::eSuccess) {
const std::optional physical_devices = instance.EnumeratePhysicalDevices();
if (!physical_devices) {
return {};
}
std::vector<std::string> names;
names.reserve(num_devices);
for (auto& device : devices) {
names.push_back(device.getProperties(dld).deviceName);
names.reserve(physical_devices->size());
for (const auto& device : *physical_devices) {
names.push_back(vk::PhysicalDevice(device, dld).GetProperties().deviceName);
}
return names;
}

View File

@@ -12,7 +12,7 @@
#include "common/dynamic_library.h"
#include "video_core/renderer_base.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -61,14 +61,14 @@ private:
Core::System& system;
Common::DynamicLibrary library;
vk::DispatchLoaderDynamic dld;
vk::InstanceDispatch dld;
UniqueInstance instance;
UniqueSurfaceKHR surface;
vk::Instance instance;
vk::SurfaceKHR surface;
VKScreenInfo screen_info;
UniqueDebugUtilsMessengerEXT debug_callback;
vk::DebugCallback debug_callback;
std::unique_ptr<VKDevice> device;
std::unique_ptr<VKSwapchain> swapchain;
std::unique_ptr<VKMemoryManager> memory_manager;

View File

@@ -20,7 +20,6 @@
#include "video_core/gpu.h"
#include "video_core/morton.h"
#include "video_core/rasterizer_interface.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/renderer_vulkan.h"
#include "video_core/renderer_vulkan/vk_blit_screen.h"
#include "video_core/renderer_vulkan/vk_device.h"
@@ -30,6 +29,7 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_shader_util.h"
#include "video_core/renderer_vulkan/vk_swapchain.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
namespace Vulkan {
@@ -140,16 +140,25 @@ struct ScreenRectVertex {
std::array<f32, 2> position;
std::array<f32, 2> tex_coord;
static vk::VertexInputBindingDescription GetDescription() {
return vk::VertexInputBindingDescription(0, sizeof(ScreenRectVertex),
vk::VertexInputRate::eVertex);
static VkVertexInputBindingDescription GetDescription() {
VkVertexInputBindingDescription description;
description.binding = 0;
description.stride = sizeof(ScreenRectVertex);
description.inputRate = VK_VERTEX_INPUT_RATE_VERTEX;
return description;
}
static std::array<vk::VertexInputAttributeDescription, 2> GetAttributes() {
return {vk::VertexInputAttributeDescription(0, 0, vk::Format::eR32G32Sfloat,
offsetof(ScreenRectVertex, position)),
vk::VertexInputAttributeDescription(1, 0, vk::Format::eR32G32Sfloat,
offsetof(ScreenRectVertex, tex_coord))};
static std::array<VkVertexInputAttributeDescription, 2> GetAttributes() {
std::array<VkVertexInputAttributeDescription, 2> attributes;
attributes[0].location = 0;
attributes[0].binding = 0;
attributes[0].format = VK_FORMAT_R32G32_SFLOAT;
attributes[0].offset = offsetof(ScreenRectVertex, position);
attributes[1].location = 1;
attributes[1].binding = 0;
attributes[1].format = VK_FORMAT_R32G32_SFLOAT;
attributes[1].offset = offsetof(ScreenRectVertex, tex_coord);
return attributes;
}
};
@@ -172,16 +181,16 @@ std::size_t GetSizeInBytes(const Tegra::FramebufferConfig& framebuffer) {
static_cast<std::size_t>(framebuffer.height) * GetBytesPerPixel(framebuffer);
}
vk::Format GetFormat(const Tegra::FramebufferConfig& framebuffer) {
VkFormat GetFormat(const Tegra::FramebufferConfig& framebuffer) {
switch (framebuffer.pixel_format) {
case Tegra::FramebufferConfig::PixelFormat::ABGR8:
return vk::Format::eA8B8G8R8UnormPack32;
return VK_FORMAT_A8B8G8R8_UNORM_PACK32;
case Tegra::FramebufferConfig::PixelFormat::RGB565:
return vk::Format::eR5G6B5UnormPack16;
return VK_FORMAT_R5G6B5_UNORM_PACK16;
default:
UNIMPLEMENTED_MSG("Unknown framebuffer pixel format: {}",
static_cast<u32>(framebuffer.pixel_format));
return vk::Format::eA8B8G8R8UnormPack32;
return VK_FORMAT_A8B8G8R8_UNORM_PACK32;
}
}
@@ -219,8 +228,8 @@ void VKBlitScreen::Recreate() {
CreateDynamicResources();
}
std::tuple<VKFence&, vk::Semaphore> VKBlitScreen::Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated) {
std::tuple<VKFence&, VkSemaphore> VKBlitScreen::Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated) {
RefreshResources(framebuffer);
// Finish any pending renderpass
@@ -255,46 +264,76 @@ std::tuple<VKFence&, vk::Semaphore> VKBlitScreen::Draw(const Tegra::FramebufferC
framebuffer.stride, block_height_log2, framebuffer.height, 0, 1, 1,
map.GetAddress() + image_offset, host_ptr);
blit_image->Transition(0, 1, 0, 1, vk::PipelineStageFlagBits::eTransfer,
vk::AccessFlagBits::eTransferWrite,
vk::ImageLayout::eTransferDstOptimal);
blit_image->Transition(0, 1, 0, 1, VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_ACCESS_TRANSFER_WRITE_BIT, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
const vk::BufferImageCopy copy(image_offset, 0, 0,
{vk::ImageAspectFlagBits::eColor, 0, 0, 1}, {0, 0, 0},
{framebuffer.width, framebuffer.height, 1});
scheduler.Record([buffer_handle = *buffer, image = blit_image->GetHandle(),
copy](auto cmdbuf, auto& dld) {
cmdbuf.copyBufferToImage(buffer_handle, image, vk::ImageLayout::eTransferDstOptimal,
{copy}, dld);
});
VkBufferImageCopy copy;
copy.bufferOffset = image_offset;
copy.bufferRowLength = 0;
copy.bufferImageHeight = 0;
copy.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
copy.imageSubresource.mipLevel = 0;
copy.imageSubresource.baseArrayLayer = 0;
copy.imageSubresource.layerCount = 1;
copy.imageOffset.x = 0;
copy.imageOffset.y = 0;
copy.imageOffset.z = 0;
copy.imageExtent.width = framebuffer.width;
copy.imageExtent.height = framebuffer.height;
copy.imageExtent.depth = 1;
scheduler.Record(
[buffer = *buffer, image = *blit_image->GetHandle(), copy](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBufferToImage(buffer, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, copy);
});
}
map.Release();
blit_image->Transition(0, 1, 0, 1, vk::PipelineStageFlagBits::eFragmentShader,
vk::AccessFlagBits::eShaderRead,
vk::ImageLayout::eShaderReadOnlyOptimal);
blit_image->Transition(0, 1, 0, 1, VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
VK_ACCESS_SHADER_READ_BIT, VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL);
scheduler.Record([renderpass = *renderpass, framebuffer = *framebuffers[image_index],
descriptor_set = descriptor_sets[image_index], buffer = *buffer,
size = swapchain.GetSize(), pipeline = *pipeline,
layout = *pipeline_layout](auto cmdbuf, auto& dld) {
const vk::ClearValue clear_color{std::array{0.0f, 0.0f, 0.0f, 1.0f}};
const vk::RenderPassBeginInfo renderpass_bi(renderpass, framebuffer, {{0, 0}, size}, 1,
&clear_color);
layout = *pipeline_layout](vk::CommandBuffer cmdbuf) {
VkClearValue clear_color;
clear_color.color.float32[0] = 0.0f;
clear_color.color.float32[1] = 0.0f;
clear_color.color.float32[2] = 0.0f;
clear_color.color.float32[3] = 0.0f;
cmdbuf.beginRenderPass(renderpass_bi, vk::SubpassContents::eInline, dld);
cmdbuf.bindPipeline(vk::PipelineBindPoint::eGraphics, pipeline, dld);
cmdbuf.setViewport(
0,
{{0.0f, 0.0f, static_cast<f32>(size.width), static_cast<f32>(size.height), 0.0f, 1.0f}},
dld);
cmdbuf.setScissor(0, {{{0, 0}, size}}, dld);
VkRenderPassBeginInfo renderpass_bi;
renderpass_bi.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO;
renderpass_bi.pNext = nullptr;
renderpass_bi.renderPass = renderpass;
renderpass_bi.framebuffer = framebuffer;
renderpass_bi.renderArea.offset.x = 0;
renderpass_bi.renderArea.offset.y = 0;
renderpass_bi.renderArea.extent = size;
renderpass_bi.clearValueCount = 1;
renderpass_bi.pClearValues = &clear_color;
cmdbuf.bindVertexBuffers(0, {buffer}, {offsetof(BufferData, vertices)}, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eGraphics, layout, 0, {descriptor_set}, {},
dld);
cmdbuf.draw(4, 1, 0, 0, dld);
cmdbuf.endRenderPass(dld);
VkViewport viewport;
viewport.x = 0.0f;
viewport.y = 0.0f;
viewport.width = static_cast<float>(size.width);
viewport.height = static_cast<float>(size.height);
viewport.minDepth = 0.0f;
viewport.maxDepth = 1.0f;
VkRect2D scissor;
scissor.offset.x = 0;
scissor.offset.y = 0;
scissor.extent = size;
cmdbuf.BeginRenderPass(renderpass_bi, VK_SUBPASS_CONTENTS_INLINE);
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
cmdbuf.SetViewport(0, viewport);
cmdbuf.SetScissor(0, scissor);
cmdbuf.BindVertexBuffer(0, buffer, offsetof(BufferData, vertices));
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_GRAPHICS, layout, 0, descriptor_set, {});
cmdbuf.Draw(4, 1, 0, 0);
cmdbuf.EndRenderPass();
});
return {scheduler.GetFence(), *semaphores[image_index]};
@@ -334,165 +373,297 @@ void VKBlitScreen::CreateShaders() {
}
void VKBlitScreen::CreateSemaphores() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
semaphores.resize(image_count);
for (std::size_t i = 0; i < image_count; ++i) {
semaphores[i] = dev.createSemaphoreUnique({}, nullptr, dld);
}
std::generate(semaphores.begin(), semaphores.end(),
[this] { return device.GetLogical().CreateSemaphore(); });
}
void VKBlitScreen::CreateDescriptorPool() {
const std::array<vk::DescriptorPoolSize, 2> pool_sizes{
vk::DescriptorPoolSize{vk::DescriptorType::eUniformBuffer, static_cast<u32>(image_count)},
vk::DescriptorPoolSize{vk::DescriptorType::eCombinedImageSampler,
static_cast<u32>(image_count)}};
const vk::DescriptorPoolCreateInfo pool_ci(
{}, static_cast<u32>(image_count), static_cast<u32>(pool_sizes.size()), pool_sizes.data());
const auto dev = device.GetLogical();
descriptor_pool = dev.createDescriptorPoolUnique(pool_ci, nullptr, device.GetDispatchLoader());
std::array<VkDescriptorPoolSize, 2> pool_sizes;
pool_sizes[0].type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
pool_sizes[0].descriptorCount = static_cast<u32>(image_count);
pool_sizes[1].type = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
pool_sizes[1].descriptorCount = static_cast<u32>(image_count);
VkDescriptorPoolCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT;
ci.maxSets = static_cast<u32>(image_count);
ci.poolSizeCount = static_cast<u32>(pool_sizes.size());
ci.pPoolSizes = pool_sizes.data();
descriptor_pool = device.GetLogical().CreateDescriptorPool(ci);
}
void VKBlitScreen::CreateRenderPass() {
const vk::AttachmentDescription color_attachment(
{}, swapchain.GetImageFormat(), vk::SampleCountFlagBits::e1, vk::AttachmentLoadOp::eClear,
vk::AttachmentStoreOp::eStore, vk::AttachmentLoadOp::eDontCare,
vk::AttachmentStoreOp::eDontCare, vk::ImageLayout::eUndefined,
vk::ImageLayout::ePresentSrcKHR);
VkAttachmentDescription color_attachment;
color_attachment.flags = 0;
color_attachment.format = swapchain.GetImageFormat();
color_attachment.samples = VK_SAMPLE_COUNT_1_BIT;
color_attachment.loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR;
color_attachment.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
color_attachment.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
color_attachment.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
color_attachment.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
color_attachment.finalLayout = VK_IMAGE_LAYOUT_PRESENT_SRC_KHR;
const vk::AttachmentReference color_attachment_ref(0, vk::ImageLayout::eColorAttachmentOptimal);
VkAttachmentReference color_attachment_ref;
color_attachment_ref.attachment = 0;
color_attachment_ref.layout = VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
const vk::SubpassDescription subpass_description({}, vk::PipelineBindPoint::eGraphics, 0,
nullptr, 1, &color_attachment_ref, nullptr,
nullptr, 0, nullptr);
VkSubpassDescription subpass_description;
subpass_description.flags = 0;
subpass_description.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass_description.inputAttachmentCount = 0;
subpass_description.pInputAttachments = nullptr;
subpass_description.colorAttachmentCount = 1;
subpass_description.pColorAttachments = &color_attachment_ref;
subpass_description.pResolveAttachments = nullptr;
subpass_description.pDepthStencilAttachment = nullptr;
subpass_description.preserveAttachmentCount = 0;
subpass_description.pPreserveAttachments = nullptr;
const vk::SubpassDependency dependency(
VK_SUBPASS_EXTERNAL, 0, vk::PipelineStageFlagBits::eColorAttachmentOutput,
vk::PipelineStageFlagBits::eColorAttachmentOutput, {},
vk::AccessFlagBits::eColorAttachmentRead | vk::AccessFlagBits::eColorAttachmentWrite, {});
VkSubpassDependency dependency;
dependency.srcSubpass = VK_SUBPASS_EXTERNAL;
dependency.dstSubpass = 0;
dependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
dependency.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
dependency.srcAccessMask = 0;
dependency.dstAccessMask =
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
dependency.dependencyFlags = 0;
const vk::RenderPassCreateInfo renderpass_ci({}, 1, &color_attachment, 1, &subpass_description,
1, &dependency);
VkRenderPassCreateInfo renderpass_ci;
renderpass_ci.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
renderpass_ci.pNext = nullptr;
renderpass_ci.flags = 0;
renderpass_ci.attachmentCount = 1;
renderpass_ci.pAttachments = &color_attachment;
renderpass_ci.subpassCount = 1;
renderpass_ci.pSubpasses = &subpass_description;
renderpass_ci.dependencyCount = 1;
renderpass_ci.pDependencies = &dependency;
const auto dev = device.GetLogical();
renderpass = dev.createRenderPassUnique(renderpass_ci, nullptr, device.GetDispatchLoader());
renderpass = device.GetLogical().CreateRenderPass(renderpass_ci);
}
void VKBlitScreen::CreateDescriptorSetLayout() {
const std::array<vk::DescriptorSetLayoutBinding, 2> layout_bindings{
vk::DescriptorSetLayoutBinding(0, vk::DescriptorType::eUniformBuffer, 1,
vk::ShaderStageFlagBits::eVertex, nullptr),
vk::DescriptorSetLayoutBinding(1, vk::DescriptorType::eCombinedImageSampler, 1,
vk::ShaderStageFlagBits::eFragment, nullptr)};
const vk::DescriptorSetLayoutCreateInfo descriptor_layout_ci(
{}, static_cast<u32>(layout_bindings.size()), layout_bindings.data());
std::array<VkDescriptorSetLayoutBinding, 2> layout_bindings;
layout_bindings[0].binding = 0;
layout_bindings[0].descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
layout_bindings[0].descriptorCount = 1;
layout_bindings[0].stageFlags = VK_SHADER_STAGE_VERTEX_BIT;
layout_bindings[0].pImmutableSamplers = nullptr;
layout_bindings[1].binding = 1;
layout_bindings[1].descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
layout_bindings[1].descriptorCount = 1;
layout_bindings[1].stageFlags = VK_SHADER_STAGE_FRAGMENT_BIT;
layout_bindings[1].pImmutableSamplers = nullptr;
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
descriptor_set_layout = dev.createDescriptorSetLayoutUnique(descriptor_layout_ci, nullptr, dld);
VkDescriptorSetLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.bindingCount = static_cast<u32>(layout_bindings.size());
ci.pBindings = layout_bindings.data();
descriptor_set_layout = device.GetLogical().CreateDescriptorSetLayout(ci);
}
void VKBlitScreen::CreateDescriptorSets() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const std::vector layouts(image_count, *descriptor_set_layout);
descriptor_sets.resize(image_count);
for (std::size_t i = 0; i < image_count; ++i) {
const vk::DescriptorSetLayout layout = *descriptor_set_layout;
const vk::DescriptorSetAllocateInfo descriptor_set_ai(*descriptor_pool, 1, &layout);
const vk::Result result =
dev.allocateDescriptorSets(&descriptor_set_ai, &descriptor_sets[i], dld);
ASSERT(result == vk::Result::eSuccess);
}
VkDescriptorSetAllocateInfo ai;
ai.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
ai.pNext = nullptr;
ai.descriptorPool = *descriptor_pool;
ai.descriptorSetCount = static_cast<u32>(image_count);
ai.pSetLayouts = layouts.data();
descriptor_sets = descriptor_pool.Allocate(ai);
}
void VKBlitScreen::CreatePipelineLayout() {
const vk::PipelineLayoutCreateInfo pipeline_layout_ci({}, 1, &descriptor_set_layout.get(), 0,
nullptr);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
pipeline_layout = dev.createPipelineLayoutUnique(pipeline_layout_ci, nullptr, dld);
VkPipelineLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.setLayoutCount = 1;
ci.pSetLayouts = descriptor_set_layout.address();
ci.pushConstantRangeCount = 0;
ci.pPushConstantRanges = nullptr;
pipeline_layout = device.GetLogical().CreatePipelineLayout(ci);
}
void VKBlitScreen::CreateGraphicsPipeline() {
const std::array shader_stages = {
vk::PipelineShaderStageCreateInfo({}, vk::ShaderStageFlagBits::eVertex, *vertex_shader,
"main", nullptr),
vk::PipelineShaderStageCreateInfo({}, vk::ShaderStageFlagBits::eFragment, *fragment_shader,
"main", nullptr)};
std::array<VkPipelineShaderStageCreateInfo, 2> shader_stages;
shader_stages[0].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
shader_stages[0].pNext = nullptr;
shader_stages[0].flags = 0;
shader_stages[0].stage = VK_SHADER_STAGE_VERTEX_BIT;
shader_stages[0].module = *vertex_shader;
shader_stages[0].pName = "main";
shader_stages[0].pSpecializationInfo = nullptr;
shader_stages[1].sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
shader_stages[1].pNext = nullptr;
shader_stages[1].flags = 0;
shader_stages[1].stage = VK_SHADER_STAGE_FRAGMENT_BIT;
shader_stages[1].module = *fragment_shader;
shader_stages[1].pName = "main";
shader_stages[1].pSpecializationInfo = nullptr;
const auto vertex_binding_description = ScreenRectVertex::GetDescription();
const auto vertex_attrs_description = ScreenRectVertex::GetAttributes();
const vk::PipelineVertexInputStateCreateInfo vertex_input(
{}, 1, &vertex_binding_description, static_cast<u32>(vertex_attrs_description.size()),
vertex_attrs_description.data());
const vk::PipelineInputAssemblyStateCreateInfo input_assembly(
{}, vk::PrimitiveTopology::eTriangleStrip, false);
VkPipelineVertexInputStateCreateInfo vertex_input_ci;
vertex_input_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO;
vertex_input_ci.pNext = nullptr;
vertex_input_ci.flags = 0;
vertex_input_ci.vertexBindingDescriptionCount = 1;
vertex_input_ci.pVertexBindingDescriptions = &vertex_binding_description;
vertex_input_ci.vertexAttributeDescriptionCount = u32{vertex_attrs_description.size()};
vertex_input_ci.pVertexAttributeDescriptions = vertex_attrs_description.data();
// Set a dummy viewport, it's going to be replaced by dynamic states.
const vk::Viewport viewport(0.0f, 0.0f, 1.0f, 1.0f, 0.0f, 1.0f);
const vk::Rect2D scissor({0, 0}, {1, 1});
VkPipelineInputAssemblyStateCreateInfo input_assembly_ci;
input_assembly_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO;
input_assembly_ci.pNext = nullptr;
input_assembly_ci.flags = 0;
input_assembly_ci.topology = VK_PRIMITIVE_TOPOLOGY_TRIANGLE_STRIP;
input_assembly_ci.primitiveRestartEnable = VK_FALSE;
const vk::PipelineViewportStateCreateInfo viewport_state({}, 1, &viewport, 1, &scissor);
VkPipelineViewportStateCreateInfo viewport_state_ci;
viewport_state_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO;
viewport_state_ci.pNext = nullptr;
viewport_state_ci.flags = 0;
viewport_state_ci.viewportCount = 1;
viewport_state_ci.pViewports = nullptr;
viewport_state_ci.scissorCount = 1;
viewport_state_ci.pScissors = nullptr;
const vk::PipelineRasterizationStateCreateInfo rasterizer(
{}, false, false, vk::PolygonMode::eFill, vk::CullModeFlagBits::eNone,
vk::FrontFace::eClockwise, false, 0.0f, 0.0f, 0.0f, 1.0f);
VkPipelineRasterizationStateCreateInfo rasterization_ci;
rasterization_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO;
rasterization_ci.pNext = nullptr;
rasterization_ci.flags = 0;
rasterization_ci.depthClampEnable = VK_FALSE;
rasterization_ci.rasterizerDiscardEnable = VK_FALSE;
rasterization_ci.polygonMode = VK_POLYGON_MODE_FILL;
rasterization_ci.cullMode = VK_CULL_MODE_NONE;
rasterization_ci.frontFace = VK_FRONT_FACE_CLOCKWISE;
rasterization_ci.depthBiasEnable = VK_FALSE;
rasterization_ci.depthBiasConstantFactor = 0.0f;
rasterization_ci.depthBiasClamp = 0.0f;
rasterization_ci.depthBiasSlopeFactor = 0.0f;
rasterization_ci.lineWidth = 1.0f;
const vk::PipelineMultisampleStateCreateInfo multisampling({}, vk::SampleCountFlagBits::e1,
false, 0.0f, nullptr, false, false);
VkPipelineMultisampleStateCreateInfo multisampling_ci;
multisampling_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO;
multisampling_ci.pNext = nullptr;
multisampling_ci.flags = 0;
multisampling_ci.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT;
multisampling_ci.sampleShadingEnable = VK_FALSE;
multisampling_ci.minSampleShading = 0.0f;
multisampling_ci.pSampleMask = nullptr;
multisampling_ci.alphaToCoverageEnable = VK_FALSE;
multisampling_ci.alphaToOneEnable = VK_FALSE;
const vk::PipelineColorBlendAttachmentState color_blend_attachment(
false, vk::BlendFactor::eZero, vk::BlendFactor::eZero, vk::BlendOp::eAdd,
vk::BlendFactor::eZero, vk::BlendFactor::eZero, vk::BlendOp::eAdd,
vk::ColorComponentFlagBits::eR | vk::ColorComponentFlagBits::eG |
vk::ColorComponentFlagBits::eB | vk::ColorComponentFlagBits::eA);
VkPipelineColorBlendAttachmentState color_blend_attachment;
color_blend_attachment.blendEnable = VK_FALSE;
color_blend_attachment.srcColorBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.dstColorBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.colorBlendOp = VK_BLEND_OP_ADD;
color_blend_attachment.srcAlphaBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.dstAlphaBlendFactor = VK_BLEND_FACTOR_ZERO;
color_blend_attachment.alphaBlendOp = VK_BLEND_OP_ADD;
color_blend_attachment.colorWriteMask = VK_COLOR_COMPONENT_R_BIT | VK_COLOR_COMPONENT_G_BIT |
VK_COLOR_COMPONENT_B_BIT | VK_COLOR_COMPONENT_A_BIT;
const vk::PipelineColorBlendStateCreateInfo color_blending(
{}, false, vk::LogicOp::eCopy, 1, &color_blend_attachment, {0.0f, 0.0f, 0.0f, 0.0f});
VkPipelineColorBlendStateCreateInfo color_blend_ci;
color_blend_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO;
color_blend_ci.flags = 0;
color_blend_ci.pNext = nullptr;
color_blend_ci.logicOpEnable = VK_FALSE;
color_blend_ci.logicOp = VK_LOGIC_OP_COPY;
color_blend_ci.attachmentCount = 1;
color_blend_ci.pAttachments = &color_blend_attachment;
color_blend_ci.blendConstants[0] = 0.0f;
color_blend_ci.blendConstants[1] = 0.0f;
color_blend_ci.blendConstants[2] = 0.0f;
color_blend_ci.blendConstants[3] = 0.0f;
const std::array<vk::DynamicState, 2> dynamic_states = {vk::DynamicState::eViewport,
vk::DynamicState::eScissor};
static constexpr std::array dynamic_states = {VK_DYNAMIC_STATE_VIEWPORT,
VK_DYNAMIC_STATE_SCISSOR};
VkPipelineDynamicStateCreateInfo dynamic_state_ci;
dynamic_state_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO;
dynamic_state_ci.pNext = nullptr;
dynamic_state_ci.flags = 0;
dynamic_state_ci.dynamicStateCount = static_cast<u32>(dynamic_states.size());
dynamic_state_ci.pDynamicStates = dynamic_states.data();
const vk::PipelineDynamicStateCreateInfo dynamic_state(
{}, static_cast<u32>(dynamic_states.size()), dynamic_states.data());
VkGraphicsPipelineCreateInfo pipeline_ci;
pipeline_ci.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO;
pipeline_ci.pNext = nullptr;
pipeline_ci.flags = 0;
pipeline_ci.stageCount = static_cast<u32>(shader_stages.size());
pipeline_ci.pStages = shader_stages.data();
pipeline_ci.pVertexInputState = &vertex_input_ci;
pipeline_ci.pInputAssemblyState = &input_assembly_ci;
pipeline_ci.pTessellationState = nullptr;
pipeline_ci.pViewportState = &viewport_state_ci;
pipeline_ci.pRasterizationState = &rasterization_ci;
pipeline_ci.pMultisampleState = &multisampling_ci;
pipeline_ci.pDepthStencilState = nullptr;
pipeline_ci.pColorBlendState = &color_blend_ci;
pipeline_ci.pDynamicState = &dynamic_state_ci;
pipeline_ci.layout = *pipeline_layout;
pipeline_ci.renderPass = *renderpass;
pipeline_ci.subpass = 0;
pipeline_ci.basePipelineHandle = 0;
pipeline_ci.basePipelineIndex = 0;
const vk::GraphicsPipelineCreateInfo pipeline_ci(
{}, static_cast<u32>(shader_stages.size()), shader_stages.data(), &vertex_input,
&input_assembly, nullptr, &viewport_state, &rasterizer, &multisampling, nullptr,
&color_blending, &dynamic_state, *pipeline_layout, *renderpass, 0, nullptr, 0);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
pipeline = dev.createGraphicsPipelineUnique({}, pipeline_ci, nullptr, dld);
pipeline = device.GetLogical().CreateGraphicsPipeline(pipeline_ci);
}
void VKBlitScreen::CreateSampler() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const vk::SamplerCreateInfo sampler_ci(
{}, vk::Filter::eLinear, vk::Filter::eLinear, vk::SamplerMipmapMode::eLinear,
vk::SamplerAddressMode::eClampToBorder, vk::SamplerAddressMode::eClampToBorder,
vk::SamplerAddressMode::eClampToBorder, 0.0f, false, 0.0f, false, vk::CompareOp::eNever,
0.0f, 0.0f, vk::BorderColor::eFloatOpaqueBlack, false);
sampler = dev.createSamplerUnique(sampler_ci, nullptr, dld);
VkSamplerCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.magFilter = VK_FILTER_LINEAR;
ci.minFilter = VK_FILTER_NEAREST;
ci.mipmapMode = VK_SAMPLER_MIPMAP_MODE_LINEAR;
ci.addressModeU = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
ci.addressModeV = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
ci.addressModeW = VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_BORDER;
ci.mipLodBias = 0.0f;
ci.anisotropyEnable = VK_FALSE;
ci.maxAnisotropy = 0.0f;
ci.compareEnable = VK_FALSE;
ci.compareOp = VK_COMPARE_OP_NEVER;
ci.minLod = 0.0f;
ci.maxLod = 0.0f;
ci.borderColor = VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK;
ci.unnormalizedCoordinates = VK_FALSE;
sampler = device.GetLogical().CreateSampler(ci);
}
void VKBlitScreen::CreateFramebuffers() {
const vk::Extent2D size{swapchain.GetSize()};
framebuffers.clear();
const VkExtent2D size{swapchain.GetSize()};
framebuffers.resize(image_count);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
VkFramebufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.renderPass = *renderpass;
ci.attachmentCount = 1;
ci.width = size.width;
ci.height = size.height;
ci.layers = 1;
for (std::size_t i = 0; i < image_count; ++i) {
const vk::ImageView image_view{swapchain.GetImageViewIndex(i)};
const vk::FramebufferCreateInfo framebuffer_ci({}, *renderpass, 1, &image_view, size.width,
size.height, 1);
framebuffers[i] = dev.createFramebufferUnique(framebuffer_ci, nullptr, dld);
const VkImageView image_view{swapchain.GetImageViewIndex(i)};
ci.pAttachments = &image_view;
framebuffers[i] = device.GetLogical().CreateFramebuffer(ci);
}
}
@@ -507,54 +678,86 @@ void VKBlitScreen::ReleaseRawImages() {
}
void VKBlitScreen::CreateStagingBuffer(const Tegra::FramebufferConfig& framebuffer) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
VkBufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.size = CalculateBufferSize(framebuffer);
ci.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT |
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
const vk::BufferCreateInfo buffer_ci({}, CalculateBufferSize(framebuffer),
vk::BufferUsageFlagBits::eTransferSrc |
vk::BufferUsageFlagBits::eVertexBuffer |
vk::BufferUsageFlagBits::eUniformBuffer,
vk::SharingMode::eExclusive, 0, nullptr);
buffer = dev.createBufferUnique(buffer_ci, nullptr, dld);
buffer_commit = memory_manager.Commit(*buffer, true);
buffer = device.GetLogical().CreateBuffer(ci);
buffer_commit = memory_manager.Commit(buffer, true);
}
void VKBlitScreen::CreateRawImages(const Tegra::FramebufferConfig& framebuffer) {
raw_images.resize(image_count);
raw_buffer_commits.resize(image_count);
const auto format = GetFormat(framebuffer);
for (std::size_t i = 0; i < image_count; ++i) {
const vk::ImageCreateInfo image_ci(
{}, vk::ImageType::e2D, format, {framebuffer.width, framebuffer.height, 1}, 1, 1,
vk::SampleCountFlagBits::e1, vk::ImageTiling::eOptimal,
vk::ImageUsageFlagBits::eTransferDst | vk::ImageUsageFlagBits::eSampled,
vk::SharingMode::eExclusive, 0, nullptr, vk::ImageLayout::eUndefined);
VkImageCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.imageType = VK_IMAGE_TYPE_2D;
ci.format = GetFormat(framebuffer);
ci.extent.width = framebuffer.width;
ci.extent.height = framebuffer.height;
ci.extent.depth = 1;
ci.mipLevels = 1;
ci.arrayLayers = 1;
ci.samples = VK_SAMPLE_COUNT_1_BIT;
ci.tiling = VK_IMAGE_TILING_LINEAR;
ci.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
ci.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
raw_images[i] =
std::make_unique<VKImage>(device, scheduler, image_ci, vk::ImageAspectFlagBits::eColor);
for (std::size_t i = 0; i < image_count; ++i) {
raw_images[i] = std::make_unique<VKImage>(device, scheduler, ci, VK_IMAGE_ASPECT_COLOR_BIT);
raw_buffer_commits[i] = memory_manager.Commit(raw_images[i]->GetHandle(), false);
}
}
void VKBlitScreen::UpdateDescriptorSet(std::size_t image_index, vk::ImageView image_view) const {
const vk::DescriptorSet descriptor_set = descriptor_sets[image_index];
void VKBlitScreen::UpdateDescriptorSet(std::size_t image_index, VkImageView image_view) const {
VkDescriptorBufferInfo buffer_info;
buffer_info.buffer = *buffer;
buffer_info.offset = offsetof(BufferData, uniform);
buffer_info.range = sizeof(BufferData::uniform);
const vk::DescriptorBufferInfo buffer_info(*buffer, offsetof(BufferData, uniform),
sizeof(BufferData::uniform));
const vk::WriteDescriptorSet ubo_write(descriptor_set, 0, 0, 1,
vk::DescriptorType::eUniformBuffer, nullptr,
&buffer_info, nullptr);
VkWriteDescriptorSet ubo_write;
ubo_write.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
ubo_write.pNext = nullptr;
ubo_write.dstSet = descriptor_sets[image_index];
ubo_write.dstBinding = 0;
ubo_write.dstArrayElement = 0;
ubo_write.descriptorCount = 1;
ubo_write.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
ubo_write.pImageInfo = nullptr;
ubo_write.pBufferInfo = &buffer_info;
ubo_write.pTexelBufferView = nullptr;
const vk::DescriptorImageInfo image_info(*sampler, image_view,
vk::ImageLayout::eShaderReadOnlyOptimal);
const vk::WriteDescriptorSet sampler_write(descriptor_set, 1, 0, 1,
vk::DescriptorType::eCombinedImageSampler,
&image_info, nullptr, nullptr);
VkDescriptorImageInfo image_info;
image_info.sampler = *sampler;
image_info.imageView = image_view;
image_info.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
dev.updateDescriptorSets({ubo_write, sampler_write}, {}, dld);
VkWriteDescriptorSet sampler_write;
sampler_write.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
sampler_write.pNext = nullptr;
sampler_write.dstSet = descriptor_sets[image_index];
sampler_write.dstBinding = 1;
sampler_write.dstArrayElement = 0;
sampler_write.descriptorCount = 1;
sampler_write.descriptorType = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
sampler_write.pImageInfo = &image_info;
sampler_write.pBufferInfo = nullptr;
sampler_write.pTexelBufferView = nullptr;
device.GetLogical().UpdateDescriptorSets(std::array{ubo_write, sampler_write}, {});
}
void VKBlitScreen::SetUniformData(BufferData& data,

View File

@@ -8,9 +8,9 @@
#include <memory>
#include <tuple>
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -49,8 +49,8 @@ public:
void Recreate();
std::tuple<VKFence&, vk::Semaphore> Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated);
std::tuple<VKFence&, VkSemaphore> Draw(const Tegra::FramebufferConfig& framebuffer,
bool use_accelerated);
private:
struct BufferData;
@@ -74,7 +74,7 @@ private:
void CreateStagingBuffer(const Tegra::FramebufferConfig& framebuffer);
void CreateRawImages(const Tegra::FramebufferConfig& framebuffer);
void UpdateDescriptorSet(std::size_t image_index, vk::ImageView image_view) const;
void UpdateDescriptorSet(std::size_t image_index, VkImageView image_view) const;
void SetUniformData(BufferData& data, const Tegra::FramebufferConfig& framebuffer) const;
void SetVertexData(BufferData& data, const Tegra::FramebufferConfig& framebuffer) const;
@@ -93,23 +93,23 @@ private:
const std::size_t image_count;
const VKScreenInfo& screen_info;
UniqueShaderModule vertex_shader;
UniqueShaderModule fragment_shader;
UniqueDescriptorPool descriptor_pool;
UniqueDescriptorSetLayout descriptor_set_layout;
UniquePipelineLayout pipeline_layout;
UniquePipeline pipeline;
UniqueRenderPass renderpass;
std::vector<UniqueFramebuffer> framebuffers;
std::vector<vk::DescriptorSet> descriptor_sets;
UniqueSampler sampler;
vk::ShaderModule vertex_shader;
vk::ShaderModule fragment_shader;
vk::DescriptorPool descriptor_pool;
vk::DescriptorSetLayout descriptor_set_layout;
vk::PipelineLayout pipeline_layout;
vk::Pipeline pipeline;
vk::RenderPass renderpass;
std::vector<vk::Framebuffer> framebuffers;
vk::DescriptorSets descriptor_sets;
vk::Sampler sampler;
UniqueBuffer buffer;
vk::Buffer buffer;
VKMemoryCommit buffer_commit;
std::vector<std::unique_ptr<VKFenceWatch>> watches;
std::vector<UniqueSemaphore> semaphores;
std::vector<vk::Semaphore> semaphores;
std::vector<std::unique_ptr<VKImage>> raw_images;
std::vector<VKMemoryCommit> raw_buffer_commits;
u32 raw_width = 0;

View File

@@ -11,32 +11,31 @@
#include "common/assert.h"
#include "common/bit_util.h"
#include "core/core.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_buffer_cache.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_stream_buffer.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
namespace {
const auto BufferUsage =
vk::BufferUsageFlagBits::eVertexBuffer | vk::BufferUsageFlagBits::eIndexBuffer |
vk::BufferUsageFlagBits::eUniformBuffer | vk::BufferUsageFlagBits::eStorageBuffer;
constexpr VkBufferUsageFlags BUFFER_USAGE =
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_INDEX_BUFFER_BIT |
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT;
const auto UploadPipelineStage =
vk::PipelineStageFlagBits::eTransfer | vk::PipelineStageFlagBits::eVertexInput |
vk::PipelineStageFlagBits::eVertexShader | vk::PipelineStageFlagBits::eFragmentShader |
vk::PipelineStageFlagBits::eComputeShader;
constexpr VkPipelineStageFlags UPLOAD_PIPELINE_STAGE =
VK_PIPELINE_STAGE_TRANSFER_BIT | VK_PIPELINE_STAGE_VERTEX_INPUT_BIT |
VK_PIPELINE_STAGE_VERTEX_SHADER_BIT | VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT |
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT;
const auto UploadAccessBarriers =
vk::AccessFlagBits::eTransferRead | vk::AccessFlagBits::eShaderRead |
vk::AccessFlagBits::eUniformRead | vk::AccessFlagBits::eVertexAttributeRead |
vk::AccessFlagBits::eIndexRead;
constexpr VkAccessFlags UPLOAD_ACCESS_BARRIERS =
VK_ACCESS_TRANSFER_READ_BIT | VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_UNIFORM_READ_BIT |
VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT | VK_ACCESS_INDEX_READ_BIT;
auto CreateStreamBuffer(const VKDevice& device, VKScheduler& scheduler) {
return std::make_unique<VKStreamBuffer>(device, scheduler, BufferUsage);
std::unique_ptr<VKStreamBuffer> CreateStreamBuffer(const VKDevice& device, VKScheduler& scheduler) {
return std::make_unique<VKStreamBuffer>(device, scheduler, BUFFER_USAGE);
}
} // Anonymous namespace
@@ -44,15 +43,18 @@ auto CreateStreamBuffer(const VKDevice& device, VKScheduler& scheduler) {
CachedBufferBlock::CachedBufferBlock(const VKDevice& device, VKMemoryManager& memory_manager,
VAddr cpu_addr, std::size_t size)
: VideoCommon::BufferBlock{cpu_addr, size} {
const vk::BufferCreateInfo buffer_ci({}, static_cast<vk::DeviceSize>(size),
BufferUsage | vk::BufferUsageFlagBits::eTransferSrc |
vk::BufferUsageFlagBits::eTransferDst,
vk::SharingMode::eExclusive, 0, nullptr);
VkBufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.size = static_cast<VkDeviceSize>(size);
ci.usage = BUFFER_USAGE | VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
const auto& dld{device.GetDispatchLoader()};
const auto dev{device.GetLogical()};
buffer.handle = dev.createBufferUnique(buffer_ci, nullptr, dld);
buffer.commit = memory_manager.Commit(*buffer.handle, false);
buffer.handle = device.GetLogical().CreateBuffer(ci);
buffer.commit = memory_manager.Commit(buffer.handle, false);
}
CachedBufferBlock::~CachedBufferBlock() = default;
@@ -60,9 +62,9 @@ CachedBufferBlock::~CachedBufferBlock() = default;
VKBufferCache::VKBufferCache(VideoCore::RasterizerInterface& rasterizer, Core::System& system,
const VKDevice& device, VKMemoryManager& memory_manager,
VKScheduler& scheduler, VKStagingBufferPool& staging_pool)
: VideoCommon::BufferCache<Buffer, vk::Buffer, VKStreamBuffer>{rasterizer, system,
CreateStreamBuffer(device,
scheduler)},
: VideoCommon::BufferCache<Buffer, VkBuffer, VKStreamBuffer>{rasterizer, system,
CreateStreamBuffer(device,
scheduler)},
device{device}, memory_manager{memory_manager}, scheduler{scheduler}, staging_pool{
staging_pool} {}
@@ -72,18 +74,18 @@ Buffer VKBufferCache::CreateBlock(VAddr cpu_addr, std::size_t size) {
return std::make_shared<CachedBufferBlock>(device, memory_manager, cpu_addr, size);
}
const vk::Buffer* VKBufferCache::ToHandle(const Buffer& buffer) {
const VkBuffer* VKBufferCache::ToHandle(const Buffer& buffer) {
return buffer->GetHandle();
}
const vk::Buffer* VKBufferCache::GetEmptyBuffer(std::size_t size) {
const VkBuffer* VKBufferCache::GetEmptyBuffer(std::size_t size) {
size = std::max(size, std::size_t(4));
const auto& empty = staging_pool.GetUnusedBuffer(size, false);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([size, buffer = *empty.handle](vk::CommandBuffer cmdbuf, auto& dld) {
cmdbuf.fillBuffer(buffer, 0, size, 0, dld);
scheduler.Record([size, buffer = *empty.handle](vk::CommandBuffer cmdbuf) {
cmdbuf.FillBuffer(buffer, 0, size, 0);
});
return &*empty.handle;
return empty.handle.address();
}
void VKBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
@@ -93,14 +95,21 @@ void VKBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, st
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([staging = *staging.handle, buffer = *buffer->GetHandle(), offset,
size](auto cmdbuf, auto& dld) {
cmdbuf.copyBuffer(staging, buffer, {{0, offset, size}}, dld);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTransfer, UploadPipelineStage, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferWrite, UploadAccessBarriers,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, buffer,
offset, size)},
{}, dld);
size](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBuffer(staging, buffer, VkBufferCopy{0, offset, size});
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstAccessMask = UPLOAD_ACCESS_BARRIERS;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = offset;
barrier.size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFER_BIT, UPLOAD_PIPELINE_STAGE, 0, {},
barrier, {});
});
}
@@ -109,16 +118,23 @@ void VKBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset,
const auto& staging = staging_pool.GetUnusedBuffer(size, true);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([staging = *staging.handle, buffer = *buffer->GetHandle(), offset,
size](auto cmdbuf, auto& dld) {
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eVertexShader | vk::PipelineStageFlagBits::eFragmentShader |
vk::PipelineStageFlagBits::eComputeShader,
vk::PipelineStageFlagBits::eTransfer, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eShaderWrite,
vk::AccessFlagBits::eTransferRead, VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED, buffer, offset, size)},
{}, dld);
cmdbuf.copyBuffer(buffer, staging, {{offset, 0, size}}, dld);
size](vk::CommandBuffer cmdbuf) {
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = offset;
barrier.size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_VERTEX_SHADER_BIT |
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT |
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_TRANSFER_BIT, 0, {}, barrier, {});
cmdbuf.CopyBuffer(buffer, staging, VkBufferCopy{offset, 0, size});
});
scheduler.Finish();
@@ -129,17 +145,30 @@ void VKBufferCache::CopyBlock(const Buffer& src, const Buffer& dst, std::size_t
std::size_t dst_offset, std::size_t size) {
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([src_buffer = *src->GetHandle(), dst_buffer = *dst->GetHandle(), src_offset,
dst_offset, size](auto cmdbuf, auto& dld) {
cmdbuf.copyBuffer(src_buffer, dst_buffer, {{src_offset, dst_offset, size}}, dld);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTransfer, UploadPipelineStage, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferRead,
vk::AccessFlagBits::eShaderWrite, VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED, src_buffer, src_offset, size),
vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferWrite, UploadAccessBarriers,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, dst_buffer,
dst_offset, size)},
{}, dld);
dst_offset, size](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBuffer(src_buffer, dst_buffer, VkBufferCopy{src_offset, dst_offset, size});
std::array<VkBufferMemoryBarrier, 2> barriers;
barriers[0].sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barriers[0].pNext = nullptr;
barriers[0].srcAccessMask = VK_ACCESS_TRANSFER_READ_BIT;
barriers[0].dstAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barriers[0].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[0].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[0].buffer = src_buffer;
barriers[0].offset = src_offset;
barriers[0].size = size;
barriers[1].sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barriers[1].pNext = nullptr;
barriers[1].srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
barriers[1].dstAccessMask = UPLOAD_ACCESS_BARRIERS;
barriers[1].srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[1].dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barriers[1].buffer = dst_buffer;
barriers[1].offset = dst_offset;
barriers[1].size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFER_BIT, UPLOAD_PIPELINE_STAGE, 0, {},
barriers, {});
});
}

View File

@@ -11,11 +11,11 @@
#include "common/common_types.h"
#include "video_core/buffer_cache/buffer_cache.h"
#include "video_core/rasterizer_cache.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_stream_buffer.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -33,8 +33,8 @@ public:
VAddr cpu_addr, std::size_t size);
~CachedBufferBlock();
const vk::Buffer* GetHandle() const {
return &*buffer.handle;
const VkBuffer* GetHandle() const {
return buffer.handle.address();
}
private:
@@ -43,21 +43,21 @@ private:
using Buffer = std::shared_ptr<CachedBufferBlock>;
class VKBufferCache final : public VideoCommon::BufferCache<Buffer, vk::Buffer, VKStreamBuffer> {
class VKBufferCache final : public VideoCommon::BufferCache<Buffer, VkBuffer, VKStreamBuffer> {
public:
explicit VKBufferCache(VideoCore::RasterizerInterface& rasterizer, Core::System& system,
const VKDevice& device, VKMemoryManager& memory_manager,
VKScheduler& scheduler, VKStagingBufferPool& staging_pool);
~VKBufferCache();
const vk::Buffer* GetEmptyBuffer(std::size_t size) override;
const VkBuffer* GetEmptyBuffer(std::size_t size) override;
protected:
void WriteBarrier() override {}
Buffer CreateBlock(VAddr cpu_addr, std::size_t size) override;
const vk::Buffer* ToHandle(const Buffer& buffer) override;
const VkBuffer* ToHandle(const Buffer& buffer) override;
void UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) override;

View File

@@ -10,13 +10,13 @@
#include "common/alignment.h"
#include "common/assert.h"
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_compute_pass.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -114,6 +114,35 @@ constexpr u8 quad_array[] = {
0xf9, 0x00, 0x02, 0x00, 0x4c, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x4b, 0x00, 0x00, 0x00,
0xfd, 0x00, 0x01, 0x00, 0x38, 0x00, 0x01, 0x00};
VkDescriptorSetLayoutBinding BuildQuadArrayPassDescriptorSetLayoutBinding() {
VkDescriptorSetLayoutBinding binding;
binding.binding = 0;
binding.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
binding.descriptorCount = 1;
binding.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
binding.pImmutableSamplers = nullptr;
return binding;
}
VkDescriptorUpdateTemplateEntryKHR BuildQuadArrayPassDescriptorUpdateTemplateEntry() {
VkDescriptorUpdateTemplateEntryKHR entry;
entry.dstBinding = 0;
entry.dstArrayElement = 0;
entry.descriptorCount = 1;
entry.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
entry.offset = 0;
entry.stride = sizeof(DescriptorUpdateEntry);
return entry;
}
VkPushConstantRange BuildQuadArrayPassPushConstantRange() {
VkPushConstantRange range;
range.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
range.offset = 0;
range.size = sizeof(u32);
return range;
}
// Uint8 SPIR-V module. Generated from the "shaders/" directory.
constexpr u8 uint8_pass[] = {
0x03, 0x02, 0x23, 0x07, 0x00, 0x00, 0x01, 0x00, 0x07, 0x00, 0x08, 0x00, 0x2f, 0x00, 0x00, 0x00,
@@ -191,53 +220,111 @@ constexpr u8 uint8_pass[] = {
0xf9, 0x00, 0x02, 0x00, 0x1d, 0x00, 0x00, 0x00, 0xf8, 0x00, 0x02, 0x00, 0x1d, 0x00, 0x00, 0x00,
0xfd, 0x00, 0x01, 0x00, 0x38, 0x00, 0x01, 0x00};
std::array<VkDescriptorSetLayoutBinding, 2> BuildUint8PassDescriptorSetBindings() {
std::array<VkDescriptorSetLayoutBinding, 2> bindings;
bindings[0].binding = 0;
bindings[0].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
bindings[0].descriptorCount = 1;
bindings[0].stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
bindings[0].pImmutableSamplers = nullptr;
bindings[1].binding = 1;
bindings[1].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
bindings[1].descriptorCount = 1;
bindings[1].stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
bindings[1].pImmutableSamplers = nullptr;
return bindings;
}
VkDescriptorUpdateTemplateEntryKHR BuildUint8PassDescriptorUpdateTemplateEntry() {
VkDescriptorUpdateTemplateEntryKHR entry;
entry.dstBinding = 0;
entry.dstArrayElement = 0;
entry.descriptorCount = 2;
entry.descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
entry.offset = 0;
entry.stride = sizeof(DescriptorUpdateEntry);
return entry;
}
} // Anonymous namespace
VKComputePass::VKComputePass(const VKDevice& device, VKDescriptorPool& descriptor_pool,
const std::vector<vk::DescriptorSetLayoutBinding>& bindings,
const std::vector<vk::DescriptorUpdateTemplateEntry>& templates,
const std::vector<vk::PushConstantRange> push_constants,
std::size_t code_size, const u8* code) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
vk::Span<VkDescriptorSetLayoutBinding> bindings,
vk::Span<VkDescriptorUpdateTemplateEntryKHR> templates,
vk::Span<VkPushConstantRange> push_constants, std::size_t code_size,
const u8* code) {
VkDescriptorSetLayoutCreateInfo descriptor_layout_ci;
descriptor_layout_ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
descriptor_layout_ci.pNext = nullptr;
descriptor_layout_ci.flags = 0;
descriptor_layout_ci.bindingCount = bindings.size();
descriptor_layout_ci.pBindings = bindings.data();
descriptor_set_layout = device.GetLogical().CreateDescriptorSetLayout(descriptor_layout_ci);
const vk::DescriptorSetLayoutCreateInfo descriptor_layout_ci(
{}, static_cast<u32>(bindings.size()), bindings.data());
descriptor_set_layout = dev.createDescriptorSetLayoutUnique(descriptor_layout_ci, nullptr, dld);
const vk::PipelineLayoutCreateInfo pipeline_layout_ci({}, 1, &*descriptor_set_layout,
static_cast<u32>(push_constants.size()),
push_constants.data());
layout = dev.createPipelineLayoutUnique(pipeline_layout_ci, nullptr, dld);
VkPipelineLayoutCreateInfo pipeline_layout_ci;
pipeline_layout_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
pipeline_layout_ci.pNext = nullptr;
pipeline_layout_ci.flags = 0;
pipeline_layout_ci.setLayoutCount = 1;
pipeline_layout_ci.pSetLayouts = descriptor_set_layout.address();
pipeline_layout_ci.pushConstantRangeCount = push_constants.size();
pipeline_layout_ci.pPushConstantRanges = push_constants.data();
layout = device.GetLogical().CreatePipelineLayout(pipeline_layout_ci);
if (!templates.empty()) {
const vk::DescriptorUpdateTemplateCreateInfo template_ci(
{}, static_cast<u32>(templates.size()), templates.data(),
vk::DescriptorUpdateTemplateType::eDescriptorSet, *descriptor_set_layout,
vk::PipelineBindPoint::eGraphics, *layout, 0);
descriptor_template = dev.createDescriptorUpdateTemplateUnique(template_ci, nullptr, dld);
VkDescriptorUpdateTemplateCreateInfoKHR template_ci;
template_ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR;
template_ci.pNext = nullptr;
template_ci.flags = 0;
template_ci.descriptorUpdateEntryCount = templates.size();
template_ci.pDescriptorUpdateEntries = templates.data();
template_ci.templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR;
template_ci.descriptorSetLayout = *descriptor_set_layout;
template_ci.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
template_ci.pipelineLayout = *layout;
template_ci.set = 0;
descriptor_template = device.GetLogical().CreateDescriptorUpdateTemplateKHR(template_ci);
descriptor_allocator.emplace(descriptor_pool, *descriptor_set_layout);
}
auto code_copy = std::make_unique<u32[]>(code_size / sizeof(u32) + 1);
std::memcpy(code_copy.get(), code, code_size);
const vk::ShaderModuleCreateInfo module_ci({}, code_size, code_copy.get());
module = dev.createShaderModuleUnique(module_ci, nullptr, dld);
const vk::PipelineShaderStageCreateInfo stage_ci({}, vk::ShaderStageFlagBits::eCompute, *module,
"main", nullptr);
VkShaderModuleCreateInfo module_ci;
module_ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
module_ci.pNext = nullptr;
module_ci.flags = 0;
module_ci.codeSize = code_size;
module_ci.pCode = code_copy.get();
module = device.GetLogical().CreateShaderModule(module_ci);
const vk::ComputePipelineCreateInfo pipeline_ci({}, stage_ci, *layout, nullptr, 0);
pipeline = dev.createComputePipelineUnique(nullptr, pipeline_ci, nullptr, dld);
VkComputePipelineCreateInfo pipeline_ci;
pipeline_ci.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO;
pipeline_ci.pNext = nullptr;
pipeline_ci.flags = 0;
pipeline_ci.layout = *layout;
pipeline_ci.basePipelineHandle = nullptr;
pipeline_ci.basePipelineIndex = 0;
VkPipelineShaderStageCreateInfo& stage_ci = pipeline_ci.stage;
stage_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
stage_ci.pNext = nullptr;
stage_ci.flags = 0;
stage_ci.stage = VK_SHADER_STAGE_COMPUTE_BIT;
stage_ci.module = *module;
stage_ci.pName = "main";
stage_ci.pSpecializationInfo = nullptr;
pipeline = device.GetLogical().CreateComputePipeline(pipeline_ci);
}
VKComputePass::~VKComputePass() = default;
vk::DescriptorSet VKComputePass::CommitDescriptorSet(
VKUpdateDescriptorQueue& update_descriptor_queue, VKFence& fence) {
VkDescriptorSet VKComputePass::CommitDescriptorSet(VKUpdateDescriptorQueue& update_descriptor_queue,
VKFence& fence) {
if (!descriptor_template) {
return {};
return nullptr;
}
const auto set = descriptor_allocator->Commit(fence);
update_descriptor_queue.Send(*descriptor_template, set);
@@ -248,25 +335,21 @@ QuadArrayPass::QuadArrayPass(const VKDevice& device, VKScheduler& scheduler,
VKDescriptorPool& descriptor_pool,
VKStagingBufferPool& staging_buffer_pool,
VKUpdateDescriptorQueue& update_descriptor_queue)
: VKComputePass(device, descriptor_pool,
{vk::DescriptorSetLayoutBinding(0, vk::DescriptorType::eStorageBuffer, 1,
vk::ShaderStageFlagBits::eCompute, nullptr)},
{vk::DescriptorUpdateTemplateEntry(0, 0, 1, vk::DescriptorType::eStorageBuffer,
0, sizeof(DescriptorUpdateEntry))},
{vk::PushConstantRange(vk::ShaderStageFlagBits::eCompute, 0, sizeof(u32))},
std::size(quad_array), quad_array),
: VKComputePass(device, descriptor_pool, BuildQuadArrayPassDescriptorSetLayoutBinding(),
BuildQuadArrayPassDescriptorUpdateTemplateEntry(),
BuildQuadArrayPassPushConstantRange(), std::size(quad_array), quad_array),
scheduler{scheduler}, staging_buffer_pool{staging_buffer_pool},
update_descriptor_queue{update_descriptor_queue} {}
QuadArrayPass::~QuadArrayPass() = default;
std::pair<const vk::Buffer&, vk::DeviceSize> QuadArrayPass::Assemble(u32 num_vertices, u32 first) {
std::pair<const VkBuffer*, VkDeviceSize> QuadArrayPass::Assemble(u32 num_vertices, u32 first) {
const u32 num_triangle_vertices = num_vertices * 6 / 4;
const std::size_t staging_size = num_triangle_vertices * sizeof(u32);
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(&*buffer.handle, 0, staging_size);
update_descriptor_queue.AddBuffer(buffer.handle.address(), 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
@@ -274,66 +357,72 @@ std::pair<const vk::Buffer&, vk::DeviceSize> QuadArrayPass::Assemble(u32 num_ver
ASSERT(num_vertices % 4 == 0);
const u32 num_quads = num_vertices / 4;
scheduler.Record([layout = *layout, pipeline = *pipeline, buffer = *buffer.handle, num_quads,
first, set](auto cmdbuf, auto& dld) {
first, set](vk::CommandBuffer cmdbuf) {
constexpr u32 dispatch_size = 1024;
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, pipeline, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, layout, 0, {set}, {}, dld);
cmdbuf.pushConstants(layout, vk::ShaderStageFlagBits::eCompute, 0, sizeof(first), &first,
dld);
cmdbuf.dispatch(Common::AlignUp(num_quads, dispatch_size) / dispatch_size, 1, 1, dld);
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, layout, 0, set, {});
cmdbuf.PushConstants(layout, VK_SHADER_STAGE_COMPUTE_BIT, 0, sizeof(first), &first);
cmdbuf.Dispatch(Common::AlignUp(num_quads, dispatch_size) / dispatch_size, 1, 1);
const vk::BufferMemoryBarrier barrier(
vk::AccessFlagBits::eShaderWrite, vk::AccessFlagBits::eVertexAttributeRead,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, buffer, 0,
static_cast<vk::DeviceSize>(num_quads) * 6 * sizeof(u32));
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eComputeShader,
vk::PipelineStageFlagBits::eVertexInput, {}, {}, {barrier}, {}, dld);
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = 0;
barrier.size = static_cast<VkDeviceSize>(num_quads) * 6 * sizeof(u32);
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, {barrier}, {});
});
return {*buffer.handle, 0};
return {buffer.handle.address(), 0};
}
Uint8Pass::Uint8Pass(const VKDevice& device, VKScheduler& scheduler,
VKDescriptorPool& descriptor_pool, VKStagingBufferPool& staging_buffer_pool,
VKUpdateDescriptorQueue& update_descriptor_queue)
: VKComputePass(device, descriptor_pool,
{vk::DescriptorSetLayoutBinding(0, vk::DescriptorType::eStorageBuffer, 1,
vk::ShaderStageFlagBits::eCompute, nullptr),
vk::DescriptorSetLayoutBinding(1, vk::DescriptorType::eStorageBuffer, 1,
vk::ShaderStageFlagBits::eCompute, nullptr)},
{vk::DescriptorUpdateTemplateEntry(0, 0, 2, vk::DescriptorType::eStorageBuffer,
0, sizeof(DescriptorUpdateEntry))},
{}, std::size(uint8_pass), uint8_pass),
: VKComputePass(device, descriptor_pool, BuildUint8PassDescriptorSetBindings(),
BuildUint8PassDescriptorUpdateTemplateEntry(), {}, std::size(uint8_pass),
uint8_pass),
scheduler{scheduler}, staging_buffer_pool{staging_buffer_pool},
update_descriptor_queue{update_descriptor_queue} {}
Uint8Pass::~Uint8Pass() = default;
std::pair<const vk::Buffer*, u64> Uint8Pass::Assemble(u32 num_vertices, vk::Buffer src_buffer,
u64 src_offset) {
std::pair<const VkBuffer*, u64> Uint8Pass::Assemble(u32 num_vertices, VkBuffer src_buffer,
u64 src_offset) {
const auto staging_size = static_cast<u32>(num_vertices * sizeof(u16));
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(&src_buffer, src_offset, num_vertices);
update_descriptor_queue.AddBuffer(&*buffer.handle, 0, staging_size);
update_descriptor_queue.AddBuffer(buffer.handle.address(), 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([layout = *layout, pipeline = *pipeline, buffer = *buffer.handle, set,
num_vertices](auto cmdbuf, auto& dld) {
num_vertices](vk::CommandBuffer cmdbuf) {
constexpr u32 dispatch_size = 1024;
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, pipeline, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, layout, 0, {set}, {}, dld);
cmdbuf.dispatch(Common::AlignUp(num_vertices, dispatch_size) / dispatch_size, 1, 1, dld);
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, layout, 0, set, {});
cmdbuf.Dispatch(Common::AlignUp(num_vertices, dispatch_size) / dispatch_size, 1, 1);
const vk::BufferMemoryBarrier barrier(
vk::AccessFlagBits::eShaderWrite, vk::AccessFlagBits::eVertexAttributeRead,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED, buffer, 0,
static_cast<vk::DeviceSize>(num_vertices) * sizeof(u16));
cmdbuf.pipelineBarrier(vk::PipelineStageFlagBits::eComputeShader,
vk::PipelineStageFlagBits::eVertexInput, {}, {}, {barrier}, {}, dld);
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_ACCESS_SHADER_WRITE_BIT;
barrier.dstAccessMask = VK_ACCESS_VERTEX_ATTRIBUTE_READ_BIT;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.buffer = buffer;
barrier.offset = 0;
barrier.size = static_cast<VkDeviceSize>(num_vertices * sizeof(u16));
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, barrier, {});
});
return {&*buffer.handle, 0};
return {buffer.handle.address(), 0};
}
} // namespace Vulkan

View File

@@ -8,8 +8,8 @@
#include <utility>
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -22,24 +22,24 @@ class VKUpdateDescriptorQueue;
class VKComputePass {
public:
explicit VKComputePass(const VKDevice& device, VKDescriptorPool& descriptor_pool,
const std::vector<vk::DescriptorSetLayoutBinding>& bindings,
const std::vector<vk::DescriptorUpdateTemplateEntry>& templates,
const std::vector<vk::PushConstantRange> push_constants,
std::size_t code_size, const u8* code);
vk::Span<VkDescriptorSetLayoutBinding> bindings,
vk::Span<VkDescriptorUpdateTemplateEntryKHR> templates,
vk::Span<VkPushConstantRange> push_constants, std::size_t code_size,
const u8* code);
~VKComputePass();
protected:
vk::DescriptorSet CommitDescriptorSet(VKUpdateDescriptorQueue& update_descriptor_queue,
VKFence& fence);
VkDescriptorSet CommitDescriptorSet(VKUpdateDescriptorQueue& update_descriptor_queue,
VKFence& fence);
UniqueDescriptorUpdateTemplate descriptor_template;
UniquePipelineLayout layout;
UniquePipeline pipeline;
vk::DescriptorUpdateTemplateKHR descriptor_template;
vk::PipelineLayout layout;
vk::Pipeline pipeline;
private:
UniqueDescriptorSetLayout descriptor_set_layout;
vk::DescriptorSetLayout descriptor_set_layout;
std::optional<DescriptorAllocator> descriptor_allocator;
UniqueShaderModule module;
vk::ShaderModule module;
};
class QuadArrayPass final : public VKComputePass {
@@ -50,7 +50,7 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue);
~QuadArrayPass();
std::pair<const vk::Buffer&, vk::DeviceSize> Assemble(u32 num_vertices, u32 first);
std::pair<const VkBuffer*, VkDeviceSize> Assemble(u32 num_vertices, u32 first);
private:
VKScheduler& scheduler;
@@ -65,8 +65,7 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue);
~Uint8Pass();
std::pair<const vk::Buffer*, u64> Assemble(u32 num_vertices, vk::Buffer src_buffer,
u64 src_offset);
std::pair<const VkBuffer*, u64> Assemble(u32 num_vertices, VkBuffer src_buffer, u64 src_offset);
private:
VKScheduler& scheduler;

View File

@@ -5,7 +5,6 @@
#include <memory>
#include <vector>
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_compute_pipeline.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_device.h"
@@ -14,6 +13,7 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_shader_decompiler.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -30,7 +30,7 @@ VKComputePipeline::VKComputePipeline(const VKDevice& device, VKScheduler& schedu
VKComputePipeline::~VKComputePipeline() = default;
vk::DescriptorSet VKComputePipeline::CommitDescriptorSet() {
VkDescriptorSet VKComputePipeline::CommitDescriptorSet() {
if (!descriptor_template) {
return {};
}
@@ -39,74 +39,109 @@ vk::DescriptorSet VKComputePipeline::CommitDescriptorSet() {
return set;
}
UniqueDescriptorSetLayout VKComputePipeline::CreateDescriptorSetLayout() const {
std::vector<vk::DescriptorSetLayoutBinding> bindings;
vk::DescriptorSetLayout VKComputePipeline::CreateDescriptorSetLayout() const {
std::vector<VkDescriptorSetLayoutBinding> bindings;
u32 binding = 0;
const auto AddBindings = [&](vk::DescriptorType descriptor_type, std::size_t num_entries) {
const auto add_bindings = [&](VkDescriptorType descriptor_type, std::size_t num_entries) {
// TODO(Rodrigo): Maybe make individual bindings here?
for (u32 bindpoint = 0; bindpoint < static_cast<u32>(num_entries); ++bindpoint) {
bindings.emplace_back(binding++, descriptor_type, 1, vk::ShaderStageFlagBits::eCompute,
nullptr);
VkDescriptorSetLayoutBinding& entry = bindings.emplace_back();
entry.binding = binding++;
entry.descriptorType = descriptor_type;
entry.descriptorCount = 1;
entry.stageFlags = VK_SHADER_STAGE_COMPUTE_BIT;
entry.pImmutableSamplers = nullptr;
}
};
AddBindings(vk::DescriptorType::eUniformBuffer, entries.const_buffers.size());
AddBindings(vk::DescriptorType::eStorageBuffer, entries.global_buffers.size());
AddBindings(vk::DescriptorType::eUniformTexelBuffer, entries.texel_buffers.size());
AddBindings(vk::DescriptorType::eCombinedImageSampler, entries.samplers.size());
AddBindings(vk::DescriptorType::eStorageImage, entries.images.size());
add_bindings(VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, entries.const_buffers.size());
add_bindings(VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, entries.global_buffers.size());
add_bindings(VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER, entries.texel_buffers.size());
add_bindings(VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, entries.samplers.size());
add_bindings(VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, entries.images.size());
const vk::DescriptorSetLayoutCreateInfo descriptor_set_layout_ci(
{}, static_cast<u32>(bindings.size()), bindings.data());
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createDescriptorSetLayoutUnique(descriptor_set_layout_ci, nullptr, dld);
VkDescriptorSetLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.bindingCount = static_cast<u32>(bindings.size());
ci.pBindings = bindings.data();
return device.GetLogical().CreateDescriptorSetLayout(ci);
}
UniquePipelineLayout VKComputePipeline::CreatePipelineLayout() const {
const vk::PipelineLayoutCreateInfo layout_ci({}, 1, &*descriptor_set_layout, 0, nullptr);
const auto dev = device.GetLogical();
return dev.createPipelineLayoutUnique(layout_ci, nullptr, device.GetDispatchLoader());
vk::PipelineLayout VKComputePipeline::CreatePipelineLayout() const {
VkPipelineLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.setLayoutCount = 1;
ci.pSetLayouts = descriptor_set_layout.address();
ci.pushConstantRangeCount = 0;
ci.pPushConstantRanges = nullptr;
return device.GetLogical().CreatePipelineLayout(ci);
}
UniqueDescriptorUpdateTemplate VKComputePipeline::CreateDescriptorUpdateTemplate() const {
std::vector<vk::DescriptorUpdateTemplateEntry> template_entries;
vk::DescriptorUpdateTemplateKHR VKComputePipeline::CreateDescriptorUpdateTemplate() const {
std::vector<VkDescriptorUpdateTemplateEntryKHR> template_entries;
u32 binding = 0;
u32 offset = 0;
FillDescriptorUpdateTemplateEntries(entries, binding, offset, template_entries);
if (template_entries.empty()) {
// If the shader doesn't use descriptor sets, skip template creation.
return UniqueDescriptorUpdateTemplate{};
return {};
}
const vk::DescriptorUpdateTemplateCreateInfo template_ci(
{}, static_cast<u32>(template_entries.size()), template_entries.data(),
vk::DescriptorUpdateTemplateType::eDescriptorSet, *descriptor_set_layout,
vk::PipelineBindPoint::eGraphics, *layout, DESCRIPTOR_SET);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createDescriptorUpdateTemplateUnique(template_ci, nullptr, dld);
VkDescriptorUpdateTemplateCreateInfoKHR ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR;
ci.pNext = nullptr;
ci.flags = 0;
ci.descriptorUpdateEntryCount = static_cast<u32>(template_entries.size());
ci.pDescriptorUpdateEntries = template_entries.data();
ci.templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR;
ci.descriptorSetLayout = *descriptor_set_layout;
ci.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
ci.pipelineLayout = *layout;
ci.set = DESCRIPTOR_SET;
return device.GetLogical().CreateDescriptorUpdateTemplateKHR(ci);
}
UniqueShaderModule VKComputePipeline::CreateShaderModule(const std::vector<u32>& code) const {
const vk::ShaderModuleCreateInfo module_ci({}, code.size() * sizeof(u32), code.data());
const auto dev = device.GetLogical();
return dev.createShaderModuleUnique(module_ci, nullptr, device.GetDispatchLoader());
vk::ShaderModule VKComputePipeline::CreateShaderModule(const std::vector<u32>& code) const {
VkShaderModuleCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.codeSize = code.size() * sizeof(u32);
ci.pCode = code.data();
return device.GetLogical().CreateShaderModule(ci);
}
UniquePipeline VKComputePipeline::CreatePipeline() const {
vk::PipelineShaderStageCreateInfo shader_stage_ci({}, vk::ShaderStageFlagBits::eCompute,
*shader_module, "main", nullptr);
vk::PipelineShaderStageRequiredSubgroupSizeCreateInfoEXT subgroup_size_ci;
vk::Pipeline VKComputePipeline::CreatePipeline() const {
VkComputePipelineCreateInfo ci;
VkPipelineShaderStageCreateInfo& stage_ci = ci.stage;
stage_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
stage_ci.pNext = nullptr;
stage_ci.flags = 0;
stage_ci.stage = VK_SHADER_STAGE_COMPUTE_BIT;
stage_ci.module = *shader_module;
stage_ci.pName = "main";
stage_ci.pSpecializationInfo = nullptr;
VkPipelineShaderStageRequiredSubgroupSizeCreateInfoEXT subgroup_size_ci;
subgroup_size_ci.sType =
VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_REQUIRED_SUBGROUP_SIZE_CREATE_INFO_EXT;
subgroup_size_ci.pNext = nullptr;
subgroup_size_ci.requiredSubgroupSize = GuestWarpSize;
if (entries.uses_warps && device.IsGuestWarpSizeSupported(vk::ShaderStageFlagBits::eCompute)) {
shader_stage_ci.pNext = &subgroup_size_ci;
if (entries.uses_warps && device.IsGuestWarpSizeSupported(VK_SHADER_STAGE_COMPUTE_BIT)) {
stage_ci.pNext = &subgroup_size_ci;
}
const vk::ComputePipelineCreateInfo create_info({}, shader_stage_ci, *layout, {}, 0);
const auto dev = device.GetLogical();
return dev.createComputePipelineUnique({}, create_info, nullptr, device.GetDispatchLoader());
ci.sType = VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.layout = *layout;
ci.basePipelineHandle = nullptr;
ci.basePipelineIndex = 0;
return device.GetLogical().CreateComputePipeline(ci);
}
} // namespace Vulkan

View File

@@ -7,9 +7,9 @@
#include <memory>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_shader_decompiler.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -25,42 +25,42 @@ public:
const SPIRVShader& shader);
~VKComputePipeline();
vk::DescriptorSet CommitDescriptorSet();
VkDescriptorSet CommitDescriptorSet();
vk::Pipeline GetHandle() const {
VkPipeline GetHandle() const {
return *pipeline;
}
vk::PipelineLayout GetLayout() const {
VkPipelineLayout GetLayout() const {
return *layout;
}
const ShaderEntries& GetEntries() {
const ShaderEntries& GetEntries() const {
return entries;
}
private:
UniqueDescriptorSetLayout CreateDescriptorSetLayout() const;
vk::DescriptorSetLayout CreateDescriptorSetLayout() const;
UniquePipelineLayout CreatePipelineLayout() const;
vk::PipelineLayout CreatePipelineLayout() const;
UniqueDescriptorUpdateTemplate CreateDescriptorUpdateTemplate() const;
vk::DescriptorUpdateTemplateKHR CreateDescriptorUpdateTemplate() const;
UniqueShaderModule CreateShaderModule(const std::vector<u32>& code) const;
vk::ShaderModule CreateShaderModule(const std::vector<u32>& code) const;
UniquePipeline CreatePipeline() const;
vk::Pipeline CreatePipeline() const;
const VKDevice& device;
VKScheduler& scheduler;
ShaderEntries entries;
UniqueDescriptorSetLayout descriptor_set_layout;
vk::DescriptorSetLayout descriptor_set_layout;
DescriptorAllocator descriptor_allocator;
VKUpdateDescriptorQueue& update_descriptor_queue;
UniquePipelineLayout layout;
UniqueDescriptorUpdateTemplate descriptor_template;
UniqueShaderModule shader_module;
UniquePipeline pipeline;
vk::PipelineLayout layout;
vk::DescriptorUpdateTemplateKHR descriptor_template;
vk::ShaderModule shader_module;
vk::Pipeline pipeline;
};
} // namespace Vulkan

View File

@@ -6,10 +6,10 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -17,19 +17,18 @@ namespace Vulkan {
constexpr std::size_t SETS_GROW_RATE = 0x20;
DescriptorAllocator::DescriptorAllocator(VKDescriptorPool& descriptor_pool,
vk::DescriptorSetLayout layout)
VkDescriptorSetLayout layout)
: VKFencedPool{SETS_GROW_RATE}, descriptor_pool{descriptor_pool}, layout{layout} {}
DescriptorAllocator::~DescriptorAllocator() = default;
vk::DescriptorSet DescriptorAllocator::Commit(VKFence& fence) {
return *descriptors[CommitResource(fence)];
VkDescriptorSet DescriptorAllocator::Commit(VKFence& fence) {
const std::size_t index = CommitResource(fence);
return descriptors_allocations[index / SETS_GROW_RATE][index % SETS_GROW_RATE];
}
void DescriptorAllocator::Allocate(std::size_t begin, std::size_t end) {
auto new_sets = descriptor_pool.AllocateDescriptors(layout, end - begin);
descriptors.insert(descriptors.end(), std::make_move_iterator(new_sets.begin()),
std::make_move_iterator(new_sets.end()));
descriptors_allocations.push_back(descriptor_pool.AllocateDescriptors(layout, end - begin));
}
VKDescriptorPool::VKDescriptorPool(const VKDevice& device)
@@ -37,53 +36,50 @@ VKDescriptorPool::VKDescriptorPool(const VKDevice& device)
VKDescriptorPool::~VKDescriptorPool() = default;
vk::DescriptorPool VKDescriptorPool::AllocateNewPool() {
vk::DescriptorPool* VKDescriptorPool::AllocateNewPool() {
static constexpr u32 num_sets = 0x20000;
static constexpr vk::DescriptorPoolSize pool_sizes[] = {
{vk::DescriptorType::eUniformBuffer, num_sets * 90},
{vk::DescriptorType::eStorageBuffer, num_sets * 60},
{vk::DescriptorType::eUniformTexelBuffer, num_sets * 64},
{vk::DescriptorType::eCombinedImageSampler, num_sets * 64},
{vk::DescriptorType::eStorageImage, num_sets * 40}};
static constexpr VkDescriptorPoolSize pool_sizes[] = {
{VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER, num_sets * 90},
{VK_DESCRIPTOR_TYPE_STORAGE_BUFFER, num_sets * 60},
{VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER, num_sets * 64},
{VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER, num_sets * 64},
{VK_DESCRIPTOR_TYPE_STORAGE_IMAGE, num_sets * 40}};
const vk::DescriptorPoolCreateInfo create_info(
vk::DescriptorPoolCreateFlagBits::eFreeDescriptorSet, num_sets,
static_cast<u32>(std::size(pool_sizes)), std::data(pool_sizes));
const auto dev = device.GetLogical();
return *pools.emplace_back(
dev.createDescriptorPoolUnique(create_info, nullptr, device.GetDispatchLoader()));
VkDescriptorPoolCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT;
ci.maxSets = num_sets;
ci.poolSizeCount = static_cast<u32>(std::size(pool_sizes));
ci.pPoolSizes = std::data(pool_sizes);
return &pools.emplace_back(device.GetLogical().CreateDescriptorPool(ci));
}
std::vector<UniqueDescriptorSet> VKDescriptorPool::AllocateDescriptors(
vk::DescriptorSetLayout layout, std::size_t count) {
std::vector layout_copies(count, layout);
vk::DescriptorSetAllocateInfo allocate_info(active_pool, static_cast<u32>(count),
layout_copies.data());
vk::DescriptorSets VKDescriptorPool::AllocateDescriptors(VkDescriptorSetLayout layout,
std::size_t count) {
const std::vector layout_copies(count, layout);
VkDescriptorSetAllocateInfo ai;
ai.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
ai.pNext = nullptr;
ai.descriptorPool = **active_pool;
ai.descriptorSetCount = static_cast<u32>(count);
ai.pSetLayouts = layout_copies.data();
std::vector<vk::DescriptorSet> sets(count);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
switch (const auto result = dev.allocateDescriptorSets(&allocate_info, sets.data(), dld)) {
case vk::Result::eSuccess:
break;
case vk::Result::eErrorOutOfPoolMemory:
active_pool = AllocateNewPool();
allocate_info.descriptorPool = active_pool;
if (dev.allocateDescriptorSets(&allocate_info, sets.data(), dld) == vk::Result::eSuccess) {
break;
}
[[fallthrough]];
default:
vk::throwResultException(result, "vk::Device::allocateDescriptorSetsUnique");
vk::DescriptorSets sets = active_pool->Allocate(ai);
if (!sets.IsOutOfPoolMemory()) {
return sets;
}
vk::PoolFree deleter(dev, active_pool, dld);
std::vector<UniqueDescriptorSet> unique_sets;
unique_sets.reserve(count);
for (const auto set : sets) {
unique_sets.push_back(UniqueDescriptorSet{set, deleter});
// Our current pool is out of memory. Allocate a new one and retry
active_pool = AllocateNewPool();
ai.descriptorPool = **active_pool;
sets = active_pool->Allocate(ai);
if (!sets.IsOutOfPoolMemory()) {
return sets;
}
return unique_sets;
// After allocating a new pool, we are out of memory again. We can't handle this from here.
throw vk::Exception(VK_ERROR_OUT_OF_POOL_MEMORY);
}
} // namespace Vulkan

View File

@@ -8,8 +8,8 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -17,21 +17,21 @@ class VKDescriptorPool;
class DescriptorAllocator final : public VKFencedPool {
public:
explicit DescriptorAllocator(VKDescriptorPool& descriptor_pool, vk::DescriptorSetLayout layout);
explicit DescriptorAllocator(VKDescriptorPool& descriptor_pool, VkDescriptorSetLayout layout);
~DescriptorAllocator() override;
DescriptorAllocator(const DescriptorAllocator&) = delete;
vk::DescriptorSet Commit(VKFence& fence);
VkDescriptorSet Commit(VKFence& fence);
protected:
void Allocate(std::size_t begin, std::size_t end) override;
private:
VKDescriptorPool& descriptor_pool;
const vk::DescriptorSetLayout layout;
const VkDescriptorSetLayout layout;
std::vector<UniqueDescriptorSet> descriptors;
std::vector<vk::DescriptorSets> descriptors_allocations;
};
class VKDescriptorPool final {
@@ -42,15 +42,14 @@ public:
~VKDescriptorPool();
private:
vk::DescriptorPool AllocateNewPool();
vk::DescriptorPool* AllocateNewPool();
std::vector<UniqueDescriptorSet> AllocateDescriptors(vk::DescriptorSetLayout layout,
std::size_t count);
vk::DescriptorSets AllocateDescriptors(VkDescriptorSetLayout layout, std::size_t count);
const VKDevice& device;
std::vector<UniqueDescriptorPool> pools;
vk::DescriptorPool active_pool;
std::vector<vk::DescriptorPool> pools;
vk::DescriptorPool* active_pool;
};
} // namespace Vulkan

View File

@@ -6,15 +6,15 @@
#include <chrono>
#include <cstdlib>
#include <optional>
#include <set>
#include <string_view>
#include <thread>
#include <unordered_set>
#include <vector>
#include "common/assert.h"
#include "core/settings.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -22,49 +22,43 @@ namespace {
namespace Alternatives {
constexpr std::array Depth24UnormS8Uint = {vk::Format::eD32SfloatS8Uint,
vk::Format::eD16UnormS8Uint, vk::Format{}};
constexpr std::array Depth16UnormS8Uint = {vk::Format::eD24UnormS8Uint,
vk::Format::eD32SfloatS8Uint, vk::Format{}};
constexpr std::array Depth24UnormS8_UINT = {VK_FORMAT_D32_SFLOAT_S8_UINT,
VK_FORMAT_D16_UNORM_S8_UINT, VkFormat{}};
constexpr std::array Depth16UnormS8_UINT = {VK_FORMAT_D24_UNORM_S8_UINT,
VK_FORMAT_D32_SFLOAT_S8_UINT, VkFormat{}};
} // namespace Alternatives
constexpr std::array REQUIRED_EXTENSIONS = {
VK_KHR_SWAPCHAIN_EXTENSION_NAME,
VK_KHR_16BIT_STORAGE_EXTENSION_NAME,
VK_KHR_8BIT_STORAGE_EXTENSION_NAME,
VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME,
VK_KHR_DESCRIPTOR_UPDATE_TEMPLATE_EXTENSION_NAME,
VK_EXT_VERTEX_ATTRIBUTE_DIVISOR_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_BALLOT_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_VOTE_EXTENSION_NAME,
VK_EXT_HOST_QUERY_RESET_EXTENSION_NAME,
};
template <typename T>
void SetNext(void**& next, T& data) {
*next = &data;
next = &data.pNext;
}
template <typename T>
T GetFeatures(vk::PhysicalDevice physical, const vk::DispatchLoaderDynamic& dld) {
vk::PhysicalDeviceFeatures2 features;
T extension_features;
features.pNext = &extension_features;
physical.getFeatures2(&features, dld);
return extension_features;
}
template <typename T>
T GetProperties(vk::PhysicalDevice physical, const vk::DispatchLoaderDynamic& dld) {
vk::PhysicalDeviceProperties2 properties;
T extension_properties;
properties.pNext = &extension_properties;
physical.getProperties2(&properties, dld);
return extension_properties;
}
constexpr const vk::Format* GetFormatAlternatives(vk::Format format) {
constexpr const VkFormat* GetFormatAlternatives(VkFormat format) {
switch (format) {
case vk::Format::eD24UnormS8Uint:
return Alternatives::Depth24UnormS8Uint.data();
case vk::Format::eD16UnormS8Uint:
return Alternatives::Depth16UnormS8Uint.data();
case VK_FORMAT_D24_UNORM_S8_UINT:
return Alternatives::Depth24UnormS8_UINT.data();
case VK_FORMAT_D16_UNORM_S8_UINT:
return Alternatives::Depth16UnormS8_UINT.data();
default:
return nullptr;
}
}
vk::FormatFeatureFlags GetFormatFeatures(vk::FormatProperties properties, FormatType format_type) {
VkFormatFeatureFlags GetFormatFeatures(VkFormatProperties properties, FormatType format_type) {
switch (format_type) {
case FormatType::Linear:
return properties.linearTilingFeatures;
@@ -77,79 +71,220 @@ vk::FormatFeatureFlags GetFormatFeatures(vk::FormatProperties properties, Format
}
}
std::unordered_map<VkFormat, VkFormatProperties> GetFormatProperties(
vk::PhysicalDevice physical, const vk::InstanceDispatch& dld) {
static constexpr std::array formats{VK_FORMAT_A8B8G8R8_UNORM_PACK32,
VK_FORMAT_A8B8G8R8_UINT_PACK32,
VK_FORMAT_A8B8G8R8_SNORM_PACK32,
VK_FORMAT_A8B8G8R8_SRGB_PACK32,
VK_FORMAT_B5G6R5_UNORM_PACK16,
VK_FORMAT_A2B10G10R10_UNORM_PACK32,
VK_FORMAT_A1R5G5B5_UNORM_PACK16,
VK_FORMAT_R32G32B32A32_SFLOAT,
VK_FORMAT_R32G32B32A32_UINT,
VK_FORMAT_R32G32_SFLOAT,
VK_FORMAT_R32G32_UINT,
VK_FORMAT_R16G16B16A16_UINT,
VK_FORMAT_R16G16B16A16_SNORM,
VK_FORMAT_R16G16B16A16_UNORM,
VK_FORMAT_R16G16_UNORM,
VK_FORMAT_R16G16_SNORM,
VK_FORMAT_R16G16_SFLOAT,
VK_FORMAT_R16_UNORM,
VK_FORMAT_R8G8B8A8_SRGB,
VK_FORMAT_R8G8_UNORM,
VK_FORMAT_R8G8_SNORM,
VK_FORMAT_R8_UNORM,
VK_FORMAT_R8_UINT,
VK_FORMAT_B10G11R11_UFLOAT_PACK32,
VK_FORMAT_R32_SFLOAT,
VK_FORMAT_R32_UINT,
VK_FORMAT_R32_SINT,
VK_FORMAT_R16_SFLOAT,
VK_FORMAT_R16G16B16A16_SFLOAT,
VK_FORMAT_B8G8R8A8_UNORM,
VK_FORMAT_R4G4B4A4_UNORM_PACK16,
VK_FORMAT_D32_SFLOAT,
VK_FORMAT_D16_UNORM,
VK_FORMAT_D16_UNORM_S8_UINT,
VK_FORMAT_D24_UNORM_S8_UINT,
VK_FORMAT_D32_SFLOAT_S8_UINT,
VK_FORMAT_BC1_RGBA_UNORM_BLOCK,
VK_FORMAT_BC2_UNORM_BLOCK,
VK_FORMAT_BC3_UNORM_BLOCK,
VK_FORMAT_BC4_UNORM_BLOCK,
VK_FORMAT_BC5_UNORM_BLOCK,
VK_FORMAT_BC5_SNORM_BLOCK,
VK_FORMAT_BC7_UNORM_BLOCK,
VK_FORMAT_BC6H_UFLOAT_BLOCK,
VK_FORMAT_BC6H_SFLOAT_BLOCK,
VK_FORMAT_BC1_RGBA_SRGB_BLOCK,
VK_FORMAT_BC2_SRGB_BLOCK,
VK_FORMAT_BC3_SRGB_BLOCK,
VK_FORMAT_BC7_SRGB_BLOCK,
VK_FORMAT_ASTC_4x4_SRGB_BLOCK,
VK_FORMAT_ASTC_8x8_SRGB_BLOCK,
VK_FORMAT_ASTC_8x5_SRGB_BLOCK,
VK_FORMAT_ASTC_5x4_SRGB_BLOCK,
VK_FORMAT_ASTC_5x5_UNORM_BLOCK,
VK_FORMAT_ASTC_5x5_SRGB_BLOCK,
VK_FORMAT_ASTC_10x8_UNORM_BLOCK,
VK_FORMAT_ASTC_10x8_SRGB_BLOCK,
VK_FORMAT_ASTC_6x6_UNORM_BLOCK,
VK_FORMAT_ASTC_6x6_SRGB_BLOCK,
VK_FORMAT_ASTC_10x10_UNORM_BLOCK,
VK_FORMAT_ASTC_10x10_SRGB_BLOCK,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK,
VK_FORMAT_ASTC_12x12_SRGB_BLOCK,
VK_FORMAT_ASTC_8x6_UNORM_BLOCK,
VK_FORMAT_ASTC_8x6_SRGB_BLOCK,
VK_FORMAT_ASTC_6x5_UNORM_BLOCK,
VK_FORMAT_ASTC_6x5_SRGB_BLOCK,
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32};
std::unordered_map<VkFormat, VkFormatProperties> format_properties;
for (const auto format : formats) {
format_properties.emplace(format, physical.GetFormatProperties(format));
}
return format_properties;
}
} // Anonymous namespace
VKDevice::VKDevice(const vk::DispatchLoaderDynamic& dld, vk::PhysicalDevice physical,
vk::SurfaceKHR surface)
: dld{dld}, physical{physical}, properties{physical.getProperties(dld)},
format_properties{GetFormatProperties(dld, physical)} {
VKDevice::VKDevice(VkInstance instance, vk::PhysicalDevice physical, VkSurfaceKHR surface,
const vk::InstanceDispatch& dld)
: dld{dld}, physical{physical}, properties{physical.GetProperties()},
format_properties{GetFormatProperties(physical, dld)} {
SetupFamilies(surface);
SetupFeatures();
}
VKDevice::~VKDevice() = default;
bool VKDevice::Create(vk::Instance instance) {
bool VKDevice::Create() {
const auto queue_cis = GetDeviceQueueCreateInfos();
const std::vector extensions = LoadExtensions();
vk::PhysicalDeviceFeatures2 features2;
VkPhysicalDeviceFeatures2 features2;
features2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2;
features2.pNext = nullptr;
void** next = &features2.pNext;
auto& features = features2.features;
features.vertexPipelineStoresAndAtomics = true;
features.robustBufferAccess = false;
features.fullDrawIndexUint32 = false;
features.imageCubeArray = false;
features.independentBlend = true;
features.depthClamp = true;
features.samplerAnisotropy = true;
features.largePoints = true;
features.multiViewport = true;
features.depthBiasClamp = true;
features.geometryShader = true;
features.tessellationShader = true;
features.sampleRateShading = false;
features.dualSrcBlend = false;
features.logicOp = false;
features.multiDrawIndirect = false;
features.drawIndirectFirstInstance = false;
features.depthClamp = true;
features.depthBiasClamp = true;
features.fillModeNonSolid = false;
features.depthBounds = false;
features.wideLines = false;
features.largePoints = true;
features.alphaToOne = false;
features.multiViewport = true;
features.samplerAnisotropy = true;
features.textureCompressionETC2 = false;
features.textureCompressionASTC_LDR = is_optimal_astc_supported;
features.textureCompressionBC = false;
features.occlusionQueryPrecise = true;
features.pipelineStatisticsQuery = false;
features.vertexPipelineStoresAndAtomics = true;
features.fragmentStoresAndAtomics = true;
features.shaderTessellationAndGeometryPointSize = false;
features.shaderImageGatherExtended = true;
features.shaderStorageImageExtendedFormats = false;
features.shaderStorageImageMultisample = false;
features.shaderStorageImageReadWithoutFormat = is_formatless_image_load_supported;
features.shaderStorageImageWriteWithoutFormat = true;
features.textureCompressionASTC_LDR = is_optimal_astc_supported;
features.shaderUniformBufferArrayDynamicIndexing = false;
features.shaderSampledImageArrayDynamicIndexing = false;
features.shaderStorageBufferArrayDynamicIndexing = false;
features.shaderStorageImageArrayDynamicIndexing = false;
features.shaderClipDistance = false;
features.shaderCullDistance = false;
features.shaderFloat64 = false;
features.shaderInt64 = false;
features.shaderInt16 = false;
features.shaderResourceResidency = false;
features.shaderResourceMinLod = false;
features.sparseBinding = false;
features.sparseResidencyBuffer = false;
features.sparseResidencyImage2D = false;
features.sparseResidencyImage3D = false;
features.sparseResidency2Samples = false;
features.sparseResidency4Samples = false;
features.sparseResidency8Samples = false;
features.sparseResidency16Samples = false;
features.sparseResidencyAliased = false;
features.variableMultisampleRate = false;
features.inheritedQueries = false;
vk::PhysicalDevice16BitStorageFeaturesKHR bit16_storage;
VkPhysicalDevice16BitStorageFeaturesKHR bit16_storage;
bit16_storage.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_16BIT_STORAGE_FEATURES_KHR;
bit16_storage.pNext = nullptr;
bit16_storage.storageBuffer16BitAccess = false;
bit16_storage.uniformAndStorageBuffer16BitAccess = true;
bit16_storage.storagePushConstant16 = false;
bit16_storage.storageInputOutput16 = false;
SetNext(next, bit16_storage);
vk::PhysicalDevice8BitStorageFeaturesKHR bit8_storage;
VkPhysicalDevice8BitStorageFeaturesKHR bit8_storage;
bit8_storage.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_8BIT_STORAGE_FEATURES_KHR;
bit8_storage.pNext = nullptr;
bit8_storage.storageBuffer8BitAccess = false;
bit8_storage.uniformAndStorageBuffer8BitAccess = true;
bit8_storage.storagePushConstant8 = false;
SetNext(next, bit8_storage);
vk::PhysicalDeviceHostQueryResetFeaturesEXT host_query_reset;
VkPhysicalDeviceHostQueryResetFeaturesEXT host_query_reset;
host_query_reset.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_HOST_QUERY_RESET_FEATURES_EXT;
host_query_reset.hostQueryReset = true;
SetNext(next, host_query_reset);
vk::PhysicalDeviceFloat16Int8FeaturesKHR float16_int8;
VkPhysicalDeviceFloat16Int8FeaturesKHR float16_int8;
if (is_float16_supported) {
float16_int8.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR;
float16_int8.pNext = nullptr;
float16_int8.shaderFloat16 = true;
float16_int8.shaderInt8 = false;
SetNext(next, float16_int8);
} else {
LOG_INFO(Render_Vulkan, "Device doesn't support float16 natively");
}
vk::PhysicalDeviceUniformBufferStandardLayoutFeaturesKHR std430_layout;
VkPhysicalDeviceUniformBufferStandardLayoutFeaturesKHR std430_layout;
if (khr_uniform_buffer_standard_layout) {
std430_layout.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_UNIFORM_BUFFER_STANDARD_LAYOUT_FEATURES_KHR;
std430_layout.pNext = nullptr;
std430_layout.uniformBufferStandardLayout = true;
SetNext(next, std430_layout);
} else {
LOG_INFO(Render_Vulkan, "Device doesn't support packed UBOs");
}
vk::PhysicalDeviceIndexTypeUint8FeaturesEXT index_type_uint8;
VkPhysicalDeviceIndexTypeUint8FeaturesEXT index_type_uint8;
if (ext_index_type_uint8) {
index_type_uint8.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_INDEX_TYPE_UINT8_FEATURES_EXT;
index_type_uint8.pNext = nullptr;
index_type_uint8.indexTypeUint8 = true;
SetNext(next, index_type_uint8);
} else {
LOG_INFO(Render_Vulkan, "Device doesn't support uint8 indexes");
}
vk::PhysicalDeviceTransformFeedbackFeaturesEXT transform_feedback;
VkPhysicalDeviceTransformFeedbackFeaturesEXT transform_feedback;
if (ext_transform_feedback) {
transform_feedback.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT;
transform_feedback.pNext = nullptr;
transform_feedback.transformFeedback = true;
transform_feedback.geometryStreams = true;
SetNext(next, transform_feedback);
@@ -161,60 +296,48 @@ bool VKDevice::Create(vk::Instance instance) {
LOG_INFO(Render_Vulkan, "Device doesn't support depth range unrestricted");
}
vk::DeviceCreateInfo device_ci({}, static_cast<u32>(queue_cis.size()), queue_cis.data(), 0,
nullptr, static_cast<u32>(extensions.size()), extensions.data(),
nullptr);
device_ci.pNext = &features2;
vk::Device unsafe_logical;
if (physical.createDevice(&device_ci, nullptr, &unsafe_logical, dld) != vk::Result::eSuccess) {
LOG_CRITICAL(Render_Vulkan, "Logical device failed to be created!");
logical = vk::Device::Create(physical, queue_cis, extensions, features2, dld);
if (!logical) {
LOG_ERROR(Render_Vulkan, "Failed to create logical device");
return false;
}
dld.init(instance, dld.vkGetInstanceProcAddr, unsafe_logical);
logical = UniqueDevice(unsafe_logical, {nullptr, dld});
CollectTelemetryParameters();
graphics_queue = logical->getQueue(graphics_family, 0, dld);
present_queue = logical->getQueue(present_family, 0, dld);
graphics_queue = logical.GetQueue(graphics_family);
present_queue = logical.GetQueue(present_family);
return true;
}
vk::Format VKDevice::GetSupportedFormat(vk::Format wanted_format,
vk::FormatFeatureFlags wanted_usage,
FormatType format_type) const {
VkFormat VKDevice::GetSupportedFormat(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const {
if (IsFormatSupported(wanted_format, wanted_usage, format_type)) {
return wanted_format;
}
// The wanted format is not supported by hardware, search for alternatives
const vk::Format* alternatives = GetFormatAlternatives(wanted_format);
const VkFormat* alternatives = GetFormatAlternatives(wanted_format);
if (alternatives == nullptr) {
UNREACHABLE_MSG("Format={} with usage={} and type={} has no defined alternatives and host "
"hardware does not support it",
vk::to_string(wanted_format), vk::to_string(wanted_usage),
static_cast<u32>(format_type));
wanted_format, wanted_usage, format_type);
return wanted_format;
}
std::size_t i = 0;
for (vk::Format alternative = alternatives[0]; alternative != vk::Format{};
alternative = alternatives[++i]) {
for (VkFormat alternative = *alternatives; alternative; alternative = alternatives[++i]) {
if (!IsFormatSupported(alternative, wanted_usage, format_type)) {
continue;
}
LOG_WARNING(Render_Vulkan,
"Emulating format={} with alternative format={} with usage={} and type={}",
static_cast<u32>(wanted_format), static_cast<u32>(alternative),
static_cast<u32>(wanted_usage), static_cast<u32>(format_type));
wanted_format, alternative, wanted_usage, format_type);
return alternative;
}
// No alternatives found, panic
UNREACHABLE_MSG("Format={} with usage={} and type={} is not supported by the host hardware and "
"doesn't support any of the alternatives",
static_cast<u32>(wanted_format), static_cast<u32>(wanted_usage),
static_cast<u32>(format_type));
wanted_format, wanted_usage, format_type);
return wanted_format;
}
@@ -228,38 +351,39 @@ void VKDevice::ReportLoss() const {
return;
}
[[maybe_unused]] const std::vector data = graphics_queue.getCheckpointDataNV(dld);
[[maybe_unused]] const std::vector data = graphics_queue.GetCheckpointDataNV(dld);
// Catch here in debug builds (or with optimizations disabled) the last graphics pipeline to be
// executed. It can be done on a debugger by evaluating the expression:
// *(VKGraphicsPipeline*)data[0]
}
bool VKDevice::IsOptimalAstcSupported(const vk::PhysicalDeviceFeatures& features) const {
bool VKDevice::IsOptimalAstcSupported(const VkPhysicalDeviceFeatures& features) const {
// Disable for now to avoid converting ASTC twice.
static constexpr std::array astc_formats = {
vk::Format::eAstc4x4UnormBlock, vk::Format::eAstc4x4SrgbBlock,
vk::Format::eAstc5x4UnormBlock, vk::Format::eAstc5x4SrgbBlock,
vk::Format::eAstc5x5UnormBlock, vk::Format::eAstc5x5SrgbBlock,
vk::Format::eAstc6x5UnormBlock, vk::Format::eAstc6x5SrgbBlock,
vk::Format::eAstc6x6UnormBlock, vk::Format::eAstc6x6SrgbBlock,
vk::Format::eAstc8x5UnormBlock, vk::Format::eAstc8x5SrgbBlock,
vk::Format::eAstc8x6UnormBlock, vk::Format::eAstc8x6SrgbBlock,
vk::Format::eAstc8x8UnormBlock, vk::Format::eAstc8x8SrgbBlock,
vk::Format::eAstc10x5UnormBlock, vk::Format::eAstc10x5SrgbBlock,
vk::Format::eAstc10x6UnormBlock, vk::Format::eAstc10x6SrgbBlock,
vk::Format::eAstc10x8UnormBlock, vk::Format::eAstc10x8SrgbBlock,
vk::Format::eAstc10x10UnormBlock, vk::Format::eAstc10x10SrgbBlock,
vk::Format::eAstc12x10UnormBlock, vk::Format::eAstc12x10SrgbBlock,
vk::Format::eAstc12x12UnormBlock, vk::Format::eAstc12x12SrgbBlock};
VK_FORMAT_ASTC_4x4_UNORM_BLOCK, VK_FORMAT_ASTC_4x4_SRGB_BLOCK,
VK_FORMAT_ASTC_5x4_UNORM_BLOCK, VK_FORMAT_ASTC_5x4_SRGB_BLOCK,
VK_FORMAT_ASTC_5x5_UNORM_BLOCK, VK_FORMAT_ASTC_5x5_SRGB_BLOCK,
VK_FORMAT_ASTC_6x5_UNORM_BLOCK, VK_FORMAT_ASTC_6x5_SRGB_BLOCK,
VK_FORMAT_ASTC_6x6_UNORM_BLOCK, VK_FORMAT_ASTC_6x6_SRGB_BLOCK,
VK_FORMAT_ASTC_8x5_UNORM_BLOCK, VK_FORMAT_ASTC_8x5_SRGB_BLOCK,
VK_FORMAT_ASTC_8x6_UNORM_BLOCK, VK_FORMAT_ASTC_8x6_SRGB_BLOCK,
VK_FORMAT_ASTC_8x8_UNORM_BLOCK, VK_FORMAT_ASTC_8x8_SRGB_BLOCK,
VK_FORMAT_ASTC_10x5_UNORM_BLOCK, VK_FORMAT_ASTC_10x5_SRGB_BLOCK,
VK_FORMAT_ASTC_10x6_UNORM_BLOCK, VK_FORMAT_ASTC_10x6_SRGB_BLOCK,
VK_FORMAT_ASTC_10x8_UNORM_BLOCK, VK_FORMAT_ASTC_10x8_SRGB_BLOCK,
VK_FORMAT_ASTC_10x10_UNORM_BLOCK, VK_FORMAT_ASTC_10x10_SRGB_BLOCK,
VK_FORMAT_ASTC_12x10_UNORM_BLOCK, VK_FORMAT_ASTC_12x10_SRGB_BLOCK,
VK_FORMAT_ASTC_12x12_UNORM_BLOCK, VK_FORMAT_ASTC_12x12_SRGB_BLOCK,
};
if (!features.textureCompressionASTC_LDR) {
return false;
}
const auto format_feature_usage{
vk::FormatFeatureFlagBits::eSampledImage | vk::FormatFeatureFlagBits::eBlitSrc |
vk::FormatFeatureFlagBits::eBlitDst | vk::FormatFeatureFlagBits::eTransferSrc |
vk::FormatFeatureFlagBits::eTransferDst};
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT | VK_FORMAT_FEATURE_BLIT_SRC_BIT |
VK_FORMAT_FEATURE_BLIT_DST_BIT | VK_FORMAT_FEATURE_TRANSFER_SRC_BIT |
VK_FORMAT_FEATURE_TRANSFER_DST_BIT};
for (const auto format : astc_formats) {
const auto format_properties{physical.getFormatProperties(format, dld)};
const auto format_properties{physical.GetFormatProperties(format)};
if (!(format_properties.optimalTilingFeatures & format_feature_usage)) {
return false;
}
@@ -267,61 +391,49 @@ bool VKDevice::IsOptimalAstcSupported(const vk::PhysicalDeviceFeatures& features
return true;
}
bool VKDevice::IsFormatSupported(vk::Format wanted_format, vk::FormatFeatureFlags wanted_usage,
bool VKDevice::IsFormatSupported(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const {
const auto it = format_properties.find(wanted_format);
if (it == format_properties.end()) {
UNIMPLEMENTED_MSG("Unimplemented format query={}", vk::to_string(wanted_format));
UNIMPLEMENTED_MSG("Unimplemented format query={}", wanted_format);
return true;
}
const auto supported_usage = GetFormatFeatures(it->second, format_type);
return (supported_usage & wanted_usage) == wanted_usage;
}
bool VKDevice::IsSuitable(vk::PhysicalDevice physical, vk::SurfaceKHR surface,
const vk::DispatchLoaderDynamic& dld) {
static constexpr std::array required_extensions = {
VK_KHR_SWAPCHAIN_EXTENSION_NAME,
VK_KHR_16BIT_STORAGE_EXTENSION_NAME,
VK_KHR_8BIT_STORAGE_EXTENSION_NAME,
VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME,
VK_EXT_VERTEX_ATTRIBUTE_DIVISOR_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_BALLOT_EXTENSION_NAME,
VK_EXT_SHADER_SUBGROUP_VOTE_EXTENSION_NAME,
VK_EXT_HOST_QUERY_RESET_EXTENSION_NAME,
};
bool VKDevice::IsSuitable(vk::PhysicalDevice physical, VkSurfaceKHR surface) {
bool is_suitable = true;
std::bitset<required_extensions.size()> available_extensions{};
std::bitset<REQUIRED_EXTENSIONS.size()> available_extensions;
for (const auto& prop : physical.enumerateDeviceExtensionProperties(nullptr, dld)) {
for (std::size_t i = 0; i < required_extensions.size(); ++i) {
for (const auto& prop : physical.EnumerateDeviceExtensionProperties()) {
for (std::size_t i = 0; i < REQUIRED_EXTENSIONS.size(); ++i) {
if (available_extensions[i]) {
continue;
}
available_extensions[i] =
required_extensions[i] == std::string_view{prop.extensionName};
const std::string_view name{prop.extensionName};
available_extensions[i] = name == REQUIRED_EXTENSIONS[i];
}
}
if (!available_extensions.all()) {
for (std::size_t i = 0; i < required_extensions.size(); ++i) {
for (std::size_t i = 0; i < REQUIRED_EXTENSIONS.size(); ++i) {
if (available_extensions[i]) {
continue;
}
LOG_ERROR(Render_Vulkan, "Missing required extension: {}", required_extensions[i]);
LOG_ERROR(Render_Vulkan, "Missing required extension: {}", REQUIRED_EXTENSIONS[i]);
is_suitable = false;
}
}
bool has_graphics{}, has_present{};
const auto queue_family_properties = physical.getQueueFamilyProperties(dld);
const std::vector queue_family_properties = physical.GetQueueFamilyProperties();
for (u32 i = 0; i < static_cast<u32>(queue_family_properties.size()); ++i) {
const auto& family = queue_family_properties[i];
if (family.queueCount == 0) {
continue;
}
has_graphics |=
(family.queueFlags & vk::QueueFlagBits::eGraphics) != static_cast<vk::QueueFlagBits>(0);
has_present |= physical.getSurfaceSupportKHR(i, surface, dld) != 0;
has_graphics |= family.queueFlags & VK_QUEUE_GRAPHICS_BIT;
has_present |= physical.GetSurfaceSupportKHR(i, surface);
}
if (!has_graphics || !has_present) {
LOG_ERROR(Render_Vulkan, "Device lacks a graphics and present queue");
@@ -329,7 +441,7 @@ bool VKDevice::IsSuitable(vk::PhysicalDevice physical, vk::SurfaceKHR surface,
}
// TODO(Rodrigo): Check if the device matches all requeriments.
const auto properties{physical.getProperties(dld)};
const auto properties{physical.GetProperties()};
const auto& limits{properties.limits};
constexpr u32 required_ubo_size = 65536;
@@ -346,7 +458,7 @@ bool VKDevice::IsSuitable(vk::PhysicalDevice physical, vk::SurfaceKHR surface,
is_suitable = false;
}
const auto features{physical.getFeatures(dld)};
const auto features{physical.GetFeatures()};
const std::array feature_report = {
std::make_pair(features.vertexPipelineStoresAndAtomics, "vertexPipelineStoresAndAtomics"),
std::make_pair(features.independentBlend, "independentBlend"),
@@ -380,7 +492,7 @@ bool VKDevice::IsSuitable(vk::PhysicalDevice physical, vk::SurfaceKHR surface,
std::vector<const char*> VKDevice::LoadExtensions() {
std::vector<const char*> extensions;
const auto Test = [&](const vk::ExtensionProperties& extension,
const auto Test = [&](const VkExtensionProperties& extension,
std::optional<std::reference_wrapper<bool>> status, const char* name,
bool push) {
if (extension.extensionName != std::string_view(name)) {
@@ -394,22 +506,13 @@ std::vector<const char*> VKDevice::LoadExtensions() {
}
};
extensions.reserve(15);
extensions.push_back(VK_KHR_SWAPCHAIN_EXTENSION_NAME);
extensions.push_back(VK_KHR_16BIT_STORAGE_EXTENSION_NAME);
extensions.push_back(VK_KHR_8BIT_STORAGE_EXTENSION_NAME);
extensions.push_back(VK_KHR_DRIVER_PROPERTIES_EXTENSION_NAME);
extensions.push_back(VK_EXT_VERTEX_ATTRIBUTE_DIVISOR_EXTENSION_NAME);
extensions.push_back(VK_EXT_SHADER_SUBGROUP_BALLOT_EXTENSION_NAME);
extensions.push_back(VK_EXT_SHADER_SUBGROUP_VOTE_EXTENSION_NAME);
extensions.push_back(VK_EXT_HOST_QUERY_RESET_EXTENSION_NAME);
extensions.reserve(7 + REQUIRED_EXTENSIONS.size());
extensions.insert(extensions.begin(), REQUIRED_EXTENSIONS.begin(), REQUIRED_EXTENSIONS.end());
[[maybe_unused]] const bool nsight =
std::getenv("NVTX_INJECTION64_PATH") || std::getenv("NSIGHT_LAUNCHED");
bool has_khr_shader_float16_int8{};
bool has_ext_subgroup_size_control{};
bool has_ext_transform_feedback{};
for (const auto& extension : physical.enumerateDeviceExtensionProperties(nullptr, dld)) {
for (const auto& extension : physical.EnumerateDeviceExtensionProperties()) {
Test(extension, khr_uniform_buffer_standard_layout,
VK_KHR_UNIFORM_BUFFER_STANDARD_LAYOUT_EXTENSION_NAME, true);
Test(extension, has_khr_shader_float16_int8, VK_KHR_SHADER_FLOAT16_INT8_EXTENSION_NAME,
@@ -429,38 +532,67 @@ std::vector<const char*> VKDevice::LoadExtensions() {
}
}
VkPhysicalDeviceFeatures2KHR features;
features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2_KHR;
VkPhysicalDeviceProperties2KHR properties;
properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR;
if (has_khr_shader_float16_int8) {
is_float16_supported =
GetFeatures<vk::PhysicalDeviceFloat16Int8FeaturesKHR>(physical, dld).shaderFloat16;
VkPhysicalDeviceFloat16Int8FeaturesKHR float16_int8_features;
float16_int8_features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT16_INT8_FEATURES_KHR;
float16_int8_features.pNext = nullptr;
features.pNext = &float16_int8_features;
physical.GetFeatures2KHR(features);
is_float16_supported = float16_int8_features.shaderFloat16;
extensions.push_back(VK_KHR_SHADER_FLOAT16_INT8_EXTENSION_NAME);
}
if (has_ext_subgroup_size_control) {
const auto features =
GetFeatures<vk::PhysicalDeviceSubgroupSizeControlFeaturesEXT>(physical, dld);
const auto properties =
GetProperties<vk::PhysicalDeviceSubgroupSizeControlPropertiesEXT>(physical, dld);
VkPhysicalDeviceSubgroupSizeControlFeaturesEXT subgroup_features;
subgroup_features.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_SIZE_CONTROL_FEATURES_EXT;
subgroup_features.pNext = nullptr;
features.pNext = &subgroup_features;
physical.GetFeatures2KHR(features);
is_warp_potentially_bigger = properties.maxSubgroupSize > GuestWarpSize;
VkPhysicalDeviceSubgroupSizeControlPropertiesEXT subgroup_properties;
subgroup_properties.sType =
VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_SIZE_CONTROL_PROPERTIES_EXT;
subgroup_properties.pNext = nullptr;
properties.pNext = &subgroup_properties;
physical.GetProperties2KHR(properties);
if (features.subgroupSizeControl && properties.minSubgroupSize <= GuestWarpSize &&
properties.maxSubgroupSize >= GuestWarpSize) {
is_warp_potentially_bigger = subgroup_properties.maxSubgroupSize > GuestWarpSize;
if (subgroup_features.subgroupSizeControl &&
subgroup_properties.minSubgroupSize <= GuestWarpSize &&
subgroup_properties.maxSubgroupSize >= GuestWarpSize) {
extensions.push_back(VK_EXT_SUBGROUP_SIZE_CONTROL_EXTENSION_NAME);
guest_warp_stages = properties.requiredSubgroupSizeStages;
guest_warp_stages = subgroup_properties.requiredSubgroupSizeStages;
}
} else {
is_warp_potentially_bigger = true;
}
if (has_ext_transform_feedback) {
const auto features =
GetFeatures<vk::PhysicalDeviceTransformFeedbackFeaturesEXT>(physical, dld);
const auto properties =
GetProperties<vk::PhysicalDeviceTransformFeedbackPropertiesEXT>(physical, dld);
VkPhysicalDeviceTransformFeedbackFeaturesEXT tfb_features;
tfb_features.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_FEATURES_EXT;
tfb_features.pNext = nullptr;
features.pNext = &tfb_features;
physical.GetFeatures2KHR(features);
if (features.transformFeedback && features.geometryStreams &&
properties.maxTransformFeedbackStreams >= 4 && properties.maxTransformFeedbackBuffers &&
properties.transformFeedbackQueries && properties.transformFeedbackDraw) {
VkPhysicalDeviceTransformFeedbackPropertiesEXT tfb_properties;
tfb_properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_TRANSFORM_FEEDBACK_PROPERTIES_EXT;
tfb_properties.pNext = nullptr;
properties.pNext = &tfb_properties;
physical.GetProperties2KHR(properties);
if (tfb_features.transformFeedback && tfb_features.geometryStreams &&
tfb_properties.maxTransformFeedbackStreams >= 4 &&
tfb_properties.maxTransformFeedbackBuffers && tfb_properties.transformFeedbackQueries &&
tfb_properties.transformFeedbackDraw) {
extensions.push_back(VK_EXT_TRANSFORM_FEEDBACK_EXTENSION_NAME);
ext_transform_feedback = true;
}
@@ -469,10 +601,10 @@ std::vector<const char*> VKDevice::LoadExtensions() {
return extensions;
}
void VKDevice::SetupFamilies(vk::SurfaceKHR surface) {
void VKDevice::SetupFamilies(VkSurfaceKHR surface) {
std::optional<u32> graphics_family_, present_family_;
const auto queue_family_properties = physical.getQueueFamilyProperties(dld);
const std::vector queue_family_properties = physical.GetQueueFamilyProperties();
for (u32 i = 0; i < static_cast<u32>(queue_family_properties.size()); ++i) {
if (graphics_family_ && present_family_)
break;
@@ -481,10 +613,10 @@ void VKDevice::SetupFamilies(vk::SurfaceKHR surface) {
if (queue_family.queueCount == 0)
continue;
if (queue_family.queueFlags & vk::QueueFlagBits::eGraphics) {
if (queue_family.queueFlags & VK_QUEUE_GRAPHICS_BIT) {
graphics_family_ = i;
}
if (physical.getSurfaceSupportKHR(i, surface, dld)) {
if (physical.GetSurfaceSupportKHR(i, surface)) {
present_family_ = i;
}
}
@@ -495,120 +627,48 @@ void VKDevice::SetupFamilies(vk::SurfaceKHR surface) {
}
void VKDevice::SetupFeatures() {
const auto supported_features{physical.getFeatures(dld)};
const auto supported_features{physical.GetFeatures()};
is_formatless_image_load_supported = supported_features.shaderStorageImageReadWithoutFormat;
is_optimal_astc_supported = IsOptimalAstcSupported(supported_features);
}
void VKDevice::CollectTelemetryParameters() {
const auto driver = GetProperties<vk::PhysicalDeviceDriverPropertiesKHR>(physical, dld);
VkPhysicalDeviceDriverPropertiesKHR driver;
driver.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_DRIVER_PROPERTIES_KHR;
driver.pNext = nullptr;
VkPhysicalDeviceProperties2KHR properties;
properties.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_PROPERTIES_2_KHR;
properties.pNext = &driver;
physical.GetProperties2KHR(properties);
driver_id = driver.driverID;
vendor_name = driver.driverName;
const auto extensions = physical.enumerateDeviceExtensionProperties(nullptr, dld);
const std::vector extensions = physical.EnumerateDeviceExtensionProperties();
reported_extensions.reserve(std::size(extensions));
for (const auto& extension : extensions) {
reported_extensions.push_back(extension.extensionName);
}
}
std::vector<vk::DeviceQueueCreateInfo> VKDevice::GetDeviceQueueCreateInfos() const {
static const float QUEUE_PRIORITY = 1.0f;
std::vector<VkDeviceQueueCreateInfo> VKDevice::GetDeviceQueueCreateInfos() const {
static constexpr float QUEUE_PRIORITY = 1.0f;
std::set<u32> unique_queue_families = {graphics_family, present_family};
std::vector<vk::DeviceQueueCreateInfo> queue_cis;
std::unordered_set<u32> unique_queue_families = {graphics_family, present_family};
std::vector<VkDeviceQueueCreateInfo> queue_cis;
for (u32 queue_family : unique_queue_families)
queue_cis.push_back({{}, queue_family, 1, &QUEUE_PRIORITY});
for (const u32 queue_family : unique_queue_families) {
VkDeviceQueueCreateInfo& ci = queue_cis.emplace_back();
ci.sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.queueFamilyIndex = queue_family;
ci.queueCount = 1;
ci.pQueuePriorities = &QUEUE_PRIORITY;
}
return queue_cis;
}
std::unordered_map<vk::Format, vk::FormatProperties> VKDevice::GetFormatProperties(
const vk::DispatchLoaderDynamic& dld, vk::PhysicalDevice physical) {
static constexpr std::array formats{vk::Format::eA8B8G8R8UnormPack32,
vk::Format::eA8B8G8R8UintPack32,
vk::Format::eA8B8G8R8SnormPack32,
vk::Format::eA8B8G8R8SrgbPack32,
vk::Format::eB5G6R5UnormPack16,
vk::Format::eA2B10G10R10UnormPack32,
vk::Format::eA1R5G5B5UnormPack16,
vk::Format::eR32G32B32A32Sfloat,
vk::Format::eR32G32B32A32Uint,
vk::Format::eR32G32Sfloat,
vk::Format::eR32G32Uint,
vk::Format::eR16G16B16A16Uint,
vk::Format::eR16G16B16A16Snorm,
vk::Format::eR16G16B16A16Unorm,
vk::Format::eR16G16Unorm,
vk::Format::eR16G16Snorm,
vk::Format::eR16G16Sfloat,
vk::Format::eR16Unorm,
vk::Format::eR8G8B8A8Srgb,
vk::Format::eR8G8Unorm,
vk::Format::eR8G8Snorm,
vk::Format::eR8Unorm,
vk::Format::eR8Uint,
vk::Format::eB10G11R11UfloatPack32,
vk::Format::eR32Sfloat,
vk::Format::eR32Uint,
vk::Format::eR32Sint,
vk::Format::eR16Sfloat,
vk::Format::eR16G16B16A16Sfloat,
vk::Format::eB8G8R8A8Unorm,
vk::Format::eR4G4B4A4UnormPack16,
vk::Format::eD32Sfloat,
vk::Format::eD16Unorm,
vk::Format::eD16UnormS8Uint,
vk::Format::eD24UnormS8Uint,
vk::Format::eD32SfloatS8Uint,
vk::Format::eBc1RgbaUnormBlock,
vk::Format::eBc2UnormBlock,
vk::Format::eBc3UnormBlock,
vk::Format::eBc4UnormBlock,
vk::Format::eBc5UnormBlock,
vk::Format::eBc5SnormBlock,
vk::Format::eBc7UnormBlock,
vk::Format::eBc6HUfloatBlock,
vk::Format::eBc6HSfloatBlock,
vk::Format::eBc1RgbaSrgbBlock,
vk::Format::eBc2SrgbBlock,
vk::Format::eBc3SrgbBlock,
vk::Format::eBc7SrgbBlock,
vk::Format::eAstc4x4UnormBlock,
vk::Format::eAstc4x4SrgbBlock,
vk::Format::eAstc5x4UnormBlock,
vk::Format::eAstc5x4SrgbBlock,
vk::Format::eAstc5x5UnormBlock,
vk::Format::eAstc5x5SrgbBlock,
vk::Format::eAstc6x5UnormBlock,
vk::Format::eAstc6x5SrgbBlock,
vk::Format::eAstc6x6UnormBlock,
vk::Format::eAstc6x6SrgbBlock,
vk::Format::eAstc8x5UnormBlock,
vk::Format::eAstc8x5SrgbBlock,
vk::Format::eAstc8x6UnormBlock,
vk::Format::eAstc8x6SrgbBlock,
vk::Format::eAstc8x8UnormBlock,
vk::Format::eAstc8x8SrgbBlock,
vk::Format::eAstc10x5UnormBlock,
vk::Format::eAstc10x5SrgbBlock,
vk::Format::eAstc10x6UnormBlock,
vk::Format::eAstc10x6SrgbBlock,
vk::Format::eAstc10x8UnormBlock,
vk::Format::eAstc10x8SrgbBlock,
vk::Format::eAstc10x10UnormBlock,
vk::Format::eAstc10x10SrgbBlock,
vk::Format::eAstc12x10UnormBlock,
vk::Format::eAstc12x10SrgbBlock,
vk::Format::eAstc12x12UnormBlock,
vk::Format::eAstc12x12SrgbBlock,
vk::Format::eE5B9G9R9UfloatPack32};
std::unordered_map<vk::Format, vk::FormatProperties> format_properties;
for (const auto format : formats) {
format_properties.emplace(format, physical.getFormatProperties(format, dld));
}
return format_properties;
}
} // namespace Vulkan

View File

@@ -8,8 +8,9 @@
#include <string_view>
#include <unordered_map>
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -22,12 +23,12 @@ const u32 GuestWarpSize = 32;
/// Handles data specific to a physical device.
class VKDevice final {
public:
explicit VKDevice(const vk::DispatchLoaderDynamic& dld, vk::PhysicalDevice physical,
vk::SurfaceKHR surface);
explicit VKDevice(VkInstance instance, vk::PhysicalDevice physical, VkSurfaceKHR surface,
const vk::InstanceDispatch& dld);
~VKDevice();
/// Initializes the device. Returns true on success.
bool Create(vk::Instance instance);
bool Create();
/**
* Returns a format supported by the device for the passed requeriments.
@@ -36,20 +37,20 @@ public:
* @param format_type Format type usage.
* @returns A format supported by the device.
*/
vk::Format GetSupportedFormat(vk::Format wanted_format, vk::FormatFeatureFlags wanted_usage,
FormatType format_type) const;
VkFormat GetSupportedFormat(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const;
/// Reports a device loss.
void ReportLoss() const;
/// Returns the dispatch loader with direct function pointers of the device.
const vk::DispatchLoaderDynamic& GetDispatchLoader() const {
const vk::DeviceDispatch& GetDispatchLoader() const {
return dld;
}
/// Returns the logical device.
vk::Device GetLogical() const {
return logical.get();
const vk::Device& GetLogical() const {
return logical;
}
/// Returns the physical device.
@@ -79,7 +80,7 @@ public:
/// Returns true if the device is integrated with the host CPU.
bool IsIntegrated() const {
return properties.deviceType == vk::PhysicalDeviceType::eIntegratedGpu;
return properties.deviceType == VK_PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU;
}
/// Returns the current Vulkan API version provided in Vulkan-formatted version numbers.
@@ -98,27 +99,27 @@ public:
}
/// Returns the driver ID.
vk::DriverIdKHR GetDriverID() const {
VkDriverIdKHR GetDriverID() const {
return driver_id;
}
/// Returns uniform buffer alignment requeriment.
vk::DeviceSize GetUniformBufferAlignment() const {
VkDeviceSize GetUniformBufferAlignment() const {
return properties.limits.minUniformBufferOffsetAlignment;
}
/// Returns storage alignment requeriment.
vk::DeviceSize GetStorageBufferAlignment() const {
VkDeviceSize GetStorageBufferAlignment() const {
return properties.limits.minStorageBufferOffsetAlignment;
}
/// Returns the maximum range for storage buffers.
vk::DeviceSize GetMaxStorageBufferRange() const {
VkDeviceSize GetMaxStorageBufferRange() const {
return properties.limits.maxStorageBufferRange;
}
/// Returns the maximum size for push constants.
vk::DeviceSize GetMaxPushConstantsSize() const {
VkDeviceSize GetMaxPushConstantsSize() const {
return properties.limits.maxPushConstantsSize;
}
@@ -138,8 +139,8 @@ public:
}
/// Returns true if the device can be forced to use the guest warp size.
bool IsGuestWarpSizeSupported(vk::ShaderStageFlagBits stage) const {
return (guest_warp_stages & stage) != vk::ShaderStageFlags{};
bool IsGuestWarpSizeSupported(VkShaderStageFlagBits stage) const {
return guest_warp_stages & stage;
}
/// Returns true if formatless image load is supported.
@@ -188,15 +189,14 @@ public:
}
/// Checks if the physical device is suitable.
static bool IsSuitable(vk::PhysicalDevice physical, vk::SurfaceKHR surface,
const vk::DispatchLoaderDynamic& dld);
static bool IsSuitable(vk::PhysicalDevice physical, VkSurfaceKHR surface);
private:
/// Loads extensions into a vector and stores available ones in this object.
std::vector<const char*> LoadExtensions();
/// Sets up queue families.
void SetupFamilies(vk::SurfaceKHR surface);
void SetupFamilies(VkSurfaceKHR surface);
/// Sets up device features.
void SetupFeatures();
@@ -205,32 +205,28 @@ private:
void CollectTelemetryParameters();
/// Returns a list of queue initialization descriptors.
std::vector<vk::DeviceQueueCreateInfo> GetDeviceQueueCreateInfos() const;
std::vector<VkDeviceQueueCreateInfo> GetDeviceQueueCreateInfos() const;
/// Returns true if ASTC textures are natively supported.
bool IsOptimalAstcSupported(const vk::PhysicalDeviceFeatures& features) const;
bool IsOptimalAstcSupported(const VkPhysicalDeviceFeatures& features) const;
/// Returns true if a format is supported.
bool IsFormatSupported(vk::Format wanted_format, vk::FormatFeatureFlags wanted_usage,
bool IsFormatSupported(VkFormat wanted_format, VkFormatFeatureFlags wanted_usage,
FormatType format_type) const;
/// Returns the device properties for Vulkan formats.
static std::unordered_map<vk::Format, vk::FormatProperties> GetFormatProperties(
const vk::DispatchLoaderDynamic& dld, vk::PhysicalDevice physical);
vk::DispatchLoaderDynamic dld; ///< Device function pointers.
vk::PhysicalDevice physical; ///< Physical device.
vk::PhysicalDeviceProperties properties; ///< Device properties.
UniqueDevice logical; ///< Logical device.
vk::Queue graphics_queue; ///< Main graphics queue.
vk::Queue present_queue; ///< Main present queue.
u32 graphics_family{}; ///< Main graphics queue family index.
u32 present_family{}; ///< Main present queue family index.
vk::DriverIdKHR driver_id{}; ///< Driver ID.
vk::ShaderStageFlags guest_warp_stages{}; ///< Stages where the guest warp size can be forced.ed
bool is_optimal_astc_supported{}; ///< Support for native ASTC.
bool is_float16_supported{}; ///< Support for float16 arithmetics.
bool is_warp_potentially_bigger{}; ///< Host warp size can be bigger than guest.
vk::DeviceDispatch dld; ///< Device function pointers.
vk::PhysicalDevice physical; ///< Physical device.
VkPhysicalDeviceProperties properties; ///< Device properties.
vk::Device logical; ///< Logical device.
vk::Queue graphics_queue; ///< Main graphics queue.
vk::Queue present_queue; ///< Main present queue.
u32 graphics_family{}; ///< Main graphics queue family index.
u32 present_family{}; ///< Main present queue family index.
VkDriverIdKHR driver_id{}; ///< Driver ID.
VkShaderStageFlags guest_warp_stages{}; ///< Stages where the guest warp size can be forced.ed
bool is_optimal_astc_supported{}; ///< Support for native ASTC.
bool is_float16_supported{}; ///< Support for float16 arithmetics.
bool is_warp_potentially_bigger{}; ///< Host warp size can be bigger than guest.
bool is_formatless_image_load_supported{}; ///< Support for shader image read without format.
bool khr_uniform_buffer_standard_layout{}; ///< Support for std430 on UBOs.
bool ext_index_type_uint8{}; ///< Support for VK_EXT_index_type_uint8.
@@ -244,7 +240,7 @@ private:
std::vector<std::string> reported_extensions; ///< Reported Vulkan extensions.
/// Format properties dictionary.
std::unordered_map<vk::Format, vk::FormatProperties> format_properties;
std::unordered_map<VkFormat, VkFormatProperties> format_properties;
};
} // namespace Vulkan

View File

@@ -2,11 +2,13 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <array>
#include <cstring>
#include <vector>
#include "common/assert.h"
#include "common/common_types.h"
#include "common/microprofile.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/fixed_pipeline_state.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
@@ -16,6 +18,7 @@
#include "video_core/renderer_vulkan/vk_renderpass_cache.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -23,21 +26,26 @@ MICROPROFILE_DECLARE(Vulkan_PipelineCache);
namespace {
vk::StencilOpState GetStencilFaceState(const FixedPipelineState::StencilFace& face) {
return vk::StencilOpState(MaxwellToVK::StencilOp(face.action_stencil_fail),
MaxwellToVK::StencilOp(face.action_depth_pass),
MaxwellToVK::StencilOp(face.action_depth_fail),
MaxwellToVK::ComparisonOp(face.test_func), 0, 0, 0);
VkStencilOpState GetStencilFaceState(const FixedPipelineState::StencilFace& face) {
VkStencilOpState state;
state.failOp = MaxwellToVK::StencilOp(face.action_stencil_fail);
state.passOp = MaxwellToVK::StencilOp(face.action_depth_pass);
state.depthFailOp = MaxwellToVK::StencilOp(face.action_depth_fail);
state.compareOp = MaxwellToVK::ComparisonOp(face.test_func);
state.compareMask = 0;
state.writeMask = 0;
state.reference = 0;
return state;
}
bool SupportsPrimitiveRestart(vk::PrimitiveTopology topology) {
bool SupportsPrimitiveRestart(VkPrimitiveTopology topology) {
static constexpr std::array unsupported_topologies = {
vk::PrimitiveTopology::ePointList,
vk::PrimitiveTopology::eLineList,
vk::PrimitiveTopology::eTriangleList,
vk::PrimitiveTopology::eLineListWithAdjacency,
vk::PrimitiveTopology::eTriangleListWithAdjacency,
vk::PrimitiveTopology::ePatchList};
VK_PRIMITIVE_TOPOLOGY_POINT_LIST,
VK_PRIMITIVE_TOPOLOGY_LINE_LIST,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST,
VK_PRIMITIVE_TOPOLOGY_LINE_LIST_WITH_ADJACENCY,
VK_PRIMITIVE_TOPOLOGY_TRIANGLE_LIST_WITH_ADJACENCY,
VK_PRIMITIVE_TOPOLOGY_PATCH_LIST};
return std::find(std::begin(unsupported_topologies), std::end(unsupported_topologies),
topology) == std::end(unsupported_topologies);
}
@@ -49,7 +57,7 @@ VKGraphicsPipeline::VKGraphicsPipeline(const VKDevice& device, VKScheduler& sche
VKUpdateDescriptorQueue& update_descriptor_queue,
VKRenderPassCache& renderpass_cache,
const GraphicsPipelineCacheKey& key,
const std::vector<vk::DescriptorSetLayoutBinding>& bindings,
vk::Span<VkDescriptorSetLayoutBinding> bindings,
const SPIRVProgram& program)
: device{device}, scheduler{scheduler}, fixed_state{key.fixed_state}, hash{key.Hash()},
descriptor_set_layout{CreateDescriptorSetLayout(bindings)},
@@ -63,7 +71,7 @@ VKGraphicsPipeline::VKGraphicsPipeline(const VKDevice& device, VKScheduler& sche
VKGraphicsPipeline::~VKGraphicsPipeline() = default;
vk::DescriptorSet VKGraphicsPipeline::CommitDescriptorSet() {
VkDescriptorSet VKGraphicsPipeline::CommitDescriptorSet() {
if (!descriptor_template) {
return {};
}
@@ -72,27 +80,32 @@ vk::DescriptorSet VKGraphicsPipeline::CommitDescriptorSet() {
return set;
}
UniqueDescriptorSetLayout VKGraphicsPipeline::CreateDescriptorSetLayout(
const std::vector<vk::DescriptorSetLayoutBinding>& bindings) const {
const vk::DescriptorSetLayoutCreateInfo descriptor_set_layout_ci(
{}, static_cast<u32>(bindings.size()), bindings.data());
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createDescriptorSetLayoutUnique(descriptor_set_layout_ci, nullptr, dld);
vk::DescriptorSetLayout VKGraphicsPipeline::CreateDescriptorSetLayout(
vk::Span<VkDescriptorSetLayoutBinding> bindings) const {
VkDescriptorSetLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.bindingCount = bindings.size();
ci.pBindings = bindings.data();
return device.GetLogical().CreateDescriptorSetLayout(ci);
}
UniquePipelineLayout VKGraphicsPipeline::CreatePipelineLayout() const {
const vk::PipelineLayoutCreateInfo pipeline_layout_ci({}, 1, &*descriptor_set_layout, 0,
nullptr);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createPipelineLayoutUnique(pipeline_layout_ci, nullptr, dld);
vk::PipelineLayout VKGraphicsPipeline::CreatePipelineLayout() const {
VkPipelineLayoutCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.setLayoutCount = 1;
ci.pSetLayouts = descriptor_set_layout.address();
ci.pushConstantRangeCount = 0;
ci.pPushConstantRanges = nullptr;
return device.GetLogical().CreatePipelineLayout(ci);
}
UniqueDescriptorUpdateTemplate VKGraphicsPipeline::CreateDescriptorUpdateTemplate(
vk::DescriptorUpdateTemplateKHR VKGraphicsPipeline::CreateDescriptorUpdateTemplate(
const SPIRVProgram& program) const {
std::vector<vk::DescriptorUpdateTemplateEntry> template_entries;
std::vector<VkDescriptorUpdateTemplateEntry> template_entries;
u32 binding = 0;
u32 offset = 0;
for (const auto& stage : program) {
@@ -102,38 +115,47 @@ UniqueDescriptorUpdateTemplate VKGraphicsPipeline::CreateDescriptorUpdateTemplat
}
if (template_entries.empty()) {
// If the shader doesn't use descriptor sets, skip template creation.
return UniqueDescriptorUpdateTemplate{};
return {};
}
const vk::DescriptorUpdateTemplateCreateInfo template_ci(
{}, static_cast<u32>(template_entries.size()), template_entries.data(),
vk::DescriptorUpdateTemplateType::eDescriptorSet, *descriptor_set_layout,
vk::PipelineBindPoint::eGraphics, *layout, DESCRIPTOR_SET);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createDescriptorUpdateTemplateUnique(template_ci, nullptr, dld);
VkDescriptorUpdateTemplateCreateInfoKHR ci;
ci.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_UPDATE_TEMPLATE_CREATE_INFO_KHR;
ci.pNext = nullptr;
ci.flags = 0;
ci.descriptorUpdateEntryCount = static_cast<u32>(template_entries.size());
ci.pDescriptorUpdateEntries = template_entries.data();
ci.templateType = VK_DESCRIPTOR_UPDATE_TEMPLATE_TYPE_DESCRIPTOR_SET_KHR;
ci.descriptorSetLayout = *descriptor_set_layout;
ci.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
ci.pipelineLayout = *layout;
ci.set = DESCRIPTOR_SET;
return device.GetLogical().CreateDescriptorUpdateTemplateKHR(ci);
}
std::vector<UniqueShaderModule> VKGraphicsPipeline::CreateShaderModules(
std::vector<vk::ShaderModule> VKGraphicsPipeline::CreateShaderModules(
const SPIRVProgram& program) const {
std::vector<UniqueShaderModule> modules;
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
VkShaderModuleCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
std::vector<vk::ShaderModule> modules;
modules.reserve(Maxwell::MaxShaderStage);
for (std::size_t i = 0; i < Maxwell::MaxShaderStage; ++i) {
const auto& stage = program[i];
if (!stage) {
continue;
}
const vk::ShaderModuleCreateInfo module_ci({}, stage->code.size() * sizeof(u32),
stage->code.data());
modules.emplace_back(dev.createShaderModuleUnique(module_ci, nullptr, dld));
ci.codeSize = stage->code.size() * sizeof(u32);
ci.pCode = stage->code.data();
modules.push_back(device.GetLogical().CreateShaderModule(ci));
}
return modules;
}
UniquePipeline VKGraphicsPipeline::CreatePipeline(const RenderPassParams& renderpass_params,
const SPIRVProgram& program) const {
vk::Pipeline VKGraphicsPipeline::CreatePipeline(const RenderPassParams& renderpass_params,
const SPIRVProgram& program) const {
const auto& vi = fixed_state.vertex_input;
const auto& ia = fixed_state.input_assembly;
const auto& ds = fixed_state.depth_stencil;
@@ -141,19 +163,26 @@ UniquePipeline VKGraphicsPipeline::CreatePipeline(const RenderPassParams& render
const auto& ts = fixed_state.tessellation;
const auto& rs = fixed_state.rasterizer;
std::vector<vk::VertexInputBindingDescription> vertex_bindings;
std::vector<vk::VertexInputBindingDivisorDescriptionEXT> vertex_binding_divisors;
std::vector<VkVertexInputBindingDescription> vertex_bindings;
std::vector<VkVertexInputBindingDivisorDescriptionEXT> vertex_binding_divisors;
for (std::size_t i = 0; i < vi.num_bindings; ++i) {
const auto& binding = vi.bindings[i];
const bool instanced = binding.divisor != 0;
const auto rate = instanced ? vk::VertexInputRate::eInstance : vk::VertexInputRate::eVertex;
vertex_bindings.emplace_back(binding.index, binding.stride, rate);
const auto rate = instanced ? VK_VERTEX_INPUT_RATE_INSTANCE : VK_VERTEX_INPUT_RATE_VERTEX;
auto& vertex_binding = vertex_bindings.emplace_back();
vertex_binding.binding = binding.index;
vertex_binding.stride = binding.stride;
vertex_binding.inputRate = rate;
if (instanced) {
vertex_binding_divisors.emplace_back(binding.index, binding.divisor);
auto& binding_divisor = vertex_binding_divisors.emplace_back();
binding_divisor.binding = binding.index;
binding_divisor.divisor = binding.divisor;
}
}
std::vector<vk::VertexInputAttributeDescription> vertex_attributes;
std::vector<VkVertexInputAttributeDescription> vertex_attributes;
const auto& input_attributes = program[0]->entries.attributes;
for (std::size_t i = 0; i < vi.num_attributes; ++i) {
const auto& attribute = vi.attributes[i];
@@ -161,109 +190,194 @@ UniquePipeline VKGraphicsPipeline::CreatePipeline(const RenderPassParams& render
// Skip attributes not used by the vertex shaders.
continue;
}
vertex_attributes.emplace_back(attribute.index, attribute.buffer,
MaxwellToVK::VertexFormat(attribute.type, attribute.size),
attribute.offset);
auto& vertex_attribute = vertex_attributes.emplace_back();
vertex_attribute.location = attribute.index;
vertex_attribute.binding = attribute.buffer;
vertex_attribute.format = MaxwellToVK::VertexFormat(attribute.type, attribute.size);
vertex_attribute.offset = attribute.offset;
}
vk::PipelineVertexInputStateCreateInfo vertex_input_ci(
{}, static_cast<u32>(vertex_bindings.size()), vertex_bindings.data(),
static_cast<u32>(vertex_attributes.size()), vertex_attributes.data());
VkPipelineVertexInputStateCreateInfo vertex_input_ci;
vertex_input_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO;
vertex_input_ci.pNext = nullptr;
vertex_input_ci.flags = 0;
vertex_input_ci.vertexBindingDescriptionCount = static_cast<u32>(vertex_bindings.size());
vertex_input_ci.pVertexBindingDescriptions = vertex_bindings.data();
vertex_input_ci.vertexAttributeDescriptionCount = static_cast<u32>(vertex_attributes.size());
vertex_input_ci.pVertexAttributeDescriptions = vertex_attributes.data();
const vk::PipelineVertexInputDivisorStateCreateInfoEXT vertex_input_divisor_ci(
static_cast<u32>(vertex_binding_divisors.size()), vertex_binding_divisors.data());
VkPipelineVertexInputDivisorStateCreateInfoEXT input_divisor_ci;
input_divisor_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_DIVISOR_STATE_CREATE_INFO_EXT;
input_divisor_ci.pNext = nullptr;
input_divisor_ci.vertexBindingDivisorCount = static_cast<u32>(vertex_binding_divisors.size());
input_divisor_ci.pVertexBindingDivisors = vertex_binding_divisors.data();
if (!vertex_binding_divisors.empty()) {
vertex_input_ci.pNext = &vertex_input_divisor_ci;
vertex_input_ci.pNext = &input_divisor_ci;
}
const auto primitive_topology = MaxwellToVK::PrimitiveTopology(device, ia.topology);
const vk::PipelineInputAssemblyStateCreateInfo input_assembly_ci(
{}, primitive_topology,
ia.primitive_restart_enable && SupportsPrimitiveRestart(primitive_topology));
VkPipelineInputAssemblyStateCreateInfo input_assembly_ci;
input_assembly_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO;
input_assembly_ci.pNext = nullptr;
input_assembly_ci.flags = 0;
input_assembly_ci.topology = MaxwellToVK::PrimitiveTopology(device, ia.topology);
input_assembly_ci.primitiveRestartEnable =
ia.primitive_restart_enable && SupportsPrimitiveRestart(input_assembly_ci.topology);
const vk::PipelineTessellationStateCreateInfo tessellation_ci({}, ts.patch_control_points);
VkPipelineTessellationStateCreateInfo tessellation_ci;
tessellation_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO;
tessellation_ci.pNext = nullptr;
tessellation_ci.flags = 0;
tessellation_ci.patchControlPoints = ts.patch_control_points;
const vk::PipelineViewportStateCreateInfo viewport_ci({}, Maxwell::NumViewports, nullptr,
Maxwell::NumViewports, nullptr);
VkPipelineViewportStateCreateInfo viewport_ci;
viewport_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO;
viewport_ci.pNext = nullptr;
viewport_ci.flags = 0;
viewport_ci.viewportCount = Maxwell::NumViewports;
viewport_ci.pViewports = nullptr;
viewport_ci.scissorCount = Maxwell::NumViewports;
viewport_ci.pScissors = nullptr;
// TODO(Rodrigo): Find out what's the default register value for front face
const vk::PipelineRasterizationStateCreateInfo rasterizer_ci(
{}, rs.depth_clamp_enable, false, vk::PolygonMode::eFill,
rs.cull_enable ? MaxwellToVK::CullFace(rs.cull_face) : vk::CullModeFlagBits::eNone,
MaxwellToVK::FrontFace(rs.front_face), rs.depth_bias_enable, 0.0f, 0.0f, 0.0f, 1.0f);
VkPipelineRasterizationStateCreateInfo rasterization_ci;
rasterization_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO;
rasterization_ci.pNext = nullptr;
rasterization_ci.flags = 0;
rasterization_ci.depthClampEnable = rs.depth_clamp_enable;
rasterization_ci.rasterizerDiscardEnable = VK_FALSE;
rasterization_ci.polygonMode = VK_POLYGON_MODE_FILL;
rasterization_ci.cullMode =
rs.cull_enable ? MaxwellToVK::CullFace(rs.cull_face) : VK_CULL_MODE_NONE;
rasterization_ci.frontFace = MaxwellToVK::FrontFace(rs.front_face);
rasterization_ci.depthBiasEnable = rs.depth_bias_enable;
rasterization_ci.depthBiasConstantFactor = 0.0f;
rasterization_ci.depthBiasClamp = 0.0f;
rasterization_ci.depthBiasSlopeFactor = 0.0f;
rasterization_ci.lineWidth = 1.0f;
const vk::PipelineMultisampleStateCreateInfo multisampling_ci(
{}, vk::SampleCountFlagBits::e1, false, 0.0f, nullptr, false, false);
VkPipelineMultisampleStateCreateInfo multisample_ci;
multisample_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO;
multisample_ci.pNext = nullptr;
multisample_ci.flags = 0;
multisample_ci.rasterizationSamples = VK_SAMPLE_COUNT_1_BIT;
multisample_ci.sampleShadingEnable = VK_FALSE;
multisample_ci.minSampleShading = 0.0f;
multisample_ci.pSampleMask = nullptr;
multisample_ci.alphaToCoverageEnable = VK_FALSE;
multisample_ci.alphaToOneEnable = VK_FALSE;
const vk::CompareOp depth_test_compare = ds.depth_test_enable
? MaxwellToVK::ComparisonOp(ds.depth_test_function)
: vk::CompareOp::eAlways;
VkPipelineDepthStencilStateCreateInfo depth_stencil_ci;
depth_stencil_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO;
depth_stencil_ci.pNext = nullptr;
depth_stencil_ci.flags = 0;
depth_stencil_ci.depthTestEnable = ds.depth_test_enable;
depth_stencil_ci.depthWriteEnable = ds.depth_write_enable;
depth_stencil_ci.depthCompareOp = ds.depth_test_enable
? MaxwellToVK::ComparisonOp(ds.depth_test_function)
: VK_COMPARE_OP_ALWAYS;
depth_stencil_ci.depthBoundsTestEnable = ds.depth_bounds_enable;
depth_stencil_ci.stencilTestEnable = ds.stencil_enable;
depth_stencil_ci.front = GetStencilFaceState(ds.front_stencil);
depth_stencil_ci.back = GetStencilFaceState(ds.back_stencil);
depth_stencil_ci.minDepthBounds = 0.0f;
depth_stencil_ci.maxDepthBounds = 0.0f;
const vk::PipelineDepthStencilStateCreateInfo depth_stencil_ci(
{}, ds.depth_test_enable, ds.depth_write_enable, depth_test_compare, ds.depth_bounds_enable,
ds.stencil_enable, GetStencilFaceState(ds.front_stencil),
GetStencilFaceState(ds.back_stencil), 0.0f, 0.0f);
std::array<vk::PipelineColorBlendAttachmentState, Maxwell::NumRenderTargets> cb_attachments;
std::array<VkPipelineColorBlendAttachmentState, Maxwell::NumRenderTargets> cb_attachments;
const std::size_t num_attachments =
std::min(cd.attachments_count, renderpass_params.color_attachments.size());
for (std::size_t i = 0; i < num_attachments; ++i) {
constexpr std::array component_table{
vk::ColorComponentFlagBits::eR, vk::ColorComponentFlagBits::eG,
vk::ColorComponentFlagBits::eB, vk::ColorComponentFlagBits::eA};
static constexpr std::array component_table = {
VK_COLOR_COMPONENT_R_BIT, VK_COLOR_COMPONENT_G_BIT, VK_COLOR_COMPONENT_B_BIT,
VK_COLOR_COMPONENT_A_BIT};
const auto& blend = cd.attachments[i];
vk::ColorComponentFlags color_components{};
VkColorComponentFlags color_components = 0;
for (std::size_t j = 0; j < component_table.size(); ++j) {
if (blend.components[j])
if (blend.components[j]) {
color_components |= component_table[j];
}
}
cb_attachments[i] = vk::PipelineColorBlendAttachmentState(
blend.enable, MaxwellToVK::BlendFactor(blend.src_rgb_func),
MaxwellToVK::BlendFactor(blend.dst_rgb_func),
MaxwellToVK::BlendEquation(blend.rgb_equation),
MaxwellToVK::BlendFactor(blend.src_a_func), MaxwellToVK::BlendFactor(blend.dst_a_func),
MaxwellToVK::BlendEquation(blend.a_equation), color_components);
VkPipelineColorBlendAttachmentState& attachment = cb_attachments[i];
attachment.blendEnable = blend.enable;
attachment.srcColorBlendFactor = MaxwellToVK::BlendFactor(blend.src_rgb_func);
attachment.dstColorBlendFactor = MaxwellToVK::BlendFactor(blend.dst_rgb_func);
attachment.colorBlendOp = MaxwellToVK::BlendEquation(blend.rgb_equation);
attachment.srcAlphaBlendFactor = MaxwellToVK::BlendFactor(blend.src_a_func);
attachment.dstAlphaBlendFactor = MaxwellToVK::BlendFactor(blend.dst_a_func);
attachment.alphaBlendOp = MaxwellToVK::BlendEquation(blend.a_equation);
attachment.colorWriteMask = color_components;
}
const vk::PipelineColorBlendStateCreateInfo color_blending_ci({}, false, vk::LogicOp::eCopy,
static_cast<u32>(num_attachments),
cb_attachments.data(), {});
constexpr std::array dynamic_states = {
vk::DynamicState::eViewport, vk::DynamicState::eScissor,
vk::DynamicState::eDepthBias, vk::DynamicState::eBlendConstants,
vk::DynamicState::eDepthBounds, vk::DynamicState::eStencilCompareMask,
vk::DynamicState::eStencilWriteMask, vk::DynamicState::eStencilReference};
const vk::PipelineDynamicStateCreateInfo dynamic_state_ci(
{}, static_cast<u32>(dynamic_states.size()), dynamic_states.data());
VkPipelineColorBlendStateCreateInfo color_blend_ci;
color_blend_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO;
color_blend_ci.pNext = nullptr;
color_blend_ci.flags = 0;
color_blend_ci.logicOpEnable = VK_FALSE;
color_blend_ci.logicOp = VK_LOGIC_OP_COPY;
color_blend_ci.attachmentCount = static_cast<u32>(num_attachments);
color_blend_ci.pAttachments = cb_attachments.data();
std::memset(color_blend_ci.blendConstants, 0, sizeof(color_blend_ci.blendConstants));
vk::PipelineShaderStageRequiredSubgroupSizeCreateInfoEXT subgroup_size_ci;
static constexpr std::array dynamic_states = {
VK_DYNAMIC_STATE_VIEWPORT, VK_DYNAMIC_STATE_SCISSOR,
VK_DYNAMIC_STATE_DEPTH_BIAS, VK_DYNAMIC_STATE_BLEND_CONSTANTS,
VK_DYNAMIC_STATE_DEPTH_BOUNDS, VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK,
VK_DYNAMIC_STATE_STENCIL_WRITE_MASK, VK_DYNAMIC_STATE_STENCIL_REFERENCE};
VkPipelineDynamicStateCreateInfo dynamic_state_ci;
dynamic_state_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO;
dynamic_state_ci.pNext = nullptr;
dynamic_state_ci.flags = 0;
dynamic_state_ci.dynamicStateCount = static_cast<u32>(dynamic_states.size());
dynamic_state_ci.pDynamicStates = dynamic_states.data();
VkPipelineShaderStageRequiredSubgroupSizeCreateInfoEXT subgroup_size_ci;
subgroup_size_ci.sType =
VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_REQUIRED_SUBGROUP_SIZE_CREATE_INFO_EXT;
subgroup_size_ci.pNext = nullptr;
subgroup_size_ci.requiredSubgroupSize = GuestWarpSize;
std::vector<vk::PipelineShaderStageCreateInfo> shader_stages;
std::vector<VkPipelineShaderStageCreateInfo> shader_stages;
std::size_t module_index = 0;
for (std::size_t stage = 0; stage < Maxwell::MaxShaderStage; ++stage) {
if (!program[stage]) {
continue;
}
const auto stage_enum = static_cast<Tegra::Engines::ShaderType>(stage);
const auto vk_stage = MaxwellToVK::ShaderStage(stage_enum);
auto& stage_ci = shader_stages.emplace_back(vk::PipelineShaderStageCreateFlags{}, vk_stage,
*modules[module_index++], "main", nullptr);
if (program[stage]->entries.uses_warps && device.IsGuestWarpSizeSupported(vk_stage)) {
VkPipelineShaderStageCreateInfo& stage_ci = shader_stages.emplace_back();
stage_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO;
stage_ci.pNext = nullptr;
stage_ci.flags = 0;
stage_ci.stage = MaxwellToVK::ShaderStage(static_cast<Tegra::Engines::ShaderType>(stage));
stage_ci.module = *modules[module_index++];
stage_ci.pName = "main";
stage_ci.pSpecializationInfo = nullptr;
if (program[stage]->entries.uses_warps && device.IsGuestWarpSizeSupported(stage_ci.stage)) {
stage_ci.pNext = &subgroup_size_ci;
}
}
const vk::GraphicsPipelineCreateInfo create_info(
{}, static_cast<u32>(shader_stages.size()), shader_stages.data(), &vertex_input_ci,
&input_assembly_ci, &tessellation_ci, &viewport_ci, &rasterizer_ci, &multisampling_ci,
&depth_stencil_ci, &color_blending_ci, &dynamic_state_ci, *layout, renderpass, 0, {}, 0);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createGraphicsPipelineUnique(nullptr, create_info, nullptr, dld);
VkGraphicsPipelineCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.stageCount = static_cast<u32>(shader_stages.size());
ci.pStages = shader_stages.data();
ci.pVertexInputState = &vertex_input_ci;
ci.pInputAssemblyState = &input_assembly_ci;
ci.pTessellationState = &tessellation_ci;
ci.pViewportState = &viewport_ci;
ci.pRasterizationState = &rasterization_ci;
ci.pMultisampleState = &multisample_ci;
ci.pDepthStencilState = &depth_stencil_ci;
ci.pColorBlendState = &color_blend_ci;
ci.pDynamicState = &dynamic_state_ci;
ci.layout = *layout;
ci.renderPass = renderpass;
ci.subpass = 0;
ci.basePipelineHandle = nullptr;
ci.basePipelineIndex = 0;
return device.GetLogical().CreateGraphicsPipeline(ci);
}
} // namespace Vulkan

View File

@@ -11,12 +11,12 @@
#include <vector>
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/fixed_pipeline_state.h"
#include "video_core/renderer_vulkan/vk_descriptor_pool.h"
#include "video_core/renderer_vulkan/vk_renderpass_cache.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_shader_decompiler.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -39,52 +39,52 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue,
VKRenderPassCache& renderpass_cache,
const GraphicsPipelineCacheKey& key,
const std::vector<vk::DescriptorSetLayoutBinding>& bindings,
vk::Span<VkDescriptorSetLayoutBinding> bindings,
const SPIRVProgram& program);
~VKGraphicsPipeline();
vk::DescriptorSet CommitDescriptorSet();
VkDescriptorSet CommitDescriptorSet();
vk::Pipeline GetHandle() const {
VkPipeline GetHandle() const {
return *pipeline;
}
vk::PipelineLayout GetLayout() const {
VkPipelineLayout GetLayout() const {
return *layout;
}
vk::RenderPass GetRenderPass() const {
VkRenderPass GetRenderPass() const {
return renderpass;
}
private:
UniqueDescriptorSetLayout CreateDescriptorSetLayout(
const std::vector<vk::DescriptorSetLayoutBinding>& bindings) const;
vk::DescriptorSetLayout CreateDescriptorSetLayout(
vk::Span<VkDescriptorSetLayoutBinding> bindings) const;
UniquePipelineLayout CreatePipelineLayout() const;
vk::PipelineLayout CreatePipelineLayout() const;
UniqueDescriptorUpdateTemplate CreateDescriptorUpdateTemplate(
vk::DescriptorUpdateTemplateKHR CreateDescriptorUpdateTemplate(
const SPIRVProgram& program) const;
std::vector<UniqueShaderModule> CreateShaderModules(const SPIRVProgram& program) const;
std::vector<vk::ShaderModule> CreateShaderModules(const SPIRVProgram& program) const;
UniquePipeline CreatePipeline(const RenderPassParams& renderpass_params,
const SPIRVProgram& program) const;
vk::Pipeline CreatePipeline(const RenderPassParams& renderpass_params,
const SPIRVProgram& program) const;
const VKDevice& device;
VKScheduler& scheduler;
const FixedPipelineState fixed_state;
const u64 hash;
UniqueDescriptorSetLayout descriptor_set_layout;
vk::DescriptorSetLayout descriptor_set_layout;
DescriptorAllocator descriptor_allocator;
VKUpdateDescriptorQueue& update_descriptor_queue;
UniquePipelineLayout layout;
UniqueDescriptorUpdateTemplate descriptor_template;
std::vector<UniqueShaderModule> modules;
vk::PipelineLayout layout;
vk::DescriptorUpdateTemplateKHR descriptor_template;
std::vector<vk::ShaderModule> modules;
vk::RenderPass renderpass;
UniquePipeline pipeline;
VkRenderPass renderpass;
vk::Pipeline pipeline;
};
} // namespace Vulkan

View File

@@ -6,22 +6,21 @@
#include <vector>
#include "common/assert.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_image.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
VKImage::VKImage(const VKDevice& device, VKScheduler& scheduler,
const vk::ImageCreateInfo& image_ci, vk::ImageAspectFlags aspect_mask)
VKImage::VKImage(const VKDevice& device, VKScheduler& scheduler, const VkImageCreateInfo& image_ci,
VkImageAspectFlags aspect_mask)
: device{device}, scheduler{scheduler}, format{image_ci.format}, aspect_mask{aspect_mask},
image_num_layers{image_ci.arrayLayers}, image_num_levels{image_ci.mipLevels} {
UNIMPLEMENTED_IF_MSG(image_ci.queueFamilyIndexCount != 0,
"Queue family tracking is not implemented");
const auto dev = device.GetLogical();
image = dev.createImageUnique(image_ci, nullptr, device.GetDispatchLoader());
image = device.GetLogical().CreateImage(image_ci);
const u32 num_ranges = image_num_layers * image_num_levels;
barriers.resize(num_ranges);
@@ -31,8 +30,8 @@ VKImage::VKImage(const VKDevice& device, VKScheduler& scheduler,
VKImage::~VKImage() = default;
void VKImage::Transition(u32 base_layer, u32 num_layers, u32 base_level, u32 num_levels,
vk::PipelineStageFlags new_stage_mask, vk::AccessFlags new_access,
vk::ImageLayout new_layout) {
VkPipelineStageFlags new_stage_mask, VkAccessFlags new_access,
VkImageLayout new_layout) {
if (!HasChanged(base_layer, num_layers, base_level, num_levels, new_access, new_layout)) {
return;
}
@@ -43,9 +42,21 @@ void VKImage::Transition(u32 base_layer, u32 num_layers, u32 base_level, u32 num
const u32 layer = base_layer + layer_it;
const u32 level = base_level + level_it;
auto& state = GetSubrangeState(layer, level);
barriers[cursor] = vk::ImageMemoryBarrier(
state.access, new_access, state.layout, new_layout, VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED, *image, {aspect_mask, level, 1, layer, 1});
auto& barrier = barriers[cursor];
barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = state.access;
barrier.dstAccessMask = new_access;
barrier.oldLayout = state.layout;
barrier.newLayout = new_layout;
barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
barrier.image = *image;
barrier.subresourceRange.aspectMask = aspect_mask;
barrier.subresourceRange.baseMipLevel = level;
barrier.subresourceRange.levelCount = 1;
barrier.subresourceRange.baseArrayLayer = layer;
barrier.subresourceRange.layerCount = 1;
state.access = new_access;
state.layout = new_layout;
}
@@ -53,16 +64,16 @@ void VKImage::Transition(u32 base_layer, u32 num_layers, u32 base_level, u32 num
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([barriers = barriers, cursor](auto cmdbuf, auto& dld) {
scheduler.Record([barriers = barriers, cursor](vk::CommandBuffer cmdbuf) {
// TODO(Rodrigo): Implement a way to use the latest stage across subresources.
constexpr auto stage_stub = vk::PipelineStageFlagBits::eAllCommands;
cmdbuf.pipelineBarrier(stage_stub, stage_stub, {}, 0, nullptr, 0, nullptr,
static_cast<u32>(cursor), barriers.data(), dld);
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_ALL_COMMANDS_BIT,
VK_PIPELINE_STAGE_ALL_COMMANDS_BIT, 0, {}, {},
vk::Span(barriers.data(), cursor));
});
}
bool VKImage::HasChanged(u32 base_layer, u32 num_layers, u32 base_level, u32 num_levels,
vk::AccessFlags new_access, vk::ImageLayout new_layout) noexcept {
VkAccessFlags new_access, VkImageLayout new_layout) noexcept {
const bool is_full_range = base_layer == 0 && num_layers == image_num_layers &&
base_level == 0 && num_levels == image_num_levels;
if (!is_full_range) {
@@ -91,11 +102,21 @@ bool VKImage::HasChanged(u32 base_layer, u32 num_layers, u32 base_level, u32 num
void VKImage::CreatePresentView() {
// Image type has to be 2D to be presented.
const vk::ImageViewCreateInfo image_view_ci({}, *image, vk::ImageViewType::e2D, format, {},
{aspect_mask, 0, 1, 0, 1});
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
present_view = dev.createImageViewUnique(image_view_ci, nullptr, dld);
VkImageViewCreateInfo image_view_ci;
image_view_ci.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
image_view_ci.pNext = nullptr;
image_view_ci.flags = 0;
image_view_ci.image = *image;
image_view_ci.viewType = VK_IMAGE_VIEW_TYPE_2D;
image_view_ci.format = format;
image_view_ci.components = {VK_COMPONENT_SWIZZLE_IDENTITY, VK_COMPONENT_SWIZZLE_IDENTITY,
VK_COMPONENT_SWIZZLE_IDENTITY, VK_COMPONENT_SWIZZLE_IDENTITY};
image_view_ci.subresourceRange.aspectMask = aspect_mask;
image_view_ci.subresourceRange.baseMipLevel = 0;
image_view_ci.subresourceRange.levelCount = 1;
image_view_ci.subresourceRange.baseArrayLayer = 0;
image_view_ci.subresourceRange.layerCount = 1;
present_view = device.GetLogical().CreateImageView(image_view_ci);
}
VKImage::SubrangeState& VKImage::GetSubrangeState(u32 layer, u32 level) noexcept {

View File

@@ -8,7 +8,7 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -18,16 +18,16 @@ class VKScheduler;
class VKImage {
public:
explicit VKImage(const VKDevice& device, VKScheduler& scheduler,
const vk::ImageCreateInfo& image_ci, vk::ImageAspectFlags aspect_mask);
const VkImageCreateInfo& image_ci, VkImageAspectFlags aspect_mask);
~VKImage();
/// Records in the passed command buffer an image transition and updates the state of the image.
void Transition(u32 base_layer, u32 num_layers, u32 base_level, u32 num_levels,
vk::PipelineStageFlags new_stage_mask, vk::AccessFlags new_access,
vk::ImageLayout new_layout);
VkPipelineStageFlags new_stage_mask, VkAccessFlags new_access,
VkImageLayout new_layout);
/// Returns a view compatible with presentation, the image has to be 2D.
vk::ImageView GetPresentView() {
VkImageView GetPresentView() {
if (!present_view) {
CreatePresentView();
}
@@ -35,28 +35,28 @@ public:
}
/// Returns the Vulkan image handler.
vk::Image GetHandle() const {
return *image;
const vk::Image& GetHandle() const {
return image;
}
/// Returns the Vulkan format for this image.
vk::Format GetFormat() const {
VkFormat GetFormat() const {
return format;
}
/// Returns the Vulkan aspect mask.
vk::ImageAspectFlags GetAspectMask() const {
VkImageAspectFlags GetAspectMask() const {
return aspect_mask;
}
private:
struct SubrangeState final {
vk::AccessFlags access{}; ///< Current access bits.
vk::ImageLayout layout = vk::ImageLayout::eUndefined; ///< Current image layout.
VkAccessFlags access = 0; ///< Current access bits.
VkImageLayout layout = VK_IMAGE_LAYOUT_UNDEFINED; ///< Current image layout.
};
bool HasChanged(u32 base_layer, u32 num_layers, u32 base_level, u32 num_levels,
vk::AccessFlags new_access, vk::ImageLayout new_layout) noexcept;
VkAccessFlags new_access, VkImageLayout new_layout) noexcept;
/// Creates a presentation view.
void CreatePresentView();
@@ -67,16 +67,16 @@ private:
const VKDevice& device; ///< Device handler.
VKScheduler& scheduler; ///< Device scheduler.
const vk::Format format; ///< Vulkan format.
const vk::ImageAspectFlags aspect_mask; ///< Vulkan aspect mask.
const u32 image_num_layers; ///< Number of layers.
const u32 image_num_levels; ///< Number of mipmap levels.
const VkFormat format; ///< Vulkan format.
const VkImageAspectFlags aspect_mask; ///< Vulkan aspect mask.
const u32 image_num_layers; ///< Number of layers.
const u32 image_num_levels; ///< Number of mipmap levels.
UniqueImage image; ///< Image handle.
UniqueImageView present_view; ///< Image view compatible with presentation.
vk::Image image; ///< Image handle.
vk::ImageView present_view; ///< Image view compatible with presentation.
std::vector<vk::ImageMemoryBarrier> barriers; ///< Pool of barriers.
std::vector<SubrangeState> subrange_states; ///< Current subrange state.
std::vector<VkImageMemoryBarrier> barriers; ///< Pool of barriers.
std::vector<SubrangeState> subrange_states; ///< Current subrange state.
bool state_diverged = false; ///< True when subresources mismatch in layout.
};

View File

@@ -11,9 +11,9 @@
#include "common/assert.h"
#include "common/common_types.h"
#include "common/logging/log.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -30,17 +30,11 @@ u64 GetAllocationChunkSize(u64 required_size) {
class VKMemoryAllocation final {
public:
explicit VKMemoryAllocation(const VKDevice& device, vk::DeviceMemory memory,
vk::MemoryPropertyFlags properties, u64 allocation_size, u32 type)
: device{device}, memory{memory}, properties{properties}, allocation_size{allocation_size},
shifted_type{ShiftType(type)} {}
VkMemoryPropertyFlags properties, u64 allocation_size, u32 type)
: device{device}, memory{std::move(memory)}, properties{properties},
allocation_size{allocation_size}, shifted_type{ShiftType(type)} {}
~VKMemoryAllocation() {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
dev.free(memory, nullptr, dld);
}
VKMemoryCommit Commit(vk::DeviceSize commit_size, vk::DeviceSize alignment) {
VKMemoryCommit Commit(VkDeviceSize commit_size, VkDeviceSize alignment) {
auto found = TryFindFreeSection(free_iterator, allocation_size,
static_cast<u64>(commit_size), static_cast<u64>(alignment));
if (!found) {
@@ -73,9 +67,8 @@ public:
}
/// Returns whether this allocation is compatible with the arguments.
bool IsCompatible(vk::MemoryPropertyFlags wanted_properties, u32 type_mask) const {
return (wanted_properties & properties) != vk::MemoryPropertyFlagBits(0) &&
(type_mask & shifted_type) != 0;
bool IsCompatible(VkMemoryPropertyFlags wanted_properties, u32 type_mask) const {
return (wanted_properties & properties) && (type_mask & shifted_type) != 0;
}
private:
@@ -111,11 +104,11 @@ private:
return std::nullopt;
}
const VKDevice& device; ///< Vulkan device.
const vk::DeviceMemory memory; ///< Vulkan memory allocation handler.
const vk::MemoryPropertyFlags properties; ///< Vulkan properties.
const u64 allocation_size; ///< Size of this allocation.
const u32 shifted_type; ///< Stored Vulkan type of this allocation, shifted.
const VKDevice& device; ///< Vulkan device.
const vk::DeviceMemory memory; ///< Vulkan memory allocation handler.
const VkMemoryPropertyFlags properties; ///< Vulkan properties.
const u64 allocation_size; ///< Size of this allocation.
const u32 shifted_type; ///< Stored Vulkan type of this allocation, shifted.
/// Hints where the next free region is likely going to be.
u64 free_iterator{};
@@ -125,22 +118,20 @@ private:
};
VKMemoryManager::VKMemoryManager(const VKDevice& device)
: device{device}, properties{device.GetPhysical().getMemoryProperties(
device.GetDispatchLoader())},
: device{device}, properties{device.GetPhysical().GetMemoryProperties()},
is_memory_unified{GetMemoryUnified(properties)} {}
VKMemoryManager::~VKMemoryManager() = default;
VKMemoryCommit VKMemoryManager::Commit(const vk::MemoryRequirements& requirements,
VKMemoryCommit VKMemoryManager::Commit(const VkMemoryRequirements& requirements,
bool host_visible) {
const u64 chunk_size = GetAllocationChunkSize(requirements.size);
// When a host visible commit is asked, search for host visible and coherent, otherwise search
// for a fast device local type.
const vk::MemoryPropertyFlags wanted_properties =
host_visible
? vk::MemoryPropertyFlagBits::eHostVisible | vk::MemoryPropertyFlagBits::eHostCoherent
: vk::MemoryPropertyFlagBits::eDeviceLocal;
const VkMemoryPropertyFlags wanted_properties =
host_visible ? VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
: VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT;
if (auto commit = TryAllocCommit(requirements, wanted_properties)) {
return commit;
@@ -161,23 +152,19 @@ VKMemoryCommit VKMemoryManager::Commit(const vk::MemoryRequirements& requirement
return commit;
}
VKMemoryCommit VKMemoryManager::Commit(vk::Buffer buffer, bool host_visible) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
auto commit = Commit(dev.getBufferMemoryRequirements(buffer, dld), host_visible);
dev.bindBufferMemory(buffer, commit->GetMemory(), commit->GetOffset(), dld);
VKMemoryCommit VKMemoryManager::Commit(const vk::Buffer& buffer, bool host_visible) {
auto commit = Commit(device.GetLogical().GetBufferMemoryRequirements(*buffer), host_visible);
buffer.BindMemory(commit->GetMemory(), commit->GetOffset());
return commit;
}
VKMemoryCommit VKMemoryManager::Commit(vk::Image image, bool host_visible) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
auto commit = Commit(dev.getImageMemoryRequirements(image, dld), host_visible);
dev.bindImageMemory(image, commit->GetMemory(), commit->GetOffset(), dld);
VKMemoryCommit VKMemoryManager::Commit(const vk::Image& image, bool host_visible) {
auto commit = Commit(device.GetLogical().GetImageMemoryRequirements(*image), host_visible);
image.BindMemory(commit->GetMemory(), commit->GetOffset());
return commit;
}
bool VKMemoryManager::AllocMemory(vk::MemoryPropertyFlags wanted_properties, u32 type_mask,
bool VKMemoryManager::AllocMemory(VkMemoryPropertyFlags wanted_properties, u32 type_mask,
u64 size) {
const u32 type = [&] {
for (u32 type_index = 0; type_index < properties.memoryTypeCount; ++type_index) {
@@ -191,24 +178,26 @@ bool VKMemoryManager::AllocMemory(vk::MemoryPropertyFlags wanted_properties, u32
return 0U;
}();
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
// Try to allocate found type.
const vk::MemoryAllocateInfo memory_ai(size, type);
vk::DeviceMemory memory;
if (const auto res = dev.allocateMemory(&memory_ai, nullptr, &memory, dld);
res != vk::Result::eSuccess) {
LOG_CRITICAL(Render_Vulkan, "Device allocation failed with code {}!", vk::to_string(res));
VkMemoryAllocateInfo memory_ai;
memory_ai.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
memory_ai.pNext = nullptr;
memory_ai.allocationSize = size;
memory_ai.memoryTypeIndex = type;
vk::DeviceMemory memory = device.GetLogical().TryAllocateMemory(memory_ai);
if (!memory) {
LOG_CRITICAL(Render_Vulkan, "Device allocation failed!");
return false;
}
allocations.push_back(
std::make_unique<VKMemoryAllocation>(device, memory, wanted_properties, size, type));
allocations.push_back(std::make_unique<VKMemoryAllocation>(device, std::move(memory),
wanted_properties, size, type));
return true;
}
VKMemoryCommit VKMemoryManager::TryAllocCommit(const vk::MemoryRequirements& requirements,
vk::MemoryPropertyFlags wanted_properties) {
VKMemoryCommit VKMemoryManager::TryAllocCommit(const VkMemoryRequirements& requirements,
VkMemoryPropertyFlags wanted_properties) {
for (auto& allocation : allocations) {
if (!allocation->IsCompatible(wanted_properties, requirements.memoryTypeBits)) {
continue;
@@ -220,10 +209,9 @@ VKMemoryCommit VKMemoryManager::TryAllocCommit(const vk::MemoryRequirements& req
return {};
}
/*static*/ bool VKMemoryManager::GetMemoryUnified(
const vk::PhysicalDeviceMemoryProperties& properties) {
bool VKMemoryManager::GetMemoryUnified(const VkPhysicalDeviceMemoryProperties& properties) {
for (u32 heap_index = 0; heap_index < properties.memoryHeapCount; ++heap_index) {
if (!(properties.memoryHeaps[heap_index].flags & vk::MemoryHeapFlagBits::eDeviceLocal)) {
if (!(properties.memoryHeaps[heap_index].flags & VK_MEMORY_HEAP_DEVICE_LOCAL_BIT)) {
// Memory is considered unified when heaps are device local only.
return false;
}
@@ -232,23 +220,19 @@ VKMemoryCommit VKMemoryManager::TryAllocCommit(const vk::MemoryRequirements& req
}
VKMemoryCommitImpl::VKMemoryCommitImpl(const VKDevice& device, VKMemoryAllocation* allocation,
vk::DeviceMemory memory, u64 begin, u64 end)
: device{device}, interval{begin, end}, memory{memory}, allocation{allocation} {}
const vk::DeviceMemory& memory, u64 begin, u64 end)
: device{device}, memory{memory}, interval{begin, end}, allocation{allocation} {}
VKMemoryCommitImpl::~VKMemoryCommitImpl() {
allocation->Free(this);
}
MemoryMap VKMemoryCommitImpl::Map(u64 size, u64 offset_) const {
const auto dev = device.GetLogical();
const auto address = reinterpret_cast<u8*>(
dev.mapMemory(memory, interval.first + offset_, size, {}, device.GetDispatchLoader()));
return MemoryMap{this, address};
return MemoryMap{this, memory.Map(interval.first + offset_, size)};
}
void VKMemoryCommitImpl::Unmap() const {
const auto dev = device.GetLogical();
dev.unmapMemory(memory, device.GetDispatchLoader());
memory.Unmap();
}
MemoryMap VKMemoryCommitImpl::Map() const {

View File

@@ -8,7 +8,7 @@
#include <utility>
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -32,13 +32,13 @@ public:
* memory. When passing false, it will try to allocate device local memory.
* @returns A memory commit.
*/
VKMemoryCommit Commit(const vk::MemoryRequirements& reqs, bool host_visible);
VKMemoryCommit Commit(const VkMemoryRequirements& reqs, bool host_visible);
/// Commits memory required by the buffer and binds it.
VKMemoryCommit Commit(vk::Buffer buffer, bool host_visible);
VKMemoryCommit Commit(const vk::Buffer& buffer, bool host_visible);
/// Commits memory required by the image and binds it.
VKMemoryCommit Commit(vk::Image image, bool host_visible);
VKMemoryCommit Commit(const vk::Image& image, bool host_visible);
/// Returns true if the memory allocations are done always in host visible and coherent memory.
bool IsMemoryUnified() const {
@@ -47,18 +47,18 @@ public:
private:
/// Allocates a chunk of memory.
bool AllocMemory(vk::MemoryPropertyFlags wanted_properties, u32 type_mask, u64 size);
bool AllocMemory(VkMemoryPropertyFlags wanted_properties, u32 type_mask, u64 size);
/// Tries to allocate a memory commit.
VKMemoryCommit TryAllocCommit(const vk::MemoryRequirements& requirements,
vk::MemoryPropertyFlags wanted_properties);
VKMemoryCommit TryAllocCommit(const VkMemoryRequirements& requirements,
VkMemoryPropertyFlags wanted_properties);
/// Returns true if the device uses an unified memory model.
static bool GetMemoryUnified(const vk::PhysicalDeviceMemoryProperties& properties);
static bool GetMemoryUnified(const VkPhysicalDeviceMemoryProperties& properties);
const VKDevice& device; ///< Device handler.
const vk::PhysicalDeviceMemoryProperties properties; ///< Physical device properties.
const bool is_memory_unified; ///< True if memory model is unified.
const VKDevice& device; ///< Device handler.
const VkPhysicalDeviceMemoryProperties properties; ///< Physical device properties.
const bool is_memory_unified; ///< True if memory model is unified.
std::vector<std::unique_ptr<VKMemoryAllocation>> allocations; ///< Current allocations.
};
@@ -68,7 +68,7 @@ class VKMemoryCommitImpl final {
public:
explicit VKMemoryCommitImpl(const VKDevice& device, VKMemoryAllocation* allocation,
vk::DeviceMemory memory, u64 begin, u64 end);
const vk::DeviceMemory& memory, u64 begin, u64 end);
~VKMemoryCommitImpl();
/// Maps a memory region and returns a pointer to it.
@@ -80,13 +80,13 @@ public:
MemoryMap Map() const;
/// Returns the Vulkan memory handler.
vk::DeviceMemory GetMemory() const {
return memory;
VkDeviceMemory GetMemory() const {
return *memory;
}
/// Returns the start position of the commit relative to the allocation.
vk::DeviceSize GetOffset() const {
return static_cast<vk::DeviceSize>(interval.first);
VkDeviceSize GetOffset() const {
return static_cast<VkDeviceSize>(interval.first);
}
private:
@@ -94,8 +94,8 @@ private:
void Unmap() const;
const VKDevice& device; ///< Vulkan device.
const vk::DeviceMemory& memory; ///< Vulkan device memory handler.
std::pair<u64, u64> interval{}; ///< Interval where the commit exists.
vk::DeviceMemory memory; ///< Vulkan device memory handler.
VKMemoryAllocation* allocation{}; ///< Pointer to the large memory allocation.
};

View File

@@ -13,7 +13,6 @@
#include "video_core/engines/kepler_compute.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/memory_manager.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/fixed_pipeline_state.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_compute_pipeline.h"
@@ -26,6 +25,7 @@
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/shader/compiler_settings.h"
namespace Vulkan {
@@ -36,12 +36,11 @@ using Tegra::Engines::ShaderType;
namespace {
// C++20's using enum
constexpr auto eUniformBuffer = vk::DescriptorType::eUniformBuffer;
constexpr auto eStorageBuffer = vk::DescriptorType::eStorageBuffer;
constexpr auto eUniformTexelBuffer = vk::DescriptorType::eUniformTexelBuffer;
constexpr auto eCombinedImageSampler = vk::DescriptorType::eCombinedImageSampler;
constexpr auto eStorageImage = vk::DescriptorType::eStorageImage;
constexpr VkDescriptorType UNIFORM_BUFFER = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
constexpr VkDescriptorType STORAGE_BUFFER = VK_DESCRIPTOR_TYPE_STORAGE_BUFFER;
constexpr VkDescriptorType UNIFORM_TEXEL_BUFFER = VK_DESCRIPTOR_TYPE_UNIFORM_TEXEL_BUFFER;
constexpr VkDescriptorType COMBINED_IMAGE_SAMPLER = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER;
constexpr VkDescriptorType STORAGE_IMAGE = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE;
constexpr VideoCommon::Shader::CompilerSettings compiler_settings{
VideoCommon::Shader::CompileDepth::FullDecompile};
@@ -126,32 +125,37 @@ ShaderType GetShaderType(Maxwell::ShaderProgram program) {
}
}
template <vk::DescriptorType descriptor_type, class Container>
void AddBindings(std::vector<vk::DescriptorSetLayoutBinding>& bindings, u32& binding,
vk::ShaderStageFlags stage_flags, const Container& container) {
template <VkDescriptorType descriptor_type, class Container>
void AddBindings(std::vector<VkDescriptorSetLayoutBinding>& bindings, u32& binding,
VkShaderStageFlags stage_flags, const Container& container) {
const u32 num_entries = static_cast<u32>(std::size(container));
for (std::size_t i = 0; i < num_entries; ++i) {
u32 count = 1;
if constexpr (descriptor_type == eCombinedImageSampler) {
if constexpr (descriptor_type == VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER) {
// Combined image samplers can be arrayed.
count = container[i].Size();
}
bindings.emplace_back(binding++, descriptor_type, count, stage_flags, nullptr);
VkDescriptorSetLayoutBinding& entry = bindings.emplace_back();
entry.binding = binding++;
entry.descriptorType = descriptor_type;
entry.descriptorCount = count;
entry.stageFlags = stage_flags;
entry.pImmutableSamplers = nullptr;
}
}
u32 FillDescriptorLayout(const ShaderEntries& entries,
std::vector<vk::DescriptorSetLayoutBinding>& bindings,
std::vector<VkDescriptorSetLayoutBinding>& bindings,
Maxwell::ShaderProgram program_type, u32 base_binding) {
const ShaderType stage = GetStageFromProgram(program_type);
const vk::ShaderStageFlags flags = MaxwellToVK::ShaderStage(stage);
const VkShaderStageFlags flags = MaxwellToVK::ShaderStage(stage);
u32 binding = base_binding;
AddBindings<eUniformBuffer>(bindings, binding, flags, entries.const_buffers);
AddBindings<eStorageBuffer>(bindings, binding, flags, entries.global_buffers);
AddBindings<eUniformTexelBuffer>(bindings, binding, flags, entries.texel_buffers);
AddBindings<eCombinedImageSampler>(bindings, binding, flags, entries.samplers);
AddBindings<eStorageImage>(bindings, binding, flags, entries.images);
AddBindings<UNIFORM_BUFFER>(bindings, binding, flags, entries.const_buffers);
AddBindings<STORAGE_BUFFER>(bindings, binding, flags, entries.global_buffers);
AddBindings<UNIFORM_TEXEL_BUFFER>(bindings, binding, flags, entries.texel_buffers);
AddBindings<COMBINED_IMAGE_SAMPLER>(bindings, binding, flags, entries.samplers);
AddBindings<STORAGE_IMAGE>(bindings, binding, flags, entries.images);
return binding;
}
@@ -318,7 +322,7 @@ void VKPipelineCache::Unregister(const Shader& shader) {
RasterizerCache::Unregister(shader);
}
std::pair<SPIRVProgram, std::vector<vk::DescriptorSetLayoutBinding>>
std::pair<SPIRVProgram, std::vector<VkDescriptorSetLayoutBinding>>
VKPipelineCache::DecompileShaders(const GraphicsPipelineCacheKey& key) {
const auto& fixed_state = key.fixed_state;
auto& memory_manager = system.GPU().MemoryManager();
@@ -335,7 +339,7 @@ VKPipelineCache::DecompileShaders(const GraphicsPipelineCacheKey& key) {
specialization.ndc_minus_one_to_one = fixed_state.rasterizer.ndc_minus_one_to_one;
SPIRVProgram program;
std::vector<vk::DescriptorSetLayoutBinding> bindings;
std::vector<VkDescriptorSetLayoutBinding> bindings;
for (std::size_t index = 0; index < Maxwell::MaxShaderProgram; ++index) {
const auto program_enum = static_cast<Maxwell::ShaderProgram>(index);
@@ -371,32 +375,49 @@ VKPipelineCache::DecompileShaders(const GraphicsPipelineCacheKey& key) {
return {std::move(program), std::move(bindings)};
}
template <vk::DescriptorType descriptor_type, class Container>
void AddEntry(std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries, u32& binding,
template <VkDescriptorType descriptor_type, class Container>
void AddEntry(std::vector<VkDescriptorUpdateTemplateEntry>& template_entries, u32& binding,
u32& offset, const Container& container) {
static constexpr u32 entry_size = static_cast<u32>(sizeof(DescriptorUpdateEntry));
const u32 count = static_cast<u32>(std::size(container));
if constexpr (descriptor_type == eCombinedImageSampler) {
if constexpr (descriptor_type == COMBINED_IMAGE_SAMPLER) {
for (u32 i = 0; i < count; ++i) {
const u32 num_samplers = container[i].Size();
template_entries.emplace_back(binding, 0, num_samplers, descriptor_type, offset,
entry_size);
VkDescriptorUpdateTemplateEntry& entry = template_entries.emplace_back();
entry.dstBinding = binding;
entry.dstArrayElement = 0;
entry.descriptorCount = num_samplers;
entry.descriptorType = descriptor_type;
entry.offset = offset;
entry.stride = entry_size;
++binding;
offset += num_samplers * entry_size;
}
return;
}
if constexpr (descriptor_type == eUniformTexelBuffer) {
if constexpr (descriptor_type == UNIFORM_TEXEL_BUFFER) {
// Nvidia has a bug where updating multiple uniform texels at once causes the driver to
// crash.
for (u32 i = 0; i < count; ++i) {
template_entries.emplace_back(binding + i, 0, 1, descriptor_type,
offset + i * entry_size, entry_size);
VkDescriptorUpdateTemplateEntry& entry = template_entries.emplace_back();
entry.dstBinding = binding + i;
entry.dstArrayElement = 0;
entry.descriptorCount = 1;
entry.descriptorType = descriptor_type;
entry.offset = offset + i * entry_size;
entry.stride = entry_size;
}
} else if (count > 0) {
template_entries.emplace_back(binding, 0, count, descriptor_type, offset, entry_size);
VkDescriptorUpdateTemplateEntry& entry = template_entries.emplace_back();
entry.dstBinding = binding;
entry.dstArrayElement = 0;
entry.descriptorCount = count;
entry.descriptorType = descriptor_type;
entry.offset = offset;
entry.stride = entry_size;
}
offset += count * entry_size;
binding += count;
@@ -404,12 +425,12 @@ void AddEntry(std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries,
void FillDescriptorUpdateTemplateEntries(
const ShaderEntries& entries, u32& binding, u32& offset,
std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries) {
AddEntry<eUniformBuffer>(template_entries, offset, binding, entries.const_buffers);
AddEntry<eStorageBuffer>(template_entries, offset, binding, entries.global_buffers);
AddEntry<eUniformTexelBuffer>(template_entries, offset, binding, entries.texel_buffers);
AddEntry<eCombinedImageSampler>(template_entries, offset, binding, entries.samplers);
AddEntry<eStorageImage>(template_entries, offset, binding, entries.images);
std::vector<VkDescriptorUpdateTemplateEntryKHR>& template_entries) {
AddEntry<UNIFORM_BUFFER>(template_entries, offset, binding, entries.const_buffers);
AddEntry<STORAGE_BUFFER>(template_entries, offset, binding, entries.global_buffers);
AddEntry<UNIFORM_TEXEL_BUFFER>(template_entries, offset, binding, entries.texel_buffers);
AddEntry<COMBINED_IMAGE_SAMPLER>(template_entries, offset, binding, entries.samplers);
AddEntry<STORAGE_IMAGE>(template_entries, offset, binding, entries.images);
}
} // namespace Vulkan

View File

@@ -19,12 +19,12 @@
#include "video_core/engines/const_buffer_engine_interface.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/rasterizer_cache.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/fixed_pipeline_state.h"
#include "video_core/renderer_vulkan/vk_graphics_pipeline.h"
#include "video_core/renderer_vulkan/vk_renderpass_cache.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_shader_decompiler.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/shader/registry.h"
#include "video_core/shader/shader_ir.h"
#include "video_core/surface.h"
@@ -172,7 +172,7 @@ protected:
void FlushObjectInner(const Shader& object) override {}
private:
std::pair<SPIRVProgram, std::vector<vk::DescriptorSetLayoutBinding>> DecompileShaders(
std::pair<SPIRVProgram, std::vector<VkDescriptorSetLayoutBinding>> DecompileShaders(
const GraphicsPipelineCacheKey& key);
Core::System& system;
@@ -194,6 +194,6 @@ private:
void FillDescriptorUpdateTemplateEntries(
const ShaderEntries& entries, u32& binding, u32& offset,
std::vector<vk::DescriptorUpdateTemplateEntry>& template_entries);
std::vector<VkDescriptorUpdateTemplateEntryKHR>& template_entries);
} // namespace Vulkan

View File

@@ -8,19 +8,19 @@
#include <utility>
#include <vector>
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_query_cache.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
namespace {
constexpr std::array QUERY_TARGETS = {vk::QueryType::eOcclusion};
constexpr std::array QUERY_TARGETS = {VK_QUERY_TYPE_OCCLUSION};
constexpr vk::QueryType GetTarget(VideoCore::QueryType type) {
constexpr VkQueryType GetTarget(VideoCore::QueryType type) {
return QUERY_TARGETS[static_cast<std::size_t>(type)];
}
@@ -35,29 +35,34 @@ void QueryPool::Initialize(const VKDevice& device_, VideoCore::QueryType type_)
type = type_;
}
std::pair<vk::QueryPool, std::uint32_t> QueryPool::Commit(VKFence& fence) {
std::pair<VkQueryPool, u32> QueryPool::Commit(VKFence& fence) {
std::size_t index;
do {
index = CommitResource(fence);
} while (usage[index]);
usage[index] = true;
return {*pools[index / GROW_STEP], static_cast<std::uint32_t>(index % GROW_STEP)};
return {*pools[index / GROW_STEP], static_cast<u32>(index % GROW_STEP)};
}
void QueryPool::Allocate(std::size_t begin, std::size_t end) {
usage.resize(end);
const auto dev = device->GetLogical();
const u32 size = static_cast<u32>(end - begin);
const vk::QueryPoolCreateInfo query_pool_ci({}, GetTarget(type), size, {});
pools.push_back(dev.createQueryPoolUnique(query_pool_ci, nullptr, device->GetDispatchLoader()));
VkQueryPoolCreateInfo query_pool_ci;
query_pool_ci.sType = VK_STRUCTURE_TYPE_QUERY_POOL_CREATE_INFO;
query_pool_ci.pNext = nullptr;
query_pool_ci.flags = 0;
query_pool_ci.queryType = GetTarget(type);
query_pool_ci.queryCount = static_cast<u32>(end - begin);
query_pool_ci.pipelineStatistics = 0;
pools.push_back(device->GetLogical().CreateQueryPool(query_pool_ci));
}
void QueryPool::Reserve(std::pair<vk::QueryPool, std::uint32_t> query) {
void QueryPool::Reserve(std::pair<VkQueryPool, u32> query) {
const auto it =
std::find_if(std::begin(pools), std::end(pools),
[query_pool = query.first](auto& pool) { return query_pool == *pool; });
std::find_if(pools.begin(), pools.end(), [query_pool = query.first](vk::QueryPool& pool) {
return query_pool == *pool;
});
ASSERT(it != std::end(pools));
const std::ptrdiff_t pool_index = std::distance(std::begin(pools), it);
@@ -76,12 +81,11 @@ VKQueryCache::VKQueryCache(Core::System& system, VideoCore::RasterizerInterface&
VKQueryCache::~VKQueryCache() = default;
std::pair<vk::QueryPool, std::uint32_t> VKQueryCache::AllocateQuery(VideoCore::QueryType type) {
std::pair<VkQueryPool, u32> VKQueryCache::AllocateQuery(VideoCore::QueryType type) {
return query_pools[static_cast<std::size_t>(type)].Commit(scheduler.GetFence());
}
void VKQueryCache::Reserve(VideoCore::QueryType type,
std::pair<vk::QueryPool, std::uint32_t> query) {
void VKQueryCache::Reserve(VideoCore::QueryType type, std::pair<VkQueryPool, u32> query) {
query_pools[static_cast<std::size_t>(type)].Reserve(query);
}
@@ -89,10 +93,10 @@ HostCounter::HostCounter(VKQueryCache& cache, std::shared_ptr<HostCounter> depen
VideoCore::QueryType type)
: VideoCommon::HostCounterBase<VKQueryCache, HostCounter>{std::move(dependency)}, cache{cache},
type{type}, query{cache.AllocateQuery(type)}, ticks{cache.Scheduler().Ticks()} {
const auto dev = cache.Device().GetLogical();
cache.Scheduler().Record([dev, query = query](vk::CommandBuffer cmdbuf, auto& dld) {
dev.resetQueryPoolEXT(query.first, query.second, 1, dld);
cmdbuf.beginQuery(query.first, query.second, vk::QueryControlFlagBits::ePrecise, dld);
const vk::Device* logical = &cache.Device().GetLogical();
cache.Scheduler().Record([logical, query = query](vk::CommandBuffer cmdbuf) {
logical->ResetQueryPoolEXT(query.first, query.second, 1);
cmdbuf.BeginQuery(query.first, query.second, VK_QUERY_CONTROL_PRECISE_BIT);
});
}
@@ -101,22 +105,16 @@ HostCounter::~HostCounter() {
}
void HostCounter::EndQuery() {
cache.Scheduler().Record([query = query](auto cmdbuf, auto& dld) {
cmdbuf.endQuery(query.first, query.second, dld);
});
cache.Scheduler().Record(
[query = query](vk::CommandBuffer cmdbuf) { cmdbuf.EndQuery(query.first, query.second); });
}
u64 HostCounter::BlockingQuery() const {
if (ticks >= cache.Scheduler().Ticks()) {
cache.Scheduler().Flush();
}
const auto dev = cache.Device().GetLogical();
const auto& dld = cache.Device().GetDispatchLoader();
u64 value;
dev.getQueryPoolResults(query.first, query.second, 1, sizeof(value), &value, sizeof(value),
vk::QueryResultFlagBits::e64 | vk::QueryResultFlagBits::eWait, dld);
return value;
return cache.Device().GetLogical().GetQueryResult<u64>(
query.first, query.second, VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT);
}
} // namespace Vulkan

View File

@@ -12,8 +12,8 @@
#include "common/common_types.h"
#include "video_core/query_cache.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace VideoCore {
class RasterizerInterface;
@@ -36,9 +36,9 @@ public:
void Initialize(const VKDevice& device, VideoCore::QueryType type);
std::pair<vk::QueryPool, std::uint32_t> Commit(VKFence& fence);
std::pair<VkQueryPool, u32> Commit(VKFence& fence);
void Reserve(std::pair<vk::QueryPool, std::uint32_t> query);
void Reserve(std::pair<VkQueryPool, u32> query);
protected:
void Allocate(std::size_t begin, std::size_t end) override;
@@ -49,7 +49,7 @@ private:
const VKDevice* device = nullptr;
VideoCore::QueryType type = {};
std::vector<UniqueQueryPool> pools;
std::vector<vk::QueryPool> pools;
std::vector<bool> usage;
};
@@ -61,9 +61,9 @@ public:
const VKDevice& device, VKScheduler& scheduler);
~VKQueryCache();
std::pair<vk::QueryPool, std::uint32_t> AllocateQuery(VideoCore::QueryType type);
std::pair<VkQueryPool, u32> AllocateQuery(VideoCore::QueryType type);
void Reserve(VideoCore::QueryType type, std::pair<vk::QueryPool, std::uint32_t> query);
void Reserve(VideoCore::QueryType type, std::pair<VkQueryPool, u32> query);
const VKDevice& Device() const noexcept {
return device;
@@ -91,7 +91,7 @@ private:
VKQueryCache& cache;
const VideoCore::QueryType type;
const std::pair<vk::QueryPool, std::uint32_t> query;
const std::pair<VkQueryPool, u32> query;
const u64 ticks;
};

View File

@@ -19,7 +19,6 @@
#include "core/memory.h"
#include "video_core/engines/kepler_compute.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/fixed_pipeline_state.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/renderer_vulkan.h"
@@ -39,6 +38,7 @@
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#include "video_core/renderer_vulkan/vk_texture_cache.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -60,32 +60,42 @@ namespace {
constexpr auto ComputeShaderIndex = static_cast<std::size_t>(Tegra::Engines::ShaderType::Compute);
vk::Viewport GetViewportState(const VKDevice& device, const Maxwell& regs, std::size_t index) {
const auto& viewport = regs.viewport_transform[index];
const float x = viewport.translate_x - viewport.scale_x;
const float y = viewport.translate_y - viewport.scale_y;
const float width = viewport.scale_x * 2.0f;
const float height = viewport.scale_y * 2.0f;
VkViewport GetViewportState(const VKDevice& device, const Maxwell& regs, std::size_t index) {
const auto& src = regs.viewport_transform[index];
const float width = src.scale_x * 2.0f;
const float height = src.scale_y * 2.0f;
const float reduce_z = regs.depth_mode == Maxwell::DepthMode::MinusOneToOne;
float near = viewport.translate_z - viewport.scale_z * reduce_z;
float far = viewport.translate_z + viewport.scale_z;
VkViewport viewport;
viewport.x = src.translate_x - src.scale_x;
viewport.y = src.translate_y - src.scale_y;
viewport.width = width != 0.0f ? width : 1.0f;
viewport.height = height != 0.0f ? height : 1.0f;
const float reduce_z = regs.depth_mode == Maxwell::DepthMode::MinusOneToOne ? 1.0f : 0.0f;
viewport.minDepth = src.translate_z - src.scale_z * reduce_z;
viewport.maxDepth = src.translate_z + src.scale_z;
if (!device.IsExtDepthRangeUnrestrictedSupported()) {
near = std::clamp(near, 0.0f, 1.0f);
far = std::clamp(far, 0.0f, 1.0f);
viewport.minDepth = std::clamp(viewport.minDepth, 0.0f, 1.0f);
viewport.maxDepth = std::clamp(viewport.maxDepth, 0.0f, 1.0f);
}
return vk::Viewport(x, y, width != 0 ? width : 1.0f, height != 0 ? height : 1.0f, near, far);
return viewport;
}
constexpr vk::Rect2D GetScissorState(const Maxwell& regs, std::size_t index) {
const auto& scissor = regs.scissor_test[index];
if (!scissor.enable) {
return {{0, 0}, {INT32_MAX, INT32_MAX}};
VkRect2D GetScissorState(const Maxwell& regs, std::size_t index) {
const auto& src = regs.scissor_test[index];
VkRect2D scissor;
if (src.enable) {
scissor.offset.x = static_cast<s32>(src.min_x);
scissor.offset.y = static_cast<s32>(src.min_y);
scissor.extent.width = src.max_x - src.min_x;
scissor.extent.height = src.max_y - src.min_y;
} else {
scissor.offset.x = 0;
scissor.offset.y = 0;
scissor.extent.width = std::numeric_limits<s32>::max();
scissor.extent.height = std::numeric_limits<s32>::max();
}
const u32 width = scissor.max_x - scissor.min_x;
const u32 height = scissor.max_y - scissor.min_y;
return {{static_cast<s32>(scissor.min_x), static_cast<s32>(scissor.min_y)}, {width, height}};
return scissor;
}
std::array<GPUVAddr, Maxwell::MaxShaderProgram> GetShaderAddresses(
@@ -97,8 +107,8 @@ std::array<GPUVAddr, Maxwell::MaxShaderProgram> GetShaderAddresses(
return addresses;
}
void TransitionImages(const std::vector<ImageView>& views, vk::PipelineStageFlags pipeline_stage,
vk::AccessFlags access) {
void TransitionImages(const std::vector<ImageView>& views, VkPipelineStageFlags pipeline_stage,
VkAccessFlags access) {
for (auto& [view, layout] : views) {
view->Transition(*layout, pipeline_stage, access);
}
@@ -127,13 +137,13 @@ Tegra::Texture::FullTextureInfo GetTextureInfo(const Engine& engine, const Entry
class BufferBindings final {
public:
void AddVertexBinding(const vk::Buffer* buffer, vk::DeviceSize offset) {
void AddVertexBinding(const VkBuffer* buffer, VkDeviceSize offset) {
vertex.buffer_ptrs[vertex.num_buffers] = buffer;
vertex.offsets[vertex.num_buffers] = offset;
++vertex.num_buffers;
}
void SetIndexBinding(const vk::Buffer* buffer, vk::DeviceSize offset, vk::IndexType type) {
void SetIndexBinding(const VkBuffer* buffer, VkDeviceSize offset, VkIndexType type) {
index.buffer = buffer;
index.offset = offset;
index.type = type;
@@ -217,14 +227,14 @@ private:
// Some of these fields are intentionally left uninitialized to avoid initializing them twice.
struct {
std::size_t num_buffers = 0;
std::array<const vk::Buffer*, Maxwell::NumVertexArrays> buffer_ptrs;
std::array<vk::DeviceSize, Maxwell::NumVertexArrays> offsets;
std::array<const VkBuffer*, Maxwell::NumVertexArrays> buffer_ptrs;
std::array<VkDeviceSize, Maxwell::NumVertexArrays> offsets;
} vertex;
struct {
const vk::Buffer* buffer = nullptr;
vk::DeviceSize offset;
vk::IndexType type;
const VkBuffer* buffer = nullptr;
VkDeviceSize offset;
VkIndexType type;
} index;
template <std::size_t N>
@@ -243,38 +253,35 @@ private:
return;
}
std::array<vk::Buffer, N> buffers;
std::array<VkBuffer, N> buffers;
std::transform(vertex.buffer_ptrs.begin(), vertex.buffer_ptrs.begin() + N, buffers.begin(),
[](const auto ptr) { return *ptr; });
std::array<vk::DeviceSize, N> offsets;
std::array<VkDeviceSize, N> offsets;
std::copy(vertex.offsets.begin(), vertex.offsets.begin() + N, offsets.begin());
if constexpr (is_indexed) {
// Indexed draw
scheduler.Record([buffers, offsets, index_buffer = *index.buffer,
index_offset = index.offset,
index_type = index.type](auto cmdbuf, auto& dld) {
cmdbuf.bindIndexBuffer(index_buffer, index_offset, index_type, dld);
cmdbuf.bindVertexBuffers(0, static_cast<u32>(N), buffers.data(), offsets.data(),
dld);
index_type = index.type](vk::CommandBuffer cmdbuf) {
cmdbuf.BindIndexBuffer(index_buffer, index_offset, index_type);
cmdbuf.BindVertexBuffers(0, static_cast<u32>(N), buffers.data(), offsets.data());
});
} else {
// Array draw
scheduler.Record([buffers, offsets](auto cmdbuf, auto& dld) {
cmdbuf.bindVertexBuffers(0, static_cast<u32>(N), buffers.data(), offsets.data(),
dld);
scheduler.Record([buffers, offsets](vk::CommandBuffer cmdbuf) {
cmdbuf.BindVertexBuffers(0, static_cast<u32>(N), buffers.data(), offsets.data());
});
}
}
};
void RasterizerVulkan::DrawParameters::Draw(vk::CommandBuffer cmdbuf,
const vk::DispatchLoaderDynamic& dld) const {
void RasterizerVulkan::DrawParameters::Draw(vk::CommandBuffer cmdbuf) const {
if (is_indexed) {
cmdbuf.drawIndexed(num_vertices, num_instances, 0, base_vertex, base_instance, dld);
cmdbuf.DrawIndexed(num_vertices, num_instances, 0, base_vertex, base_instance);
} else {
cmdbuf.draw(num_vertices, num_instances, base_vertex, base_instance, dld);
cmdbuf.Draw(num_vertices, num_instances, base_vertex, base_instance);
}
}
@@ -337,7 +344,7 @@ void RasterizerVulkan::Draw(bool is_indexed, bool is_instanced) {
const auto renderpass = pipeline.GetRenderPass();
const auto [framebuffer, render_area] = ConfigureFramebuffers(renderpass);
scheduler.RequestRenderpass({renderpass, framebuffer, {{0, 0}, render_area}, 0, nullptr});
scheduler.RequestRenderpass(renderpass, framebuffer, render_area);
UpdateDynamicStates();
@@ -345,19 +352,19 @@ void RasterizerVulkan::Draw(bool is_indexed, bool is_instanced) {
if (device.IsNvDeviceDiagnosticCheckpoints()) {
scheduler.Record(
[&pipeline](auto cmdbuf, auto& dld) { cmdbuf.setCheckpointNV(&pipeline, dld); });
[&pipeline](vk::CommandBuffer cmdbuf) { cmdbuf.SetCheckpointNV(&pipeline); });
}
BeginTransformFeedback();
const auto pipeline_layout = pipeline.GetLayout();
const auto descriptor_set = pipeline.CommitDescriptorSet();
scheduler.Record([pipeline_layout, descriptor_set, draw_params](auto cmdbuf, auto& dld) {
scheduler.Record([pipeline_layout, descriptor_set, draw_params](vk::CommandBuffer cmdbuf) {
if (descriptor_set) {
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eGraphics, pipeline_layout,
DESCRIPTOR_SET, 1, &descriptor_set, 0, nullptr, dld);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline_layout,
DESCRIPTOR_SET, descriptor_set, {});
}
draw_params.Draw(cmdbuf, dld);
draw_params.Draw(cmdbuf);
});
EndTransformFeedback();
@@ -389,48 +396,54 @@ void RasterizerVulkan::Clear() {
DEBUG_ASSERT(texceptions.none());
SetupImageTransitions(0, color_attachments, zeta_attachment);
const vk::RenderPass renderpass = renderpass_cache.GetRenderPass(GetRenderPassParams(0));
const VkRenderPass renderpass = renderpass_cache.GetRenderPass(GetRenderPassParams(0));
const auto [framebuffer, render_area] = ConfigureFramebuffers(renderpass);
scheduler.RequestRenderpass({renderpass, framebuffer, {{0, 0}, render_area}, 0, nullptr});
scheduler.RequestRenderpass(renderpass, framebuffer, render_area);
const auto& scissor = regs.scissor_test[0];
const vk::Offset2D scissor_offset(scissor.min_x, scissor.min_y);
vk::Extent2D scissor_extent{scissor.max_x - scissor.min_x, scissor.max_y - scissor.min_y};
scissor_extent.width = std::min(scissor_extent.width, render_area.width);
scissor_extent.height = std::min(scissor_extent.height, render_area.height);
const u32 layer = regs.clear_buffers.layer;
const vk::ClearRect clear_rect({scissor_offset, scissor_extent}, layer, 1);
VkClearRect clear_rect;
clear_rect.baseArrayLayer = regs.clear_buffers.layer;
clear_rect.layerCount = 1;
clear_rect.rect = GetScissorState(regs, 0);
clear_rect.rect.extent.width = std::min(clear_rect.rect.extent.width, render_area.width);
clear_rect.rect.extent.height = std::min(clear_rect.rect.extent.height, render_area.height);
if (use_color) {
const std::array clear_color = {regs.clear_color[0], regs.clear_color[1],
regs.clear_color[2], regs.clear_color[3]};
const vk::ClearValue clear_value{clear_color};
VkClearValue clear_value;
std::memcpy(clear_value.color.float32, regs.clear_color, sizeof(regs.clear_color));
const u32 color_attachment = regs.clear_buffers.RT;
scheduler.Record([color_attachment, clear_value, clear_rect](auto cmdbuf, auto& dld) {
const vk::ClearAttachment attachment(vk::ImageAspectFlagBits::eColor, color_attachment,
clear_value);
cmdbuf.clearAttachments(1, &attachment, 1, &clear_rect, dld);
scheduler.Record([color_attachment, clear_value, clear_rect](vk::CommandBuffer cmdbuf) {
VkClearAttachment attachment;
attachment.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
attachment.colorAttachment = color_attachment;
attachment.clearValue = clear_value;
cmdbuf.ClearAttachments(attachment, clear_rect);
});
}
if (!use_depth && !use_stencil) {
return;
}
vk::ImageAspectFlags aspect_flags;
VkImageAspectFlags aspect_flags = 0;
if (use_depth) {
aspect_flags |= vk::ImageAspectFlagBits::eDepth;
aspect_flags |= VK_IMAGE_ASPECT_DEPTH_BIT;
}
if (use_stencil) {
aspect_flags |= vk::ImageAspectFlagBits::eStencil;
aspect_flags |= VK_IMAGE_ASPECT_STENCIL_BIT;
}
scheduler.Record([clear_depth = regs.clear_depth, clear_stencil = regs.clear_stencil,
clear_rect, aspect_flags](auto cmdbuf, auto& dld) {
const vk::ClearDepthStencilValue clear_zeta(clear_depth, clear_stencil);
const vk::ClearValue clear_value{clear_zeta};
const vk::ClearAttachment attachment(aspect_flags, 0, clear_value);
cmdbuf.clearAttachments(1, &attachment, 1, &clear_rect, dld);
clear_rect, aspect_flags](vk::CommandBuffer cmdbuf) {
VkClearValue clear_value;
clear_value.depthStencil.depth = clear_depth;
clear_value.depthStencil.stencil = clear_stencil;
VkClearAttachment attachment;
attachment.aspectMask = aspect_flags;
attachment.colorAttachment = 0;
attachment.clearValue.depthStencil.depth = clear_depth;
attachment.clearValue.depthStencil.stencil = clear_stencil;
cmdbuf.ClearAttachments(attachment, clear_rect);
});
}
@@ -463,24 +476,24 @@ void RasterizerVulkan::DispatchCompute(GPUVAddr code_addr) {
buffer_cache.Unmap();
TransitionImages(sampled_views, vk::PipelineStageFlagBits::eComputeShader,
vk::AccessFlagBits::eShaderRead);
TransitionImages(image_views, vk::PipelineStageFlagBits::eComputeShader,
vk::AccessFlagBits::eShaderRead | vk::AccessFlagBits::eShaderWrite);
TransitionImages(sampled_views, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_ACCESS_SHADER_READ_BIT);
TransitionImages(image_views, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT);
if (device.IsNvDeviceDiagnosticCheckpoints()) {
scheduler.Record(
[&pipeline](auto cmdbuf, auto& dld) { cmdbuf.setCheckpointNV(nullptr, dld); });
[&pipeline](vk::CommandBuffer cmdbuf) { cmdbuf.SetCheckpointNV(nullptr); });
}
scheduler.Record([grid_x = launch_desc.grid_dim_x, grid_y = launch_desc.grid_dim_y,
grid_z = launch_desc.grid_dim_z, pipeline_handle = pipeline.GetHandle(),
layout = pipeline.GetLayout(),
descriptor_set = pipeline.CommitDescriptorSet()](auto cmdbuf, auto& dld) {
cmdbuf.bindPipeline(vk::PipelineBindPoint::eCompute, pipeline_handle, dld);
cmdbuf.bindDescriptorSets(vk::PipelineBindPoint::eCompute, layout, DESCRIPTOR_SET, 1,
&descriptor_set, 0, nullptr, dld);
cmdbuf.dispatch(grid_x, grid_y, grid_z, dld);
descriptor_set = pipeline.CommitDescriptorSet()](vk::CommandBuffer cmdbuf) {
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_COMPUTE, pipeline_handle);
cmdbuf.BindDescriptorSets(VK_PIPELINE_BIND_POINT_COMPUTE, layout, DESCRIPTOR_SET,
descriptor_set, {});
cmdbuf.Dispatch(grid_x, grid_y, grid_z);
});
}
@@ -599,7 +612,7 @@ RasterizerVulkan::Texceptions RasterizerVulkan::UpdateAttachments() {
Texceptions texceptions;
for (std::size_t rt = 0; rt < Maxwell::NumRenderTargets; ++rt) {
if (update_rendertargets) {
color_attachments[rt] = texture_cache.GetColorBufferSurface(rt, true);
color_attachments[rt] = texture_cache.GetColorBufferSurface(rt);
}
if (color_attachments[rt] && WalkAttachmentOverlaps(*color_attachments[rt])) {
texceptions[rt] = true;
@@ -607,7 +620,7 @@ RasterizerVulkan::Texceptions RasterizerVulkan::UpdateAttachments() {
}
if (update_rendertargets) {
zeta_attachment = texture_cache.GetDepthBufferSurface(true);
zeta_attachment = texture_cache.GetDepthBufferSurface();
}
if (zeta_attachment && WalkAttachmentOverlaps(*zeta_attachment)) {
texceptions[ZETA_TEXCEPTION_INDEX] = true;
@@ -625,13 +638,13 @@ bool RasterizerVulkan::WalkAttachmentOverlaps(const CachedSurfaceView& attachmen
continue;
}
overlap = true;
*layout = vk::ImageLayout::eGeneral;
*layout = VK_IMAGE_LAYOUT_GENERAL;
}
return overlap;
}
std::tuple<vk::Framebuffer, vk::Extent2D> RasterizerVulkan::ConfigureFramebuffers(
vk::RenderPass renderpass) {
std::tuple<VkFramebuffer, VkExtent2D> RasterizerVulkan::ConfigureFramebuffers(
VkRenderPass renderpass) {
FramebufferCacheKey key{renderpass, std::numeric_limits<u32>::max(),
std::numeric_limits<u32>::max(), std::numeric_limits<u32>::max()};
@@ -658,15 +671,20 @@ std::tuple<vk::Framebuffer, vk::Extent2D> RasterizerVulkan::ConfigureFramebuffer
const auto [fbentry, is_cache_miss] = framebuffer_cache.try_emplace(key);
auto& framebuffer = fbentry->second;
if (is_cache_miss) {
const vk::FramebufferCreateInfo framebuffer_ci(
{}, key.renderpass, static_cast<u32>(key.views.size()), key.views.data(), key.width,
key.height, key.layers);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
framebuffer = dev.createFramebufferUnique(framebuffer_ci, nullptr, dld);
VkFramebufferCreateInfo framebuffer_ci;
framebuffer_ci.sType = VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO;
framebuffer_ci.pNext = nullptr;
framebuffer_ci.flags = 0;
framebuffer_ci.renderPass = key.renderpass;
framebuffer_ci.attachmentCount = static_cast<u32>(key.views.size());
framebuffer_ci.pAttachments = key.views.data();
framebuffer_ci.width = key.width;
framebuffer_ci.height = key.height;
framebuffer_ci.layers = key.layers;
framebuffer = device.GetLogical().CreateFramebuffer(framebuffer_ci);
}
return {*framebuffer, vk::Extent2D{key.width, key.height}};
return {*framebuffer, VkExtent2D{key.width, key.height}};
}
RasterizerVulkan::DrawParameters RasterizerVulkan::SetupGeometry(FixedPipelineState& fixed_state,
@@ -714,10 +732,9 @@ void RasterizerVulkan::SetupShaderDescriptors(
void RasterizerVulkan::SetupImageTransitions(
Texceptions texceptions, const std::array<View, Maxwell::NumRenderTargets>& color_attachments,
const View& zeta_attachment) {
TransitionImages(sampled_views, vk::PipelineStageFlagBits::eAllGraphics,
vk::AccessFlagBits::eShaderRead);
TransitionImages(image_views, vk::PipelineStageFlagBits::eAllGraphics,
vk::AccessFlagBits::eShaderRead | vk::AccessFlagBits::eShaderWrite);
TransitionImages(sampled_views, VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT, VK_ACCESS_SHADER_READ_BIT);
TransitionImages(image_views, VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT);
for (std::size_t rt = 0; rt < std::size(color_attachments); ++rt) {
const auto color_attachment = color_attachments[rt];
@@ -725,19 +742,19 @@ void RasterizerVulkan::SetupImageTransitions(
continue;
}
const auto image_layout =
texceptions[rt] ? vk::ImageLayout::eGeneral : vk::ImageLayout::eColorAttachmentOptimal;
color_attachment->Transition(
image_layout, vk::PipelineStageFlagBits::eColorAttachmentOutput,
vk::AccessFlagBits::eColorAttachmentRead | vk::AccessFlagBits::eColorAttachmentWrite);
texceptions[rt] ? VK_IMAGE_LAYOUT_GENERAL : VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
color_attachment->Transition(image_layout, VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT);
}
if (zeta_attachment != nullptr) {
const auto image_layout = texceptions[ZETA_TEXCEPTION_INDEX]
? vk::ImageLayout::eGeneral
: vk::ImageLayout::eDepthStencilAttachmentOptimal;
zeta_attachment->Transition(image_layout, vk::PipelineStageFlagBits::eLateFragmentTests,
vk::AccessFlagBits::eDepthStencilAttachmentRead |
vk::AccessFlagBits::eDepthStencilAttachmentWrite);
? VK_IMAGE_LAYOUT_GENERAL
: VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
zeta_attachment->Transition(image_layout, VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT,
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT);
}
}
@@ -773,9 +790,9 @@ void RasterizerVulkan::BeginTransformFeedback() {
const std::size_t size = binding.buffer_size;
const auto [buffer, offset] = buffer_cache.UploadMemory(gpu_addr, size, 4, true);
scheduler.Record([buffer = *buffer, offset = offset, size](auto cmdbuf, auto& dld) {
cmdbuf.bindTransformFeedbackBuffersEXT(0, {buffer}, {offset}, {size}, dld);
cmdbuf.beginTransformFeedbackEXT(0, {}, {}, dld);
scheduler.Record([buffer = *buffer, offset = offset, size](vk::CommandBuffer cmdbuf) {
cmdbuf.BindTransformFeedbackBuffersEXT(0, 1, &buffer, &offset, &size);
cmdbuf.BeginTransformFeedbackEXT(0, 0, nullptr, nullptr);
});
}
@@ -786,7 +803,7 @@ void RasterizerVulkan::EndTransformFeedback() {
}
scheduler.Record(
[](auto cmdbuf, auto& dld) { cmdbuf.endTransformFeedbackEXT(0, {}, {}, dld); });
[](vk::CommandBuffer cmdbuf) { cmdbuf.EndTransformFeedbackEXT(0, 0, nullptr, nullptr); });
}
void RasterizerVulkan::SetupVertexArrays(FixedPipelineState::VertexInput& vertex_input,
@@ -837,7 +854,7 @@ void RasterizerVulkan::SetupIndexBuffer(BufferBindings& buffer_bindings, DrawPar
} else {
const auto [buffer, offset] =
quad_array_pass.Assemble(params.num_vertices, params.base_vertex);
buffer_bindings.SetIndexBinding(&buffer, offset, vk::IndexType::eUint32);
buffer_bindings.SetIndexBinding(buffer, offset, VK_INDEX_TYPE_UINT32);
params.base_vertex = 0;
params.num_vertices = params.num_vertices * 6 / 4;
params.is_indexed = true;
@@ -1022,7 +1039,7 @@ void RasterizerVulkan::SetupTexture(const Tegra::Texture::FullTextureInfo& textu
update_descriptor_queue.AddSampledImage(sampler, image_view);
const auto image_layout = update_descriptor_queue.GetLastImageLayout();
*image_layout = vk::ImageLayout::eShaderReadOnlyOptimal;
*image_layout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
sampled_views.push_back(ImageView{std::move(view), image_layout});
}
@@ -1039,7 +1056,7 @@ void RasterizerVulkan::SetupImage(const Tegra::Texture::TICEntry& tic, const Ima
update_descriptor_queue.AddImage(image_view);
const auto image_layout = update_descriptor_queue.GetLastImageLayout();
*image_layout = vk::ImageLayout::eGeneral;
*image_layout = VK_IMAGE_LAYOUT_GENERAL;
image_views.push_back(ImageView{std::move(view), image_layout});
}
@@ -1056,9 +1073,7 @@ void RasterizerVulkan::UpdateViewportsState(Tegra::Engines::Maxwell3D::Regs& reg
GetViewportState(device, regs, 10), GetViewportState(device, regs, 11),
GetViewportState(device, regs, 12), GetViewportState(device, regs, 13),
GetViewportState(device, regs, 14), GetViewportState(device, regs, 15)};
scheduler.Record([viewports](auto cmdbuf, auto& dld) {
cmdbuf.setViewport(0, static_cast<u32>(viewports.size()), viewports.data(), dld);
});
scheduler.Record([viewports](vk::CommandBuffer cmdbuf) { cmdbuf.SetViewport(0, viewports); });
}
void RasterizerVulkan::UpdateScissorsState(Tegra::Engines::Maxwell3D::Regs& regs) {
@@ -1072,9 +1087,7 @@ void RasterizerVulkan::UpdateScissorsState(Tegra::Engines::Maxwell3D::Regs& regs
GetScissorState(regs, 9), GetScissorState(regs, 10), GetScissorState(regs, 11),
GetScissorState(regs, 12), GetScissorState(regs, 13), GetScissorState(regs, 14),
GetScissorState(regs, 15)};
scheduler.Record([scissors](auto cmdbuf, auto& dld) {
cmdbuf.setScissor(0, static_cast<u32>(scissors.size()), scissors.data(), dld);
});
scheduler.Record([scissors](vk::CommandBuffer cmdbuf) { cmdbuf.SetScissor(0, scissors); });
}
void RasterizerVulkan::UpdateDepthBias(Tegra::Engines::Maxwell3D::Regs& regs) {
@@ -1082,8 +1095,8 @@ void RasterizerVulkan::UpdateDepthBias(Tegra::Engines::Maxwell3D::Regs& regs) {
return;
}
scheduler.Record([constant = regs.polygon_offset_units, clamp = regs.polygon_offset_clamp,
factor = regs.polygon_offset_factor](auto cmdbuf, auto& dld) {
cmdbuf.setDepthBias(constant, clamp, factor / 2.0f, dld);
factor = regs.polygon_offset_factor](vk::CommandBuffer cmdbuf) {
cmdbuf.SetDepthBias(constant, clamp, factor / 2.0f);
});
}
@@ -1093,9 +1106,8 @@ void RasterizerVulkan::UpdateBlendConstants(Tegra::Engines::Maxwell3D::Regs& reg
}
const std::array blend_color = {regs.blend_color.r, regs.blend_color.g, regs.blend_color.b,
regs.blend_color.a};
scheduler.Record([blend_color](auto cmdbuf, auto& dld) {
cmdbuf.setBlendConstants(blend_color.data(), dld);
});
scheduler.Record(
[blend_color](vk::CommandBuffer cmdbuf) { cmdbuf.SetBlendConstants(blend_color.data()); });
}
void RasterizerVulkan::UpdateDepthBounds(Tegra::Engines::Maxwell3D::Regs& regs) {
@@ -1103,7 +1115,7 @@ void RasterizerVulkan::UpdateDepthBounds(Tegra::Engines::Maxwell3D::Regs& regs)
return;
}
scheduler.Record([min = regs.depth_bounds[0], max = regs.depth_bounds[1]](
auto cmdbuf, auto& dld) { cmdbuf.setDepthBounds(min, max, dld); });
vk::CommandBuffer cmdbuf) { cmdbuf.SetDepthBounds(min, max); });
}
void RasterizerVulkan::UpdateStencilFaces(Tegra::Engines::Maxwell3D::Regs& regs) {
@@ -1116,24 +1128,24 @@ void RasterizerVulkan::UpdateStencilFaces(Tegra::Engines::Maxwell3D::Regs& regs)
[front_ref = regs.stencil_front_func_ref, front_write_mask = regs.stencil_front_mask,
front_test_mask = regs.stencil_front_func_mask, back_ref = regs.stencil_back_func_ref,
back_write_mask = regs.stencil_back_mask,
back_test_mask = regs.stencil_back_func_mask](auto cmdbuf, auto& dld) {
back_test_mask = regs.stencil_back_func_mask](vk::CommandBuffer cmdbuf) {
// Front face
cmdbuf.setStencilReference(vk::StencilFaceFlagBits::eFront, front_ref, dld);
cmdbuf.setStencilWriteMask(vk::StencilFaceFlagBits::eFront, front_write_mask, dld);
cmdbuf.setStencilCompareMask(vk::StencilFaceFlagBits::eFront, front_test_mask, dld);
cmdbuf.SetStencilReference(VK_STENCIL_FACE_FRONT_BIT, front_ref);
cmdbuf.SetStencilWriteMask(VK_STENCIL_FACE_FRONT_BIT, front_write_mask);
cmdbuf.SetStencilCompareMask(VK_STENCIL_FACE_FRONT_BIT, front_test_mask);
// Back face
cmdbuf.setStencilReference(vk::StencilFaceFlagBits::eBack, back_ref, dld);
cmdbuf.setStencilWriteMask(vk::StencilFaceFlagBits::eBack, back_write_mask, dld);
cmdbuf.setStencilCompareMask(vk::StencilFaceFlagBits::eBack, back_test_mask, dld);
cmdbuf.SetStencilReference(VK_STENCIL_FACE_BACK_BIT, back_ref);
cmdbuf.SetStencilWriteMask(VK_STENCIL_FACE_BACK_BIT, back_write_mask);
cmdbuf.SetStencilCompareMask(VK_STENCIL_FACE_BACK_BIT, back_test_mask);
});
} else {
// Front face defines both faces
scheduler.Record([ref = regs.stencil_back_func_ref, write_mask = regs.stencil_back_mask,
test_mask = regs.stencil_back_func_mask](auto cmdbuf, auto& dld) {
cmdbuf.setStencilReference(vk::StencilFaceFlagBits::eFrontAndBack, ref, dld);
cmdbuf.setStencilWriteMask(vk::StencilFaceFlagBits::eFrontAndBack, write_mask, dld);
cmdbuf.setStencilCompareMask(vk::StencilFaceFlagBits::eFrontAndBack, test_mask, dld);
test_mask = regs.stencil_back_func_mask](vk::CommandBuffer cmdbuf) {
cmdbuf.SetStencilReference(VK_STENCIL_FACE_FRONT_AND_BACK, ref);
cmdbuf.SetStencilWriteMask(VK_STENCIL_FACE_FRONT_AND_BACK, write_mask);
cmdbuf.SetStencilCompareMask(VK_STENCIL_FACE_FRONT_AND_BACK, test_mask);
});
}
}

View File

@@ -17,7 +17,6 @@
#include "video_core/memory_manager.h"
#include "video_core/rasterizer_accelerated.h"
#include "video_core/rasterizer_interface.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/fixed_pipeline_state.h"
#include "video_core/renderer_vulkan/vk_buffer_cache.h"
#include "video_core/renderer_vulkan/vk_compute_pass.h"
@@ -32,6 +31,7 @@
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_texture_cache.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Core {
class System;
@@ -49,11 +49,10 @@ namespace Vulkan {
struct VKScreenInfo;
using ImageViewsPack =
boost::container::static_vector<vk::ImageView, Maxwell::NumRenderTargets + 1>;
using ImageViewsPack = boost::container::static_vector<VkImageView, Maxwell::NumRenderTargets + 1>;
struct FramebufferCacheKey {
vk::RenderPass renderpass{};
VkRenderPass renderpass{};
u32 width = 0;
u32 height = 0;
u32 layers = 0;
@@ -101,7 +100,7 @@ class BufferBindings;
struct ImageView {
View view;
vk::ImageLayout* layout = nullptr;
VkImageLayout* layout = nullptr;
};
class RasterizerVulkan final : public VideoCore::RasterizerAccelerated {
@@ -137,7 +136,7 @@ public:
private:
struct DrawParameters {
void Draw(vk::CommandBuffer cmdbuf, const vk::DispatchLoaderDynamic& dld) const;
void Draw(vk::CommandBuffer cmdbuf) const;
u32 base_instance = 0;
u32 num_instances = 0;
@@ -154,7 +153,7 @@ private:
Texceptions UpdateAttachments();
std::tuple<vk::Framebuffer, vk::Extent2D> ConfigureFramebuffers(vk::RenderPass renderpass);
std::tuple<VkFramebuffer, VkExtent2D> ConfigureFramebuffers(VkRenderPass renderpass);
/// Setups geometry buffers and state.
DrawParameters SetupGeometry(FixedPipelineState& fixed_state, BufferBindings& buffer_bindings,
@@ -272,7 +271,7 @@ private:
u32 draw_counter = 0;
// TODO(Rodrigo): Invalidate on image destruction
std::unordered_map<FramebufferCacheKey, UniqueFramebuffer> framebuffer_cache;
std::unordered_map<FramebufferCacheKey, vk::Framebuffer> framebuffer_cache;
};
} // namespace Vulkan

View File

@@ -6,10 +6,10 @@
#include <vector>
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_renderpass_cache.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -17,7 +17,7 @@ VKRenderPassCache::VKRenderPassCache(const VKDevice& device) : device{device} {}
VKRenderPassCache::~VKRenderPassCache() = default;
vk::RenderPass VKRenderPassCache::GetRenderPass(const RenderPassParams& params) {
VkRenderPass VKRenderPassCache::GetRenderPass(const RenderPassParams& params) {
const auto [pair, is_cache_miss] = cache.try_emplace(params);
auto& entry = pair->second;
if (is_cache_miss) {
@@ -26,9 +26,9 @@ vk::RenderPass VKRenderPassCache::GetRenderPass(const RenderPassParams& params)
return *entry;
}
UniqueRenderPass VKRenderPassCache::CreateRenderPass(const RenderPassParams& params) const {
std::vector<vk::AttachmentDescription> descriptors;
std::vector<vk::AttachmentReference> color_references;
vk::RenderPass VKRenderPassCache::CreateRenderPass(const RenderPassParams& params) const {
std::vector<VkAttachmentDescription> descriptors;
std::vector<VkAttachmentReference> color_references;
for (std::size_t rt = 0; rt < params.color_attachments.size(); ++rt) {
const auto attachment = params.color_attachments[rt];
@@ -39,16 +39,25 @@ UniqueRenderPass VKRenderPassCache::CreateRenderPass(const RenderPassParams& par
// TODO(Rodrigo): Add eMayAlias when it's needed.
const auto color_layout = attachment.is_texception
? vk::ImageLayout::eGeneral
: vk::ImageLayout::eColorAttachmentOptimal;
descriptors.emplace_back(vk::AttachmentDescriptionFlagBits::eMayAlias, format.format,
vk::SampleCountFlagBits::e1, vk::AttachmentLoadOp::eLoad,
vk::AttachmentStoreOp::eStore, vk::AttachmentLoadOp::eDontCare,
vk::AttachmentStoreOp::eDontCare, color_layout, color_layout);
color_references.emplace_back(static_cast<u32>(rt), color_layout);
? VK_IMAGE_LAYOUT_GENERAL
: VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL;
VkAttachmentDescription& descriptor = descriptors.emplace_back();
descriptor.flags = VK_ATTACHMENT_DESCRIPTION_MAY_ALIAS_BIT;
descriptor.format = format.format;
descriptor.samples = VK_SAMPLE_COUNT_1_BIT;
descriptor.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
descriptor.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
descriptor.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_DONT_CARE;
descriptor.stencilStoreOp = VK_ATTACHMENT_STORE_OP_DONT_CARE;
descriptor.initialLayout = color_layout;
descriptor.finalLayout = color_layout;
VkAttachmentReference& reference = color_references.emplace_back();
reference.attachment = static_cast<u32>(rt);
reference.layout = color_layout;
}
vk::AttachmentReference zeta_attachment_ref;
VkAttachmentReference zeta_attachment_ref;
if (params.has_zeta) {
const auto format =
MaxwellToVK::SurfaceFormat(device, FormatType::Optimal, params.zeta_pixel_format);
@@ -56,45 +65,68 @@ UniqueRenderPass VKRenderPassCache::CreateRenderPass(const RenderPassParams& par
static_cast<u32>(params.zeta_pixel_format));
const auto zeta_layout = params.zeta_texception
? vk::ImageLayout::eGeneral
: vk::ImageLayout::eDepthStencilAttachmentOptimal;
descriptors.emplace_back(vk::AttachmentDescriptionFlags{}, format.format,
vk::SampleCountFlagBits::e1, vk::AttachmentLoadOp::eLoad,
vk::AttachmentStoreOp::eStore, vk::AttachmentLoadOp::eLoad,
vk::AttachmentStoreOp::eStore, zeta_layout, zeta_layout);
zeta_attachment_ref =
vk::AttachmentReference(static_cast<u32>(params.color_attachments.size()), zeta_layout);
? VK_IMAGE_LAYOUT_GENERAL
: VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL;
VkAttachmentDescription& descriptor = descriptors.emplace_back();
descriptor.flags = 0;
descriptor.format = format.format;
descriptor.samples = VK_SAMPLE_COUNT_1_BIT;
descriptor.loadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
descriptor.storeOp = VK_ATTACHMENT_STORE_OP_STORE;
descriptor.stencilLoadOp = VK_ATTACHMENT_LOAD_OP_LOAD;
descriptor.stencilStoreOp = VK_ATTACHMENT_STORE_OP_STORE;
descriptor.initialLayout = zeta_layout;
descriptor.finalLayout = zeta_layout;
zeta_attachment_ref.attachment = static_cast<u32>(params.color_attachments.size());
zeta_attachment_ref.layout = zeta_layout;
}
const vk::SubpassDescription subpass_description(
{}, vk::PipelineBindPoint::eGraphics, 0, nullptr, static_cast<u32>(color_references.size()),
color_references.data(), nullptr, params.has_zeta ? &zeta_attachment_ref : nullptr, 0,
nullptr);
VkSubpassDescription subpass_description;
subpass_description.flags = 0;
subpass_description.pipelineBindPoint = VK_PIPELINE_BIND_POINT_GRAPHICS;
subpass_description.inputAttachmentCount = 0;
subpass_description.pInputAttachments = nullptr;
subpass_description.colorAttachmentCount = static_cast<u32>(color_references.size());
subpass_description.pColorAttachments = color_references.data();
subpass_description.pResolveAttachments = nullptr;
subpass_description.pDepthStencilAttachment = params.has_zeta ? &zeta_attachment_ref : nullptr;
subpass_description.preserveAttachmentCount = 0;
subpass_description.pPreserveAttachments = nullptr;
vk::AccessFlags access;
vk::PipelineStageFlags stage;
VkAccessFlags access = 0;
VkPipelineStageFlags stage = 0;
if (!color_references.empty()) {
access |=
vk::AccessFlagBits::eColorAttachmentRead | vk::AccessFlagBits::eColorAttachmentWrite;
stage |= vk::PipelineStageFlagBits::eColorAttachmentOutput;
access |= VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
stage |= VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
}
if (params.has_zeta) {
access |= vk::AccessFlagBits::eDepthStencilAttachmentRead |
vk::AccessFlagBits::eDepthStencilAttachmentWrite;
stage |= vk::PipelineStageFlagBits::eLateFragmentTests;
access |= VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT |
VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;
stage |= VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT;
}
const vk::SubpassDependency subpass_dependency(VK_SUBPASS_EXTERNAL, 0, stage, stage, {}, access,
{});
VkSubpassDependency subpass_dependency;
subpass_dependency.srcSubpass = VK_SUBPASS_EXTERNAL;
subpass_dependency.dstSubpass = 0;
subpass_dependency.srcStageMask = stage;
subpass_dependency.dstStageMask = stage;
subpass_dependency.srcAccessMask = 0;
subpass_dependency.dstAccessMask = access;
subpass_dependency.dependencyFlags = 0;
const vk::RenderPassCreateInfo create_info({}, static_cast<u32>(descriptors.size()),
descriptors.data(), 1, &subpass_description, 1,
&subpass_dependency);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createRenderPassUnique(create_info, nullptr, dld);
VkRenderPassCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.attachmentCount = static_cast<u32>(descriptors.size());
ci.pAttachments = descriptors.data();
ci.subpassCount = 1;
ci.pSubpasses = &subpass_description;
ci.dependencyCount = 1;
ci.pDependencies = &subpass_dependency;
return device.GetLogical().CreateRenderPass(ci);
}
} // namespace Vulkan

View File

@@ -12,7 +12,7 @@
#include <boost/functional/hash.hpp>
#include "video_core/engines/maxwell_3d.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
namespace Vulkan {
@@ -85,13 +85,13 @@ public:
explicit VKRenderPassCache(const VKDevice& device);
~VKRenderPassCache();
vk::RenderPass GetRenderPass(const RenderPassParams& params);
VkRenderPass GetRenderPass(const RenderPassParams& params);
private:
UniqueRenderPass CreateRenderPass(const RenderPassParams& params) const;
vk::RenderPass CreateRenderPass(const RenderPassParams& params) const;
const VKDevice& device;
std::unordered_map<RenderPassParams, UniqueRenderPass> cache;
std::unordered_map<RenderPassParams, vk::RenderPass> cache;
};
} // namespace Vulkan

View File

@@ -6,83 +6,83 @@
#include <optional>
#include "common/assert.h"
#include "common/logging/log.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
namespace {
// TODO(Rodrigo): Fine tune these numbers.
constexpr std::size_t COMMAND_BUFFER_POOL_SIZE = 0x1000;
constexpr std::size_t FENCES_GROW_STEP = 0x40;
VkFenceCreateInfo BuildFenceCreateInfo() {
VkFenceCreateInfo fence_ci;
fence_ci.sType = VK_STRUCTURE_TYPE_FENCE_CREATE_INFO;
fence_ci.pNext = nullptr;
fence_ci.flags = 0;
return fence_ci;
}
} // Anonymous namespace
class CommandBufferPool final : public VKFencedPool {
public:
CommandBufferPool(const VKDevice& device)
: VKFencedPool(COMMAND_BUFFER_POOL_SIZE), device{device} {}
void Allocate(std::size_t begin, std::size_t end) override {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const u32 graphics_family = device.GetGraphicsFamily();
auto pool = std::make_unique<Pool>();
// Command buffers are going to be commited, recorded, executed every single usage cycle.
// They are also going to be reseted when commited.
const auto pool_flags = vk::CommandPoolCreateFlagBits::eTransient |
vk::CommandPoolCreateFlagBits::eResetCommandBuffer;
const vk::CommandPoolCreateInfo cmdbuf_pool_ci(pool_flags, graphics_family);
pool->handle = dev.createCommandPoolUnique(cmdbuf_pool_ci, nullptr, dld);
VkCommandPoolCreateInfo command_pool_ci;
command_pool_ci.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO;
command_pool_ci.pNext = nullptr;
command_pool_ci.flags =
VK_COMMAND_POOL_CREATE_TRANSIENT_BIT | VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT;
command_pool_ci.queueFamilyIndex = device.GetGraphicsFamily();
const vk::CommandBufferAllocateInfo cmdbuf_ai(*pool->handle,
vk::CommandBufferLevel::ePrimary,
static_cast<u32>(COMMAND_BUFFER_POOL_SIZE));
pool->cmdbufs =
dev.allocateCommandBuffersUnique<std::allocator<UniqueCommandBuffer>>(cmdbuf_ai, dld);
pools.push_back(std::move(pool));
Pool& pool = pools.emplace_back();
pool.handle = device.GetLogical().CreateCommandPool(command_pool_ci);
pool.cmdbufs = pool.handle.Allocate(COMMAND_BUFFER_POOL_SIZE);
}
vk::CommandBuffer Commit(VKFence& fence) {
VkCommandBuffer Commit(VKFence& fence) {
const std::size_t index = CommitResource(fence);
const auto pool_index = index / COMMAND_BUFFER_POOL_SIZE;
const auto sub_index = index % COMMAND_BUFFER_POOL_SIZE;
return *pools[pool_index]->cmdbufs[sub_index];
return pools[pool_index].cmdbufs[sub_index];
}
private:
struct Pool {
UniqueCommandPool handle;
std::vector<UniqueCommandBuffer> cmdbufs;
vk::CommandPool handle;
vk::CommandBuffers cmdbufs;
};
const VKDevice& device;
std::vector<std::unique_ptr<Pool>> pools;
std::vector<Pool> pools;
};
VKResource::VKResource() = default;
VKResource::~VKResource() = default;
VKFence::VKFence(const VKDevice& device, UniqueFence handle)
: device{device}, handle{std::move(handle)} {}
VKFence::VKFence(const VKDevice& device)
: device{device}, handle{device.GetLogical().CreateFence(BuildFenceCreateInfo())} {}
VKFence::~VKFence() = default;
void VKFence::Wait() {
static constexpr u64 timeout = std::numeric_limits<u64>::max();
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
switch (const auto result = dev.waitForFences(1, &*handle, true, timeout, dld)) {
case vk::Result::eSuccess:
switch (const VkResult result = handle.Wait()) {
case VK_SUCCESS:
return;
case vk::Result::eErrorDeviceLost:
case VK_ERROR_DEVICE_LOST:
device.ReportLoss();
[[fallthrough]];
default:
vk::throwResultException(result, "vk::waitForFences");
throw vk::Exception(result);
}
}
@@ -107,13 +107,11 @@ bool VKFence::Tick(bool gpu_wait, bool owner_wait) {
return false;
}
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
if (gpu_wait) {
// Wait for the fence if it has been requested.
dev.waitForFences({*handle}, true, std::numeric_limits<u64>::max(), dld);
(void)handle.Wait();
} else {
if (dev.getFenceStatus(*handle, dld) != vk::Result::eSuccess) {
if (handle.GetStatus() != VK_SUCCESS) {
// Vulkan fence is not ready, not much it can do here
return false;
}
@@ -126,7 +124,7 @@ bool VKFence::Tick(bool gpu_wait, bool owner_wait) {
protected_resources.clear();
// Prepare fence for reusage.
dev.resetFences({*handle}, dld);
handle.Reset();
is_used = false;
return true;
}
@@ -299,21 +297,16 @@ VKFence& VKResourceManager::CommitFence() {
return *found_fence;
}
vk::CommandBuffer VKResourceManager::CommitCommandBuffer(VKFence& fence) {
VkCommandBuffer VKResourceManager::CommitCommandBuffer(VKFence& fence) {
return command_buffer_pool->Commit(fence);
}
void VKResourceManager::GrowFences(std::size_t new_fences_count) {
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const vk::FenceCreateInfo fence_ci;
const std::size_t previous_size = fences.size();
fences.resize(previous_size + new_fences_count);
std::generate(fences.begin() + previous_size, fences.end(), [&]() {
return std::make_unique<VKFence>(device, dev.createFenceUnique(fence_ci, nullptr, dld));
});
std::generate(fences.begin() + previous_size, fences.end(),
[this] { return std::make_unique<VKFence>(device); });
}
} // namespace Vulkan

View File

@@ -7,7 +7,7 @@
#include <cstddef>
#include <memory>
#include <vector>
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -42,7 +42,7 @@ class VKFence {
friend class VKResourceManager;
public:
explicit VKFence(const VKDevice& device, UniqueFence handle);
explicit VKFence(const VKDevice& device);
~VKFence();
/**
@@ -69,7 +69,7 @@ public:
void RedirectProtection(VKResource* old_resource, VKResource* new_resource) noexcept;
/// Retreives the fence.
operator vk::Fence() const {
operator VkFence() const {
return *handle;
}
@@ -87,7 +87,7 @@ private:
bool Tick(bool gpu_wait, bool owner_wait);
const VKDevice& device; ///< Device handler
UniqueFence handle; ///< Vulkan fence
vk::Fence handle; ///< Vulkan fence
std::vector<VKResource*> protected_resources; ///< List of resources protected by this fence
bool is_owned = false; ///< The fence has been commited but not released yet.
bool is_used = false; ///< The fence has been commited but it has not been checked to be free.
@@ -181,7 +181,7 @@ public:
VKFence& CommitFence();
/// Commits an unused command buffer and protects it with a fence.
vk::CommandBuffer CommitCommandBuffer(VKFence& fence);
VkCommandBuffer CommitCommandBuffer(VKFence& fence);
private:
/// Allocates new fences.

View File

@@ -7,64 +7,64 @@
#include <unordered_map>
#include "common/assert.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_sampler_cache.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/textures/texture.h"
namespace Vulkan {
static std::optional<vk::BorderColor> TryConvertBorderColor(std::array<float, 4> color) {
namespace {
VkBorderColor ConvertBorderColor(std::array<float, 4> color) {
// TODO(Rodrigo): Manage integer border colors
if (color == std::array<float, 4>{0, 0, 0, 0}) {
return vk::BorderColor::eFloatTransparentBlack;
return VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK;
} else if (color == std::array<float, 4>{0, 0, 0, 1}) {
return vk::BorderColor::eFloatOpaqueBlack;
return VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK;
} else if (color == std::array<float, 4>{1, 1, 1, 1}) {
return vk::BorderColor::eFloatOpaqueWhite;
return VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE;
}
if (color[0] + color[1] + color[2] > 1.35f) {
// If color elements are brighter than roughly 0.5 average, use white border
return VK_BORDER_COLOR_FLOAT_OPAQUE_WHITE;
} else if (color[3] > 0.5f) {
return VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK;
} else {
if (color[0] + color[1] + color[2] > 1.35f) {
// If color elements are brighter than roughly 0.5 average, use white border
return vk::BorderColor::eFloatOpaqueWhite;
}
if (color[3] > 0.5f) {
return vk::BorderColor::eFloatOpaqueBlack;
}
return vk::BorderColor::eFloatTransparentBlack;
return VK_BORDER_COLOR_FLOAT_TRANSPARENT_BLACK;
}
}
} // Anonymous namespace
VKSamplerCache::VKSamplerCache(const VKDevice& device) : device{device} {}
VKSamplerCache::~VKSamplerCache() = default;
UniqueSampler VKSamplerCache::CreateSampler(const Tegra::Texture::TSCEntry& tsc) const {
const float max_anisotropy{tsc.GetMaxAnisotropy()};
const bool has_anisotropy{max_anisotropy > 1.0f};
const auto border_color{tsc.GetBorderColor()};
const auto vk_border_color{TryConvertBorderColor(border_color)};
constexpr bool unnormalized_coords{false};
const vk::SamplerCreateInfo sampler_ci(
{}, MaxwellToVK::Sampler::Filter(tsc.mag_filter),
MaxwellToVK::Sampler::Filter(tsc.min_filter),
MaxwellToVK::Sampler::MipmapMode(tsc.mipmap_filter),
MaxwellToVK::Sampler::WrapMode(device, tsc.wrap_u, tsc.mag_filter),
MaxwellToVK::Sampler::WrapMode(device, tsc.wrap_v, tsc.mag_filter),
MaxwellToVK::Sampler::WrapMode(device, tsc.wrap_p, tsc.mag_filter), tsc.GetLodBias(),
has_anisotropy, max_anisotropy, tsc.depth_compare_enabled,
MaxwellToVK::Sampler::DepthCompareFunction(tsc.depth_compare_func), tsc.GetMinLod(),
tsc.GetMaxLod(), vk_border_color.value_or(vk::BorderColor::eFloatTransparentBlack),
unnormalized_coords);
const auto& dld{device.GetDispatchLoader()};
const auto dev{device.GetLogical()};
return dev.createSamplerUnique(sampler_ci, nullptr, dld);
vk::Sampler VKSamplerCache::CreateSampler(const Tegra::Texture::TSCEntry& tsc) const {
VkSamplerCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.magFilter = MaxwellToVK::Sampler::Filter(tsc.mag_filter);
ci.minFilter = MaxwellToVK::Sampler::Filter(tsc.min_filter);
ci.mipmapMode = MaxwellToVK::Sampler::MipmapMode(tsc.mipmap_filter);
ci.addressModeU = MaxwellToVK::Sampler::WrapMode(device, tsc.wrap_u, tsc.mag_filter);
ci.addressModeV = MaxwellToVK::Sampler::WrapMode(device, tsc.wrap_v, tsc.mag_filter);
ci.addressModeW = MaxwellToVK::Sampler::WrapMode(device, tsc.wrap_p, tsc.mag_filter);
ci.mipLodBias = tsc.GetLodBias();
ci.anisotropyEnable = tsc.GetMaxAnisotropy() > 1.0f ? VK_TRUE : VK_FALSE;
ci.maxAnisotropy = tsc.GetMaxAnisotropy();
ci.compareEnable = tsc.depth_compare_enabled;
ci.compareOp = MaxwellToVK::Sampler::DepthCompareFunction(tsc.depth_compare_func);
ci.minLod = tsc.GetMinLod();
ci.maxLod = tsc.GetMaxLod();
ci.borderColor = ConvertBorderColor(tsc.GetBorderColor());
ci.unnormalizedCoordinates = VK_FALSE;
return device.GetLogical().CreateSampler(ci);
}
vk::Sampler VKSamplerCache::ToSamplerType(const UniqueSampler& sampler) const {
VkSampler VKSamplerCache::ToSamplerType(const vk::Sampler& sampler) const {
return *sampler;
}

View File

@@ -4,7 +4,7 @@
#pragma once
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/sampler_cache.h"
#include "video_core/textures/texture.h"
@@ -12,15 +12,15 @@ namespace Vulkan {
class VKDevice;
class VKSamplerCache final : public VideoCommon::SamplerCache<vk::Sampler, UniqueSampler> {
class VKSamplerCache final : public VideoCommon::SamplerCache<VkSampler, vk::Sampler> {
public:
explicit VKSamplerCache(const VKDevice& device);
~VKSamplerCache();
protected:
UniqueSampler CreateSampler(const Tegra::Texture::TSCEntry& tsc) const override;
vk::Sampler CreateSampler(const Tegra::Texture::TSCEntry& tsc) const override;
vk::Sampler ToSamplerType(const UniqueSampler& sampler) const override;
VkSampler ToSamplerType(const vk::Sampler& sampler) const override;
private:
const VKDevice& device;

View File

@@ -10,23 +10,22 @@
#include "common/assert.h"
#include "common/microprofile.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_query_cache.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_state_tracker.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
MICROPROFILE_DECLARE(Vulkan_WaitForWorker);
void VKScheduler::CommandChunk::ExecuteAll(vk::CommandBuffer cmdbuf,
const vk::DispatchLoaderDynamic& dld) {
void VKScheduler::CommandChunk::ExecuteAll(vk::CommandBuffer cmdbuf) {
auto command = first;
while (command != nullptr) {
auto next = command->GetNext();
command->Execute(cmdbuf, dld);
command->Execute(cmdbuf);
command->~Command();
command = next;
}
@@ -51,7 +50,7 @@ VKScheduler::~VKScheduler() {
worker_thread.join();
}
void VKScheduler::Flush(bool release_fence, vk::Semaphore semaphore) {
void VKScheduler::Flush(bool release_fence, VkSemaphore semaphore) {
SubmitExecution(semaphore);
if (release_fence) {
current_fence->Release();
@@ -59,7 +58,7 @@ void VKScheduler::Flush(bool release_fence, vk::Semaphore semaphore) {
AllocateNewContext();
}
void VKScheduler::Finish(bool release_fence, vk::Semaphore semaphore) {
void VKScheduler::Finish(bool release_fence, VkSemaphore semaphore) {
SubmitExecution(semaphore);
current_fence->Wait();
if (release_fence) {
@@ -89,17 +88,34 @@ void VKScheduler::DispatchWork() {
AcquireNewChunk();
}
void VKScheduler::RequestRenderpass(const vk::RenderPassBeginInfo& renderpass_bi) {
if (state.renderpass && renderpass_bi == *state.renderpass) {
void VKScheduler::RequestRenderpass(VkRenderPass renderpass, VkFramebuffer framebuffer,
VkExtent2D render_area) {
if (renderpass == state.renderpass && framebuffer == state.framebuffer &&
render_area.width == state.render_area.width &&
render_area.height == state.render_area.height) {
return;
}
const bool end_renderpass = state.renderpass.has_value();
state.renderpass = renderpass_bi;
Record([renderpass_bi, end_renderpass](auto cmdbuf, auto& dld) {
const bool end_renderpass = state.renderpass != nullptr;
state.renderpass = renderpass;
state.framebuffer = framebuffer;
state.render_area = render_area;
VkRenderPassBeginInfo renderpass_bi;
renderpass_bi.sType = VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO;
renderpass_bi.pNext = nullptr;
renderpass_bi.renderPass = renderpass;
renderpass_bi.framebuffer = framebuffer;
renderpass_bi.renderArea.offset.x = 0;
renderpass_bi.renderArea.offset.y = 0;
renderpass_bi.renderArea.extent = render_area;
renderpass_bi.clearValueCount = 0;
renderpass_bi.pClearValues = nullptr;
Record([renderpass_bi, end_renderpass](vk::CommandBuffer cmdbuf) {
if (end_renderpass) {
cmdbuf.endRenderPass(dld);
cmdbuf.EndRenderPass();
}
cmdbuf.beginRenderPass(renderpass_bi, vk::SubpassContents::eInline, dld);
cmdbuf.BeginRenderPass(renderpass_bi, VK_SUBPASS_CONTENTS_INLINE);
});
}
@@ -107,13 +123,13 @@ void VKScheduler::RequestOutsideRenderPassOperationContext() {
EndRenderPass();
}
void VKScheduler::BindGraphicsPipeline(vk::Pipeline pipeline) {
void VKScheduler::BindGraphicsPipeline(VkPipeline pipeline) {
if (state.graphics_pipeline == pipeline) {
return;
}
state.graphics_pipeline = pipeline;
Record([pipeline](auto cmdbuf, auto& dld) {
cmdbuf.bindPipeline(vk::PipelineBindPoint::eGraphics, pipeline, dld);
Record([pipeline](vk::CommandBuffer cmdbuf) {
cmdbuf.BindPipeline(VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline);
});
}
@@ -126,37 +142,50 @@ void VKScheduler::WorkerThread() {
}
auto extracted_chunk = std::move(chunk_queue.Front());
chunk_queue.Pop();
extracted_chunk->ExecuteAll(current_cmdbuf, device.GetDispatchLoader());
extracted_chunk->ExecuteAll(current_cmdbuf);
chunk_reserve.Push(std::move(extracted_chunk));
} while (!quit);
}
void VKScheduler::SubmitExecution(vk::Semaphore semaphore) {
void VKScheduler::SubmitExecution(VkSemaphore semaphore) {
EndPendingOperations();
InvalidateState();
WaitWorker();
std::unique_lock lock{mutex};
const auto queue = device.GetGraphicsQueue();
const auto& dld = device.GetDispatchLoader();
current_cmdbuf.end(dld);
current_cmdbuf.End();
const vk::SubmitInfo submit_info(0, nullptr, nullptr, 1, &current_cmdbuf, semaphore ? 1U : 0U,
&semaphore);
queue.submit({submit_info}, static_cast<vk::Fence>(*current_fence), dld);
VkSubmitInfo submit_info;
submit_info.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
submit_info.pNext = nullptr;
submit_info.waitSemaphoreCount = 0;
submit_info.pWaitSemaphores = nullptr;
submit_info.pWaitDstStageMask = nullptr;
submit_info.commandBufferCount = 1;
submit_info.pCommandBuffers = current_cmdbuf.address();
submit_info.signalSemaphoreCount = semaphore ? 1 : 0;
submit_info.pSignalSemaphores = &semaphore;
device.GetGraphicsQueue().Submit(submit_info, *current_fence);
}
void VKScheduler::AllocateNewContext() {
++ticks;
VkCommandBufferBeginInfo cmdbuf_bi;
cmdbuf_bi.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
cmdbuf_bi.pNext = nullptr;
cmdbuf_bi.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
cmdbuf_bi.pInheritanceInfo = nullptr;
std::unique_lock lock{mutex};
current_fence = next_fence;
next_fence = &resource_manager.CommitFence();
current_cmdbuf = resource_manager.CommitCommandBuffer(*current_fence);
current_cmdbuf.begin({vk::CommandBufferUsageFlagBits::eOneTimeSubmit},
device.GetDispatchLoader());
current_cmdbuf = vk::CommandBuffer(resource_manager.CommitCommandBuffer(*current_fence),
device.GetDispatchLoader());
current_cmdbuf.Begin(cmdbuf_bi);
// Enable counters once again. These are disabled when a command buffer is finished.
if (query_cache) {
query_cache->UpdateCounters();
@@ -177,8 +206,8 @@ void VKScheduler::EndRenderPass() {
if (!state.renderpass) {
return;
}
state.renderpass = std::nullopt;
Record([](auto cmdbuf, auto& dld) { cmdbuf.endRenderPass(dld); });
state.renderpass = nullptr;
Record([](vk::CommandBuffer cmdbuf) { cmdbuf.EndRenderPass(); });
}
void VKScheduler::AcquireNewChunk() {

View File

@@ -13,7 +13,7 @@
#include <utility>
#include "common/common_types.h"
#include "common/threadsafe_queue.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -49,10 +49,10 @@ public:
~VKScheduler();
/// Sends the current execution context to the GPU.
void Flush(bool release_fence = true, vk::Semaphore semaphore = nullptr);
void Flush(bool release_fence = true, VkSemaphore semaphore = nullptr);
/// Sends the current execution context to the GPU and waits for it to complete.
void Finish(bool release_fence = true, vk::Semaphore semaphore = nullptr);
void Finish(bool release_fence = true, VkSemaphore semaphore = nullptr);
/// Waits for the worker thread to finish executing everything. After this function returns it's
/// safe to touch worker resources.
@@ -62,14 +62,15 @@ public:
void DispatchWork();
/// Requests to begin a renderpass.
void RequestRenderpass(const vk::RenderPassBeginInfo& renderpass_bi);
void RequestRenderpass(VkRenderPass renderpass, VkFramebuffer framebuffer,
VkExtent2D render_area);
/// Requests the current executino context to be able to execute operations only allowed outside
/// of a renderpass.
void RequestOutsideRenderPassOperationContext();
/// Binds a pipeline to the current execution context.
void BindGraphicsPipeline(vk::Pipeline pipeline);
void BindGraphicsPipeline(VkPipeline pipeline);
/// Assigns the query cache.
void SetQueryCache(VKQueryCache& query_cache_) {
@@ -101,8 +102,7 @@ private:
public:
virtual ~Command() = default;
virtual void Execute(vk::CommandBuffer cmdbuf,
const vk::DispatchLoaderDynamic& dld) const = 0;
virtual void Execute(vk::CommandBuffer cmdbuf) const = 0;
Command* GetNext() const {
return next;
@@ -125,9 +125,8 @@ private:
TypedCommand(TypedCommand&&) = delete;
TypedCommand& operator=(TypedCommand&&) = delete;
void Execute(vk::CommandBuffer cmdbuf,
const vk::DispatchLoaderDynamic& dld) const override {
command(cmdbuf, dld);
void Execute(vk::CommandBuffer cmdbuf) const override {
command(cmdbuf);
}
private:
@@ -136,7 +135,7 @@ private:
class CommandChunk final {
public:
void ExecuteAll(vk::CommandBuffer cmdbuf, const vk::DispatchLoaderDynamic& dld);
void ExecuteAll(vk::CommandBuffer cmdbuf);
template <typename T>
bool Record(T& command) {
@@ -175,7 +174,7 @@ private:
void WorkerThread();
void SubmitExecution(vk::Semaphore semaphore);
void SubmitExecution(VkSemaphore semaphore);
void AllocateNewContext();
@@ -198,8 +197,10 @@ private:
VKFence* next_fence = nullptr;
struct State {
std::optional<vk::RenderPassBeginInfo> renderpass;
vk::Pipeline graphics_pipeline;
VkRenderPass renderpass = nullptr;
VkFramebuffer framebuffer = nullptr;
VkExtent2D render_area = {0, 0};
VkPipeline graphics_pipeline = nullptr;
} state;
std::unique_ptr<CommandChunk> chunk;

View File

@@ -801,7 +801,7 @@ private:
if (IsOutputAttributeArray()) {
const u32 num = GetNumOutputVertices();
type = TypeArray(type, Constant(t_uint, num));
if (device.GetDriverID() != vk::DriverIdKHR::eIntelProprietaryWindows) {
if (device.GetDriverID() != VK_DRIVER_ID_INTEL_PROPRIETARY_WINDOWS_KHR) {
// Intel's proprietary driver fails to setup defaults for arrayed output
// attributes.
varying_default = ConstantComposite(type, std::vector(num, varying_default));
@@ -1938,11 +1938,8 @@ private:
return {};
}
template <Id (Module::*func)(Id, Id, Id, Id, Id), Type result_type,
Type value_type = result_type>
template <Id (Module::*func)(Id, Id, Id, Id, Id)>
Expression Atomic(Operation operation) {
const Id type_def = GetTypeDefinition(result_type);
Id pointer;
if (const auto smem = std::get_if<SmemNode>(&*operation[0])) {
pointer = GetSharedMemoryPointer(*smem);
@@ -1950,15 +1947,19 @@ private:
pointer = GetGlobalMemoryPointer(*gmem);
} else {
UNREACHABLE();
return {Constant(type_def, 0), result_type};
return {v_float_zero, Type::Float};
}
const Id value = As(Visit(operation[1]), value_type);
const Id scope = Constant(t_uint, static_cast<u32>(spv::Scope::Device));
const Id semantics = Constant(type_def, 0);
const Id semantics = Constant(t_uint, 0);
const Id value = AsUint(Visit(operation[1]));
return {(this->*func)(type_def, pointer, scope, semantics, value), result_type};
return {(this->*func)(t_uint, pointer, scope, semantics, value), Type::Uint};
}
template <Id (Module::*func)(Id, Id, Id, Id, Id)>
Expression Reduce(Operation operation) {
Atomic<func>(operation);
return {};
}
Expression Branch(Operation operation) {
@@ -2547,21 +2548,35 @@ private:
&SPIRVDecompiler::AtomicImageXor,
&SPIRVDecompiler::AtomicImageExchange,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMin, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMax, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMin>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMax>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMin, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMax, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMin>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMax>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicUMin>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicUMax>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicOr>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicXor>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicSMin>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicSMax>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicOr>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicXor>,
&SPIRVDecompiler::Branch,
&SPIRVDecompiler::BranchIndirect,

View File

@@ -8,27 +8,25 @@
#include "common/alignment.h"
#include "common/assert.h"
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_shader_util.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
UniqueShaderModule BuildShader(const VKDevice& device, std::size_t code_size, const u8* code_data) {
vk::ShaderModule BuildShader(const VKDevice& device, std::size_t code_size, const u8* code_data) {
// Avoid undefined behavior by copying to a staging allocation
ASSERT(code_size % sizeof(u32) == 0);
const auto data = std::make_unique<u32[]>(code_size / sizeof(u32));
std::memcpy(data.get(), code_data, code_size);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const vk::ShaderModuleCreateInfo shader_ci({}, code_size, data.get());
vk::ShaderModule shader_module;
if (dev.createShaderModule(&shader_ci, nullptr, &shader_module, dld) != vk::Result::eSuccess) {
UNREACHABLE_MSG("Shader module failed to build!");
}
return UniqueShaderModule(shader_module, vk::ObjectDestroy(dev, nullptr, dld));
VkShaderModuleCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.codeSize = code_size;
ci.pCode = data.get();
return device.GetLogical().CreateShaderModule(ci);
}
} // namespace Vulkan

View File

@@ -6,12 +6,12 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
class VKDevice;
UniqueShaderModule BuildShader(const VKDevice& device, std::size_t code_size, const u8* code_data);
vk::ShaderModule BuildShader(const VKDevice& device, std::size_t code_size, const u8* code_data);
} // namespace Vulkan

View File

@@ -13,6 +13,7 @@
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -71,17 +72,23 @@ VKBuffer* VKStagingBufferPool::TryGetReservedBuffer(std::size_t size, bool host_
}
VKBuffer& VKStagingBufferPool::CreateStagingBuffer(std::size_t size, bool host_visible) {
const auto usage =
vk::BufferUsageFlagBits::eTransferSrc | vk::BufferUsageFlagBits::eTransferDst |
vk::BufferUsageFlagBits::eUniformBuffer | vk::BufferUsageFlagBits::eStorageBuffer |
vk::BufferUsageFlagBits::eIndexBuffer;
const u32 log2 = Common::Log2Ceil64(size);
const vk::BufferCreateInfo buffer_ci({}, 1ULL << log2, usage, vk::SharingMode::eExclusive, 0,
nullptr);
const auto dev = device.GetLogical();
VkBufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.size = 1ULL << log2;
ci.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT |
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT | VK_BUFFER_USAGE_STORAGE_BUFFER_BIT |
VK_BUFFER_USAGE_INDEX_BUFFER_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
auto buffer = std::make_unique<VKBuffer>();
buffer->handle = dev.createBufferUnique(buffer_ci, nullptr, device.GetDispatchLoader());
buffer->commit = memory_manager.Commit(*buffer->handle, host_visible);
buffer->handle = device.GetLogical().CreateBuffer(ci);
buffer->commit = memory_manager.Commit(buffer->handle, host_visible);
auto& entries = GetCache(host_visible)[log2].entries;
return *entries.emplace_back(std::move(buffer), scheduler.GetFence(), epoch).buffer;

View File

@@ -11,9 +11,9 @@
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -22,7 +22,7 @@ class VKFenceWatch;
class VKScheduler;
struct VKBuffer final {
UniqueBuffer handle;
vk::Buffer handle;
VKMemoryCommit commit;
};

View File

@@ -9,11 +9,11 @@
#include "common/alignment.h"
#include "common/assert.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_stream_buffer.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -25,8 +25,8 @@ constexpr u64 WATCHES_RESERVE_CHUNK = 0x1000;
constexpr u64 STREAM_BUFFER_SIZE = 256 * 1024 * 1024;
std::optional<u32> FindMemoryType(const VKDevice& device, u32 filter,
vk::MemoryPropertyFlags wanted) {
const auto properties = device.GetPhysical().getMemoryProperties(device.GetDispatchLoader());
VkMemoryPropertyFlags wanted) {
const auto properties = device.GetPhysical().GetMemoryProperties();
for (u32 i = 0; i < properties.memoryTypeCount; i++) {
if (!(filter & (1 << i))) {
continue;
@@ -35,13 +35,13 @@ std::optional<u32> FindMemoryType(const VKDevice& device, u32 filter,
return i;
}
}
return {};
return std::nullopt;
}
} // Anonymous namespace
VKStreamBuffer::VKStreamBuffer(const VKDevice& device, VKScheduler& scheduler,
vk::BufferUsageFlags usage)
VkBufferUsageFlags usage)
: device{device}, scheduler{scheduler} {
CreateBuffers(usage);
ReserveWatches(current_watches, WATCHES_INITIAL_RESERVE);
@@ -78,17 +78,13 @@ std::tuple<u8*, u64, bool> VKStreamBuffer::Map(u64 size, u64 alignment) {
invalidated = true;
}
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
const auto pointer = reinterpret_cast<u8*>(dev.mapMemory(*memory, offset, size, {}, dld));
return {pointer, offset, invalidated};
return {memory.Map(offset, size), offset, invalidated};
}
void VKStreamBuffer::Unmap(u64 size) {
ASSERT_MSG(size <= mapped_size, "Reserved size is too small");
const auto dev = device.GetLogical();
dev.unmapMemory(*memory, device.GetDispatchLoader());
memory.Unmap();
offset += size;
@@ -101,30 +97,42 @@ void VKStreamBuffer::Unmap(u64 size) {
watch.fence.Watch(scheduler.GetFence());
}
void VKStreamBuffer::CreateBuffers(vk::BufferUsageFlags usage) {
const vk::BufferCreateInfo buffer_ci({}, STREAM_BUFFER_SIZE, usage, vk::SharingMode::eExclusive,
0, nullptr);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
buffer = dev.createBufferUnique(buffer_ci, nullptr, dld);
void VKStreamBuffer::CreateBuffers(VkBufferUsageFlags usage) {
VkBufferCreateInfo buffer_ci;
buffer_ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
buffer_ci.pNext = nullptr;
buffer_ci.flags = 0;
buffer_ci.size = STREAM_BUFFER_SIZE;
buffer_ci.usage = usage;
buffer_ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
buffer_ci.queueFamilyIndexCount = 0;
buffer_ci.pQueueFamilyIndices = nullptr;
const auto requirements = dev.getBufferMemoryRequirements(*buffer, dld);
const auto& dev = device.GetLogical();
buffer = dev.CreateBuffer(buffer_ci);
const auto& dld = device.GetDispatchLoader();
const auto requirements = dev.GetBufferMemoryRequirements(*buffer);
// Prefer device local host visible allocations (this should hit AMD's pinned memory).
auto type = FindMemoryType(device, requirements.memoryTypeBits,
vk::MemoryPropertyFlagBits::eHostVisible |
vk::MemoryPropertyFlagBits::eHostCoherent |
vk::MemoryPropertyFlagBits::eDeviceLocal);
auto type =
FindMemoryType(device, requirements.memoryTypeBits,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT |
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
if (!type) {
// Otherwise search for a host visible allocation.
type = FindMemoryType(device, requirements.memoryTypeBits,
vk::MemoryPropertyFlagBits::eHostVisible |
vk::MemoryPropertyFlagBits::eHostCoherent);
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT);
ASSERT_MSG(type, "No host visible and coherent memory type found");
}
const vk::MemoryAllocateInfo alloc_ci(requirements.size, *type);
memory = dev.allocateMemoryUnique(alloc_ci, nullptr, dld);
VkMemoryAllocateInfo memory_ai;
memory_ai.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
memory_ai.pNext = nullptr;
memory_ai.allocationSize = requirements.size;
memory_ai.memoryTypeIndex = *type;
dev.bindBufferMemory(*buffer, *memory, 0, dld);
memory = dev.AllocateMemory(memory_ai);
buffer.BindMemory(*memory, 0);
}
void VKStreamBuffer::ReserveWatches(std::vector<Watch>& watches, std::size_t grow_size) {

View File

@@ -9,7 +9,7 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -21,7 +21,7 @@ class VKScheduler;
class VKStreamBuffer final {
public:
explicit VKStreamBuffer(const VKDevice& device, VKScheduler& scheduler,
vk::BufferUsageFlags usage);
VkBufferUsageFlags usage);
~VKStreamBuffer();
/**
@@ -35,7 +35,7 @@ public:
/// Ensures that "size" bytes of memory are available to the GPU, potentially recording a copy.
void Unmap(u64 size);
vk::Buffer GetHandle() const {
VkBuffer GetHandle() const {
return *buffer;
}
@@ -46,20 +46,18 @@ private:
};
/// Creates Vulkan buffer handles committing the required the required memory.
void CreateBuffers(vk::BufferUsageFlags usage);
void CreateBuffers(VkBufferUsageFlags usage);
/// Increases the amount of watches available.
void ReserveWatches(std::vector<Watch>& watches, std::size_t grow_size);
void WaitPendingOperations(u64 requested_upper_bound);
const VKDevice& device; ///< Vulkan device manager.
VKScheduler& scheduler; ///< Command scheduler.
const vk::AccessFlags access; ///< Access usage of this stream buffer.
const vk::PipelineStageFlags pipeline_stage; ///< Pipeline usage of this stream buffer.
const VKDevice& device; ///< Vulkan device manager.
VKScheduler& scheduler; ///< Command scheduler.
UniqueBuffer buffer; ///< Mapped buffer.
UniqueDeviceMemory memory; ///< Memory allocation.
vk::Buffer buffer; ///< Mapped buffer.
vk::DeviceMemory memory; ///< Memory allocation.
u64 offset{}; ///< Buffer iterator.
u64 mapped_size{}; ///< Size reserved for the current copy.

View File

@@ -11,69 +11,64 @@
#include "common/logging/log.h"
#include "core/core.h"
#include "core/frontend/framebuffer_layout.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_resource_manager.h"
#include "video_core/renderer_vulkan/vk_swapchain.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
namespace {
vk::SurfaceFormatKHR ChooseSwapSurfaceFormat(const std::vector<vk::SurfaceFormatKHR>& formats,
bool srgb) {
if (formats.size() == 1 && formats[0].format == vk::Format::eUndefined) {
vk::SurfaceFormatKHR format;
format.format = vk::Format::eB8G8R8A8Unorm;
format.colorSpace = vk::ColorSpaceKHR::eSrgbNonlinear;
VkSurfaceFormatKHR ChooseSwapSurfaceFormat(vk::Span<VkSurfaceFormatKHR> formats, bool srgb) {
if (formats.size() == 1 && formats[0].format == VK_FORMAT_UNDEFINED) {
VkSurfaceFormatKHR format;
format.format = VK_FORMAT_B8G8R8A8_UNORM;
format.colorSpace = VK_COLOR_SPACE_SRGB_NONLINEAR_KHR;
return format;
}
const auto& found = std::find_if(formats.begin(), formats.end(), [srgb](const auto& format) {
const auto request_format = srgb ? vk::Format::eB8G8R8A8Srgb : vk::Format::eB8G8R8A8Unorm;
const auto request_format = srgb ? VK_FORMAT_B8G8R8A8_SRGB : VK_FORMAT_B8G8R8A8_UNORM;
return format.format == request_format &&
format.colorSpace == vk::ColorSpaceKHR::eSrgbNonlinear;
format.colorSpace == VK_COLOR_SPACE_SRGB_NONLINEAR_KHR;
});
return found != formats.end() ? *found : formats[0];
}
vk::PresentModeKHR ChooseSwapPresentMode(const std::vector<vk::PresentModeKHR>& modes) {
VkPresentModeKHR ChooseSwapPresentMode(vk::Span<VkPresentModeKHR> modes) {
// Mailbox doesn't lock the application like fifo (vsync), prefer it
const auto& found = std::find_if(modes.begin(), modes.end(), [](const auto& mode) {
return mode == vk::PresentModeKHR::eMailbox;
});
return found != modes.end() ? *found : vk::PresentModeKHR::eFifo;
const auto found = std::find(modes.begin(), modes.end(), VK_PRESENT_MODE_MAILBOX_KHR);
return found != modes.end() ? *found : VK_PRESENT_MODE_FIFO_KHR;
}
vk::Extent2D ChooseSwapExtent(const vk::SurfaceCapabilitiesKHR& capabilities, u32 width,
u32 height) {
VkExtent2D ChooseSwapExtent(const VkSurfaceCapabilitiesKHR& capabilities, u32 width, u32 height) {
constexpr auto undefined_size{std::numeric_limits<u32>::max()};
if (capabilities.currentExtent.width != undefined_size) {
return capabilities.currentExtent;
}
vk::Extent2D extent = {width, height};
VkExtent2D extent;
extent.width = std::max(capabilities.minImageExtent.width,
std::min(capabilities.maxImageExtent.width, extent.width));
std::min(capabilities.maxImageExtent.width, width));
extent.height = std::max(capabilities.minImageExtent.height,
std::min(capabilities.maxImageExtent.height, extent.height));
std::min(capabilities.maxImageExtent.height, height));
return extent;
}
} // Anonymous namespace
VKSwapchain::VKSwapchain(vk::SurfaceKHR surface, const VKDevice& device)
VKSwapchain::VKSwapchain(VkSurfaceKHR surface, const VKDevice& device)
: surface{surface}, device{device} {}
VKSwapchain::~VKSwapchain() = default;
void VKSwapchain::Create(u32 width, u32 height, bool srgb) {
const auto& dld = device.GetDispatchLoader();
const auto physical_device = device.GetPhysical();
const auto capabilities{physical_device.getSurfaceCapabilitiesKHR(surface, dld)};
const auto capabilities{physical_device.GetSurfaceCapabilitiesKHR(surface)};
if (capabilities.maxImageExtent.width == 0 || capabilities.maxImageExtent.height == 0) {
return;
}
device.GetLogical().waitIdle(dld);
device.GetLogical().WaitIdle();
Destroy();
CreateSwapchain(capabilities, width, height, srgb);
@@ -84,10 +79,8 @@ void VKSwapchain::Create(u32 width, u32 height, bool srgb) {
}
void VKSwapchain::AcquireNextImage() {
const auto dev{device.GetLogical()};
const auto& dld{device.GetDispatchLoader()};
dev.acquireNextImageKHR(*swapchain, std::numeric_limits<u64>::max(),
*present_semaphores[frame_index], {}, &image_index, dld);
device.GetLogical().AcquireNextImageKHR(*swapchain, std::numeric_limits<u64>::max(),
*present_semaphores[frame_index], {}, &image_index);
if (auto& fence = fences[image_index]; fence) {
fence->Wait();
@@ -96,29 +89,37 @@ void VKSwapchain::AcquireNextImage() {
}
}
bool VKSwapchain::Present(vk::Semaphore render_semaphore, VKFence& fence) {
const vk::Semaphore present_semaphore{*present_semaphores[frame_index]};
const std::array<vk::Semaphore, 2> semaphores{present_semaphore, render_semaphore};
const u32 wait_semaphore_count{render_semaphore ? 2U : 1U};
const auto& dld{device.GetDispatchLoader()};
bool VKSwapchain::Present(VkSemaphore render_semaphore, VKFence& fence) {
const VkSemaphore present_semaphore{*present_semaphores[frame_index]};
const std::array<VkSemaphore, 2> semaphores{present_semaphore, render_semaphore};
const auto present_queue{device.GetPresentQueue()};
bool recreated = false;
const vk::PresentInfoKHR present_info(wait_semaphore_count, semaphores.data(), 1,
&swapchain.get(), &image_index, {});
switch (const auto result = present_queue.presentKHR(&present_info, dld); result) {
case vk::Result::eSuccess:
VkPresentInfoKHR present_info;
present_info.sType = VK_STRUCTURE_TYPE_PRESENT_INFO_KHR;
present_info.pNext = nullptr;
present_info.waitSemaphoreCount = render_semaphore ? 2U : 1U;
present_info.pWaitSemaphores = semaphores.data();
present_info.swapchainCount = 1;
present_info.pSwapchains = swapchain.address();
present_info.pImageIndices = &image_index;
present_info.pResults = nullptr;
switch (const VkResult result = present_queue.Present(present_info)) {
case VK_SUCCESS:
break;
case vk::Result::eErrorOutOfDateKHR:
case VK_SUBOPTIMAL_KHR:
LOG_DEBUG(Render_Vulkan, "Suboptimal swapchain");
break;
case VK_ERROR_OUT_OF_DATE_KHR:
if (current_width > 0 && current_height > 0) {
Create(current_width, current_height, current_srgb);
recreated = true;
}
break;
default:
LOG_CRITICAL(Render_Vulkan, "Vulkan failed to present swapchain due to {}!",
vk::to_string(result));
UNREACHABLE();
LOG_CRITICAL(Render_Vulkan, "Failed to present with error {}", vk::ToString(result));
break;
}
ASSERT(fences[image_index] == nullptr);
@@ -132,74 +133,92 @@ bool VKSwapchain::HasFramebufferChanged(const Layout::FramebufferLayout& framebu
return framebuffer.width != current_width || framebuffer.height != current_height;
}
void VKSwapchain::CreateSwapchain(const vk::SurfaceCapabilitiesKHR& capabilities, u32 width,
void VKSwapchain::CreateSwapchain(const VkSurfaceCapabilitiesKHR& capabilities, u32 width,
u32 height, bool srgb) {
const auto& dld{device.GetDispatchLoader()};
const auto physical_device{device.GetPhysical()};
const auto formats{physical_device.getSurfaceFormatsKHR(surface, dld)};
const auto present_modes{physical_device.getSurfacePresentModesKHR(surface, dld)};
const auto formats{physical_device.GetSurfaceFormatsKHR(surface)};
const auto present_modes{physical_device.GetSurfacePresentModesKHR(surface)};
const vk::SurfaceFormatKHR surface_format{ChooseSwapSurfaceFormat(formats, srgb)};
const vk::PresentModeKHR present_mode{ChooseSwapPresentMode(present_modes)};
const VkSurfaceFormatKHR surface_format{ChooseSwapSurfaceFormat(formats, srgb)};
const VkPresentModeKHR present_mode{ChooseSwapPresentMode(present_modes)};
u32 requested_image_count{capabilities.minImageCount + 1};
if (capabilities.maxImageCount > 0 && requested_image_count > capabilities.maxImageCount) {
requested_image_count = capabilities.maxImageCount;
}
vk::SwapchainCreateInfoKHR swapchain_ci(
{}, surface, requested_image_count, surface_format.format, surface_format.colorSpace, {}, 1,
vk::ImageUsageFlagBits::eColorAttachment, {}, {}, {}, capabilities.currentTransform,
vk::CompositeAlphaFlagBitsKHR::eOpaque, present_mode, false, {});
VkSwapchainCreateInfoKHR swapchain_ci;
swapchain_ci.sType = VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR;
swapchain_ci.pNext = nullptr;
swapchain_ci.flags = 0;
swapchain_ci.surface = surface;
swapchain_ci.minImageCount = requested_image_count;
swapchain_ci.imageFormat = surface_format.format;
swapchain_ci.imageColorSpace = surface_format.colorSpace;
swapchain_ci.imageArrayLayers = 1;
swapchain_ci.imageUsage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
swapchain_ci.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE;
swapchain_ci.queueFamilyIndexCount = 0;
swapchain_ci.pQueueFamilyIndices = nullptr;
swapchain_ci.preTransform = capabilities.currentTransform;
swapchain_ci.compositeAlpha = VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR;
swapchain_ci.presentMode = present_mode;
swapchain_ci.clipped = VK_FALSE;
swapchain_ci.oldSwapchain = nullptr;
const u32 graphics_family{device.GetGraphicsFamily()};
const u32 present_family{device.GetPresentFamily()};
const std::array<u32, 2> queue_indices{graphics_family, present_family};
if (graphics_family != present_family) {
swapchain_ci.imageSharingMode = vk::SharingMode::eConcurrent;
swapchain_ci.imageSharingMode = VK_SHARING_MODE_CONCURRENT;
swapchain_ci.queueFamilyIndexCount = static_cast<u32>(queue_indices.size());
swapchain_ci.pQueueFamilyIndices = queue_indices.data();
} else {
swapchain_ci.imageSharingMode = vk::SharingMode::eExclusive;
swapchain_ci.imageSharingMode = VK_SHARING_MODE_EXCLUSIVE;
}
// Request the size again to reduce the possibility of a TOCTOU race condition.
const auto updated_capabilities = physical_device.getSurfaceCapabilitiesKHR(surface, dld);
const auto updated_capabilities = physical_device.GetSurfaceCapabilitiesKHR(surface);
swapchain_ci.imageExtent = ChooseSwapExtent(updated_capabilities, width, height);
// Don't add code within this and the swapchain creation.
const auto dev{device.GetLogical()};
swapchain = dev.createSwapchainKHRUnique(swapchain_ci, nullptr, dld);
swapchain = device.GetLogical().CreateSwapchainKHR(swapchain_ci);
extent = swapchain_ci.imageExtent;
current_width = extent.width;
current_height = extent.height;
current_srgb = srgb;
images = dev.getSwapchainImagesKHR(*swapchain, dld);
images = swapchain.GetImages();
image_count = static_cast<u32>(images.size());
image_format = surface_format.format;
}
void VKSwapchain::CreateSemaphores() {
const auto dev{device.GetLogical()};
const auto& dld{device.GetDispatchLoader()};
present_semaphores.resize(image_count);
for (std::size_t i = 0; i < image_count; i++) {
present_semaphores[i] = dev.createSemaphoreUnique({}, nullptr, dld);
}
std::generate(present_semaphores.begin(), present_semaphores.end(),
[this] { return device.GetLogical().CreateSemaphore(); });
}
void VKSwapchain::CreateImageViews() {
const auto dev{device.GetLogical()};
const auto& dld{device.GetDispatchLoader()};
VkImageViewCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
// ci.image
ci.viewType = VK_IMAGE_VIEW_TYPE_2D;
ci.format = image_format;
ci.components = {VK_COMPONENT_SWIZZLE_IDENTITY, VK_COMPONENT_SWIZZLE_IDENTITY,
VK_COMPONENT_SWIZZLE_IDENTITY, VK_COMPONENT_SWIZZLE_IDENTITY};
ci.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
ci.subresourceRange.baseMipLevel = 0;
ci.subresourceRange.levelCount = 1;
ci.subresourceRange.baseArrayLayer = 0;
ci.subresourceRange.layerCount = 1;
image_views.resize(image_count);
for (std::size_t i = 0; i < image_count; i++) {
const vk::ImageViewCreateInfo image_view_ci({}, images[i], vk::ImageViewType::e2D,
image_format, {},
{vk::ImageAspectFlagBits::eColor, 0, 1, 0, 1});
image_views[i] = dev.createImageViewUnique(image_view_ci, nullptr, dld);
ci.image = images[i];
image_views[i] = device.GetLogical().CreateImageView(ci);
}
}

View File

@@ -7,7 +7,7 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Layout {
struct FramebufferLayout;
@@ -20,7 +20,7 @@ class VKFence;
class VKSwapchain {
public:
explicit VKSwapchain(vk::SurfaceKHR surface, const VKDevice& device);
explicit VKSwapchain(VkSurfaceKHR surface, const VKDevice& device);
~VKSwapchain();
/// Creates (or recreates) the swapchain with a given size.
@@ -31,12 +31,12 @@ public:
/// Presents the rendered image to the swapchain. Returns true when the swapchains had to be
/// recreated. Takes responsability for the ownership of fence.
bool Present(vk::Semaphore render_semaphore, VKFence& fence);
bool Present(VkSemaphore render_semaphore, VKFence& fence);
/// Returns true when the framebuffer layout has changed.
bool HasFramebufferChanged(const Layout::FramebufferLayout& framebuffer) const;
const vk::Extent2D& GetSize() const {
VkExtent2D GetSize() const {
return extent;
}
@@ -48,15 +48,15 @@ public:
return image_index;
}
vk::Image GetImageIndex(std::size_t index) const {
VkImage GetImageIndex(std::size_t index) const {
return images[index];
}
vk::ImageView GetImageViewIndex(std::size_t index) const {
VkImageView GetImageViewIndex(std::size_t index) const {
return *image_views[index];
}
vk::Format GetImageFormat() const {
VkFormat GetImageFormat() const {
return image_format;
}
@@ -65,30 +65,30 @@ public:
}
private:
void CreateSwapchain(const vk::SurfaceCapabilitiesKHR& capabilities, u32 width, u32 height,
void CreateSwapchain(const VkSurfaceCapabilitiesKHR& capabilities, u32 width, u32 height,
bool srgb);
void CreateSemaphores();
void CreateImageViews();
void Destroy();
const vk::SurfaceKHR surface;
const VkSurfaceKHR surface;
const VKDevice& device;
UniqueSwapchainKHR swapchain;
vk::SwapchainKHR swapchain;
std::size_t image_count{};
std::vector<vk::Image> images;
std::vector<UniqueImageView> image_views;
std::vector<UniqueFramebuffer> framebuffers;
std::vector<VkImage> images;
std::vector<vk::ImageView> image_views;
std::vector<vk::Framebuffer> framebuffers;
std::vector<VKFence*> fences;
std::vector<UniqueSemaphore> present_semaphores;
std::vector<vk::Semaphore> present_semaphores;
u32 image_index{};
u32 frame_index{};
vk::Format image_format{};
vk::Extent2D extent{};
VkFormat image_format{};
VkExtent2D extent{};
u32 current_width{};
u32 current_height{};

View File

@@ -17,7 +17,6 @@
#include "core/memory.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/morton.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/maxwell_to_vk.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
@@ -25,6 +24,7 @@
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_staging_buffer_pool.h"
#include "video_core/renderer_vulkan/vk_texture_cache.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/surface.h"
#include "video_core/textures/convert.h"
@@ -39,18 +39,18 @@ using VideoCore::Surface::SurfaceTarget;
namespace {
vk::ImageType SurfaceTargetToImage(SurfaceTarget target) {
VkImageType SurfaceTargetToImage(SurfaceTarget target) {
switch (target) {
case SurfaceTarget::Texture1D:
case SurfaceTarget::Texture1DArray:
return vk::ImageType::e1D;
return VK_IMAGE_TYPE_1D;
case SurfaceTarget::Texture2D:
case SurfaceTarget::Texture2DArray:
case SurfaceTarget::TextureCubemap:
case SurfaceTarget::TextureCubeArray:
return vk::ImageType::e2D;
return VK_IMAGE_TYPE_2D;
case SurfaceTarget::Texture3D:
return vk::ImageType::e3D;
return VK_IMAGE_TYPE_3D;
case SurfaceTarget::TextureBuffer:
UNREACHABLE();
return {};
@@ -59,35 +59,35 @@ vk::ImageType SurfaceTargetToImage(SurfaceTarget target) {
return {};
}
vk::ImageAspectFlags PixelFormatToImageAspect(PixelFormat pixel_format) {
VkImageAspectFlags PixelFormatToImageAspect(PixelFormat pixel_format) {
if (pixel_format < PixelFormat::MaxColorFormat) {
return vk::ImageAspectFlagBits::eColor;
return VK_IMAGE_ASPECT_COLOR_BIT;
} else if (pixel_format < PixelFormat::MaxDepthFormat) {
return vk::ImageAspectFlagBits::eDepth;
return VK_IMAGE_ASPECT_DEPTH_BIT;
} else if (pixel_format < PixelFormat::MaxDepthStencilFormat) {
return vk::ImageAspectFlagBits::eDepth | vk::ImageAspectFlagBits::eStencil;
return VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT;
} else {
UNREACHABLE_MSG("Invalid pixel format={}", static_cast<u32>(pixel_format));
return vk::ImageAspectFlagBits::eColor;
UNREACHABLE_MSG("Invalid pixel format={}", static_cast<int>(pixel_format));
return VK_IMAGE_ASPECT_COLOR_BIT;
}
}
vk::ImageViewType GetImageViewType(SurfaceTarget target) {
VkImageViewType GetImageViewType(SurfaceTarget target) {
switch (target) {
case SurfaceTarget::Texture1D:
return vk::ImageViewType::e1D;
return VK_IMAGE_VIEW_TYPE_1D;
case SurfaceTarget::Texture2D:
return vk::ImageViewType::e2D;
return VK_IMAGE_VIEW_TYPE_2D;
case SurfaceTarget::Texture3D:
return vk::ImageViewType::e3D;
return VK_IMAGE_VIEW_TYPE_3D;
case SurfaceTarget::Texture1DArray:
return vk::ImageViewType::e1DArray;
return VK_IMAGE_VIEW_TYPE_1D_ARRAY;
case SurfaceTarget::Texture2DArray:
return vk::ImageViewType::e2DArray;
return VK_IMAGE_VIEW_TYPE_2D_ARRAY;
case SurfaceTarget::TextureCubemap:
return vk::ImageViewType::eCube;
return VK_IMAGE_VIEW_TYPE_CUBE;
case SurfaceTarget::TextureCubeArray:
return vk::ImageViewType::eCubeArray;
return VK_IMAGE_VIEW_TYPE_CUBE_ARRAY;
case SurfaceTarget::TextureBuffer:
break;
}
@@ -95,73 +95,88 @@ vk::ImageViewType GetImageViewType(SurfaceTarget target) {
return {};
}
UniqueBuffer CreateBuffer(const VKDevice& device, const SurfaceParams& params,
std::size_t host_memory_size) {
vk::Buffer CreateBuffer(const VKDevice& device, const SurfaceParams& params,
std::size_t host_memory_size) {
// TODO(Rodrigo): Move texture buffer creation to the buffer cache
const vk::BufferCreateInfo buffer_ci({}, host_memory_size,
vk::BufferUsageFlagBits::eUniformTexelBuffer |
vk::BufferUsageFlagBits::eTransferSrc |
vk::BufferUsageFlagBits::eTransferDst,
vk::SharingMode::eExclusive, 0, nullptr);
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
return dev.createBufferUnique(buffer_ci, nullptr, dld);
VkBufferCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.size = static_cast<VkDeviceSize>(host_memory_size);
ci.usage = VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_SRC_BIT |
VK_BUFFER_USAGE_TRANSFER_DST_BIT;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
return device.GetLogical().CreateBuffer(ci);
}
vk::BufferViewCreateInfo GenerateBufferViewCreateInfo(const VKDevice& device,
const SurfaceParams& params,
vk::Buffer buffer,
std::size_t host_memory_size) {
VkBufferViewCreateInfo GenerateBufferViewCreateInfo(const VKDevice& device,
const SurfaceParams& params, VkBuffer buffer,
std::size_t host_memory_size) {
ASSERT(params.IsBuffer());
const auto format =
MaxwellToVK::SurfaceFormat(device, FormatType::Buffer, params.pixel_format).format;
return vk::BufferViewCreateInfo({}, buffer, format, 0, host_memory_size);
VkBufferViewCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_BUFFER_VIEW_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.buffer = buffer;
ci.format = MaxwellToVK::SurfaceFormat(device, FormatType::Buffer, params.pixel_format).format;
ci.offset = 0;
ci.range = static_cast<VkDeviceSize>(host_memory_size);
return ci;
}
vk::ImageCreateInfo GenerateImageCreateInfo(const VKDevice& device, const SurfaceParams& params) {
constexpr auto sample_count = vk::SampleCountFlagBits::e1;
constexpr auto tiling = vk::ImageTiling::eOptimal;
VkImageCreateInfo GenerateImageCreateInfo(const VKDevice& device, const SurfaceParams& params) {
ASSERT(!params.IsBuffer());
const auto [format, attachable, storage] =
MaxwellToVK::SurfaceFormat(device, FormatType::Optimal, params.pixel_format);
auto image_usage = vk::ImageUsageFlagBits::eSampled | vk::ImageUsageFlagBits::eTransferDst |
vk::ImageUsageFlagBits::eTransferSrc;
VkImageCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.imageType = SurfaceTargetToImage(params.target);
ci.format = format;
ci.mipLevels = params.num_levels;
ci.arrayLayers = static_cast<u32>(params.GetNumLayers());
ci.samples = VK_SAMPLE_COUNT_1_BIT;
ci.tiling = VK_IMAGE_TILING_OPTIMAL;
ci.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
ci.queueFamilyIndexCount = 0;
ci.pQueueFamilyIndices = nullptr;
ci.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
ci.usage = VK_IMAGE_USAGE_SAMPLED_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT |
VK_IMAGE_USAGE_TRANSFER_SRC_BIT;
if (attachable) {
image_usage |= params.IsPixelFormatZeta() ? vk::ImageUsageFlagBits::eDepthStencilAttachment
: vk::ImageUsageFlagBits::eColorAttachment;
ci.usage |= params.IsPixelFormatZeta() ? VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT
: VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
}
if (storage) {
image_usage |= vk::ImageUsageFlagBits::eStorage;
ci.usage |= VK_IMAGE_USAGE_STORAGE_BIT;
}
vk::ImageCreateFlags flags;
vk::Extent3D extent;
switch (params.target) {
case SurfaceTarget::TextureCubemap:
case SurfaceTarget::TextureCubeArray:
flags |= vk::ImageCreateFlagBits::eCubeCompatible;
ci.flags |= VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT;
[[fallthrough]];
case SurfaceTarget::Texture1D:
case SurfaceTarget::Texture1DArray:
case SurfaceTarget::Texture2D:
case SurfaceTarget::Texture2DArray:
extent = vk::Extent3D(params.width, params.height, 1);
ci.extent = {params.width, params.height, 1};
break;
case SurfaceTarget::Texture3D:
extent = vk::Extent3D(params.width, params.height, params.depth);
ci.extent = {params.width, params.height, params.depth};
break;
case SurfaceTarget::TextureBuffer:
UNREACHABLE();
}
return vk::ImageCreateInfo(flags, SurfaceTargetToImage(params.target), format, extent,
params.num_levels, static_cast<u32>(params.GetNumLayers()),
sample_count, tiling, image_usage, vk::SharingMode::eExclusive, 0,
nullptr, vk::ImageLayout::eUndefined);
return ci;
}
} // Anonymous namespace
@@ -175,15 +190,13 @@ CachedSurface::CachedSurface(Core::System& system, const VKDevice& device,
memory_manager{memory_manager}, scheduler{scheduler}, staging_pool{staging_pool} {
if (params.IsBuffer()) {
buffer = CreateBuffer(device, params, host_memory_size);
commit = memory_manager.Commit(*buffer, false);
commit = memory_manager.Commit(buffer, false);
const auto buffer_view_ci =
GenerateBufferViewCreateInfo(device, params, *buffer, host_memory_size);
format = buffer_view_ci.format;
const auto dev = device.GetLogical();
const auto& dld = device.GetDispatchLoader();
buffer_view = dev.createBufferViewUnique(buffer_view_ci, nullptr, dld);
buffer_view = device.GetLogical().CreateBufferView(buffer_view_ci);
} else {
const auto image_ci = GenerateImageCreateInfo(device, params);
format = image_ci.format;
@@ -221,16 +234,15 @@ void CachedSurface::DownloadTexture(std::vector<u8>& staging_buffer) {
// We can't copy images to buffers inside a renderpass
scheduler.RequestOutsideRenderPassOperationContext();
FullTransition(vk::PipelineStageFlagBits::eTransfer, vk::AccessFlagBits::eTransferRead,
vk::ImageLayout::eTransferSrcOptimal);
FullTransition(VK_PIPELINE_STAGE_TRANSFER_BIT, VK_ACCESS_TRANSFER_READ_BIT,
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL);
const auto& buffer = staging_pool.GetUnusedBuffer(host_memory_size, true);
// TODO(Rodrigo): Do this in a single copy
for (u32 level = 0; level < params.num_levels; ++level) {
scheduler.Record([image = image->GetHandle(), buffer = *buffer.handle,
copy = GetBufferImageCopy(level)](auto cmdbuf, auto& dld) {
cmdbuf.copyImageToBuffer(image, vk::ImageLayout::eTransferSrcOptimal, buffer, {copy},
dld);
scheduler.Record([image = *image->GetHandle(), buffer = *buffer.handle,
copy = GetBufferImageCopy(level)](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyImageToBuffer(image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, buffer, copy);
});
}
scheduler.Finish();
@@ -257,15 +269,27 @@ void CachedSurface::UploadBuffer(const std::vector<u8>& staging_buffer) {
std::memcpy(src_buffer.commit->Map(host_memory_size), staging_buffer.data(), host_memory_size);
scheduler.Record([src_buffer = *src_buffer.handle, dst_buffer = *buffer,
size = host_memory_size](auto cmdbuf, auto& dld) {
const vk::BufferCopy copy(0, 0, size);
cmdbuf.copyBuffer(src_buffer, dst_buffer, {copy}, dld);
size = host_memory_size](vk::CommandBuffer cmdbuf) {
VkBufferCopy copy;
copy.srcOffset = 0;
copy.dstOffset = 0;
copy.size = size;
cmdbuf.CopyBuffer(src_buffer, dst_buffer, copy);
cmdbuf.pipelineBarrier(
vk::PipelineStageFlagBits::eTransfer, vk::PipelineStageFlagBits::eVertexShader, {}, {},
{vk::BufferMemoryBarrier(vk::AccessFlagBits::eTransferWrite,
vk::AccessFlagBits::eShaderRead, 0, 0, dst_buffer, 0, size)},
{}, dld);
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
barrier.pNext = nullptr;
barrier.srcAccessMask = VK_PIPELINE_STAGE_TRANSFER_BIT;
barrier.dstAccessMask = VK_PIPELINE_STAGE_VERTEX_SHADER_BIT;
barrier.srcQueueFamilyIndex = VK_ACCESS_TRANSFER_WRITE_BIT;
barrier.dstQueueFamilyIndex = VK_ACCESS_SHADER_READ_BIT;
barrier.srcQueueFamilyIndex = 0;
barrier.dstQueueFamilyIndex = 0;
barrier.buffer = dst_buffer;
barrier.offset = 0;
barrier.size = size;
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_VERTEX_SHADER_BIT,
0, {}, barrier, {});
});
}
@@ -273,43 +297,49 @@ void CachedSurface::UploadImage(const std::vector<u8>& staging_buffer) {
const auto& src_buffer = staging_pool.GetUnusedBuffer(host_memory_size, true);
std::memcpy(src_buffer.commit->Map(host_memory_size), staging_buffer.data(), host_memory_size);
FullTransition(vk::PipelineStageFlagBits::eTransfer, vk::AccessFlagBits::eTransferWrite,
vk::ImageLayout::eTransferDstOptimal);
FullTransition(VK_PIPELINE_STAGE_TRANSFER_BIT, VK_ACCESS_TRANSFER_WRITE_BIT,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
for (u32 level = 0; level < params.num_levels; ++level) {
vk::BufferImageCopy copy = GetBufferImageCopy(level);
if (image->GetAspectMask() ==
(vk::ImageAspectFlagBits::eDepth | vk::ImageAspectFlagBits::eStencil)) {
vk::BufferImageCopy depth = copy;
vk::BufferImageCopy stencil = copy;
depth.imageSubresource.aspectMask = vk::ImageAspectFlagBits::eDepth;
stencil.imageSubresource.aspectMask = vk::ImageAspectFlagBits::eStencil;
scheduler.Record([buffer = *src_buffer.handle, image = image->GetHandle(), depth,
stencil](auto cmdbuf, auto& dld) {
cmdbuf.copyBufferToImage(buffer, image, vk::ImageLayout::eTransferDstOptimal,
{depth, stencil}, dld);
const VkBufferImageCopy copy = GetBufferImageCopy(level);
if (image->GetAspectMask() == (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) {
scheduler.Record([buffer = *src_buffer.handle, image = *image->GetHandle(),
copy](vk::CommandBuffer cmdbuf) {
std::array<VkBufferImageCopy, 2> copies = {copy, copy};
copies[0].imageSubresource.aspectMask = VK_IMAGE_ASPECT_DEPTH_BIT;
copies[1].imageSubresource.aspectMask = VK_IMAGE_ASPECT_STENCIL_BIT;
cmdbuf.CopyBufferToImage(buffer, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
copies);
});
} else {
scheduler.Record([buffer = *src_buffer.handle, image = image->GetHandle(),
copy](auto cmdbuf, auto& dld) {
cmdbuf.copyBufferToImage(buffer, image, vk::ImageLayout::eTransferDstOptimal,
{copy}, dld);
scheduler.Record([buffer = *src_buffer.handle, image = *image->GetHandle(),
copy](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBufferToImage(buffer, image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, copy);
});
}
}
}
vk::BufferImageCopy CachedSurface::GetBufferImageCopy(u32 level) const {
const u32 vk_depth = params.target == SurfaceTarget::Texture3D ? params.GetMipDepth(level) : 1;
const std::size_t mip_offset = params.GetHostMipmapLevelOffset(level, is_converted);
return vk::BufferImageCopy(
mip_offset, 0, 0,
{image->GetAspectMask(), level, 0, static_cast<u32>(params.GetNumLayers())}, {0, 0, 0},
{params.GetMipWidth(level), params.GetMipHeight(level), vk_depth});
VkBufferImageCopy CachedSurface::GetBufferImageCopy(u32 level) const {
VkBufferImageCopy copy;
copy.bufferOffset = params.GetHostMipmapLevelOffset(level, is_converted);
copy.bufferRowLength = 0;
copy.bufferImageHeight = 0;
copy.imageSubresource.aspectMask = image->GetAspectMask();
copy.imageSubresource.mipLevel = level;
copy.imageSubresource.baseArrayLayer = 0;
copy.imageSubresource.layerCount = static_cast<u32>(params.GetNumLayers());
copy.imageOffset.x = 0;
copy.imageOffset.y = 0;
copy.imageOffset.z = 0;
copy.imageExtent.width = params.GetMipWidth(level);
copy.imageExtent.height = params.GetMipHeight(level);
copy.imageExtent.depth =
params.target == SurfaceTarget::Texture3D ? params.GetMipDepth(level) : 1;
return copy;
}
vk::ImageSubresourceRange CachedSurface::GetImageSubresourceRange() const {
VkImageSubresourceRange CachedSurface::GetImageSubresourceRange() const {
return {image->GetAspectMask(), 0, params.num_levels, 0,
static_cast<u32>(params.GetNumLayers())};
}
@@ -321,12 +351,12 @@ CachedSurfaceView::CachedSurfaceView(const VKDevice& device, CachedSurface& surf
aspect_mask{surface.GetAspectMask()}, device{device}, surface{surface},
base_layer{params.base_layer}, num_layers{params.num_layers}, base_level{params.base_level},
num_levels{params.num_levels}, image_view_type{image ? GetImageViewType(params.target)
: vk::ImageViewType{}} {}
: VK_IMAGE_VIEW_TYPE_1D} {}
CachedSurfaceView::~CachedSurfaceView() = default;
vk::ImageView CachedSurfaceView::GetHandle(SwizzleSource x_source, SwizzleSource y_source,
SwizzleSource z_source, SwizzleSource w_source) {
VkImageView CachedSurfaceView::GetHandle(SwizzleSource x_source, SwizzleSource y_source,
SwizzleSource z_source, SwizzleSource w_source) {
const u32 swizzle = EncodeSwizzle(x_source, y_source, z_source, w_source);
if (last_image_view && last_swizzle == swizzle) {
return last_image_view;
@@ -351,37 +381,45 @@ vk::ImageView CachedSurfaceView::GetHandle(SwizzleSource x_source, SwizzleSource
// Games can sample depth or stencil values on textures. This is decided by the swizzle value on
// hardware. To emulate this on Vulkan we specify it in the aspect.
vk::ImageAspectFlags aspect = aspect_mask;
if (aspect == (vk::ImageAspectFlagBits::eDepth | vk::ImageAspectFlagBits::eStencil)) {
VkImageAspectFlags aspect = aspect_mask;
if (aspect == (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_STENCIL_BIT)) {
UNIMPLEMENTED_IF(x_source != SwizzleSource::R && x_source != SwizzleSource::G);
const bool is_first = x_source == SwizzleSource::R;
switch (params.pixel_format) {
case VideoCore::Surface::PixelFormat::Z24S8:
case VideoCore::Surface::PixelFormat::Z32FS8:
aspect = is_first ? vk::ImageAspectFlagBits::eDepth : vk::ImageAspectFlagBits::eStencil;
aspect = is_first ? VK_IMAGE_ASPECT_DEPTH_BIT : VK_IMAGE_ASPECT_STENCIL_BIT;
break;
case VideoCore::Surface::PixelFormat::S8Z24:
aspect = is_first ? vk::ImageAspectFlagBits::eStencil : vk::ImageAspectFlagBits::eDepth;
aspect = is_first ? VK_IMAGE_ASPECT_STENCIL_BIT : VK_IMAGE_ASPECT_DEPTH_BIT;
break;
default:
aspect = vk::ImageAspectFlagBits::eDepth;
aspect = VK_IMAGE_ASPECT_DEPTH_BIT;
UNIMPLEMENTED();
}
// Vulkan doesn't seem to understand swizzling of a depth stencil image, use identity
swizzle_x = vk::ComponentSwizzle::eR;
swizzle_y = vk::ComponentSwizzle::eG;
swizzle_z = vk::ComponentSwizzle::eB;
swizzle_w = vk::ComponentSwizzle::eA;
swizzle_x = VK_COMPONENT_SWIZZLE_R;
swizzle_y = VK_COMPONENT_SWIZZLE_G;
swizzle_z = VK_COMPONENT_SWIZZLE_B;
swizzle_w = VK_COMPONENT_SWIZZLE_A;
}
const vk::ImageViewCreateInfo image_view_ci(
{}, surface.GetImageHandle(), image_view_type, surface.GetImage().GetFormat(),
{swizzle_x, swizzle_y, swizzle_z, swizzle_w},
{aspect, base_level, num_levels, base_layer, num_layers});
VkImageViewCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO;
ci.pNext = nullptr;
ci.flags = 0;
ci.image = surface.GetImageHandle();
ci.viewType = image_view_type;
ci.format = surface.GetImage().GetFormat();
ci.components = {swizzle_x, swizzle_y, swizzle_z, swizzle_w};
ci.subresourceRange.aspectMask = aspect;
ci.subresourceRange.baseMipLevel = base_level;
ci.subresourceRange.levelCount = num_levels;
ci.subresourceRange.baseArrayLayer = base_layer;
ci.subresourceRange.layerCount = num_layers;
image_view = device.GetLogical().CreateImageView(ci);
const auto dev = device.GetLogical();
image_view = dev.createImageViewUnique(image_view_ci, nullptr, device.GetDispatchLoader());
return last_image_view = *image_view;
}
@@ -418,25 +456,36 @@ void VKTextureCache::ImageCopy(Surface& src_surface, Surface& dst_surface,
scheduler.RequestOutsideRenderPassOperationContext();
src_surface->Transition(copy_params.source_z, copy_params.depth, copy_params.source_level, 1,
vk::PipelineStageFlagBits::eTransfer, vk::AccessFlagBits::eTransferRead,
vk::ImageLayout::eTransferSrcOptimal);
dst_surface->Transition(
dst_base_layer, num_layers, copy_params.dest_level, 1, vk::PipelineStageFlagBits::eTransfer,
vk::AccessFlagBits::eTransferWrite, vk::ImageLayout::eTransferDstOptimal);
VK_PIPELINE_STAGE_TRANSFER_BIT, VK_ACCESS_TRANSFER_READ_BIT,
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL);
dst_surface->Transition(dst_base_layer, num_layers, copy_params.dest_level, 1,
VK_PIPELINE_STAGE_TRANSFER_BIT, VK_ACCESS_TRANSFER_WRITE_BIT,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
const vk::ImageSubresourceLayers src_subresource(
src_surface->GetAspectMask(), copy_params.source_level, copy_params.source_z, num_layers);
const vk::ImageSubresourceLayers dst_subresource(
dst_surface->GetAspectMask(), copy_params.dest_level, dst_base_layer, num_layers);
const vk::Offset3D src_offset(copy_params.source_x, copy_params.source_y, 0);
const vk::Offset3D dst_offset(copy_params.dest_x, copy_params.dest_y, dst_offset_z);
const vk::Extent3D extent(copy_params.width, copy_params.height, extent_z);
const vk::ImageCopy copy(src_subresource, src_offset, dst_subresource, dst_offset, extent);
const vk::Image src_image = src_surface->GetImageHandle();
const vk::Image dst_image = dst_surface->GetImageHandle();
scheduler.Record([src_image, dst_image, copy](auto cmdbuf, auto& dld) {
cmdbuf.copyImage(src_image, vk::ImageLayout::eTransferSrcOptimal, dst_image,
vk::ImageLayout::eTransferDstOptimal, {copy}, dld);
VkImageCopy copy;
copy.srcSubresource.aspectMask = src_surface->GetAspectMask();
copy.srcSubresource.mipLevel = copy_params.source_level;
copy.srcSubresource.baseArrayLayer = copy_params.source_z;
copy.srcSubresource.layerCount = num_layers;
copy.srcOffset.x = copy_params.source_x;
copy.srcOffset.y = copy_params.source_y;
copy.srcOffset.z = 0;
copy.dstSubresource.aspectMask = dst_surface->GetAspectMask();
copy.dstSubresource.mipLevel = copy_params.dest_level;
copy.dstSubresource.baseArrayLayer = dst_base_layer;
copy.dstSubresource.layerCount = num_layers;
copy.dstOffset.x = copy_params.dest_x;
copy.dstOffset.y = copy_params.dest_y;
copy.dstOffset.z = dst_offset_z;
copy.extent.width = copy_params.width;
copy.extent.height = copy_params.height;
copy.extent.depth = extent_z;
const VkImage src_image = src_surface->GetImageHandle();
const VkImage dst_image = dst_surface->GetImageHandle();
scheduler.Record([src_image, dst_image, copy](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyImage(src_image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, dst_image,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, copy);
});
}
@@ -445,25 +494,34 @@ void VKTextureCache::ImageBlit(View& src_view, View& dst_view,
// We can't blit inside a renderpass
scheduler.RequestOutsideRenderPassOperationContext();
src_view->Transition(vk::ImageLayout::eTransferSrcOptimal, vk::PipelineStageFlagBits::eTransfer,
vk::AccessFlagBits::eTransferRead);
dst_view->Transition(vk::ImageLayout::eTransferDstOptimal, vk::PipelineStageFlagBits::eTransfer,
vk::AccessFlagBits::eTransferWrite);
src_view->Transition(VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_ACCESS_TRANSFER_READ_BIT);
dst_view->Transition(VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_ACCESS_TRANSFER_WRITE_BIT);
VkImageBlit blit;
blit.srcSubresource = src_view->GetImageSubresourceLayers();
blit.srcOffsets[0].x = copy_config.src_rect.left;
blit.srcOffsets[0].y = copy_config.src_rect.top;
blit.srcOffsets[0].z = 0;
blit.srcOffsets[1].x = copy_config.src_rect.right;
blit.srcOffsets[1].y = copy_config.src_rect.bottom;
blit.srcOffsets[1].z = 1;
blit.dstSubresource = dst_view->GetImageSubresourceLayers();
blit.dstOffsets[0].x = copy_config.dst_rect.left;
blit.dstOffsets[0].y = copy_config.dst_rect.top;
blit.dstOffsets[0].z = 0;
blit.dstOffsets[1].x = copy_config.dst_rect.right;
blit.dstOffsets[1].y = copy_config.dst_rect.bottom;
blit.dstOffsets[1].z = 1;
const auto& cfg = copy_config;
const auto src_top_left = vk::Offset3D(cfg.src_rect.left, cfg.src_rect.top, 0);
const auto src_bot_right = vk::Offset3D(cfg.src_rect.right, cfg.src_rect.bottom, 1);
const auto dst_top_left = vk::Offset3D(cfg.dst_rect.left, cfg.dst_rect.top, 0);
const auto dst_bot_right = vk::Offset3D(cfg.dst_rect.right, cfg.dst_rect.bottom, 1);
const vk::ImageBlit blit(src_view->GetImageSubresourceLayers(), {src_top_left, src_bot_right},
dst_view->GetImageSubresourceLayers(), {dst_top_left, dst_bot_right});
const bool is_linear = copy_config.filter == Tegra::Engines::Fermi2D::Filter::Linear;
scheduler.Record([src_image = src_view->GetImage(), dst_image = dst_view->GetImage(), blit,
is_linear](auto cmdbuf, auto& dld) {
cmdbuf.blitImage(src_image, vk::ImageLayout::eTransferSrcOptimal, dst_image,
vk::ImageLayout::eTransferDstOptimal, {blit},
is_linear ? vk::Filter::eLinear : vk::Filter::eNearest, dld);
is_linear](vk::CommandBuffer cmdbuf) {
cmdbuf.BlitImage(src_image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, dst_image,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, blit,
is_linear ? VK_FILTER_LINEAR : VK_FILTER_NEAREST);
});
}

View File

@@ -13,10 +13,10 @@
#include "common/math_util.h"
#include "video_core/gpu.h"
#include "video_core/rasterizer_cache.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_image.h"
#include "video_core/renderer_vulkan/vk_memory_manager.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/wrapper.h"
#include "video_core/texture_cache/surface_base.h"
#include "video_core/texture_cache/texture_cache.h"
#include "video_core/textures/decoders.h"
@@ -60,15 +60,15 @@ public:
void UploadTexture(const std::vector<u8>& staging_buffer) override;
void DownloadTexture(std::vector<u8>& staging_buffer) override;
void FullTransition(vk::PipelineStageFlags new_stage_mask, vk::AccessFlags new_access,
vk::ImageLayout new_layout) {
void FullTransition(VkPipelineStageFlags new_stage_mask, VkAccessFlags new_access,
VkImageLayout new_layout) {
image->Transition(0, static_cast<u32>(params.GetNumLayers()), 0, params.num_levels,
new_stage_mask, new_access, new_layout);
}
void Transition(u32 base_layer, u32 num_layers, u32 base_level, u32 num_levels,
vk::PipelineStageFlags new_stage_mask, vk::AccessFlags new_access,
vk::ImageLayout new_layout) {
VkPipelineStageFlags new_stage_mask, VkAccessFlags new_access,
VkImageLayout new_layout) {
image->Transition(base_layer, num_layers, base_level, num_levels, new_stage_mask,
new_access, new_layout);
}
@@ -81,15 +81,15 @@ public:
return *image;
}
vk::Image GetImageHandle() const {
return image->GetHandle();
VkImage GetImageHandle() const {
return *image->GetHandle();
}
vk::ImageAspectFlags GetAspectMask() const {
VkImageAspectFlags GetAspectMask() const {
return image->GetAspectMask();
}
vk::BufferView GetBufferViewHandle() const {
VkBufferView GetBufferViewHandle() const {
return *buffer_view;
}
@@ -104,9 +104,9 @@ private:
void UploadImage(const std::vector<u8>& staging_buffer);
vk::BufferImageCopy GetBufferImageCopy(u32 level) const;
VkBufferImageCopy GetBufferImageCopy(u32 level) const;
vk::ImageSubresourceRange GetImageSubresourceRange() const;
VkImageSubresourceRange GetImageSubresourceRange() const;
Core::System& system;
const VKDevice& device;
@@ -116,11 +116,11 @@ private:
VKStagingBufferPool& staging_pool;
std::optional<VKImage> image;
UniqueBuffer buffer;
UniqueBufferView buffer_view;
vk::Buffer buffer;
vk::BufferView buffer_view;
VKMemoryCommit commit;
vk::Format format;
VkFormat format = VK_FORMAT_UNDEFINED;
};
class CachedSurfaceView final : public VideoCommon::ViewBase {
@@ -129,16 +129,16 @@ public:
const ViewParams& params, bool is_proxy);
~CachedSurfaceView();
vk::ImageView GetHandle(Tegra::Texture::SwizzleSource x_source,
Tegra::Texture::SwizzleSource y_source,
Tegra::Texture::SwizzleSource z_source,
Tegra::Texture::SwizzleSource w_source);
VkImageView GetHandle(Tegra::Texture::SwizzleSource x_source,
Tegra::Texture::SwizzleSource y_source,
Tegra::Texture::SwizzleSource z_source,
Tegra::Texture::SwizzleSource w_source);
bool IsSameSurface(const CachedSurfaceView& rhs) const {
return &surface == &rhs.surface;
}
vk::ImageView GetHandle() {
VkImageView GetHandle() {
return GetHandle(Tegra::Texture::SwizzleSource::R, Tegra::Texture::SwizzleSource::G,
Tegra::Texture::SwizzleSource::B, Tegra::Texture::SwizzleSource::A);
}
@@ -159,24 +159,24 @@ public:
return buffer_view;
}
vk::Image GetImage() const {
VkImage GetImage() const {
return image;
}
vk::BufferView GetBufferView() const {
VkBufferView GetBufferView() const {
return buffer_view;
}
vk::ImageSubresourceRange GetImageSubresourceRange() const {
VkImageSubresourceRange GetImageSubresourceRange() const {
return {aspect_mask, base_level, num_levels, base_layer, num_layers};
}
vk::ImageSubresourceLayers GetImageSubresourceLayers() const {
VkImageSubresourceLayers GetImageSubresourceLayers() const {
return {surface.GetAspectMask(), base_level, base_layer, num_layers};
}
void Transition(vk::ImageLayout new_layout, vk::PipelineStageFlags new_stage_mask,
vk::AccessFlags new_access) const {
void Transition(VkImageLayout new_layout, VkPipelineStageFlags new_stage_mask,
VkAccessFlags new_access) const {
surface.Transition(base_layer, num_layers, base_level, num_levels, new_stage_mask,
new_access, new_layout);
}
@@ -196,9 +196,9 @@ private:
// Store a copy of these values to avoid double dereference when reading them
const SurfaceParams params;
const vk::Image image;
const vk::BufferView buffer_view;
const vk::ImageAspectFlags aspect_mask;
const VkImage image;
const VkBufferView buffer_view;
const VkImageAspectFlags aspect_mask;
const VKDevice& device;
CachedSurface& surface;
@@ -206,12 +206,12 @@ private:
const u32 num_layers;
const u32 base_level;
const u32 num_levels;
const vk::ImageViewType image_view_type;
const VkImageViewType image_view_type;
vk::ImageView last_image_view;
u32 last_swizzle{};
VkImageView last_image_view = nullptr;
u32 last_swizzle = 0;
std::unordered_map<u32, UniqueImageView> view_cache;
std::unordered_map<u32, vk::ImageView> view_cache;
};
class VKTextureCache final : public TextureCacheBase {

View File

@@ -7,10 +7,10 @@
#include "common/assert.h"
#include "common/logging/log.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/vk_device.h"
#include "video_core/renderer_vulkan/vk_scheduler.h"
#include "video_core/renderer_vulkan/vk_update_descriptor.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -27,8 +27,8 @@ void VKUpdateDescriptorQueue::Acquire() {
entries.clear();
}
void VKUpdateDescriptorQueue::Send(vk::DescriptorUpdateTemplate update_template,
vk::DescriptorSet set) {
void VKUpdateDescriptorQueue::Send(VkDescriptorUpdateTemplateKHR update_template,
VkDescriptorSet set) {
if (payload.size() + entries.size() >= payload.max_size()) {
LOG_WARNING(Render_Vulkan, "Payload overflow, waiting for worker thread");
scheduler.WaitWorker();
@@ -37,21 +37,21 @@ void VKUpdateDescriptorQueue::Send(vk::DescriptorUpdateTemplate update_template,
const auto payload_start = payload.data() + payload.size();
for (const auto& entry : entries) {
if (const auto image = std::get_if<vk::DescriptorImageInfo>(&entry)) {
if (const auto image = std::get_if<VkDescriptorImageInfo>(&entry)) {
payload.push_back(*image);
} else if (const auto buffer = std::get_if<Buffer>(&entry)) {
payload.emplace_back(*buffer->buffer, buffer->offset, buffer->size);
} else if (const auto texel = std::get_if<vk::BufferView>(&entry)) {
} else if (const auto texel = std::get_if<VkBufferView>(&entry)) {
payload.push_back(*texel);
} else {
UNREACHABLE();
}
}
scheduler.Record([dev = device.GetLogical(), payload_start, set,
update_template]([[maybe_unused]] auto cmdbuf, auto& dld) {
dev.updateDescriptorSetWithTemplate(set, update_template, payload_start, dld);
});
scheduler.Record(
[payload_start, set, update_template, logical = &device.GetLogical()](vk::CommandBuffer) {
logical->UpdateDescriptorSet(set, update_template, payload_start);
});
}
} // namespace Vulkan

View File

@@ -9,7 +9,7 @@
#include <boost/container/static_vector.hpp>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/declarations.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -20,18 +20,18 @@ class DescriptorUpdateEntry {
public:
explicit DescriptorUpdateEntry() : image{} {}
DescriptorUpdateEntry(vk::DescriptorImageInfo image) : image{image} {}
DescriptorUpdateEntry(VkDescriptorImageInfo image) : image{image} {}
DescriptorUpdateEntry(vk::Buffer buffer, vk::DeviceSize offset, vk::DeviceSize size)
DescriptorUpdateEntry(VkBuffer buffer, VkDeviceSize offset, VkDeviceSize size)
: buffer{buffer, offset, size} {}
DescriptorUpdateEntry(vk::BufferView texel_buffer) : texel_buffer{texel_buffer} {}
DescriptorUpdateEntry(VkBufferView texel_buffer) : texel_buffer{texel_buffer} {}
private:
union {
vk::DescriptorImageInfo image;
vk::DescriptorBufferInfo buffer;
vk::BufferView texel_buffer;
VkDescriptorImageInfo image;
VkDescriptorBufferInfo buffer;
VkBufferView texel_buffer;
};
};
@@ -44,37 +44,35 @@ public:
void Acquire();
void Send(vk::DescriptorUpdateTemplate update_template, vk::DescriptorSet set);
void Send(VkDescriptorUpdateTemplateKHR update_template, VkDescriptorSet set);
void AddSampledImage(vk::Sampler sampler, vk::ImageView image_view) {
entries.emplace_back(vk::DescriptorImageInfo{sampler, image_view, {}});
void AddSampledImage(VkSampler sampler, VkImageView image_view) {
entries.emplace_back(VkDescriptorImageInfo{sampler, image_view, {}});
}
void AddImage(vk::ImageView image_view) {
entries.emplace_back(vk::DescriptorImageInfo{{}, image_view, {}});
void AddImage(VkImageView image_view) {
entries.emplace_back(VkDescriptorImageInfo{{}, image_view, {}});
}
void AddBuffer(const vk::Buffer* buffer, u64 offset, std::size_t size) {
void AddBuffer(const VkBuffer* buffer, u64 offset, std::size_t size) {
entries.push_back(Buffer{buffer, offset, size});
}
void AddTexelBuffer(vk::BufferView texel_buffer) {
void AddTexelBuffer(VkBufferView texel_buffer) {
entries.emplace_back(texel_buffer);
}
vk::ImageLayout* GetLastImageLayout() {
return &std::get<vk::DescriptorImageInfo>(entries.back()).imageLayout;
VkImageLayout* GetLastImageLayout() {
return &std::get<VkDescriptorImageInfo>(entries.back()).imageLayout;
}
private:
struct Buffer {
const vk::Buffer* buffer{};
u64 offset{};
std::size_t size{};
const VkBuffer* buffer = nullptr;
u64 offset = 0;
std::size_t size = 0;
};
using Variant = std::variant<vk::DescriptorImageInfo, Buffer, vk::BufferView>;
// Old gcc versions don't consider this trivially copyable.
// static_assert(std::is_trivially_copyable_v<Variant>);
using Variant = std::variant<VkDescriptorImageInfo, Buffer, VkBufferView>;
const VKDevice& device;
VKScheduler& scheduler;

View File

@@ -136,7 +136,8 @@ u32 ShaderIR::DecodeArithmetic(NodeBlock& bb, u32 pc) {
SetRegister(bb, instr.gpr0, value);
break;
}
case OpCode::Id::FCMP_R: {
case OpCode::Id::FCMP_RR:
case OpCode::Id::FCMP_RC: {
UNIMPLEMENTED_IF(instr.fcmp.ftz == 0);
Node op_c = GetRegister(instr.gpr39);
Node comp = GetPredicateComparisonFloat(instr.fcmp.cond, std::move(op_c), Immediate(0.0f));

View File

@@ -2,6 +2,10 @@
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <limits>
#include <optional>
#include <utility>
#include "common/assert.h"
#include "common/common_types.h"
#include "video_core/engines/shader_bytecode.h"
@@ -15,9 +19,49 @@ using Tegra::Shader::OpCode;
using Tegra::Shader::Register;
namespace {
constexpr OperationCode GetFloatSelector(u64 selector) {
return selector == 0 ? OperationCode::FCastHalf0 : OperationCode::FCastHalf1;
}
constexpr u32 SizeInBits(Register::Size size) {
switch (size) {
case Register::Size::Byte:
return 8;
case Register::Size::Short:
return 16;
case Register::Size::Word:
return 32;
case Register::Size::Long:
return 64;
}
return 0;
}
constexpr std::optional<std::pair<s32, s32>> IntegerSaturateBounds(Register::Size src_size,
Register::Size dst_size,
bool src_signed,
bool dst_signed) {
const u32 dst_bits = SizeInBits(dst_size);
if (src_size == Register::Size::Word && dst_size == Register::Size::Word) {
if (src_signed == dst_signed) {
return std::nullopt;
}
return std::make_pair(0, std::numeric_limits<s32>::max());
}
if (dst_signed) {
// Signed destination, clamp to [-128, 127] for instance
return std::make_pair(-(1 << (dst_bits - 1)), (1 << (dst_bits - 1)) - 1);
} else {
// Unsigned destination
if (dst_bits == 32) {
// Avoid shifting by 32, that is undefined behavior
return std::make_pair(0, s32(std::numeric_limits<u32>::max()));
}
return std::make_pair(0, (1 << dst_bits) - 1);
}
}
} // Anonymous namespace
u32 ShaderIR::DecodeConversion(NodeBlock& bb, u32 pc) {
@@ -28,14 +72,13 @@ u32 ShaderIR::DecodeConversion(NodeBlock& bb, u32 pc) {
case OpCode::Id::I2I_R:
case OpCode::Id::I2I_C:
case OpCode::Id::I2I_IMM: {
UNIMPLEMENTED_IF(instr.conversion.int_src.selector != 0);
UNIMPLEMENTED_IF(instr.conversion.dst_size != Register::Size::Word);
UNIMPLEMENTED_IF(instr.alu.saturate_d);
const bool src_signed = instr.conversion.is_input_signed;
const bool dst_signed = instr.conversion.is_output_signed;
const Register::Size src_size = instr.conversion.src_size;
const Register::Size dst_size = instr.conversion.dst_size;
const u32 selector = static_cast<u32>(instr.conversion.int_src.selector);
const bool input_signed = instr.conversion.is_input_signed;
const bool output_signed = instr.conversion.is_output_signed;
Node value = [&]() {
Node value = [this, instr, opcode] {
switch (opcode->get().GetId()) {
case OpCode::Id::I2I_R:
return GetRegister(instr.gpr20);
@@ -48,16 +91,60 @@ u32 ShaderIR::DecodeConversion(NodeBlock& bb, u32 pc) {
return Immediate(0);
}
}();
value = ConvertIntegerSize(value, instr.conversion.src_size, input_signed);
value = GetOperandAbsNegInteger(value, instr.conversion.abs_a, instr.conversion.negate_a,
input_signed);
if (input_signed != output_signed) {
value = SignedOperation(OperationCode::ICastUnsigned, output_signed, NO_PRECISE, value);
// Ensure the source selector is valid
switch (instr.conversion.src_size) {
case Register::Size::Byte:
break;
case Register::Size::Short:
ASSERT(selector == 0 || selector == 2);
break;
default:
ASSERT(selector == 0);
break;
}
if (src_size != Register::Size::Word || selector != 0) {
value = SignedOperation(OperationCode::IBitfieldExtract, src_signed, std::move(value),
Immediate(selector * 8), Immediate(SizeInBits(src_size)));
}
value = GetOperandAbsNegInteger(std::move(value), instr.conversion.abs_a,
instr.conversion.negate_a, src_signed);
if (instr.alu.saturate_d) {
if (src_signed && !dst_signed) {
Node is_negative = Operation(OperationCode::LogicalUGreaterEqual, value,
Immediate(1 << (SizeInBits(src_size) - 1)));
value = Operation(OperationCode::Select, std::move(is_negative), Immediate(0),
std::move(value));
// Simplify generated expressions, this can be removed without semantic impact
SetTemporary(bb, 0, std::move(value));
value = GetTemporary(0);
if (dst_size != Register::Size::Word) {
const Node limit = Immediate((1 << SizeInBits(dst_size)) - 1);
Node is_large =
Operation(OperationCode::LogicalUGreaterThan, std::move(value), limit);
value = Operation(OperationCode::Select, std::move(is_large), limit,
std::move(value));
}
} else if (const std::optional bounds =
IntegerSaturateBounds(src_size, dst_size, src_signed, dst_signed)) {
value = SignedOperation(OperationCode::IMax, src_signed, std::move(value),
Immediate(bounds->first));
value = SignedOperation(OperationCode::IMin, src_signed, std::move(value),
Immediate(bounds->second));
}
} else if (dst_size != Register::Size::Word) {
// No saturation, we only have to mask the result
Node mask = Immediate((1 << SizeInBits(dst_size)) - 1);
value = Operation(OperationCode::UBitwiseAnd, std::move(value), std::move(mask));
}
SetInternalFlagsFromInteger(bb, value, instr.generates_cc);
SetRegister(bb, instr.gpr0, value);
SetRegister(bb, instr.gpr0, std::move(value));
break;
}
case OpCode::Id::I2F_R:

View File

@@ -352,8 +352,10 @@ u32 ShaderIR::DecodeImage(NodeBlock& bb, u32 pc) {
registry.ObtainBoundSampler(static_cast<u32>(instr.image.index.Value()));
} else {
const Node image_register = GetRegister(instr.gpr39);
const auto [base_image, buffer, offset] = TrackCbuf(
image_register, global_code, static_cast<s64>(global_code.size()));
const auto result = TrackCbuf(image_register, global_code,
static_cast<s64>(global_code.size()));
const auto buffer = std::get<1>(result);
const auto offset = std::get<2>(result);
descriptor = registry.ObtainBindlessSampler(buffer, offset);
}
if (!descriptor) {
@@ -497,9 +499,12 @@ Image& ShaderIR::GetImage(Tegra::Shader::Image image, Tegra::Shader::ImageType t
Image& ShaderIR::GetBindlessImage(Tegra::Shader::Register reg, Tegra::Shader::ImageType type) {
const Node image_register = GetRegister(reg);
const auto [base_image, buffer, offset] =
const auto result =
TrackCbuf(image_register, global_code, static_cast<s64>(global_code.size()));
const auto buffer = std::get<1>(result);
const auto offset = std::get<2>(result);
const auto it =
std::find_if(std::begin(used_images), std::end(used_images),
[buffer = buffer, offset = offset](const Image& entry) {

View File

@@ -3,7 +3,9 @@
// Refer to the license.txt file included.
#include <algorithm>
#include <utility>
#include <vector>
#include <fmt/format.h>
#include "common/alignment.h"
@@ -16,6 +18,7 @@
namespace VideoCommon::Shader {
using std::move;
using Tegra::Shader::AtomicOp;
using Tegra::Shader::AtomicType;
using Tegra::Shader::Attribute;
@@ -27,29 +30,26 @@ using Tegra::Shader::StoreType;
namespace {
Node GetAtomOperation(AtomicOp op, bool is_signed, Node memory, Node data) {
const OperationCode operation_code = [op] {
switch (op) {
case AtomicOp::Add:
return OperationCode::AtomicIAdd;
case AtomicOp::Min:
return OperationCode::AtomicIMin;
case AtomicOp::Max:
return OperationCode::AtomicIMax;
case AtomicOp::And:
return OperationCode::AtomicIAnd;
case AtomicOp::Or:
return OperationCode::AtomicIOr;
case AtomicOp::Xor:
return OperationCode::AtomicIXor;
case AtomicOp::Exch:
return OperationCode::AtomicIExchange;
default:
UNIMPLEMENTED_MSG("op={}", static_cast<int>(op));
return OperationCode::AtomicIAdd;
}
}();
return SignedOperation(operation_code, is_signed, std::move(memory), std::move(data));
OperationCode GetAtomOperation(AtomicOp op) {
switch (op) {
case AtomicOp::Add:
return OperationCode::AtomicIAdd;
case AtomicOp::Min:
return OperationCode::AtomicIMin;
case AtomicOp::Max:
return OperationCode::AtomicIMax;
case AtomicOp::And:
return OperationCode::AtomicIAnd;
case AtomicOp::Or:
return OperationCode::AtomicIOr;
case AtomicOp::Xor:
return OperationCode::AtomicIXor;
case AtomicOp::Exch:
return OperationCode::AtomicIExchange;
default:
UNIMPLEMENTED_MSG("op={}", static_cast<int>(op));
return OperationCode::AtomicIAdd;
}
}
bool IsUnaligned(Tegra::Shader::UniformType uniform_type) {
@@ -90,23 +90,22 @@ u32 GetMemorySize(Tegra::Shader::UniformType uniform_type) {
Node ExtractUnaligned(Node value, Node address, u32 mask, u32 size) {
Node offset = Operation(OperationCode::UBitwiseAnd, address, Immediate(mask));
offset = Operation(OperationCode::ULogicalShiftLeft, std::move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldExtract, std::move(value), std::move(offset),
Immediate(size));
offset = Operation(OperationCode::ULogicalShiftLeft, move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldExtract, move(value), move(offset), Immediate(size));
}
Node InsertUnaligned(Node dest, Node value, Node address, u32 mask, u32 size) {
Node offset = Operation(OperationCode::UBitwiseAnd, std::move(address), Immediate(mask));
offset = Operation(OperationCode::ULogicalShiftLeft, std::move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldInsert, std::move(dest), std::move(value),
std::move(offset), Immediate(size));
Node offset = Operation(OperationCode::UBitwiseAnd, move(address), Immediate(mask));
offset = Operation(OperationCode::ULogicalShiftLeft, move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldInsert, move(dest), move(value), move(offset),
Immediate(size));
}
Node Sign16Extend(Node value) {
Node sign = Operation(OperationCode::UBitwiseAnd, value, Immediate(1U << 15));
Node is_sign = Operation(OperationCode::LogicalUEqual, std::move(sign), Immediate(1U << 15));
Node is_sign = Operation(OperationCode::LogicalUEqual, move(sign), Immediate(1U << 15));
Node extend = Operation(OperationCode::Select, is_sign, Immediate(0xFFFF0000), Immediate(0));
return Operation(OperationCode::UBitwiseOr, std::move(value), std::move(extend));
return Operation(OperationCode::UBitwiseOr, move(value), move(extend));
}
} // Anonymous namespace
@@ -379,20 +378,36 @@ u32 ShaderIR::DecodeMemory(NodeBlock& bb, u32 pc) {
if (IsUnaligned(type)) {
const u32 mask = GetUnalignedMask(type);
value = InsertUnaligned(gmem, std::move(value), real_address, mask, size);
value = InsertUnaligned(gmem, move(value), real_address, mask, size);
}
bb.push_back(Operation(OperationCode::Assign, gmem, value));
}
break;
}
case OpCode::Id::RED: {
UNIMPLEMENTED_IF_MSG(instr.red.type != GlobalAtomicType::U32);
UNIMPLEMENTED_IF_MSG(instr.red.operation != AtomicOp::Add);
const auto [real_address, base_address, descriptor] =
TrackGlobalMemory(bb, instr, true, true);
if (!real_address || !base_address) {
// Tracking failed, skip atomic.
break;
}
Node gmem = MakeNode<GmemNode>(real_address, base_address, descriptor);
Node value = GetRegister(instr.gpr0);
bb.push_back(Operation(OperationCode::ReduceIAdd, move(gmem), move(value)));
break;
}
case OpCode::Id::ATOM: {
UNIMPLEMENTED_IF_MSG(instr.atom.operation == AtomicOp::Inc ||
instr.atom.operation == AtomicOp::Dec ||
instr.atom.operation == AtomicOp::SafeAdd,
"operation={}", static_cast<int>(instr.atom.operation.Value()));
UNIMPLEMENTED_IF_MSG(instr.atom.type == GlobalAtomicType::S64 ||
instr.atom.type == GlobalAtomicType::U64,
instr.atom.type == GlobalAtomicType::U64 ||
instr.atom.type == GlobalAtomicType::F16x2_FTZ_RN ||
instr.atom.type == GlobalAtomicType::F32_FTZ_RN,
"type={}", static_cast<int>(instr.atom.type.Value()));
const auto [real_address, base_address, descriptor] =
@@ -403,11 +418,11 @@ u32 ShaderIR::DecodeMemory(NodeBlock& bb, u32 pc) {
}
const bool is_signed =
instr.atoms.type == AtomicType::S32 || instr.atoms.type == AtomicType::S64;
instr.atom.type == GlobalAtomicType::S32 || instr.atom.type == GlobalAtomicType::S64;
Node gmem = MakeNode<GmemNode>(real_address, base_address, descriptor);
Node value = GetAtomOperation(static_cast<AtomicOp>(instr.atom.operation), is_signed, gmem,
GetRegister(instr.gpr20));
SetRegister(bb, instr.gpr0, std::move(value));
SetRegister(bb, instr.gpr0,
SignedOperation(GetAtomOperation(instr.atom.operation), is_signed, gmem,
GetRegister(instr.gpr20)));
break;
}
case OpCode::Id::ATOMS: {
@@ -421,11 +436,10 @@ u32 ShaderIR::DecodeMemory(NodeBlock& bb, u32 pc) {
instr.atoms.type == AtomicType::S32 || instr.atoms.type == AtomicType::S64;
const s32 offset = instr.atoms.GetImmediateOffset();
Node address = GetRegister(instr.gpr8);
address = Operation(OperationCode::IAdd, std::move(address), Immediate(offset));
Node value =
GetAtomOperation(static_cast<AtomicOp>(instr.atoms.operation), is_signed,
GetSharedMemory(std::move(address)), GetRegister(instr.gpr20));
SetRegister(bb, instr.gpr0, std::move(value));
address = Operation(OperationCode::IAdd, move(address), Immediate(offset));
SetRegister(bb, instr.gpr0,
SignedOperation(GetAtomOperation(instr.atoms.operation), is_signed,
GetSharedMemory(move(address)), GetRegister(instr.gpr20)));
break;
}
case OpCode::Id::AL2P: {

View File

@@ -780,20 +780,6 @@ Node4 ShaderIR::GetTldsCode(Instruction instr, TextureType texture_type, bool is
// When lod is used always is in gpr20
const Node lod = lod_enabled ? GetRegister(instr.gpr20) : Immediate(0);
// Fill empty entries from the guest sampler
const std::size_t entry_coord_count = GetCoordCount(sampler.GetType());
if (type_coord_count != entry_coord_count) {
LOG_WARNING(HW_GPU, "Bound and built texture types mismatch");
// When the size is higher we insert zeroes
for (std::size_t i = type_coord_count; i < entry_coord_count; ++i) {
coords.push_back(GetRegister(Register::ZeroIndex));
}
// Then we ensure the size matches the number of entries (dropping unused values)
coords.resize(entry_coord_count);
}
Node4 values;
for (u32 element = 0; element < values.size(); ++element) {
auto coords_copy = coords;

View File

@@ -10,16 +10,24 @@
namespace VideoCommon::Shader {
using std::move;
using Tegra::Shader::Instruction;
using Tegra::Shader::OpCode;
using Tegra::Shader::Pred;
using Tegra::Shader::VideoType;
using Tegra::Shader::VmadShr;
using Tegra::Shader::VmnmxOperation;
using Tegra::Shader::VmnmxType;
u32 ShaderIR::DecodeVideo(NodeBlock& bb, u32 pc) {
const Instruction instr = {program_code[pc]};
const auto opcode = OpCode::Decode(instr);
if (opcode->get().GetId() == OpCode::Id::VMNMX) {
DecodeVMNMX(bb, instr);
return pc;
}
const Node op_a =
GetVideoOperand(GetRegister(instr.gpr8), instr.video.is_byte_chunk_a, instr.video.signed_a,
instr.video.type_a, instr.video.byte_height_a);
@@ -109,4 +117,54 @@ Node ShaderIR::GetVideoOperand(Node op, bool is_chunk, bool is_signed,
}
}
void ShaderIR::DecodeVMNMX(NodeBlock& bb, Tegra::Shader::Instruction instr) {
UNIMPLEMENTED_IF(!instr.vmnmx.is_op_b_register);
UNIMPLEMENTED_IF(instr.vmnmx.SourceFormatA() != VmnmxType::Bits32);
UNIMPLEMENTED_IF(instr.vmnmx.SourceFormatB() != VmnmxType::Bits32);
UNIMPLEMENTED_IF(instr.vmnmx.is_src_a_signed != instr.vmnmx.is_src_b_signed);
UNIMPLEMENTED_IF(instr.vmnmx.sat);
UNIMPLEMENTED_IF(instr.generates_cc);
Node op_a = GetRegister(instr.gpr8);
Node op_b = GetRegister(instr.gpr20);
Node op_c = GetRegister(instr.gpr39);
const bool is_oper1_signed = instr.vmnmx.is_src_a_signed; // Stubbed
const bool is_oper2_signed = instr.vmnmx.is_dest_signed;
const auto operation_a = instr.vmnmx.mx ? OperationCode::IMax : OperationCode::IMin;
Node value = SignedOperation(operation_a, is_oper1_signed, move(op_a), move(op_b));
switch (instr.vmnmx.operation) {
case VmnmxOperation::Mrg_16H:
value = BitfieldInsert(move(op_c), move(value), 16, 16);
break;
case VmnmxOperation::Mrg_16L:
value = BitfieldInsert(move(op_c), move(value), 0, 16);
break;
case VmnmxOperation::Mrg_8B0:
value = BitfieldInsert(move(op_c), move(value), 0, 8);
break;
case VmnmxOperation::Mrg_8B2:
value = BitfieldInsert(move(op_c), move(value), 16, 8);
break;
case VmnmxOperation::Acc:
value = Operation(OperationCode::IAdd, move(value), move(op_c));
break;
case VmnmxOperation::Min:
value = SignedOperation(OperationCode::IMin, is_oper2_signed, move(value), move(op_c));
break;
case VmnmxOperation::Max:
value = SignedOperation(OperationCode::IMax, is_oper2_signed, move(value), move(op_c));
break;
case VmnmxOperation::Nop:
break;
default:
UNREACHABLE();
break;
}
SetRegister(bb, instr.gpr0, move(value));
}
} // namespace VideoCommon::Shader

View File

@@ -178,6 +178,20 @@ enum class OperationCode {
AtomicIOr, /// (memory, int) -> int
AtomicIXor, /// (memory, int) -> int
ReduceUAdd, /// (memory, uint) -> void
ReduceUMin, /// (memory, uint) -> void
ReduceUMax, /// (memory, uint) -> void
ReduceUAnd, /// (memory, uint) -> void
ReduceUOr, /// (memory, uint) -> void
ReduceUXor, /// (memory, uint) -> void
ReduceIAdd, /// (memory, int) -> void
ReduceIMin, /// (memory, int) -> void
ReduceIMax, /// (memory, int) -> void
ReduceIAnd, /// (memory, int) -> void
ReduceIOr, /// (memory, int) -> void
ReduceIXor, /// (memory, int) -> void
Branch, /// (uint branch_target) -> void
BranchIndirect, /// (uint branch_target) -> void
PushFlowStack, /// (uint branch_target) -> void

View File

@@ -56,8 +56,7 @@ Node ShaderIR::GetConstBuffer(u64 index_, u64 offset_) {
const auto index = static_cast<u32>(index_);
const auto offset = static_cast<u32>(offset_);
const auto [entry, is_new] = used_cbufs.try_emplace(index);
entry->second.MarkAsUsed(offset);
used_cbufs.try_emplace(index).first->second.MarkAsUsed(offset);
return MakeNode<CbufNode>(index, Immediate(offset));
}
@@ -66,8 +65,7 @@ Node ShaderIR::GetConstBufferIndirect(u64 index_, u64 offset_, Node node) {
const auto index = static_cast<u32>(index_);
const auto offset = static_cast<u32>(offset_);
const auto [entry, is_new] = used_cbufs.try_emplace(index);
entry->second.MarkAsUsedIndirect();
used_cbufs.try_emplace(index).first->second.MarkAsUsedIndirect();
Node final_offset = [&] {
// Attempt to inline constant buffer without a variable offset. This is done to allow
@@ -166,6 +164,7 @@ Node ShaderIR::ConvertIntegerSize(Node value, Register::Size size, bool is_signe
std::move(value), Immediate(16));
value = SignedOperation(OperationCode::IArithmeticShiftRight, is_signed, NO_PRECISE,
std::move(value), Immediate(16));
return value;
case Register::Size::Word:
// Default - do nothing
return value;

View File

@@ -354,6 +354,9 @@ private:
/// Marks the usage of a input or output attribute.
void MarkAttributeUsage(Tegra::Shader::Attribute::Index index, u64 element);
/// Decodes VMNMX instruction and inserts its code into the passed basic block.
void DecodeVMNMX(NodeBlock& bb, Tegra::Shader::Instruction instr);
void WriteTexInstructionFloat(NodeBlock& bb, Tegra::Shader::Instruction instr,
const Node4& components);

View File

@@ -27,8 +27,9 @@ std::pair<Node, s64> FindOperation(const NodeBlock& code, s64 cursor,
if (const auto conditional = std::get_if<ConditionalNode>(&*node)) {
const auto& conditional_code = conditional->GetCode();
auto [found, internal_cursor] = FindOperation(
auto result = FindOperation(
conditional_code, static_cast<s64>(conditional_code.size() - 1), operation_code);
auto& found = result.first;
if (found) {
return {std::move(found), cursor};
}
@@ -186,8 +187,8 @@ std::tuple<Node, u32, u32> ShaderIR::TrackCbuf(Node tracked, const NodeBlock& co
std::optional<u32> ShaderIR::TrackImmediate(Node tracked, const NodeBlock& code, s64 cursor) const {
// Reduce the cursor in one to avoid infinite loops when the instruction sets the same register
// that it uses as operand
const auto [found, found_cursor] =
TrackRegister(&std::get<GprNode>(*tracked), code, cursor - 1);
const auto result = TrackRegister(&std::get<GprNode>(*tracked), code, cursor - 1);
const auto& found = result.first;
if (!found) {
return {};
}

View File

@@ -167,7 +167,6 @@ SurfaceParams SurfaceParams::CreateForImage(const FormatLookupTable& lookup_tabl
SurfaceParams SurfaceParams::CreateForDepthBuffer(Core::System& system) {
const auto& regs = system.GPU().Maxwell3D().regs;
regs.zeta_width, regs.zeta_height, regs.zeta.format, regs.zeta.memory_layout.type;
SurfaceParams params;
params.is_tiled = regs.zeta.memory_layout.type ==
Tegra::Engines::Maxwell3D::Regs::InvMemoryLayout::BlockLinear;

View File

@@ -108,7 +108,7 @@ public:
}
const auto params{SurfaceParams::CreateForTexture(format_lookup_table, tic, entry)};
const auto [surface, view] = GetSurface(gpu_addr, *cpu_addr, params, true, false);
const auto [surface, view] = GetSurface(gpu_addr, *cpu_addr, params, false);
if (guard_samplers) {
sampled_textures.push_back(surface);
}
@@ -128,7 +128,7 @@ public:
return GetNullSurface(SurfaceParams::ExpectedTarget(entry));
}
const auto params{SurfaceParams::CreateForImage(format_lookup_table, tic, entry)};
const auto [surface, view] = GetSurface(gpu_addr, *cpu_addr, params, true, false);
const auto [surface, view] = GetSurface(gpu_addr, *cpu_addr, params, false);
if (guard_samplers) {
sampled_textures.push_back(surface);
}
@@ -143,7 +143,7 @@ public:
return any_rt;
}
TView GetDepthBufferSurface(bool preserve_contents) {
TView GetDepthBufferSurface() {
std::lock_guard lock{mutex};
auto& maxwell3d = system.GPU().Maxwell3D();
if (!maxwell3d.dirty.flags[VideoCommon::Dirty::ZetaBuffer]) {
@@ -164,7 +164,7 @@ public:
return {};
}
const auto depth_params{SurfaceParams::CreateForDepthBuffer(system)};
auto surface_view = GetSurface(gpu_addr, *cpu_addr, depth_params, preserve_contents, true);
auto surface_view = GetSurface(gpu_addr, *cpu_addr, depth_params, true);
if (depth_buffer.target)
depth_buffer.target->MarkAsRenderTarget(false, NO_RT);
depth_buffer.target = surface_view.first;
@@ -174,7 +174,7 @@ public:
return surface_view.second;
}
TView GetColorBufferSurface(std::size_t index, bool preserve_contents) {
TView GetColorBufferSurface(std::size_t index) {
std::lock_guard lock{mutex};
ASSERT(index < Tegra::Engines::Maxwell3D::Regs::NumRenderTargets);
auto& maxwell3d = system.GPU().Maxwell3D();
@@ -204,9 +204,8 @@ public:
return {};
}
auto surface_view =
GetSurface(gpu_addr, *cpu_addr, SurfaceParams::CreateForFramebuffer(system, index),
preserve_contents, true);
auto surface_view = GetSurface(gpu_addr, *cpu_addr,
SurfaceParams::CreateForFramebuffer(system, index), true);
if (render_targets[index].target)
render_targets[index].target->MarkAsRenderTarget(false, NO_RT);
render_targets[index].target = surface_view.first;
@@ -260,9 +259,9 @@ public:
const std::optional<VAddr> src_cpu_addr =
system.GPU().MemoryManager().GpuToCpuAddress(src_gpu_addr);
std::pair<TSurface, TView> dst_surface =
GetSurface(dst_gpu_addr, *dst_cpu_addr, dst_params, true, false);
GetSurface(dst_gpu_addr, *dst_cpu_addr, dst_params, false);
std::pair<TSurface, TView> src_surface =
GetSurface(src_gpu_addr, *src_cpu_addr, src_params, true, false);
GetSurface(src_gpu_addr, *src_cpu_addr, src_params, false);
ImageBlit(src_surface.second, dst_surface.second, copy_config);
dst_surface.first->MarkAsModified(true, Tick());
}
@@ -451,22 +450,18 @@ private:
* @param overlaps The overlapping surfaces registered in the cache.
* @param params The parameters for the new surface.
* @param gpu_addr The starting address of the new surface.
* @param preserve_contents Indicates that the new surface should be loaded from memory or left
* blank.
* @param untopological Indicates to the recycler that the texture has no way to match the
* overlaps due to topological reasons.
**/
std::pair<TSurface, TView> RecycleSurface(std::vector<TSurface>& overlaps,
const SurfaceParams& params, const GPUVAddr gpu_addr,
const bool preserve_contents,
const MatchTopologyResult untopological) {
const bool do_load = preserve_contents && Settings::values.use_accurate_gpu_emulation;
for (auto& surface : overlaps) {
Unregister(surface);
}
switch (PickStrategy(overlaps, params, gpu_addr, untopological)) {
case RecycleStrategy::Ignore: {
return InitializeSurface(gpu_addr, params, do_load);
return InitializeSurface(gpu_addr, params, Settings::values.use_accurate_gpu_emulation);
}
case RecycleStrategy::Flush: {
std::sort(overlaps.begin(), overlaps.end(),
@@ -476,7 +471,7 @@ private:
for (auto& surface : overlaps) {
FlushSurface(surface);
}
return InitializeSurface(gpu_addr, params, preserve_contents);
return InitializeSurface(gpu_addr, params);
}
case RecycleStrategy::BufferCopy: {
auto new_surface = GetUncachedSurface(gpu_addr, params);
@@ -485,7 +480,7 @@ private:
}
default: {
UNIMPLEMENTED_MSG("Unimplemented Texture Cache Recycling Strategy!");
return InitializeSurface(gpu_addr, params, do_load);
return InitializeSurface(gpu_addr, params);
}
}
}
@@ -514,7 +509,9 @@ private:
}
const auto& final_params = new_surface->GetSurfaceParams();
if (cr_params.type != final_params.type) {
BufferCopy(current_surface, new_surface);
if (Settings::values.use_accurate_gpu_emulation) {
BufferCopy(current_surface, new_surface);
}
} else {
std::vector<CopyParams> bricks = current_surface->BreakDown(final_params);
for (auto& brick : bricks) {
@@ -621,14 +618,11 @@ private:
* @param params The parameters on the new surface.
* @param gpu_addr The starting address of the new surface.
* @param cache_addr The starting address of the new surface on physical memory.
* @param preserve_contents Indicates that the new surface should be loaded from memory or
* left blank.
*/
std::optional<std::pair<TSurface, TView>> Manage3DSurfaces(std::vector<TSurface>& overlaps,
const SurfaceParams& params,
const GPUVAddr gpu_addr,
const VAddr cpu_addr,
bool preserve_contents) {
const VAddr cpu_addr) {
if (params.target == SurfaceTarget::Texture3D) {
bool failed = false;
if (params.num_levels > 1) {
@@ -653,7 +647,8 @@ private:
break;
}
const u32 offset = static_cast<u32>(surface->GetCpuAddr() - cpu_addr);
const auto [x, y, z] = params.GetBlockOffsetXYZ(offset);
const auto offsets = params.GetBlockOffsetXYZ(offset);
const auto z = std::get<2>(offsets);
modified |= surface->IsModified();
const CopyParams copy_params(0, 0, 0, 0, 0, z, 0, 0, params.width, params.height,
1);
@@ -677,7 +672,7 @@ private:
return std::nullopt;
}
Unregister(surface);
return InitializeSurface(gpu_addr, params, preserve_contents);
return InitializeSurface(gpu_addr, params);
}
return std::nullopt;
}
@@ -688,7 +683,7 @@ private:
return {{surface, surface->GetMainView()}};
}
}
return InitializeSurface(gpu_addr, params, preserve_contents);
return InitializeSurface(gpu_addr, params);
}
}
@@ -711,13 +706,10 @@ private:
*
* @param gpu_addr The starting address of the candidate surface.
* @param params The parameters on the candidate surface.
* @param preserve_contents Indicates that the new surface should be loaded from memory or
* left blank.
* @param is_render Whether or not the surface is a render target.
**/
std::pair<TSurface, TView> GetSurface(const GPUVAddr gpu_addr, const VAddr cpu_addr,
const SurfaceParams& params, bool preserve_contents,
bool is_render) {
const SurfaceParams& params, bool is_render) {
// Step 1
// Check Level 1 Cache for a fast structural match. If candidate surface
// matches at certain level we are pretty much done.
@@ -726,8 +718,7 @@ private:
const auto topological_result = current_surface->MatchesTopology(params);
if (topological_result != MatchTopologyResult::FullMatch) {
std::vector<TSurface> overlaps{current_surface};
return RecycleSurface(overlaps, params, gpu_addr, preserve_contents,
topological_result);
return RecycleSurface(overlaps, params, gpu_addr, topological_result);
}
const auto struct_result = current_surface->MatchesStructure(params);
@@ -752,7 +743,7 @@ private:
// If none are found, we are done. we just load the surface and create it.
if (overlaps.empty()) {
return InitializeSurface(gpu_addr, params, preserve_contents);
return InitializeSurface(gpu_addr, params);
}
// Step 3
@@ -762,15 +753,13 @@ private:
for (const auto& surface : overlaps) {
const auto topological_result = surface->MatchesTopology(params);
if (topological_result != MatchTopologyResult::FullMatch) {
return RecycleSurface(overlaps, params, gpu_addr, preserve_contents,
topological_result);
return RecycleSurface(overlaps, params, gpu_addr, topological_result);
}
}
// Check if it's a 3D texture
if (params.block_depth > 0) {
auto surface =
Manage3DSurfaces(overlaps, params, gpu_addr, cpu_addr, preserve_contents);
auto surface = Manage3DSurfaces(overlaps, params, gpu_addr, cpu_addr);
if (surface) {
return *surface;
}
@@ -790,8 +779,7 @@ private:
return *view;
}
}
return RecycleSurface(overlaps, params, gpu_addr, preserve_contents,
MatchTopologyResult::FullMatch);
return RecycleSurface(overlaps, params, gpu_addr, MatchTopologyResult::FullMatch);
}
// Now we check if the candidate is a mipmap/layer of the overlap
std::optional<TView> view =
@@ -815,7 +803,7 @@ private:
pair.first->EmplaceView(params, gpu_addr, candidate_size);
if (mirage_view)
return {pair.first, *mirage_view};
return RecycleSurface(overlaps, params, gpu_addr, preserve_contents,
return RecycleSurface(overlaps, params, gpu_addr,
MatchTopologyResult::FullMatch);
}
return {current_surface, *view};
@@ -831,8 +819,7 @@ private:
}
}
// We failed all the tests, recycle the overlaps into a new texture.
return RecycleSurface(overlaps, params, gpu_addr, preserve_contents,
MatchTopologyResult::FullMatch);
return RecycleSurface(overlaps, params, gpu_addr, MatchTopologyResult::FullMatch);
}
/**
@@ -990,10 +977,10 @@ private:
}
std::pair<TSurface, TView> InitializeSurface(GPUVAddr gpu_addr, const SurfaceParams& params,
bool preserve_contents) {
bool do_load = true) {
auto new_surface{GetUncachedSurface(gpu_addr, params)};
Register(new_surface);
if (preserve_contents) {
if (do_load) {
LoadSurface(new_surface);
}
return {new_surface, new_surface->GetMainView()};

View File

@@ -20,6 +20,8 @@
#include <cstring>
#include <vector>
#include <boost/container/static_vector.hpp>
#include "common/common_types.h"
#include "video_core/textures/astc.h"
@@ -39,25 +41,25 @@ constexpr u32 Popcnt(u32 n) {
class InputBitStream {
public:
explicit InputBitStream(const u8* ptr, std::size_t start_offset = 0)
: m_CurByte(ptr), m_NextBit(start_offset % 8) {}
constexpr explicit InputBitStream(const u8* ptr, std::size_t start_offset = 0)
: cur_byte{ptr}, next_bit{start_offset % 8} {}
std::size_t GetBitsRead() const {
return m_BitsRead;
constexpr std::size_t GetBitsRead() const {
return bits_read;
}
u32 ReadBit() {
u32 bit = *m_CurByte >> m_NextBit++;
while (m_NextBit >= 8) {
m_NextBit -= 8;
m_CurByte++;
constexpr bool ReadBit() {
const bool bit = (*cur_byte >> next_bit++) & 1;
while (next_bit >= 8) {
next_bit -= 8;
cur_byte++;
}
m_BitsRead++;
return bit & 1;
bits_read++;
return bit;
}
u32 ReadBits(std::size_t nBits) {
constexpr u32 ReadBits(std::size_t nBits) {
u32 ret = 0;
for (std::size_t i = 0; i < nBits; ++i) {
ret |= (ReadBit() & 1) << i;
@@ -66,7 +68,7 @@ public:
}
template <std::size_t nBits>
u32 ReadBits() {
constexpr u32 ReadBits() {
u32 ret = 0;
for (std::size_t i = 0; i < nBits; ++i) {
ret |= (ReadBit() & 1) << i;
@@ -75,64 +77,58 @@ public:
}
private:
const u8* m_CurByte;
std::size_t m_NextBit = 0;
std::size_t m_BitsRead = 0;
const u8* cur_byte;
std::size_t next_bit = 0;
std::size_t bits_read = 0;
};
class OutputBitStream {
public:
explicit OutputBitStream(u8* ptr, s32 nBits = 0, s32 start_offset = 0)
: m_NumBits(nBits), m_CurByte(ptr), m_NextBit(start_offset % 8) {}
constexpr explicit OutputBitStream(u8* ptr, std::size_t bits = 0, std::size_t start_offset = 0)
: cur_byte{ptr}, num_bits{bits}, next_bit{start_offset % 8} {}
~OutputBitStream() = default;
s32 GetBitsWritten() const {
return m_BitsWritten;
constexpr std::size_t GetBitsWritten() const {
return bits_written;
}
void WriteBitsR(u32 val, u32 nBits) {
constexpr void WriteBitsR(u32 val, u32 nBits) {
for (u32 i = 0; i < nBits; i++) {
WriteBit((val >> (nBits - i - 1)) & 1);
}
}
void WriteBits(u32 val, u32 nBits) {
constexpr void WriteBits(u32 val, u32 nBits) {
for (u32 i = 0; i < nBits; i++) {
WriteBit((val >> i) & 1);
}
}
private:
void WriteBit(s32 b) {
if (done)
constexpr void WriteBit(bool b) {
if (bits_written >= num_bits) {
return;
}
const u32 mask = 1 << m_NextBit++;
const u32 mask = 1 << next_bit++;
// clear the bit
*m_CurByte &= static_cast<u8>(~mask);
*cur_byte &= static_cast<u8>(~mask);
// Write the bit, if necessary
if (b)
*m_CurByte |= static_cast<u8>(mask);
*cur_byte |= static_cast<u8>(mask);
// Next byte?
if (m_NextBit >= 8) {
m_CurByte += 1;
m_NextBit = 0;
if (next_bit >= 8) {
cur_byte += 1;
next_bit = 0;
}
done = done || ++m_BitsWritten >= m_NumBits;
}
s32 m_BitsWritten = 0;
const s32 m_NumBits;
u8* m_CurByte;
s32 m_NextBit = 0;
bool done = false;
u8* cur_byte;
std::size_t num_bits;
std::size_t bits_written = 0;
std::size_t next_bit = 0;
};
template <typename IntType>
@@ -195,9 +191,13 @@ struct IntegerEncodedValue {
u32 trit_value;
};
};
using IntegerEncodedVector = boost::container::static_vector<
IntegerEncodedValue, 64,
boost::container::static_vector_options<
boost::container::inplace_alignment<alignof(IntegerEncodedValue)>,
boost::container::throw_on_overflow<false>>::type>;
static void DecodeTritBlock(InputBitStream& bits, std::vector<IntegerEncodedValue>& result,
u32 nBitsPerValue) {
static void DecodeTritBlock(InputBitStream& bits, IntegerEncodedVector& result, u32 nBitsPerValue) {
// Implement the algorithm in section C.2.12
u32 m[5];
u32 t[5];
@@ -255,7 +255,7 @@ static void DecodeTritBlock(InputBitStream& bits, std::vector<IntegerEncodedValu
}
}
static void DecodeQus32Block(InputBitStream& bits, std::vector<IntegerEncodedValue>& result,
static void DecodeQus32Block(InputBitStream& bits, IntegerEncodedVector& result,
u32 nBitsPerValue) {
// Implement the algorithm in section C.2.12
u32 m[3];
@@ -343,8 +343,8 @@ static constexpr std::array EncodingsValues = MakeEncodedValues();
// Fills result with the values that are encoded in the given
// bitstream. We must know beforehand what the maximum possible
// value is, and how many values we're decoding.
static void DecodeIntegerSequence(std::vector<IntegerEncodedValue>& result, InputBitStream& bits,
u32 maxRange, u32 nValues) {
static void DecodeIntegerSequence(IntegerEncodedVector& result, InputBitStream& bits, u32 maxRange,
u32 nValues) {
// Determine encoding parameters
IntegerEncodedValue val = EncodingsValues[maxRange];
@@ -634,12 +634,14 @@ static void FillError(u32* outBuf, u32 blockWidth, u32 blockHeight) {
// Replicates low numBits such that [(toBit - 1):(toBit - 1 - fromBit)]
// is the same as [(numBits - 1):0] and repeats all the way down.
template <typename IntType>
static IntType Replicate(IntType val, u32 numBits, u32 toBit) {
if (numBits == 0)
static constexpr IntType Replicate(IntType val, u32 numBits, u32 toBit) {
if (numBits == 0) {
return 0;
if (toBit == 0)
}
if (toBit == 0) {
return 0;
IntType v = val & static_cast<IntType>((1 << numBits) - 1);
}
const IntType v = val & static_cast<IntType>((1 << numBits) - 1);
IntType res = v;
u32 reslen = numBits;
while (reslen < toBit) {
@@ -656,6 +658,89 @@ static IntType Replicate(IntType val, u32 numBits, u32 toBit) {
return res;
}
static constexpr std::size_t NumReplicateEntries(u32 num_bits) {
return std::size_t(1) << num_bits;
}
template <typename IntType, u32 num_bits, u32 to_bit>
static constexpr auto MakeReplicateTable() {
std::array<IntType, NumReplicateEntries(num_bits)> table{};
for (IntType value = 0; value < static_cast<IntType>(std::size(table)); ++value) {
table[value] = Replicate(value, num_bits, to_bit);
}
return table;
}
static constexpr auto REPLICATE_BYTE_TO_16_TABLE = MakeReplicateTable<u32, 8, 16>();
static constexpr u32 ReplicateByteTo16(std::size_t value) {
return REPLICATE_BYTE_TO_16_TABLE[value];
}
static constexpr auto REPLICATE_BIT_TO_7_TABLE = MakeReplicateTable<u32, 1, 7>();
static constexpr u32 ReplicateBitTo7(std::size_t value) {
return REPLICATE_BIT_TO_7_TABLE[value];
}
static constexpr auto REPLICATE_BIT_TO_9_TABLE = MakeReplicateTable<u32, 1, 9>();
static constexpr u32 ReplicateBitTo9(std::size_t value) {
return REPLICATE_BIT_TO_9_TABLE[value];
}
static constexpr auto REPLICATE_1_BIT_TO_8_TABLE = MakeReplicateTable<u32, 1, 8>();
static constexpr auto REPLICATE_2_BIT_TO_8_TABLE = MakeReplicateTable<u32, 2, 8>();
static constexpr auto REPLICATE_3_BIT_TO_8_TABLE = MakeReplicateTable<u32, 3, 8>();
static constexpr auto REPLICATE_4_BIT_TO_8_TABLE = MakeReplicateTable<u32, 4, 8>();
static constexpr auto REPLICATE_5_BIT_TO_8_TABLE = MakeReplicateTable<u32, 5, 8>();
static constexpr auto REPLICATE_6_BIT_TO_8_TABLE = MakeReplicateTable<u32, 6, 8>();
static constexpr auto REPLICATE_7_BIT_TO_8_TABLE = MakeReplicateTable<u32, 7, 8>();
static constexpr auto REPLICATE_8_BIT_TO_8_TABLE = MakeReplicateTable<u32, 8, 8>();
/// Use a precompiled table with the most common usages, if it's not in the expected range, fallback
/// to the runtime implementation
static constexpr u32 FastReplicateTo8(u32 value, u32 num_bits) {
switch (num_bits) {
case 1:
return REPLICATE_1_BIT_TO_8_TABLE[value];
case 2:
return REPLICATE_2_BIT_TO_8_TABLE[value];
case 3:
return REPLICATE_3_BIT_TO_8_TABLE[value];
case 4:
return REPLICATE_4_BIT_TO_8_TABLE[value];
case 5:
return REPLICATE_5_BIT_TO_8_TABLE[value];
case 6:
return REPLICATE_6_BIT_TO_8_TABLE[value];
case 7:
return REPLICATE_7_BIT_TO_8_TABLE[value];
case 8:
return REPLICATE_8_BIT_TO_8_TABLE[value];
default:
return Replicate(value, num_bits, 8);
}
}
static constexpr auto REPLICATE_1_BIT_TO_6_TABLE = MakeReplicateTable<u32, 1, 6>();
static constexpr auto REPLICATE_2_BIT_TO_6_TABLE = MakeReplicateTable<u32, 2, 6>();
static constexpr auto REPLICATE_3_BIT_TO_6_TABLE = MakeReplicateTable<u32, 3, 6>();
static constexpr auto REPLICATE_4_BIT_TO_6_TABLE = MakeReplicateTable<u32, 4, 6>();
static constexpr auto REPLICATE_5_BIT_TO_6_TABLE = MakeReplicateTable<u32, 5, 6>();
static constexpr u32 FastReplicateTo6(u32 value, u32 num_bits) {
switch (num_bits) {
case 1:
return REPLICATE_1_BIT_TO_6_TABLE[value];
case 2:
return REPLICATE_2_BIT_TO_6_TABLE[value];
case 3:
return REPLICATE_3_BIT_TO_6_TABLE[value];
case 4:
return REPLICATE_4_BIT_TO_6_TABLE[value];
case 5:
return REPLICATE_5_BIT_TO_6_TABLE[value];
default:
return Replicate(value, num_bits, 6);
}
}
class Pixel {
protected:
using ChannelType = s16;
@@ -674,10 +759,10 @@ public:
// significant bits when going from larger to smaller bit depth
// or by repeating the most significant bits when going from
// smaller to larger bit depths.
void ChangeBitDepth(const u8 (&depth)[4]) {
void ChangeBitDepth() {
for (u32 i = 0; i < 4; i++) {
Component(i) = ChangeBitDepth(Component(i), m_BitDepth[i], depth[i]);
m_BitDepth[i] = depth[i];
Component(i) = ChangeBitDepth(Component(i), m_BitDepth[i]);
m_BitDepth[i] = 8;
}
}
@@ -689,28 +774,23 @@ public:
// Changes the bit depth of a single component. See the comment
// above for how we do this.
static ChannelType ChangeBitDepth(Pixel::ChannelType val, u8 oldDepth, u8 newDepth) {
assert(newDepth <= 8);
static ChannelType ChangeBitDepth(Pixel::ChannelType val, u8 oldDepth) {
assert(oldDepth <= 8);
if (oldDepth == newDepth) {
if (oldDepth == 8) {
// Do nothing
return val;
} else if (oldDepth == 0 && newDepth != 0) {
return static_cast<ChannelType>((1 << newDepth) - 1);
} else if (newDepth > oldDepth) {
return Replicate(val, oldDepth, newDepth);
} else if (oldDepth == 0) {
return static_cast<ChannelType>((1 << 8) - 1);
} else if (8 > oldDepth) {
return static_cast<ChannelType>(FastReplicateTo8(static_cast<u32>(val), oldDepth));
} else {
// oldDepth > newDepth
if (newDepth == 0) {
return 0xFF;
} else {
u8 bitsWasted = static_cast<u8>(oldDepth - newDepth);
u16 v = static_cast<u16>(val);
v = static_cast<u16>((v + (1 << (bitsWasted - 1))) >> bitsWasted);
v = ::std::min<u16>(::std::max<u16>(0, v), static_cast<u16>((1 << newDepth) - 1));
return static_cast<u8>(v);
}
const u8 bitsWasted = static_cast<u8>(oldDepth - 8);
u16 v = static_cast<u16>(val);
v = static_cast<u16>((v + (1 << (bitsWasted - 1))) >> bitsWasted);
v = ::std::min<u16>(::std::max<u16>(0, v), static_cast<u16>((1 << 8) - 1));
return static_cast<u8>(v);
}
assert(false && "We shouldn't get here.");
@@ -760,8 +840,7 @@ public:
// up in the most-significant byte.
u32 Pack() const {
Pixel eightBit(*this);
const u8 eightBitDepth[4] = {8, 8, 8, 8};
eightBit.ChangeBitDepth(eightBitDepth);
eightBit.ChangeBitDepth();
u32 r = 0;
r |= eightBit.A();
@@ -816,8 +895,7 @@ static void DecodeColorValues(u32* out, u8* data, const u32* modes, const u32 nP
}
// We now have enough to decode our integer sequence.
std::vector<IntegerEncodedValue> decodedColorValues;
decodedColorValues.reserve(32);
IntegerEncodedVector decodedColorValues;
InputBitStream colorStream(data);
DecodeIntegerSequence(decodedColorValues, colorStream, range, nValues);
@@ -839,12 +917,12 @@ static void DecodeColorValues(u32* out, u8* data, const u32* modes, const u32 nP
u32 A = 0, B = 0, C = 0, D = 0;
// A is just the lsb replicated 9 times.
A = Replicate(bitval & 1, 1, 9);
A = ReplicateBitTo9(bitval & 1);
switch (val.encoding) {
// Replicate bits
case IntegerEncoding::JustBits:
out[outIdx++] = Replicate(bitval, bitlen, 8);
out[outIdx++] = FastReplicateTo8(bitval, bitlen);
break;
// Use algorithm in C.2.13
@@ -962,13 +1040,13 @@ static u32 UnquantizeTexelWeight(const IntegerEncodedValue& val) {
u32 bitval = val.bit_value;
u32 bitlen = val.num_bits;
u32 A = Replicate(bitval & 1, 1, 7);
u32 A = ReplicateBitTo7(bitval & 1);
u32 B = 0, C = 0, D = 0;
u32 result = 0;
switch (val.encoding) {
case IntegerEncoding::JustBits:
result = Replicate(bitval, bitlen, 6);
result = FastReplicateTo6(bitval, bitlen);
break;
case IntegerEncoding::Trit: {
@@ -1047,7 +1125,7 @@ static u32 UnquantizeTexelWeight(const IntegerEncodedValue& val) {
return result;
}
static void UnquantizeTexelWeights(u32 out[2][144], const std::vector<IntegerEncodedValue>& weights,
static void UnquantizeTexelWeights(u32 out[2][144], const IntegerEncodedVector& weights,
const TexelWeightParams& params, const u32 blockWidth,
const u32 blockHeight) {
u32 weightIdx = 0;
@@ -1545,8 +1623,7 @@ static void DecompressBlock(const u8 inBuf[16], const u32 blockWidth, const u32
static_cast<u8>((1 << (weightParams.GetPackedBitSize() % 8)) - 1);
memset(texelWeightData + clearByteStart, 0, 16 - clearByteStart);
std::vector<IntegerEncodedValue> texelWeightValues;
texelWeightValues.reserve(64);
IntegerEncodedVector texelWeightValues;
InputBitStream weightStream(texelWeightData);
@@ -1568,9 +1645,9 @@ static void DecompressBlock(const u8 inBuf[16], const u32 blockWidth, const u32
Pixel p;
for (u32 c = 0; c < 4; c++) {
u32 C0 = endpos32s[partition][0].Component(c);
C0 = Replicate(C0, 8, 16);
C0 = ReplicateByteTo16(C0);
u32 C1 = endpos32s[partition][1].Component(c);
C1 = Replicate(C1, 8, 16);
C1 = ReplicateByteTo16(C1);
u32 plane = 0;
if (weightParams.m_bDualPlane && (((planeIdx + 1) & 3) == c)) {

View File

@@ -131,6 +131,20 @@ enum class SwizzleSource : u32 {
OneFloat = 7,
};
enum class MsaaMode : u32 {
Msaa1x1 = 0,
Msaa2x1 = 1,
Msaa2x2 = 2,
Msaa4x2 = 3,
Msaa4x2_D3D = 4,
Msaa2x1_D3D = 5,
Msaa4x4 = 6,
Msaa2x2_VC4 = 8,
Msaa2x2_VC12 = 9,
Msaa4x2_VC8 = 10,
Msaa4x2_VC24 = 11,
};
union TextureHandle {
TextureHandle(u32 raw) : raw{raw} {}
@@ -197,6 +211,7 @@ struct TICEntry {
union {
BitField<0, 4, u32> res_min_mip_level;
BitField<4, 4, u32> res_max_mip_level;
BitField<8, 4, MsaaMode> msaa_mode;
BitField<12, 12, u32> min_lod_clamp;
};

View File

@@ -43,7 +43,7 @@ struct Client::Impl {
if (jwt.empty() && !allow_anonymous) {
LOG_ERROR(WebService, "Credentials must be provided for authenticated requests");
return Common::WebResult{Common::WebResult::Code::CredentialsMissing,
"Credentials needed"};
"Credentials needed", ""};
}
auto result = GenericRequest(method, path, data, accept, jwt);
@@ -81,12 +81,12 @@ struct Client::Impl {
cli = std::make_unique<httplib::SSLClient>(parsedUrl.m_Host.c_str(), port);
} else {
LOG_ERROR(WebService, "Bad URL scheme {}", parsedUrl.m_Scheme);
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Bad URL scheme"};
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Bad URL scheme", ""};
}
}
if (cli == nullptr) {
LOG_ERROR(WebService, "Invalid URL {}", host + path);
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Invalid URL"};
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Invalid URL", ""};
}
cli->set_timeout_sec(TIMEOUT_SECONDS);
@@ -118,27 +118,27 @@ struct Client::Impl {
if (!cli->send(request, response)) {
LOG_ERROR(WebService, "{} to {} returned null", method, host + path);
return Common::WebResult{Common::WebResult::Code::LibError, "Null response"};
return Common::WebResult{Common::WebResult::Code::LibError, "Null response", ""};
}
if (response.status >= 400) {
LOG_ERROR(WebService, "{} to {} returned error status code: {}", method, host + path,
response.status);
return Common::WebResult{Common::WebResult::Code::HttpError,
std::to_string(response.status)};
std::to_string(response.status), ""};
}
auto content_type = response.headers.find("content-type");
if (content_type == response.headers.end()) {
LOG_ERROR(WebService, "{} to {} returned no content", method, host + path);
return Common::WebResult{Common::WebResult::Code::WrongContent, ""};
return Common::WebResult{Common::WebResult::Code::WrongContent, "", ""};
}
if (content_type->second.find(accept) == std::string::npos) {
LOG_ERROR(WebService, "{} to {} returned wrong content: {}", method, host + path,
content_type->second);
return Common::WebResult{Common::WebResult::Code::WrongContent, "Wrong content"};
return Common::WebResult{Common::WebResult::Code::WrongContent, "Wrong content", ""};
}
return Common::WebResult{Common::WebResult::Code::Success, "", response.body};
}

View File

@@ -51,7 +51,8 @@ MicroProfileDialog::MicroProfileDialog(QWidget* parent) : QWidget(parent, Qt::Di
setWindowTitle(tr("MicroProfile"));
resize(1000, 600);
// Remove the "?" button from the titlebar and enable the maximize button
setWindowFlags(windowFlags() & ~Qt::WindowContextHelpButtonHint | Qt::WindowMaximizeButtonHint);
setWindowFlags((windowFlags() & ~Qt::WindowContextHelpButtonHint) |
Qt::WindowMaximizeButtonHint);
#if MICROPROFILE_ENABLED

View File

@@ -91,7 +91,8 @@ std::pair<std::vector<u8>, std::string> GetGameListCachedObject(
return generator();
}
if (file1.write(reinterpret_cast<const char*>(icon.data()), icon.size()) != icon.size()) {
if (file1.write(reinterpret_cast<const char*>(icon.data()), icon.size()) !=
s64(icon.size())) {
LOG_ERROR(Frontend, "Failed to write data to cache file.");
return generator();
}

View File

@@ -1019,9 +1019,9 @@ void GMainWindow::BootGame(const QString& filename) {
std::string title_name;
const auto res = Core::System::GetInstance().GetGameName(title_name);
if (res != Loader::ResultStatus::Success) {
const auto [nacp, icon_file] = FileSys::PatchManager(title_id).GetControlMetadata();
if (nacp != nullptr)
title_name = nacp->GetApplicationName();
const auto metadata = FileSys::PatchManager(title_id).GetControlMetadata();
if (metadata.first != nullptr)
title_name = metadata.first->GetApplicationName();
if (title_name.empty())
title_name = FileUtil::GetFilename(filename.toStdString());
@@ -1628,7 +1628,7 @@ void GMainWindow::OnMenuInstallToNAND() {
}
FileSys::InstallResult res;
if (index >= static_cast<size_t>(FileSys::TitleType::Application)) {
if (index >= static_cast<s32>(FileSys::TitleType::Application)) {
res = Core::System::GetInstance()
.GetFileSystemController()
.GetUserNANDContents()