Compare commits

..

53 Commits

Author SHA1 Message Date
Lioncash
77356731a9 hle_ipc: Remove std::size_t casts where applicable
These were added in the change that enabled -Wextra on linux builds so
as not to introduce interface changes in the same change as a
build-system flag addition.

Now that the flags are enabled, we can freely change the interface to
make these unnecessary.
2020-04-16 22:02:10 -04:00
bunnei
79c1269f0f Merge pull request #3673 from lioncash/extra
CMakeLists: Specify -Wextra on linux builds
2020-04-16 21:12:33 -04:00
Fernando Sahmkow
c81f256111 Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache
buffer_cache: Return handles instead of pointer to handles
2020-04-16 19:58:13 -04:00
bunnei
5a067eda84 Merge pull request #3675 from degasus/linux_shared_libraries
externals: Use shared libraries if possible
2020-04-16 18:17:18 -04:00
Markus Wick
b520978043 externals: Use shared libraries if possible
This is mostly done by pkgconfig.
I've focused on the larger and more stable libraries.
2020-04-16 17:03:17 +02:00
Markus Wick
fedf750e1b externals: Move LibreSSL linking to httplib.
Neither core nor web_services use OpenSSL nor LibreSSL.
However they need to link them as it's a requirement of httplib.
So let's declare this within httplib instead of core and web_services.
2020-04-16 16:46:33 +02:00
Markus Wick
94c2c828a5 input_common: Use the CMake target instead of the variable. 2020-04-16 16:42:59 +02:00
Rodrigo Locatti
db67e017cb Merge pull request #3659 from bunnei/time-calc-standard-user
service: time: Implement CalculateStandardUserSystemClockDifferenceByUser.
2020-04-16 02:51:57 -03:00
ReinUsesLisp
090fd3fefa buffer_cache: Return handles instead of pointer to handles
The original idea of returning pointers is that handles can be moved.
The problem is that the implementation didn't take that in mind and made
everything harder to work with. This commit drops pointer to handles and
returns the handles themselves. While it is still true that handles can
be invalidated, this way we get an old handle instead of a dangling
pointer.

This problem can be solved in the future with sparse buffers.
2020-04-16 02:33:34 -03:00
Rodrigo Locatti
a5a2ee8766 Merge pull request #3689 from lioncash/unused-var
decode/shift: Remove unused variable within Shift()
2020-04-16 02:05:54 -03:00
Rodrigo Locatti
d196ce0f71 Merge pull request #3688 from lioncash/nequal
surface_view: Add missing operator!= to ViewParams
2020-04-16 01:39:51 -03:00
Rodrigo Locatti
4209dba1f6 Merge pull request #3680 from lioncash/static
gl_device: Mark stage_swizzle as constexpr
2020-04-16 01:26:23 -03:00
Rodrigo Locatti
60e8de7c95 Merge pull request #3687 from lioncash/constness
surface_base: Make IsInside() a const member function
2020-04-16 01:22:50 -03:00
Rodrigo Locatti
612966399b Merge pull request #3685 from lioncash/copies
control_flow: Make use of std::move in TryInspectAddress()
2020-04-16 01:22:40 -03:00
Lioncash
cd2a12e78f decode/shift: Remove unused variable within Shift()
Removes a redundant variable that is already satisfied by the IsFull()
utility function.
2020-04-16 00:16:06 -04:00
Lioncash
5fbe8785d2 surface_view: Add missing operator!= to ViewParams
Provides logical symmetry to the interface.
2020-04-16 00:03:12 -04:00
Lioncash
d551c910bb surface_base: Make IsInside() a const member function
This doesn't modify internal state, so this can be made const.
2020-04-15 23:59:35 -04:00
bunnei
319df1db77 Merge pull request #3683 from lioncash/docs
video_core: Amend doxygen comment references
2020-04-15 23:54:58 -04:00
Lioncash
72a224d3fc control_flow: Make use of std::move in TryInspectAddress()
Eliminates redundant atomic reference count increments and decrements.
2020-04-15 23:31:22 -04:00
Lioncash
11837e8f13 video_core: Amend doxygen comment references
Fixes broken documentation references.
2020-04-15 22:33:29 -04:00
Lioncash
71fb156611 gl_device: Mark stage_swizzle as constexpr
Previously this was mutable even though it shouldn't be.
2020-04-15 21:59:13 -04:00
Lioncash
1c340c6efa CMakeLists: Specify -Wextra on linux builds
Allows reporting more cases where logic errors may exist, such as
implicit fallthrough cases, etc.

We currently ignore unused parameters, since we currently have many
cases where this is intentional (virtual interfaces).

While we're at it, we can also tidy up any existing code that causes
warnings. This also uncovered a few bugs as well.
2020-04-15 21:33:46 -04:00
Rodrigo Locatti
65cbb122ea Merge pull request #3649 from FernandoS27/3d-fix
Texture Cache: Read current data when flushing a 3D segment.
2020-04-15 17:06:55 -03:00
Fernando Sahmkow
e33196d4e7 Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00
Mat M
4398bdb4c7 Merge pull request #3670 from lioncash/reorder
CMakeLists: Make -Wreorder a compile-time error
2020-04-15 14:40:05 -04:00
Lioncash
213fff67bc CMakeLists: Make -Wreorder a compile-time error
This can result in silent logic bugs within code, and given the amount
of times these kind of warnings are caused, they should be flagged at
compile-time so no new code is submitted with them.
2020-04-15 14:14:41 -04:00
Mat M
64b5985f0a Merge pull request #3662 from ReinUsesLisp/constant-attrs
gl_rasterizer: Implement constant vertex attributes
2020-04-15 11:54:50 -04:00
Fernando Sahmkow
6789d88a9c Texture Cache: Read current data when flushing a 3D segment.
This PR corrects flushing of 3D segments when data of other segments is
mixed, this aims to preserve the data in place.
2020-04-15 11:46:17 -04:00
Mat M
9208d555b7 Merge pull request #3668 from ReinUsesLisp/vtx-format-16ui
maxwell_to_vk: Add uint16 vertex formats
2020-04-15 11:43:52 -04:00
Mat M
ab72696beb Merge pull request #3656 from ReinUsesLisp/glsl-full-decompile
gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
2020-04-15 03:17:46 -04:00
Mat M
4878d6bb49 Merge pull request #3654 from ReinUsesLisp/fix-fb-attach
gl_texture_cache: Fix layered texture attachment base level
2020-04-15 03:17:18 -04:00
Mat M
50c0a92db8 Merge pull request #3663 from ReinUsesLisp/fcmp-rc
shader/arithmetic: Add FCMP_CR variant
2020-04-15 03:16:56 -04:00
Mat M
13331a3a32 Merge pull request #3664 from ReinUsesLisp/fe3h-black-squares
Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
2020-04-15 03:14:28 -04:00
Mat M
3a759d2352 Merge pull request #3667 from ReinUsesLisp/viewport-trash
vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
2020-04-15 03:10:34 -04:00
ReinUsesLisp
3036067047 maxwell_to_vk: Add uint16 vertex formats 2020-04-15 04:06:30 -03:00
ReinUsesLisp
b4e43c64c8 maxwell_to_vk: Add missing breaks
Avoid invalid fallbacks.
2020-04-15 04:05:33 -03:00
ReinUsesLisp
0ca456830f vk_blit_screen: Initialize all members in VkPipelineViewportStateCreateInfo
When the dynamic state is specified, pViewports and pScissors are
ignored, quoting the specification:

  pViewports is a pointer to an array of VkViewport structures, defining
  the viewport transforms. If the viewport state is dynamic, this member
  is ignored.

That said, AMD's proprietary driver itself seem to read it regardless of
what the specification says.
2020-04-15 03:30:08 -03:00
Rodrigo Locatti
0b132e8cc1 Merge pull request #3657 from ReinUsesLisp/viewport-zero
vk_rasterizer: Default to 1 viewports with a size of 0
2020-04-15 01:51:17 -03:00
Fernando Sahmkow
daddbeffd1 Texture Cache: Only do buffer copies on accurate GPU. (#3634)
This is a simple optimization as Buffer Copies are mostly used for texture recycling. They are, however, useful when games abuse undefined behavior but most 3D APIs forbid it.
2020-04-14 23:21:00 -04:00
bunnei
eb676c343a service: time: Implement CalculateStandardUserSystemClockDifferenceByUser.
- Used by Animal Crossing: New Horizons.
2020-04-14 22:28:41 -04:00
ReinUsesLisp
fd6371eba7 Revert "gl_shader_decompiler: Implement merges with bitfieldInsert"
This reverts commit 05cf270836.

Apparently the first approach using floats instead of bitfieldInert
worked better for Fire Emblem: Three Houses. Reverting to get that
behavior back.
2020-04-14 21:24:33 -03:00
ReinUsesLisp
fefe7f18f9 shader/arithmetic: Add FCMP_CR variant
Adds another variant of FCMP.
2020-04-14 19:11:04 -03:00
Zach Hilman
e366b4ee1f Merge pull request #3660 from bunnei/friend-blocked-users
service: friend: Stub IFriendService::GetBlockedUserListIds.
2020-04-14 16:59:46 -04:00
Zach Hilman
8040f6d544 Merge pull request #3661 from bunnei/patch-manager-fix
file_sys: patch_manager: Return early when there are no layers to apply.
2020-04-14 16:59:25 -04:00
ReinUsesLisp
6dfcabc800 gl_rasterizer: Implement constant vertex attributes
Credits go to gdkchan from Ryujinx for finding constant attributes are
used in retail games.
2020-04-14 17:58:53 -03:00
bunnei
fc35803f91 file_sys: patch_manager: Return early when there are no layers to apply. 2020-04-14 16:25:55 -04:00
bunnei
598740f1dd service: friend: Stub IFriendService::GetBlockedUserListIds.
- This is safe to stub, as there should be no adverse consequences from reporting no blocked users.
2020-04-14 16:20:51 -04:00
ReinUsesLisp
37e5c4fa7c vk_rasterizer: Default to 1 viewports with a size of 0
Silence validation layer errors.
2020-04-14 04:44:34 -03:00
ReinUsesLisp
453d7419d9 gl_shader_cache: Use CompileDepth::FullDecompile on GLSL
From my testing on a Splatoon 2 shader that takes 3800ms on average to
compile changing to FullDecompile reduces it to 900ms on average.

The shader decoder will automatically fallback to a more naive method if
it can't use full decompile.
2020-04-14 01:34:20 -03:00
ReinUsesLisp
21dc842171 gl_texture_cache: Fix layered texture attachment base level
The base level is already included in the texture view. If we specify
the base level in the texture again, this will end up in the incorrect
level and potentially out of bounds.
2020-04-13 18:24:56 -03:00
ReinUsesLisp
3185245845 shader/memory: Implement RED.E.ADD
Implements a reduction operation. It's an atomic operation that doesn't
return a value.

This commit introduces another primitive because some shading languages
might have a primitive for reduction operations.
2020-04-06 02:24:47 -03:00
ReinUsesLisp
fd0a2b5151 shader/memory: Add "using std::move" 2020-04-06 02:18:14 -03:00
ReinUsesLisp
79970c9174 shader/memory: Minor fixes in ATOM 2020-04-06 00:54:22 -03:00
80 changed files with 581 additions and 813 deletions

View File

@@ -3,13 +3,27 @@
set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${PROJECT_SOURCE_DIR}/CMakeModules)
include(DownloadExternals)
# pkgconfig -- it is used to find shared libraries without cmake modules on linux systems
find_package(PkgConfig)
if (NOT PkgConfig_FOUND)
function(pkg_check_modules)
# STUB
endfunction()
endif()
# Catch
add_library(catch-single-include INTERFACE)
target_include_directories(catch-single-include INTERFACE catch/single_include)
# libfmt
add_subdirectory(fmt)
add_library(fmt::fmt ALIAS fmt)
pkg_check_modules(FMT IMPORTED_TARGET GLOBAL fmt>=6.1.0)
if (FMT_FOUND)
add_library(fmt::fmt ALIAS PkgConfig::FMT)
else()
message(STATUS "fmt 6.1.0 or newer not found, falling back to externals")
add_subdirectory(fmt)
add_library(fmt::fmt ALIAS fmt)
endif()
# Dynarmic
if (ARCHITECTURE_x86_64)
@@ -30,9 +44,15 @@ add_subdirectory(glad)
add_subdirectory(inih)
# lz4
set(LZ4_BUNDLED_MODE ON)
add_subdirectory(lz4/contrib/cmake_unofficial EXCLUDE_FROM_ALL)
target_include_directories(lz4_static INTERFACE ./lz4/lib)
pkg_check_modules(LIBLZ4 IMPORTED_TARGET GLOBAL liblz4>=1.8.0)
if (LIBLZ4_FOUND)
add_library(lz4_static ALIAS PkgConfig::LIBLZ4)
else()
message(STATUS "liblz4 1.8.0 or newer not found, falling back to externals")
set(LZ4_BUNDLED_MODE ON)
add_subdirectory(lz4/contrib/cmake_unofficial EXCLUDE_FROM_ALL)
target_include_directories(lz4_static INTERFACE ./lz4/lib)
endif()
# mbedtls
add_subdirectory(mbedtls EXCLUDE_FROM_ALL)
@@ -47,15 +67,27 @@ add_library(unicorn-headers INTERFACE)
target_include_directories(unicorn-headers INTERFACE ./unicorn/include)
# Zstandard
add_subdirectory(zstd/build/cmake EXCLUDE_FROM_ALL)
target_include_directories(libzstd_static INTERFACE ./zstd/lib)
pkg_check_modules(LIBZSTD IMPORTED_TARGET GLOBAL libzstd>=1.3.8)
if (LIBZSTD_FOUND)
add_library(libzstd_static ALIAS PkgConfig::LIBZSTD)
else()
message(STATUS "libzstd 1.3.8 or newer not found, falling back to externals")
add_subdirectory(zstd/build/cmake EXCLUDE_FROM_ALL)
target_include_directories(libzstd_static INTERFACE ./zstd/lib)
endif()
# SoundTouch
add_subdirectory(soundtouch)
# Opus
add_subdirectory(opus)
target_include_directories(opus INTERFACE ./opus/include)
pkg_check_modules(OPUS IMPORTED_TARGET GLOBAL opus>=1.3.1)
if (OPUS_FOUND)
add_library(opus ALIAS PkgConfig::OPUS)
else()
message(STATUS "opus 1.3.1 or newer not found, falling back to externals")
add_subdirectory(opus)
target_include_directories(opus INTERFACE ./opus/include)
endif()
# Cubeb
if(ENABLE_CUBEB)
@@ -75,18 +107,35 @@ if (ENABLE_VULKAN)
endif()
# zlib
add_subdirectory(zlib EXCLUDE_FROM_ALL)
set(ZLIB_LIBRARIES z)
find_package(ZLIB 1.2.11)
if (NOT ZLIB_FOUND)
message(STATUS "zlib 1.2.11 or newer not found, falling back to externals")
add_subdirectory(zlib EXCLUDE_FROM_ALL)
set(ZLIB_LIBRARIES z)
endif()
# libzip
add_subdirectory(libzip EXCLUDE_FROM_ALL)
pkg_check_modules(LIBZIP IMPORTED_TARGET GLOBAL libzip>=1.5.3)
if (LIBZIP_FOUND)
add_library(zip ALIAS PkgConfig::LIBZIP)
else()
message(STATUS "libzip 1.5.3 or newer not found, falling back to externals")
add_subdirectory(libzip EXCLUDE_FROM_ALL)
endif()
if (ENABLE_WEB_SERVICE)
# LibreSSL
set(LIBRESSL_SKIP_INSTALL ON CACHE BOOL "")
add_subdirectory(libressl EXCLUDE_FROM_ALL)
target_include_directories(ssl INTERFACE ./libressl/include)
target_compile_definitions(ssl PRIVATE -DHAVE_INET_NTOP)
find_package(OpenSSL COMPONENTS Crypto SSL)
if (NOT OpenSSL_FOUND)
message(STATUS "OpenSSL not found, falling back to externals")
set(LIBRESSL_SKIP_INSTALL ON CACHE BOOL "")
add_subdirectory(libressl EXCLUDE_FROM_ALL)
target_include_directories(ssl INTERFACE ./libressl/include)
target_compile_definitions(ssl PRIVATE -DHAVE_INET_NTOP)
get_directory_property(OPENSSL_LIBRARIES
DIRECTORY libressl
DEFINITION OPENSSL_LIBS)
endif()
# lurlparser
add_subdirectory(lurlparser EXCLUDE_FROM_ALL)
@@ -94,6 +143,8 @@ if (ENABLE_WEB_SERVICE)
# httplib
add_library(httplib INTERFACE)
target_include_directories(httplib INTERFACE ./httplib)
target_compile_definitions(httplib INTERFACE -DCPPHTTPLIB_OPENSSL_SUPPORT)
target_link_libraries(httplib INTERFACE ${OPENSSL_LIBRARIES})
# JSON
add_library(json-headers INTERFACE)

View File

@@ -53,7 +53,11 @@ if (MSVC)
else()
add_compile_options(
-Wall
-Werror=implicit-fallthrough
-Werror=reorder
-Wextra
-Wno-attributes
-Wno-unused-parameter
)
if (APPLE AND CMAKE_CXX_COMPILER_ID STREQUAL Clang)

View File

@@ -591,11 +591,8 @@ target_link_libraries(core PUBLIC common PRIVATE audio_core video_core)
target_link_libraries(core PUBLIC Boost::boost PRIVATE fmt json-headers mbedtls opus unicorn)
if (YUZU_ENABLE_BOXCAT)
get_directory_property(OPENSSL_LIBS
DIRECTORY ${PROJECT_SOURCE_DIR}/externals/libressl
DEFINITION OPENSSL_LIBS)
target_compile_definitions(core PRIVATE -DCPPHTTPLIB_OPENSSL_SUPPORT -DYUZU_ENABLE_BOXCAT)
target_link_libraries(core PRIVATE httplib json-headers ${OPENSSL_LIBS} zip)
target_compile_definitions(core PRIVATE -DYUZU_ENABLE_BOXCAT)
target_link_libraries(core PRIVATE httplib json-headers zip)
endif()
if (ENABLE_WEB_SERVICE)

View File

@@ -123,7 +123,7 @@ Symbols GetSymbols(VAddr text_offset, Memory::Memory& memory) {
std::optional<std::string> GetSymbolName(const Symbols& symbols, VAddr func_address) {
const auto iter =
std::find_if(symbols.begin(), symbols.end(), [func_address](const auto& pair) {
const auto& [symbol, name] = pair;
const auto& symbol = pair.first;
const auto end_address = symbol.value + symbol.size;
return func_address >= symbol.value && func_address < end_address;
});
@@ -146,7 +146,7 @@ std::vector<ARM_Interface::BacktraceEntry> ARM_Interface::GetBacktrace() const {
auto fp = GetReg(29);
auto lr = GetReg(30);
while (true) {
out.push_back({"", 0, lr, 0});
out.push_back({"", 0, lr, 0, ""});
if (!fp) {
break;
}

View File

@@ -348,6 +348,12 @@ static void ApplyLayeredFS(VirtualFile& romfs, u64 title_id, ContentRecordType t
if (ext_dir != nullptr)
layers_ext.push_back(std::move(ext_dir));
}
// When there are no layers to apply, return early as there is no need to rebuild the RomFS
if (layers.empty() && layers_ext.empty()) {
return;
}
layers.push_back(std::move(extracted));
auto layered = LayeredVfsDirectory::MakeLayeredDirectory(std::move(layers));
@@ -434,7 +440,8 @@ std::map<std::string, std::string, std::less<>> PatchManager::GetPatchVersionNam
// Game Updates
const auto update_tid = GetUpdateTitleID(title_id);
PatchManager update{update_tid};
auto [nacp, discard_icon_file] = update.GetControlMetadata();
const auto metadata = update.GetControlMetadata();
const auto& nacp = metadata.first;
const auto update_disabled =
std::find(disabled.cbegin(), disabled.cend(), "Update") != disabled.cend();

View File

@@ -591,14 +591,18 @@ InstallResult RegisteredCache::InstallEntry(const NSP& nsp, bool overwrite_if_ex
InstallResult RegisteredCache::InstallEntry(const NCA& nca, TitleType type,
bool overwrite_if_exists, const VfsCopyFunction& copy) {
CNMTHeader header{
nca.GetTitleId(), ///< Title ID
0, ///< Ignore/Default title version
type, ///< Type
{}, ///< Padding
0x10, ///< Default table offset
1, ///< 1 Content Entry
0, ///< No Meta Entries
{}, ///< Padding
nca.GetTitleId(), // Title ID
0, // Ignore/Default title version
type, // Type
{}, // Padding
0x10, // Default table offset
1, // 1 Content Entry
0, // No Meta Entries
{}, // Padding
{}, // Reserved 1
0, // Is committed
0, // Required download system version
{}, // Reserved 2
};
OptionalHeader opt_header{0, 0};
ContentRecord c_rec{{}, {}, {}, GetCRTypeFromNCAType(nca.GetType()), {}};
@@ -848,7 +852,8 @@ VirtualFile ManualContentProvider::GetEntryUnparsed(u64 title_id, ContentRecordT
VirtualFile ManualContentProvider::GetEntryRaw(u64 title_id, ContentRecordType type) const {
const auto iter =
std::find_if(entries.begin(), entries.end(), [title_id, type](const auto& entry) {
const auto [title_type, content_type, e_title_id] = entry.first;
const auto content_type = std::get<1>(entry.first);
const auto e_title_id = std::get<2>(entry.first);
return content_type == type && e_title_id == title_id;
});
if (iter == entries.end())

View File

@@ -42,11 +42,11 @@ VirtualDir ExtractZIP(VirtualFile file) {
continue;
if (name.back() != '/') {
std::unique_ptr<zip_file_t, decltype(&zip_fclose)> file{
std::unique_ptr<zip_file_t, decltype(&zip_fclose)> file2{
zip_fopen_index(zip.get(), i, 0), zip_fclose};
std::vector<u8> buf(stat.size);
if (zip_fread(file.get(), buf.data(), buf.size()) != buf.size())
if (zip_fread(file2.get(), buf.data(), buf.size()) != s64(buf.size()))
return nullptr;
const auto parts = FileUtil::SplitPathComponents(stat.name);

View File

@@ -25,7 +25,7 @@ FramebufferLayout DefaultFrameLayout(u32 width, u32 height) {
ASSERT(height > 0);
// The drawing code needs at least somewhat valid values for both screens
// so just calculate them both even if the other isn't showing.
FramebufferLayout res{width, height};
FramebufferLayout res{width, height, false, {}};
const float window_aspect_ratio = static_cast<float>(height) / width;
const float emulation_aspect_ratio = EmulationAspectRatio(

View File

@@ -282,7 +282,7 @@ ResultCode HLERequestContext::WriteToOutgoingCommandBuffer(Thread& thread) {
return RESULT_SUCCESS;
}
std::vector<u8> HLERequestContext::ReadBuffer(int buffer_index) const {
std::vector<u8> HLERequestContext::ReadBuffer(std::size_t buffer_index) const {
std::vector<u8> buffer;
const bool is_buffer_a{BufferDescriptorA().size() > buffer_index &&
BufferDescriptorA()[buffer_index].Size()};
@@ -304,7 +304,7 @@ std::vector<u8> HLERequestContext::ReadBuffer(int buffer_index) const {
}
std::size_t HLERequestContext::WriteBuffer(const void* buffer, std::size_t size,
int buffer_index) const {
std::size_t buffer_index) const {
if (size == 0) {
LOG_WARNING(Core, "skip empty buffer write");
return 0;
@@ -337,7 +337,7 @@ std::size_t HLERequestContext::WriteBuffer(const void* buffer, std::size_t size,
return size;
}
std::size_t HLERequestContext::GetReadBufferSize(int buffer_index) const {
std::size_t HLERequestContext::GetReadBufferSize(std::size_t buffer_index) const {
const bool is_buffer_a{BufferDescriptorA().size() > buffer_index &&
BufferDescriptorA()[buffer_index].Size()};
if (is_buffer_a) {
@@ -355,7 +355,7 @@ std::size_t HLERequestContext::GetReadBufferSize(int buffer_index) const {
}
}
std::size_t HLERequestContext::GetWriteBufferSize(int buffer_index) const {
std::size_t HLERequestContext::GetWriteBufferSize(std::size_t buffer_index) const {
const bool is_buffer_b{BufferDescriptorB().size() > buffer_index &&
BufferDescriptorB()[buffer_index].Size()};
if (is_buffer_b) {

View File

@@ -179,10 +179,11 @@ public:
}
/// Helper function to read a buffer using the appropriate buffer descriptor
std::vector<u8> ReadBuffer(int buffer_index = 0) const;
std::vector<u8> ReadBuffer(std::size_t buffer_index = 0) const;
/// Helper function to write a buffer using the appropriate buffer descriptor
std::size_t WriteBuffer(const void* buffer, std::size_t size, int buffer_index = 0) const;
std::size_t WriteBuffer(const void* buffer, std::size_t size,
std::size_t buffer_index = 0) const;
/* Helper function to write a buffer using the appropriate buffer descriptor
*
@@ -194,7 +195,8 @@ public:
*/
template <typename ContiguousContainer,
typename = std::enable_if_t<!std::is_pointer_v<ContiguousContainer>>>
std::size_t WriteBuffer(const ContiguousContainer& container, int buffer_index = 0) const {
std::size_t WriteBuffer(const ContiguousContainer& container,
std::size_t buffer_index = 0) const {
using ContiguousType = typename ContiguousContainer::value_type;
static_assert(std::is_trivially_copyable_v<ContiguousType>,
@@ -205,10 +207,10 @@ public:
}
/// Helper function to get the size of the input buffer
std::size_t GetReadBufferSize(int buffer_index = 0) const;
std::size_t GetReadBufferSize(std::size_t buffer_index = 0) const;
/// Helper function to get the size of the output buffer
std::size_t GetWriteBufferSize(int buffer_index = 0) const;
std::size_t GetWriteBufferSize(std::size_t buffer_index = 0) const;
template <typename T>
std::shared_ptr<T> GetCopyObject(std::size_t index) {

View File

@@ -103,7 +103,7 @@ static void ThreadWakeupCallback(u64 thread_handle, [[maybe_unused]] s64 cycles_
struct KernelCore::Impl {
explicit Impl(Core::System& system, KernelCore& kernel)
: system{system}, global_scheduler{kernel}, synchronization{system}, time_manager{system} {}
: global_scheduler{kernel}, synchronization{system}, time_manager{system}, system{system} {}
void Initialize(KernelCore& kernel) {
Shutdown();

View File

@@ -129,7 +129,7 @@ private:
LOG_DEBUG(Service_Audio, "called. rendering_time_limit_percent={}",
rendering_time_limit_percent);
ASSERT(rendering_time_limit_percent >= 0 && rendering_time_limit_percent <= 100);
ASSERT(rendering_time_limit_percent <= 100);
IPC::ResponseBuilder rb{ctx, 2};
rb.Push(RESULT_SUCCESS);

View File

@@ -451,7 +451,8 @@ FileSys::SaveDataSize FileSystemController::ReadSaveDataSize(FileSys::SaveDataTy
if (res != Loader::ResultStatus::Success) {
FileSys::PatchManager pm{system.CurrentProcess()->GetTitleID()};
auto [nacp_unique, discard] = pm.GetControlMetadata();
const auto metadata = pm.GetControlMetadata();
const auto& nacp_unique = metadata.first;
if (nacp_unique != nullptr) {
new_size = {nacp_unique->GetDefaultNormalSaveSize(),

View File

@@ -575,6 +575,7 @@ private:
0,
user_id->GetSize(),
{},
{},
});
continue;
@@ -595,6 +596,7 @@ private:
stoull_be(title_id->GetName()),
title_id->GetSize(),
{},
{},
});
}
}
@@ -619,6 +621,7 @@ private:
stoull_be(title_id->GetName()),
title_id->GetSize(),
{},
{},
});
}
}

View File

@@ -27,7 +27,7 @@ public:
{10110, nullptr, "GetFriendProfileImage"},
{10200, nullptr, "SendFriendRequestForApplication"},
{10211, nullptr, "AddFacedFriendRequestForApplication"},
{10400, nullptr, "GetBlockedUserListIds"},
{10400, &IFriendService::GetBlockedUserListIds, "GetBlockedUserListIds"},
{10500, nullptr, "GetProfileList"},
{10600, nullptr, "DeclareOpenOnlinePlaySession"},
{10601, &IFriendService::DeclareCloseOnlinePlaySession, "DeclareCloseOnlinePlaySession"},
@@ -121,6 +121,15 @@ private:
};
static_assert(sizeof(SizedFriendFilter) == 0x10, "SizedFriendFilter is an invalid size");
void GetBlockedUserListIds(Kernel::HLERequestContext& ctx) {
// This is safe to stub, as there should be no adverse consequences from reporting no
// blocked users.
LOG_WARNING(Service_ACC, "(STUBBED) called");
IPC::ResponseBuilder rb{ctx, 3};
rb.Push(RESULT_SUCCESS);
rb.Push<u32>(0); // Indicates there are no blocked users
}
void DeclareCloseOnlinePlaySession(Kernel::HLERequestContext& ctx) {
// Stub used by Splatoon 2
LOG_WARNING(Service_ACC, "(STUBBED) called");

View File

@@ -29,7 +29,7 @@ Time::Time(std::shared_ptr<Module> module, Core::System& system, const char* nam
{300, &Time::CalculateMonotonicSystemClockBaseTimePoint, "CalculateMonotonicSystemClockBaseTimePoint"},
{400, &Time::GetClockSnapshot, "GetClockSnapshot"},
{401, &Time::GetClockSnapshotFromSystemClockContext, "GetClockSnapshotFromSystemClockContext"},
{500, nullptr, "CalculateStandardUserSystemClockDifferenceByUser"},
{500, &Time::CalculateStandardUserSystemClockDifferenceByUser, "CalculateStandardUserSystemClockDifferenceByUser"},
{501, &Time::CalculateSpanBetween, "CalculateSpanBetween"},
};
// clang-format on

View File

@@ -308,6 +308,29 @@ void Module::Interface::GetClockSnapshotFromSystemClockContext(Kernel::HLEReques
ctx.WriteBuffer(&clock_snapshot, sizeof(Clock::ClockSnapshot));
}
void Module::Interface::CalculateStandardUserSystemClockDifferenceByUser(
Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_Time, "called");
IPC::RequestParser rp{ctx};
const auto snapshot_a = rp.PopRaw<Clock::ClockSnapshot>();
const auto snapshot_b = rp.PopRaw<Clock::ClockSnapshot>();
auto time_span_type{Clock::TimeSpanType::FromSeconds(snapshot_b.user_context.offset -
snapshot_a.user_context.offset)};
if ((snapshot_b.user_context.steady_time_point.clock_source_id !=
snapshot_a.user_context.steady_time_point.clock_source_id) ||
(snapshot_b.is_automatic_correction_enabled &&
snapshot_a.is_automatic_correction_enabled)) {
time_span_type.nanoseconds = 0;
}
IPC::ResponseBuilder rb{ctx, (sizeof(s64) / 4) + 2};
rb.Push(RESULT_SUCCESS);
rb.PushRaw(time_span_type.nanoseconds);
}
void Module::Interface::CalculateSpanBetween(Kernel::HLERequestContext& ctx) {
LOG_DEBUG(Service_Time, "called");

View File

@@ -32,6 +32,7 @@ public:
void CalculateMonotonicSystemClockBaseTimePoint(Kernel::HLERequestContext& ctx);
void GetClockSnapshot(Kernel::HLERequestContext& ctx);
void GetClockSnapshotFromSystemClockContext(Kernel::HLERequestContext& ctx);
void CalculateStandardUserSystemClockDifferenceByUser(Kernel::HLERequestContext& ctx);
void CalculateSpanBetween(Kernel::HLERequestContext& ctx);
void GetSharedMemoryNativeHandle(Kernel::HLERequestContext& ctx);

View File

@@ -309,7 +309,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
offset = GetTZName(name, offset);
std_len = offset;
}
if (!std_len) {
if (std_len == 0) {
return {};
}
if (!GetOffset(name, offset, std_offset)) {
@@ -320,7 +320,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
int dest_len{};
int dest_offset{};
const char* dest_name{name + offset};
if (rule.chars.size() < char_count) {
if (rule.chars.size() < std::size_t(char_count)) {
return {};
}
@@ -343,7 +343,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
return {};
}
char_count += dest_len + 1;
if (rule.chars.size() < char_count) {
if (rule.chars.size() < std::size_t(char_count)) {
return {};
}
if (name[offset] != '\0' && name[offset] != ',' && name[offset] != ';') {
@@ -414,7 +414,7 @@ static bool ParsePosixName(const char* name, TimeZoneRule& rule) {
if (is_reversed ||
(start_time < end_time &&
(end_time - start_time < (year_seconds + (std_offset - dest_offset))))) {
if (rule.ats.size() - 2 < time_count) {
if (rule.ats.size() - 2 < std::size_t(time_count)) {
break;
}
@@ -609,7 +609,7 @@ static bool ParseTimeZoneBinary(TimeZoneRule& time_zone_rule, FileSys::VirtualFi
}
const u64 position{(read_offset - sizeof(TzifHeader))};
const std::size_t bytes_read{vfs_file->GetSize() - sizeof(TzifHeader) - position};
const s64 bytes_read = s64(vfs_file->GetSize() - sizeof(TzifHeader) - position);
if (bytes_read < 0) {
return {};
}
@@ -621,11 +621,11 @@ static bool ParseTimeZoneBinary(TimeZoneRule& time_zone_rule, FileSys::VirtualFi
std::array<char, time_zone_name_max + 1> temp_name{};
vfs_file->ReadArray(temp_name.data(), bytes_read, read_offset);
if (bytes_read > 2 && temp_name[0] == '\n' && temp_name[bytes_read - 1] == '\n' &&
time_zone_rule.type_count + 2 <= time_zone_rule.ttis.size()) {
std::size_t(time_zone_rule.type_count) + 2 <= time_zone_rule.ttis.size()) {
temp_name[bytes_read - 1] = '\0';
std::array<char, time_zone_name_max> name{};
std::memcpy(name.data(), temp_name.data() + 1, bytes_read - 1);
std::memcpy(name.data(), temp_name.data() + 1, std::size_t(bytes_read - 1));
TimeZoneRule temp_rule;
if (ParsePosixName(name.data(), temp_rule)) {

View File

@@ -101,8 +101,8 @@ public:
}
std::u16string ReadInterfaceToken() {
u32 unknown = Read<u32_le>();
u32 length = Read<u32_le>();
[[maybe_unused]] const u32 unknown = Read<u32_le>();
const u32 length = Read<u32_le>();
std::u16string token{};

View File

@@ -55,7 +55,7 @@ void DmntCheatVm::LogOpcode(const CheatVmOpcode& opcode) {
fmt::format("Cond Type: {:X}", static_cast<u32>(begin_cond->cond_type)));
callbacks->CommandLog(fmt::format("Rel Addr: {:X}", begin_cond->rel_address));
callbacks->CommandLog(fmt::format("Value: {:X}", begin_cond->value.bit64));
} else if (auto end_cond = std::get_if<EndConditionalOpcode>(&opcode.opcode)) {
} else if (std::holds_alternative<EndConditionalOpcode>(opcode.opcode)) {
callbacks->CommandLog("Opcode: End Conditional");
} else if (auto ctrl_loop = std::get_if<ControlLoopOpcode>(&opcode.opcode)) {
if (ctrl_loop->start_loop) {
@@ -399,6 +399,7 @@ bool DmntCheatVm::DecodeNextOpcode(CheatVmOpcode& out) {
// 8kkkkkkk
// Just parse the mask.
begin_keypress_cond.key_mask = first_dword & 0x0FFFFFFF;
opcode.opcode = begin_keypress_cond;
} break;
case CheatVmOpcodeType::PerformArithmeticRegister: {
PerformArithmeticRegisterOpcode perform_math_reg{};
@@ -779,7 +780,7 @@ void DmntCheatVm::Execute(const CheatProcessMetadata& metadata) {
if (!cond_met) {
SkipConditionalBlock();
}
} else if (auto end_cond = std::get_if<EndConditionalOpcode>(&cur_opcode.opcode)) {
} else if (std::holds_alternative<EndConditionalOpcode>(cur_opcode.opcode)) {
// Decrement the condition depth.
// We will assume, graciously, that mismatched conditional block ends are a nop.
if (condition_depth > 0) {

View File

@@ -153,9 +153,9 @@ void TelemetrySession::AddInitialInfo(Loader::AppLoader& app_loader) {
app_loader.ReadTitle(name);
if (name.empty()) {
auto [nacp, icon_file] = FileSys::PatchManager(program_id).GetControlMetadata();
if (nacp != nullptr) {
name = nacp->GetApplicationName();
const auto metadata = FileSys::PatchManager(program_id).GetControlMetadata();
if (metadata.first != nullptr) {
name = metadata.first->GetApplicationName();
}
}

View File

@@ -27,4 +27,4 @@ if(SDL2_FOUND)
endif()
create_target_directory_groups(input_common)
target_link_libraries(input_common PUBLIC core PRIVATE common ${Boost_LIBRARIES})
target_link_libraries(input_common PUBLIC core PRIVATE common Boost::boost)

View File

@@ -603,6 +603,7 @@ public:
if (std::abs(event.jaxis.value / 32767.0) < 0.5) {
break;
}
[[fallthrough]];
case SDL_JOYBUTTONUP:
case SDL_JOYHATMOTION:
return SDLEventToButtonParamPackage(state, event);

View File

@@ -160,8 +160,6 @@ if (ENABLE_VULKAN)
renderer_vulkan/fixed_pipeline_state.h
renderer_vulkan/maxwell_to_vk.cpp
renderer_vulkan/maxwell_to_vk.h
renderer_vulkan/nsight_aftermath_tracker.cpp
renderer_vulkan/nsight_aftermath_tracker.h
renderer_vulkan/renderer_vulkan.h
renderer_vulkan/renderer_vulkan.cpp
renderer_vulkan/vk_blit_screen.cpp
@@ -215,30 +213,19 @@ if (ENABLE_VULKAN)
renderer_vulkan/wrapper.cpp
renderer_vulkan/wrapper.h
)
target_include_directories(video_core PRIVATE sirit ../../externals/Vulkan-Headers/include)
target_compile_definitions(video_core PRIVATE HAS_VULKAN)
endif()
create_target_directory_groups(video_core)
target_link_libraries(video_core PUBLIC common core)
target_link_libraries(video_core PRIVATE glad)
if (ENABLE_VULKAN)
target_include_directories(video_core PRIVATE sirit ../../externals/Vulkan-Headers/include)
target_compile_definitions(video_core PRIVATE HAS_VULKAN)
target_link_libraries(video_core PRIVATE sirit)
endif()
if (ENABLE_NSIGHT_AFTERMATH)
if (NOT DEFINED ENV{NSIGHT_AFTERMATH_SDK})
message(ERROR "Environment variable NSIGHT_AFTERMATH_SDK has to be provided")
endif()
if (NOT WIN32)
message(ERROR "Nsight Aftermath doesn't support non-Windows platforms")
endif()
target_compile_definitions(video_core PRIVATE HAS_NSIGHT_AFTERMATH)
target_include_directories(video_core PRIVATE "$ENV{NSIGHT_AFTERMATH_SDK}/include")
endif()
if (MSVC)
target_compile_options(video_core PRIVATE /we4267)
else()

View File

@@ -29,10 +29,10 @@ namespace VideoCommon {
using MapInterval = std::shared_ptr<MapIntervalBase>;
template <typename TBuffer, typename TBufferType, typename StreamBuffer>
template <typename OwnerBuffer, typename BufferType, typename StreamBuffer>
class BufferCache {
public:
using BufferInfo = std::pair<const TBufferType*, u64>;
using BufferInfo = std::pair<BufferType, u64>;
BufferInfo UploadMemory(GPUVAddr gpu_addr, std::size_t size, std::size_t alignment = 4,
bool is_written = false, bool use_fast_cbuf = false) {
@@ -89,9 +89,7 @@ public:
}
}
const u64 offset = static_cast<u64>(block->GetOffset(cpu_addr));
return {ToHandle(block), offset};
return {ToHandle(block), static_cast<u64>(block->GetOffset(cpu_addr))};
}
/// Uploads from a host memory. Returns the OpenGL buffer where it's located and its offset.
@@ -156,7 +154,7 @@ public:
}
}
virtual const TBufferType* GetEmptyBuffer(std::size_t size) = 0;
virtual BufferType GetEmptyBuffer(std::size_t size) = 0;
protected:
explicit BufferCache(VideoCore::RasterizerInterface& rasterizer, Core::System& system,
@@ -166,19 +164,19 @@ protected:
~BufferCache() = default;
virtual const TBufferType* ToHandle(const TBuffer& storage) = 0;
virtual BufferType ToHandle(const OwnerBuffer& storage) = 0;
virtual void WriteBarrier() = 0;
virtual TBuffer CreateBlock(VAddr cpu_addr, std::size_t size) = 0;
virtual OwnerBuffer CreateBlock(VAddr cpu_addr, std::size_t size) = 0;
virtual void UploadBlockData(const TBuffer& buffer, std::size_t offset, std::size_t size,
virtual void UploadBlockData(const OwnerBuffer& buffer, std::size_t offset, std::size_t size,
const u8* data) = 0;
virtual void DownloadBlockData(const TBuffer& buffer, std::size_t offset, std::size_t size,
virtual void DownloadBlockData(const OwnerBuffer& buffer, std::size_t offset, std::size_t size,
u8* data) = 0;
virtual void CopyBlock(const TBuffer& src, const TBuffer& dst, std::size_t src_offset,
virtual void CopyBlock(const OwnerBuffer& src, const OwnerBuffer& dst, std::size_t src_offset,
std::size_t dst_offset, std::size_t size) = 0;
virtual BufferInfo ConstBufferUpload(const void* raw_pointer, std::size_t size) {
@@ -221,9 +219,8 @@ private:
return std::make_shared<MapIntervalBase>(start, end, gpu_addr);
}
MapInterval MapAddress(const TBuffer& block, const GPUVAddr gpu_addr, const VAddr cpu_addr,
MapInterval MapAddress(const OwnerBuffer& block, const GPUVAddr gpu_addr, const VAddr cpu_addr,
const std::size_t size) {
std::vector<MapInterval> overlaps = GetMapsInRange(cpu_addr, size);
if (overlaps.empty()) {
auto& memory_manager = system.GPU().MemoryManager();
@@ -272,7 +269,7 @@ private:
return new_map;
}
void UpdateBlock(const TBuffer& block, VAddr start, VAddr end,
void UpdateBlock(const OwnerBuffer& block, VAddr start, VAddr end,
std::vector<MapInterval>& overlaps) {
const IntervalType base_interval{start, end};
IntervalSet interval_set{};
@@ -313,7 +310,7 @@ private:
void FlushMap(MapInterval map) {
std::size_t size = map->GetEnd() - map->GetStart();
TBuffer block = blocks[map->GetStart() >> block_page_bits];
OwnerBuffer block = blocks[map->GetStart() >> block_page_bits];
staging_buffer.resize(size);
DownloadBlockData(block, block->GetOffset(map->GetStart()), size, staging_buffer.data());
system.Memory().WriteBlockUnsafe(map->GetStart(), staging_buffer.data(), size);
@@ -328,7 +325,7 @@ private:
buffer_ptr += size;
buffer_offset += size;
return {&stream_buffer_handle, uploaded_offset};
return {stream_buffer_handle, uploaded_offset};
}
void AlignBuffer(std::size_t alignment) {
@@ -338,11 +335,11 @@ private:
buffer_offset = offset_aligned;
}
TBuffer EnlargeBlock(TBuffer buffer) {
OwnerBuffer EnlargeBlock(OwnerBuffer buffer) {
const std::size_t old_size = buffer->GetSize();
const std::size_t new_size = old_size + block_page_size;
const VAddr cpu_addr = buffer->GetCpuAddr();
TBuffer new_buffer = CreateBlock(cpu_addr, new_size);
OwnerBuffer new_buffer = CreateBlock(cpu_addr, new_size);
CopyBlock(buffer, new_buffer, 0, 0, old_size);
buffer->SetEpoch(epoch);
pending_destruction.push_back(buffer);
@@ -356,14 +353,14 @@ private:
return new_buffer;
}
TBuffer MergeBlocks(TBuffer first, TBuffer second) {
OwnerBuffer MergeBlocks(OwnerBuffer first, OwnerBuffer second) {
const std::size_t size_1 = first->GetSize();
const std::size_t size_2 = second->GetSize();
const VAddr first_addr = first->GetCpuAddr();
const VAddr second_addr = second->GetCpuAddr();
const VAddr new_addr = std::min(first_addr, second_addr);
const std::size_t new_size = size_1 + size_2;
TBuffer new_buffer = CreateBlock(new_addr, new_size);
OwnerBuffer new_buffer = CreateBlock(new_addr, new_size);
CopyBlock(first, new_buffer, 0, new_buffer->GetOffset(first_addr), size_1);
CopyBlock(second, new_buffer, 0, new_buffer->GetOffset(second_addr), size_2);
first->SetEpoch(epoch);
@@ -380,8 +377,8 @@ private:
return new_buffer;
}
TBuffer GetBlock(const VAddr cpu_addr, const std::size_t size) {
TBuffer found{};
OwnerBuffer GetBlock(const VAddr cpu_addr, const std::size_t size) {
OwnerBuffer found;
const VAddr cpu_addr_end = cpu_addr + size - 1;
u64 page_start = cpu_addr >> block_page_bits;
const u64 page_end = cpu_addr_end >> block_page_bits;
@@ -457,7 +454,7 @@ private:
Core::System& system;
std::unique_ptr<StreamBuffer> stream_buffer;
TBufferType stream_buffer_handle{};
BufferType stream_buffer_handle{};
bool invalidated = false;
@@ -475,9 +472,9 @@ private:
static constexpr u64 block_page_bits = 21;
static constexpr u64 block_page_size = 1ULL << block_page_bits;
std::unordered_map<u64, TBuffer> blocks;
std::unordered_map<u64, OwnerBuffer> blocks;
std::list<TBuffer> pending_destruction;
std::list<OwnerBuffer> pending_destruction;
u64 epoch = 0;
u64 modified_ticks = 0;

View File

@@ -303,6 +303,10 @@ public:
return (type == Type::SignedNorm) || (type == Type::UnsignedNorm);
}
bool IsConstant() const {
return constant;
}
bool IsValid() const {
return size != Size::Invalid;
}

View File

@@ -1005,6 +1005,12 @@ union Instruction {
BitField<46, 2, u64> cache_mode;
} stg;
union {
BitField<23, 3, AtomicOp> operation;
BitField<48, 1, u64> extended;
BitField<20, 3, GlobalAtomicType> type;
} red;
union {
BitField<52, 4, AtomicOp> operation;
BitField<49, 3, GlobalAtomicType> type;
@@ -1501,7 +1507,7 @@ union Instruction {
TextureType GetTextureType() const {
// The TLDS instruction has a weird encoding for the texture type.
if (texture_info >= 0 && texture_info <= 1) {
if (texture_info <= 1) {
return TextureType::Texture1D;
}
if (texture_info == 2 || texture_info == 8 || texture_info == 12 ||
@@ -1787,6 +1793,7 @@ public:
ST_S,
ST, // Store in generic memory
STG, // Store in global memory
RED, // Reduction operation
ATOM, // Atomic operation on global memory
ATOMS, // Atomic operation on shared memory
AL2P, // Transforms attribute memory into physical memory
@@ -1871,7 +1878,8 @@ public:
ICMP_R,
ICMP_CR,
ICMP_IMM,
FCMP_R,
FCMP_RR,
FCMP_RC,
MUFU, // Multi-Function Operator
RRO_C, // Range Reduction Operator
RRO_R,
@@ -2096,6 +2104,7 @@ private:
INST("1110111101010---", Id::ST_L, Type::Memory, "ST_L"),
INST("101-------------", Id::ST, Type::Memory, "ST"),
INST("1110111011011---", Id::STG, Type::Memory, "STG"),
INST("1110101111111---", Id::RED, Type::Memory, "RED"),
INST("11101101--------", Id::ATOM, Type::Memory, "ATOM"),
INST("11101100--------", Id::ATOMS, Type::Memory, "ATOMS"),
INST("1110111110100---", Id::AL2P, Type::Memory, "AL2P"),
@@ -2179,7 +2188,8 @@ private:
INST("0101110100100---", Id::HSETP2_R, Type::HalfSetPredicate, "HSETP2_R"),
INST("0111111-0-------", Id::HSETP2_IMM, Type::HalfSetPredicate, "HSETP2_IMM"),
INST("0101110100011---", Id::HSET2_R, Type::HalfSet, "HSET2_R"),
INST("010110111010----", Id::FCMP_R, Type::Arithmetic, "FCMP_R"),
INST("010110111010----", Id::FCMP_RR, Type::Arithmetic, "FCMP_RR"),
INST("010010111010----", Id::FCMP_RC, Type::Arithmetic, "FCMP_RC"),
INST("0101000010000---", Id::MUFU, Type::Arithmetic, "MUFU"),
INST("0100110010010---", Id::RRO_C, Type::Arithmetic, "RRO_C"),
INST("0101110010010---", Id::RRO_R, Type::Arithmetic, "RRO_R"),

View File

@@ -12,8 +12,9 @@ namespace VideoCommon {
GPUAsynch::GPUAsynch(Core::System& system, std::unique_ptr<VideoCore::RendererBase>&& renderer_,
std::unique_ptr<Core::Frontend::GraphicsContext>&& context)
: GPU(system, std::move(renderer_), true), gpu_thread{system}, gpu_context(std::move(context)),
cpu_context(renderer->GetRenderWindow().CreateSharedContext()) {}
: GPU(system, std::move(renderer_), true), gpu_thread{system},
cpu_context(renderer->GetRenderWindow().CreateSharedContext()),
gpu_context(std::move(context)) {}
GPUAsynch::~GPUAsynch() = default;

View File

@@ -55,33 +55,31 @@ void OGLBufferCache::WriteBarrier() {
glMemoryBarrier(GL_ALL_BARRIER_BITS);
}
const GLuint* OGLBufferCache::ToHandle(const Buffer& buffer) {
GLuint OGLBufferCache::ToHandle(const Buffer& buffer) {
return buffer->GetHandle();
}
const GLuint* OGLBufferCache::GetEmptyBuffer(std::size_t) {
static const GLuint null_buffer = 0;
return &null_buffer;
GLuint OGLBufferCache::GetEmptyBuffer(std::size_t) {
return 0;
}
void OGLBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) {
glNamedBufferSubData(*buffer->GetHandle(), static_cast<GLintptr>(offset),
glNamedBufferSubData(buffer->GetHandle(), static_cast<GLintptr>(offset),
static_cast<GLsizeiptr>(size), data);
}
void OGLBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
u8* data) {
MICROPROFILE_SCOPE(OpenGL_Buffer_Download);
glGetNamedBufferSubData(*buffer->GetHandle(), static_cast<GLintptr>(offset),
glGetNamedBufferSubData(buffer->GetHandle(), static_cast<GLintptr>(offset),
static_cast<GLsizeiptr>(size), data);
}
void OGLBufferCache::CopyBlock(const Buffer& src, const Buffer& dst, std::size_t src_offset,
std::size_t dst_offset, std::size_t size) {
glCopyNamedBufferSubData(*src->GetHandle(), *dst->GetHandle(),
static_cast<GLintptr>(src_offset), static_cast<GLintptr>(dst_offset),
static_cast<GLsizeiptr>(size));
glCopyNamedBufferSubData(src->GetHandle(), dst->GetHandle(), static_cast<GLintptr>(src_offset),
static_cast<GLintptr>(dst_offset), static_cast<GLsizeiptr>(size));
}
OGLBufferCache::BufferInfo OGLBufferCache::ConstBufferUpload(const void* raw_pointer,
@@ -89,7 +87,7 @@ OGLBufferCache::BufferInfo OGLBufferCache::ConstBufferUpload(const void* raw_poi
DEBUG_ASSERT(cbuf_cursor < std::size(cbufs));
const GLuint& cbuf = cbufs[cbuf_cursor++];
glNamedBufferSubData(cbuf, 0, static_cast<GLsizeiptr>(size), raw_pointer);
return {&cbuf, 0};
return {cbuf, 0};
}
} // namespace OpenGL

View File

@@ -34,12 +34,12 @@ public:
explicit CachedBufferBlock(VAddr cpu_addr, const std::size_t size);
~CachedBufferBlock();
const GLuint* GetHandle() const {
return &gl_buffer.handle;
GLuint GetHandle() const {
return gl_buffer.handle;
}
private:
OGLBuffer gl_buffer{};
OGLBuffer gl_buffer;
};
class OGLBufferCache final : public GenericBufferCache {
@@ -48,7 +48,7 @@ public:
const Device& device, std::size_t stream_size);
~OGLBufferCache();
const GLuint* GetEmptyBuffer(std::size_t) override;
GLuint GetEmptyBuffer(std::size_t) override;
void Acquire() noexcept {
cbuf_cursor = 0;
@@ -57,9 +57,9 @@ public:
protected:
Buffer CreateBlock(VAddr cpu_addr, std::size_t size) override;
void WriteBarrier() override;
GLuint ToHandle(const Buffer& buffer) override;
const GLuint* ToHandle(const Buffer& buffer) override;
void WriteBarrier() override;
void UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) override;

View File

@@ -87,7 +87,7 @@ u32 Extract(u32& base, u32& num, u32 amount, std::optional<GLenum> limit = {}) {
std::array<Device::BaseBindings, Tegra::Engines::MaxShaderTypes> BuildBaseBindings() noexcept {
std::array<Device::BaseBindings, Tegra::Engines::MaxShaderTypes> bindings;
static std::array<std::size_t, 5> stage_swizzle = {0, 1, 2, 3, 4};
static constexpr std::array<std::size_t, 5> stage_swizzle{0, 1, 2, 3, 4};
const u32 total_ubos = GetInteger<u32>(GL_MAX_UNIFORM_BUFFER_BINDINGS);
const u32 total_ssbos = GetInteger<u32>(GL_MAX_SHADER_STORAGE_BUFFER_BINDINGS);
const u32 total_samplers = GetInteger<u32>(GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS);

View File

@@ -140,8 +140,8 @@ void RasterizerOpenGL::SetupVertexFormat() {
const auto attrib = gpu.regs.vertex_attrib_format[index];
const auto gl_index = static_cast<GLuint>(index);
// Ignore invalid attributes.
if (!attrib.IsValid()) {
// Disable constant attributes.
if (attrib.IsConstant()) {
glDisableVertexAttribArray(gl_index);
continue;
}
@@ -188,10 +188,8 @@ void RasterizerOpenGL::SetupVertexBuffer() {
ASSERT(end > start);
const u64 size = end - start + 1;
const auto [vertex_buffer, vertex_buffer_offset] = buffer_cache.UploadMemory(start, size);
// Bind the vertex array to the buffer at the current offset.
vertex_array_pushbuffer.SetVertexBuffer(static_cast<GLuint>(index), vertex_buffer,
vertex_buffer_offset, vertex_array.stride);
glBindVertexBuffer(static_cast<GLuint>(index), vertex_buffer, vertex_buffer_offset,
vertex_array.stride);
}
}
@@ -222,7 +220,7 @@ GLintptr RasterizerOpenGL::SetupIndexBuffer() {
const auto& regs = system.GPU().Maxwell3D().regs;
const std::size_t size = CalculateIndexBufferSize();
const auto [buffer, offset] = buffer_cache.UploadMemory(regs.index_array.IndexStart(), size);
vertex_array_pushbuffer.SetIndexBuffer(buffer);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, buffer);
return offset;
}
@@ -524,7 +522,6 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
// Prepare vertex array format.
SetupVertexFormat();
vertex_array_pushbuffer.Setup();
// Upload vertex and index data.
SetupVertexBuffer();
@@ -534,17 +531,13 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
index_buffer_offset = SetupIndexBuffer();
}
// Prepare packed bindings.
bind_ubo_pushbuffer.Setup();
bind_ssbo_pushbuffer.Setup();
// Setup emulation uniform buffer.
GLShader::MaxwellUniformData ubo;
ubo.SetFromRegs(gpu);
const auto [buffer, offset] =
buffer_cache.UploadHostMemory(&ubo, sizeof(ubo), device.GetUniformBufferAlignment());
bind_ubo_pushbuffer.Push(EmulationUniformBlockBinding, buffer, offset,
static_cast<GLsizeiptr>(sizeof(ubo)));
glBindBufferRange(GL_UNIFORM_BUFFER, EmulationUniformBlockBinding, buffer, offset,
static_cast<GLsizeiptr>(sizeof(ubo)));
// Setup shaders and their used resources.
texture_cache.GuardSamplers(true);
@@ -557,11 +550,6 @@ void RasterizerOpenGL::Draw(bool is_indexed, bool is_instanced) {
// Signal the buffer cache that we are not going to upload more things.
buffer_cache.Unmap();
// Now that we are no longer uploading data, we can safely bind the buffers to OpenGL.
vertex_array_pushbuffer.Bind();
bind_ubo_pushbuffer.Bind();
bind_ssbo_pushbuffer.Bind();
program_manager.BindGraphicsPipeline();
if (texture_cache.TextureBarrier()) {
@@ -630,17 +618,11 @@ void RasterizerOpenGL::DispatchCompute(GPUVAddr code_addr) {
(Maxwell::MaxConstBufferSize + device.GetUniformBufferAlignment());
buffer_cache.Map(buffer_size);
bind_ubo_pushbuffer.Setup();
bind_ssbo_pushbuffer.Setup();
SetupComputeConstBuffers(kernel);
SetupComputeGlobalMemory(kernel);
buffer_cache.Unmap();
bind_ubo_pushbuffer.Bind();
bind_ssbo_pushbuffer.Bind();
const auto& launch_desc = system.GPU().KeplerCompute().launch_description;
glDispatchCompute(launch_desc.grid_dim_x, launch_desc.grid_dim_y, launch_desc.grid_dim_z);
++num_queued_commands;
@@ -771,8 +753,8 @@ void RasterizerOpenGL::SetupConstBuffer(u32 binding, const Tegra::Engines::Const
const ConstBufferEntry& entry) {
if (!buffer.enabled) {
// Set values to zero to unbind buffers
bind_ubo_pushbuffer.Push(binding, buffer_cache.GetEmptyBuffer(sizeof(float)), 0,
sizeof(float));
glBindBufferRange(GL_UNIFORM_BUFFER, binding, buffer_cache.GetEmptyBuffer(sizeof(float)), 0,
sizeof(float));
return;
}
@@ -783,7 +765,7 @@ void RasterizerOpenGL::SetupConstBuffer(u32 binding, const Tegra::Engines::Const
const auto alignment = device.GetUniformBufferAlignment();
const auto [cbuf, offset] = buffer_cache.UploadMemory(buffer.address, size, alignment, false,
device.HasFastBufferSubData());
bind_ubo_pushbuffer.Push(binding, cbuf, offset, size);
glBindBufferRange(GL_UNIFORM_BUFFER, binding, cbuf, offset, size);
}
void RasterizerOpenGL::SetupDrawGlobalMemory(std::size_t stage_index, const Shader& shader) {
@@ -819,7 +801,8 @@ void RasterizerOpenGL::SetupGlobalMemory(u32 binding, const GlobalMemoryEntry& e
const auto alignment{device.GetShaderStorageBufferAlignment()};
const auto [ssbo, buffer_offset] =
buffer_cache.UploadMemory(gpu_addr, size, alignment, entry.IsWritten());
bind_ssbo_pushbuffer.Push(binding, ssbo, buffer_offset, static_cast<GLsizeiptr>(size));
glBindBufferRange(GL_SHADER_STORAGE_BUFFER, binding, ssbo, buffer_offset,
static_cast<GLsizeiptr>(size));
}
void RasterizerOpenGL::SetupDrawTextures(std::size_t stage_index, const Shader& shader) {
@@ -1432,7 +1415,7 @@ void RasterizerOpenGL::EndTransformFeedback() {
const GPUVAddr gpu_addr = binding.Address();
const std::size_t size = binding.buffer_size;
const auto [dest_buffer, offset] = buffer_cache.UploadMemory(gpu_addr, size, 4, true);
glCopyNamedBufferSubData(handle, *dest_buffer, 0, offset, static_cast<GLsizeiptr>(size));
glCopyNamedBufferSubData(handle, dest_buffer, 0, offset, static_cast<GLsizeiptr>(size));
}
}

View File

@@ -231,9 +231,7 @@ private:
static constexpr std::size_t STREAM_BUFFER_SIZE = 128 * 1024 * 1024;
OGLBufferCache buffer_cache;
VertexArrayPushBuffer vertex_array_pushbuffer{state_tracker};
BindBuffersRangePushBuffer bind_ubo_pushbuffer{GL_UNIFORM_BUFFER};
BindBuffersRangePushBuffer bind_ssbo_pushbuffer{GL_SHADER_STORAGE_BUFFER};
GLint vertex_binding = 0;
std::array<OGLBuffer, Tegra::Engines::Maxwell3D::Regs::NumTransformFeedbackBuffers>
transform_feedback_buffers;

View File

@@ -34,6 +34,8 @@
namespace OpenGL {
using Tegra::Engines::ShaderType;
using VideoCommon::Shader::CompileDepth;
using VideoCommon::Shader::CompilerSettings;
using VideoCommon::Shader::ProgramCode;
using VideoCommon::Shader::Registry;
using VideoCommon::Shader::ShaderIR;
@@ -43,7 +45,7 @@ namespace {
constexpr u32 STAGE_MAIN_OFFSET = 10;
constexpr u32 KERNEL_MAIN_OFFSET = 0;
constexpr VideoCommon::Shader::CompilerSettings COMPILER_SETTINGS{};
constexpr CompilerSettings COMPILER_SETTINGS{CompileDepth::FullDecompile};
/// Gets the address for the specified shader stage program
GPUVAddr GetShaderAddress(Core::System& system, Maxwell::ShaderProgram program) {

View File

@@ -835,7 +835,8 @@ private:
void DeclareConstantBuffers() {
u32 binding = device.GetBaseBindings(stage).uniform_buffer;
for (const auto& [index, cbuf] : ir.GetConstantBuffers()) {
for (const auto& buffers : ir.GetConstantBuffers()) {
const auto index = buffers.first;
code.AddLine("layout (std140, binding = {}) uniform {} {{", binding++,
GetConstBufferBlock(index));
code.AddLine(" uvec4 {}[{}];", GetConstBuffer(index), MAX_CONSTBUFFER_ELEMENTS);
@@ -1821,13 +1822,15 @@ private:
Expression HMergeH0(Operation operation) {
const std::string dest = VisitOperand(operation, 0).AsUint();
const std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("bitfieldInsert({}, {}, 0, 16)", dest, src), Type::Uint};
return {fmt::format("vec2(unpackHalf2x16({}).x, unpackHalf2x16({}).y)", src, dest),
Type::HalfFloat};
}
Expression HMergeH1(Operation operation) {
const std::string dest = VisitOperand(operation, 0).AsUint();
const std::string src = VisitOperand(operation, 1).AsUint();
return {fmt::format("bitfieldInsert({}, {}, 16, 16)", dest, src), Type::Uint};
return {fmt::format("vec2(unpackHalf2x16({}).x, unpackHalf2x16({}).y)", dest, src),
Type::HalfFloat};
}
Expression HPack2(Operation operation) {
@@ -2117,8 +2120,14 @@ private:
return {};
}
return {fmt::format("atomic{}({}, {})", opname, Visit(operation[0]).GetCode(),
Visit(operation[1]).As(type)),
type};
Visit(operation[1]).AsUint()),
Type::Uint};
}
template <const std::string_view& opname, Type type>
Expression Reduce(Operation operation) {
code.AddLine("{};", Atomic<opname, type>(operation).GetCode());
return {};
}
Expression Branch(Operation operation) {
@@ -2477,6 +2486,20 @@ private:
&GLSLDecompiler::Atomic<Func::Or, Type::Int>,
&GLSLDecompiler::Atomic<Func::Xor, Type::Int>,
&GLSLDecompiler::Reduce<Func::Add, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Min, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Max, Type::Uint>,
&GLSLDecompiler::Reduce<Func::And, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Or, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Xor, Type::Uint>,
&GLSLDecompiler::Reduce<Func::Add, Type::Int>,
&GLSLDecompiler::Reduce<Func::Min, Type::Int>,
&GLSLDecompiler::Reduce<Func::Max, Type::Int>,
&GLSLDecompiler::Reduce<Func::And, Type::Int>,
&GLSLDecompiler::Reduce<Func::Or, Type::Int>,
&GLSLDecompiler::Reduce<Func::Xor, Type::Int>,
&GLSLDecompiler::Branch,
&GLSLDecompiler::BranchIndirect,
&GLSLDecompiler::PushFlowStack,

View File

@@ -417,7 +417,7 @@ void CachedSurfaceView::Attach(GLenum attachment, GLenum target) const {
switch (params.target) {
case SurfaceTarget::Texture2DArray:
glFramebufferTexture(target, attachment, GetTexture(), params.base_level);
glFramebufferTexture(target, attachment, GetTexture(), 0);
break;
default:
UNIMPLEMENTED();

View File

@@ -315,8 +315,8 @@ public:
RendererOpenGL::RendererOpenGL(Core::Frontend::EmuWindow& emu_window, Core::System& system,
Core::Frontend::GraphicsContext& context)
: VideoCore::RendererBase{emu_window}, emu_window{emu_window}, system{system},
frame_mailbox{}, context{context}, has_debug_tool{HasDebugTool()} {}
: RendererBase{emu_window}, emu_window{emu_window}, system{system}, context{context},
has_debug_tool{HasDebugTool()} {}
RendererOpenGL::~RendererOpenGL() = default;

View File

@@ -14,68 +14,6 @@
namespace OpenGL {
struct VertexArrayPushBuffer::Entry {
GLuint binding_index{};
const GLuint* buffer{};
GLintptr offset{};
GLsizei stride{};
};
VertexArrayPushBuffer::VertexArrayPushBuffer(StateTracker& state_tracker)
: state_tracker{state_tracker} {}
VertexArrayPushBuffer::~VertexArrayPushBuffer() = default;
void VertexArrayPushBuffer::Setup() {
index_buffer = nullptr;
vertex_buffers.clear();
}
void VertexArrayPushBuffer::SetIndexBuffer(const GLuint* buffer) {
index_buffer = buffer;
}
void VertexArrayPushBuffer::SetVertexBuffer(GLuint binding_index, const GLuint* buffer,
GLintptr offset, GLsizei stride) {
vertex_buffers.push_back(Entry{binding_index, buffer, offset, stride});
}
void VertexArrayPushBuffer::Bind() {
if (index_buffer) {
state_tracker.BindIndexBuffer(*index_buffer);
}
for (const auto& entry : vertex_buffers) {
glBindVertexBuffer(entry.binding_index, *entry.buffer, entry.offset, entry.stride);
}
}
struct BindBuffersRangePushBuffer::Entry {
GLuint binding;
const GLuint* buffer;
GLintptr offset;
GLsizeiptr size;
};
BindBuffersRangePushBuffer::BindBuffersRangePushBuffer(GLenum target) : target{target} {}
BindBuffersRangePushBuffer::~BindBuffersRangePushBuffer() = default;
void BindBuffersRangePushBuffer::Setup() {
entries.clear();
}
void BindBuffersRangePushBuffer::Push(GLuint binding, const GLuint* buffer, GLintptr offset,
GLsizeiptr size) {
entries.push_back(Entry{binding, buffer, offset, size});
}
void BindBuffersRangePushBuffer::Bind() {
for (const Entry& entry : entries) {
glBindBufferRange(target, entry.binding, *entry.buffer, entry.offset, entry.size);
}
}
void LabelGLObject(GLenum identifier, GLuint handle, VAddr addr, std::string_view extra_info) {
if (!GLAD_GL_KHR_debug) {
// We don't need to throw an error as this is just for debugging

View File

@@ -11,49 +11,6 @@
namespace OpenGL {
class StateTracker;
class VertexArrayPushBuffer final {
public:
explicit VertexArrayPushBuffer(StateTracker& state_tracker);
~VertexArrayPushBuffer();
void Setup();
void SetIndexBuffer(const GLuint* buffer);
void SetVertexBuffer(GLuint binding_index, const GLuint* buffer, GLintptr offset,
GLsizei stride);
void Bind();
private:
struct Entry;
StateTracker& state_tracker;
const GLuint* index_buffer{};
std::vector<Entry> vertex_buffers;
};
class BindBuffersRangePushBuffer final {
public:
explicit BindBuffersRangePushBuffer(GLenum target);
~BindBuffersRangePushBuffer();
void Setup();
void Push(GLuint binding, const GLuint* buffer, GLintptr offset, GLsizeiptr size);
void Bind();
private:
struct Entry;
GLenum target;
std::vector<Entry> entries;
};
void LabelGLObject(GLenum identifier, GLuint handle, VAddr addr, std::string_view extra_info = {});
} // namespace OpenGL

View File

@@ -360,6 +360,7 @@ VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttrib
default:
break;
}
break;
case Maxwell::VertexAttribute::Type::UnsignedInt:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:
@@ -370,6 +371,14 @@ VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttrib
return VK_FORMAT_R8G8B8_UINT;
case Maxwell::VertexAttribute::Size::Size_8_8_8_8:
return VK_FORMAT_R8G8B8A8_UINT;
case Maxwell::VertexAttribute::Size::Size_16:
return VK_FORMAT_R16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16:
return VK_FORMAT_R16G16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16_16:
return VK_FORMAT_R16G16B16_UINT;
case Maxwell::VertexAttribute::Size::Size_16_16_16_16:
return VK_FORMAT_R16G16B16A16_UINT;
case Maxwell::VertexAttribute::Size::Size_32:
return VK_FORMAT_R32_UINT;
case Maxwell::VertexAttribute::Size::Size_32_32:
@@ -381,6 +390,7 @@ VkFormat VertexFormat(Maxwell::VertexAttribute::Type type, Maxwell::VertexAttrib
default:
break;
}
break;
case Maxwell::VertexAttribute::Type::UnsignedScaled:
switch (size) {
case Maxwell::VertexAttribute::Size::Size_8:

View File

@@ -1,220 +0,0 @@
// Copyright 2020 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#ifdef HAS_NSIGHT_AFTERMATH
#include <mutex>
#include <string>
#include <string_view>
#include <utility>
#include <vector>
#include <fmt/format.h>
#define VK_NO_PROTOTYPES
#include <vulkan/vulkan.h>
#include <GFSDK_Aftermath.h>
#include <GFSDK_Aftermath_Defines.h>
#include <GFSDK_Aftermath_GpuCrashDump.h>
#include <GFSDK_Aftermath_GpuCrashDumpDecoding.h>
#include "common/common_paths.h"
#include "common/common_types.h"
#include "common/file_util.h"
#include "common/logging/log.h"
#include "common/scope_exit.h"
#include "video_core/renderer_vulkan/nsight_aftermath_tracker.h"
namespace Vulkan {
static constexpr char AFTERMATH_LIB_NAME[] = "GFSDK_Aftermath_Lib.x64.dll";
NsightAftermathTracker::NsightAftermathTracker() = default;
NsightAftermathTracker::~NsightAftermathTracker() {
if (initialized) {
(void)GFSDK_Aftermath_DisableGpuCrashDumps();
}
}
bool NsightAftermathTracker::Initialize() {
if (!dl.Open(AFTERMATH_LIB_NAME)) {
LOG_ERROR(Render_Vulkan, "Failed to load Nsight Aftermath DLL");
return false;
}
if (!dl.GetSymbol("GFSDK_Aftermath_DisableGpuCrashDumps",
&GFSDK_Aftermath_DisableGpuCrashDumps) ||
!dl.GetSymbol("GFSDK_Aftermath_EnableGpuCrashDumps",
&GFSDK_Aftermath_EnableGpuCrashDumps) ||
!dl.GetSymbol("GFSDK_Aftermath_GetShaderDebugInfoIdentifier",
&GFSDK_Aftermath_GetShaderDebugInfoIdentifier) ||
!dl.GetSymbol("GFSDK_Aftermath_GetShaderHashSpirv", &GFSDK_Aftermath_GetShaderHashSpirv) ||
!dl.GetSymbol("GFSDK_Aftermath_GpuCrashDump_CreateDecoder",
&GFSDK_Aftermath_GpuCrashDump_CreateDecoder) ||
!dl.GetSymbol("GFSDK_Aftermath_GpuCrashDump_DestroyDecoder",
&GFSDK_Aftermath_GpuCrashDump_DestroyDecoder) ||
!dl.GetSymbol("GFSDK_Aftermath_GpuCrashDump_GenerateJSON",
&GFSDK_Aftermath_GpuCrashDump_GenerateJSON) ||
!dl.GetSymbol("GFSDK_Aftermath_GpuCrashDump_GetJSON",
&GFSDK_Aftermath_GpuCrashDump_GetJSON)) {
LOG_ERROR(Render_Vulkan, "Failed to load Nsight Aftermath function pointers");
return false;
}
dump_dir = FileUtil::GetUserPath(FileUtil::UserPath::LogDir) + "gpucrash";
(void)FileUtil::DeleteDirRecursively(dump_dir);
if (!FileUtil::CreateDir(dump_dir)) {
LOG_ERROR(Render_Vulkan, "Failed to create Nsight Aftermath dump directory");
return false;
}
if (!GFSDK_Aftermath_SUCCEED(GFSDK_Aftermath_EnableGpuCrashDumps(
GFSDK_Aftermath_Version_API, GFSDK_Aftermath_GpuCrashDumpWatchedApiFlags_Vulkan,
GFSDK_Aftermath_GpuCrashDumpFeatureFlags_Default, GpuCrashDumpCallback,
ShaderDebugInfoCallback, CrashDumpDescriptionCallback, this))) {
LOG_ERROR(Render_Vulkan, "GFSDK_Aftermath_EnableGpuCrashDumps failed");
return false;
}
LOG_INFO(Render_Vulkan, "Nsight Aftermath dump directory is \"{}\"", dump_dir);
initialized = true;
return true;
}
void NsightAftermathTracker::SaveShader(const std::vector<u32>& spirv) const {
if (!initialized) {
return;
}
std::vector<u32> spirv_copy = spirv;
GFSDK_Aftermath_SpirvCode shader;
shader.pData = spirv_copy.data();
shader.size = static_cast<u32>(spirv_copy.size() * 4);
std::scoped_lock lock{mutex};
GFSDK_Aftermath_ShaderHash hash;
if (!GFSDK_Aftermath_SUCCEED(
GFSDK_Aftermath_GetShaderHashSpirv(GFSDK_Aftermath_Version_API, &shader, &hash))) {
LOG_ERROR(Render_Vulkan, "Failed to hash SPIR-V module");
return;
}
FileUtil::IOFile file(fmt::format("{}/source_{:016x}.spv", dump_dir, hash.hash), "wb");
if (!file.IsOpen()) {
LOG_ERROR(Render_Vulkan, "Failed to dump SPIR-V module with hash={:016x}", hash.hash);
return;
}
if (file.WriteArray(spirv.data(), spirv.size()) != spirv.size()) {
LOG_ERROR(Render_Vulkan, "Failed to write SPIR-V module with hash={:016x}", hash.hash);
return;
}
}
void NsightAftermathTracker::OnGpuCrashDumpCallback(const void* gpu_crash_dump,
u32 gpu_crash_dump_size) {
std::scoped_lock lock{mutex};
LOG_CRITICAL(Render_Vulkan, "called");
GFSDK_Aftermath_GpuCrashDump_Decoder decoder;
if (!GFSDK_Aftermath_SUCCEED(GFSDK_Aftermath_GpuCrashDump_CreateDecoder(
GFSDK_Aftermath_Version_API, gpu_crash_dump, gpu_crash_dump_size, &decoder))) {
LOG_ERROR(Render_Vulkan, "Failed to create decoder");
return;
}
SCOPE_EXIT({ GFSDK_Aftermath_GpuCrashDump_DestroyDecoder(decoder); });
u32 json_size = 0;
if (!GFSDK_Aftermath_SUCCEED(GFSDK_Aftermath_GpuCrashDump_GenerateJSON(
decoder, GFSDK_Aftermath_GpuCrashDumpDecoderFlags_ALL_INFO,
GFSDK_Aftermath_GpuCrashDumpFormatterFlags_NONE, nullptr, nullptr, nullptr, nullptr,
this, &json_size))) {
LOG_ERROR(Render_Vulkan, "Failed to generate JSON");
return;
}
std::vector<char> json(json_size);
if (!GFSDK_Aftermath_SUCCEED(
GFSDK_Aftermath_GpuCrashDump_GetJSON(decoder, json_size, json.data()))) {
LOG_ERROR(Render_Vulkan, "Failed to query JSON");
return;
}
const std::string base_name = [this] {
const int id = dump_id++;
if (id == 0) {
return fmt::format("{}/crash.nv-gpudmp", dump_dir);
} else {
return fmt::format("{}/crash_{}.nv-gpudmp", dump_dir, id);
}
}();
std::string_view dump_view(static_cast<const char*>(gpu_crash_dump), gpu_crash_dump_size);
if (FileUtil::WriteStringToFile(false, base_name, dump_view) != gpu_crash_dump_size) {
LOG_ERROR(Render_Vulkan, "Failed to write dump file");
return;
}
const std::string_view json_view(json.data(), json.size());
if (FileUtil::WriteStringToFile(true, base_name + ".json", json_view) != json.size()) {
LOG_ERROR(Render_Vulkan, "Failed to write JSON");
return;
}
}
void NsightAftermathTracker::OnShaderDebugInfoCallback(const void* shader_debug_info,
u32 shader_debug_info_size) {
std::scoped_lock lock{mutex};
GFSDK_Aftermath_ShaderDebugInfoIdentifier identifier;
if (!GFSDK_Aftermath_SUCCEED(GFSDK_Aftermath_GetShaderDebugInfoIdentifier(
GFSDK_Aftermath_Version_API, shader_debug_info, shader_debug_info_size, &identifier))) {
LOG_ERROR(Render_Vulkan, "GFSDK_Aftermath_GetShaderDebugInfoIdentifier failed");
return;
}
const std::string path =
fmt::format("{}/shader_{:016x}{:016x}.nvdbg", dump_dir, identifier.id[0], identifier.id[1]);
FileUtil::IOFile file(path, "wb");
if (!file.IsOpen()) {
LOG_ERROR(Render_Vulkan, "Failed to create file {}", path);
return;
}
if (file.WriteBytes(static_cast<const u8*>(shader_debug_info), shader_debug_info_size) !=
shader_debug_info_size) {
LOG_ERROR(Render_Vulkan, "Failed to write file {}", path);
return;
}
}
void NsightAftermathTracker::OnCrashDumpDescriptionCallback(
PFN_GFSDK_Aftermath_AddGpuCrashDumpDescription add_description) {
add_description(GFSDK_Aftermath_GpuCrashDumpDescriptionKey_ApplicationName, "yuzu");
}
void NsightAftermathTracker::GpuCrashDumpCallback(const void* gpu_crash_dump,
u32 gpu_crash_dump_size, void* user_data) {
static_cast<NsightAftermathTracker*>(user_data)->OnGpuCrashDumpCallback(gpu_crash_dump,
gpu_crash_dump_size);
}
void NsightAftermathTracker::ShaderDebugInfoCallback(const void* shader_debug_info,
u32 shader_debug_info_size, void* user_data) {
static_cast<NsightAftermathTracker*>(user_data)->OnShaderDebugInfoCallback(
shader_debug_info, shader_debug_info_size);
}
void NsightAftermathTracker::CrashDumpDescriptionCallback(
PFN_GFSDK_Aftermath_AddGpuCrashDumpDescription add_description, void* user_data) {
static_cast<NsightAftermathTracker*>(user_data)->OnCrashDumpDescriptionCallback(
add_description);
}
} // namespace Vulkan
#endif // HAS_NSIGHT_AFTERMATH

View File

@@ -1,87 +0,0 @@
// Copyright 2020 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <mutex>
#include <string>
#include <vector>
#define VK_NO_PROTOTYPES
#include <vulkan/vulkan.h>
#ifdef HAS_NSIGHT_AFTERMATH
#include <GFSDK_Aftermath_Defines.h>
#include <GFSDK_Aftermath_GpuCrashDump.h>
#include <GFSDK_Aftermath_GpuCrashDumpDecoding.h>
#endif
#include "common/common_types.h"
#include "common/dynamic_library.h"
namespace Vulkan {
class NsightAftermathTracker {
public:
NsightAftermathTracker();
~NsightAftermathTracker();
NsightAftermathTracker(const NsightAftermathTracker&) = delete;
NsightAftermathTracker& operator=(const NsightAftermathTracker&) = delete;
// Delete move semantics because Aftermath initialization uses a pointer to this.
NsightAftermathTracker(NsightAftermathTracker&&) = delete;
NsightAftermathTracker& operator=(NsightAftermathTracker&&) = delete;
bool Initialize();
void SaveShader(const std::vector<u32>& spirv) const;
private:
#ifdef HAS_NSIGHT_AFTERMATH
static void GpuCrashDumpCallback(const void* gpu_crash_dump, u32 gpu_crash_dump_size,
void* user_data);
static void ShaderDebugInfoCallback(const void* shader_debug_info, u32 shader_debug_info_size,
void* user_data);
static void CrashDumpDescriptionCallback(
PFN_GFSDK_Aftermath_AddGpuCrashDumpDescription add_description, void* user_data);
void OnGpuCrashDumpCallback(const void* gpu_crash_dump, u32 gpu_crash_dump_size);
void OnShaderDebugInfoCallback(const void* shader_debug_info, u32 shader_debug_info_size);
void OnCrashDumpDescriptionCallback(
PFN_GFSDK_Aftermath_AddGpuCrashDumpDescription add_description);
mutable std::mutex mutex;
std::string dump_dir;
int dump_id = 0;
bool initialized = false;
Common::DynamicLibrary dl;
PFN_GFSDK_Aftermath_DisableGpuCrashDumps GFSDK_Aftermath_DisableGpuCrashDumps;
PFN_GFSDK_Aftermath_EnableGpuCrashDumps GFSDK_Aftermath_EnableGpuCrashDumps;
PFN_GFSDK_Aftermath_GetShaderDebugInfoIdentifier GFSDK_Aftermath_GetShaderDebugInfoIdentifier;
PFN_GFSDK_Aftermath_GetShaderHashSpirv GFSDK_Aftermath_GetShaderHashSpirv;
PFN_GFSDK_Aftermath_GpuCrashDump_CreateDecoder GFSDK_Aftermath_GpuCrashDump_CreateDecoder;
PFN_GFSDK_Aftermath_GpuCrashDump_DestroyDecoder GFSDK_Aftermath_GpuCrashDump_DestroyDecoder;
PFN_GFSDK_Aftermath_GpuCrashDump_GenerateJSON GFSDK_Aftermath_GpuCrashDump_GenerateJSON;
PFN_GFSDK_Aftermath_GpuCrashDump_GetJSON GFSDK_Aftermath_GpuCrashDump_GetJSON;
#endif
};
#ifndef HAS_NSIGHT_AFTERMATH
inline NsightAftermathTracker::NsightAftermathTracker() = default;
inline NsightAftermathTracker::~NsightAftermathTracker() = default;
inline bool NsightAftermathTracker::Initialize() {
return false;
}
inline void NsightAftermathTracker::SaveShader(const std::vector<u32>&) const {}
#endif
} // namespace Vulkan

View File

@@ -535,7 +535,9 @@ void VKBlitScreen::CreateGraphicsPipeline() {
viewport_state_ci.pNext = nullptr;
viewport_state_ci.flags = 0;
viewport_state_ci.viewportCount = 1;
viewport_state_ci.pViewports = nullptr;
viewport_state_ci.scissorCount = 1;
viewport_state_ci.pScissors = nullptr;
VkPipelineRasterizationStateCreateInfo rasterization_ci;
rasterization_ci.sType = VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO;

View File

@@ -74,18 +74,18 @@ Buffer VKBufferCache::CreateBlock(VAddr cpu_addr, std::size_t size) {
return std::make_shared<CachedBufferBlock>(device, memory_manager, cpu_addr, size);
}
const VkBuffer* VKBufferCache::ToHandle(const Buffer& buffer) {
VkBuffer VKBufferCache::ToHandle(const Buffer& buffer) {
return buffer->GetHandle();
}
const VkBuffer* VKBufferCache::GetEmptyBuffer(std::size_t size) {
VkBuffer VKBufferCache::GetEmptyBuffer(std::size_t size) {
size = std::max(size, std::size_t(4));
const auto& empty = staging_pool.GetUnusedBuffer(size, false);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([size, buffer = *empty.handle](vk::CommandBuffer cmdbuf) {
cmdbuf.FillBuffer(buffer, 0, size, 0);
});
return empty.handle.address();
return *empty.handle;
}
void VKBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
@@ -94,7 +94,7 @@ void VKBufferCache::UploadBlockData(const Buffer& buffer, std::size_t offset, st
std::memcpy(staging.commit->Map(size), data, size);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([staging = *staging.handle, buffer = *buffer->GetHandle(), offset,
scheduler.Record([staging = *staging.handle, buffer = buffer->GetHandle(), offset,
size](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBuffer(staging, buffer, VkBufferCopy{0, offset, size});
@@ -117,7 +117,7 @@ void VKBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset,
u8* data) {
const auto& staging = staging_pool.GetUnusedBuffer(size, true);
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([staging = *staging.handle, buffer = *buffer->GetHandle(), offset,
scheduler.Record([staging = *staging.handle, buffer = buffer->GetHandle(), offset,
size](vk::CommandBuffer cmdbuf) {
VkBufferMemoryBarrier barrier;
barrier.sType = VK_STRUCTURE_TYPE_BUFFER_MEMORY_BARRIER;
@@ -144,7 +144,7 @@ void VKBufferCache::DownloadBlockData(const Buffer& buffer, std::size_t offset,
void VKBufferCache::CopyBlock(const Buffer& src, const Buffer& dst, std::size_t src_offset,
std::size_t dst_offset, std::size_t size) {
scheduler.RequestOutsideRenderPassOperationContext();
scheduler.Record([src_buffer = *src->GetHandle(), dst_buffer = *dst->GetHandle(), src_offset,
scheduler.Record([src_buffer = src->GetHandle(), dst_buffer = dst->GetHandle(), src_offset,
dst_offset, size](vk::CommandBuffer cmdbuf) {
cmdbuf.CopyBuffer(src_buffer, dst_buffer, VkBufferCopy{src_offset, dst_offset, size});

View File

@@ -33,8 +33,8 @@ public:
VAddr cpu_addr, std::size_t size);
~CachedBufferBlock();
const VkBuffer* GetHandle() const {
return buffer.handle.address();
VkBuffer GetHandle() const {
return *buffer.handle;
}
private:
@@ -50,15 +50,15 @@ public:
VKScheduler& scheduler, VKStagingBufferPool& staging_pool);
~VKBufferCache();
const VkBuffer* GetEmptyBuffer(std::size_t size) override;
VkBuffer GetEmptyBuffer(std::size_t size) override;
protected:
VkBuffer ToHandle(const Buffer& buffer) override;
void WriteBarrier() override {}
Buffer CreateBlock(VAddr cpu_addr, std::size_t size) override;
const VkBuffer* ToHandle(const Buffer& buffer) override;
void UploadBlockData(const Buffer& buffer, std::size_t offset, std::size_t size,
const u8* data) override;

View File

@@ -343,13 +343,13 @@ QuadArrayPass::QuadArrayPass(const VKDevice& device, VKScheduler& scheduler,
QuadArrayPass::~QuadArrayPass() = default;
std::pair<const VkBuffer*, VkDeviceSize> QuadArrayPass::Assemble(u32 num_vertices, u32 first) {
std::pair<VkBuffer, VkDeviceSize> QuadArrayPass::Assemble(u32 num_vertices, u32 first) {
const u32 num_triangle_vertices = num_vertices * 6 / 4;
const std::size_t staging_size = num_triangle_vertices * sizeof(u32);
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(buffer.handle.address(), 0, staging_size);
update_descriptor_queue.AddBuffer(*buffer.handle, 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
@@ -377,7 +377,7 @@ std::pair<const VkBuffer*, VkDeviceSize> QuadArrayPass::Assemble(u32 num_vertice
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, {barrier}, {});
});
return {buffer.handle.address(), 0};
return {*buffer.handle, 0};
}
Uint8Pass::Uint8Pass(const VKDevice& device, VKScheduler& scheduler,
@@ -391,14 +391,14 @@ Uint8Pass::Uint8Pass(const VKDevice& device, VKScheduler& scheduler,
Uint8Pass::~Uint8Pass() = default;
std::pair<const VkBuffer*, u64> Uint8Pass::Assemble(u32 num_vertices, VkBuffer src_buffer,
u64 src_offset) {
std::pair<VkBuffer, u64> Uint8Pass::Assemble(u32 num_vertices, VkBuffer src_buffer,
u64 src_offset) {
const auto staging_size = static_cast<u32>(num_vertices * sizeof(u16));
auto& buffer = staging_buffer_pool.GetUnusedBuffer(staging_size, false);
update_descriptor_queue.Acquire();
update_descriptor_queue.AddBuffer(&src_buffer, src_offset, num_vertices);
update_descriptor_queue.AddBuffer(buffer.handle.address(), 0, staging_size);
update_descriptor_queue.AddBuffer(src_buffer, src_offset, num_vertices);
update_descriptor_queue.AddBuffer(*buffer.handle, 0, staging_size);
const auto set = CommitDescriptorSet(update_descriptor_queue, scheduler.GetFence());
scheduler.RequestOutsideRenderPassOperationContext();
@@ -422,7 +422,7 @@ std::pair<const VkBuffer*, u64> Uint8Pass::Assemble(u32 num_vertices, VkBuffer s
cmdbuf.PipelineBarrier(VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_PIPELINE_STAGE_VERTEX_INPUT_BIT, 0, {}, barrier, {});
});
return {buffer.handle.address(), 0};
return {*buffer.handle, 0};
}
} // namespace Vulkan

View File

@@ -50,7 +50,7 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue);
~QuadArrayPass();
std::pair<const VkBuffer*, VkDeviceSize> Assemble(u32 num_vertices, u32 first);
std::pair<VkBuffer, VkDeviceSize> Assemble(u32 num_vertices, u32 first);
private:
VKScheduler& scheduler;
@@ -65,7 +65,7 @@ public:
VKUpdateDescriptorQueue& update_descriptor_queue);
~Uint8Pass();
std::pair<const VkBuffer*, u64> Assemble(u32 num_vertices, VkBuffer src_buffer, u64 src_offset);
std::pair<VkBuffer, u64> Assemble(u32 num_vertices, VkBuffer src_buffer, u64 src_offset);
private:
VKScheduler& scheduler;

View File

@@ -105,8 +105,6 @@ vk::DescriptorUpdateTemplateKHR VKComputePipeline::CreateDescriptorUpdateTemplat
}
vk::ShaderModule VKComputePipeline::CreateShaderModule(const std::vector<u32>& code) const {
device.SaveShader(code);
VkShaderModuleCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO;
ci.pNext = nullptr;

View File

@@ -9,7 +9,6 @@
#include <string_view>
#include <thread>
#include <unordered_set>
#include <utility>
#include <vector>
#include "common/assert.h"
@@ -168,7 +167,6 @@ bool VKDevice::Create() {
VkPhysicalDeviceFeatures2 features2;
features2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2;
features2.pNext = nullptr;
const void* first_next = &features2;
void** next = &features2.pNext;
auto& features = features2.features;
@@ -298,19 +296,7 @@ bool VKDevice::Create() {
LOG_INFO(Render_Vulkan, "Device doesn't support depth range unrestricted");
}
VkDeviceDiagnosticsConfigCreateInfoNV diagnostics_nv;
if (nv_device_diagnostics_config) {
nsight_aftermath_tracker.Initialize();
diagnostics_nv.sType = VK_STRUCTURE_TYPE_DEVICE_DIAGNOSTICS_CONFIG_CREATE_INFO_NV;
diagnostics_nv.pNext = &features2;
diagnostics_nv.flags = VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_SHADER_DEBUG_INFO_BIT_NV |
VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_RESOURCE_TRACKING_BIT_NV |
VK_DEVICE_DIAGNOSTICS_CONFIG_ENABLE_AUTOMATIC_CHECKPOINTS_BIT_NV;
first_next = &diagnostics_nv;
}
logical = vk::Device::Create(physical, queue_cis, extensions, first_next, dld);
logical = vk::Device::Create(physical, queue_cis, extensions, features2, dld);
if (!logical) {
LOG_ERROR(Render_Vulkan, "Failed to create logical device");
return false;
@@ -358,12 +344,17 @@ VkFormat VKDevice::GetSupportedFormat(VkFormat wanted_format, VkFormatFeatureFla
void VKDevice::ReportLoss() const {
LOG_CRITICAL(Render_Vulkan, "Device loss occured!");
// Wait for the log to flush and for Nsight Aftermath to dump the results
std::this_thread::sleep_for(std::chrono::seconds{3});
}
// Wait some time to let the log flush
std::this_thread::sleep_for(std::chrono::seconds{1});
void VKDevice::SaveShader(const std::vector<u32>& spirv) const {
nsight_aftermath_tracker.SaveShader(spirv);
if (!nv_device_diagnostic_checkpoints) {
return;
}
[[maybe_unused]] const std::vector data = graphics_queue.GetCheckpointDataNV(dld);
// Catch here in debug builds (or with optimizations disabled) the last graphics pipeline to be
// executed. It can be done on a debugger by evaluating the expression:
// *(VKGraphicsPipeline*)data[0]
}
bool VKDevice::IsOptimalAstcSupported(const VkPhysicalDeviceFeatures& features) const {
@@ -536,8 +527,8 @@ std::vector<const char*> VKDevice::LoadExtensions() {
Test(extension, has_ext_transform_feedback, VK_EXT_TRANSFORM_FEEDBACK_EXTENSION_NAME,
false);
if (Settings::values.renderer_debug) {
Test(extension, nv_device_diagnostics_config,
VK_NV_DEVICE_DIAGNOSTICS_CONFIG_EXTENSION_NAME, true);
Test(extension, nv_device_diagnostic_checkpoints,
VK_NV_DEVICE_DIAGNOSTIC_CHECKPOINTS_EXTENSION_NAME, true);
}
}

View File

@@ -10,7 +10,6 @@
#include <vector>
#include "common/common_types.h"
#include "video_core/renderer_vulkan/nsight_aftermath_tracker.h"
#include "video_core/renderer_vulkan/wrapper.h"
namespace Vulkan {
@@ -44,9 +43,6 @@ public:
/// Reports a device loss.
void ReportLoss() const;
/// Reports a shader to Nsight Aftermath.
void SaveShader(const std::vector<u32>& spirv) const;
/// Returns the dispatch loader with direct function pointers of the device.
const vk::DeviceDispatch& GetDispatchLoader() const {
return dld;
@@ -177,6 +173,11 @@ public:
return ext_transform_feedback;
}
/// Returns true if the device supports VK_NV_device_diagnostic_checkpoints.
bool IsNvDeviceDiagnosticCheckpoints() const {
return nv_device_diagnostic_checkpoints;
}
/// Returns the vendor name reported from Vulkan.
std::string_view GetVendorName() const {
return vendor_name;
@@ -232,7 +233,7 @@ private:
bool ext_depth_range_unrestricted{}; ///< Support for VK_EXT_depth_range_unrestricted.
bool ext_shader_viewport_index_layer{}; ///< Support for VK_EXT_shader_viewport_index_layer.
bool ext_transform_feedback{}; ///< Support for VK_EXT_transform_feedback.
bool nv_device_diagnostics_config{}; ///< Support for VK_NV_device_diagnostics_config.
bool nv_device_diagnostic_checkpoints{}; ///< Support for VK_NV_device_diagnostic_checkpoints.
// Telemetry parameters
std::string vendor_name; ///< Device's driver name.
@@ -240,9 +241,6 @@ private:
/// Format properties dictionary.
std::unordered_map<VkFormat, VkFormatProperties> format_properties;
/// Nsight Aftermath GPU crash tracker
NsightAftermathTracker nsight_aftermath_tracker;
};
} // namespace Vulkan

View File

@@ -147,8 +147,6 @@ std::vector<vk::ShaderModule> VKGraphicsPipeline::CreateShaderModules(
continue;
}
device.SaveShader(stage->code);
ci.codeSize = stage->code.size() * sizeof(u32);
ci.pCode = stage->code.data();
modules.push_back(device.GetLogical().CreateShaderModule(ci));

View File

@@ -32,7 +32,7 @@ public:
* memory. When passing false, it will try to allocate device local memory.
* @returns A memory commit.
*/
VKMemoryCommit Commit(const VkMemoryRequirements& reqs, bool host_visible);
VKMemoryCommit Commit(const VkMemoryRequirements& requirements, bool host_visible);
/// Commits memory required by the buffer and binds it.
VKMemoryCommit Commit(const vk::Buffer& buffer, bool host_visible);

View File

@@ -113,19 +113,8 @@ u64 HostCounter::BlockingQuery() const {
if (ticks >= cache.Scheduler().Ticks()) {
cache.Scheduler().Flush();
}
u64 data;
const VkResult result = cache.Device().GetLogical().GetQueryResults(
query.first, query.second, 1, sizeof(data), &data, sizeof(data),
VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT);
switch (result) {
case VK_SUCCESS:
return data;
case VK_ERROR_DEVICE_LOST:
cache.Device().ReportLoss();
[[fallthrough]];
default:
throw vk::Exception(result);
}
return cache.Device().GetLogical().GetQueryResult<u64>(
query.first, query.second, VK_QUERY_RESULT_64_BIT | VK_QUERY_RESULT_WAIT_BIT);
}
} // namespace Vulkan

View File

@@ -62,13 +62,16 @@ constexpr auto ComputeShaderIndex = static_cast<std::size_t>(Tegra::Engines::Sha
VkViewport GetViewportState(const VKDevice& device, const Maxwell& regs, std::size_t index) {
const auto& src = regs.viewport_transform[index];
const float width = src.scale_x * 2.0f;
const float height = src.scale_y * 2.0f;
VkViewport viewport;
viewport.x = src.translate_x - src.scale_x;
viewport.y = src.translate_y - src.scale_y;
viewport.width = src.scale_x * 2.0f;
viewport.height = src.scale_y * 2.0f;
viewport.width = width != 0.0f ? width : 1.0f;
viewport.height = height != 0.0f ? height : 1.0f;
const float reduce_z = regs.depth_mode == Maxwell::DepthMode::MinusOneToOne;
const float reduce_z = regs.depth_mode == Maxwell::DepthMode::MinusOneToOne ? 1.0f : 0.0f;
viewport.minDepth = src.translate_z - src.scale_z * reduce_z;
viewport.maxDepth = src.translate_z + src.scale_z;
if (!device.IsExtDepthRangeUnrestrictedSupported()) {
@@ -134,13 +137,13 @@ Tegra::Texture::FullTextureInfo GetTextureInfo(const Engine& engine, const Entry
class BufferBindings final {
public:
void AddVertexBinding(const VkBuffer* buffer, VkDeviceSize offset) {
vertex.buffer_ptrs[vertex.num_buffers] = buffer;
void AddVertexBinding(VkBuffer buffer, VkDeviceSize offset) {
vertex.buffers[vertex.num_buffers] = buffer;
vertex.offsets[vertex.num_buffers] = offset;
++vertex.num_buffers;
}
void SetIndexBinding(const VkBuffer* buffer, VkDeviceSize offset, VkIndexType type) {
void SetIndexBinding(VkBuffer buffer, VkDeviceSize offset, VkIndexType type) {
index.buffer = buffer;
index.offset = offset;
index.type = type;
@@ -224,19 +227,19 @@ private:
// Some of these fields are intentionally left uninitialized to avoid initializing them twice.
struct {
std::size_t num_buffers = 0;
std::array<const VkBuffer*, Maxwell::NumVertexArrays> buffer_ptrs;
std::array<VkBuffer, Maxwell::NumVertexArrays> buffers;
std::array<VkDeviceSize, Maxwell::NumVertexArrays> offsets;
} vertex;
struct {
const VkBuffer* buffer = nullptr;
VkBuffer buffer = nullptr;
VkDeviceSize offset;
VkIndexType type;
} index;
template <std::size_t N>
void BindStatic(VKScheduler& scheduler) const {
if (index.buffer != nullptr) {
if (index.buffer) {
BindStatic<N, true>(scheduler);
} else {
BindStatic<N, false>(scheduler);
@@ -251,18 +254,14 @@ private:
}
std::array<VkBuffer, N> buffers;
std::transform(vertex.buffer_ptrs.begin(), vertex.buffer_ptrs.begin() + N, buffers.begin(),
[](const auto ptr) { return *ptr; });
std::array<VkDeviceSize, N> offsets;
std::copy(vertex.buffers.begin(), vertex.buffers.begin() + N, buffers.begin());
std::copy(vertex.offsets.begin(), vertex.offsets.begin() + N, offsets.begin());
if constexpr (is_indexed) {
// Indexed draw
scheduler.Record([buffers, offsets, index_buffer = *index.buffer,
index_offset = index.offset,
index_type = index.type](vk::CommandBuffer cmdbuf) {
cmdbuf.BindIndexBuffer(index_buffer, index_offset, index_type);
scheduler.Record([buffers, offsets, index = index](vk::CommandBuffer cmdbuf) {
cmdbuf.BindIndexBuffer(index.buffer, index.offset, index.type);
cmdbuf.BindVertexBuffers(0, static_cast<u32>(N), buffers.data(), offsets.data());
});
} else {
@@ -347,6 +346,11 @@ void RasterizerVulkan::Draw(bool is_indexed, bool is_instanced) {
buffer_bindings.Bind(scheduler);
if (device.IsNvDeviceDiagnosticCheckpoints()) {
scheduler.Record(
[&pipeline](vk::CommandBuffer cmdbuf) { cmdbuf.SetCheckpointNV(&pipeline); });
}
BeginTransformFeedback();
const auto pipeline_layout = pipeline.GetLayout();
@@ -473,6 +477,11 @@ void RasterizerVulkan::DispatchCompute(GPUVAddr code_addr) {
TransitionImages(image_views, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT,
VK_ACCESS_SHADER_READ_BIT | VK_ACCESS_SHADER_WRITE_BIT);
if (device.IsNvDeviceDiagnosticCheckpoints()) {
scheduler.Record(
[&pipeline](vk::CommandBuffer cmdbuf) { cmdbuf.SetCheckpointNV(nullptr); });
}
scheduler.Record([grid_x = launch_desc.grid_dim_x, grid_y = launch_desc.grid_dim_y,
grid_z = launch_desc.grid_dim_z, pipeline_handle = pipeline.GetHandle(),
layout = pipeline.GetLayout(),
@@ -777,7 +786,7 @@ void RasterizerVulkan::BeginTransformFeedback() {
const std::size_t size = binding.buffer_size;
const auto [buffer, offset] = buffer_cache.UploadMemory(gpu_addr, size, 4, true);
scheduler.Record([buffer = *buffer, offset = offset, size](vk::CommandBuffer cmdbuf) {
scheduler.Record([buffer = buffer, offset = offset, size](vk::CommandBuffer cmdbuf) {
cmdbuf.BindTransformFeedbackBuffersEXT(0, 1, &buffer, &offset, &size);
cmdbuf.BeginTransformFeedbackEXT(0, 0, nullptr, nullptr);
});
@@ -857,7 +866,7 @@ void RasterizerVulkan::SetupIndexBuffer(BufferBindings& buffer_bindings, DrawPar
auto format = regs.index_array.format;
const bool is_uint8 = format == Maxwell::IndexFormat::UnsignedByte;
if (is_uint8 && !device.IsExtIndexTypeUint8Supported()) {
std::tie(buffer, offset) = uint8_pass.Assemble(params.num_vertices, *buffer, offset);
std::tie(buffer, offset) = uint8_pass.Assemble(params.num_vertices, buffer, offset);
format = Maxwell::IndexFormat::UnsignedShort;
}
@@ -994,8 +1003,8 @@ void RasterizerVulkan::SetupGlobalBuffer(const GlobalBufferEntry& entry, GPUVAdd
const auto size = memory_manager.Read<u32>(address + 8);
if (size == 0) {
// Sometimes global memory pointers don't have a proper size. Upload a dummy entry because
// Vulkan doesn't like empty buffers.
// Sometimes global memory pointers don't have a proper size. Upload a dummy entry
// because Vulkan doesn't like empty buffers.
constexpr std::size_t dummy_size = 4;
const auto buffer = buffer_cache.GetEmptyBuffer(dummy_size);
update_descriptor_queue.AddBuffer(buffer, 0, dummy_size);

View File

@@ -166,15 +166,7 @@ void VKScheduler::SubmitExecution(VkSemaphore semaphore) {
submit_info.pCommandBuffers = current_cmdbuf.address();
submit_info.signalSemaphoreCount = semaphore ? 1 : 0;
submit_info.pSignalSemaphores = &semaphore;
switch (const VkResult result = device.GetGraphicsQueue().Submit(submit_info, *current_fence)) {
case VK_SUCCESS:
break;
case VK_ERROR_DEVICE_LOST:
device.ReportLoss();
[[fallthrough]];
default:
vk::Check(result);
}
device.GetGraphicsQueue().Submit(submit_info, *current_fence);
}
void VKScheduler::AllocateNewContext() {

View File

@@ -1938,11 +1938,8 @@ private:
return {};
}
template <Id (Module::*func)(Id, Id, Id, Id, Id), Type result_type,
Type value_type = result_type>
template <Id (Module::*func)(Id, Id, Id, Id, Id)>
Expression Atomic(Operation operation) {
const Id type_def = GetTypeDefinition(result_type);
Id pointer;
if (const auto smem = std::get_if<SmemNode>(&*operation[0])) {
pointer = GetSharedMemoryPointer(*smem);
@@ -1950,15 +1947,19 @@ private:
pointer = GetGlobalMemoryPointer(*gmem);
} else {
UNREACHABLE();
return {Constant(type_def, 0), result_type};
return {v_float_zero, Type::Float};
}
const Id value = As(Visit(operation[1]), value_type);
const Id scope = Constant(t_uint, static_cast<u32>(spv::Scope::Device));
const Id semantics = Constant(type_def, 0);
const Id semantics = Constant(t_uint, 0);
const Id value = AsUint(Visit(operation[1]));
return {(this->*func)(type_def, pointer, scope, semantics, value), result_type};
return {(this->*func)(t_uint, pointer, scope, semantics, value), Type::Uint};
}
template <Id (Module::*func)(Id, Id, Id, Id, Id)>
Expression Reduce(Operation operation) {
Atomic<func>(operation);
return {};
}
Expression Branch(Operation operation) {
@@ -2547,21 +2548,35 @@ private:
&SPIRVDecompiler::AtomicImageXor,
&SPIRVDecompiler::AtomicImageExchange,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMin, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMax, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor, Type::Uint>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMin>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicUMax>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMin, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMax, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor, Type::Int>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicExchange>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMin>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicSMax>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicOr>,
&SPIRVDecompiler::Atomic<&Module::OpAtomicXor>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicUMin>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicUMax>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicOr>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicXor>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicIAdd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicSMin>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicSMax>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicAnd>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicOr>,
&SPIRVDecompiler::Reduce<&Module::OpAtomicXor>,
&SPIRVDecompiler::Branch,
&SPIRVDecompiler::BranchIndirect,

View File

@@ -35,12 +35,13 @@ void VKUpdateDescriptorQueue::Send(VkDescriptorUpdateTemplateKHR update_template
payload.clear();
}
// TODO(Rodrigo): Rework to write the payload directly
const auto payload_start = payload.data() + payload.size();
for (const auto& entry : entries) {
if (const auto image = std::get_if<VkDescriptorImageInfo>(&entry)) {
payload.push_back(*image);
} else if (const auto buffer = std::get_if<Buffer>(&entry)) {
payload.emplace_back(*buffer->buffer, buffer->offset, buffer->size);
} else if (const auto buffer = std::get_if<VkDescriptorBufferInfo>(&entry)) {
payload.push_back(*buffer);
} else if (const auto texel = std::get_if<VkBufferView>(&entry)) {
payload.push_back(*texel);
} else {

View File

@@ -18,12 +18,11 @@ class VKScheduler;
class DescriptorUpdateEntry {
public:
explicit DescriptorUpdateEntry() : image{} {}
explicit DescriptorUpdateEntry() {}
DescriptorUpdateEntry(VkDescriptorImageInfo image) : image{image} {}
DescriptorUpdateEntry(VkBuffer buffer, VkDeviceSize offset, VkDeviceSize size)
: buffer{buffer, offset, size} {}
DescriptorUpdateEntry(VkDescriptorBufferInfo buffer) : buffer{buffer} {}
DescriptorUpdateEntry(VkBufferView texel_buffer) : texel_buffer{texel_buffer} {}
@@ -54,8 +53,8 @@ public:
entries.emplace_back(VkDescriptorImageInfo{{}, image_view, {}});
}
void AddBuffer(const VkBuffer* buffer, u64 offset, std::size_t size) {
entries.push_back(Buffer{buffer, offset, size});
void AddBuffer(VkBuffer buffer, u64 offset, std::size_t size) {
entries.emplace_back(VkDescriptorBufferInfo{buffer, offset, size});
}
void AddTexelBuffer(VkBufferView texel_buffer) {
@@ -67,12 +66,7 @@ public:
}
private:
struct Buffer {
const VkBuffer* buffer = nullptr;
u64 offset = 0;
std::size_t size = 0;
};
using Variant = std::variant<VkDescriptorImageInfo, Buffer, VkBufferView>;
using Variant = std::variant<VkDescriptorImageInfo, VkDescriptorBufferInfo, VkBufferView>;
const VKDevice& device;
VKScheduler& scheduler;

View File

@@ -61,6 +61,7 @@ void Load(VkDevice device, DeviceDispatch& dld) noexcept {
X(vkCmdPipelineBarrier);
X(vkCmdPushConstants);
X(vkCmdSetBlendConstants);
X(vkCmdSetCheckpointNV);
X(vkCmdSetDepthBias);
X(vkCmdSetDepthBounds);
X(vkCmdSetScissor);
@@ -115,6 +116,7 @@ void Load(VkDevice device, DeviceDispatch& dld) noexcept {
X(vkGetFenceStatus);
X(vkGetImageMemoryRequirements);
X(vkGetQueryPoolResults);
X(vkGetQueueCheckpointDataNV);
X(vkMapMemory);
X(vkQueueSubmit);
X(vkResetFences);
@@ -407,6 +409,17 @@ DebugCallback Instance::TryCreateDebugCallback(
return DebugCallback(messenger, handle, *dld);
}
std::vector<VkCheckpointDataNV> Queue::GetCheckpointDataNV(const DeviceDispatch& dld) const {
if (!dld.vkGetQueueCheckpointDataNV) {
return {};
}
u32 num;
dld.vkGetQueueCheckpointDataNV(queue, &num, nullptr);
std::vector<VkCheckpointDataNV> checkpoints(num);
dld.vkGetQueueCheckpointDataNV(queue, &num, checkpoints.data());
return checkpoints;
}
void Buffer::BindMemory(VkDeviceMemory memory, VkDeviceSize offset) const {
Check(dld->vkBindBufferMemory(owner, handle, memory, offset));
}
@@ -456,11 +469,12 @@ std::vector<VkImage> SwapchainKHR::GetImages() const {
}
Device Device::Create(VkPhysicalDevice physical_device, Span<VkDeviceQueueCreateInfo> queues_ci,
Span<const char*> enabled_extensions, const void* next,
Span<const char*> enabled_extensions,
const VkPhysicalDeviceFeatures2& enabled_features,
DeviceDispatch& dld) noexcept {
VkDeviceCreateInfo ci;
ci.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
ci.pNext = next;
ci.pNext = &enabled_features;
ci.flags = 0;
ci.queueCreateInfoCount = queues_ci.size();
ci.pQueueCreateInfos = queues_ci.data();

View File

@@ -197,6 +197,7 @@ struct DeviceDispatch : public InstanceDispatch {
PFN_vkCmdPipelineBarrier vkCmdPipelineBarrier;
PFN_vkCmdPushConstants vkCmdPushConstants;
PFN_vkCmdSetBlendConstants vkCmdSetBlendConstants;
PFN_vkCmdSetCheckpointNV vkCmdSetCheckpointNV;
PFN_vkCmdSetDepthBias vkCmdSetDepthBias;
PFN_vkCmdSetDepthBounds vkCmdSetDepthBounds;
PFN_vkCmdSetScissor vkCmdSetScissor;
@@ -251,6 +252,7 @@ struct DeviceDispatch : public InstanceDispatch {
PFN_vkGetFenceStatus vkGetFenceStatus;
PFN_vkGetImageMemoryRequirements vkGetImageMemoryRequirements;
PFN_vkGetQueryPoolResults vkGetQueryPoolResults;
PFN_vkGetQueueCheckpointDataNV vkGetQueueCheckpointDataNV;
PFN_vkMapMemory vkMapMemory;
PFN_vkQueueSubmit vkQueueSubmit;
PFN_vkResetFences vkResetFences;
@@ -565,8 +567,12 @@ public:
/// Construct a queue handle.
constexpr Queue(VkQueue queue, const DeviceDispatch& dld) noexcept : queue{queue}, dld{&dld} {}
VkResult Submit(Span<VkSubmitInfo> submit_infos, VkFence fence) const noexcept {
return dld->vkQueueSubmit(queue, submit_infos.size(), submit_infos.data(), fence);
/// Returns the checkpoint data.
/// @note Returns an empty vector when the function pointer is not present.
std::vector<VkCheckpointDataNV> GetCheckpointDataNV(const DeviceDispatch& dld) const;
void Submit(Span<VkSubmitInfo> submit_infos, VkFence fence) const {
Check(dld->vkQueueSubmit(queue, submit_infos.size(), submit_infos.data(), fence));
}
VkResult Present(const VkPresentInfoKHR& present_info) const noexcept {
@@ -653,7 +659,8 @@ class Device : public Handle<VkDevice, NoOwner, DeviceDispatch> {
public:
static Device Create(VkPhysicalDevice physical_device, Span<VkDeviceQueueCreateInfo> queues_ci,
Span<const char*> enabled_extensions, const void* next,
Span<const char*> enabled_extensions,
const VkPhysicalDeviceFeatures2& enabled_features,
DeviceDispatch& dld) noexcept;
Queue GetQueue(u32 family_index) const noexcept;
@@ -727,11 +734,18 @@ public:
dld->vkResetQueryPoolEXT(handle, query_pool, first, count);
}
VkResult GetQueryResults(VkQueryPool query_pool, u32 first, u32 count, std::size_t data_size,
void* data, VkDeviceSize stride, VkQueryResultFlags flags) const
noexcept {
return dld->vkGetQueryPoolResults(handle, query_pool, first, count, data_size, data, stride,
flags);
void GetQueryResults(VkQueryPool query_pool, u32 first, u32 count, std::size_t data_size,
void* data, VkDeviceSize stride, VkQueryResultFlags flags) const {
Check(dld->vkGetQueryPoolResults(handle, query_pool, first, count, data_size, data, stride,
flags));
}
template <typename T>
T GetQueryResult(VkQueryPool query_pool, u32 first, VkQueryResultFlags flags) const {
static_assert(std::is_trivially_copyable_v<T>);
T value;
GetQueryResults(query_pool, first, 1, sizeof(T), &value, sizeof(T), flags);
return value;
}
};
@@ -906,6 +920,10 @@ public:
dld->vkCmdPushConstants(handle, layout, flags, offset, size, values);
}
void SetCheckpointNV(const void* checkpoint_marker) const noexcept {
dld->vkCmdSetCheckpointNV(handle, checkpoint_marker);
}
void SetViewport(u32 first, Span<VkViewport> viewports) const noexcept {
dld->vkCmdSetViewport(handle, first, viewports.size(), viewports.data());
}

View File

@@ -484,17 +484,17 @@ bool TryInspectAddress(CFGRebuildState& state) {
}
case BlockCollision::Inside: {
// This case is the tricky one:
// We need to Split the block in 2 sepparate blocks
// We need to split the block into 2 separate blocks
const u32 end = state.block_info[block_index].end;
BlockInfo& new_block = CreateBlockInfo(state, address, end);
BlockInfo& current_block = state.block_info[block_index];
current_block.end = address - 1;
new_block.branch = current_block.branch;
new_block.branch = std::move(current_block.branch);
BlockBranchInfo forward_branch = MakeBranchInfo<SingleBranch>();
const auto branch = std::get_if<SingleBranch>(forward_branch.get());
branch->address = address;
branch->ignore = true;
current_block.branch = forward_branch;
current_block.branch = std::move(forward_branch);
return true;
}
default:

View File

@@ -136,7 +136,8 @@ u32 ShaderIR::DecodeArithmetic(NodeBlock& bb, u32 pc) {
SetRegister(bb, instr.gpr0, value);
break;
}
case OpCode::Id::FCMP_R: {
case OpCode::Id::FCMP_RR:
case OpCode::Id::FCMP_RC: {
UNIMPLEMENTED_IF(instr.fcmp.ftz == 0);
Node op_c = GetRegister(instr.gpr39);
Node comp = GetPredicateComparisonFloat(instr.fcmp.cond, std::move(op_c), Immediate(0.0f));

View File

@@ -352,8 +352,10 @@ u32 ShaderIR::DecodeImage(NodeBlock& bb, u32 pc) {
registry.ObtainBoundSampler(static_cast<u32>(instr.image.index.Value()));
} else {
const Node image_register = GetRegister(instr.gpr39);
const auto [base_image, buffer, offset] = TrackCbuf(
image_register, global_code, static_cast<s64>(global_code.size()));
const auto result = TrackCbuf(image_register, global_code,
static_cast<s64>(global_code.size()));
const auto buffer = std::get<1>(result);
const auto offset = std::get<2>(result);
descriptor = registry.ObtainBindlessSampler(buffer, offset);
}
if (!descriptor) {
@@ -497,9 +499,12 @@ Image& ShaderIR::GetImage(Tegra::Shader::Image image, Tegra::Shader::ImageType t
Image& ShaderIR::GetBindlessImage(Tegra::Shader::Register reg, Tegra::Shader::ImageType type) {
const Node image_register = GetRegister(reg);
const auto [base_image, buffer, offset] =
const auto result =
TrackCbuf(image_register, global_code, static_cast<s64>(global_code.size()));
const auto buffer = std::get<1>(result);
const auto offset = std::get<2>(result);
const auto it =
std::find_if(std::begin(used_images), std::end(used_images),
[buffer = buffer, offset = offset](const Image& entry) {

View File

@@ -3,7 +3,9 @@
// Refer to the license.txt file included.
#include <algorithm>
#include <utility>
#include <vector>
#include <fmt/format.h>
#include "common/alignment.h"
@@ -16,6 +18,7 @@
namespace VideoCommon::Shader {
using std::move;
using Tegra::Shader::AtomicOp;
using Tegra::Shader::AtomicType;
using Tegra::Shader::Attribute;
@@ -27,29 +30,26 @@ using Tegra::Shader::StoreType;
namespace {
Node GetAtomOperation(AtomicOp op, bool is_signed, Node memory, Node data) {
const OperationCode operation_code = [op] {
switch (op) {
case AtomicOp::Add:
return OperationCode::AtomicIAdd;
case AtomicOp::Min:
return OperationCode::AtomicIMin;
case AtomicOp::Max:
return OperationCode::AtomicIMax;
case AtomicOp::And:
return OperationCode::AtomicIAnd;
case AtomicOp::Or:
return OperationCode::AtomicIOr;
case AtomicOp::Xor:
return OperationCode::AtomicIXor;
case AtomicOp::Exch:
return OperationCode::AtomicIExchange;
default:
UNIMPLEMENTED_MSG("op={}", static_cast<int>(op));
return OperationCode::AtomicIAdd;
}
}();
return SignedOperation(operation_code, is_signed, std::move(memory), std::move(data));
OperationCode GetAtomOperation(AtomicOp op) {
switch (op) {
case AtomicOp::Add:
return OperationCode::AtomicIAdd;
case AtomicOp::Min:
return OperationCode::AtomicIMin;
case AtomicOp::Max:
return OperationCode::AtomicIMax;
case AtomicOp::And:
return OperationCode::AtomicIAnd;
case AtomicOp::Or:
return OperationCode::AtomicIOr;
case AtomicOp::Xor:
return OperationCode::AtomicIXor;
case AtomicOp::Exch:
return OperationCode::AtomicIExchange;
default:
UNIMPLEMENTED_MSG("op={}", static_cast<int>(op));
return OperationCode::AtomicIAdd;
}
}
bool IsUnaligned(Tegra::Shader::UniformType uniform_type) {
@@ -90,23 +90,22 @@ u32 GetMemorySize(Tegra::Shader::UniformType uniform_type) {
Node ExtractUnaligned(Node value, Node address, u32 mask, u32 size) {
Node offset = Operation(OperationCode::UBitwiseAnd, address, Immediate(mask));
offset = Operation(OperationCode::ULogicalShiftLeft, std::move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldExtract, std::move(value), std::move(offset),
Immediate(size));
offset = Operation(OperationCode::ULogicalShiftLeft, move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldExtract, move(value), move(offset), Immediate(size));
}
Node InsertUnaligned(Node dest, Node value, Node address, u32 mask, u32 size) {
Node offset = Operation(OperationCode::UBitwiseAnd, std::move(address), Immediate(mask));
offset = Operation(OperationCode::ULogicalShiftLeft, std::move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldInsert, std::move(dest), std::move(value),
std::move(offset), Immediate(size));
Node offset = Operation(OperationCode::UBitwiseAnd, move(address), Immediate(mask));
offset = Operation(OperationCode::ULogicalShiftLeft, move(offset), Immediate(3));
return Operation(OperationCode::UBitfieldInsert, move(dest), move(value), move(offset),
Immediate(size));
}
Node Sign16Extend(Node value) {
Node sign = Operation(OperationCode::UBitwiseAnd, value, Immediate(1U << 15));
Node is_sign = Operation(OperationCode::LogicalUEqual, std::move(sign), Immediate(1U << 15));
Node is_sign = Operation(OperationCode::LogicalUEqual, move(sign), Immediate(1U << 15));
Node extend = Operation(OperationCode::Select, is_sign, Immediate(0xFFFF0000), Immediate(0));
return Operation(OperationCode::UBitwiseOr, std::move(value), std::move(extend));
return Operation(OperationCode::UBitwiseOr, move(value), move(extend));
}
} // Anonymous namespace
@@ -379,20 +378,36 @@ u32 ShaderIR::DecodeMemory(NodeBlock& bb, u32 pc) {
if (IsUnaligned(type)) {
const u32 mask = GetUnalignedMask(type);
value = InsertUnaligned(gmem, std::move(value), real_address, mask, size);
value = InsertUnaligned(gmem, move(value), real_address, mask, size);
}
bb.push_back(Operation(OperationCode::Assign, gmem, value));
}
break;
}
case OpCode::Id::RED: {
UNIMPLEMENTED_IF_MSG(instr.red.type != GlobalAtomicType::U32);
UNIMPLEMENTED_IF_MSG(instr.red.operation != AtomicOp::Add);
const auto [real_address, base_address, descriptor] =
TrackGlobalMemory(bb, instr, true, true);
if (!real_address || !base_address) {
// Tracking failed, skip atomic.
break;
}
Node gmem = MakeNode<GmemNode>(real_address, base_address, descriptor);
Node value = GetRegister(instr.gpr0);
bb.push_back(Operation(OperationCode::ReduceIAdd, move(gmem), move(value)));
break;
}
case OpCode::Id::ATOM: {
UNIMPLEMENTED_IF_MSG(instr.atom.operation == AtomicOp::Inc ||
instr.atom.operation == AtomicOp::Dec ||
instr.atom.operation == AtomicOp::SafeAdd,
"operation={}", static_cast<int>(instr.atom.operation.Value()));
UNIMPLEMENTED_IF_MSG(instr.atom.type == GlobalAtomicType::S64 ||
instr.atom.type == GlobalAtomicType::U64,
instr.atom.type == GlobalAtomicType::U64 ||
instr.atom.type == GlobalAtomicType::F16x2_FTZ_RN ||
instr.atom.type == GlobalAtomicType::F32_FTZ_RN,
"type={}", static_cast<int>(instr.atom.type.Value()));
const auto [real_address, base_address, descriptor] =
@@ -403,11 +418,11 @@ u32 ShaderIR::DecodeMemory(NodeBlock& bb, u32 pc) {
}
const bool is_signed =
instr.atoms.type == AtomicType::S32 || instr.atoms.type == AtomicType::S64;
instr.atom.type == GlobalAtomicType::S32 || instr.atom.type == GlobalAtomicType::S64;
Node gmem = MakeNode<GmemNode>(real_address, base_address, descriptor);
Node value = GetAtomOperation(static_cast<AtomicOp>(instr.atom.operation), is_signed, gmem,
GetRegister(instr.gpr20));
SetRegister(bb, instr.gpr0, std::move(value));
SetRegister(bb, instr.gpr0,
SignedOperation(GetAtomOperation(instr.atom.operation), is_signed, gmem,
GetRegister(instr.gpr20)));
break;
}
case OpCode::Id::ATOMS: {
@@ -421,11 +436,10 @@ u32 ShaderIR::DecodeMemory(NodeBlock& bb, u32 pc) {
instr.atoms.type == AtomicType::S32 || instr.atoms.type == AtomicType::S64;
const s32 offset = instr.atoms.GetImmediateOffset();
Node address = GetRegister(instr.gpr8);
address = Operation(OperationCode::IAdd, std::move(address), Immediate(offset));
Node value =
GetAtomOperation(static_cast<AtomicOp>(instr.atoms.operation), is_signed,
GetSharedMemory(std::move(address)), GetRegister(instr.gpr20));
SetRegister(bb, instr.gpr0, std::move(value));
address = Operation(OperationCode::IAdd, move(address), Immediate(offset));
SetRegister(bb, instr.gpr0,
SignedOperation(GetAtomOperation(instr.atoms.operation), is_signed,
GetSharedMemory(move(address)), GetRegister(instr.gpr20)));
break;
}
case OpCode::Id::AL2P: {

View File

@@ -23,7 +23,6 @@ Node IsFull(Node shift) {
}
Node Shift(OperationCode opcode, Node value, Node shift) {
Node is_full = Operation(OperationCode::LogicalIEqual, shift, Immediate(32));
Node shifted = Operation(opcode, move(value), shift);
return Operation(OperationCode::Select, IsFull(move(shift)), Immediate(0), move(shifted));
}

View File

@@ -178,6 +178,20 @@ enum class OperationCode {
AtomicIOr, /// (memory, int) -> int
AtomicIXor, /// (memory, int) -> int
ReduceUAdd, /// (memory, uint) -> void
ReduceUMin, /// (memory, uint) -> void
ReduceUMax, /// (memory, uint) -> void
ReduceUAnd, /// (memory, uint) -> void
ReduceUOr, /// (memory, uint) -> void
ReduceUXor, /// (memory, uint) -> void
ReduceIAdd, /// (memory, int) -> void
ReduceIMin, /// (memory, int) -> void
ReduceIMax, /// (memory, int) -> void
ReduceIAnd, /// (memory, int) -> void
ReduceIOr, /// (memory, int) -> void
ReduceIXor, /// (memory, int) -> void
Branch, /// (uint branch_target) -> void
BranchIndirect, /// (uint branch_target) -> void
PushFlowStack, /// (uint branch_target) -> void

View File

@@ -56,8 +56,7 @@ Node ShaderIR::GetConstBuffer(u64 index_, u64 offset_) {
const auto index = static_cast<u32>(index_);
const auto offset = static_cast<u32>(offset_);
const auto [entry, is_new] = used_cbufs.try_emplace(index);
entry->second.MarkAsUsed(offset);
used_cbufs.try_emplace(index).first->second.MarkAsUsed(offset);
return MakeNode<CbufNode>(index, Immediate(offset));
}
@@ -66,8 +65,7 @@ Node ShaderIR::GetConstBufferIndirect(u64 index_, u64 offset_, Node node) {
const auto index = static_cast<u32>(index_);
const auto offset = static_cast<u32>(offset_);
const auto [entry, is_new] = used_cbufs.try_emplace(index);
entry->second.MarkAsUsedIndirect();
used_cbufs.try_emplace(index).first->second.MarkAsUsedIndirect();
Node final_offset = [&] {
// Attempt to inline constant buffer without a variable offset. This is done to allow
@@ -166,6 +164,7 @@ Node ShaderIR::ConvertIntegerSize(Node value, Register::Size size, bool is_signe
std::move(value), Immediate(16));
value = SignedOperation(OperationCode::IArithmeticShiftRight, is_signed, NO_PRECISE,
std::move(value), Immediate(16));
return value;
case Register::Size::Word:
// Default - do nothing
return value;

View File

@@ -27,8 +27,9 @@ std::pair<Node, s64> FindOperation(const NodeBlock& code, s64 cursor,
if (const auto conditional = std::get_if<ConditionalNode>(&*node)) {
const auto& conditional_code = conditional->GetCode();
auto [found, internal_cursor] = FindOperation(
auto result = FindOperation(
conditional_code, static_cast<s64>(conditional_code.size() - 1), operation_code);
auto& found = result.first;
if (found) {
return {std::move(found), cursor};
}
@@ -186,8 +187,8 @@ std::tuple<Node, u32, u32> ShaderIR::TrackCbuf(Node tracked, const NodeBlock& co
std::optional<u32> ShaderIR::TrackImmediate(Node tracked, const NodeBlock& code, s64 cursor) const {
// Reduce the cursor in one to avoid infinite loops when the instruction sets the same register
// that it uses as operand
const auto [found, found_cursor] =
TrackRegister(&std::get<GprNode>(*tracked), code, cursor - 1);
const auto result = TrackRegister(&std::get<GprNode>(*tracked), code, cursor - 1);
const auto& found = result.first;
if (!found) {
return {};
}

View File

@@ -248,8 +248,14 @@ void SurfaceBaseImpl::FlushBuffer(Tegra::MemoryManager& memory_manager,
// Use an extra temporal buffer
auto& tmp_buffer = staging_cache.GetBuffer(1);
// Special case for 3D Texture Segments
const bool must_read_current_data =
params.block_depth > 0 && params.target == VideoCore::Surface::SurfaceTarget::Texture2D;
tmp_buffer.resize(guest_memory_size);
host_ptr = tmp_buffer.data();
if (must_read_current_data) {
memory_manager.ReadBlockUnsafe(gpu_addr, host_ptr, guest_memory_size);
}
if (params.is_tiled) {
ASSERT_MSG(params.block_width == 0, "Block width is defined as {}", params.block_width);

View File

@@ -72,9 +72,9 @@ public:
return (cpu_addr < end) && (cpu_addr_end > start);
}
bool IsInside(const GPUVAddr other_start, const GPUVAddr other_end) {
bool IsInside(const GPUVAddr other_start, const GPUVAddr other_end) const {
const GPUVAddr gpu_addr_end = gpu_addr + guest_memory_size;
return (gpu_addr <= other_start && other_end <= gpu_addr_end);
return gpu_addr <= other_start && other_end <= gpu_addr_end;
}
// Use only when recycling a surface

View File

@@ -167,7 +167,6 @@ SurfaceParams SurfaceParams::CreateForImage(const FormatLookupTable& lookup_tabl
SurfaceParams SurfaceParams::CreateForDepthBuffer(Core::System& system) {
const auto& regs = system.GPU().Maxwell3D().regs;
regs.zeta_width, regs.zeta_height, regs.zeta.format, regs.zeta.memory_layout.type;
SurfaceParams params;
params.is_tiled = regs.zeta.memory_layout.type ==
Tegra::Engines::Maxwell3D::Regs::InvMemoryLayout::BlockLinear;

View File

@@ -20,4 +20,8 @@ bool ViewParams::operator==(const ViewParams& rhs) const {
std::tie(rhs.base_layer, rhs.num_layers, rhs.base_level, rhs.num_levels, rhs.target);
}
bool ViewParams::operator!=(const ViewParams& rhs) const {
return !operator==(rhs);
}
} // namespace VideoCommon

View File

@@ -21,6 +21,7 @@ struct ViewParams {
std::size_t Hash() const;
bool operator==(const ViewParams& rhs) const;
bool operator!=(const ViewParams& rhs) const;
bool IsLayered() const {
switch (target) {

View File

@@ -509,7 +509,9 @@ private:
}
const auto& final_params = new_surface->GetSurfaceParams();
if (cr_params.type != final_params.type) {
BufferCopy(current_surface, new_surface);
if (Settings::values.use_accurate_gpu_emulation) {
BufferCopy(current_surface, new_surface);
}
} else {
std::vector<CopyParams> bricks = current_surface->BreakDown(final_params);
for (auto& brick : bricks) {
@@ -612,10 +614,10 @@ private:
* textures within the GPU if possible. Falls back to LLE when it isn't possible to use any of
* the HLE methods.
*
* @param overlaps The overlapping surfaces registered in the cache.
* @param params The parameters on the new surface.
* @param gpu_addr The starting address of the new surface.
* @param cache_addr The starting address of the new surface on physical memory.
* @param overlaps The overlapping surfaces registered in the cache.
* @param params The parameters on the new surface.
* @param gpu_addr The starting address of the new surface.
* @param cpu_addr The starting address of the new surface on physical memory.
*/
std::optional<std::pair<TSurface, TView>> Manage3DSurfaces(std::vector<TSurface>& overlaps,
const SurfaceParams& params,
@@ -645,7 +647,8 @@ private:
break;
}
const u32 offset = static_cast<u32>(surface->GetCpuAddr() - cpu_addr);
const auto [x, y, z] = params.GetBlockOffsetXYZ(offset);
const auto offsets = params.GetBlockOffsetXYZ(offset);
const auto z = std::get<2>(offsets);
modified |= surface->IsModified();
const CopyParams copy_params(0, 0, 0, 0, 0, z, 0, 0, params.width, params.height,
1);

View File

@@ -8,9 +8,4 @@ add_library(web_service STATIC
)
create_target_directory_groups(web_service)
get_directory_property(OPENSSL_LIBS
DIRECTORY ${PROJECT_SOURCE_DIR}/externals/libressl
DEFINITION OPENSSL_LIBS)
target_compile_definitions(web_service PRIVATE -DCPPHTTPLIB_OPENSSL_SUPPORT)
target_link_libraries(web_service PRIVATE common json-headers ${OPENSSL_LIBS} httplib lurlparser)
target_link_libraries(web_service PRIVATE common json-headers httplib lurlparser)

View File

@@ -43,7 +43,7 @@ struct Client::Impl {
if (jwt.empty() && !allow_anonymous) {
LOG_ERROR(WebService, "Credentials must be provided for authenticated requests");
return Common::WebResult{Common::WebResult::Code::CredentialsMissing,
"Credentials needed"};
"Credentials needed", ""};
}
auto result = GenericRequest(method, path, data, accept, jwt);
@@ -81,12 +81,12 @@ struct Client::Impl {
cli = std::make_unique<httplib::SSLClient>(parsedUrl.m_Host.c_str(), port);
} else {
LOG_ERROR(WebService, "Bad URL scheme {}", parsedUrl.m_Scheme);
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Bad URL scheme"};
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Bad URL scheme", ""};
}
}
if (cli == nullptr) {
LOG_ERROR(WebService, "Invalid URL {}", host + path);
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Invalid URL"};
return Common::WebResult{Common::WebResult::Code::InvalidURL, "Invalid URL", ""};
}
cli->set_timeout_sec(TIMEOUT_SECONDS);
@@ -118,27 +118,27 @@ struct Client::Impl {
if (!cli->send(request, response)) {
LOG_ERROR(WebService, "{} to {} returned null", method, host + path);
return Common::WebResult{Common::WebResult::Code::LibError, "Null response"};
return Common::WebResult{Common::WebResult::Code::LibError, "Null response", ""};
}
if (response.status >= 400) {
LOG_ERROR(WebService, "{} to {} returned error status code: {}", method, host + path,
response.status);
return Common::WebResult{Common::WebResult::Code::HttpError,
std::to_string(response.status)};
std::to_string(response.status), ""};
}
auto content_type = response.headers.find("content-type");
if (content_type == response.headers.end()) {
LOG_ERROR(WebService, "{} to {} returned no content", method, host + path);
return Common::WebResult{Common::WebResult::Code::WrongContent, ""};
return Common::WebResult{Common::WebResult::Code::WrongContent, "", ""};
}
if (content_type->second.find(accept) == std::string::npos) {
LOG_ERROR(WebService, "{} to {} returned wrong content: {}", method, host + path,
content_type->second);
return Common::WebResult{Common::WebResult::Code::WrongContent, "Wrong content"};
return Common::WebResult{Common::WebResult::Code::WrongContent, "Wrong content", ""};
}
return Common::WebResult{Common::WebResult::Code::Success, "", response.body};
}

View File

@@ -51,7 +51,8 @@ MicroProfileDialog::MicroProfileDialog(QWidget* parent) : QWidget(parent, Qt::Di
setWindowTitle(tr("MicroProfile"));
resize(1000, 600);
// Remove the "?" button from the titlebar and enable the maximize button
setWindowFlags(windowFlags() & ~Qt::WindowContextHelpButtonHint | Qt::WindowMaximizeButtonHint);
setWindowFlags((windowFlags() & ~Qt::WindowContextHelpButtonHint) |
Qt::WindowMaximizeButtonHint);
#if MICROPROFILE_ENABLED

View File

@@ -91,7 +91,8 @@ std::pair<std::vector<u8>, std::string> GetGameListCachedObject(
return generator();
}
if (file1.write(reinterpret_cast<const char*>(icon.data()), icon.size()) != icon.size()) {
if (file1.write(reinterpret_cast<const char*>(icon.data()), icon.size()) !=
s64(icon.size())) {
LOG_ERROR(Frontend, "Failed to write data to cache file.");
return generator();
}

View File

@@ -1019,9 +1019,9 @@ void GMainWindow::BootGame(const QString& filename) {
std::string title_name;
const auto res = Core::System::GetInstance().GetGameName(title_name);
if (res != Loader::ResultStatus::Success) {
const auto [nacp, icon_file] = FileSys::PatchManager(title_id).GetControlMetadata();
if (nacp != nullptr)
title_name = nacp->GetApplicationName();
const auto metadata = FileSys::PatchManager(title_id).GetControlMetadata();
if (metadata.first != nullptr)
title_name = metadata.first->GetApplicationName();
if (title_name.empty())
title_name = FileUtil::GetFilename(filename.toStdString());
@@ -1628,7 +1628,7 @@ void GMainWindow::OnMenuInstallToNAND() {
}
FileSys::InstallResult res;
if (index >= static_cast<size_t>(FileSys::TitleType::Application)) {
if (index >= static_cast<s32>(FileSys::TitleType::Application)) {
res = Core::System::GetInstance()
.GetFileSystemController()
.GetUserNANDContents()