Compare commits

..

37 Commits

Author SHA1 Message Date
bunnei
55c49d5bf4 gl_shader_gen: Set position.w to 1. 2018-06-15 20:47:04 -04:00
bunnei
17f3590d59 Merge pull request #560 from Subv/crash_widget
Qt: Removed the Registers widget.
2018-06-13 10:15:00 -04:00
Subv
7786f41cc0 Qt: Removed the Registers widget.
It was crashing and nobody actually uses this.
2018-06-12 20:33:32 -05:00
bunnei
019d7208c8 Merge pull request #556 from Subv/dma_engine
GPU: Partially implemented the Maxwell DMA engine.
2018-06-12 14:25:17 -04:00
bunnei
2015a1b180 Merge pull request #558 from Subv/iadd32i
GPU: Implemented the iadd32i shader instruction.
2018-06-12 14:19:25 -04:00
Subv
db0497b808 GPU: Implemented the iadd32i shader instruction. 2018-06-12 11:46:45 -05:00
Subv
987a170665 GPU: Partially implemented the Maxwell DMA engine.
Only tiled->linear and linear->tiled copies that aren't offsetted are supported for now. Queries are not supported. Swizzled copies are not supported.
2018-06-12 11:27:36 -05:00
bunnei
33dbf24b56 Merge pull request #557 from shinyquagsire23/libnx-hid-fix
hid: Update all layouts and only show handheld as connected, fixes libnx input for P1_AUTO
2018-06-12 09:07:38 -04:00
bunnei
2dc8b5c224 Merge pull request #552 from bunnei/sat-fmul
gl_shader_decompiler: Implement saturate for float instructions.
2018-06-11 23:19:37 -04:00
bunnei
5f3d6c85db gl_shader_decompiler: Implement saturate for float instructions. 2018-06-11 21:46:34 -04:00
shinyquagsire23
2f9c0e7c7e hid: Update all layouts and only show handheld as connected, fixes libnx input for P1_AUTO 2018-06-11 19:41:29 -06:00
bunnei
09b8a16414 Merge pull request #555 from Subv/gpu_sysregs
GPU: Convert the gl_InstanceId and gl_VertexID variables to floats when reading from them.
2018-06-10 20:55:27 -04:00
Subv
004b1b3830 GPU: Convert the gl_InstanceId and gl_VertexID variables to floats when reading from them.
This corrects the invalid position values in some games when doing attribute-less rendering.
2018-06-10 13:50:19 -05:00
bunnei
281fd881a0 Merge pull request #553 from Subv/iset
GPU: Implement the ISET family of shader instructions.
2018-06-10 10:50:38 -04:00
Subv
b366b885a1 GPU: Implement the iset family of shader instructions. 2018-06-09 16:19:13 -05:00
Subv
3cb753eeb1 GPU: Added decodings for the ISET family of instructions. 2018-06-09 15:56:50 -05:00
bunnei
d81aaa3ed3 Merge pull request #550 from Subv/ssy
GPU: Stub the SSY shader instruction.
2018-06-09 00:42:53 -04:00
bunnei
e2176dc7ce Merge pull request #551 from bunnei/shr
gl_shader_decompiler: Implement SHR instruction.
2018-06-09 00:42:44 -04:00
bunnei
174c22e5f6 Merge pull request #549 from bunnei/iadd
gl_shader_decompiler: Implement IADD instruction.
2018-06-09 00:34:03 -04:00
bunnei
5440b9c634 gl_shader_decompiler: Implement SHR instruction. 2018-06-09 00:01:17 -04:00
Subv
abec5f82e2 GPU: Stub the SSY shader instruction.
This instruction tells the GPU where the flow reconverges in a non-uniform control flow scenario, we can ignore this when generating GLSL code.
2018-06-08 22:46:10 -05:00
bunnei
bbc4f369ed gl_shader_decompiler: Implement IADD instruction. 2018-06-08 23:25:22 -04:00
bunnei
79e9c2e237 gl_shader_decompiler: Add missing asserts for saturate_a instructions. 2018-06-08 23:24:10 -04:00
bunnei
83517cb53a Merge pull request #505 from janisozaur/ccache-travis
Enable ccache usage on Travis
2018-06-08 18:51:59 -04:00
bunnei
9949e4d508 Merge pull request #533 from mailwl/array-to-buffer
Common/string_util: add StringFromBuffer() function
2018-06-08 18:51:00 -04:00
bunnei
c116b220e9 Merge pull request #548 from Subv/blend
GPU: Fixed ghosting when drawing with blending disabled
2018-06-08 18:48:12 -04:00
Subv
c011b6f67e GPU: Synchronize the blend state on every draw call.
Only independent blending on render target 0 is implemented for now.

This fixes the elongated squids in Splatoon 2's boot screen.
2018-06-08 17:05:52 -05:00
Subv
c712dafaee GPU: Added registers for normal and independent blending. 2018-06-08 17:04:41 -05:00
bunnei
a931cf9e8b Merge pull request #547 from Subv/compressed_alignment
GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
2018-06-08 16:40:49 -04:00
bunnei
a941a94148 Merge pull request #546 from Subv/flush_ubo_buffer
Rasterizer: Flush the written region when writing shader uniform data before copying it to the uniform buffers.
2018-06-08 16:39:55 -04:00
Subv
8d9534d830 GLCache: Align compressed texture sizes to their compression ratio, and then align that compressed size to the block height for tiled textures.
This fixes issues with retrieving non-block-aligned tiled compressed textures from the cache.
2018-06-08 12:27:19 -05:00
Subv
47dc5e0dab Rasterizer: Flush the written region when writing shader uniform data before copying it to the uniform buffers.
This fixes the flip_viewport uniform having invalid values when drawing.
2018-06-08 12:22:39 -05:00
Michał Janiszewski
f3885845fc Cache ccache on Travis 2018-06-07 21:43:33 +02:00
Michał Janiszewski
c0d3e2da4e Add ccache support for macOS on Travis 2018-06-07 21:43:33 +02:00
Michał Janiszewski
517112f549 Add ccache support for Linux on Travis 2018-06-07 21:43:32 +02:00
Michał Janiszewski
6324d86c71 Install cmake from repositories for Ubuntu
Ubuntu 18.04 already has cmake 3.10.2
2018-06-07 21:42:12 +02:00
mailwl
a2efb1dd48 Common/string_util: add StringFromBuffer function
convert input buffer (std::vector<u8>) to string, stripping zero chars
2018-06-07 09:59:47 +03:00
31 changed files with 518 additions and 425 deletions

View File

@@ -42,3 +42,7 @@ notifications:
webhooks:
urls:
- https://api.yuzu-emu.org/code/travis/notify
cache:
directories:
- $HOME/.ccache

View File

@@ -1,3 +1,3 @@
#!/bin/bash -ex
docker run -v $(pwd):/yuzu ubuntu:18.04 /bin/bash /yuzu/.travis/linux/docker.sh
docker run -e CCACHE_DIR=/ccache -v $HOME/.ccache:/ccache -v $(pwd):/yuzu ubuntu:18.04 /bin/bash /yuzu/.travis/linux/docker.sh

View File

@@ -1,16 +1,18 @@
#!/bin/bash -ex
apt-get update
apt-get install -y build-essential git libqt5opengl5-dev libsdl2-dev libssl-dev python qtbase5-dev wget ninja-build
# Get a recent version of CMake
wget https://cmake.org/files/v3.10/cmake-3.10.1-Linux-x86_64.sh
sh cmake-3.10.1-Linux-x86_64.sh --exclude-subdir --prefix=/ --skip-license
apt-get install --no-install-recommends -y build-essential git libqt5opengl5-dev libsdl2-dev libssl-dev python qtbase5-dev wget cmake ninja-build ccache
cd /yuzu
export PATH=/usr/lib/ccache:$PATH
ln -sf /usr/bin/ccache /usr/lib/ccache/cc
ln -sf /usr/bin/ccache /usr/lib/ccache/c++
mkdir build && cd build
ccache --show-stats > ccache_before
cmake .. -DYUZU_BUILD_UNICORN=ON -DCMAKE_BUILD_TYPE=Release -G Ninja
ninja
ccache --show-stats > ccache_after
diff -U100 ccache_before ccache_after || true
ctest -VV -C Release

View File

@@ -7,8 +7,12 @@ export Qt5_DIR=$(brew --prefix)/opt/qt5
export UNICORNDIR=$(pwd)/externals/unicorn
mkdir build && cd build
export PATH=/usr/local/opt/ccache/libexec:$PATH
ccache --show-stats > ccache_before
cmake --version
cmake .. -DYUZU_BUILD_UNICORN=ON -DCMAKE_BUILD_TYPE=Release
make -j4
ccache --show-stats > ccache_after
diff -U100 ccache_before ccache_after || true
ctest -VV -C Release

View File

@@ -1,5 +1,5 @@
#!/bin/sh -ex
brew update
brew install dylibbundler p7zip qt5 sdl2
brew install dylibbundler p7zip qt5 sdl2 ccache
brew outdated cmake || brew upgrade cmake

View File

@@ -64,6 +64,10 @@ std::string ArrayToString(const u8* data, size_t size, int line_len, bool spaces
return oss.str();
}
std::string StringFromBuffer(const std::vector<u8>& data) {
return std::string(data.begin(), std::find(data.begin(), data.end(), '\0'));
}
// Turns " hej " into "hej". Also handles tabs.
std::string StripSpaces(const std::string& str) {
const size_t s = str.find_first_not_of(" \t\r\n");

View File

@@ -21,6 +21,8 @@ std::string ToUpper(std::string str);
std::string ArrayToString(const u8* data, size_t size, int line_len = 20, bool spaces = true);
std::string StringFromBuffer(const std::vector<u8>& data);
std::string StripSpaces(const std::string& s);
std::string StripQuotes(const std::string& s);

View File

@@ -44,14 +44,14 @@ Cpu& System::CurrentCpuCore() {
}
// Otherwise, use single-threaded mode active_core variable
return *cpu_cores[active_core = 4];
return *cpu_cores[active_core];
}
System::ResultStatus System::RunLoop(bool tight_loop) {
status = ResultStatus::Success;
// Update thread_to_cpu in case Core 0 is run from a different host thread
thread_to_cpu[std::this_thread::get_id()] = cpu_cores[4];
thread_to_cpu[std::this_thread::get_id()] = cpu_cores[0];
if (GDBStub::IsServerEnabled()) {
GDBStub::HandlePacket();
@@ -68,7 +68,7 @@ System::ResultStatus System::RunLoop(bool tight_loop) {
}
}
for (active_core = 4; active_core < NUM_CPU_CORES; ++active_core) {
for (active_core = 0; active_core < NUM_CPU_CORES; ++active_core) {
cpu_cores[active_core]->RunLoop(tight_loop);
if (Settings::values.use_multi_core) {
// Cores 1-3 are run on other threads in this mode

View File

@@ -4,6 +4,7 @@
#include <cinttypes>
#include "common/logging/log.h"
#include "common/string_util.h"
#include "core/core.h"
#include "core/file_sys/directory.h"
#include "core/file_sys/filesystem.h"
@@ -258,9 +259,7 @@ public:
IPC::RequestParser rp{ctx};
auto file_buffer = ctx.ReadBuffer();
auto end = std::find(file_buffer.begin(), file_buffer.end(), '\0');
std::string name(file_buffer.begin(), end);
std::string name = Common::StringFromBuffer(file_buffer);
u64 mode = rp.Pop<u64>();
u32 size = rp.Pop<u32>();
@@ -275,9 +274,7 @@ public:
IPC::RequestParser rp{ctx};
auto file_buffer = ctx.ReadBuffer();
auto end = std::find(file_buffer.begin(), file_buffer.end(), '\0');
std::string name(file_buffer.begin(), end);
std::string name = Common::StringFromBuffer(file_buffer);
NGLOG_DEBUG(Service_FS, "called file {}", name);
@@ -289,9 +286,7 @@ public:
IPC::RequestParser rp{ctx};
auto file_buffer = ctx.ReadBuffer();
auto end = std::find(file_buffer.begin(), file_buffer.end(), '\0');
std::string name(file_buffer.begin(), end);
std::string name = Common::StringFromBuffer(file_buffer);
NGLOG_DEBUG(Service_FS, "called directory {}", name);
@@ -305,13 +300,11 @@ public:
std::vector<u8> buffer;
buffer.resize(ctx.BufferDescriptorX()[0].Size());
Memory::ReadBlock(ctx.BufferDescriptorX()[0].Address(), buffer.data(), buffer.size());
auto end = std::find(buffer.begin(), buffer.end(), '\0');
std::string src_name(buffer.begin(), end);
std::string src_name = Common::StringFromBuffer(buffer);
buffer.resize(ctx.BufferDescriptorX()[1].Size());
Memory::ReadBlock(ctx.BufferDescriptorX()[1].Address(), buffer.data(), buffer.size());
end = std::find(buffer.begin(), buffer.end(), '\0');
std::string dst_name(buffer.begin(), end);
std::string dst_name = Common::StringFromBuffer(buffer);
NGLOG_DEBUG(Service_FS, "called file '{}' to file '{}'", src_name, dst_name);
@@ -323,9 +316,7 @@ public:
IPC::RequestParser rp{ctx};
auto file_buffer = ctx.ReadBuffer();
auto end = std::find(file_buffer.begin(), file_buffer.end(), '\0');
std::string name(file_buffer.begin(), end);
std::string name = Common::StringFromBuffer(file_buffer);
auto mode = static_cast<FileSys::Mode>(rp.Pop<u32>());
@@ -349,9 +340,7 @@ public:
IPC::RequestParser rp{ctx};
auto file_buffer = ctx.ReadBuffer();
auto end = std::find(file_buffer.begin(), file_buffer.end(), '\0');
std::string name(file_buffer.begin(), end);
std::string name = Common::StringFromBuffer(file_buffer);
// TODO(Subv): Implement this filter.
u32 filter_flags = rp.Pop<u32>();
@@ -376,9 +365,7 @@ public:
IPC::RequestParser rp{ctx};
auto file_buffer = ctx.ReadBuffer();
auto end = std::find(file_buffer.begin(), file_buffer.end(), '\0');
std::string name(file_buffer.begin(), end);
std::string name = Common::StringFromBuffer(file_buffer);
NGLOG_DEBUG(Service_FS, "called file {}", name);

View File

@@ -94,7 +94,6 @@ private:
layout.header.latest_entry = (layout.header.latest_entry + 1) % HID_NUM_ENTRIES;
ControllerInputEntry& entry = layout.entries[layout.header.latest_entry];
entry.connection_state = ConnectionState_Connected | ConnectionState_Wired;
entry.timestamp++;
// TODO(shinyquagsire23): Is this always identical to timestamp?
entry.timestamp_2++;
@@ -103,6 +102,8 @@ private:
if (controller != Controller_Handheld)
continue;
entry.connection_state = ConnectionState_Connected | ConnectionState_Wired;
// TODO(shinyquagsire23): Set up some LUTs for each layout mapping in the future?
// For now everything is just the default handheld layout, but split Joy-Con will
// rotate the face buttons and directions for certain layouts.

View File

@@ -12,7 +12,7 @@ namespace Service::HID {
// Begin enums and output structs
constexpr u32 HID_NUM_ENTRIES = 17;
constexpr u32 HID_NUM_LAYOUTS = 2;
constexpr u32 HID_NUM_LAYOUTS = 7;
constexpr s32 HID_JOYSTICK_MAX = 0x8000;
constexpr s32 HID_JOYSTICK_MIN = -0x8000;

View File

@@ -9,6 +9,8 @@ add_library(video_core STATIC
engines/maxwell_3d.h
engines/maxwell_compute.cpp
engines/maxwell_compute.h
engines/maxwell_dma.cpp
engines/maxwell_dma.h
engines/shader_bytecode.h
gpu.cpp
gpu.h

View File

@@ -16,6 +16,7 @@
#include "video_core/engines/fermi_2d.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/engines/maxwell_compute.h"
#include "video_core/engines/maxwell_dma.h"
#include "video_core/gpu.h"
#include "video_core/renderer_base.h"
#include "video_core/video_core.h"
@@ -60,8 +61,11 @@ void GPU::WriteReg(u32 method, u32 subchannel, u32 value, u32 remaining_params)
case EngineID::MAXWELL_COMPUTE_B:
maxwell_compute->WriteReg(method, value);
break;
case EngineID::MAXWELL_DMA_COPY_A:
maxwell_dma->WriteReg(method, value);
break;
default:
UNIMPLEMENTED();
UNIMPLEMENTED_MSG("Unimplemented engine");
}
}

View File

@@ -47,6 +47,7 @@ void Fermi2D::HandleSurfaceCopy() {
if (regs.src.linear == regs.dst.linear) {
// If the input layout and the output layout are the same, just perform a raw copy.
ASSERT(regs.src.BlockHeight() == regs.dst.BlockHeight());
Memory::CopyBlock(dest_cpu, source_cpu,
src_bytes_per_pixel * regs.dst.width * regs.dst.height);
return;

View File

@@ -318,6 +318,7 @@ public:
Equation equation_a;
Factor factor_source_a;
Factor factor_dest_a;
INSERT_PADDING_WORDS(1);
};
union {
@@ -432,7 +433,27 @@ public:
};
} rt_control;
INSERT_PADDING_WORDS(0xCF);
INSERT_PADDING_WORDS(0x31);
u32 independent_blend_enable;
INSERT_PADDING_WORDS(0x15);
struct {
u32 separate_alpha;
Blend::Equation equation_rgb;
Blend::Factor factor_source_rgb;
Blend::Factor factor_dest_rgb;
Blend::Equation equation_a;
Blend::Factor factor_source_a;
INSERT_PADDING_WORDS(1);
Blend::Factor factor_dest_a;
u32 enable_common;
u32 enable[NumRenderTargets];
} blend;
INSERT_PADDING_WORDS(0x77);
struct {
u32 tsc_address_high;
@@ -557,9 +578,7 @@ public:
} vertex_array[NumVertexArrays];
Blend blend;
INSERT_PADDING_WORDS(0x39);
Blend independent_blend[NumRenderTargets];
struct {
u32 limit_high;
@@ -722,6 +741,8 @@ ASSERT_REG_POSITION(vertex_buffer, 0x35D);
ASSERT_REG_POSITION(zeta, 0x3F8);
ASSERT_REG_POSITION(vertex_attrib_format[0], 0x458);
ASSERT_REG_POSITION(rt_control, 0x487);
ASSERT_REG_POSITION(independent_blend_enable, 0x4B9);
ASSERT_REG_POSITION(blend, 0x4CF);
ASSERT_REG_POSITION(tsc, 0x557);
ASSERT_REG_POSITION(tic, 0x55D);
ASSERT_REG_POSITION(code_address, 0x582);
@@ -729,7 +750,7 @@ ASSERT_REG_POSITION(draw, 0x585);
ASSERT_REG_POSITION(index_array, 0x5F2);
ASSERT_REG_POSITION(query, 0x6C0);
ASSERT_REG_POSITION(vertex_array[0], 0x700);
ASSERT_REG_POSITION(blend, 0x780);
ASSERT_REG_POSITION(independent_blend, 0x780);
ASSERT_REG_POSITION(vertex_array_limit[0], 0x7C0);
ASSERT_REG_POSITION(shader_config[0], 0x800);
ASSERT_REG_POSITION(const_buffer, 0x8E0);

View File

@@ -0,0 +1,69 @@
// Copyright 2018 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include "core/memory.h"
#include "video_core/engines/maxwell_dma.h"
#include "video_core/textures/decoders.h"
namespace Tegra {
namespace Engines {
MaxwellDMA::MaxwellDMA(MemoryManager& memory_manager) : memory_manager(memory_manager) {}
void MaxwellDMA::WriteReg(u32 method, u32 value) {
ASSERT_MSG(method < Regs::NUM_REGS,
"Invalid MaxwellDMA register, increase the size of the Regs structure");
regs.reg_array[method] = value;
#define MAXWELLDMA_REG_INDEX(field_name) \
(offsetof(Tegra::Engines::MaxwellDMA::Regs, field_name) / sizeof(u32))
switch (method) {
case MAXWELLDMA_REG_INDEX(exec): {
HandleCopy();
break;
}
}
#undef MAXWELLDMA_REG_INDEX
}
void MaxwellDMA::HandleCopy() {
NGLOG_WARNING(HW_GPU, "Requested a DMA copy");
const GPUVAddr source = regs.src_address.Address();
const GPUVAddr dest = regs.dst_address.Address();
const VAddr source_cpu = *memory_manager.GpuToCpuAddress(source);
const VAddr dest_cpu = *memory_manager.GpuToCpuAddress(dest);
// TODO(Subv): Perform more research and implement all features of this engine.
ASSERT(regs.exec.enable_swizzle == 0);
ASSERT(regs.exec.enable_2d == 1);
ASSERT(regs.exec.query_mode == Regs::QueryMode::None);
ASSERT(regs.exec.query_intr == Regs::QueryIntr::None);
ASSERT(regs.exec.copy_mode == Regs::CopyMode::Unk2);
ASSERT(regs.src_params.pos_x == 0);
ASSERT(regs.src_params.pos_y == 0);
ASSERT(regs.dst_params.pos_x == 0);
ASSERT(regs.dst_params.pos_y == 0);
ASSERT(regs.exec.is_dst_linear != regs.exec.is_src_linear);
u8* src_buffer = Memory::GetPointer(source_cpu);
u8* dst_buffer = Memory::GetPointer(dest_cpu);
if (regs.exec.is_dst_linear && !regs.exec.is_src_linear) {
// If the input is tiled and the output is linear, deswizzle the input and copy it over.
Texture::CopySwizzledData(regs.src_params.size_x, regs.src_params.size_y, 1, 1, src_buffer,
dst_buffer, true, regs.src_params.BlockHeight());
} else {
// If the input is linear and the output is tiled, swizzle the input and copy it over.
Texture::CopySwizzledData(regs.dst_params.size_x, regs.dst_params.size_y, 1, 1, dst_buffer,
src_buffer, false, regs.dst_params.BlockHeight());
}
}
} // namespace Engines
} // namespace Tegra

View File

@@ -0,0 +1,155 @@
// Copyright 2018 yuzu Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <array>
#include "common/assert.h"
#include "common/bit_field.h"
#include "common/common_funcs.h"
#include "common/common_types.h"
#include "video_core/gpu.h"
#include "video_core/memory_manager.h"
namespace Tegra {
namespace Engines {
class MaxwellDMA final {
public:
explicit MaxwellDMA(MemoryManager& memory_manager);
~MaxwellDMA() = default;
/// Write the value to the register identified by method.
void WriteReg(u32 method, u32 value);
struct Regs {
static constexpr size_t NUM_REGS = 0x1D6;
struct Parameters {
union {
BitField<0, 4, u32> block_depth;
BitField<4, 4, u32> block_height;
BitField<8, 4, u32> block_width;
};
u32 size_x;
u32 size_y;
u32 size_z;
u32 pos_z;
union {
BitField<0, 16, u32> pos_x;
BitField<16, 16, u32> pos_y;
};
u32 BlockHeight() const {
return 1 << block_height;
}
};
static_assert(sizeof(Parameters) == 24, "Parameters has wrong size");
enum class CopyMode : u32 {
None = 0,
Unk1 = 1,
Unk2 = 2,
};
enum class QueryMode : u32 {
None = 0,
Short = 1,
Long = 2,
};
enum class QueryIntr : u32 {
None = 0,
Block = 1,
NonBlock = 2,
};
union {
struct {
INSERT_PADDING_WORDS(0xC0);
struct {
union {
BitField<0, 2, CopyMode> copy_mode;
BitField<2, 1, u32> flush;
BitField<3, 2, QueryMode> query_mode;
BitField<5, 2, QueryIntr> query_intr;
BitField<7, 1, u32> is_src_linear;
BitField<8, 1, u32> is_dst_linear;
BitField<9, 1, u32> enable_2d;
BitField<10, 1, u32> enable_swizzle;
};
} exec;
INSERT_PADDING_WORDS(0x3F);
struct {
u32 address_high;
u32 address_low;
GPUVAddr Address() const {
return static_cast<GPUVAddr>((static_cast<GPUVAddr>(address_high) << 32) |
address_low);
}
} src_address;
struct {
u32 address_high;
u32 address_low;
GPUVAddr Address() const {
return static_cast<GPUVAddr>((static_cast<GPUVAddr>(address_high) << 32) |
address_low);
}
} dst_address;
u32 src_pitch;
u32 dst_pitch;
u32 x_count;
u32 y_count;
INSERT_PADDING_WORDS(0xBB);
Parameters dst_params;
INSERT_PADDING_WORDS(1);
Parameters src_params;
INSERT_PADDING_WORDS(0x13);
};
std::array<u32, NUM_REGS> reg_array;
};
} regs{};
MemoryManager& memory_manager;
private:
/// Performs the copy from the source buffer to the destination buffer as configured in the
/// registers.
void HandleCopy();
};
#define ASSERT_REG_POSITION(field_name, position) \
static_assert(offsetof(MaxwellDMA::Regs, field_name) == position * 4, \
"Field " #field_name " has invalid position")
ASSERT_REG_POSITION(exec, 0xC0);
ASSERT_REG_POSITION(src_address, 0x100);
ASSERT_REG_POSITION(dst_address, 0x102);
ASSERT_REG_POSITION(src_pitch, 0x104);
ASSERT_REG_POSITION(dst_pitch, 0x105);
ASSERT_REG_POSITION(x_count, 0x106);
ASSERT_REG_POSITION(y_count, 0x107);
ASSERT_REG_POSITION(dst_params, 0x1C3);
ASSERT_REG_POSITION(src_params, 0x1CA);
#undef ASSERT_REG_POSITION
} // namespace Engines
} // namespace Tegra

View File

@@ -216,12 +216,12 @@ union Instruction {
union {
BitField<20, 19, u64> imm20_19;
BitField<20, 32, u64> imm20_32;
BitField<20, 32, s64> imm20_32;
BitField<45, 1, u64> negate_b;
BitField<46, 1, u64> abs_a;
BitField<48, 1, u64> negate_a;
BitField<49, 1, u64> abs_b;
BitField<50, 1, u64> abs_d;
BitField<50, 1, u64> saturate_d;
BitField<56, 1, u64> negate_imm;
union {
@@ -246,7 +246,7 @@ union Instruction {
float GetImm20_32() const {
float result{};
u32 imm{static_cast<u32>(imm20_32)};
s32 imm{static_cast<s32>(imm20_32)};
std::memcpy(&result, &imm, sizeof(imm));
return result;
}
@@ -259,11 +259,20 @@ union Instruction {
}
} alu;
union {
BitField<48, 1, u64> is_signed;
} shift;
union {
BitField<39, 5, u64> shift_amount;
BitField<48, 1, u64> negate_b;
BitField<49, 1, u64> negate_a;
} iscadd;
} alu_integer;
union {
BitField<54, 1, u64> saturate;
BitField<56, 1, u64> negate_a;
} iadd32i;
union {
BitField<20, 8, u64> shift_position;
@@ -324,6 +333,15 @@ union Instruction {
BitField<56, 1, u64> neg_imm;
} fset;
union {
BitField<39, 3, u64> pred39;
BitField<42, 1, u64> neg_pred;
BitField<44, 1, u64> bf;
BitField<45, 2, PredOperation> op;
BitField<48, 1, u64> is_signed;
BitField<49, 3, PredCondition> cond;
} iset;
union {
BitField<10, 2, Register::Size> size;
BitField<12, 1, u64> is_output_signed;
@@ -331,7 +349,6 @@ union Instruction {
BitField<41, 2, u64> selector;
BitField<45, 1, u64> negate_a;
BitField<49, 1, u64> abs_a;
BitField<50, 1, u64> saturate_a;
union {
BitField<39, 2, F2iRoundingOp> rounding;
@@ -410,6 +427,7 @@ class OpCode {
public:
enum class Id {
KIL,
SSY,
BFE_C,
BFE_R,
BFE_IMM,
@@ -434,6 +452,10 @@ public:
FMUL_R,
FMUL_IMM,
FMUL32_IMM,
IADD_C,
IADD_R,
IADD_IMM,
IADD32I,
ISCADD_C, // Scale and Add
ISCADD_R,
ISCADD_IMM,
@@ -479,6 +501,9 @@ public:
ISETP_C,
ISETP_IMM,
ISETP_R,
ISET_R,
ISET_C,
ISET_IMM,
PSETP,
XMAD_IMM,
XMAD_CR,
@@ -489,15 +514,17 @@ public:
enum class Type {
Trivial,
Arithmetic,
ArithmeticInteger,
ArithmeticIntegerImmediate,
Bfe,
Logic,
Shift,
ScaledAdd,
Ffma,
Flow,
Memory,
FloatSet,
FloatSetPredicate,
IntegerSet,
IntegerSetPredicate,
PredicateSetPredicate,
Conversion,
@@ -596,6 +623,7 @@ private:
std::vector<Matcher> table = {
#define INST(bitstring, op, type, name) Detail::GetMatcher(bitstring, op, type, name)
INST("111000110011----", Id::KIL, Type::Flow, "KIL"),
INST("111000101001----", Id::SSY, Type::Flow, "SSY"),
INST("111000100100----", Id::BRA, Type::Flow, "BRA"),
INST("1110111111011---", Id::LD_A, Type::Memory, "LD_A"),
INST("1110111110010---", Id::LD_C, Type::Memory, "LD_C"),
@@ -617,9 +645,13 @@ private:
INST("0101110001101---", Id::FMUL_R, Type::Arithmetic, "FMUL_R"),
INST("0011100-01101---", Id::FMUL_IMM, Type::Arithmetic, "FMUL_IMM"),
INST("00011110--------", Id::FMUL32_IMM, Type::Arithmetic, "FMUL32_IMM"),
INST("0100110000011---", Id::ISCADD_C, Type::ScaledAdd, "ISCADD_C"),
INST("0101110000011---", Id::ISCADD_R, Type::ScaledAdd, "ISCADD_R"),
INST("0011100-00011---", Id::ISCADD_IMM, Type::ScaledAdd, "ISCADD_IMM"),
INST("0100110000010---", Id::IADD_C, Type::ArithmeticInteger, "IADD_C"),
INST("0101110000010---", Id::IADD_R, Type::ArithmeticInteger, "IADD_R"),
INST("0011100-00010---", Id::IADD_IMM, Type::ArithmeticInteger, "IADD_IMM"),
INST("0001110---------", Id::IADD32I, Type::ArithmeticIntegerImmediate, "IADD32I"),
INST("0100110000011---", Id::ISCADD_C, Type::ArithmeticInteger, "ISCADD_C"),
INST("0101110000011---", Id::ISCADD_R, Type::ArithmeticInteger, "ISCADD_R"),
INST("0011100-00011---", Id::ISCADD_IMM, Type::ArithmeticInteger, "ISCADD_IMM"),
INST("0101000010000---", Id::MUFU, Type::Arithmetic, "MUFU"),
INST("0100110010010---", Id::RRO_C, Type::Arithmetic, "RRO_C"),
INST("0101110010010---", Id::RRO_R, Type::Arithmetic, "RRO_R"),
@@ -665,6 +697,9 @@ private:
INST("010010110110----", Id::ISETP_C, Type::IntegerSetPredicate, "ISETP_C"),
INST("010110110110----", Id::ISETP_R, Type::IntegerSetPredicate, "ISETP_R"),
INST("0011011-0110----", Id::ISETP_IMM, Type::IntegerSetPredicate, "ISETP_IMM"),
INST("010110110101----", Id::ISET_R, Type::IntegerSet, "ISET_R"),
INST("010010110101----", Id::ISET_C, Type::IntegerSet, "ISET_C"),
INST("0011011-0101----", Id::ISET_IMM, Type::IntegerSet, "ISET_IMM"),
INST("0101000010010---", Id::PSETP, Type::PredicateSetPredicate, "PSETP"),
INST("0011011-00------", Id::XMAD_IMM, Type::Arithmetic, "XMAD_IMM"),
INST("0100111---------", Id::XMAD_CR, Type::Arithmetic, "XMAD_CR"),

View File

@@ -5,6 +5,7 @@
#include "video_core/engines/fermi_2d.h"
#include "video_core/engines/maxwell_3d.h"
#include "video_core/engines/maxwell_compute.h"
#include "video_core/engines/maxwell_dma.h"
#include "video_core/gpu.h"
namespace Tegra {
@@ -14,6 +15,7 @@ GPU::GPU() {
maxwell_3d = std::make_unique<Engines::Maxwell3D>(*memory_manager);
fermi_2d = std::make_unique<Engines::Fermi2D>(*memory_manager);
maxwell_compute = std::make_unique<Engines::MaxwellCompute>();
maxwell_dma = std::make_unique<Engines::MaxwellDMA>(*memory_manager);
}
GPU::~GPU() = default;

View File

@@ -63,6 +63,7 @@ namespace Engines {
class Fermi2D;
class Maxwell3D;
class MaxwellCompute;
class MaxwellDMA;
} // namespace Engines
enum class EngineID {
@@ -103,6 +104,8 @@ private:
std::unique_ptr<Engines::Fermi2D> fermi_2d;
/// Compute engine
std::unique_ptr<Engines::MaxwellCompute> maxwell_compute;
/// DMA engine
std::unique_ptr<Engines::MaxwellDMA> maxwell_dma;
};
} // namespace Tegra

View File

@@ -218,6 +218,9 @@ void RasterizerOpenGL::SetupShaders(u8* buffer_ptr, GLintptr buffer_offset) {
ubo.SetFromRegs(gpu.state.shader_stages[stage]);
std::memcpy(buffer_ptr, &ubo, sizeof(ubo));
// Flush the buffer so that the GPU can see the data we just wrote.
glFlushMappedBufferRange(GL_ARRAY_BUFFER, buffer_offset, sizeof(ubo));
// Upload uniform data as one UBO per stage
const GLintptr ubo_offset = buffer_offset;
copy_buffer(uniform_buffers[stage].handle, ubo_offset,
@@ -346,6 +349,9 @@ void RasterizerOpenGL::DrawArrays() {
// Sync the viewport
SyncViewport(surfaces_rect, res_scale);
// Sync the blend state registers
SyncBlendState();
// TODO(bunnei): Sync framebuffer_scale uniform here
// TODO(bunnei): Sync scissorbox uniform(s) here
@@ -452,32 +458,7 @@ void RasterizerOpenGL::DrawArrays() {
}
}
void RasterizerOpenGL::NotifyMaxwellRegisterChanged(u32 method) {
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
switch (method) {
case MAXWELL3D_REG_INDEX(blend.separate_alpha):
ASSERT_MSG(false, "unimplemented");
break;
case MAXWELL3D_REG_INDEX(blend.equation_rgb):
state.blend.rgb_equation = MaxwellToGL::BlendEquation(regs.blend.equation_rgb);
break;
case MAXWELL3D_REG_INDEX(blend.factor_source_rgb):
state.blend.src_rgb_func = MaxwellToGL::BlendFunc(regs.blend.factor_source_rgb);
break;
case MAXWELL3D_REG_INDEX(blend.factor_dest_rgb):
state.blend.dst_rgb_func = MaxwellToGL::BlendFunc(regs.blend.factor_dest_rgb);
break;
case MAXWELL3D_REG_INDEX(blend.equation_a):
state.blend.a_equation = MaxwellToGL::BlendEquation(regs.blend.equation_a);
break;
case MAXWELL3D_REG_INDEX(blend.factor_source_a):
state.blend.src_a_func = MaxwellToGL::BlendFunc(regs.blend.factor_source_a);
break;
case MAXWELL3D_REG_INDEX(blend.factor_dest_a):
state.blend.dst_a_func = MaxwellToGL::BlendFunc(regs.blend.factor_dest_a);
break;
}
}
void RasterizerOpenGL::NotifyMaxwellRegisterChanged(u32 method) {}
void RasterizerOpenGL::FlushAll() {
MICROPROFILE_SCOPE(OpenGL_CacheManagement);
@@ -757,14 +738,21 @@ void RasterizerOpenGL::SyncDepthOffset() {
UNREACHABLE();
}
void RasterizerOpenGL::SyncBlendEnabled() {
UNREACHABLE();
}
void RasterizerOpenGL::SyncBlendState() {
const auto& regs = Core::System().GetInstance().GPU().Maxwell3D().regs;
ASSERT_MSG(regs.independent_blend_enable == 1, "Only independent blending is implemented");
void RasterizerOpenGL::SyncBlendFuncs() {
UNREACHABLE();
}
// TODO(Subv): Support more than just render target 0.
state.blend.enabled = regs.blend.enable[0] != 0;
void RasterizerOpenGL::SyncBlendColor() {
UNREACHABLE();
if (!state.blend.enabled)
return;
ASSERT_MSG(!regs.independent_blend[0].separate_alpha, "Unimplemented");
state.blend.rgb_equation = MaxwellToGL::BlendEquation(regs.independent_blend[0].equation_rgb);
state.blend.src_rgb_func = MaxwellToGL::BlendFunc(regs.independent_blend[0].factor_source_rgb);
state.blend.dst_rgb_func = MaxwellToGL::BlendFunc(regs.independent_blend[0].factor_dest_rgb);
state.blend.a_equation = MaxwellToGL::BlendEquation(regs.independent_blend[0].equation_a);
state.blend.src_a_func = MaxwellToGL::BlendFunc(regs.independent_blend[0].factor_source_a);
state.blend.dst_a_func = MaxwellToGL::BlendFunc(regs.independent_blend[0].factor_dest_a);
}

View File

@@ -121,14 +121,8 @@ private:
/// Syncs the depth offset to match the guest state
void SyncDepthOffset();
/// Syncs the blend enabled status to match the guest state
void SyncBlendEnabled();
/// Syncs the blend functions to match the guest state
void SyncBlendFuncs();
/// Syncs the blend color to match the guest state
void SyncBlendColor();
/// Syncs the blend state to match the guest state
void SyncBlendState();
bool has_ARB_buffer_storage;
bool has_ARB_direct_state_access;

View File

@@ -1033,8 +1033,11 @@ Surface RasterizerCacheOpenGL::GetTextureSurface(const Tegra::Texture::FullTextu
params.addr = config.tic.Address();
params.is_tiled = config.tic.IsTiled();
params.pixel_format = SurfaceParams::PixelFormatFromTextureFormat(config.tic.format);
params.width = config.tic.Width() / params.GetCompresssionFactor();
params.height = config.tic.Height() / params.GetCompresssionFactor();
params.width = Common::AlignUp(config.tic.Width(), params.GetCompresssionFactor()) /
params.GetCompresssionFactor();
params.height = Common::AlignUp(config.tic.Height(), params.GetCompresssionFactor()) /
params.GetCompresssionFactor();
// TODO(Subv): Different types per component are not supported.
ASSERT(config.tic.r_type.Value() == config.tic.g_type.Value() &&
@@ -1045,6 +1048,8 @@ Surface RasterizerCacheOpenGL::GetTextureSurface(const Tegra::Texture::FullTextu
if (config.tic.IsTiled()) {
params.block_height = config.tic.BlockHeight();
params.width = Common::AlignUp(params.width, params.block_height);
params.height = Common::AlignUp(params.height, params.block_height);
} else {
// Use the texture-provided stride value if the texture isn't tiled.
params.stride = static_cast<u32>(params.PixelsInBytes(config.tic.Pitch()));

View File

@@ -299,13 +299,15 @@ public:
* @param value The code representing the value to assign.
* @param dest_num_components Number of components in the destination.
* @param value_num_components Number of components in the value.
* @param is_abs Optional, when True, applies absolute value to output.
* @param is_saturated Optional, when True, saturates the provided value.
* @param dest_elem Optional, the destination element to use for the operation.
*/
void SetRegisterToFloat(const Register& reg, u64 elem, const std::string& value,
u64 dest_num_components, u64 value_num_components, bool is_abs = false,
u64 dest_elem = 0) {
SetRegister(reg, elem, value, dest_num_components, value_num_components, is_abs, dest_elem);
u64 dest_num_components, u64 value_num_components,
bool is_saturated = false, u64 dest_elem = 0) {
SetRegister(reg, elem, is_saturated ? "clamp(" + value + ", 0.0, 1.0)" : value,
dest_num_components, value_num_components, dest_elem);
}
/**
@@ -315,18 +317,21 @@ public:
* @param value The code representing the value to assign.
* @param dest_num_components Number of components in the destination.
* @param value_num_components Number of components in the value.
* @param is_abs Optional, when True, applies absolute value to output.
* @param is_saturated Optional, when True, saturates the provided value.
* @param dest_elem Optional, the destination element to use for the operation.
*/
void SetRegisterToInteger(const Register& reg, bool is_signed, u64 elem,
const std::string& value, u64 dest_num_components,
u64 value_num_components, bool is_abs = false, u64 dest_elem = 0) {
u64 value_num_components, bool is_saturated = false,
u64 dest_elem = 0) {
ASSERT_MSG(!is_saturated, "Unimplemented");
const std::string func = GetGLSLConversionFunc(
is_signed ? GLSLRegister::Type::Integer : GLSLRegister::Type::UnsignedInteger,
GLSLRegister::Type::Float);
SetRegister(reg, elem, func + '(' + value + ')', dest_num_components, value_num_components,
is_abs, dest_elem);
dest_elem);
}
/**
@@ -500,12 +505,10 @@ private:
* @param value The code representing the value to assign.
* @param dest_num_components Number of components in the destination.
* @param value_num_components Number of components in the value.
* @param is_abs Optional, when True, applies absolute value to output.
* @param dest_elem Optional, the destination element to use for the operation.
*/
void SetRegister(const Register& reg, u64 elem, const std::string& value,
u64 dest_num_components, u64 value_num_components, bool is_abs,
u64 dest_elem) {
u64 dest_num_components, u64 value_num_components, u64 dest_elem) {
std::string dest = GetRegister(reg, dest_elem);
if (dest_num_components > 1) {
dest += GetSwizzle(elem);
@@ -516,8 +519,6 @@ private:
src += GetSwizzle(elem);
}
src = is_abs ? "abs(" + src + ')' : src;
shader.AddLine(dest + " = " + src + ';');
}
@@ -538,7 +539,7 @@ private:
// vertex shader, and what's the value of the fourth element when inside a Tess Eval
// shader.
ASSERT(stage == Maxwell3D::Regs::ShaderStage::Vertex);
return "vec4(0, 0, gl_InstanceID, gl_VertexID)";
return "vec4(0, 0, uintBitsToFloat(gl_InstanceID), uintBitsToFloat(gl_VertexID))";
default:
const u32 index{static_cast<u32>(attribute) -
static_cast<u32>(Attribute::Index::Attribute_0)};
@@ -808,7 +809,8 @@ private:
case OpCode::Id::FMUL_C:
case OpCode::Id::FMUL_R:
case OpCode::Id::FMUL_IMM: {
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " * " + op_b, 1, 1, instr.alu.abs_d);
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " * " + op_b, 1, 1,
instr.alu.saturate_d);
break;
}
case OpCode::Id::FMUL32_IMM: {
@@ -821,37 +823,39 @@ private:
case OpCode::Id::FADD_C:
case OpCode::Id::FADD_R:
case OpCode::Id::FADD_IMM: {
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " + " + op_b, 1, 1, instr.alu.abs_d);
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " + " + op_b, 1, 1,
instr.alu.saturate_d);
break;
}
case OpCode::Id::MUFU: {
switch (instr.sub_op) {
case SubOp::Cos:
regs.SetRegisterToFloat(instr.gpr0, 0, "cos(" + op_a + ')', 1, 1,
instr.alu.abs_d);
instr.alu.saturate_d);
break;
case SubOp::Sin:
regs.SetRegisterToFloat(instr.gpr0, 0, "sin(" + op_a + ')', 1, 1,
instr.alu.abs_d);
instr.alu.saturate_d);
break;
case SubOp::Ex2:
regs.SetRegisterToFloat(instr.gpr0, 0, "exp2(" + op_a + ')', 1, 1,
instr.alu.abs_d);
instr.alu.saturate_d);
break;
case SubOp::Lg2:
regs.SetRegisterToFloat(instr.gpr0, 0, "log2(" + op_a + ')', 1, 1,
instr.alu.abs_d);
instr.alu.saturate_d);
break;
case SubOp::Rcp:
regs.SetRegisterToFloat(instr.gpr0, 0, "1.0 / " + op_a, 1, 1, instr.alu.abs_d);
regs.SetRegisterToFloat(instr.gpr0, 0, "1.0 / " + op_a, 1, 1,
instr.alu.saturate_d);
break;
case SubOp::Rsq:
regs.SetRegisterToFloat(instr.gpr0, 0, "inversesqrt(" + op_a + ')', 1, 1,
instr.alu.abs_d);
instr.alu.saturate_d);
break;
case SubOp::Min:
regs.SetRegisterToFloat(instr.gpr0, 0, "min(" + op_a + "," + op_b + ')', 1, 1,
instr.alu.abs_d);
instr.alu.saturate_d);
break;
default:
NGLOG_CRITICAL(HW_GPU, "Unhandled MUFU sub op: {0:x}",
@@ -973,6 +977,19 @@ private:
}
switch (opcode->GetId()) {
case OpCode::Id::SHR_C:
case OpCode::Id::SHR_R:
case OpCode::Id::SHR_IMM: {
if (!instr.shift.is_signed) {
// Logical shift right
op_a = "uint(" + op_a + ')';
}
// Cast to int is superfluous for arithmetic shift, it's only for a logical shift
regs.SetRegisterToInteger(instr.gpr0, true, 0, "int(" + op_a + " >> " + op_b + ')',
1, 1);
break;
}
case OpCode::Id::SHL_C:
case OpCode::Id::SHL_R:
case OpCode::Id::SHL_IMM:
@@ -986,13 +1003,34 @@ private:
break;
}
case OpCode::Type::ScaledAdd: {
case OpCode::Type::ArithmeticIntegerImmediate: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8);
if (instr.iscadd.negate_a)
if (instr.iadd32i.negate_a)
op_a = '-' + op_a;
std::string op_b = instr.iscadd.negate_b ? "-" : "";
std::string op_b = '(' + std::to_string(instr.alu.imm20_32.Value()) + ')';
switch (opcode->GetId()) {
case OpCode::Id::IADD32I:
regs.SetRegisterToInteger(instr.gpr0, true, 0, op_a + " + " + op_b, 1, 1,
instr.iadd32i.saturate != 0);
break;
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled ArithmeticIntegerImmediate instruction: {}",
opcode->GetName());
UNREACHABLE();
}
}
break;
}
case OpCode::Type::ArithmeticInteger: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8);
if (instr.alu_integer.negate_a)
op_a = '-' + op_a;
std::string op_b = instr.alu_integer.negate_b ? "-" : "";
if (instr.is_b_imm) {
op_b += '(' + std::to_string(instr.alu.GetSignedImm20_20()) + ')';
@@ -1005,10 +1043,30 @@ private:
}
}
std::string shift = std::to_string(instr.iscadd.shift_amount.Value());
switch (opcode->GetId()) {
case OpCode::Id::IADD_C:
case OpCode::Id::IADD_R:
case OpCode::Id::IADD_IMM: {
regs.SetRegisterToInteger(instr.gpr0, true, 0, op_a + " + " + op_b, 1, 1,
instr.alu.saturate_d);
break;
}
case OpCode::Id::ISCADD_C:
case OpCode::Id::ISCADD_R:
case OpCode::Id::ISCADD_IMM: {
std::string shift = std::to_string(instr.alu_integer.shift_amount.Value());
regs.SetRegisterToInteger(instr.gpr0, true, 0,
"((" + op_a + " << " + shift + ") + " + op_b + ')', 1, 1);
break;
}
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled ArithmeticInteger instruction: {}",
opcode->GetName());
UNREACHABLE();
}
}
regs.SetRegisterToInteger(instr.gpr0, true, 0,
"((" + op_a + " << " + shift + ") + " + op_b + ')', 1, 1);
break;
}
case OpCode::Type::Ffma: {
@@ -1045,13 +1103,13 @@ private:
}
}
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " * " + op_b + " + " + op_c, 1, 1);
regs.SetRegisterToFloat(instr.gpr0, 0, op_a + " * " + op_b + " + " + op_c, 1, 1,
instr.alu.saturate_d);
break;
}
case OpCode::Type::Conversion: {
ASSERT_MSG(instr.conversion.size == Register::Size::Word, "Unimplemented");
ASSERT_MSG(!instr.conversion.negate_a, "Unimplemented");
ASSERT_MSG(!instr.conversion.saturate_a, "Unimplemented");
switch (opcode->GetId()) {
case OpCode::Id::I2I_R: {
@@ -1065,7 +1123,7 @@ private:
}
regs.SetRegisterToInteger(instr.gpr0, instr.conversion.is_output_signed, 0, op_a, 1,
1);
1, instr.alu.saturate_d);
break;
}
case OpCode::Id::I2F_R: {
@@ -1106,7 +1164,7 @@ private:
op_a = "abs(" + op_a + ')';
}
regs.SetRegisterToFloat(instr.gpr0, 0, op_a, 1, 1);
regs.SetRegisterToFloat(instr.gpr0, 0, op_a, 1, 1, instr.alu.saturate_d);
break;
}
case OpCode::Id::F2I_R: {
@@ -1198,8 +1256,8 @@ private:
const std::string op_b = regs.GetRegisterAsFloat(instr.gpr8.Value() + 1);
const std::string sampler = GetSampler(instr.sampler);
const std::string coord = "vec2 coords = vec2(" + op_a + ", " + op_b + ");";
// Add an extra scope and declare the texture coords inside to prevent overwriting
// them in case they are used as outputs of the texs instruction.
// Add an extra scope and declare the texture coords inside to prevent
// overwriting them in case they are used as outputs of the texs instruction.
shader.AddLine("{");
++shader.scope;
shader.AddLine(coord);
@@ -1230,8 +1288,8 @@ private:
shader.AddLine(coord);
const std::string texture = "texture(" + sampler + ", coords)";
// TEXS has two destination registers. RG goes into gpr0+0 and gpr0+1, and BA goes
// into gpr28+0 and gpr28+1
// TEXS has two destination registers. RG goes into gpr0+0 and gpr0+1, and BA
// goes into gpr28+0 and gpr28+1
size_t offset{};
for (const auto& dest : {instr.gpr0.Value(), instr.gpr28.Value()}) {
@@ -1380,8 +1438,8 @@ private:
op_b = "abs(" + op_b + ')';
}
// The fset instruction sets a register to 1.0 if the condition is true, and to 0
// otherwise.
// The fset instruction sets a register to 1.0 or -1 (depending on the bf bit) if the
// condition is true, and to 0 otherwise.
std::string second_pred =
GetPredicateCondition(instr.fset.pred39, instr.fset.neg_pred != 0);
@@ -1399,6 +1457,41 @@ private:
}
break;
}
case OpCode::Type::IntegerSet: {
std::string op_a = regs.GetRegisterAsInteger(instr.gpr8, 0, instr.iset.is_signed);
std::string op_b;
if (instr.is_b_imm) {
op_b = std::to_string(instr.alu.GetSignedImm20_20());
} else {
if (instr.is_b_gpr) {
op_b = regs.GetRegisterAsInteger(instr.gpr20, 0, instr.iset.is_signed);
} else {
op_b = regs.GetUniform(instr.cbuf34.index, instr.cbuf34.offset,
GLSLRegister::Type::Integer);
}
}
// The iset instruction sets a register to 1.0 or -1 (depending on the bf bit) if the
// condition is true, and to 0 otherwise.
std::string second_pred =
GetPredicateCondition(instr.iset.pred39, instr.iset.neg_pred != 0);
std::string comparator = GetPredicateComparison(instr.iset.cond);
std::string combiner = GetPredicateCombiner(instr.iset.op);
std::string predicate = "(((" + op_a + ") " + comparator + " (" + op_b + ")) " +
combiner + " (" + second_pred + "))";
if (instr.iset.bf) {
regs.SetRegisterToFloat(instr.gpr0, 0, predicate + " ? 1.0 : 0.0", 1, 1);
} else {
regs.SetRegisterToInteger(instr.gpr0, false, 0, predicate + " ? 0xFFFFFFFF : 0", 1,
1);
}
break;
}
default: {
switch (opcode->GetId()) {
case OpCode::Id::EXIT: {
@@ -1412,8 +1505,8 @@ private:
shader.AddLine("return true;");
if (instr.pred.pred_index == static_cast<u64>(Pred::UnusedIndex)) {
// If this is an unconditional exit then just end processing here, otherwise we
// have to account for the possibility of the condition not being met, so
// If this is an unconditional exit then just end processing here, otherwise
// we have to account for the possibility of the condition not being met, so
// continue processing the next instruction.
offset = PROGRAM_END - 1;
}
@@ -1435,6 +1528,11 @@ private:
regs.SetRegisterToInputAttibute(instr.gpr0, attribute.element, attribute.index);
break;
}
case OpCode::Id::SSY: {
// The SSY opcode tells the GPU where to re-converge divergent execution paths, we
// can ignore this when generating GLSL code.
break;
}
default: {
NGLOG_CRITICAL(HW_GPU, "Unhandled instruction: {}", opcode->GetName());
UNREACHABLE();

View File

@@ -39,6 +39,10 @@ void main() {
// Viewport can be flipped, which is unsupported by glViewport
position.xy *= viewport_flip.xy;
gl_Position = position;
// TODO(bunnei): This is likely a hack, position.w should be interpolated as 1.0
// For now, this is here to bring order in lieu of proper emulation
position.w = 1.0;
}
)";
out += program.first;

View File

@@ -32,8 +32,6 @@ add_executable(yuzu
debugger/graphics/graphics_surface.h
debugger/profiler.cpp
debugger/profiler.h
debugger/registers.cpp
debugger/registers.h
debugger/wait_tree.cpp
debugger/wait_tree.h
game_list.cpp
@@ -60,7 +58,6 @@ set(UIS
configuration/configure_graphics.ui
configuration/configure_input.ui
configuration/configure_system.ui
debugger/registers.ui
hotkeys.ui
main.ui
)

View File

@@ -1,190 +0,0 @@
// Copyright 2014 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#include <QTreeWidgetItem>
#include "core/arm/arm_interface.h"
#include "core/core.h"
#include "yuzu/debugger/registers.h"
#include "yuzu/util/util.h"
RegistersWidget::RegistersWidget(QWidget* parent) : QDockWidget(parent) {
cpu_regs_ui.setupUi(this);
tree = cpu_regs_ui.treeWidget;
tree->addTopLevelItem(core_registers = new QTreeWidgetItem(QStringList(tr("Registers"))));
tree->addTopLevelItem(vfp_registers = new QTreeWidgetItem(QStringList(tr("VFP Registers"))));
tree->addTopLevelItem(vfp_system_registers =
new QTreeWidgetItem(QStringList(tr("VFP System Registers"))));
tree->addTopLevelItem(cpsr = new QTreeWidgetItem(QStringList("CPSR")));
for (int i = 0; i < 16; ++i) {
QTreeWidgetItem* child = new QTreeWidgetItem(QStringList(QString("R[%1]").arg(i)));
core_registers->addChild(child);
}
for (int i = 0; i < 32; ++i) {
QTreeWidgetItem* child = new QTreeWidgetItem(QStringList(QString("S[%1]").arg(i)));
vfp_registers->addChild(child);
}
QFont font = GetMonospaceFont();
CreateCPSRChildren();
CreateVFPSystemRegisterChildren();
// Set Registers to display in monospace font
for (int i = 0; i < core_registers->childCount(); ++i)
core_registers->child(i)->setFont(1, font);
for (int i = 0; i < vfp_registers->childCount(); ++i)
vfp_registers->child(i)->setFont(1, font);
for (int i = 0; i < vfp_system_registers->childCount(); ++i) {
vfp_system_registers->child(i)->setFont(1, font);
for (int x = 0; x < vfp_system_registers->child(i)->childCount(); ++x) {
vfp_system_registers->child(i)->child(x)->setFont(1, font);
}
}
// Set CSPR to display in monospace font
cpsr->setFont(1, font);
for (int i = 0; i < cpsr->childCount(); ++i) {
cpsr->child(i)->setFont(1, font);
for (int x = 0; x < cpsr->child(i)->childCount(); ++x) {
cpsr->child(i)->child(x)->setFont(1, font);
}
}
setEnabled(false);
}
void RegistersWidget::OnDebugModeEntered() {
if (!Core::System::GetInstance().IsPoweredOn())
return;
for (int i = 0; i < core_registers->childCount(); ++i)
core_registers->child(i)->setText(
1, QString("0x%1").arg(Core::CurrentArmInterface().GetReg(i), 8, 16, QLatin1Char('0')));
UpdateCPSRValues();
}
void RegistersWidget::OnDebugModeLeft() {}
void RegistersWidget::OnEmulationStarting(EmuThread* emu_thread) {
setEnabled(true);
}
void RegistersWidget::OnEmulationStopping() {
// Reset widget text
for (int i = 0; i < core_registers->childCount(); ++i)
core_registers->child(i)->setText(1, QString(""));
for (int i = 0; i < vfp_registers->childCount(); ++i)
vfp_registers->child(i)->setText(1, QString(""));
for (int i = 0; i < cpsr->childCount(); ++i)
cpsr->child(i)->setText(1, QString(""));
cpsr->setText(1, QString(""));
// FPSCR
for (int i = 0; i < vfp_system_registers->child(0)->childCount(); ++i)
vfp_system_registers->child(0)->child(i)->setText(1, QString(""));
// FPEXC
for (int i = 0; i < vfp_system_registers->child(1)->childCount(); ++i)
vfp_system_registers->child(1)->child(i)->setText(1, QString(""));
vfp_system_registers->child(0)->setText(1, QString(""));
vfp_system_registers->child(1)->setText(1, QString(""));
vfp_system_registers->child(2)->setText(1, QString(""));
vfp_system_registers->child(3)->setText(1, QString(""));
setEnabled(false);
}
void RegistersWidget::CreateCPSRChildren() {
cpsr->addChild(new QTreeWidgetItem(QStringList("M")));
cpsr->addChild(new QTreeWidgetItem(QStringList("T")));
cpsr->addChild(new QTreeWidgetItem(QStringList("F")));
cpsr->addChild(new QTreeWidgetItem(QStringList("I")));
cpsr->addChild(new QTreeWidgetItem(QStringList("A")));
cpsr->addChild(new QTreeWidgetItem(QStringList("E")));
cpsr->addChild(new QTreeWidgetItem(QStringList("IT")));
cpsr->addChild(new QTreeWidgetItem(QStringList("GE")));
cpsr->addChild(new QTreeWidgetItem(QStringList("DNM")));
cpsr->addChild(new QTreeWidgetItem(QStringList("J")));
cpsr->addChild(new QTreeWidgetItem(QStringList("Q")));
cpsr->addChild(new QTreeWidgetItem(QStringList("V")));
cpsr->addChild(new QTreeWidgetItem(QStringList("C")));
cpsr->addChild(new QTreeWidgetItem(QStringList("Z")));
cpsr->addChild(new QTreeWidgetItem(QStringList("N")));
}
void RegistersWidget::UpdateCPSRValues() {
const u32 cpsr_val = Core::CurrentArmInterface().GetCPSR();
cpsr->setText(1, QString("0x%1").arg(cpsr_val, 8, 16, QLatin1Char('0')));
cpsr->child(0)->setText(
1, QString("b%1").arg(cpsr_val & 0x1F, 5, 2, QLatin1Char('0'))); // M - Mode
cpsr->child(1)->setText(1, QString::number((cpsr_val >> 5) & 1)); // T - State
cpsr->child(2)->setText(1, QString::number((cpsr_val >> 6) & 1)); // F - FIQ disable
cpsr->child(3)->setText(1, QString::number((cpsr_val >> 7) & 1)); // I - IRQ disable
cpsr->child(4)->setText(1, QString::number((cpsr_val >> 8) & 1)); // A - Imprecise abort
cpsr->child(5)->setText(1, QString::number((cpsr_val >> 9) & 1)); // E - Data endianness
cpsr->child(6)->setText(1,
QString::number((cpsr_val >> 10) & 0x3F)); // IT - If-Then state (DNM)
cpsr->child(7)->setText(1,
QString::number((cpsr_val >> 16) & 0xF)); // GE - Greater-than-or-Equal
cpsr->child(8)->setText(1, QString::number((cpsr_val >> 20) & 0xF)); // DNM - Do not modify
cpsr->child(9)->setText(1, QString::number((cpsr_val >> 24) & 1)); // J - Jazelle
cpsr->child(10)->setText(1, QString::number((cpsr_val >> 27) & 1)); // Q - Saturation
cpsr->child(11)->setText(1, QString::number((cpsr_val >> 28) & 1)); // V - Overflow
cpsr->child(12)->setText(1, QString::number((cpsr_val >> 29) & 1)); // C - Carry/Borrow/Extend
cpsr->child(13)->setText(1, QString::number((cpsr_val >> 30) & 1)); // Z - Zero
cpsr->child(14)->setText(1, QString::number((cpsr_val >> 31) & 1)); // N - Negative/Less than
}
void RegistersWidget::CreateVFPSystemRegisterChildren() {
QTreeWidgetItem* const fpscr = new QTreeWidgetItem(QStringList("FPSCR"));
fpscr->addChild(new QTreeWidgetItem(QStringList("IOC")));
fpscr->addChild(new QTreeWidgetItem(QStringList("DZC")));
fpscr->addChild(new QTreeWidgetItem(QStringList("OFC")));
fpscr->addChild(new QTreeWidgetItem(QStringList("UFC")));
fpscr->addChild(new QTreeWidgetItem(QStringList("IXC")));
fpscr->addChild(new QTreeWidgetItem(QStringList("IDC")));
fpscr->addChild(new QTreeWidgetItem(QStringList("IOE")));
fpscr->addChild(new QTreeWidgetItem(QStringList("DZE")));
fpscr->addChild(new QTreeWidgetItem(QStringList("OFE")));
fpscr->addChild(new QTreeWidgetItem(QStringList("UFE")));
fpscr->addChild(new QTreeWidgetItem(QStringList("IXE")));
fpscr->addChild(new QTreeWidgetItem(QStringList("IDE")));
fpscr->addChild(new QTreeWidgetItem(QStringList(tr("Vector Length"))));
fpscr->addChild(new QTreeWidgetItem(QStringList(tr("Vector Stride"))));
fpscr->addChild(new QTreeWidgetItem(QStringList(tr("Rounding Mode"))));
fpscr->addChild(new QTreeWidgetItem(QStringList("FZ")));
fpscr->addChild(new QTreeWidgetItem(QStringList("DN")));
fpscr->addChild(new QTreeWidgetItem(QStringList("V")));
fpscr->addChild(new QTreeWidgetItem(QStringList("C")));
fpscr->addChild(new QTreeWidgetItem(QStringList("Z")));
fpscr->addChild(new QTreeWidgetItem(QStringList("N")));
QTreeWidgetItem* const fpexc = new QTreeWidgetItem(QStringList("FPEXC"));
fpexc->addChild(new QTreeWidgetItem(QStringList("IOC")));
fpexc->addChild(new QTreeWidgetItem(QStringList("OFC")));
fpexc->addChild(new QTreeWidgetItem(QStringList("UFC")));
fpexc->addChild(new QTreeWidgetItem(QStringList("INV")));
fpexc->addChild(new QTreeWidgetItem(QStringList(tr("Vector Iteration Count"))));
fpexc->addChild(new QTreeWidgetItem(QStringList("FP2V")));
fpexc->addChild(new QTreeWidgetItem(QStringList("EN")));
fpexc->addChild(new QTreeWidgetItem(QStringList("EX")));
vfp_system_registers->addChild(fpscr);
vfp_system_registers->addChild(fpexc);
vfp_system_registers->addChild(new QTreeWidgetItem(QStringList("FPINST")));
vfp_system_registers->addChild(new QTreeWidgetItem(QStringList("FPINST2")));
}
void RegistersWidget::UpdateVFPSystemRegisterValues() {
UNIMPLEMENTED();
}

View File

@@ -1,42 +0,0 @@
// Copyright 2014 Citra Emulator Project
// Licensed under GPLv2 or any later version
// Refer to the license.txt file included.
#pragma once
#include <QDockWidget>
#include "ui_registers.h"
class QTreeWidget;
class QTreeWidgetItem;
class EmuThread;
class RegistersWidget : public QDockWidget {
Q_OBJECT
public:
explicit RegistersWidget(QWidget* parent = nullptr);
public slots:
void OnDebugModeEntered();
void OnDebugModeLeft();
void OnEmulationStarting(EmuThread* emu_thread);
void OnEmulationStopping();
private:
void CreateCPSRChildren();
void UpdateCPSRValues();
void CreateVFPSystemRegisterChildren();
void UpdateVFPSystemRegisterValues();
Ui::ARMRegisters cpu_regs_ui;
QTreeWidget* tree;
QTreeWidgetItem* core_registers;
QTreeWidgetItem* vfp_registers;
QTreeWidgetItem* vfp_system_registers;
QTreeWidgetItem* cpsr;
};

View File

@@ -1,40 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<ui version="4.0">
<class>ARMRegisters</class>
<widget class="QDockWidget" name="ARMRegisters">
<property name="geometry">
<rect>
<x>0</x>
<y>0</y>
<width>400</width>
<height>300</height>
</rect>
</property>
<property name="windowTitle">
<string>ARM Registers</string>
</property>
<widget class="QWidget" name="dockWidgetContents">
<layout class="QVBoxLayout" name="verticalLayout">
<item>
<widget class="QTreeWidget" name="treeWidget">
<property name="alternatingRowColors">
<bool>true</bool>
</property>
<column>
<property name="text">
<string>Register</string>
</property>
</column>
<column>
<property name="text">
<string>Value</string>
</property>
</column>
</widget>
</item>
</layout>
</widget>
</widget>
<resources/>
<connections/>
</ui>

View File

@@ -33,7 +33,6 @@
#include "yuzu/debugger/graphics/graphics_breakpoints.h"
#include "yuzu/debugger/graphics/graphics_surface.h"
#include "yuzu/debugger/profiler.h"
#include "yuzu/debugger/registers.h"
#include "yuzu/debugger/wait_tree.h"
#include "yuzu/game_list.h"
#include "yuzu/hotkeys.h"
@@ -169,15 +168,6 @@ void GMainWindow::InitializeDebugWidgets() {
debug_menu->addAction(microProfileDialog->toggleViewAction());
#endif
registersWidget = new RegistersWidget(this);
addDockWidget(Qt::RightDockWidgetArea, registersWidget);
registersWidget->hide();
debug_menu->addAction(registersWidget->toggleViewAction());
connect(this, &GMainWindow::EmulationStarting, registersWidget,
&RegistersWidget::OnEmulationStarting);
connect(this, &GMainWindow::EmulationStopping, registersWidget,
&RegistersWidget::OnEmulationStopping);
graphicsBreakpointsWidget = new GraphicsBreakPointsWidget(debug_context, this);
addDockWidget(Qt::RightDockWidgetArea, graphicsBreakpointsWidget);
graphicsBreakpointsWidget->hide();
@@ -460,17 +450,12 @@ void GMainWindow::BootGame(const QString& filename) {
connect(render_window, &GRenderWindow::Closed, this, &GMainWindow::OnStopGame);
// BlockingQueuedConnection is important here, it makes sure we've finished refreshing our views
// before the CPU continues
connect(emu_thread.get(), &EmuThread::DebugModeEntered, registersWidget,
&RegistersWidget::OnDebugModeEntered, Qt::BlockingQueuedConnection);
connect(emu_thread.get(), &EmuThread::DebugModeEntered, waitTreeWidget,
&WaitTreeWidget::OnDebugModeEntered, Qt::BlockingQueuedConnection);
connect(emu_thread.get(), &EmuThread::DebugModeLeft, registersWidget,
&RegistersWidget::OnDebugModeLeft, Qt::BlockingQueuedConnection);
connect(emu_thread.get(), &EmuThread::DebugModeLeft, waitTreeWidget,
&WaitTreeWidget::OnDebugModeLeft, Qt::BlockingQueuedConnection);
// Update the GUI
registersWidget->OnDebugModeEntered();
if (ui.action_Single_Window_Mode->isChecked()) {
game_list->hide();
}

View File

@@ -19,7 +19,6 @@ class GraphicsSurfaceWidget;
class GRenderWindow;
class MicroProfileDialog;
class ProfilerWidget;
class RegistersWidget;
class WaitTreeWidget;
namespace Tegra {
@@ -163,7 +162,6 @@ private:
// Debugger panes
ProfilerWidget* profilerWidget;
MicroProfileDialog* microProfileDialog;
RegistersWidget* registersWidget;
GraphicsBreakPointsWidget* graphicsBreakpointsWidget;
GraphicsSurfaceWidget* graphicsSurfaceWidget;
WaitTreeWidget* waitTreeWidget;