Commit Graph

10419 Commits

Author SHA1 Message Date
Lioncash
894d9a8253 CMakeLists: Define QT_USE_QSTRINGBUILDER for the Qt target
This is a compile definition introduced in Qt 4.8 for reducing the total
potential number of strings created when performing string
concatenation. This allows for less memory churn.

This can be read about here:
https://blog.qt.io/blog/2011/06/13/string-concatenation-with-qstringbuilder/

For a change that isn't source-compatible, we only had one occurrence
that actually need to have its type clarified, which is pretty good, as
far as transitioning goes.
2019-04-15 17:59:41 -04:00
Lioncash
74a4f2ef19 svc: Specify handle value in thread's name
Allows the handle to be seen alongside the entry point.
2019-04-15 15:56:18 -04:00
Fernando Sahmkow
7a3e916aa3 Correct Kepler Memory on Linear Pushes. 2019-04-15 14:51:36 -04:00
Fernando Sahmkow
96a79e7c42 Support compressed formats on linear textures. 2019-04-15 13:56:09 -04:00
Lioncash
69ff7a5779 common/{lz4_compression, zstd_compression}: Add missing header guards
These two files were missing the #pragma once directive.
2019-04-15 13:00:08 -04:00
Fernando Sahmkow
9fd7921265 Correct Pitch in Fermi2D 2019-04-15 12:24:29 -04:00
Lioncash
c159aec1ad kernel/thread: Remove BoostPriority()
This is a holdover from Citra that currently remains unused, so it can
be removed from the Thread interface.
2019-04-15 06:59:19 -04:00
Lioncash
f55e040284 kernel/thread: Remove unused guest_handle member variable
This member variable is entirely unused. It was only set but never
actually utilized. Given that, we can remove it to get rid of noise in
the thread interface.
2019-04-14 06:06:06 -04:00
ReinUsesLisp
adc9b7754b gl_shader_decompiler: Use variable AOFFI on supported hardware 2019-04-14 05:13:19 -03:00
ReinUsesLisp
d91524b6d7 shader_ir: Implement STG, keep track of global memory usage and flush 2019-04-14 00:25:32 -03:00
bunnei
d587934679 Merge pull request #2378 from lioncash/ro
ldr: Minor amendments to IPC-related parameters
2019-04-13 22:16:10 -04:00
bunnei
02baac5c7f Merge pull request #2373 from FernandoS27/z32
Set Pixel Format to Z32 if its R32F and depth compare enabled, and Implement format ZF32_X24S8
2019-04-13 22:14:51 -04:00
bunnei
25e3079209 Merge pull request #2357 from zarroboogs/force-30fps-mode
Add a toggle to force 30FPS mode
2019-04-13 22:14:04 -04:00
bunnei
eba963d124 Merge pull request #2381 from lioncash/fs
fsp_srv: Minor cleanup related changes
2019-04-13 22:09:58 -04:00
bunnei
530defc20e Merge pull request #2386 from ReinUsesLisp/shader-manager
gl_shader_manager: Move code to source file and minor clean up
2019-04-13 22:09:27 -04:00
bunnei
a15684f080 Merge pull request #2017 from jroweboy/glwidget
Frontend: Migrate to QOpenGLWindow and support shared contexts
2019-04-13 22:08:40 -04:00
bunnei
997c6de5c9 Merge pull request #2389 from FreddyFunk/rename-gamedir
ui_settings: Rename game directory variables
2019-04-13 22:06:51 -04:00
Lioncash
28dcc936d5 kernel/svc: Implement svcUnmapProcessCodeMemory
Essentially performs the inverse of svcMapProcessCodeMemory. This unmaps
the aliasing region first, then restores the general traits of the
aliased memory.

What this entails, is:

- Restoring Read/Write permissions to the VMA.
- Restoring its memory state to reflect it as a general heap memory region.
- Clearing the memory attributes on the region.
2019-04-12 21:56:03 -04:00
Lioncash
d99e0ae59f kernel/svc: Implement svcMapProcessCodeMemory
This is utilized for mapping code modules into memory. Notably, the
ldr service would call this in order to map objects into memory.
2019-04-12 21:55:50 -04:00
bunnei
fb18d01faf Merge pull request #2391 from lioncash/scope
common/scope_exit: Replace std::move with std::forward in ScopeExit()
2019-04-12 21:52:35 -04:00
bunnei
891e6a66a7 Merge pull request #2392 from lioncash/swap
common/swap: Minor cleanup and improvements to byte swapping functions
2019-04-12 21:52:16 -04:00
FreddyFunk
ab1fa60603 Fix Clang Format 2019-04-12 16:40:35 +02:00
Lioncash
58130d5cf6 common/swap: Improve codegen of the default swap fallbacks
Uses arithmetic that can be identified more trivially by compilers for
optimizations. e.g. Rather than shifting the halves of the value and
then swapping and combining them, we can swap them in place.

e.g. for the original swap32 code on x86-64, clang 8.0 would generate:

    mov     ecx, edi
    rol     cx, 8
    shl     ecx, 16
    shr     edi, 16
    rol     di, 8
    movzx   eax, di
    or      eax, ecx
    ret

while GCC 8.3 would generate the ideal:

    mov     eax, edi
    bswap   eax
    ret

now both generate the same optimal output.

MSVC used to generate the following with the old code:

    mov     eax, ecx
    rol     cx, 8
    shr     eax, 16
    rol     ax, 8
    movzx   ecx, cx
    movzx   eax, ax
    shl     ecx, 16
    or      eax, ecx
    ret     0

Now MSVC also generates a similar, but equally optimal result as clang/GCC:

    bswap   ecx
    mov     eax, ecx
    ret     0

====

In the swap64 case, for the original code, clang 8.0 would generate:

    mov     eax, edi
    bswap   eax
    shl     rax, 32
    shr     rdi, 32
    bswap   edi
    or      rax, rdi
    ret

(almost there, but still missing the mark)

while, again, GCC 8.3 would generate the more ideal:

    mov     rax, rdi
    bswap   rax
    ret

now clang also generates the optimal sequence for this fallback as well.

This is a case where MSVC unfortunately falls short, despite the new
code, this one still generates a doozy of an output.

    mov     r8, rcx
    mov     r9, rcx
    mov     rax, 71776119061217280
    mov     rdx, r8
    and     r9, rax
    and     edx, 65280
    mov     rax, rcx
    shr     rax, 16
    or      r9, rax
    mov     rax, rcx
    shr     r9, 16
    mov     rcx, 280375465082880
    and     rax, rcx
    mov     rcx, 1095216660480
    or      r9, rax
    mov     rax, r8
    and     rax, rcx
    shr     r9, 16
    or      r9, rax
    mov     rcx, r8
    mov     rax, r8
    shr     r9, 8
    shl     rax, 16
    and     ecx, 16711680
    or      rdx, rax
    mov     eax, -16777216
    and     rax, r8
    shl     rdx, 16
    or      rdx, rcx
    shl     rdx, 16
    or      rax, rdx
    shl     rax, 8
    or      rax, r9
    ret     0

which is pretty unfortunate.
2019-04-12 00:07:39 -04:00
Lioncash
8dcb5915dd core/core: Move process execution start to System's Load()
This gives us significantly more control over where in the
initialization process we start execution of the main process.

Previously we were running the main process before the CPU or GPU
threads were initialized (not good). This amends execution to start
after all of our threads are properly set up.
2019-04-11 22:11:41 -04:00
Lioncash
45e88973e6 core/process: Remove unideal page table setting from LoadFromMetadata()
Initially required due to the split codepath with how the initial main
process instance was initialized. We used to initialize the process
like:

Init() {
    main_process = Process::Create(...);
    kernel.MakeCurrentProcess(main_process.get());
}

Load() {
    const auto load_result = loader.Load(*kernel.GetCurrentProcess());
    if (load_result != Loader::ResultStatus::Success) {
        // Handle error here.
    }
    ...
}

which presented a problem.

Setting a created process as the main process would set the page table
for that process as the main page table. This is fine... until we get to
the part that the page table can have its size changed in the Load()
function via NPDM metadata, which can dictate either a 32-bit, 36-bit,
or 39-bit usable address space.

Now that we have full control over the process' creation in load, we can
simply set the initial process as the main process after all the loading
is done, reflecting the potential page table changes without any
special-casing behavior.

We can also remove the cache flushing within LoadModule(), as execution
wouldn't have even begun yet during all usages of this function, now
that we have the initialization order cleaned up.
2019-04-11 22:11:41 -04:00
Lioncash
171210cb0d core/core: Move main process creation into Load()
Now that we have dependencies on the initialization order, we can move
the creation of the main process to a more sensible area: where we
actually load in the executable data.

This allows localizing the creation and loading of the process in one
location, making the initialization of the process much nicer to trace.
2019-04-11 22:11:40 -04:00
Lioncash
5a25f386e5 video_core/gpu: Create threads separately from initialization
Like with CPU emulation, we generally don't want to fire off the threads
immediately after the relevant classes are initialized, we want to do
this after all necessary data is done loading first.

This splits the thread creation into its own interface member function
to allow controlling when these threads in particular get created.
2019-04-11 22:11:40 -04:00
Lioncash
20ce175076 core/cpu_core_manager: Create threads separately from initialization.
Our initialization process is a little wonky than one would expect when
it comes to code flow. We initialize the CPU last, as opposed to
hardware, where the CPU obviously needs to be first, otherwise nothing
else would work, and we have code that adds checks to get around this.

For example, in the page table setting code, we check to see if the
system is turned on before we even notify the CPU instances of a page
table switch. This results in dead code (at the moment), because the
only time a page table switch will occur is when the system is *not*
running, preventing the emulated CPU instances from being notified of a
page table switch in a convenient manner (technically the code path
could be taken, but we don't emulate the process creation svc handlers
yet).

This moves the threads creation into its own member function of the core
manager and restores a little order (and predictability) to our
initialization process.

Previously, in the multi-threaded cases, we'd kick off several threads
before even the main kernel process was created and ready to execute (gross!).
Now the initialization process is like so:

Initialization:
  1. Timers

  2. CPU

  3. Kernel

  4. Filesystem stuff (kind of gross, but can be amended trivially)

  5. Applet stuff (ditto in terms of being kind of gross)

  6. Main process (will be moved into the loading step in a following
                   change)

  7. Telemetry (this should be initialized last in the future).

  8. Services (4 and 5 should ideally be alongside this).

  9. GDB (gross. Uses namespace scope state. Needs to be refactored into a
          class or booted altogether).

  10. Renderer

  11. GPU (will also have its threads created in a separate step in a
           following change).

Which... isn't *ideal* per-se, however getting rid of the wonky
intertwining of CPU state initialization out of this mix gets rid of
most of the footguns when it comes to our initialization process.
2019-04-11 22:11:40 -04:00
bunnei
793ab146b5 Merge pull request #2235 from ReinUsesLisp/spirv-decompiler
vk_shader_decompiler: Implement a SPIR-V decompiler
2019-04-11 21:54:23 -04:00
bunnei
02ea06348a Merge pull request #2360 from lioncash/svc-global
kernel/svc: Deglobalize the supervisor call handlers
2019-04-11 21:50:05 -04:00
bunnei
a113a11eca Merge pull request #2388 from lioncash/constexpr
kernel: Make handle type declarations constexpr
2019-04-11 21:49:45 -04:00
Lioncash
cc8d2f5a9f common/swap: Mark byte swapping free functions with [[nodiscard]] and noexcept
Allows the compiler to inform when the result of a swap function is
being ignored (which is 100% a bug in all usage scenarios). We also mark
them noexcept to allow other functions using them to be able to be
marked as noexcept and play nicely with things that potentially inspect
"nothrowability".
2019-04-11 20:42:44 -04:00
Lioncash
417997f624 common/swap: Simplify swap function ifdefs
Including every OS' own built-in byte swapping functions is kind of
undesirable, since it adds yet another build path to ensure compilation
succeeds on.

Given we only support clang, GCC, and MSVC for the time being, we can
utilize their built-in functions directly instead of going through the
OS's API functions.

This shrinks the overall code down to just

if (msvc)
  use msvc's functions
else if (clang or gcc)
  use clang/gcc's builtins
else
  use the slow path
2019-04-11 20:36:19 -04:00
Lioncash
a35089bcc3 common/swap: Remove 32-bit ARM path
We don't plan to support host 32-bit ARM execution environments, so this
is essentially dead code.
2019-04-11 20:15:47 -04:00
Lioncash
6e9399aebf common/scope_exit: Replace std::move with std::forward in ScopeExit()
The template type here is actually a forwarding reference, not an rvalue
reference in this case, so it's more appropriate to use std::forward to
preserve the value category of the type being moved.
2019-04-11 20:01:33 -04:00
Lioncash
e7de4ad13e kernel: Make handle type declarations constexpr
Some objects declare their handle type as const, while others declare it
as constexpr. This makes the const ones constexpr for consistency, and
prevent unexpected compilation errors if these happen to be attempted to be
used within a constexpr context.
2019-04-11 16:34:53 -04:00
FreddyFunk
d0485c3070 ui_settings: Rename game directory variables 2019-04-11 19:55:56 +02:00
Fernando Sahmkow
3f79f91b59 gl_rasterizer_cache: Relax restrictions on FastCopySurface and FastLayeredCopySurface 2019-04-11 13:14:28 -04:00
Lioncash
3313392214 service: Update service function tables
Updates function tables based off information from SwitchBrew.
2019-04-11 02:47:00 -04:00
bunnei
8a2f6ef4ce Merge pull request #2278 from ReinUsesLisp/vc-texture-cache
video_core: Implement API agnostic view based texture cache
2019-04-10 21:17:35 -04:00
bunnei
2c00d75fbb Merge pull request #2372 from FernandoS27/fermi-fix
Correct Fermi Copy on Linear Textures.
2019-04-10 21:17:03 -04:00
ReinUsesLisp
802316682c gl_shader_manager: Move code to source file and minor clean up 2019-04-10 19:29:15 -03:00
ReinUsesLisp
6c1cfe9e1d gl_rasterizer: Apply just the needed state on Clear 2019-04-10 18:13:15 -03:00
Lioncash
c9e839c5cb ldr: Mark IsValidNROHash() as a const member function
This doesn't modify instance state, so it can be made const.
2019-04-10 15:57:02 -04:00
Lioncash
a3b016769a ldr: Amend parameters for LoadNro/UnloadNro LoadNrr/UnloadNrr
The initial two words indicate a process ID. Also UnloadNro only
specifies one address, not two.
2019-04-10 15:56:43 -04:00
ReinUsesLisp
2b15011114 gl_device: Implement interface and add uniform offset alignment 2019-04-10 15:56:12 -03:00
ReinUsesLisp
6306ade6e5 vk_shader_decompiler: Implement flow primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp
05b14bc870 vk_shader_decompiler: Implement most common texture primitives 2019-04-10 14:20:25 -03:00
ReinUsesLisp
85eb68606a vk_shader_decompiler: Implement texture decompilation helper functions 2019-04-10 14:20:25 -03:00
ReinUsesLisp
3071adb946 vk_shader_decompiler: Implement Assign and LogicalAssign 2019-04-10 14:20:25 -03:00