Full Report
Linux contains an open source NVIDA driver in the kernel. While reviewing this, they found a null pointer dereference by simply setting the MEMORY_DESCRIPTOR on the UVM_MAP_EXTERNAL_ALLOCATION for Unified Virtual Memory Framework. Usually, this would be a crash, but it has a much more devious consequence. The functions threadStateInit() and threadStateFree() functions are used cross the open-gpu-kernel-modules a lot. The thread-state structure is added to a global red-black tree during initialization and removed once it is freed. The structure is a pointer to the Stack! If there's a kernel oops then the stack would be cleaned up. The second vulnerability is a stack use-after-free triggered by the null pointer dereference from above. I imagine that the author found this issue first and then searched for a crash. The kernel stack is located in the vmalloc area. Its purpose is to allocate virtually contiguous memory with page granularity. It's used for stack allocations and for large kernel allocations. The goal is to get control over this data for reads and writes via a new allocation and then use the value in as a threadState. There are several locations where these can be triggered: two of note are forking and Video4linux2 buffers. The Video4linux2 buffers give the attacker control in userland, allowing for arbitrary data to be read from and written to. The forking function gives us the ability to allocation 4 pages and a guard page that will quickly be freed. Both of these together give us what we need for memory grooming. Using a combination of these, it's possible to get the section of memory belonging to the Stack for the thread state. The UAF occurs at a node in a Red-Black tree. Inserting a node by opening a GPU device adds a new pointer to it for a brief moment. By continually reading the node, it's possible to leak a kernel stack address. We don't know the exact offset due to the random_kstack_offset feature, but it's a good starting point. With the leak, it's now possible to place data at Stack addresses that we control within the Red-Black tree. To exploit this, the author abused the recoloring and rotations performed in the tree. Similar to unlink, it can be used to get an arbitrary write primitive with constrained locations. They couldn't find any good targets within the dynamically allocated data so they decided to target the running kernel stack instead. Within the open syscall, we can calculate offsets to where we need to write to and then overwrite arbitrary values on the stack. Notably, the file pointer can be corrupted, and it contains function pointers! To get the KASLR slide, sock_from_file() can be used to access private socket data and check the file type. By triggering different errors, it's possible to leak KASLR. The function handler llseek() has no checks before calling the handler on the validity of the pointer. From userland, we can control the parameters and return values. How nice! With this, it's possible to corrupt the struct file directly to achieve code execution within the kernel. With this powerful call primitive, they created three primitives: kernel symbolication, arbitrary read, and arbitrary write. With these, they overwrote the creds of a process to become root. Neat! A couple of things stood out to me. First, the complexity in getting this primitive to work is off the charts; so many little details to figure out. Second, I had no idea that structures could be randomized by the kernel at build time, making remote exploitation much harder. Overall, a great and in-depth post!
Analysis Summary
# Vulnerability: NVIDIA Open GPU Kernel Module Stack Use-After-Free
## CVE Details
- **CVE ID:** CVE-2025-23300 (Null Pointer Dereference), CVE-2025-23280 (Use-After-Free)
- **CVSS Score:** Not explicitly listed, but the impact allows for full kernel-level code execution (Critical equivalent).
- **CWE:** CWE-476 (NULL Pointer Dereference), CWE-416 (Use-After-Free)
## Affected Systems
- **Products:** NVIDIA Linux Open GPU Kernel Modules (`nvidia.ko` and `nvidia-uvm.ko`).
- **Versions:** Versions prior to the October 2025 security update.
- **Configurations:** Systems using the open-source kernel modules for NVIDIA GPUs (standard for many modern Linux distributions since 2024). Accessible via unprivileged local processes.
## Vulnerability Description
The exploit chain involves two distinct vulnerabilities:
1. **CVE-2025-23300:** A null pointer dereference in the `nvidia-uvm` module triggered by setting a specific `MEMORY_DESCRIPTOR` on the `UVM_MAP_EXTERNAL_ALLOCATION`.
2. **CVE-2025-23280:** A stack use-after-free (UAF) in the `nvidia` module. The functions `threadStateInit()` and `threadStateFree()` manage a `thread-state` structure that points to the kernel stack. This structure is stored in a global Red-Black (RB) tree. When a "kernel oops" (triggered by the first bug) occurs, the stack is cleaned up, but the pointer remains in the RB tree.
## Exploitation
- **Status:** PoC available (developed by Quarkslab researchers).
- **Complexity:** High (requires kernel memory grooming, KASLR bypass, and RB-tree manipulation).
- **Attack Vector:** Local (unprivileged userland process).
- **Exploit Flow:**
- **Memory Grooming:** Uses `fork()` and `Video4linux2` buffers to reclaim `vmalloc` area memory.
- **Information Leak:** Reads from the RB-tree node to leak kernel stack addresses.
- **Arbitrary Write:** Abuses RB-tree rotations/recoloring to gain a constrained write primitive, eventually overwriting the kernel stack.
- **Privilege Escalation:** Overwrites `struct file` function pointers (e.g., `llseek`) to execute a call primitive that overwrites process credentials to achieve root.
## Impact
- **Confidentiality:** High (Arbitrary kernel memory read).
- **Integrity:** High (Arbitrary kernel memory write/Root escalation).
- **Availability:** High (Kernel crash/Panic).
## Remediation
### Patches
- **NVIDIA GPU Display Driver Update (October 2025):** Users should update to the driver versions released alongside the October 2025 Security Bulletin.
- **Kernel Modules:** Ensure `open-gpu-kernel-modules` are updated to the latest patched version on GitHub.
### Workarounds
- Transitioning back to the proprietary (closed-source) NVIDIA kernel modules may mitigate these specific flaws in the open-source implementation, though this is not a recommended long-term solution.
## Detection
- **Indicators of Compromise:** Excessive kernel "oops" or null pointer dereference logs in `dmesg` associated with `nvidia-uvm`.
- **Detection Methods:** Monitoring for unusual `ioctl` calls to `/dev/nvidia*` from unauthorized or unprivileged processes.
## References
- NVIDIA Security Bulletin (October 2025): hxxps://nvidia[.]custhelp[.]com/app/answers/detail/a_id/5703
- Quarkslab Technical Analysis: hxxps://blog[.]quarkslab[.]com/nvidia_gpu_kernel_vmalloc_exploit[.]html
- NVIDIA Open GPU Kernel Modules Repository: hxxps://github[.]com/NVIDIA/open-gpu-kernel-modules