vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.
Project Subscriptions
No data.
Advisories
| Source | ID | Title |
|---|---|---|
Github GHSA |
GHSA-5jv2-g5wq-cmr4 | vLLM: GGUF dequantize kernel int truncation exposes uninitialized GPU memory in multi-tenant serving |
Fixes
Solution
No solution given by the vendor.
Workaround
No workaround given by the vendor.
References
History
Mon, 22 Jun 2026 22:45:00 +0000
| Type | Values Removed | Values Added |
|---|---|---|
| Description | vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0. | |
| Title | vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow | |
| Weaknesses | CWE-200 CWE-681 |
|
| References |
| |
| Metrics |
cvssV4_0
|
Projects
Sign in to view the affected projects.
Status: PUBLISHED
Assigner: GitHub_M
Published:
Updated: 2026-06-22T21:55:42.001Z
Reserved: 2026-06-11T15:46:12.316Z
Link: CVE-2026-53923
No data.
No data.
No data.
OpenCVE Enrichment
Updated: 2026-06-22T23:30:05Z
Github GHSA