Share
## https://sploitus.com/exploit?id=B0ACF3C8-2112-5B03-9F69-68C8CDA00F2F
# CVE-2026-7482: Ollama GGUF Heap OOB Read Reproduction

This repository contains my local reproduction script for CVE-2026-7482, a heap out-of-bounds read in vulnerable Ollama GGUF loading and quantization paths.

The important result from this work is narrow: I was able to make the Heap OOB condition happen reliably and produce OOB-influenced quantized GGUF artifacts. I was not able to demonstrate a clear black-box impact such as reliable plaintext secret recovery or direct extraction of canary strings from the resulting artifact.

## What This PoC Does

`exp.py` creates two GGUF files:

- a malicious truncated GGUF with a tensor that declares more bytes than the file actually contains;
- a full zero-filled control GGUF with the same declared tensor shape.

It uploads both files to a vulnerable Ollama instance through the local Ollama API, triggers quantization with `/api/create`, copies the generated GGUF blobs from the local Docker container, and compares the malicious output against the zero-control output.

The differential comparison is useful because it shows that the vulnerable quantization path used bytes that were not present in the original malicious GGUF file. In my tests, this behavior was stable on Ollama `0.17.0` and rejected by the fixed `0.17.1` path.

## Requirements

- Python 3 with `requests`
- Docker access to the vulnerable Ollama container
- Ollama `0.17.0` exposed on a local API port
- A vulnerable test container name, for example `ollama-old-test`

Example lab target:

```bash
docker run -d --name ollama-old-test -p 11435:11434 ollama/ollama:0.17.0
```

## Usage

Install the only Python dependency:

```bash
python3 -m pip install requests
```

Run the default test against `http://localhost:11435` and container `ollama-old-test`:

```bash
python3 exp.py
```

Explicit arguments:

```bash
python3 exp.py http://localhost:11435 ollama-old-test Q4_K_M F16
python3 exp.py http://localhost:11435 ollama-old-test Q8_0 F16
python3 exp.py http://localhost:11435 ollama-old-test Q8_0 F32
```

The script writes local artifacts such as:

- `malicious_model.gguf`
- `control_model.gguf`
- `quantized_model.gguf`
- `control_quantized_model.gguf`
- `q8_dequantized_f32.bin`
- `q8_pseudo_f16.bin`
- `q8_pseudo_f16.txt`

## Findings

In my local tests, the vulnerable Ollama version created quantized outputs where the malicious tensor payload differed from a zero-control tensor despite the malicious GGUF file not containing those bytes.

That is enough to show an OOB-influenced artifact. It is not enough to claim practical black-box data disclosure.

I also tested canary-style data in concurrent model prompts and searched the generated artifacts, Q8_0 dequantized float32 bytes, and pseudo-F16 reconstruction output. I did not recover exact canaries or meaningful plaintext fragments.

The likely reason is that the bytes are not copied out as raw heap memory. They pass through the model conversion and quantization pipeline:

```text
heap bytes -> interpreted as F16/F32 tensor values -> converted/quantized -> GGUF tensor output
```

This path is lossy, especially with quantized formats such as `Q4_K_M`. `Q8_0` preserves more numeric information than `Q4_K_M`, but it still did not produce reliable plaintext recovery in my black-box-style tests.

## Scope And Limitations

This is a local lab reproduction and artifact analysis helper.

It does not provide a reliable remote secret exfiltration primitive. It also requires local Docker access to copy Ollama's generated blob from the test container, so the analysis step is not a pure remote black-box workflow.

The practical conclusion from my tests is:

- Heap OOB behavior is reproducible.
- OOB-influenced quantized artifacts are observable.
- Clear black-box plaintext impact was not demonstrated.

## References

- NVD entry: https://nvd.nist.gov/vuln/detail/CVE-2026-7482
- Ollama fix commit: https://github.com/ollama/ollama/commit/88d57d0483cca907e0b23a968c83627a20b21047

## Related Work

There is also a separate PoC repository by 0x0OZ:

https://github.com/0x0OZ/CVE-2026-7482-PoC

That implementation demonstrates a stronger white-box-style workflow by pushing the generated model artifact to a controlled registry. With the registry upload-flow fix from my PR, it completes cleanly in my local lab:

https://github.com/0x0OZ/CVE-2026-7482-PoC/pull/1

Even with that better end-to-end artifact collection path, the same data-quality caveat remains important: the output is quantized/model-transformed data, not a direct raw heap dump.

## Disclaimer

This repository is for authorized vulnerability research and defensive reproduction only. Test only against systems you own or have explicit permission to assess.