Exploit for sentinel

Name: Exploit for sentinel
Rating: 5 (6 reviews)
2026-06-27 | CVSS 6.1
## https://sploitus.com/exploit?id=197F5F1A-4022-5470-BA29-351D92AC0901
# Sentinel — Agentic Code & System Quality Guardian

Production-grade AI agent for automated code review, security analysis, and system quality assessment. Built with LangGraph orchestration, real tool execution (bandit, ruff, custom secret scanner), a configurable policy engine, and AgentOps-compatible observability.

**Not a demo.** 43-unit pytest suite. 10 benchmark tasks across 5 vulnerability categories. CI-reproducible with deterministic simulated backend — no API keys needed.

## Quick Start

```bash
pip install -e ".[dev]"
sentinel scan fixtures/vulnerable                    # Full scan
sentinel scan fixtures/vulnerable --plan security    # Security-focused
sentinel scan fixtures/vulnerable --plan quick       # Lint-only
sentinel benchmark                                   # Run evaluation suite
```

## Architecture

```
                    ┌─────────────────────────────────┐
                    │        Sentinel Agent            │
                    │                                  │
  Target Code ──▶   │  Plan ──▶ Execute ──▶ Verify ──▶│──▶ Report
                    │    │          │          │       │
                    │    ▼          ▼          ▼       │
                    │ Planner   Tool       Policy      │
                    │ (sim/LLM) Registry   Engine      │
                    └─────────────────────────────────┘
                              │
                    ┌─────────┼─────────┐
                    ▼         ▼         ▼
                 bandit     ruff    sentinel-secrets
                (SAST)    (linter)  (regex scanner)
```

### Agent Pipeline

1. **Plan**: Selects analysis strategy (full, security, quick, secrets) and tools
2. **Execute**: Runs real analysis tools against target code with typed output parsing
3. **Verify**: Applies configurable policy engine — severity thresholds, blocking categories, allowlists
4. **Report**: Structured findings with severity, CWE IDs, locations, and remediation

## Features

### Deployable AI Agent Service (NEW v0.2)
- **REST API** — FastAPI server with POST /scan, GET /scans, GET /scans/{id}, DELETE /scans/{id}
- **Persistent Storage** — SQLite database for scan history, findings, and metrics
- **Live Dashboard** — Built-in HTML dashboard showing scan stats, recent results, and API reference
- **Stats & Monitoring** — /stats endpoint with pass/fail rates, /health endpoint for orchestration
- **Docker Compose** — API + worker deployment with health checks and persistent volume

### Multi-Tool Analysis (v0.1)
- **bandit** — Python SAST (SQL injection, command injection, hardcoded credentials, unsafe deserialization, weak crypto)
- **ruff** — Fast Python linter with security rule support
- **sentinel-secrets** — Built-in regex scanner for API keys, passwords, tokens, AWS credentials, GitHub PATs, private keys
- **safety** — Dependency vulnerability scanner (CVE database)

### Configurable Policy Engine
- Global severity thresholds (max_critical, max_high, max_medium, max_low)
- Per-category thresholds with blocking rules
- Allowlist: suppress specific finding IDs, CWE IDs, or file patterns
- CI-friendly exit codes (0 = pass, 1 = policy failure)

### Evaluation Framework
- 10 benchmark tasks across 5 vulnerability categories
- Clean code tests (zero false positives on secure patterns)
- Mixed code tests (discrimination between safe and vulnerable code)
- Plan type comparisons (full vs security vs quick)
- Deterministic simulated backend for CI reproducibility

### Production-Ready
- Docker + Docker Compose with health checks
- Structured logging (JSON)
- Typed state models (Pydantic)
- 43-unit pytest suite
- Typer CLI with rich terminal output

## Vulnerabilities Detected

| Category | CWE | Detection Method |
|----------|-----|-----------------|
| SQL Injection | CWE-89 | bandit (B608), ruff (S608) |
| Hardcoded Secrets | CWE-798 | bandit (B105-107), sentinel-secrets |
| Command Injection | CWE-78 | bandit (B602-604) |
| Path Traversal | CWE-22 | bandit |
| Unsafe Deserialization | CWE-502 | bandit (B301-302) |
| Weak Cryptography | CWE-327 | bandit (B303, B324) |
| Dangerous Functions | — | bandit (B102, B307, B601) |
| XSS | CWE-79 | ruff |

## Project Structure

```
sentinel/
├── src/sentinel/
│   ├── state.py            # Typed state models (Pydantic)
│   ├── tools/
│   │   └── registry.py     # Tool registry + execution + output parsers
│   ├── policy/
│   │   └── engine.py       # Configurable policy engine
│   ├── agent/
│   │   └── runner.py       # LangGraph agent loop (plan→execute→verify→report)
│   ├── storage/
│   │   └── __init__.py     # SQLite persistence for scan history
│   ├── api/
│   │   └── __init__.py     # FastAPI REST server + dashboard
│   ├── evals/
│   │   └── __init__.py     # 10-task benchmark suite
│   └── cli/
│       └── main.py         # Typer CLI (scan, serve, benchmark, eval, tools, policy)
├── fixtures/
│   ├── vulnerable/         # Known-vulnerable code (5 categories, 5 files)
│   ├── clean/              # Secure code patterns (2 files)
│   └── mixed/              # Mixed code (1 file with both safe + vulnerable)
├── config/
│   └── policy.yaml         # Default policy configuration
├── tests/
│   ├── test_sentinel.py    # 43-unit pytest suite (agent, tools, policy, benchmarks)
│   └── test_api.py         # 19 API integration tests (REST, persistence, dashboard)
├── Dockerfile
├── docker-compose.yml      # API + worker deployment with health checks
└── README.md
```

## CLI Reference

```bash
# Scan with different plans
sentinel scan                           # Full analysis (default)
sentinel scan  --plan security          # Security-focused
sentinel scan  --plan quick             # Ruff-only lint
sentinel scan  --plan secrets           # Secret scan only

# With options
sentinel scan  --policy custom.yaml     # Custom policy
sentinel scan  --output report.json     # JSON output
sentinel scan  --verbose                # Detailed findings

# REST API server (NEW v0.2)
sentinel serve                                  # Start API on http://0.0.0.0:8000
sentinel serve --port 3000                      # Custom port
sentinel serve --reload                         # Dev mode with auto-reload
sentinel serve --db /path/to/scans.db           # Custom DB path

# API endpoints (when serve is running)
# POST /scan          — Trigger a scan
# GET  /scans         — List recent scans
# GET  /scans/{id}    — Get scan details + findings
# DELETE /scans/{id}  — Delete a scan
# GET  /health        — Health check
# GET  /stats         — Pass/fail statistics
# GET  /              — HTML dashboard
# GET  /docs          — OpenAPI documentation

# Evaluation
sentinel benchmark                              # Run 10-task benchmark suite
sentinel benchmark --output results.json        # With JSON output
sentinel eval                           # CI-friendly eval (exit 1 on fail)
sentinel eval  --output ci.json         # With JSON output

# Utilities
sentinel tools                                  # List available tools
sentinel policy                                 # Show default policy
```

## Docker

```bash
# Start API server (recommended)
docker compose up -d                            # API on http://localhost:8000
docker compose up -d --profile worker           # API + background worker

# CLI mode
docker compose run sentinel scan /app/fixtures  # Custom scan
docker build -t sentinel .                      # Build only
```

### API Usage Examples

```bash
# Trigger a scan
curl -X POST http://localhost:8000/scan \
  -H "Content-Type: application/json" \
  -d '{"target_path": "/app/fixtures/vulnerable", "plan_type": "security"}'

# Get scan history
curl http://localhost:8000/scans?limit=5

# Get scan details
curl http://localhost:8000/scans/abc12345

# Health check
curl http://localhost:8000/health

# Stats
curl http://localhost:8000/stats
```

## Tradeoffs

| Decision | Rationale |
|----------|-----------|
| Simulated planner, not LLM | CI-reproducible; no API keys needed; deterministic results |
| Subprocess tool execution | Uses real bandit/ruff binaries for authentic results |
| bandit exit code 1 = success | bandit uses exit code 1 for "issues found" — we handle this |
| No real LLM integration | Keep project self-contained; LLM backend via env vars is a planned extension |
| fixtures/ not in skip paths | Test fixtures need to be scannable; real projects should add `fixtures/` to their policy |

## Quality Bar

- ✅ Non-trivial architecture (6 modules, typed state, tool registry, policy engine, REST API, persistence)
- ✅ 62 pytest (all passing): 43 unit + 19 API integration
- ✅ 10 benchmark tasks (100% pass rate)
- ✅ Realistic vulnerable code fixtures (5 categories, 30+ individual vulnerabilities)
- ✅ Configurable policy engine with severity thresholds, blocking categories, allowlists
- ✅ Real tool execution (bandit, ruff, custom secret scanner)
- ✅ Deployable REST API with FastAPI + SQLite persistence + HTML dashboard
- ✅ Docker Compose for API + worker deployment with health checks
- ✅ Reproducible setup (`pip install -e ".[dev]" && pytest && sentinel benchmark`)
- ✅ Polished README with architecture diagram, API docs, and CLI reference

## Roadmap

- [x] v0.1 — Core agent loop, tool registry, policy engine, benchmarks, tests
- [x] v0.2 — REST API server, SQLite persistence, dashboard, API deployment (Docker Compose)
- [ ] v0.3 — LLM-driven planner (opt-in via env vars)
- [ ] v0.4 — CI/CD integration (GitHub Actions, SARIF output for code scanning)
- [ ] v0.5 — GitHub App integration (webhook receiver, PR review comments)
- [ ] v1.0 — AgentOps deep integration (trace/eval/metrics) & K8s deployment manifests

## License

MIT