Share
## https://sploitus.com/exploit?id=59505BC0-DE3A-56CF-96BF-33C4639271E6
# MCATester โ AI-Powered OSINT & Vulnerability Discovery Platform
> Built during a security research internship at the National e-Governance Division (NeGD), MeitY, New Delhi.
[](https://python.org)
[](https://fastapi.tiangolo.com)
[](https://groq.com)
[](LICENSE)
MCATester is a full-stack OSINT and vulnerability discovery platform that turns passive reconnaissance into **confirmed, zero-false-positive security findings** โ with an AI decision layer that makes the scanner adaptive rather than just automated.
---
## The core problem it solves
Most scanners produce noise. Running gobuster + nikto + sqlmap on a real target produces hundreds of raw results requiring hours of manual filtering. MCATester produces clean findings โ a SQLi finding means the database actually executed a sleep command, an XSS finding means the payload was reflected unescaped in the HTML response.
**On mca.gov.in (before vs after noise reduction):**
```
First version: 68 findings โ 61 false positives (all 403 responses)
Current version: 11 findings โ 0 false positives
```
The key insight: 403 responses are ambiguous. A WAF returning 403 on `/admin` doesn't mean admin exists. Content confirmation โ checking what the 403 response body actually contains โ eliminates this entire class of false positive.
---
## Real findings โ Ministry of Corporate Affairs, India
Discovered during authorized research on `mca.gov.in`:
```
CRITICAL CVE-2023-27997 CVSS 9.8
vpnv3.mca.gov.in:4111 โ Fortinet SSL VPN pre-auth heap overflow
Unauthenticated remote code execution, no credentials required
CRITICAL CVE-2022-40684 CVSS 9.8
Fortinet authentication bypass โ full admin access without credentials
Affected: FortiOS 7.0.0-7.0.6, 7.2.0-7.2.1
CRITICAL CVE-2018-13379 CVSS 9.1
Fortinet path traversal โ VPN session credentials readable
via /remote/fgt_lang without authentication
HIGH Unauthenticated File-Serving API
pminternship.mca.gov.in/mca-api/files/get-file-by-path
No auth required to request arbitrary file paths
HIGH CVE-2023-24486 CVSS 8.8
GroupWise WebAccess XSS + session hijack
mail.mca.gov.in โ active groupware installation
```
Responsibly disclosed to CERT-In (`incident@cert-in.org.in`) with full PDF report.
---
## Confirmed findings on demo.testfire.net (deliberately vulnerable lab)
```
CRITICAL SQL Injection โ Time-based blind (PostgreSQL confirmed)
URL : http://demo.testfire.net/search.jsp
Payload: '; SELECT pg_sleep(3)--
Evidence: 3.8s response vs 0.6s baseline
CRITICAL Swagger/OpenAPI UI exposed publicly
URL : http://demo.testfire.net/swagger/properties.json
Email leaked: jsmtih@altoromutual.com
HIGH Reflected XSS
URL : http://demo.testfire.net/search.jsp
Payload: reflected unescaped in HTML response
[AI-Agent] Risk: CRITICAL (score: 9.5/10)
[AI-Agent] โ Remove public Swagger access
[AI-Agent] โ Patch CVE-2025-24813 (Tomcat partial PUT RCE, CVSS 9.8)
[AI-Agent] โ Fix SQLi with parameterized queries
```
---
## Architecture
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MCATester โ 16 Stage Pipeline โ
โ โ
โ Stage 1 DNS + Whois + Subdomain Enum (crt.sh/VT/HT) โ
โ Stage 2 Recursive Asset Discovery (parallel, 20 threads) โ
โ Stage 3 Subdomain Takeover Detection (20 services) โ
โ Stage 4 Threat Intel (urlscan / AbuseIPDB / OTX) โ
โ Stage 5 Tech Stack Detection (WhatWeb + headers) โ
โ Stage 6 AI Context Injector โ Gemini generates dork queries โ
โ Stage 7 Google Dorking (20+ categories, DDG + Serper) โ
โ Stage 8 Fetch + Content Confirmation (35 patterns) โ
โ Stage 9 Active Probing + WAF Detection โ
โ Stage 10 Attack Chain Orchestrator โ
โ Stage 11 Header Security Analysis โ
โ Stage 12 Payload Injection (SQLi / XSS / Traversal) โ
โ Stage 13 CVE Correlation + NVD Enrichment โ
โ Stage 14 AI Decision Engine (Groq) โ 5 decision points โ
โ Stage 15 Gemini Report Generation โ
โ Stage 16 PDF Export + Webhook Alerts โ
โ โ
โ FastAPI backend + SQLite + Real-time dashboard โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
The AI Decision Engine (Stage 14) is not just report-writing โ it makes actual decisions at 5 points: target triage, URL prioritization, injection targeting, CVE exploitability assessment, and final risk ranking.
---
## How MCATester compares
| Capability | MCATester | Nikto | gobuster | Burp Suite Free |
|---|:---:|:---:|:---:|:---:|
| Zero false positives | โ | โ | โ | Manual |
| CVE correlation | โ | Partial | โ | โ |
| SQLi/XSS confirmation | โ | โ | โ | Manual |
| Subdomain takeover | โ | โ | โ | โ |
| AI risk scoring | โ | โ | โ | โ |
| Attack chain orchestration | โ | โ | โ | Manual |
| Real-time dashboard | โ | โ | โ | โ |
| PDF report | โ | โ | โ | Pro only |
| Drift detection | โ | โ | โ | โ |
| Webhook alerts | โ | โ | โ | โ |
---
## Features
### Passive Recon
- DNS (A/MX/NS/TXT/SOA/Reverse), Whois
- Subdomain enumeration โ VirusTotal, crt.sh, HackerTarget, sublist3r (45+ subdomains found on real targets)
- GitHub recon โ repositories referencing the target
- Threat intelligence โ urlscan.io, AbuseIPDB, OTX AlienVault
- Email discovery + Holehe registration checking (400+ sites)
- IP intelligence โ ASN, ISP, geolocation
### Active Discovery
- Parallel recursive asset scanning (28 assets in 3 min)
- WhatWeb tech stack fingerprinting
- 66+ path probes with WAF detection
- AI-targeted probing โ Gemini generates paths specific to detected tech stack
- Subdomain takeover detection (20 services: GitHub Pages, Heroku, Netlify, Vercel, AWS S3, Azure, Shopify, HubSpot, Zendesk...)
### Vulnerability Confirmation
- **SQL Injection** โ error-based + time-based blind, auto-detects MySQL/MSSQL/PostgreSQL
- **Reflected XSS** โ safe payload reflection detection
- **Path Traversal** โ file API parameter testing with content confirmation
- **WAF pre-check** โ if WAF blocks all pages, skip injection (saves ~3 min on hardened targets)
- **Content Confirmation** โ 35 signatures, kills 403 false positives
### Attack Chain Orchestrator
When one finding is confirmed, automatically fires follow-up probes:
- Swagger found โ probe 12 API endpoints
- VPN login found โ probe Fortinet-specific paths
- File API found โ test 10 traversal payloads (deduplicated by base endpoint)
- Webmail found โ probe 7 credential paths
### CVE Intelligence
- Static knowledge base โ Fortinet, Lotus Domino, GroupWise, Tomcat, Apache, nginx, WordPress, PHP
- NVD API enrichment for confirmed CVEs
- Auto-matches detected tech stack to CVE database
- AI exploitability assessment with confidence levels
### AI Decision Engine (Groq)
5 real decisions per scan โ not just report formatting:
```
Decision 1: Target triage
โ Classifies as government/enterprise/SaaS
โ Identifies high-value subdomains to prioritize
Decision 2: URL ranking
โ Ranks 40+ discovered URLs by exploitation potential
โ VPN login page > generic content page
Decision 3: Injection targeting
โ Selects which pages are worth injection testing
โ Skips pages with no injectable parameters
Decision 4: CVE exploitability
โ Assesses if correlated CVEs are likely exploitable
โ Considers service accessibility + version ranges
Decision 5: Final risk assessment
โ Risk score (0-10)
โ Executive summary (2-3 sentences for management)
โ Technical summary (attack vectors for security team)
โ Specific immediate actions
```
### Dashboard & Reporting
- Real-time web dashboard โ live scan status, severity donut, risk trend chart
- Attack Chains page โ findings grouped by CVE Intelligence / Active Exploitation / Infrastructure
- Alerts page โ all CRITICAL/HIGH findings across all scans, grouped by target with timestamps
- Drift detection โ scan-over-scan comparison, flags new/resolved/changed findings
- PDF report โ VAPT-style with findings, evidence, CVSS scores, remediation steps
- Webhooks โ Slack, Discord, Telegram for HIGH+ findings
---
## Installation
**Requirements:** Python 3.10+, Linux or WSL2, nmap
```bash
git clone https://github.com/yourusername/MCATester.git
cd MCATester
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Optional but improves results significantly
pip install groq
sudo apt install nmap whatweb
```
### API Keys (`.env`)
```bash
cp .env.example .env
# Edit .env with your keys
```
| Key | Where to get | Cost |
|---|---|---|
| `GEMINI_API_KEY` | aistudio.google.com | Free (15 req/min) |
| `SERPER_API_KEY` | serper.dev | Free (2500/month) |
| `VIRUSTOTAL_API_KEY` | virustotal.com | Free (500/day) |
| `GROQ_API_KEY` | console.groq.com | Free (fast) |
| `SHODAN_API_KEY` | shodan.io | $49/year |
---
## Usage
### CLI
```bash
# Full scan โ all 16 stages
python osint_agent.py mca.gov.in
# Passive only โ no active probing or injection
python osint_agent.py mca.gov.in --passive
# Skip recursive discovery (faster โ ~5 min vs ~10 min)
python osint_agent.py mca.gov.in --no-recursive
```
### Dashboard
```bash
python server.py
# Open http://localhost:8000
```
Enter domain โ Start Scan โ watch results populate in real time.
---
## Scan performance
```
mca.gov.in (45 subdomains, WAF protected):
Total time : ~10 minutes
Findings : 11 (zero false positives)
False positives: 0 (was 61 in v1)
demo.testfire.net (no WAF, vulnerable):
Total time : ~12 minutes
Findings : 15 (confirmed SQLi + XSS + CVEs)
Time breakdown (approximate):
Subdomain enum : 2 min (crt.sh + VirusTotal sequential)
Recursive discovery : 3 min (28 assets parallel)
Dorking : 2 min (DDG + Serper)
Payload injection : 0 min (WAF pre-check skips on mca.gov.in)
3 min (full testing on demo.testfire.net)
CVE + AI decisions : 1 min (5 Groq calls)
Other stages : 2 min
```
---
## Project structure
```
MCATester/
โโโ osint_agent.py # Main pipeline โ 16 stages, CLI entry
โโโ server.py # FastAPI backend โ scan management + API
โโโ orchestrator.py # Attack chain engine
โโโ ai_decision_engine.py # Groq LLM โ 5 decision points per scan
โโโ ai_context_injector.py # Gemini โ targeted dork + path generation
โโโ cve_correlation.py # CVE matching + NVD API enrichment
โโโ payload_injector.py # SQLi/XSS/traversal with WAF pre-check
โโโ subdomain_takeover.py # Dangling CNAME โ 20 services
โโโ recursive_discovery.py # Parallel subdomain + port scanner
โโโ delta_detection.py # Scan-over-scan diff
โโโ content_confirmation.py # 35-pattern false-positive eliminator
โโโ webhooks.py # Slack/Discord/Telegram alerts
โโโ osint_features.py # PDF report generator
โโโ osint_identity.py # IP intel + Holehe
โโโ search.py # DDG/Serper wrapper
โโโ static/
โ โโโ index.html # Real-time dashboard SPA
โโโ requirements.txt
โโโ .env.example
โโโ README.md
```
---
## Responsible use
**Only test systems you own or have explicit written permission to test.**
Built-in safety measures:
- Warning banner on every CLI run
- `--passive` mode disables all active testing
- Injection payloads are read-only diagnostics โ no write operations
- Rate limiting (1s between requests)
- WAF pre-check skips injection when target is hardened
- 403 responses never reported as findings
For disclosures: India โ CERT-In `incident@cert-in.org.in`
---
## Tech stack
| Layer | Technology |
|---|---|
| Pipeline | Python 3.12 |
| Backend API | FastAPI + SQLite |
| Frontend | Vanilla JS + CSS custom properties |
| AI decisions | Groq โ llama-3.3-70b-versatile |
| AI context | Google Gemini 2.5 Flash |
| PDF generation | ReportLab |
| Port scanning | Shodan InternetDB + nmap fallback |
| Tech detection | WhatWeb + header inference |
| Subdomain data | crt.sh + VirusTotal + HackerTarget |
---
## Roadmap
- [ ] Screenshot capture โ Playwright screenshots of all discovered assets
- [ ] Scheduled scanning โ 24h autonomous monitoring with drift alerts
- [ ] nuclei integration โ template-based CVE confirmation
- [ ] Multi-target mode โ scan an entire organization at once
- [ ] SARIF export โ GitHub Security tab integration
---
## Author
**SANKARAYOUGI SRIVASTESWAR** โ B.Tech Computer Science, VIT-AP University
Security research intern, National e-Governance Division (NeGD), MeitY, New Delhi
---
*For authorized security testing and research only. The author is not responsible for misuse.*