Cyber Net — AI-Powered Autonomous Penetration Testing Platform
Authorized testing environments only. Cyber Net is designed for controlled penetration testing in authorized lab and staging environments. All execution requires explicit operator approval.
Inspiration
Across internships and security research, two critical gaps continued to surface:
- Manual penetration testing doesn't scale—security teams struggle to continuously validate defenses across large, evolving network topologies with limited resources.
- Threat intelligence remains theoretical—MITRE ATT&CK techniques, CVE data, EPSS scores, and CISA KEV listings sit in dashboards but rarely translate into actionable, prioritized offensive operations.
Response: Cyber Net is an AI-driven offensive security orchestrator that ranks network vulnerabilities by exploitation likelihood (MITRE ATT&CK, EPSS, CVSS, KEV) and autonomously generates multi-step attack playbooks with executable shell commands, bridging the gap between threat intelligence and real-world penetration testing.
What it does
Cyber Net is an autonomous offensive security platform that visualizes your network topology and orchestrates AI-powered attacks:
Multi-Tier Intelligence Analysis: Three scan modes for different use cases:
- Standard Scan (Grok): Fast, comprehensive vulnerability discovery using X.AI's Grok model via OpenRouter for rapid reconnaissance
- Grounded Scan (Gemini + Search): Real-time threat intelligence from Google Search integration, ensuring recommendations reflect the latest CVEs and exploitation trends
- Deep Scan (Gemini 2.5 Pro): Extended thinking mode with 32k token budget for complex attack chain analysis and multi-step exploitation paths
Explainable Threat Prioritization: Each vulnerability displays MITRE ATT&CK ID, EPSS probability, CVSS base score, and CISA KEV badge with our transparent scoring formula:
Score = 0.55×EPSS + 0.20×EnvFit + 0.05×(CVSS/10) + 0.20×KEV_BUMPAutonomous Execution Playbooks: Select any attack vector and our AI agents generate step-by-step exploitation sequences with ready-to-execute shell commands—from initial reconnaissance through privilege escalation to impact demonstration.
Real-Time Impact Visualization: Watch as simulated attacks progress through your network topology, compromise additional nodes, and demonstrate actual blast radius via graph-based reachability analysis. Devices cascade offline, connections fail, and the true scope of exploitation becomes immediately visible.
Comprehensive Network Reports: AI-generated security posture analysis identifies your largest attack surface, assigns an overall security score (1-5), and prioritizes the top 3 critical exploits across your entire infrastructure.
Blue Team Response Book: Every identified vulnerability automatically populates a mitigation tracking system with actionable defensive measures, assignee management, and implementation status workflows.
How we built it
AI Orchestration Engine
We architected a multi-model AI system that leverages the strengths of different foundation models:
- X.AI Grok (via OpenRouter): Powers Standard Scans with rapid, comprehensive vulnerability analysis and JSON-structured output for consistent parsing
- Google Gemini 2.5 Flash: Handles Grounded Scans with real-time web search integration, pulling the latest CVE data, EPSS scores, and exploitation trends directly from authoritative sources
- Google Gemini 2.5 Pro: Executes Deep Scans with extended thinking capabilities (32k token budget), enabling complex multi-step attack chain reasoning and creative vulnerability discovery
All models respond via enforced JSON schemas with strict type validation, ensuring every recommendation includes MITRE ATT&CK mappings, EPSS/CVSS/KEV data, and calculated threat scores.
Custom Network Visualization Engine
Built from the ground up using SVG and React, our topology engine provides:
- Interactive pan/zoom controls with smooth animations and viewport management
- Real-time state synchronization across network devices (online/offline/compromised states)
- Graph-based blast radius calculation using breadth-first search algorithms to model cascading failures when critical infrastructure nodes are compromised
- Dynamic connection visualization with intermittent link detection and animated connection states
- Attack surface highlighting that automatically identifies and emphasizes high-risk devices
Autonomous Execution Framework
Our AI agents generate executable shell commands tailored to specific vulnerabilities:
- Reconnaissance commands:
nmap,curl,dig, custom protocol probes - Exploitation sequences: Vulnerability-specific payloads, path traversal attacks, RCE chains
- Lateral movement: Network pivoting, credential harvesting, service exploitation
- Impact demonstration: Service disruption, data exfiltration simulation, persistence establishment
Each playbook is generated on-demand by Gemini AI, adapted to the exact device configuration, operating system, and vulnerability context. Commands are formatted in clean Markdown code blocks for operator review before execution.
Threat Intelligence Integration
We integrate real-world threat data to ensure prioritization reflects actual exploitation likelihood:
- EPSS (Exploit Prediction Scoring System): Probability of exploitation in the next 30 days, sourced from FIRST.org
- CISA KEV (Known Exploited Vulnerabilities): Federal catalog of vulnerabilities exploited in the wild
- CVSS (Common Vulnerability Scoring System): Industry-standard severity ratings from NVD
- MITRE ATT&CK: Adversarial tactics and techniques taxonomy for consistent classification
Our Grounded Scan mode uses Google Search to pull the latest intelligence, ensuring recommendations stay current with emerging threats without manual database updates.
Scoring Algorithm
We developed an explainable, auditable scoring model that prioritizes likelihood over severity:
Formula: $$ \text{Score} = 0.55 \times \text{EPSS} + 0.20 \times \text{EnvFit} + 0.05 \times \left(\frac{\text{CVSS}}{10}\right) + 0.20 \times \text{KEV}_{\text{bump}} $$
Rationale:
- EPSS (55% weight): Likelihood of exploitation—the most critical factor for risk prioritization
- Environment Fit (20% weight): OS/service matching ensures relevance (e.g., Apache CVEs won't appear for PostgreSQL hosts)
- CVSS (5% weight): Severity as a light tiebreaker, not the primary driver
- KEV bump (+0.20): Actively exploited vulnerabilities get automatic priority boost
This approach aligns remediation efforts with real-world attacker behavior rather than theoretical severity scores.
Challenges we ran into
Multi-model orchestration complexity: Coordinating three different AI providers (OpenRouter, Google Gemini) with varying API structures, rate limits, and response formats required robust error handling and fallback logic.
JSON schema enforcement: Getting LLMs to consistently return valid, parseable JSON with all required fields (EPSS, CVSS, MITRE ATT&CK IDs) required extensive prompt engineering and schema validation.
Real-time state management at scale: Synchronizing live attack execution state (compromised devices, offline nodes, intermittent connections) across the network diagram while maintaining 60fps animations required careful React optimization and state batching.
Graph algorithm performance: Computing blast radius via BFS for networks with 100+ nodes and complex topologies while updating in real-time during attack simulations required efficient adjacency list representations.
Grounded search quality: Ensuring Google Search grounding returned authoritative CVE sources rather than generic security blog posts required refined prompting and result filtering.
Accomplishments we're proud of
Multi-model AI architecture that seamlessly coordinates X.AI Grok and Google Gemini models, selecting the optimal engine for each scan type (speed vs. depth vs. currency).
Explainable threat scoring where every recommendation shows its exact EPSS %, CVSS base, KEV badge, and ATT&CK ID with transparent scoring logic that security teams can audit and trust.
Autonomous playbook generation that creates ready-to-execute shell command sequences adapted to specific device configurations, vulnerabilities, and exploitation contexts.
Real-time blast radius visualization using graph algorithms to demonstrate cascading failures and network-wide impact as attacks progress through the topology.
Zero manual intelligence updates via Google Search grounding—Grounded Scans automatically pull the latest CVE data, EPSS scores, and exploitation trends without maintaining a vulnerability database.
What we learned
EPSS transforms prioritization. Weighting exploitation likelihood heavily (55%) ensures remediation efforts focus on vulnerabilities attackers actually target, not just high-CVSS items that may never be exploited.
Multi-model orchestration scales intelligence. Different AI models excel at different tasks—Grok for speed, Gemini Flash for currency, Gemini Pro for depth. Coordinating them provides better results than any single model.
MITRE ATT&CK enables shared language. Mapping every vulnerability to ATT&CK techniques creates a common taxonomy that both AI agents and human operators understand, making automation reliable and detections well-scoped.
Explainability builds trust. Transparent scoring formulas and visible threat intelligence data (EPSS %, KEV status, sources) allow security teams to confidently act on AI recommendations without "black box" uncertainty.
Graph algorithms model real impact. Breadth-first search for blast radius calculation accurately demonstrates how critical infrastructure failures cascade through network dependencies—making risk tangible.
What's next for Cyber Net
RAG-Enhanced Threat Intelligence Architecture
Implement a dual-generation verification system to ensure recommendation accuracy:
- Vector database storing the complete MITRE ATT&CK framework (14 tactics, 193+ techniques, 400+ sub-techniques) and NVD CVE corpus with embeddings for semantic search
- Creative generation phase: LLMs generate vulnerability hypotheses and attack chains unrestricted
- RAG verification phase: Retrieval-augmented generation validates every ATT&CK ID, CVE number, CVSS score, and technique description against authoritative embeddings
- Contradiction detection: Automated flagging when creative outputs diverge from ground truth, triggering regeneration with constraints
- Citation linking: Every recommendation includes direct links to MITRE technique pages and NVD CVE entries with confidence scores
This architecture combines creative AI reasoning for novel attack discovery with factual grounding to eliminate hallucinated vulnerabilities.
Advanced Persistent Threat (APT) Emulation
Transform single-exploit execution into full-scale adversary campaigns:
- Multi-stage attack chains: Automatically sequence exploits across the kill chain—initial access → persistence → privilege escalation → lateral movement → collection → exfiltration
- APT behavior modeling: Simulate real threat actor TTPs (APT29, APT28, Lazarus Group) by chaining their documented techniques from MITRE ATT&CK
- Temporal pacing: Introduce realistic delays between attack stages to mimic low-and-slow campaigns that evade detection
- Objective-driven planning: Define campaign goals (e.g., "exfiltrate financial data") and let AI agents autonomously plan and execute multi-week operations
- Purple team orchestration: Run coordinated red team attacks while simultaneously tasking blue team agents to detect, respond, and remediate in real-time
Continuous Security Validation Platform
Evolve from point-in-time scans to always-on offensive testing:
- Scheduled autonomous campaigns: Daily/weekly attack simulations that validate security controls without human intervention
- Drift detection: Automatically identify when configuration changes introduce new vulnerabilities (e.g., firewall rule modifications, patch rollbacks)
- Regression testing: Ensure previously remediated vulnerabilities stay fixed and don't reappear after updates
- Compliance validation: Continuously verify PCI-DSS, NIST 800-53, ISO 27001 security control effectiveness through targeted attack simulations
- Remediation tracking: Close-loop validation that confirms mitigations actually prevent exploitation before marking tickets complete
Collaborative Multi-Agent Swarms
Deploy coordinated AI agent teams for complex operations:
- Specialized agent roles: Reconnaissance specialists, exploitation experts, persistence engineers, data exfiltration agents, anti-forensics units
- Inter-agent communication: Shared knowledge base where agents exchange discovered credentials, network maps, and access paths
- Parallel execution: Multiple agents attack different network segments simultaneously, coordinating to achieve unified objectives
- Competitive modes: Red team agents compete against blue team defensive agents in real-time adversarial games to stress-test both offensive and defensive AI capabilities
Blue Team Intelligence Integration
Build defensive AI counterparts that learn from offensive operations:
- Automated detection rule generation: Convert successful attack chains into Sigma/YARA/Snort signatures and SIEM correlation rules
- Defensive playbook creation: Generate incident response runbooks based on observed attack paths and effective countermeasures
- Proactive countermeasures: Deploy honeypots, deception assets, and dynamic firewall rules in response to reconnaissance activity
- Alert fatigue reduction: Train ML models on historical attack data to tune detection thresholds and reduce false positives
Live Threat Intelligence Feeds
Replace static data with real-time authoritative sources:
- FIRST EPSS API integration: Pull daily exploitation probability updates directly from the official EPSS feed
- CISA KEV monitoring: Webhook subscriptions that alert when new known-exploited vulnerabilities are added to the catalog
- NVD CVE enrichment: Automatic vulnerability detail enrichment as new CVEs are published and scored
- Threat actor tracking: Integrate with commercial threat intelligence platforms (Recorded Future, Mandiant) to prioritize attacks matching active campaigns
- Zero-day detection: Flag anomalous behavior that doesn't match known CVEs, potentially indicating novel exploitation techniques
Extended Topology Discovery
Automate network mapping beyond manual configuration:
- Active scanning integration: Embedded nmap, Nessus, and Qualys agents for automatic device discovery and service fingerprinting
- Passive network monitoring: SPAN/TAP traffic analysis to build topology from observed communications without intrusive scanning
- SNMP/CDP/LLDP polling: Discover network infrastructure interconnections directly from routing protocols and management planes
- Cloud API integration: Automatic topology generation from AWS/Azure/GCP asset inventories via cloud provider APIs
- Asset management sync: Bidirectional integration with ServiceNow, Jira, and CMDBs to keep topology and business context synchronized
Custom Attack Playbook Builder
Empower security teams to define organization-specific testing scenarios:
- Visual attack graph editor: Drag-and-drop interface for composing multi-step attack chains from ATT&CK techniques
- Parameterized templates: Reusable attack patterns with variables for target IPs, credentials, payload paths
- Industry-specific scenarios: Pre-built playbooks for finance (card data theft), healthcare (PHI exfiltration), critical infrastructure (OT/SCADA attacks)
- Compliance test libraries: One-click campaigns that validate specific regulatory requirements (PCI-DSS 11.3, NIST 800-53 CA-8)
- Team sharing: Central repository where security teams publish and exchange effective attack scenarios
Attack Replay and Forensics
Transform execution logs into learning opportunities:
- Step-by-step replay: Visualize past attacks frame-by-frame to understand decision paths and identify defensive gaps
- What-if analysis: Modify network configurations retroactively and replay attacks to test if mitigations would have worked
- Timeline reconstruction: Generate detailed attack timelines with timestamps, commands executed, data accessed, and lateral movement paths
- Evidence export: Package attack logs and artifacts in formats compatible with forensic tools (PCAP, Syslog, Windows Event Logs)
CI/CD Security Integration
Embed offensive testing in development pipelines:
- API-first architecture: RESTful endpoints for triggering scans, retrieving results, and managing campaigns programmatically
- GitHub Actions / GitLab CI runners: Pre-built pipeline steps that automatically test infrastructure-as-code (Terraform, CloudFormation) for exploitable misconfigurations
- Shift-left security: Validate security controls in staging environments before production deployment
- Quality gates: Block deployments that introduce high-EPSS vulnerabilities or expand attack surface beyond defined thresholds
Adversary Behavior Analytics
Apply machine learning to attack pattern recognition:
- Attack graph mining: Identify common exploitation paths across historical campaigns to predict likely attacker next moves
- Vulnerability clustering: Group related CVEs and techniques to prioritize remediation efforts that eliminate entire attack classes
- Anomaly detection: Flag unusual network behavior that deviates from known attack patterns, potentially indicating zero-days or insider threats
- Predictive risk modeling: Forecast which devices are most likely to be targeted based on historical exploitation trends and network position
Technologies
AI & Machine Learning:
- X.AI Grok via OpenRouter — Fast vulnerability discovery
- Google Gemini 2.5 Flash — Grounded search intelligence
- Google Gemini 2.5 Pro — Extended thinking mode
- Structured JSON schema enforcement with type validation
Threat Intelligence:
- MITRE ATT&CK Framework — Adversarial technique taxonomy
- FIRST EPSS — Exploitation probability scoring
- CISA KEV Catalog — Known exploited vulnerabilities
- CVSS v3.1 — Common vulnerability scoring
- Real-time Google Search grounding
Frontend & Visualization:
- React 19 — UI framework
- TypeScript — Type-safe development
- Custom SVG engine — Interactive network topology
- Tailwind CSS via CDN — Styling system
- Vite — Build tooling and dev server
Infrastructure:
- Docker containerization
- Netlify/Vercel deployment support
- Node.js 20 runtime
Sources
Threat Intelligence & Frameworks:
- MITRE ATT&CK — https://attack.mitre.org/
- FIRST EPSS — https://www.first.org/epss/
- CISA KEV Catalog — https://www.cisa.gov/known-exploited-vulnerabilities-catalog
- CVSS Specification — https://www.first.org/cvss/v3-1/specification-document
- NVD — https://nvd.nist.gov/
AI & APIs:
- Google Gemini AI — https://ai.google.dev/
- OpenRouter — https://openrouter.ai/
- X.AI Grok — https://x.ai/
Build Tools:
- Vite — https://vitejs.dev/
- React — https://react.dev/
Built With
- ai
- api
- postgresql
- python

Log in or sign up for Devpost to join the conversation.