Main Page
Network Topology
Mitigation Response Book
Attack Recommendations

Cyber Net — AI-Powered Autonomous Penetration Testing Platform

Authorized testing environments only. Cyber Net is designed for controlled penetration testing in authorized lab and staging environments. All execution requires explicit operator approval.

Inspiration

Across internships and security research, two critical gaps continued to surface:

Manual penetration testing doesn't scale—security teams struggle to continuously validate defenses across large, evolving network topologies with limited resources.
Threat intelligence remains theoretical—MITRE ATT&CK techniques, CVE data, EPSS scores, and CISA KEV listings sit in dashboards but rarely translate into actionable, prioritized offensive operations.

Response: Cyber Net is an AI-driven offensive security orchestrator that ranks network vulnerabilities by exploitation likelihood (MITRE ATT&CK, EPSS, CVSS, KEV) and autonomously generates multi-step attack playbooks with executable shell commands, bridging the gap between threat intelligence and real-world penetration testing.

What it does

Cyber Net is an autonomous offensive security platform that visualizes your network topology and orchestrates AI-powered attacks:

Multi-Tier Intelligence Analysis: Three scan modes for different use cases:
- Standard Scan (Grok): Fast, comprehensive vulnerability discovery using X.AI's Grok model via OpenRouter for rapid reconnaissance
- Grounded Scan (Gemini + Search): Real-time threat intelligence from Google Search integration, ensuring recommendations reflect the latest CVEs and exploitation trends
- Deep Scan (Gemini 2.5 Pro): Extended thinking mode with 32k token budget for complex attack chain analysis and multi-step exploitation paths
Explainable Threat Prioritization: Each vulnerability displays MITRE ATT&CK ID, EPSS probability, CVSS base score, and CISA KEV badge with our transparent scoring formula: Score = 0.55×EPSS + 0.20×EnvFit + 0.05×(CVSS/10) + 0.20×KEV_BUMP
Autonomous Execution Playbooks: Select any attack vector and our AI agents generate step-by-step exploitation sequences with ready-to-execute shell commands—from initial reconnaissance through privilege escalation to impact demonstration.
Real-Time Impact Visualization: Watch as simulated attacks progress through your network topology, compromise additional nodes, and demonstrate actual blast radius via graph-based reachability analysis. Devices cascade offline, connections fail, and the true scope of exploitation becomes immediately visible.
Comprehensive Network Reports: AI-generated security posture analysis identifies your largest attack surface, assigns an overall security score (1-5), and prioritizes the top 3 critical exploits across your entire infrastructure.
Blue Team Response Book: Every identified vulnerability automatically populates a mitigation tracking system with actionable defensive measures, assignee management, and implementation status workflows.

How we built it

AI Orchestration Engine

We architected a multi-model AI system that leverages the strengths of different foundation models:

X.AI Grok (via OpenRouter): Powers Standard Scans with rapid, comprehensive vulnerability analysis and JSON-structured output for consistent parsing
Google Gemini 2.5 Flash: Handles Grounded Scans with real-time web search integration, pulling the latest CVE data, EPSS scores, and exploitation trends directly from authoritative sources
Google Gemini 2.5 Pro: Executes Deep Scans with extended thinking capabilities (32k token budget), enabling complex multi-step attack chain reasoning and creative vulnerability discovery

All models respond via enforced JSON schemas with strict type validation, ensuring every recommendation includes MITRE ATT&CK mappings, EPSS/CVSS/KEV data, and calculated threat scores.

Custom Network Visualization Engine

Built from the ground up using SVG and React, our topology engine provides:

Interactive pan/zoom controls with smooth animations and viewport management
Real-time state synchronization across network devices (online/offline/compromised states)
Graph-based blast radius calculation using breadth-first search algorithms to model cascading failures when critical infrastructure nodes are compromised
Dynamic connection visualization with intermittent link detection and animated connection states
Attack surface highlighting that automatically identifies and emphasizes high-risk devices

Autonomous Execution Framework

Our AI agents generate executable shell commands tailored to specific vulnerabilities:

Reconnaissance commands: nmap, curl, dig, custom protocol probes
Exploitation sequences: Vulnerability-specific payloads, path traversal attacks, RCE chains
Lateral movement: Network pivoting, credential harvesting, service exploitation
Impact demonstration: Service disruption, data exfiltration simulation, persistence establishment

Each playbook is generated on-demand by Gemini AI, adapted to the exact device configuration, operating system, and vulnerability context. Commands are formatted in clean Markdown code blocks for operator review before execution.

Threat Intelligence Integration

We integrate real-world threat data to ensure prioritization reflects actual exploitation likelihood:

EPSS (Exploit Prediction Scoring System): Probability of exploitation in the next 30 days, sourced from FIRST.org
CISA KEV (Known Exploited Vulnerabilities): Federal catalog of vulnerabilities exploited in the wild
CVSS (Common Vulnerability Scoring System): Industry-standard severity ratings from NVD
MITRE ATT&CK: Adversarial tactics and techniques taxonomy for consistent classification

Our Grounded Scan mode uses Google Search to pull the latest intelligence, ensuring recommendations stay current with emerging threats without manual database updates.

Scoring Algorithm

We developed an explainable, auditable scoring model that prioritizes likelihood over severity:

Formula: $$ \text{Score} = 0.55 \times \text{EPSS} + 0.20 \times \text{EnvFit} + 0.05 \times \left(\frac{\text{CVSS}}{10}\right) + 0.20 \times \text{KEV}_{\text{bump}} $$

Rationale:

EPSS (55% weight): Likelihood of exploitation—the most critical factor for risk prioritization
Environment Fit (20% weight): OS/service matching ensures relevance (e.g., Apache CVEs won't appear for PostgreSQL hosts)
CVSS (5% weight): Severity as a light tiebreaker, not the primary driver
KEV bump (+0.20): Actively exploited vulnerabilities get automatic priority boost

This approach aligns remediation efforts with real-world attacker behavior rather than theoretical severity scores.

Challenges we ran into

Multi-model orchestration complexity: Coordinating three different AI providers (OpenRouter, Google Gemini) with varying API structures, rate limits, and response formats required robust error handling and fallback logic.
JSON schema enforcement: Getting LLMs to consistently return valid, parseable JSON with all required fields (EPSS, CVSS, MITRE ATT&CK IDs) required extensive prompt engineering and schema validation.
Real-time state management at scale: Synchronizing live attack execution state (compromised devices, offline nodes, intermittent connections) across the network diagram while maintaining 60fps animations required careful React optimization and state batching.
Graph algorithm performance: Computing blast radius via BFS for networks with 100+ nodes and complex topologies while updating in real-time during attack simulations required efficient adjacency list representations.
Grounded search quality: Ensuring Google Search grounding returned authoritative CVE sources rather than generic security blog posts required refined prompting and result filtering.

Accomplishments we're proud of

Multi-model AI architecture that seamlessly coordinates X.AI Grok and Google Gemini models, selecting the optimal engine for each scan type (speed vs. depth vs. currency).
Explainable threat scoring where every recommendation shows its exact EPSS %, CVSS base, KEV badge, and ATT&CK ID with transparent scoring logic that security teams can audit and trust.
Autonomous playbook generation that creates ready-to-execute shell command sequences adapted to specific device configurations, vulnerabilities, and exploitation contexts.
Real-time blast radius visualization using graph algorithms to demonstrate cascading failures and network-wide impact as attacks progress through the topology.
Zero manual intelligence updates via Google Search grounding—Grounded Scans automatically pull the latest CVE data, EPSS scores, and exploitation trends without maintaining a vulnerability database.

What we learned

EPSS transforms prioritization. Weighting exploitation likelihood heavily (55%) ensures remediation efforts focus on vulnerabilities attackers actually target, not just high-CVSS items that may never be exploited.
Multi-model orchestration scales intelligence. Different AI models excel at different tasks—Grok for speed, Gemini Flash for currency, Gemini Pro for depth. Coordinating them provides better results than any single model.
MITRE ATT&CK enables shared language. Mapping every vulnerability to ATT&CK techniques creates a common taxonomy that both AI agents and human operators understand, making automation reliable and detections well-scoped.
Explainability builds trust. Transparent scoring formulas and visible threat intelligence data (EPSS %, KEV status, sources) allow security teams to confidently act on AI recommendations without "black box" uncertainty.
Graph algorithms model real impact. Breadth-first search for blast radius calculation accurately demonstrates how critical infrastructure failures cascade through network dependencies—making risk tangible.

What's next for Cyber Net

RAG-Enhanced Threat Intelligence Architecture

Implement a dual-generation verification system to ensure recommendation accuracy:

Vector database storing the complete MITRE ATT&CK framework (14 tactics, 193+ techniques, 400+ sub-techniques) and NVD CVE corpus with embeddings for semantic search
Creative generation phase: LLMs generate vulnerability hypotheses and attack chains unrestricted
RAG verification phase: Retrieval-augmented generation validates every ATT&CK ID, CVE number, CVSS score, and technique description against authoritative embeddings
Contradiction detection: Automated flagging when creative outputs diverge from ground truth, triggering regeneration with constraints
Citation linking: Every recommendation includes direct links to MITRE technique pages and NVD CVE entries with confidence scores

This architecture combines creative AI reasoning for novel attack discovery with factual grounding to eliminate hallucinated vulnerabilities.

Advanced Persistent Threat (APT) Emulation

Transform single-exploit execution into full-scale adversary campaigns:

Multi-stage attack chains: Automatically sequence exploits across the kill chain—initial access → persistence → privilege escalation → lateral movement → collection → exfiltration
APT behavior modeling: Simulate real threat actor TTPs (APT29, APT28, Lazarus Group) by chaining their documented techniques from MITRE ATT&CK
Temporal pacing: Introduce realistic delays between attack stages to mimic low-and-slow campaigns that evade detection
Objective-driven planning: Define campaign goals (e.g., "exfiltrate financial data") and let AI agents autonomously plan and execute multi-week operations
Purple team orchestration: Run coordinated red team attacks while simultaneously tasking blue team agents to detect, respond, and remediate in real-time

Continuous Security Validation Platform

Evolve from point-in-time scans to always-on offensive testing:

Scheduled autonomous campaigns: Daily/weekly attack simulations that validate security controls without human intervention
Drift detection: Automatically identify when configuration changes introduce new vulnerabilities (e.g., firewall rule modifications, patch rollbacks)
Regression testing: Ensure previously remediated vulnerabilities stay fixed and don't reappear after updates
Compliance validation: Continuously verify PCI-DSS, NIST 800-53, ISO 27001 security control effectiveness through targeted attack simulations
Remediation tracking: Close-loop validation that confirms mitigations actually prevent exploitation before marking tickets complete

Collaborative Multi-Agent Swarms

Deploy coordinated AI agent teams for complex operations:

Specialized agent roles: Reconnaissance specialists, exploitation experts, persistence engineers, data exfiltration agents, anti-forensics units
Inter-agent communication: Shared knowledge base where agents exchange discovered credentials, network maps, and access paths
Parallel execution: Multiple agents attack different network segments simultaneously, coordinating to achieve unified objectives
Competitive modes: Red team agents compete against blue team defensive agents in real-time adversarial games to stress-test both offensive and defensive AI capabilities

Blue Team Intelligence Integration

Build defensive AI counterparts that learn from offensive operations:

Automated detection rule generation: Convert successful attack chains into Sigma/YARA/Snort signatures and SIEM correlation rules
Defensive playbook creation: Generate incident response runbooks based on observed attack paths and effective countermeasures
Proactive countermeasures: Deploy honeypots, deception assets, and dynamic firewall rules in response to reconnaissance activity
Alert fatigue reduction: Train ML models on historical attack data to tune detection thresholds and reduce false positives

Live Threat Intelligence Feeds

Replace static data with real-time authoritative sources:

FIRST EPSS API integration: Pull daily exploitation probability updates directly from the official EPSS feed
CISA KEV monitoring: Webhook subscriptions that alert when new known-exploited vulnerabilities are added to the catalog
NVD CVE enrichment: Automatic vulnerability detail enrichment as new CVEs are published and scored
Threat actor tracking: Integrate with commercial threat intelligence platforms (Recorded Future, Mandiant) to prioritize attacks matching active campaigns
Zero-day detection: Flag anomalous behavior that doesn't match known CVEs, potentially indicating novel exploitation techniques

Extended Topology Discovery

Automate network mapping beyond manual configuration:

Active scanning integration: Embedded nmap, Nessus, and Qualys agents for automatic device discovery and service fingerprinting
Passive network monitoring: SPAN/TAP traffic analysis to build topology from observed communications without intrusive scanning
SNMP/CDP/LLDP polling: Discover network infrastructure interconnections directly from routing protocols and management planes
Cloud API integration: Automatic topology generation from AWS/Azure/GCP asset inventories via cloud provider APIs
Asset management sync: Bidirectional integration with ServiceNow, Jira, and CMDBs to keep topology and business context synchronized

Custom Attack Playbook Builder

Empower security teams to define organization-specific testing scenarios:

Visual attack graph editor: Drag-and-drop interface for composing multi-step attack chains from ATT&CK techniques
Parameterized templates: Reusable attack patterns with variables for target IPs, credentials, payload paths
Industry-specific scenarios: Pre-built playbooks for finance (card data theft), healthcare (PHI exfiltration), critical infrastructure (OT/SCADA attacks)
Compliance test libraries: One-click campaigns that validate specific regulatory requirements (PCI-DSS 11.3, NIST 800-53 CA-8)
Team sharing: Central repository where security teams publish and exchange effective attack scenarios

Attack Replay and Forensics

Transform execution logs into learning opportunities:

Step-by-step replay: Visualize past attacks frame-by-frame to understand decision paths and identify defensive gaps
What-if analysis: Modify network configurations retroactively and replay attacks to test if mitigations would have worked
Timeline reconstruction: Generate detailed attack timelines with timestamps, commands executed, data accessed, and lateral movement paths
Evidence export: Package attack logs and artifacts in formats compatible with forensic tools (PCAP, Syslog, Windows Event Logs)

CI/CD Security Integration

Embed offensive testing in development pipelines:

API-first architecture: RESTful endpoints for triggering scans, retrieving results, and managing campaigns programmatically
GitHub Actions / GitLab CI runners: Pre-built pipeline steps that automatically test infrastructure-as-code (Terraform, CloudFormation) for exploitable misconfigurations
Shift-left security: Validate security controls in staging environments before production deployment
Quality gates: Block deployments that introduce high-EPSS vulnerabilities or expand attack surface beyond defined thresholds

Adversary Behavior Analytics

Apply machine learning to attack pattern recognition:

Attack graph mining: Identify common exploitation paths across historical campaigns to predict likely attacker next moves
Vulnerability clustering: Group related CVEs and techniques to prioritize remediation efforts that eliminate entire attack classes
Anomaly detection: Flag unusual network behavior that deviates from known attack patterns, potentially indicating zero-days or insider threats
Predictive risk modeling: Forecast which devices are most likely to be targeted based on historical exploitation trends and network position