Smart Contract Exploitation Agent

AI-powered agent that autonomously discovers and exploits smart contract vulnerabilities using xAI's Grok.

Tested on 533 real-world contracts | 30+ successful exploits | ~$20M total USD value exploited | Fully autonomous workflow

Docker-based sandbox supporting Live Contracts (on-chain forking) and CTF Challenges (Damn Vulnerable DeFi)

Successful Exploit Categories

Access Control - Shezmu ($4.9M), TempleDao ($2.3M), SuperRare ($730K), DEPUSDT_LEVUSDC ($105K), Cftoken
Flash Loan Attacks - ChiSale ($16.3K), CFC, NovaXM2E (2.86 ETH)
Reentrancy - Cream Finance, Convergence
Logic Flaws - DeezNutz404, Novo (4.75M tokens)
Price Oracle Manipulation - Multiple DeFi protocols

Note: Most exploits target contracts deployed after Grok's knowledge cutoff. The agent operates without web search tools, relying solely on source code analysis and reasoning. Full chain-of-thought reasoning and execution logs available in reports/ directory.

Architecture

┌─────────────────────┐
│   Python Agent      │  ← Grok API (ReAct reasoning)
│   (Host Machine)    │
└─────────┬───────────┘
          │
          ├─── workspace/ (shared volume)
          │    ├── contract_context/    # Source code, ABI, state
          │    ├── memory/              # Persistent learning
          │    └── Exploit.sol          # Generated exploits
          │
          ▼
┌─────────────────────┐
│  Docker Container   │
│  ├── Foundry        │  ← Solidity compilation & testing
│  ├── Heimdall       │  ← Bytecode decompilation
│  ├── Anvil          │  ← Blockchain forking
│  └── Slither        │  ← Static analysis
└─────────────────────┘

Blockchain Fork & Isolated Execution

One of the key technical challenges is creating a perfectly isolated sandbox that can replay any historical blockchain state:

Multi-Chain Support:

✅ Ethereum Mainnet - Fork and replay at any block height
✅ BSC (Binance Smart Chain) - Full state reproduction

Technical Implementation:

Anvil Forking - Uses Foundry's Anvil to fork live blockchains at specific block heights
State Snapshot - Captures contract storage, balances, and deployment data at target block
Perfect Isolation - Each exploit runs in a fresh Docker container with no cross-contamination
Sub-second Startup - Container spins up and forks blockchain in <5 seconds

Why This is Hard:

Must handle billions of blocks of historical data via RPC
Preserve exact state including storage slots, balances, nonces
Handle proxy contracts, upgradeable patterns, and complex dependencies
Support both verified contracts (source available) and unverified (bytecode decompilation)

This enables testing exploits against real production state at the exact block height when vulnerabilities existed.

Agent Workspace & Tools

The agent operates in a shared workspace/ directory with access to:

Available Tools:

execute - Run bash commands in Docker (forge, cast, slither, etc.)
read - Read files (source code, ABI, state snapshots)
write - Create/modify Solidity exploit files
list_dir - Browse contract context and memory
grep - Search patterns in source code

Workspace Structure:

workspace/
├── contract_context/           # Target contract data
│   └── 0xAddress/
│       ├── source/*.sol        # Contract source code
│       ├── abi.json           # Contract ABI
│       └── state_at_block.json # On-chain state snapshot
├── memory/                     # Persistent learning
│   ├── global/strategies/      # Exploit patterns library
│   └── current_case/          # Current task context
├── Exploit.sol                # Agent writes exploit here
└── foundry.toml               # Forge configuration

Quick Start

Prerequisites

# Clone the repo
git clone <repo-url>
cd <repo-name>

# Install dependencies
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
playwright install chromium

# Configure API keys
cp env.example .env
# Edit .env with your XAI_API_KEY, ETH_RPC_URL, ETHERSCAN_API_KEY

Build Docker Image

docker build -t exploit-agent .

Test the Setup

# Verify sandbox environment
python tests/test_sandbox.py

# Test agent tools
python tests/test_tools.py

Usage

Option A: CTF Mode (Damn Vulnerable DeFi)

Best for learning and benchmarking. No blockchain access needed for most challenges.

Step 1: Setup DVD (one-time)

# Clone DVD repo into workspace
cd workspace
git clone https://github.com/theredguild/damn-vulnerable-defi.git dvd
cd dvd
git submodule update --init --recursive
cd ../..

# Prepare all 18 challenges
python prepare_context.py ctf --all

Step 2: Run a Challenge

Example with unstoppable (easy difficulty):

# First run (creates fresh memory)
python run_agent.py contracts/ctf/unstoppable.json

# Retry same challenge (memory preserved - agent remembers previous attempts)
python run_agent.py contracts/ctf/unstoppable.json

# Switch to different challenge (clear short-term memory with --reset)
python run_agent.py contracts/ctf/truster.json --reset

The agent will:

Read the vulnerable contract source code
Analyze the vulnerability
Write exploit code in the test file
Run forge test to verify

Step 3: Batch Evaluation (Optional)

Run the agent on multiple challenges to benchmark performance.

Each challenge gets a fresh current_case/ memory, but global/strategies/ accumulates across challenges (agent learns from previous exploits).

# List all 18 challenges
python -m evaluation.run_ctf_eval --list

# Batch run all challenges
python -m evaluation.run_ctf_eval --all

# Batch run by difficulty (easy/medium/hard/expert)
python -m evaluation.run_ctf_eval --difficulty easy

# Run single challenge
python -m evaluation.run_ctf_eval --challenge unstoppable

Generates summary report with pass rates by difficulty/category. Results saved to runs/ctf_eval/.

Option B: Live Contract Mode

For exploiting real on-chain contracts. Requires RPC access.

Step 1: Create Contract Config

Create contracts/live/my_contract.json:

{
  "type": "live",
  "name": "MyContract",
  "address": "0x1234567890123456789012345678901234567890",
  "description": "Description of the vulnerability",
  "chain_id": 1,
  "chain_name": "ethereum",
  "block_height": 21450000,
  "hints": "Optional hints for the agent..."
}

Step 2: Prepare Context

python prepare_context.py live contracts/live/my_contract.json

This downloads:

Contract source code from Etherscan (or decompiles via Heimdall if unverified)
ABI and deployment info
Current state variables
Proxy detection (if applicable)

Step 3: Run Agent

Example with sorra_staking:

# First run
python run_agent.py contracts/live/sorra_staking.json

# Retry same contract (memory preserved)
python run_agent.py contracts/live/sorra_staking.json

# Switch to different contract (clear short-term memory)
python run_agent.py contracts/live/other_contract.json --reset

The agent will:

Fork the blockchain at the specified block
Fund the agent address with ETH
Analyze the contract and execute exploit
Report balance changes

Results saved to runs/live_<name>_<timestamp>/.

CTF Challenges Reference

Difficulty	Challenges
Easy	unstoppable, naive-receiver, truster, side-entrance
Medium	the-rewarder, selfie, compromised, puppet, puppet-v2
Hard	free-rider, backdoor, climber, wallet-mining
Expert	puppet-v3, abi-smuggling, shards, curvy-puppet, withdrawal

Memory System

The agent uses filesystem-based memory for persistent learning:

workspace/memory/
├── global/strategies/     # Long-term: persists across all runs
│   ├── reentrancy.md      # Agent writes learned patterns here
│   ├── flash_loan.md
│   └── ...
└── current_case/          # Short-term: current task only
    ├── todo.md            # Task tracking
    └── attempts.md        # Exploit attempt log

global/strategies/: Agent writes learned exploit patterns here after successful exploits. Accumulates across batch runs.
current_case/: Preserved on retry, cleared with --reset or automatically in batch mode.

Vulnerability Searchers

Automated scanners that discovered 200+ high-risk contracts on BSC blockchain.

Proxy Scanner

Scan for vulnerable upgradeable proxy contracts (uninitialized implementations).

python searcher/continuous_scan.py --chain bsc --from-block 59000000 --to-block 60000000

Events scanned: EIP-1967 (Upgraded, AdminChanged, BeaconUpgraded), Diamond (DiamondCut)

Critical signal: UNINITIALIZED_IMPL - attacker can call initialize() to take ownership.

Burn Scanner

Scan for ERC20 tokens with dangerous burn functions (rug pull patterns).

python searcher/burn_scanner.py --chain bsc --from-block 59000000 --to-block 60000000

How it works:

Scan PairCreated events from DEX factories (PancakeSwap, etc.)
Get bytecode of each new token
Check for dangerous burn function selectors

Dangerous patterns (24 signatures):

Risk	Functions
🔴 HIGH	`burnPair(address)`, `burn(address,uint256)`, `burnLP(address)`, `destroyPair(address)`, `burnAll(address)`
🟠 MEDIUM	`burnFrom(address,uint256)`, `emergencyBurn`, `forceBurn`, `adminBurn`

Exploit: Call burnPair(lpAddress) → burns LP tokens → sync() → swap remaining tokens for all liquidity.

Project Structure

.
├── agents/
│   ├── react_agent.py          # ReAct agent (live + CTF)
│   ├── prompts.py              # Live contract prompts
│   ├── ctf_prompts.py          # CTF prompts (no hints)
│   ├── tools.py                # bash, grep, file_ops, forge
│   └── memory.py               # Memory system (filesystem-based)
│
├── prepare/                    # Context preparation
│   ├── live_context.py         # Live contract context
│   └── ctf_context.py          # CTF challenge context
│
├── evaluation/                 # Batch evaluation (benchmark)
│   ├── ctf_evaluator.py        # CTF batch evaluator
│   └── run_ctf_eval.py         # CLI for batch runs
│
├── searcher/                   # Vulnerability scanners
│   ├── continuous_scan.py      # Proxy scanner (EIP-1967, Diamond)
│   ├── burn_scanner.py         # ERC20 burn function scanner
│   └── proxy_scanner.py        # Single address scanner
│
├── contracts/                  # Config files (input)
│   ├── live/                   # Live contract configs (name, address, chain)
│   └── ctf/                    # CTF configs (name, difficulty, test_function)
│
├── workspace/                  # Prepared context (generated)
│   ├── dvd/                    # DVD repo (Damn Vulnerable DeFi)
│   ├── ctf_context/            # CTF context (source code, README, test file)
│   ├── contract_context/       # Live contract context (ABI, source, state)
│   └── memory/                 # Agent memory (persistent)
│       ├── global/strategies/  # Long-term: learned exploit patterns
│       └── current_case/       # Short-term: current task todo/attempts
├── tests/                      # Test files
├── runs/                       # Results
│
├── prepare_context.py          # Unified preparation entry
├── run_agent.py               # Main agent runner
└── Dockerfile                  # Sandbox image

Environment Variables

Variable	Required For	Description
`XAI_API_KEY`	All	Grok API key
`ETH_RPC_URL`	Live (ETH)	Ethereum RPC (Alchemy)
`BSC_RPC_URL`	Live (BSC)	BSC RPC (Alchemy)
`ETHERSCAN_API_KEY`	Live	Contract source download

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Contract Exploitation Agent

Successful Exploit Categories

Architecture

Blockchain Fork & Isolated Execution

Agent Workspace & Tools

Quick Start

Prerequisites

Build Docker Image

Test the Setup

Usage

Option A: CTF Mode (Damn Vulnerable DeFi)

Option B: Live Contract Mode

CTF Challenges Reference

Memory System

Vulnerability Searchers

Proxy Scanner

Burn Scanner

Project Structure

Environment Variables

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agents		agents
contracts		contracts
data_process		data_process
docs		docs
evaluation		evaluation
prepare		prepare
reports		reports
sandbox		sandbox
searcher		searcher
tests		tests
tools		tools
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
batch_runner.py		batch_runner.py
build.sh		build.sh
common_mistakes_report.md		common_mistakes_report.md
env.example		env.example
merge_contracts.py		merge_contracts.py
merge_high_priority.py		merge_high_priority.py
prepare_context.py		prepare_context.py
requirements.txt		requirements.txt
run_agent.py		run_agent.py

ZhimaoL/surf-hack

Folders and files

Latest commit

History

Repository files navigation

Smart Contract Exploitation Agent

Successful Exploit Categories

Architecture

Blockchain Fork & Isolated Execution

Agent Workspace & Tools

Quick Start

Prerequisites

Build Docker Image

Test the Setup

Usage

Option A: CTF Mode (Damn Vulnerable DeFi)

Option B: Live Contract Mode

CTF Challenges Reference

Memory System

Vulnerability Searchers

Proxy Scanner

Burn Scanner

Project Structure

Environment Variables

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages