Prompt Optimization for Every Agent
Optimize your agent's prompts with a POST request. Start training on your production code as-is. Any language, any framework. Powered by Genetic Evolutionary Prompt Optimization (GEPA).
Get started with Synth AI
Performance Benchmarks
Baseline
Optimized
Train with Your Code As-Is
Wrap your code in a few HTTP routes and start training.
Requirement (GEPA)
Expose a /rollout route wrapping your agent logic.
Use Cloudflare Tunnel to stand up a secure endpoint.
Create a of tasks to evaluate your prompt.
via OpenAPI to start training.
Example Output
Baseline PromptReward: 39%
"You are an expert banking assistant that classifies customer queries into banking intents. Given a customer message, respond with exactly one intent label from the provided list using the banking77_classify tool."
Optimized (Gen 4)Reward: 80% (+105% lift)
[Decision Policy]
"Prefer the single most specific intent that directly matches the user's primary request. If ambiguous, use keyword mappings..."
[Keyword Mappings]
"Card linking / adding / showing in app -> card_linking (Keywords: 'link', 'add to app')..."
Supported Languages:
What's different about Synth AI?
Deployment | Synth AI | DSPy |
Serverless Cloud API Managed infrastructure with no setup required | ||
Zero Code Changes Wrap existing code in HTTP endpoint, optimize via API | ||
Cloud-Tunneled Local Dev Expose local task apps to cloud optimizer without deployment | ||
Offline Compatible Run entirely on your machine, works offline | ||
Programming Model | Synth AI | DSPy |
Provider Flexibility LiteLLM + local LMs (Ollama, SGLang) + any OpenAI-compatible endpoint | ||
Built-in Agent Modules ReAct, CodeAct, ProgramOfThought modules for tool-use workflows | ||
Agent Support | Synth AI | DSPy |
Coding Agent Optimization Optimize system prompts, AGENTS.md, skills for Codex/OpenCode/Claude | ||
Cloud Sandboxes Isolated Daytona VMs for agent evaluation (~3s provisioning) | ||
Evaluation & Inference | Synth AI | DSPy |
GraphGen Train custom prompt graphs from JSON datasets | ||
Verifier Optimization Calibrate evaluation rubrics to correlate with ground truth | ||
Production Inference APIs Serve trained graphs and verifiers via hosted endpoints | ||
RLM for Massive Context Handle 1M+ token contexts via tool-based search | ||
Built-in Retrieval/RAG Retrieve and ColBERTv2 modules for retrieval-augmented workflows | ||
GEPA Algorithm | Synth AI | DSPy |
Evolutionary Optimization Population-based search with LLM-guided mutations | ||
Pareto Multi-Objective Optimize across accuracy, prompt length, and cost | ||
Rich Textual Feedback Use execution traces and error logs to guide mutations | ||
Pattern-Mode Transformations Structured diffs and partial edits instead of full prompt rewrites | ||
Cost & Efficiency Controls Proxy models, adaptive pools, spend limits, token budgets, concurrency | ||
Tool Optimization Automatically optimize tool definitions and usage patterns | ||
Inference-Time Search Track and select best outputs at runtime via track_best_outputs |
Pricing
Pay for training compute. Inference is billed at provider rates.
$10 free credits monthly for new users