MCP server unit testing, end-to-end (e2e) testing, and server evals
The @mcpjam/sdk is a TypeScript SDK for testing, evaluating, and building applications with MCP (Model Context Protocol) servers. It provides everything you need to ensure your MCP server works reliably across different LLMs and environments.
Use the suite classes inside Jest or Vitest, then convert results into shared JSON or JUnit XML reporters with toConformanceReport(...) and renderConformanceReportJUnitXml(...).
import { MCPClientManager, HostRunner, EvalTest } from "@mcpjam/sdk";// Connect to your MCP serverconst manager = new MCPClientManager({ myServer: { command: "npx", args: ["-y", "@modelcontextprotocol/server-everything"], },});// Create a runner with LLM + MCP toolsconst runner = new HostRunner({ tools: await manager.getToolsForAiSdk(), model: "anthropic/claude-sonnet-4-20250514", apiKey: process.env.ANTHROPIC_API_KEY,});// Run a prompt and inspect resultsconst result = await runner.run("Add 2 and 3");console.log(result.toolsCalled()); // ["add"]console.log(result.e2eLatencyMs()); // 1234// Run statistical evaluationconst test = new EvalTest({ name: "addition", test: async (runner) => { const r = await runner.run("Add 2+3"); return r.hasToolCall("add"); },});await test.run(runner, { iterations: 30 });console.log(`Accuracy: ${(test.accuracy() * 100).toFixed(1)}%`);
Renames in 1.11.TestAgent → HostRunner, EvalAgent → HostExecutor, .prompt() → .run(), Host.addServer() → Host.requireServer(). No deprecation aliases. The same code with the old names will not type-check against @mcpjam/sdk@>=1.11.
For richer setups — pinning a host style/profile, applying sandbox/permission policy, or running the same configuration in CI that the MCPJam Inspector uses — start from a Host spec instead of constructing a runner directly:
import { MCPClientManager, Host, EvalTest } from "@mcpjam/sdk";const manager = new MCPClientManager();await manager.connectToServer("everything", { command: "npx", args: ["-y", "@modelcontextprotocol/server-everything"],});const host = new Host({ style: "mcpjam", model: "openai/gpt-4o",}).requireServer("everything");// Bind the spec to the live manager. apiKey lives on the runtime, not per-call.const runtime = host.withManager(manager, { apiKey: process.env.OPENAI_API_KEY!,});const test = new EvalTest({ name: "add", test: async (runner) => (await runner.run("Add 2+3")).hasToolCall("add"),});await test.run(runtime, { iterations: 10 });