Getting Started
Connect your AI assistant to MarcoPolo, provision your workspace, and run your first live query.
1. Connect your AI Client
The simplest way to connect is through MCP (Model Context Protocol), the standard for connecting AI assistants to external tools. For Claude and Codex, the plugin (Claude, Codex) provides a better experience with built-in skills for data workflows. Both options connect to the same MarcoPolo workspace.
| AI Tool | Connection Path |
|---|---|
| Claude | MCP: Admin Settings → Custom Connectors → Add URL: https://mcp.marcopolo.dev Plugin: Install the Claude Plugin |
| Claude Code | MCP: Run claude mcp add marcopolo --transport http https://mcp.marcopolo.dev Plugin: Install the Claude Plugin |
| Codex | MCP: Add https://mcp.marcopolo.dev to MCP settings Plugin: Install the Codex Plugin |
| ChatGPT | Apps Marketplace → Search "MarcoPolo" |
| Cursor / VS Code | Add https://mcp.marcopolo.dev to your mcp.json or MCP settings. |
Specialized Integrations
If you aren't using a standard MCP client, follow these specific guides:
- Claude Desktop Extension: Connect MarcoPolo to a free version of Claude Desktop via the desktop extension
- Claude Plugin: Plugin for Claude Code and Claude Desktop with bundled skills, MCP configuration, and data analyst subagent
- Codex Plugin: Plugin for OpenAI Codex with bundled skills and MCP configuration
- Developer SDKs: Use MCP or the Sandbox SDK to integrate MarcoPolo into custom LangChain agents or autonomous Python scripts.
- Replit:
2. Connect and Establish Identity
Once the server URL is added, you must explicitly Connect to the MarcoPolo MCP server within your AI app. This action triggers the authentication process. You can establish your identity using Google, Microsoft, GitHub, or by providing your Email address.
No data source required to start. If you sign in with a personal email (e.g., @gmail.com), your workspace comes pre-populated with demo data sources: a Salesforce org, a Snowflake warehouse, and an S3 bucket. You can run the examples on this page immediately.
- Automatic Workspace Provisioning: The moment you first authenticate, MarcoPolo automatically provisions your workspace. All subsequent queries, files, and cached data persist within this secure boundary.
- Domain-Based Enablement: Your email domain maps you to your company’s environment. This enables you to share data sources, business context, and analysis artifacts with other users or groups within your organization.
3. Start with a Conversation
Start with a conversation, not a configuration. MarcoPolo eliminates the need for manual config files. You manage your environment through the same chat interface where you get your answers.
Ask your AI what it can see. MarcoPolo will inspect your workspace and return the state.
What data sources do I have?
AI calls list_datasources()
→ salesforce-demo-datasource (Salesforce) [query, schema]
→ snowflake-demo-datasource (Snowflake) [query, schema]
→ aws-s3-demo-datasource (S3) [browse, query, download]
AI: You have three demo data sources: a Salesforce org, a Snowflake
warehouse with an ANALYTICS database (CUSTOMERS, ORDERS,
PRODUCT_METRICS tables), and an S3 bucket. You can also connect
your own sources - PostgreSQL, BigQuery, Jira, HubSpot, and more.4. Run your First Query
Your AI fetches the schema, writes a query file, executes it in your workspace, and caches the results in DuckDB before answering.
Who are the top customers by total purchases?
AI calls get_schema("snowflake-demo-datasource", database="ANALYTICS")
→ 3 tables: CUSTOMERS (10 rows), ORDERS (17 rows), PRODUCT_METRICS (16 rows)
AI calls get_schema("snowflake-demo-datasource", database="ANALYTICS", table="CUSTOMERS")
→ 10 columns: CUSTOMER_ID, CUSTOMER_NAME, EMAIL, COUNTRY,
ACCOUNT_STATUS, TOTAL_PURCHASES, ...
AI writes queries/snowflake-demo-datasource/top_customers.sql
AI calls query("snowflake-demo-datasource", "queries/snowflake-demo-datasource/top_customers.sql")
→ 5 rows → loaded into DuckDB table: top_customers
AI: Top 5 customers by total purchases:
1. Fiona Green (Germany) $3,450.80
2. Charlie Brown (UK) $2,340.00
3. Jessica Martinez (Spain) $2,100.00
4. Hannah Lee (Japan) $1,890.45
5. Alice Johnson (USA) $1,250.50
Results are cached in DuckDB - you can run follow-up queries
against this data without hitting Snowflake again.You've run your first query. The results are cached in your workspace's DuckDB instance, and the query file is saved for reuse. From here, you can ask follow-up questions against the cached data, connect additional sources, or start building on the workflows below.
5. Connect your Own Data
When you're ready to work with production data, ask your AI to generate a connector link. You configure credentials in a secure web form - they're encrypted and never exposed to the AI.
I want to connect my PostgreSQL database.
AI calls generate_connector_url("pg")
→ https://mcp.marcopolo.dev/app/datasources/new?step=configure&type=pg
AI: Open this link to configure your PostgreSQL connection.
You'll enter your host, port, database name, and credentials
in a secure form. Once saved, I can query it immediately.Once configured, the new source appears alongside your demo data. You can query it immediately.
What tables are in my Postgres database?
AI calls get_schema("my-postgres", database="public")
→ 14 tables: customers, orders, invoices, products, ...
AI calls get_schema("my-postgres", database="public", table="customers")
→ 8 columns: id, name, email, created_at, plan, mrr, ...From here, the workflow is the same: your AI writes queries, caches results in DuckDB, and you iterate. All your data sources - demo and production - are available in the same workspace.
Next Steps
- Connect your data sources: Add the databases, warehouses, cloud storage, and SaaS applications you work with regularly.
- Open the Web App: Browse the files, tables, and dashboards your AI has created in your persistent workspace.
- Write a RULES.md: Define your business logic and metric definitions so the AI generates accurate queries consistently.
- Security overview: Review how MarcoPolo uses isolated containers and encrypted credential storage to protect your data.