A comprehensive AI assistant system with voice, text, browser automation, and productivity integrations.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β USER INTERFACES β
βββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββ¬ββββββββββββββββββββββββ€
β Voice (VAPI) β Text (iMessage)β Mobile App β 24/7 Recorder β
β User + Outreachβ via BlueBubblesβ (coming soon) β (coming soon) β
ββββββββββ¬βββββββββ΄βββββββββ¬βββββββββ΄βββββββββ¬βββββββββ΄ββββββββββββ¬ββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ORCHESTRATOR (Claude 4.5 Sonnet) β
β AWS EC2 Instance β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Tool Router β β
β β β’ Communication Tools β’ Browser Automation β’ Credentials β β
β β β’ Composio Integrations β’ User Confirmation β’ Context RAG β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββ¬βββββββββ¬βββββββββ¬βββββββββ¬βββββββββ¬βββββββββ¬ββββββββββββββββββββββββ
β β β β β β
βΌ βΌ βΌ βΌ βΌ βΌ
ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββββββ
β VAPI β β Texting β β Stagehand β β Passwords β β Composio β
β Voice AI β β iMessage β β Browser β β Manager β β Gmail/Cal/Slack β
ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββββββ
Central brain powered by Claude 4.5 Sonnet with tool use capabilities.
Features:
- Processes requests from all interfaces (voice, text, app)
- Routes to appropriate tools based on intent
- Manages conversation context per user
- Coordinates multi-step tasks
Endpoints:
POST /message- Main message processingPOST /webhook/text- Text message webhookPOST /webhook/voice- VAPI voice webhookGET /health- Health check
VAPI-powered voice agents for phone interactions.
Agents:
- User Agent - Receives calls from users
- Outreach Agent - Makes calls to businesses/contacts on user's behalf
Flow:
User β Phone Call β VAPI User Agent β Orchestrator β [Tool Execution] β Response
β
VAPI Outreach Agent β Pizza Place (example)
iMessage/SMS integration via BlueBubbles + local Mac bridge.
Components:
server/- Node.js server on EC2 for message processingmac-bridge/- Python script on Mac for iMessage database access
Flow:
User β iMessage β Mac (chat.db) β Bridge β EC2 Server β Orchestrator β Reply β Mac β iMessage
AI-powered browser automation using Stagehand.
Capabilities:
- Navigate websites
- Fill forms
- Extract information
- Complete purchases (with credential access)
Use Cases:
- Order food/products
- Make reservations
- Research and data extraction
- Account management
Secure credential storage with encryption.
Features:
- AES-256-GCM encryption
- Prisma + PostgreSQL storage
- CSV import (1Password, LastPass, Bitwarden, Chrome)
- Audit logging
- Rate limiting
Integration: Stagehand can request credentials when authentication is needed for browser tasks.
Productivity integrations via Composio's Model Context Protocol.
Supported Services:
- Gmail - Search, read, send emails
- Google Calendar - View, create events
- Notion - Search, read, create pages
- Slack - Send messages, search channels
- Google Docs/Sheets
Authentication: Per-user OAuth via Composio's connection flow.
System prompts for all agents and components.
Files:
orchestrator.md- Main orchestrator behaviorvapi-user.md- User voice agentvapi-outreach.md- Outreach voice agenttexting-user.md- Text messaging (user)texting-outreach.md- Text messaging (outreach)stagehand.md- Browser automationpasswords.md- Credential manager
- AWS EC2 instance (Amazon Linux 2023)
- Mac with macOS for iMessage bridge
- Node.js 20+
- Python 3.9+
- Docker (optional)
git clone https://github.com/yourusername/assistant.git
cd assistantcd scripts
./setup_ec2_env.shThen SSH to EC2 and edit the .env files with your API keys.
./deploy_all.sh- Sign up at vapi.ai
- Create two assistants using configs from
vapi/ - Purchase phone numbers
- Set webhook URLs to your EC2 orchestrator
- Sign up at composio.dev
- Get API key
- Users authenticate via OAuth redirect flow
cd comms/mac-bridge
pip install -r requirements.txt
python bridge_direct.py| Service | Environment Variable | Purpose |
|---|---|---|
| Anthropic | ANTHROPIC_API_KEY |
Claude LLM |
| VAPI | VAPI_API_KEY |
Voice agents |
| Composio | COMPOSIO_API_KEY |
Productivity integrations |
| OpenAI | OPENAI_API_KEY |
Stagehand vision |
| Browserbase | BROWSERBASE_API_KEY |
Cloud browser (optional) |
1. User texts: "Order me a large pepperoni from Domino's"
2. Orchestrator receives via texting webhook
3. Orchestrator calls browse_web tool with goal
4. Stagehand navigates to Domino's
5. If login needed β get_password tool
6. If payment needed β request_user_confirmation + get_credit_card
7. Order placed, confirmation sent back to user via text
1. User calls VAPI user line
2. Voice transcribed, sent to orchestrator
3. Orchestrator calls get_calendar_events (Composio)
4. Events retrieved from user's Google Calendar
5. Response spoken back via VAPI
1. User texts request
2. Orchestrator determines outreach call needed
3. Orchestrator calls make_outreach_call with task
4. VAPI outreach agent calls dentist office
5. Agent negotiates new appointment
6. Result reported back to user via text
assistant/
βββ orchestrator/ # Central brain (Claude 4.5 Sonnet)
β βββ src/
β β βββ index.ts # Express server
β β βββ agent.ts # Claude agent with tools
β β βββ tools/ # Tool implementations
β βββ Dockerfile
βββ prompts/ # All agent prompts
βββ vapi/ # Voice agent integration
β βββ src/
βββ composio/ # MCP integrations
β βββ src/
βββ comms/ # Text messaging (iMessage)
β βββ server/ # EC2 Node.js server
β βββ mac-bridge/ # Local Mac Python bridge
β βββ scripts/
βββ stagehand/ # Browser automation
β βββ src/
βββ passwords/ # Credential management
β βββ prisma/
β βββ src/
βββ scripts/ # Deployment scripts
βββ README.md
- Mobile app (React Native)
- 24/7 audio recorder with speaker diarization
- RAG system for context retrieval
- Meta glasses integration
- Accessibility browser for visually impaired
- Multi-user support with proper auth
- All credentials encrypted at rest (AES-256-GCM)
- Credential access logged for audit
- Rate limiting on sensitive operations
- User confirmation required for purchases
- Shared secrets for inter-service auth
MIT