Skip to content

actuallyarjun/rover.

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

28 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Personal Assistant

A comprehensive AI assistant system with voice, text, browser automation, and productivity integrations.

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                              USER INTERFACES                                 β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   Voice (VAPI)  β”‚  Text (iMessage)β”‚   Mobile App    β”‚   24/7 Recorder       β”‚
β”‚  User + Outreachβ”‚  via BlueBubblesβ”‚  (coming soon)  β”‚   (coming soon)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                 β”‚                 β”‚                     β”‚
         β–Ό                 β–Ό                 β–Ό                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        ORCHESTRATOR (Claude 4.5 Sonnet)                      β”‚
β”‚                              AWS EC2 Instance                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚                           Tool Router                                β”‚    β”‚
β”‚  β”‚  β€’ Communication Tools    β€’ Browser Automation    β€’ Credentials      β”‚    β”‚
β”‚  β”‚  β€’ Composio Integrations  β€’ User Confirmation     β€’ Context RAG      β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚        β”‚        β”‚        β”‚        β”‚        β”‚
         β–Ό        β–Ό        β–Ό        β–Ό        β–Ό        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   VAPI     β”‚ β”‚  Texting   β”‚ β”‚ Stagehand  β”‚ β”‚ Passwords  β”‚ β”‚    Composio     β”‚
β”‚  Voice AI  β”‚ β”‚  iMessage  β”‚ β”‚  Browser   β”‚ β”‚  Manager   β”‚ β”‚ Gmail/Cal/Slack β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Components

1. Orchestrator (/orchestrator)

Central brain powered by Claude 4.5 Sonnet with tool use capabilities.

Features:

  • Processes requests from all interfaces (voice, text, app)
  • Routes to appropriate tools based on intent
  • Manages conversation context per user
  • Coordinates multi-step tasks

Endpoints:

  • POST /message - Main message processing
  • POST /webhook/text - Text message webhook
  • POST /webhook/voice - VAPI voice webhook
  • GET /health - Health check

2. Voice Integration (/vapi)

VAPI-powered voice agents for phone interactions.

Agents:

  • User Agent - Receives calls from users
  • Outreach Agent - Makes calls to businesses/contacts on user's behalf

Flow:

User β†’ Phone Call β†’ VAPI User Agent β†’ Orchestrator β†’ [Tool Execution] β†’ Response
                                          ↓
                            VAPI Outreach Agent β†’ Pizza Place (example)

3. Text Messaging (/comms)

iMessage/SMS integration via BlueBubbles + local Mac bridge.

Components:

  • server/ - Node.js server on EC2 for message processing
  • mac-bridge/ - Python script on Mac for iMessage database access

Flow:

User β†’ iMessage β†’ Mac (chat.db) β†’ Bridge β†’ EC2 Server β†’ Orchestrator β†’ Reply β†’ Mac β†’ iMessage

4. Browser Automation (/stagehand)

AI-powered browser automation using Stagehand.

Capabilities:

  • Navigate websites
  • Fill forms
  • Extract information
  • Complete purchases (with credential access)

Use Cases:

  • Order food/products
  • Make reservations
  • Research and data extraction
  • Account management

5. Password Manager (/passwords)

Secure credential storage with encryption.

Features:

  • AES-256-GCM encryption
  • Prisma + PostgreSQL storage
  • CSV import (1Password, LastPass, Bitwarden, Chrome)
  • Audit logging
  • Rate limiting

Integration: Stagehand can request credentials when authentication is needed for browser tasks.

6. Composio MCP (/composio)

Productivity integrations via Composio's Model Context Protocol.

Supported Services:

  • Gmail - Search, read, send emails
  • Google Calendar - View, create events
  • Notion - Search, read, create pages
  • Slack - Send messages, search channels
  • Google Docs/Sheets

Authentication: Per-user OAuth via Composio's connection flow.

7. Prompts (/prompts)

System prompts for all agents and components.

Files:

  • orchestrator.md - Main orchestrator behavior
  • vapi-user.md - User voice agent
  • vapi-outreach.md - Outreach voice agent
  • texting-user.md - Text messaging (user)
  • texting-outreach.md - Text messaging (outreach)
  • stagehand.md - Browser automation
  • passwords.md - Credential manager

Setup

Prerequisites

  • AWS EC2 instance (Amazon Linux 2023)
  • Mac with macOS for iMessage bridge
  • Node.js 20+
  • Python 3.9+
  • Docker (optional)

1. Clone Repository

git clone https://github.com/yourusername/assistant.git
cd assistant

2. Set Up EC2 Environment

cd scripts
./setup_ec2_env.sh

Then SSH to EC2 and edit the .env files with your API keys.

3. Deploy Services

./deploy_all.sh

4. Configure VAPI (Optional)

  1. Sign up at vapi.ai
  2. Create two assistants using configs from vapi/
  3. Purchase phone numbers
  4. Set webhook URLs to your EC2 orchestrator

5. Configure Composio (Optional)

  1. Sign up at composio.dev
  2. Get API key
  3. Users authenticate via OAuth redirect flow

6. Start Mac Bridge

cd comms/mac-bridge
pip install -r requirements.txt
python bridge_direct.py

API Keys Required

Service Environment Variable Purpose
Anthropic ANTHROPIC_API_KEY Claude LLM
VAPI VAPI_API_KEY Voice agents
Composio COMPOSIO_API_KEY Productivity integrations
OpenAI OPENAI_API_KEY Stagehand vision
Browserbase BROWSERBASE_API_KEY Cloud browser (optional)

Example Flows

"Order me a pizza"

1. User texts: "Order me a large pepperoni from Domino's"
2. Orchestrator receives via texting webhook
3. Orchestrator calls browse_web tool with goal
4. Stagehand navigates to Domino's
5. If login needed β†’ get_password tool
6. If payment needed β†’ request_user_confirmation + get_credit_card
7. Order placed, confirmation sent back to user via text

"What's on my calendar today?"

1. User calls VAPI user line
2. Voice transcribed, sent to orchestrator
3. Orchestrator calls get_calendar_events (Composio)
4. Events retrieved from user's Google Calendar
5. Response spoken back via VAPI

"Call the dentist and reschedule my appointment"

1. User texts request
2. Orchestrator determines outreach call needed
3. Orchestrator calls make_outreach_call with task
4. VAPI outreach agent calls dentist office
5. Agent negotiates new appointment
6. Result reported back to user via text

Directory Structure

assistant/
β”œβ”€β”€ orchestrator/       # Central brain (Claude 4.5 Sonnet)
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ index.ts    # Express server
β”‚   β”‚   β”œβ”€β”€ agent.ts    # Claude agent with tools
β”‚   β”‚   └── tools/      # Tool implementations
β”‚   └── Dockerfile
β”œβ”€β”€ prompts/            # All agent prompts
β”œβ”€β”€ vapi/               # Voice agent integration
β”‚   └── src/
β”œβ”€β”€ composio/           # MCP integrations
β”‚   └── src/
β”œβ”€β”€ comms/              # Text messaging (iMessage)
β”‚   β”œβ”€β”€ server/         # EC2 Node.js server
β”‚   β”œβ”€β”€ mac-bridge/     # Local Mac Python bridge
β”‚   └── scripts/
β”œβ”€β”€ stagehand/          # Browser automation
β”‚   └── src/
β”œβ”€β”€ passwords/          # Credential management
β”‚   β”œβ”€β”€ prisma/
β”‚   └── src/
β”œβ”€β”€ scripts/            # Deployment scripts
└── README.md

Future Roadmap

  • Mobile app (React Native)
  • 24/7 audio recorder with speaker diarization
  • RAG system for context retrieval
  • Meta glasses integration
  • Accessibility browser for visually impaired
  • Multi-user support with proper auth

Security Notes

  • All credentials encrypted at rest (AES-256-GCM)
  • Credential access logged for audit
  • Rate limiting on sensitive operations
  • User confirmation required for purchases
  • Shared secrets for inter-service auth

License

MIT

About

TAMUHack 2026 - Rover

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •