Inspiration

The Rabbit R1 was a good idea, but it's hobbled by being closed source

What it does

It allows GPT-4o to be embodied and use tools

How we built it

We used the following tech:

  • Silero VAD running in browser via Onnx and WebGPU
  • Whisper-Base via WebGPU
  • Deepgram Aura for TTS
  • Mailgun for emails
  • Spotify for song integration
  • Suno.ai for song creation
  • Next.js/React
  • Web Streams for frontend-backend integration
  • Clerk.dev for user authentication

Challenges we ran into

  • Latency was a challenge, we turned to WebGPU to drop latency to first response as low as possible
  • Midway we realized it was still too slow and had to rewrite our networking stack to use web streaming
  • Our Raspberry PI didn't support WebGPU, so we couldn't make the hardware component of our project :(
  • Working with Oauth and Authentication on a short scale is really stressful!

Accomplishments that we're proud of

  • We're beating the Rabbit R1 on latency since we allow the AI to respond in parts, rather than at once
  • We built out an agent that learns novel capabilities in context by self-assembling tools
  • Doing transcription and VAD on device helps privacy tremendously: we're not sending raw audio to third parties, and we're less likely to pick up on and record non-consenting parties than most AI hardware projects these days
  • We got a real domain for our project!

What we learned

  • Latency is a huge factor in how it feels to use an agent
  • Raspberry PIs have a long way to come on Vulkan/WebGPU

What's next for Open Rabbit

  • Clean up the UI
  • Get this project running on a raspberry PI
  • Share it with the world and potentially start a Kickstarter!

Built With

Share this project:

Updates