m1 - milo | Devpost

m1 - milo

Inspiration

Selali has been working in the HCI space for nearly a decade and Avi has been building multi-sensory multi-modal GenAI experiences the past couple years.

The recent trend of AI wearables is hinting to a brand new paradigm of HCI (human computer interfaces), agent computer interfaces. The r1 was a $100 M dollar attempt at redefining our interaction with compute as we know it, but it didn’t deliver on its promises.

Why wait for the future, when you can build it yourself?

What it does

m1 : a cross-platform PWA that allows users to create with the latest GenAI tools with a voice interface

Core Functionality:
ultra low-latency multimodal conversational agent
Generate songs on the fly with Suno and provides link to webapp DAW
Generate research reports with Exa
Operate your computer from anywhere in the world with just your voice using OpenInterpreter
controlling the Phillips Hue with Milo

How we built it

Typescript/Next.JS DeepGram/Cartesia (Vapi?) Whisper/Nova 2 Suno.AI Exa.AI OpenAI Vercel OpenInterpreter

Challenges we ran into

Initially started with concept of “chat with website” but didn’t seem exciting enough and pivoted toward working with IoT hardware and didn’t have some cables so pivoted again to multi-device, multi-agent interactions that maximize user experience.

We also ran into some issues trying to get crewAI working with our typescript framework. Most of the LLMs don’t sound very natural or conversational without extensive prompt engineering

Accomplishments that we're proud of

Reducing technical debt for creators and having working demo.

Use Cases:

Agents and its use-cases are a multi-trillion dollar industry which will redefine work and culture as we know it. One of the answers to the search for meaning in this time is to create and build. GenAI and agentic systems reduce the technical debt to do so. With the recent trend of AI wearables, we wanted to see how we can improve functionality and reduce costs, what better way than a server-less cross-platform PWA?