Inspiration

Apple, OpenAI, and others are racing to discover the next paradigm for computing. We believe we have an answer: an LLM-based Operating System that makes your computer alive.

We’re introducing AgentOS, the system that enables your computer to reason, act, and interact on your behalf. Summon it with Command-Enter, and watch it write essays on Word, play a song on Spotify, and even train an ML model locally.

What it does

AgentOS gives your computer direct access to LLMs, and then commands it with voice. It can use your apps, edit your files, text your friends, and analyze data. This is more than an assistant: it’s a new hands-free solution, especially for seniors or physically disabled individuals.

You can access and try it (on Mac) here. Examples to get started:

  • “Send a text message to Ray saying What’s up?”
  • “Take a screenshot and send it to Robert on Slack”
  • “Analyze ~/housing2019.csv and train a simple regression model on it”

Siri leaves a lot to be desired – in fact, it easily fails on the above.

How we built it

To ensure smooth development on Apple, we chose to build our frontend on SwiftUI. AgentOS utilizes OpenAI’s Whisper-1 model to transcribe voice to commands, and GPT-4 to translate high-level requests into Apple-native commands. These Applescripts are finally processed and run through crafty, devious usage of SwiftUI.

Challenges we ran into

The OpenAI API is heavily limited in regards to its knowledge of the capabilities of Applescript, and many of our challenges came through designing high-quality prompts to generate working scripts.

Accomplishments that we're proud of

We thought that the most interesting part of our project was discovering how impactful prompt-engineering was. We spent a lot of time designing and tuning the examples included in our prompt, so that GPT-4 had sufficient information to extrapolate and perform tasks that weren’t specified in the prompt itself.

What we learned

SwiftUI is a pain. On a more serious note, Apple tools have much looser restrictions than Windows or Linux with respect to interacting with applications via the command line.

What's next for AgentOS

In the future, we want AgentOS to work on all devices, including Windows and Linux. We hope to be able to perform an even broader range of tasks, such as interacting with the web or going beyond keyboard/mouse input.

Try it

Here Summon with Command-Enter (Mac-only).

Built With

Share this project:

Updates