Inspiration

I often do a lot of repetitive browser tasks manually, usually taking up a long time.

What it does

The agent takes the uploaded screen recording (of you performing the manual browser task) and then automates it.

How we built it

I built it using Nextjs, Twelvelabs (multimodal API), OpenAI and Hyperbrowser (browser agent).

Challenges we ran into

Difficulty in constructing accurate steps for agent.

Accomplishments that we're proud of

Working demo that works for sending an email.

What we learned

Learned how to make browser agent effective and useful for everyday tasks.

What's next for Browsor Agent

Run multiple tasks in parallel.

Built With

  • hyperbrowser
  • nextjs
  • openai
  • twelvelabs
Share this project:

Updates