Inspiration
I often do a lot of repetitive browser tasks manually, usually taking up a long time.
What it does
The agent takes the uploaded screen recording (of you performing the manual browser task) and then automates it.
How we built it
I built it using Nextjs, Twelvelabs (multimodal API), OpenAI and Hyperbrowser (browser agent).
Challenges we ran into
Difficulty in constructing accurate steps for agent.
Accomplishments that we're proud of
Working demo that works for sending an email.
What we learned
Learned how to make browser agent effective and useful for everyday tasks.
What's next for Browsor Agent
Run multiple tasks in parallel.
Built With
- hyperbrowser
- nextjs
- openai
- twelvelabs
Log in or sign up for Devpost to join the conversation.