about the project
what inspired me
got frustrated seeing how many images on the web have zero alt text. screen readers just say "image, image, image" which is useless. figured with all the vision AI models out there now, this should be solvable.
what I learned
- chrome extensions are actually pretty powerful - you can inject code into any webpage
- openai's api can't directly access most image URLs due to cors/blocking, so you have to proxy them through your server and convert to base64
- manifest v3 service workers are different from the old background scripts - they don't stay alive permanently
how I built it
extension side:
- content script scans the DOM for
<img>tags and checks which ones have empty/missing alt attributes - background service worker handles API calls and manages the context menu
- popup shows stats and lets you generate alt text for specific images
backend:
- express server that takes image URLs
- fetches the image using axios with proper user-agent headers
- converts to base64
- sends to gpt-4o vision api
- returns the generated description
Built With
- axios
- base64
- chat-gpt-api
- chrome
- css
- dom-api
- extension
- html
- javascript
- manifest-v3
- node.js
Log in or sign up for Devpost to join the conversation.