about the project

what inspired me

got frustrated seeing how many images on the web have zero alt text. screen readers just say "image, image, image" which is useless. figured with all the vision AI models out there now, this should be solvable.

what I learned

  • chrome extensions are actually pretty powerful - you can inject code into any webpage
  • openai's api can't directly access most image URLs due to cors/blocking, so you have to proxy them through your server and convert to base64
  • manifest v3 service workers are different from the old background scripts - they don't stay alive permanently

how I built it

extension side:

  • content script scans the DOM for <img> tags and checks which ones have empty/missing alt attributes
  • background service worker handles API calls and manages the context menu
  • popup shows stats and lets you generate alt text for specific images

backend:

  • express server that takes image URLs
  • fetches the image using axios with proper user-agent headers
  • converts to base64
  • sends to gpt-4o vision api
  • returns the generated description

Built With

Share this project:

Updates