INSPIRATION

Suppose someone posts a picture on social media without a caption, there are multiple different interpretations of that picture through everyone’s minds and so the picture loses its actual meaning. Instead, having a caption would allow the poster and viewer to get a clear meaning of the picture (an effective way of communicating your emotions/meaning).

WHAT IT DOES

In simple terms, provided an image, it generates a caption. We start off by allowing the user to upload any image of their choice, and feed it to the Google Cloud's Vision API to generate meaningful labels. Using the labels and salience score, the algorithm searches the internet for content that relates to the image. The content is then processed by Google Cloud's Natural Language API to produce best matching entities related to the meaning of the text. Lastly, with the use of a heuristic function we match the appropriate caption to the image.

BUILDING PROCESS

Seeing the immensity of the project we decided to break it down into various stages.

Stage 1) Setting up the Vision API along with the credentials.
Stage 2) Creating an script to scavenge the quotes online.
Stage 3) Setting up the Natural Language API along with the credentials.
Stage 4) Designing the heuristic function.
Stage 5) Integrating both APIs to work together.
Stage 6) Building a user interface to allow the generation of caption.

CHALLENGES

One of the challenges we've faced was Google Chrome wasn't dumping cache which restricted the web application to update with the new changed we made (front-end). Another challenge was setting up the credentials of automated verification for API request. Considering the level of difficulty of this project relative to our overall experience we found it difficult to learn and implement complex technical challenges.

ACCOMPLISHMENTS

1) Successfully completing our desired goal.
2) Integrating and learning about the APIs.
3) Exploring and implementing new technologies.
4) Last but not least, improved teamwork and problem-solving skills.

NEXT STEPS

Nowadays, the use of smartphones is very high. Although we have made an application for Windows, we would like to further develop the accessibility of our software for all types of smartphone users. It would make perfect sense to have such captioning software for mobiles since the majority of people use social media on their smartphones.

Built With

cloud-natural-language
cloud-vision-api
css
flask
html
json
python

Updates

Shahil Patel started this project — Jan 27, 2019 08:25 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.