Inspiration

We were inspired by the idea of combining text-generative and image-generative models together to create fully automated comics generator.

What it does

The project takes a short description of the situation, a style for the comics and a size parameter. As a result a user receives an image to download.

How we built it

We started our project by setting up the GPT-3 and DALL-E 2 models. The next step was to add an object detector. The object detector finds the location of a main character in the image. The location is very important because we want to generate the_ speech bubbles _afterwards. After that we assembled all the parts together to create a baseline and set up a web application. The remaining time we were tuning and improving the project, creating marketing and monetisation plans.

Challenges we ran into

One of the most challenging parts was to set up a good object detector for character bounding box regression. It is difficult to recognise superheroes or cartoon characters for model that is pre-trained on a dataset with people.

Accomplishments that we're proud of

We are proud that we managed to create a complete application that contains everything the user needs. The current version is available for testing to all people here. Also, we managed to leave enough time to focus on the marketing and monetisation strategy that is one of the most important parts of the project

What we learned

We learned how to work with the OpenAI API, image processing with OpenCV library and object detection.

What's next for Comics Generator

This project definitely has a future. There are lots of things that can implemented to make it more highly specialised (depending on the area where it will be applied). One example that can greatly improve the quality is more detailed requests to the models. Also, It would be nice to give a user more freedom in the comics creation (e.g. multiple choice images).

Share this project:

Updates