Inspiration
Track: Traditional As a person who works has started working in Deep Learning and Object Detection with YOLOv3, I always was troubled by drafting hundreds of lines of OpenCV scripts to run a single image/video. Furthermore, another thing that bothered me a lot was the tedious training process of models. It could take hours to figure out the right kind of files and the placement of those files. The entire process can be easily automatable and help developers focus on the creative aspects of their projects rather than the repetitive ones. My main inspiration was to give back to this developer community by creating a project that makes Object Detection with YOLOv3 simple, understandable, and straightforward. The hassle of the process was one I struggled with and in a bid to assist future developers diving into YOLO, this could be an awesome starting point for them. Furthermore, I think for a lot of companies who work with Object Detection, constantly creating and testing models is an integral part of their day-to-day routine. By accelerating this process, we can significantly boost the type of work they wish to do.
What it does
Auto-YOLO essentially 'digitizes' the entire process of creating and running models. The project has 3 sections specifically: Image, Video, Train. The developer has a choice to choose the YOLOv3 model on their own and run it on any medium they like. All it takes for them is to paste in a couple of paths and upload the file and boom, they are done. Behind the scenes, I studied the YOLOv3 process and created an execution algorithm with OpenCV's help. The output is right there on the screen. Being a dev-side project, for videos, we also output a frame-by-frame result to give them a sense of what's going on in the background. The coolest aspect of the project is the auto-train. Within seconds developers can upload paths to data/annotations fill in rudimentary details and just hit run. We take care of the rest. It really makes the development side of things extremely efficient.
How I built it
Being a newbie to DL myself, it was a task to compare many ways to approach this project. After studying YOLOv3 and OpenCV I created a simple execution algorithm using Darknet, Blob Processing, NMS, and other key aspects of the YOLO algorithm. I utilized that to create a video aspect for it as well. To introduce the training aspect of things, I first drafted all the necessary steps variables needed. Used server-request and file-management libraries and automated the process. While the time taken to the model will invariably remain the same the time taken to get there is expedited significantly
Challenges I ran into
The UI aspect was significantly hard considering I had no previous knowledge of the same. I tried PyQT but that didn't work out so well and thus I ended up at Streamlit whose UI is excellent. Another challenge I ran into the Auto-Train algorithm. It took 5 variations of the same to execute it perfectly. It had to follow the exact steps outlined by the original developers of YOLO. Hosting is another hurdle I hit which I quite haven't figured out yet. The code is available for free on Github with all steps outlined. I want to integrate a terminal in my web app because a large chunk of what is happening is output there so allowing the developers to run themselves and check their terminals was ideal. But I am working towards finding solutions for that as well.
Accomplishments that I'm proud of
Stringing everything together. Launching a product that can impact so many and really accelerate what the DL community is trying to achieve. Creating my first project on DL and making it open-source. This is a big deal for me because it's my first ever project in DL
What I learned
I learned a lot about Streamlit, documentation, deadlines, and most importantly the quality of being resourceful. Since my project targets efficiency, I strived to replicate that in every aspect of the development process.
What's next for Auto-YOLO
This is just a template version to show the impact and scope of the technology. After adding in many features and functionality like live cameras, Youtube streams, and additional customization to the current features as well, we plan on partnering with companies working with computer vision and distributing this software. For the open-source developer community, we will work consistently work on maintaining a free, usable, and effective version of Auto-YOLO

Log in or sign up for Devpost to join the conversation.