Inspiration

The inspiration was a project conducted by my sister regarding DNA discovery. Part of the effort involved combing through the quotes of public figures. Most of what is constitutes immediately actionable knowledge by key decision makers still occurs in spoken form. That knowledge is generally locked inside video files where it cannot be accessed.

What it does

Allows users to ask ChatGPT questions about what public figures have been discussing recently on various topics, then searches public information within video and audio on YouTube, Twitter, Facebook, and public podcasts to locate the key discussions within videos.

How we built it

Huggingface Hub offers models such as pyannote for speech recognition, segmentation by speaker, and diarization (i.e. determining the speaker.) These models were then set up as Flask endpoints running on an AWS GPU-enabled server environment.

Challenges we ran into

-The killer was access to GPUs. When you can't call over to the Lamda in the next room to run your models, you appreciate the constraints on compute availability in cloud due to rapid growth of these models. -As a result of working through the environment drafted but did not test the OpenAI integration.

Accomplishments that we're proud of

-Successfully testing the concept, instantiating working endpoints for search and q&a

What we learned

-The concept is sound. There is a calculation to do regarding when the cost of compute falls enough to make mass indexing possible. -Development speed was slower without corporate resources, a real reminder to carefully consider the constraints of a startup environment -Working solo is a reminder of the value brought by others. This is a team sport for a reason.

What's next for Vidivox

-Finish and launch before EOD at vidivox.net

Built With

Share this project:

Updates