Metrophon | Devpost

Inspiration

If you've frequented learning from video lectures, youtube tutorials or any other prerecorded educational reference material, you'll be aware of the fluctuating 'Information Density'* (in informal terms, the amount of brain cells required to process whats currently being discussed) throughout the content.

Almost everyone encounters this situation without explicit knowledge of it - fast forwarding through the intro (BEFORE WE CONTINUE, WE THANK OUR SPONSORS RAID SHA-) and outro, skimming through the description of the tutorial (usually at 1.5x speed), and slowing down/pausing the video at the main sections that really matter, where you need time to process the information.

*'Information Density' can be vaguely defined, in an academic sense, as the amount and complexity of new information imparted over time (or number of pages), considering you satisfy the course prerequisites.

What it does

Infuriated by the bottleneck posed by this situation (instead resorting to audio transcripts to read), we've built Metrophon, an AI powered tool that smoothens the 'Information Density' for you by appropriately bifurcating and tweaking the speed of the lectures, so you spend your time on only whats important, while still watching the entire video.

Saves time, reduces mental lethargy, and empowers you to speedrun your education!

How we built it

A broad outline of its internals are as follows:

First, the youtube video is accessed, along with its activity heatmap and transcript.
Topic summarization functionality of symblai is then employed to determine the core topics.
We then clean the transcript data, generating several features using both standard and custom NLP algorithms - Flesch-Kincaid readability, lexical diversity, topic frequency, cooccurence matrices, implication weights etc.
Using the heatmap as our labels, we train a ML model using Pytorch, which determines the 'Information Density' graph.
This data is then used to appropriately split, speeden and merge the video, giving it a normalized information density.

Challenges we ran into

Building the model - Being the first of its kind, Metrophon required a lot of work - down from scraping hundreds of videos from youtube, extracting heatmaps and transcripts, to brainstorming, experimenting and compiling appropriate features to determine 'Information Density'. Not having ANY precedents, a lot of uncertainty loomed about our assumptions of the correlations between information density and activity heatmaps - which to our great joy, turned out to be a good correlation!
Timezones - 4 different timezones, with teammates halfway across the globe. However, we were considerably successful in scheduling our work appropriately (and pulling all nighters), allowing for realtime collaboration.
Spaghetti code - Rise statically typed languages gang. Personally speaking, I would never write production grade applications in Python.
Full frontend-backend integration. Due to time constraints, complete frontend-backend integration was impossible, so while running the client, the loading bar is found in the console log. However, this could be remedied with a few extra hours of work.

What's next for Metrophon

The power of this tech lies in its scalability - agnostic of the topic, it can be used on many forms of educational content like podcasts episodes, audio books, cooking and music videos, apart from the usual OCW lectures. Future plans of the software include extension to pinpoint core sections of articles and research papers, and the autogeneration of synopsis on Google Books and the likes, eliminating a lot of fluff noise.

We'll probably optimise it for speed, and host a website allowing anyone to use this software to watch youtube videos through it.