Sportlight by The Unknowns
Inspiration - Problem Statement #3 by Experion Technologies
Publishing highlights after a sports game is a mandatory practice followed in the broadcast industry. A highlights video provides a quick snapshot of the event highlighting the interesting events in the game and thus provides the viewers an opportunity to get a quick summary of the game. In a manual process, a video editor has to run through the entire game, identify candidate areas for the highlights and compile all the clips into a single compilation video. This can be a cumbersome process and a time-consuming.
What it does
The application uses AI enabled methods to automatically generate highlights data feed from an input video file. The expected output data feed contains start/end time stamps of interesting clips from the given video feed. For example, for a cricket game the clips may contain fall of wickets, exceptional fielding, batsman hitting boundaries etc. Finally, the highlights of that particular game would be displayed to the user sequentially.
How we built it
- We would be building a web application, where the user can directly upload the video to the web server built using Node.js.
- The video will be processed frame by frame. The processing will be convert the audio to text form by the use of symbl.ai using an API call.
- The obtained texts along with the time frames will be processed by a python script consisting of technologies like NLTK, wordnetand NLP to carefully understand the text.
- Firstly, the script would pre-process the data by removing the unwanted literals such as punctuation, special characters, connectives and other unnecessary words.
- The script will then convert the phrase into its vector form. A phrase can be represented in n-dimensional vector space. A wordnet of some common cricket highlight words would be created and stored in for later use.
- All of this is carried keeping in mind the emotions of the commentators in the audio and the words they speak. For example, “bowled”, “that’s a six”, and others (for cricket).
- The vector would be parsed in a TF-ID vectorizer. Different similarity checks would be carried out in order to check the Similarity between the received text and the highlight words.
- Once we get the check results, we can use it to decide if that particular frame can be used or not. If yes, we would be giving the start and end timestamp of the highlight (considering a 5-sec time interval).
Challenges we ran into
- One major challenge we faced is to fix the values for different similarity checks in order to see if the given phrase is of a highlight or not.
- Different TF-ID vectors should be created for different sports in order to encompass all kinds of sports highlights.
- Segregating different phrases belonging to the same highlight so as to avoid repetitive highlights.
What we implemented till now
- Initial setup of the server with the project structure and git repository.
- Constructed the wordnet cloud and TF-ID vector. 3.Established a connection between the symbl.ai API and the web-server to transfer and analyse the uploaded video.
- Integrated the web application, and the python pipeline.
- Displaying the highlights from the obtained timestamps to the user through a user friendly interface.
What's upcoming
- Optimising the time taken for the video to be processed by the API (symbl.ai) and the NLTK model.
- Adding user centric features for easy access.
To try it out
The URL provided must be a publicly available URL. Currently we do not support any redirected links, shortened links (e.g. bit.ly), YouTube, Vimeo, or links from any audio/video platforms.
Example link which can be used: https://firebasestorage.googleapis.com/v0/b/sportlight-1407.appspot.com/o/videoplayback.mp4?alt=media&token=482ef95c-a34c-4227-a223-25e29e115d2a
Domain Name from GoDaddy Registry: https://www.knowunknowns.co/
Note that, uploading may take time due to processing by multiple models via an API.
Built With
- express.js
- github
- natural-language-processing
- natural-language-toolkit
- next.js
- node.js
- postman
- python
- symbl.ai
- wordnet
Log in or sign up for Devpost to join the conversation.