Inspiration
The inspiration for our AI movie trailer workflow were the sights on the UC San Diego campus where fog enveloped the campus and you could no longer see more than 20 feet led us to believe we had special footage that would be useful for the spooky theme. Additionally the time of the year with Halloween coming up and the prompt of Cerebral Hack Hackathon.
What it does
Our solution leverages AI to streamline the production of a Halloween-themed horror trailer, addressing the inefficiencies and time-consuming nature of traditional trailer creation. By integrating Twelve Labs' video foundation models and Eleven Labs' sound generation capabilities, we automate several key tasks in the trailer-making process. This includes automated footage selection using the Marengo2.6 model, generative text capabilities of the Pegasus1 model, and AI-generated sound effects and narration. These features significantly reduce manual effort, allowing creators to focus on storytelling and artistic expression. The use of ffmpeg for audio mixing and video assembly ensures a polished final product, enhancing both the visual and auditory experience of the trailer. Overall, our solution empowers creators to produce high-quality, engaging trailers more efficiently and effectively.
How we built it
Setup and Initialization The project begins by importing necessary libraries and setting up API keys for Twelve Labs and Eleven Labs. The Twelve Labs client is initialized with the provided API keys, allowing access to various functionalities such as searching for stock footage and generating text prompts. Sound Effect Generation A function is used to generate sound effects by converting text descriptions into audio files using the Eleven Labs API. The generated audio is saved to a specified output path, ensuring that the sound effects match the duration and theme of the selected video clips. Footage Selection The process involves searching for and selecting the best clips from the stock footage based on duration and confidence scores. The selected clips are then downloaded and renamed for further processing. This step ensures that only the most relevant and high-quality footage is used in the trailer. Clip Generation A function orchestrates the search for clips, generates sound prompts, and prepares the clips for editing. It handles the renaming of downloaded clips and the generation of sound effects, ensuring that each clip is accompanied by appropriate audio. Audio and Video Concatenation The script uses ffmpeg to concatenate video clips and mix audio streams. This involves combining selected video clips into a single cohesive trailer and merging generated sound effects and narration with the video. The final video and audio are then combined into a single output file, ready for the final touches. Narration Generation The script generates a narration for the trailer using the Eleven Labs API. This narration is based on a script that describes the plot of the video in an engaging manner. The generated narration is then integrated into the final video, adding a professional touch to the trailer. The script utilizes Kindo.ai for the Llama 3.1 70B Versatile model for translation of the narration script to other languages. It is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. This implementation process highlights the use of AI models and APIs to automate the production of a high-quality, engaging horror trailer. By automating repetitive tasks such as footage selection, sound effect generation, and video editing, the solution allows creators to focus more on storytelling and artistic expression. The integration of original footage and AI-generated sound effects and narration ensures a personalized and immersive final product.
Challenges we ran into
During the hackathon, we encountered several challenges that required creative problem-solving and perseverance. Initially, we struggled with making successful POST and GET requests to the Twelve Labs API for accessing stock videos. After extensive research and troubleshooting, we finally found a solution online. Midway through the hackathon, we faced a significant mental block when our ffmpeg concatenated videos lacked audio, which we overcame by taking a strategic break to refresh our minds before returning to the project with renewed focus. Additionally, we encountered an issue with concatenating video and audio streams, where the resulting video lacked audio. To resolve this, we sought advice from professionals and conducted thorough research, ultimately finding a solution to integrate the audio correctly. These challenges tested our resilience and adaptability, but we successfully overcame them through collaboration and resourcefulness.
Accomplishments that we're proud of
- Seamless integration of Kindo AI, Twelve Labs, and Eleven Labs
- Our utilization of filmed footage combined with our generative script to create exceptional video clips.
What we learned
Throughout the project, we learned the importance of robust error handling and the need for efficient data management to handle larger datasets. We also realized the potential of AI in enhancing creative processes, allowing us to focus more on storytelling and artistic expression. The experience highlighted the value of integrating various AI models and tools to automate repetitive tasks, ultimately improving the quality and efficiency of the production process. Overall, the project was a valuable learning experience, demonstrating how AI can be a powerful ally in creative endeavors.
Our final submission: https://youtu.be/OunRTtz-RIw



Log in or sign up for Devpost to join the conversation.