
Meta Releases SAM Audio: AI-Powered Sound Segmentation
Meta announces SAM Audio, bringing the power of Segment Anything to audio processing. Isolate specific sounds from complex recordings using natural language prompts.
Isolate specific sounds from complex audio recordings using AI-powered text prompts • Meta's Segment Anything for Audio • Free to use

Meta announces SAM Audio, bringing the power of Segment Anything to audio processing. Isolate specific sounds from complex recordings using natural language prompts.

Discover how creators are using SAM Audio for podcast editing, music production, video post-production, and more.

A comprehensive guide to using SAM Audio for your first audio segmentation project. Learn the basics and advanced techniques.
Upload an audio or video file to the demo interface above
Enter a text prompt describing the sound you want to isolate
Wait for the AI to segment the audio and listen to the results
Download the isolated sound and background separately
Use natural language prompts to isolate any sound from complex audio recordings with precision.
Segment audio using text descriptions, visual cues, or time-based selections for maximum flexibility.
Built on Meta's cutting-edge Segment Anything Model (SAM) technology, extended to audio processing.
Fast audio segmentation that works with both audio and video files directly in your browser.
"SAM Audio has completely transformed how I clean up outdoor recordings. Being able to isolate specific sounds with just text prompts is incredible."
"This technology makes it super straightforward to pull apart mixed sounds from recordings. The multi-modal prompts are a game-changer."
"Finally, an AI tool that actually delivers on its promise. Removing unwanted background noise has never been this easy."
SAM Audio is Meta's AI-powered audio segmentation model that enables you to isolate specific sounds from complex, noisy recordings using simple natural-language prompts. It's based on the Segment Anything Model (SAM) technology, extended to audio processing.
SAM Audio uses three types of prompts: text descriptions (like 'dog barking' or 'guitar solo'), visual cues (clicking on objects in video), and time-based selections. The AI model then isolates those specific sounds from the audio mix.
SAM Audio is perfect for removing street noise from outdoor videos, cleaning up podcast recordings, isolating individual instruments from concert footage, extracting dialogue from noisy environments, and creating clean audio layers for music production.
Yes! SAM Audio can process both audio and video files. You can upload video files and the tool will extract and segment the audio track.
The web-based demo on HuggingFace Spaces is free to use. Meta has also released the model code under an open license for developers to integrate into their own projects.