Source Separation
Source Separation separates a mixed audio signal into individual sources – e.g., vocals, drums, bass, and instruments from a song.
Source Separation decomposes mixed audio signals into individual sources – from vocal isolation to podcast cleanup via AI.
Explanation
Models like Demucs (Meta) and HTDemucs use U-Net architectures in time and frequency domains. They decompose songs into 4-6 stems. Speech-from-noise separation also belongs here.
Marketing Relevance
Enables vocal isolation for marketing remixes, karaoke creation, podcast cleanup, and music analysis.
Common Pitfalls
Artifacts with strong source overlap. Copyright questions when isolating vocals. Mono mixes harder than stereo.
Origin & History
ICA (Independent Component Analysis, 1990s) was the classic approach. Wave-U-Net (2018) brought neural separation. Demucs (Meta, 2019-2023) became the open-source standard. MDX-Net won Music Demixing Challenges.
Comparisons & Differences
Source Separation vs. Speech Enhancement
Speech Enhancement removes noise; Source Separation separates arbitrary sources (vocals, instruments) from each other.
Source Separation vs. Audio Generation
Audio Generation creates new audio; Source Separation decomposes existing audio into components.