Extract One Speaker from a Recording

Isolate the main speaker from everything else — other voices, music and noise — to get a focused single-voice track from a busy recording.

Pro Premium AI tool — included with any paid plan.

🎧

Drop an audio or video file here

or

MP3, WAV, M4A, FLAC, OGG, AAC

Cleaning your audio…

Before

Tip: press the space bar to toggle Before / After.


How it works

The separation engine extracts the foreground voice and pushes background talkers, music and ambience down, leaving the target speaker as the clear centre of the recording.

What it's good for

  • Pulling one host from a noisy room
  • Focusing a presenter over a crowd
  • Isolating dialogue from ambience
  • Prepping a clean voice for cloning

Details

Engine
Demucs
Formats
MP3, WAV, M4A, FLAC, OGG, AAC
Price
Paid plans

Frequently asked questions

Denoising removes non-speech noise but leaves other voices and music. Target-speaker extraction also removes competing voices and music, keeping only the main speaker.

Not today — it extracts the dominant foreground voice automatically. Reference-guided extraction is planned for a future update.

They're strongly reduced. A background voice as loud as the target is the hardest case and may leave faint traces.

Speaker separation untangles two people talking over each other on an otherwise clean track, while this tool isolates one voice from a full mix of other talkers, music and ambient noise.

It produces a focused single-voice track that works well as cloning input, though a short, naturally clean recording will always beat a heavily processed extraction.

Yes, music and ambience are pushed down along with competing talkers, so the target speaker is left as the clear foreground of the recording.

Most clips finish within a minute, with processing time tied to the length of the recording rather than how crowded the background is.

Related tools