Separate Overlapping Speakers

When two people talk over each other on one track, this tool pulls the dominant voice forward and reduces the overlap, making cross-talk easier to follow and transcribe.

Pro Premium AI tool — included with any paid plan.

🎧

Drop an audio or video file here

MP3, WAV, M4A, FLAC, OGG, AAC

How it works

Overlapping speech is separated by source so the foreground speaker is isolated from the competing voice and room. The result is a clearer single-speaker track from a messy cross-talk recording.

What it's good for

Cross-talk in interviews
Single-mic two-person recordings
Cleaning up debate audio
Transcription prep for overlap

Details

Engine: Demucs
Formats: MP3, WAV, M4A, FLAC, OGG, AAC
Price: Paid plans

Frequently asked questions

It isolates the dominant near-field speaker and suppresses the overlap. Full per-speaker diarization into separate tracks is on our roadmap; today it cleans the foreground voice.

Yes — reducing the competing voice and room makes speech-to-text far more accurate on overlapping sections.

Recordings where your target speaker is closest to the mic separate best, since proximity gives the model a strong foreground cue.

Speaker separation is built for two voices talking over each other on a clean track, while target-speaker extraction pulls one voice out of a wider mess of voices, music and noise.

It is tuned for two overlapping voices; with three or more it still lifts the dominant near-field speaker, but the result is less clean than a true two-person cross-talk recording.

A typical interview segment processes in under a minute, scaling with clip length rather than with how much overlap there is.

Equal-volume voices are the hardest case and leave more of the competing speaker behind; the tool performs best when your target is clearly the closer, louder voice.

Related tools

Vocal Isolation (Extract Voice)

Pull a clean vocal out of a full mix, separating the singer or speaker …

Vocal Remover (Karaoke)

Strip the lead vocal out of a song to make an instrumental or karaoke …

Stem Separation (Drums/Bass/Vocals)

Break a finished song back into its parts — vocals, drums, bass and other …

Remove Background Music from Speech

When a voiceover, interview or clip has music underneath, this tool removes the music …

Target-Speaker Extraction

Isolate the main speaker from everything else — other voices, music and noise — …

Background Noise Removal

Strip steady and shifting background noise — air conditioning, fans, street hum, room tone …

Separate Overlapping Speakers

How it works

What it's good for

Details

Frequently asked questions

Can it give me each speaker on a separate track?

Does it help transcription accuracy?

What recording works best?

How is this different from target-speaker extraction?

Does it work with three or more talkers?

How long does separation take?

Does it work if both speakers are equally loud?