voice activity detection models

2 models · ranked by HuggingFace downloads

segmentation-3.0

Pyannote segmentation-3.0 is a speaker segmentation model for detecting speaker changes, overlapping speech, and voice activity in audio. It produces frame-level predictions used as input to the full speaker diarization pipeline. The model can also run standalone for voice activity detection or overlapped speech detection without the full diarization stack.

6,913,137 ↓ · 1,195 ♡

segmentation

Pyannote segmentation (v1.x) is the earlier version of pyannote's speaker segmentation model for voice activity detection and speaker change detection, preceding the current segmentation-3.0. It is used within older pyannote speaker diarization pipelines. MIT licensed.

4,151,918 ↓ · 678 ♡