Skip to main content
SkillSpeech & Audio AIv1.0

Speaker Diarization

by AaaS · open-source · Last verified 2026-03-17

Enables agents to segment audio recordings by speaker identity, answering 'who spoke when' for downstream summarization and analysis tasks. Covers embedding-based clustering (pyannote.audio, NeMo), overlapping speech handling, and merging diarization with ASR transcripts.

https://aaas.blog/skill/speaker-diarization
C+
C+Average
Adoption: BQuality: AFreshness: ACitations: BEngagement: F

Specifications

License
MIT
Pricing
open-source
Capabilities
speaker-segmentation, speaker-clustering, overlap-detection, speaker-counting, transcript-alignment
Integrations
pyannote, nemo, assemblyai, huggingface
Use Cases
meeting-minutes, podcast-editing, legal-deposition-analysis, customer-service-qa
API Available
No
Difficulty
intermediate
Prerequisites
speech-recognition
Supported Agents
voice-agent
Tags
diarization, speaker-id, audio, pyannote, meeting-analysis
Added
2026-03-17
Completeness
100%

Index Score

57.4
Adoption
66
Quality
80
Freshness
82
Citations
60
Engagement
0

Explore the full AI ecosystem on Agents as a Service