SkillSpeech & Audio AIv1.0

Speaker Diarization

by AaaS · open-source · Last verified 2026-03-17

Enables agents to segment audio recordings by speaker identity, answering 'who spoke when' for downstream summarization and analysis tasks. Covers embedding-based clustering (pyannote.audio, NeMo), overlapping speech handling, and merging diarization with ASR transcripts.

https://aaas.blog/skill/speaker-diarization ↗

C—Below Average

Adoption: BQuality: AFreshness: ACitations: FEngagement: F

Specifications

License: MIT
Pricing: open-source
Capabilities: speaker-segmentation, speaker-clustering, overlap-detection, speaker-counting, transcript-alignment
Integrations: pyannote, nemo, assemblyai, huggingface
Use Cases: meeting-minutes, podcast-editing, legal-deposition-analysis, customer-service-qa
API Available: No
Difficulty: intermediate
Prerequisites: speech-recognition
Supported Agents: voice-agent
Tags: diarization, speaker-id, audio, pyannote, meeting-analysis
Added: 2026-03-17
Completeness: 87%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Ready to add this skill to your workflow?

Start Building

Explore the full AI ecosystem on Agents as a Service