ScriptSpeech & Audio AIv1.0

Audio Classification Setup

by Community · open-source · Last verified 2026-03-17

Configures an audio classification system using Audio Spectrogram Transformer (AST) or YAMNet fine-tuned on AudioSet, with Mel spectrogram feature extraction and batch inference. Exports per-clip predictions with top-5 class probabilities and integrates with a streaming event bus for real-time use.

https://github.com/YuanGongND/ast ↗

D—Poor

Adoption: C+Quality: B+Freshness: ACitations: FEngagement: F

Specifications

License: Apache-2.0
Pricing: open-source
Capabilities: mel-spectrogram, top-k-predictions, batch-inference, streaming-support
Integrations: pytorch, torchaudio, huggingface, kafka
Use Cases: anomaly-sound-detection, wildlife-monitoring, smart-home-triggers
API Available: No
Language: python
Dependencies: torch, torchaudio, transformers, librosa, numpy
Environment: Python 3.10+
Est. Runtime: 1-5 minutes
Tags: audio-classification, sound-events, ast, audioset, environmental-audio
Added: 2026-03-17
Completeness: 80%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service