Content Filtering
by AaaS · open-source · Last verified 2026-03-01
Screens LLM inputs and outputs for harmful, inappropriate, or policy-violating content. Implements multi-category classification covering toxicity, violence, sexual content, hate speech, and custom policy rules with configurable thresholds and escalation paths.
https://aaas.blog/skill/content-filtering ↗B
B—Above Average
Adoption: B+Quality: AFreshness: B+Citations: BEngagement: F
Specifications
- License
- MIT
- Pricing
- open-source
- Capabilities
- toxicity-detection, category-classification, threshold-configuration, escalation-routing, policy-enforcement
- Integrations
- openai, anthropic, langchain
- Use Cases
- content-moderation, user-generated-content, chat-safety, enterprise-compliance
- API Available
- No
- Difficulty
- intermediate
- Prerequisites
- Supported Agents
- claude-code
- Tags
- moderation, filtering, safety, content-policy, trust-safety
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
61.2Adoption
72
Quality
82
Freshness
78
Citations
64
Engagement
0