Jailbreak Detection
by AaaS · open-source · Last verified 2026-03-01
Detects and blocks jailbreak attempts that try to bypass LLM safety training through adversarial prompting techniques. Uses pattern recognition, semantic analysis, and classifier-based approaches to identify known and novel jailbreak vectors before they reach the model.
https://aaas.blog/skill/jailbreak-detection ↗C+
C+—Average
Adoption: C+Quality: AFreshness: ACitations: C+Engagement: F
Specifications
- License
- MIT
- Pricing
- open-source
- Capabilities
- pattern-recognition, semantic-analysis, classifier-detection, adversarial-input-blocking, attack-logging
- Integrations
- langchain, openai, anthropic
- Use Cases
- chatbot-security, enterprise-ai-protection, red-team-defense, safety-monitoring
- API Available
- No
- Difficulty
- advanced
- Prerequisites
- prompt-injection-defense
- Supported Agents
- Tags
- jailbreak, detection, security, adversarial, defense
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
51.2Adoption
52
Quality
82
Freshness
86
Citations
56
Engagement
0