Capybara
by Argilla / LDJnr · open-source · Last verified 2026-03-17
Capybara is a high-quality instruction-tuning dataset of 15,000 diverse, long-form single- and multi-turn conversations synthesized to cover a wide range of topics and response styles, designed to improve model coherence and verbosity on open-ended tasks. It emphasizes narrative quality and conceptual depth over simple factual responses, making it particularly effective for improving chat model fluency and reasoning.
https://huggingface.co/datasets/LDJnr/Capybara ↗C+
C+—Average
Adoption: BQuality: AFreshness: ACitations: BEngagement: F
Specifications
- License
- CC BY 4.0
- Pricing
- open-source
- Capabilities
- instruction-tuning, long-form-generation, chat-finetuning
- Integrations
- huggingface-datasets
- Use Cases
- sft-training, chat-model-finetuning, response-quality-improvement
- API Available
- Yes
- Tags
- instruction-tuning, long-form, diverse, synthetic, sft
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
57.4Adoption
65
Quality
82
Freshness
80
Citations
60
Engagement
0
Put AI to work for your business
Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.