OASST2
by LAION / OpenAssistant · open-source · Last verified 2026-03-17
OpenAssistant Conversations 2 (OASST2) is a crowd-sourced human-annotated dataset of 100,000+ assistant-style conversations in 35 languages, where human contributors created and ranked message trees to produce preference labels for RLHF training. It is the largest open multilingual human-feedback dataset and is widely used for training preference models and reward functions in open-source alignment pipelines.
https://huggingface.co/datasets/OpenAssistant/oasst2 ↗B
B—Above Average
Adoption: AQuality: AFreshness: B+Citations: B+Engagement: F
Specifications
- License
- Apache 2.0
- Pricing
- open-source
- Capabilities
- rlhf-training, preference-modeling, multilingual-alignment
- Integrations
- huggingface-datasets, trl
- Use Cases
- reward-model-training, rlhf-finetuning, preference-data-collection
- API Available
- Yes
- Tags
- rlhf, human-feedback, chat, multilingual, preference
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
68.5Adoption
80
Quality
85
Freshness
78
Citations
78
Engagement
0
Put AI to work for your business
Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.