Datasetalignmentv2.0

Tulu V2 Mix

by Allen Institute for AI (AI2) · free · Last verified 2026-03-17

Tulu V2 Mix is a curated 326,000-sample mixture of instruction-tuning datasets from AI2. It blends diverse sources like FLAN, Open Assistant, and Code Alpaca to train the Tulu 2 model family. The dataset serves as a benchmark for analyzing the impact of different data sources on model performance and quality.

https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture ↗

C—Below Average

Adoption: B+Quality: AFreshness: B+Citations: FEngagement: F

Specifications

License: ODC-By 1.0
Pricing: free
Capabilities: Instruction Following, Supervised Fine-Tuning (SFT), Multi-task Language Understanding, Code Generation and Reasoning, Conversational AI Training, Question Answering, Comparative Data Analysis, Model Benchmarking
Integrations: [object Object], [object Object], [object Object], [object Object]
Use Cases: [object Object], [object Object], [object Object], [object Object], [object Object]
API Available: Yes
Tags: instruction-tuning, sft, data-mixture, llm-training, conversational-ai, code-generation, question-answering, benchmark-dataset, ai2, open-source
Added: 2026-03-17
Completeness: 0.95%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service