Datasetinstruction-tuningv1.0

Alpaca Dataset

by Stanford University · open-source · Last verified 2026-03-17

Stanford Alpaca's 52,000 instruction-following examples generated using the self-instruct technique applied to GPT-3.5 (text-davinci-003). This foundational dataset enabled the creation of the Alpaca 7B model and popularized cost-effective instruction-tuning approaches.

https://github.com/tatsu-lab/stanford_alpaca ↗

C+

C+—Average

Adoption: AQuality: B+Freshness: BCitations: FEngagement: F

Specifications

License: CC-BY-NC-4.0
Pricing: open-source
Capabilities: instruction-tuning, self-instruct
Integrations: huggingface-datasets
Use Cases: fine-tuning, instruction-following, research
API Available: No
Tags: instruction-following, self-instruct, stanford, gpt-generated
Added: 2026-03-17
Completeness: 100%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service