Datasetinstruction-tuningv2.0

Dolly-15K

by Databricks · free · Last verified 2026-03-17

Dolly-15K is a high-quality, open-source dataset of 15,000 instruction-following records generated by humans. Created by Databricks employees, it's designed for fine-tuning large language models to exhibit instruction-following capabilities, such as those seen in ChatGPT, using a relatively small, targeted dataset.

https://huggingface.co/datasets/databricks/databricks-dolly-15k ↗

C—Below Average

Adoption: AQuality: B+Freshness: BCitations: FEngagement: F

Specifications

License: CC-BY-SA-3.0
Pricing: free
Capabilities: Supervised Fine-Tuning (SFT), Instruction-Following Model Training, Natural Language Generation (NLG), Question Answering, Text Summarization, Creative Writing and Brainstorming, Information Extraction, Dialogue System Development
Integrations: Hugging Face Datasets, PyTorch, TensorFlow, Databricks Platform, Jax
Use Cases: [object Object], [object Object], [object Object], [object Object], [object Object]
API Available: No
Tags: instruction-tuning, supervised-fine-tuning, human-generated-data, databricks, llm-training, open-source-dataset, natural-language-processing, question-answering, dialogue-generation, model-alignment
Added: 2026-03-17
Completeness: 0.85%

Index Score

Adoption

Quality

Freshness

Citations

Engagement

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service