Skip to main content
brand
context
industry
strategy
AaaS
Datasetinstruction-tuningv2.0

Dolly-15K

by Databricks · free · Last verified 2026-03-17

Dolly-15K is a high-quality, open-source dataset of 15,000 instruction-following records generated by humans. Created by Databricks employees, it's designed for fine-tuning large language models to exhibit instruction-following capabilities, such as those seen in ChatGPT, using a relatively small, targeted dataset.

https://huggingface.co/datasets/databricks/databricks-dolly-15k
B
BAbove Average
Adoption: AQuality: B+Freshness: BCitations: AEngagement: F

Specifications

License
CC-BY-SA-3.0
Pricing
free
Capabilities
Supervised Fine-Tuning (SFT), Instruction-Following Model Training, Natural Language Generation (NLG), Question Answering, Text Summarization, Creative Writing and Brainstorming, Information Extraction, Dialogue System Development
Integrations
Hugging Face Datasets, PyTorch, TensorFlow, Databricks Platform, Jax
Use Cases
[object Object], [object Object], [object Object], [object Object], [object Object]
API Available
No
Tags
instruction-tuning, supervised-fine-tuning, human-generated-data, databricks, llm-training, open-source-dataset, natural-language-processing, question-answering, dialogue-generation, model-alignment
Added
2026-03-17
Completeness
0.85%

Index Score

68.3
Adoption
80
Quality
79
Freshness
68
Citations
82
Engagement
0

Need this tool deployed for your team?

Get a Custom Setup

Explore the full AI ecosystem on Agents as a Service