Genstruct
by NousResearch · open-source · Last verified 2026-03-17
Genstruct is a synthetic instruction dataset generated by the Genstruct-7B model, which converts raw documents into structured instruction-response pairs. Unlike typical self-instruct approaches, Genstruct grounds every instruction in a source document, ensuring factual consistency and enabling controllable synthetic data generation from any text corpus.
https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1 ↗C+
C+—Average
Adoption: BQuality: B+Freshness: ACitations: C+Engagement: F
Specifications
- License
- CC BY 4.0
- Pricing
- open-source
- Capabilities
- instruction-tuning, document-grounded-generation, structured-output
- Integrations
- huggingface-datasets
- Use Cases
- sft-training, knowledge-grounded-qa, domain-specific-finetuning
- API Available
- Yes
- Tags
- synthetic, instruction-tuning, document-grounded, structured-generation
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
53.4Adoption
60
Quality
78
Freshness
82
Citations
55
Engagement
0
Put AI to work for your business
Deploy this dataset alongside autonomous AaaS agents that handle tasks end-to-end — no babysitting required.