DocETL
by UC Berkeley · open-source · Last verified 2026-03-17
LLM-powered ETL system for complex document processing and analysis pipelines from UC Berkeley. Optimizes multi-step document processing workflows with automatic plan generation and quality assessment.
https://ucbepic.github.io/docetl/ ↗D
D—Poor
Adoption: DQuality: B+Freshness: ACitations: DEngagement: F
Specifications
- License
- MIT
- Pricing
- open-source
- Capabilities
- document-etl, pipeline-optimization, quality-assessment, multi-step-processing, llm-operations
- Integrations
- openai, anthropic
- Use Cases
- document-analysis, data-extraction, research-pipelines, content-processing
- API Available
- Yes
- SDK Languages
- python
- Deployment
- self-hosted
- Rate Limits
- N/A (open-source)
- Data Privacy
- Self-hosted, user-managed
- Tags
- document-processing, etl, llm-powered, data-pipeline
- Added
- 2026-03-17
- Completeness
- 100%
Index Score
39.1Adoption
35
Quality
78
Freshness
85
Citations
38
Engagement
0