Complete-muE: Optimal Hyperparameter Transfer and Scaling for MoE Models
by [unverified] · free · Last verified 2026-06-21T03:07:55.172Z
We propose Complete-muE, a framework which targets hyperparameter transfer across dense FFN and any Mixture-of-Experts (MoE) setups in transformer blocks. Existing tools such as $μ$P (requires fixed architectue) or SDE (requires fixed per-step token count) cannot directly solve the hyperparameter...
http://arxiv.org/abs/2605.23893v1 ↗F
F—Critical
Adoption: FQuality: FFreshness: A+Citations: FEngagement: F
Specifications
- Pricing
- free
- Capabilities
- unverified
- Integrations
- Use Cases
- API Available
- No
- Tags
- auto-discovered
- Added
- 2026-06-21T03:07:55.172Z
- Completeness
- 60%
Index Score
0Adoption
0
Quality
0
Freshness
100
Citations
0
Engagement
0