Repository navigation
#
ml-efficiency
- Website
- Wikipedia
Supercharge Your Model Training
Python
5399
10 天前
MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.
Python
3
12 天前
(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)
Python
1
7 个月前