Repository navigation

#

ml-efficiency

(Unofficial) building Hugging Face SmolLM-blazingly fast and small language model with PyTorch implementation of grouped query attention (GQA)

Python
1
3 个月前