Repository navigation

moe

Website
Wikipedia

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

fine-tuning llama 大语言模型 peft transformers rlhf qlora quantization qwen instruction-tuning gpt lora large-language-models agent 人工智能 moe llama3 deepseek gemma 自然语言处理

Python

59688

7321

6 小时前

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

CUDA inference llama llava 大语言模型 llm-serving moe PyTorch transformer vlm llama3 deepseek deepseek-v3 deepseek-r1 qwen3 blackwell openai kimi gpt-oss

Python

18602

3070

39 分钟前

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT LLM also contains components to create Python and C++ runtimes that orchestrate the inference execution in performant way.

blackwell CUDA moe PyTorch llm-serving

C++

11772

1779

20 小时前

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Phi4, ...) (AAAI 2025).

大语言模型 lora llama sft multimodal peft internvl liger deepseek-r1 embedding grpo open-r1 megatron llama4 qwen3 reranker moe

Python

10190

896

1 天前

czy0729 / Bangumi

An unofficial https://bgm.tv ui first app client for Android and iOS, built with React Native. 一个无广告、以爱好为驱动、不以盈利为目的、专门做 ACG 的类似豆瓣的追番记录，bgm.tv 第三方客户端。为移动端重新设计，内置大量加强的网页端难以实现的功能，且提供了相当的自定义选项。目前已适配 iOS / Android。

React Native mobx iOS React Android bangumi design expo moe

TypeScript

4830

152

1 天前

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

gpu large-large-models CUDA PyTorch llm-inference jit attention Nvidia distributed-inference moe

Cuda

3843

530

4 小时前

zai-org / GLM-4.5

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

agent 大语言模型 moe glm

Python

2841

282

5 天前

PKU-YuanGroup / MoE-LLaVA

【TMM 2025🔥】 Mixture-of-Experts for Large Vision-Language Models

large-vision-language-model mixture-of-experts moe multi-modal

Python

2247

140

3 个月前

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

flash-attention 大语言模型 llm-serving llm-training moe PyTorch transformer

Python

1911

114

6 个月前

davidmrau / mixture-of-experts

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

moe mixture-of-experts PyTorch

Python

1182

110

1 年前

pjlab-sys4nlp / llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

llama 大语言模型 mixture-of-experts moe

Python

992

10 个月前

microsoft / Tutel

Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4

PyTorch moe mixture-of-experts deepseek 大语言模型

928

104

19 天前

sail-sg / Adan

Adan: Adaptive Nesterov Momentum Algorithm for Faster Optimizing Deep Models

bert-model convnext 深度学习 fairseq optimizer resnet timm vit transformer-xl 人工智能 diffusion dreamfusion gpt2 PyTorch cuda-programming llm-training 大语言模型 moe

Python

798

4 个月前

open-compass / MixtralKit

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI

大语言模型 mistral moe

Python

770

2 年前

ScienceOne-AI / DeepSeek-671B-SFT-Guide

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions. (DeepSeek-V3/R1 满血版 671B 全参数微调的开源解决方案，包含从训练到推理的完整代码和脚本，以及实践中积累一些经验和结论。)

deepseek-r1 大语言模型 moe sft Python

Python

764

7 个月前