Repository navigation

#

deepspeed

Python
7133
6 天前

Best practices & guides on how to write distributed pytorch training code

Python
488
7 个月前

GLake: optimizing GPU memory management and IO transmission.

Python
479
6 个月前

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

Jupyter Notebook
473
7 个月前

Large Language Models for All, 🦙 Cult and More, Stay in touch !

HTML
444
2 年前

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed

Python
436
2 年前

Collaborative Training of Large Language Models in an Efficient Way

Python
415
1 年前
Python
287
2 年前

一个基于HuggingFace开发的大语言模型训练、测试工具。支持各模型的webui、终端预测,低参数量及全参数模型训练(预训练、SFT、RM、PPO、DPO)和融合、量化。

Python
220
2 年前

DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)

Python
178
2 年前

llama2 finetuning with deepspeed and lora

Python
176
2 年前

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

Python
138
2 年前

Simple and efficient RevNet-Library for PyTorch with XLA and DeepSpeed support and parameter offload

Python
129
3 年前

Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.

Python
98
2 年前

llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.

Python
86
1 年前

Scripts of LLM pre-training and fine-tuning (w/wo LoRA, DeepSpeed)

Python
84
2 年前

✏️0成本LLM微调上手项目,⚡️一步一步使用colab训练法律LLM,基于microsoft/phi-1_5、chatglm3,包含lora微调,全参微调

Jupyter Notebook
78
2 年前