Repository navigation

#

data-parallelism

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python
37996
16 小时前

Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.

Python
622
7 年前

A state-of-the-art multithreading runtime: message-passing based, fast, scalable, ultra-low overhead

Nim
555
10 个月前

飞桨大模型开发套件,提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。

Python
465
1 年前

Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.

Python
267
2 年前

Ternary Gradients to Reduce Communication in Distributed Deep Learning (TensorFlow)

Python
182
6 年前

Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*

Python
82
1 年前

Multi-GPU training for Keras

Python
44
8 年前

SC23 Deep Learning at Scale Tutorial Material

Python
43
7 个月前

WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.

Python
18
3 年前
Python
18
2 个月前

♨ Optimized Gaussian blur filter on CPU.

C++
17
7 年前

☕Implement of Parallel Matrix Multiplication Methods Using FOX Algorithm on Peking University's High-performance Computing System

C
13
6 年前

This repository provides hands-on labs on PyTorch-based Distributed Training and SageMaker Distributed Training. It is written to make it easy for beginners to get started, and guides you through step-by-step modifications to the code based on the most basic BERT use cases.

Jupyter Notebook
13
2 年前