Repository navigation

#

distributed-deep-learning

BigDL: Distributed TensorFlow, Keras and PyTorch on Apache Spark/Flink & Ray

Jupyter Notebook
2674
23 天前

Learn applied deep learning from zero to deployment using TensorFlow 1.8+

Jupyter Notebook
162
7 年前

A Portable C Library for Distributed CNN Inference on IoT Edge Clusters

C
83
5 年前

Chimera: bidirectional pipeline parallelism for efficiently training large-scale models.

Python
62
1 个月前

Distributed training of DNNs • C++/MPI Proxies (GPT-2, GPT-3, CosmoFlow, DLRM)

C++
42
1 年前

SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training

Python
32
2 年前

Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k communication volume which is asymptotically optimal) with the decentralized parallel Stochastic Gradient Descent (SGD) optimizer, and its convergence is proved theoretically and empirically.

Python
25
2 年前

🚨 Prediction of the Resource Consumption of Distributed Deep Learning Systems

Python
15
2 年前

TensorFlow (1.8+) Datasets, Feature Columns, Estimators and Distributed Training using Google Cloud Machine Learning Engine

Jupyter Notebook
12
7 年前

Decentralized Asynchronous Training on Heterogeneous Devices

Python
10
1 个月前

Eager-SGD is a decentralized asynchronous SGD. It utilizes novel partial collectives operations to accumulate the gradients across all the processes.

Python
8
3 年前

Scalable NLP model fine-tuning and batch inference with Ray and Anyscale

Jupyter Notebook
6
2 年前

WAGMA-SGD is a decentralized asynchronous SGD based on wait-avoiding group model averaging. The synchronization is relaxed by making the collectives externally-triggerable, namely, a collective can be initiated without requiring that all the processes enter it. It partially reduces the data within non-overlapping groups of process, improving the parallel scalability.

Python
6
4 年前

Java based Convolutional Neural Network package running on Apache Spark framework

Java
4
8 年前

This repository contains the implementation of a wide variety of Deep Learning Projects in different applications of computer vision, NLP, federated, and distributed learning. These projects include university projects and projects implemented due to interest in Deep Learning.

Jupyter Notebook
2
3 年前

Distributed deep learning framework based on pytorch/numba/nccl and zeromq.

Python
2
2 年前