Repository navigation

#

checkpointing

Cedana: Access and run on compute anywhere in the world, on any provider. Migrate seamlessly between providers, arbitraging price/performance in realtime to maximize pure runtime.

Go
58
17 天前

Very-Low Overhead Checkpointing System

C++
57
3 个月前

Keras wrapper that autosaves what ModelCheckpoint cannot.

Python
24
3 年前

Extending DOLFINx with checkpointing functionality

Python
24
18 天前

This FLINK project will consume streams from an azure event-hub and produce to a different event-hub ,and the config files for deploying the same in kubernetes

Java
2
4 年前

Code and tutorial on integrating wandb sweeps with Slurm pre-emption

Python
2
7 个月前

This is a standalone flink producer using for testing the flink-consume-produce-ek repo contents

Java
1
4 年前

A python package for performing memory intensive computations in parallel using chunks and checkpointing.

Python
1
3 天前

A shared library to help test your code with failure-injection

C++
1
3 个月前

DMTCP scripts to get Python scripts working with SLURM.

Shell
1
1 年前

A digital album face recognition manager, that isolates images of a specified person from a digital album.

Python
0
10 个月前

Compile a torch model to a checkpointed model

Python
0
5 年前

🌀 data objects for Bash (attempt one).

0
7 个月前

Koo and Toueg’s checkpointing and recovery protocol

Java
0
1 年前

A python package for checkpointing, saving, and loading objects.

Python
0
3 天前