Repository navigation

#

reliability-engineering

litmuschaos/litmus

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q

Go
4802
5 天前
bregman-arie/sre-checklist

A checklist of anyone practicing Site Reliability Engineering

2393
1 年前

Hands on labs and code to help you learn, measure, and build using architectural best practices.

Python
2066
1 个月前
Python
1950
1 年前

This repository provides a design methodology and approach to building highly-reliable applications on Microsoft Azure for mission-critical workloads.

521
7 个月前

Reliability engineering toolkit for Python - https://reliability.readthedocs.io/en/latest/

Python
386
5 个月前

Serverless chaos monkey for AWS (runs on AWS Lambda) ☁️ 💥

JavaScript
290
2 年前

OpenShift Guide. Learn about the Red Hat OpenShift Container Platform, Data Science, Code Ready Containers, Podman, Buildah, and Kubernetes.

Python
154
2 年前

Probabilistic Risk Analysis Tool (fault tree analysis, event tree analysis, etc.)

C++
151
2 年前

The Chaos Toolkit core library

Python
77
1 年前

An opinionated list of attributes and policies that need to be met in order to establish a stable software system.

52
8 年前

A Python package for survival analysis. The most flexible survival analysis package available. SurPyval can work with arbitrary combinations of observed, censored, and truncated data. SurPyval can also fit distributions with 'offsets' with ease, for example the three parameter Weibull distribution.

Python
48
9 个月前

A collection templates ported from the SRE Workbook

41
7 年前