Repository navigation

#

reliability-engineering

litmuschaos/litmus

Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q

Go
4670
2 天前
bregman-arie/sre-checklist

A checklist of anyone practicing Site Reliability Engineering

2354
1 年前

Hands on labs and code to help you learn, measure, and build using architectural best practices.

Python
2040
2 天前
Python
1919
9 个月前

This repository provides a design methodology and approach to building highly-reliable applications on Microsoft Azure for mission-critical workloads.

521
3 个月前

Reliability engineering toolkit for Python - https://reliability.readthedocs.io/en/latest/

Python
365
1 个月前

Serverless chaos monkey for AWS (runs on AWS Lambda) ☁️ 💥

JavaScript
290
1 年前

OpenShift Guide. Learn about the Red Hat OpenShift Container Platform, Data Science, Code Ready Containers, Podman, Buildah, and Kubernetes.

Python
150
1 年前

Probabilistic Risk Analysis Tool (fault tree analysis, event tree analysis, etc.)

C++
147
2 年前

The Chaos Toolkit core library

Python
77
8 个月前

An opinionated list of attributes and policies that need to be met in order to establish a stable software system.

52
8 年前

A Python package for survival analysis. The most flexible survival analysis package available. SurPyval can work with arbitrary combinations of observed, censored, and truncated data. SurPyval can also fit distributions with 'offsets' with ease, for example the three parameter Weibull distribution.

Python
48
5 个月前

A collection templates ported from the SRE Workbook

40
7 年前