Repository navigation
self-healing
- Website
- Wikipedia
CloudNativePG is a comprehensive platform designed to seamlessly manage PostgreSQL databases within Kubernetes environments, covering the entire operational lifecycle from initial deployment to ongoing maintenance
Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
Infrastructure monitoring framework turning DevOps runbooks into automated actions
Provides deep monitoring and self-healing of Kubernetes clusters
A collection of cluster reliability tools for Kubernetes
SaunaFS is a free-and open source, distributed POSIX file system inspired by Google File System.
A collection of nodes for improve Node-RED dependability by enabling self-healing capabilities.
An experimental cluster brings Prometheus and OpenStack together
The DeMoD Communications Framework (DCF) is an open-source evolution of the DeMoD Secure Protocol (DSP), designed as a shareware version to promote community-driven development
A Docker Healer - Auto Restarting Unhealthy Containers
Java application for orchestrating AEM infrastructure created using aem-aws-stack-builder
Aegis is an LLM-powered AI cluster autonomous operations system, focused on intelligent capabilities such as Fault Diagnosis, Self-healing, Root Cause Analysis, Cluster Inspection, and Alert Optimization.
A Terraform module to quickly deploy a secure, persistent, highly available, self healing, efficient, cost effective and self managed single-node or multi-node MongoDB NoSQL document database cluster on AWS ECS cluster with monitoring and alerting enabled.
Turn photos into forcefield particle animations in real-time. js + canvas, open source
Robot Framework and OpenAI integration
A monitoring and healing framework. It uses Pester tests (TDD) to determine the state of an IT system entity. Then, if the entity is in a faulted state HealOps will try to repair it. All along HealOps reports metrics to a backend report system and HealOps status is sent to stakeholders. In order to e.g. trigger alarms and get on-call personnel on an issue that could not be repaired.
OpenFero is intended as an event-triggered job scheduler for code agnostic recovery jobs.