Repository navigation

#

SRE

维基百科

Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.

bregman-arie/devops-exercises

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

Python
79141
7 天前

A curated list of amazingly awesome open-source sysadmin resources.

31195
2 个月前
upgundecha/howtheysre

A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

JavaScript
9470
1 个月前
runatlantis/atlantis
Go
8538
1 小时前

⭐ 【出版书籍】京东购买链接 https://item.jd.com/10183653901041.html 深入讲解内核网络、Kubernetes、ServiceMesh、容器等云原生相关技术。经历实践检验的 DevOps、SRE指南。

JavaScript
8386
18 天前
linkedin/school-of-sre

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

HTML
8042
1 年前

Coroot is an open-source observability and APM tool with AI-powered Root Cause Analysis. It combines metrics, logs, traces, continuous profiling, and SLO-based alerting with predefined dashboards and inspections.

Go
7102
2 天前

Giving Kubernetes Superpowers to everyone

Go
7004
17 小时前

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html

Python
6336
3 天前

Compilation of public failure/horror stories related to Kubernetes

HTML
6216
5 年前

Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts

Groovy
5901
1 天前

Kubernetes prompt info for bash and zsh

Shell
3718
4 个月前

CDN Up and Running - Building a CDN from Scratch to Learn about CDN, Nginx, Lua, Prometheus, Grafana, Load balancing, and Containers.

Lua
3555
1 年前
wmariuss/awesome-devops

A curated list of awesome DevOps platforms, tools, practices and resources

Python
3316
6 小时前
bregman-arie/sre-checklist

A checklist of anyone practicing Site Reliability Engineering

2413
2 年前
DevOpsHiveHQ/dynamic-devops-roadmap

A FREE pragmatic DevOps learning to kickstart your DevOps career and knowledge in the Cloud Native era following the Agile MVP style! ⭐ (2025 plans for DevOps, Cloud, Platform, SRE, SWE)

TypeScript
2189
1 个月前