aliases:
- SRE
- Reliability engineering
tags:
- Type/Tech
- area/tech/devops
- seed
publish: true
version: 1
dateCreated: 2021-10-24, 19:05
dateModified: 2024-04-04, 09:50
from:
- "[[Tech]]"
- "[[DevOps]]"
- "[[Software Measurement]]"
related:
contra:
to:
Site Reliability Engineering (SRE) is a set of principles and practices that applies aspects of software engineering to IT infrastructure and operations. SRE claims to create highly reliable and scalable software systems. Although they are closely related, SRE is slightly different from DevOps. | |
---|---|
wikipedia:: Site reliability engineering | |
Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specified period of time. Reliability is closely related to availability, which is typically described as the ability of a component or system to function at a specified moment or interval of time. | |
wikipedia:: Reliability engineering |
Also, reliability engineering, in general
Largely focuses on metrics, logging, monitoring, alerting, and recovery, and engineering for resilience like the chaos monkey, etc.