DrDroid

Error budget

An error budget is a concept for defining acceptable reliability limits and managing it’s variance. It represents the acceptable amount of errors or downtime that a service can experience within a defined period of time. It is like a limit on how many mistakes or errors your software can have before it is unreliable and is considered a problem for the engineering team to manage.

Missing Critical Issues due to Alert Noise?

Suppress noisy alerts

DrDroid filters out false positives and noise to focus on what matters

Group alerts by root cause

DrDroid investigates and clusters genuine alerts by their underlying root causes

What is an Error Budget?

An error budget is a concept for defining acceptable reliability limits and managing it’s variance. It represents the acceptable amount of errors or downtime that a service can experience within a defined period of time. It is like a limit on how many mistakes or errors your software can have before it is unreliable and is considered a problem for the engineering team to manage.

How does Error Budget help engineering teams?

Error budget helps in prioritization and in ensuring most critical problems are dealt with first. It also prevents the accumulation of the technical debt that can slow down the development over time.

Engineering teams create error budget policies to protect customers from repeated misses in SLOs.

Here’s an example of an error budget policy by Google SRE.