Alert fatigue in software engineering is a state of exhaustion that occurs when a large number of alerts makes the individuals responsible for addressing them give a deaf ear to the alert, resulting in overlooked or disregarded alerts.
For example, if a slack channel receives 100+ alerts a day, an on-call engineer may get numb to the channel due to the noise; and might indeed end up looking for some other way to get actionable alerts (maybe check another metric or wait for someone to ping).
Here are some places where we saw that engineering teams were able to easily reduce alert fatigue:
- Making it a part of KRA for the on-call engineer / SRE to tweak threshold levels for warning and incident.
- Define and prioritize symptom-based alerts to ensure that the needles are not missed in the haystack
Doctor Droid assists companies in monitoring critical KPIs associated with the operations and product, helping companies keep the focus on customer experience.
Our team has deep experience in helping companies set up their monitoring and observability stack, so if you need any assistance in setting it up, we are happy to assist. You can reach out to us, here.