How a High-Growth Marketplace Automated On-Call and Slashed MTTA by 96%
Company Overview
A high-growth e-commerce marketplace experiencing rapid expansion, with a DevOps team responsible for maintaining critical infrastructure services.
Key Results
The Challenge
As their platform scaled, the DevOps team at a high-growth marketplace faced a wave of alert fatigue:
Repetitive warnings from VMs, Elasticsearch, and PostgreSQL
Key infrastructure components were generating frequent alerts requiring manual intervention.
40% of on-call time wasted on non-critical alerts
Engineers were spending significant time addressing alerts that didn't require immediate attention.
Engineers overwhelmed, on-call rotations stretched thin
The high volume of alerts was leading to alert fatigue and burnout among the engineering team.
They needed a way to reduce noise, regain focus, and make on-call survivable again.
The Implementation
They rolled out DrDroid across their monitoring and infrastructure stack:
Integrated with AWS, databases, Elasticsearch, PostgreSQL, and Slack
Connected DrDroid with their entire infrastructure and communication stack.
Deployed automated playbooks for top recurring incidents
Created automated playbooks to handle common alert scenarios without human intervention.
Achieved full coverage across key services within 5 weeks
Deployed DrDroid across their entire infrastructure within 5 weeks.
The process was plug-and-play. No manual scripting. Just results.
Tools in Play
The Results
MTTA down from 15 minutes to <60 seconds
Mean Time To Acknowledge alerts was dramatically reduced by 96%, giving engineers back their time.
Escalations cut by 70%
The number of incidents requiring escalation to senior engineers was reduced by 70%, giving engineers back their focus.
False positives reduced by 85%
DrDroid's intelligent alert filtering dramatically reduced the number of false positive alerts, restoring trust in the alert system.
Now I actually trust the alerts in my inbox. Everything noisy gets handled before it even reaches me. DrDroid has been a game-changer for our on-call experience. Our engineers are no longer overwhelmed with alert noise, and we can automatically resolve most common issues.
Ready to see similar results at your organization?