Production-Ready Template

MySQL Monitoring with Prometheus: Production-Ready Alert Rules

Proactive monitoring is crucial for running MySQL in production. This blog walks through a curated set of Prometheus alert rules for MySQL from the open-source project prometheus-alert-templates. You'll learn to detect overloads, replication issues, query performance problems, and resource saturation — all using PromQL. We also include tuning tips and setup instructions to integrate these alerts in your Prometheus + Alertmanager pipeline.

Get Template

Core Alert Rule

MySQL is down

Critical Performance Bottleneck

absent(up{job=~"mysql.*"} == 1)

Why this matters

This alert fires when MySQL is not exposing metrics—either it's down, or metrics are unreachable. It uses the 'absent' function to detect if no targets are reporting up status == 1.

Tuning tips

Ensure your MySQL job naming matches the regex mysql.*. You can adjust matchers based on your exact Prometheus job labels.

MySQL instance experiencing errors

Operations blocking event loop

rate(mysql_global_status_errors[1m]) > 0.5

Why this matters

Triggers when the error rate for MySQL is over 0.5 per second. Useful to catch repeated query errors, connection issues, or other unexpected behavior.

Tuning tips

Tune the threshold (> 0.5) based on expected workload and business tolerance to transient errors. You might average over 5m for less noisy alerting.

MySQL-InnoDB Log Waits

Memory efficiency warning

rate(mysql_global_status_innodb_log_waits[1m]) > 0

Why this matters

Fires if InnoDB is waiting on log flushes, suggesting IO contention or misconfigured innodb_log_buffer_size.

Tuning tips

Use this alert to catch disk saturation issues. Tune based on expected log throughput and IO speed.

MySQL handler rollbacks indicate issues

Service availability check

rate(mysql_global_status_handler_rollback[5m]) > 10

Why this matters

This detects if excessive rollbacks are occurring in query handlers, which can signal application or SQL logic bugs.

Tuning tips

Adjust the rollback threshold based on your application's normal query patterns.

MySQL slow queries over threshold

Service availability check

rate(mysql_global_status_slow_queries[1m]) > 1

Why this matters

Fires when more than 1 slow query per second is happening. Indicates performance regressions or locking issues.

Tuning tips

Configure your MySQL server's slow_query_log_threshold appropriately to align with this alert.

Service availability check

Why this matters

Tuning tips

Download the MySQL alert rules file from https://github.com/DrDroidLab/prometheus-alert-templates/blob/master/mysql/mysql_alert_rules.yml

Service availability check

Why this matters

Tuning tips

Service availability check

Why this matters

Tuning tips

Quick Setup

Include the alert rule file in your Prometheus configuration using the 'rule_files' directive.

Reload Prometheus or restart it to apply the new alert rules.

Ensure Alertmanager is properly configured to receive and route these alerts.

What exporter do I need to collect MySQL metrics?

Frequently Asked Questions

Ready to Get Started?

Integrate these MySQL Prometheus alert templates into your production monitoring stack today. Get the rules on GitHub: https://github.com/DrDroidLab/prometheus-alert-templates/blob/master/mysql

Get Template