Production-Ready Template

Proactive MongoDB Monitoring with Prometheus Alert Templates

Monitoring your MongoDB infrastructure with Prometheus is critical for ensuring performance, reliability, and early detection of anomalies. This blog explores the alerting templates provided in the MongoDB alert rule set from DrDroidLab's Prometheus Alert Templates repository. We’ll walk through the types of alerts defined, deep dive into key rules, and provide guidance on setup and tuning.

Core Alert Rule

MongoDown
Critical Performance Bottleneck
mongodb_up == 0
Why this matters
This alert fires when the MongoDB exporter reports that MongoDB is unreachable. It generally indicates MongoDB is either down, misconfigured, or experiencing network issues.
Tuning tips
Use 'for: 2m' to reduce noise during restarts or transient network blips. Consider excluding known maintenance windows with Alertmanager routing logic.
MongoReplicationLag
Operations blocking event loop
mongodb_mongod_replset_member_optime_date{state="SECONDARY"} < ignoring(state) (mongodb_mongod_replset_member_optime_date{state="PRIMARY"} - 60)
Why this matters
Detects replication lag greater than 60 seconds between primary and secondary members. Indicates issues with data sync which could lead to stale reads or failover delays.
Tuning tips
Adjust the lag threshold based on your application’s tolerance, e.g., 30s for real-time applications or 120s for batch workloads. Monitor trends to detect performance degradation.
MongoReplicationStateNotPrimary
Memory efficiency warning
min(mongodb_mongod_replset_member_state{state="PRIMARY"}) != 1
Why this matters
This checks if *any* MongoDB instance is currently in a PRIMARY state. If not, failover has occurred or the replica set is unhealthy.
Tuning tips
Set a short 'for' duration (e.g., 1m) to quickly respond to failover events, but buffer slightly to avoid temporary election noise during restarts.
MongoReplicationStateNotSecondary
Service availability check
count(mongodb_mongod_replset_member_state{state="SECONDARY"}) < 1
Why this matters
Ensures there is at least one SECONDARY in the replica set. This confirms high availability and fault tolerance are maintained.
Tuning tips
Use this alert in clusters with redundancy requirements. In smaller dev/test environments, consider disabling or loosening thresholds.
MongoNumberCursorsOpenHigh
Service availability check
mongodb_mongod_metrics_cursor_open{state="total"} > 10000
Why this matters
Indicates the number of open MongoDB cursors exceeds 10,000. Could signal long-lived queries or resource leaks affecting memory and performance.
Tuning tips
Baseline your typical open cursor count and adjust the threshold accordingly (e.g., high-throughput systems may need a higher threshold). Include query profiling if this alert fires frequently.
Service availability check
Why this matters
Tuning tips
Clone the repo: git clone https://github.com/DrDroidLab/prometheus-alert-templates.git
Service availability check
Why this matters
Tuning tips
Service availability check
Why this matters
Tuning tips

Quick Setup

1
Navigate to the mongoDB folder: cd prometheus-alert-templates/mongoDB
2
Copy the alert rule file into your Prometheus alerting rules directory
3
Reload Prometheus configuration or restart the Prometheus server
4
Validate alert rules are active using Prometheus UI or API
5
Do these alerts require the MongoDB exporter?

Frequently Asked Questions

Do these alerts require the MongoDB exporter?
How do I integrate this with Alertmanager?
What version of MongoDB do these alerts support?
Can I disable specific alerts?

Ready to Get Started?

Start monitoring MongoDB effectively with these battle-tested Prometheus alert templates. Clone and customize the alert rules from the GitHub repo at https://github.com/DrDroidLab/prometheus-alert-templates/blob/master/mongoDB to match your alerting strategy and operational maturity.

SOC 2 Type II
certifed
ISO 27001
certified
Deep Sea Tech Inc. — Made with ❤️ in Bangalore & San Francisco 🏢

Doctor Droid