Skip to main content

Monitoring & Alerts

Running a validator requires 24/7 visibility into node health, consensus participation, and resource usage. This guide covers setting up Prometheus and Grafana for Mersennet monitoring, plus recommended alert rules.

Overviewโ€‹

A typical monitoring stack includes:

ComponentPurpose
PrometheusScrapes metrics from the Mersennet node
GrafanaDashboards and visualization
AlertmanagerRoutes alerts (email, Slack, PagerDuty)

Key Metricsโ€‹

Mersennet exposes metrics that you should monitor:

MetricDescription
prime_chain_block_heightCurrent block height; should increase steadily
prime_chain_total_stakeTotal staked PRIM across all validators
prime_chain_tx_countTransaction count (per block or cumulative)
prime_chain_pending_txsMempool size; high values may indicate congestion
prime_chain_peer_countNumber of connected P2P peers
prime_chain_validator_missed_blocksBlocks you failed to sign (slashing risk)
tip

Metric names may vary by implementation. Check your node's /metrics endpoint or documentation for the exact names.

Prometheus Setupโ€‹

1. Install Prometheusโ€‹

# Ubuntu/Debian
sudo apt update
sudo apt install prometheus

# Or use the official binary
wget https://github.com/prometheus/prometheus/releases/download/v2.45.0/prometheus-2.45.0.linux-amd64.tar.gz
tar xvfz prometheus-*.tar.gz
cd prometheus-*

2. Configure Scrapingโ€‹

Edit prometheus.yml:

global:
scrape_interval: 15s
evaluation_interval: 15s

scrape_configs:
- job_name: 'prime-chain'
static_configs:
- targets: ['localhost:9091'] # Mersennet metrics port

Ensure your Mersennet node exposes metrics on the configured port (e.g. 9091). The exact port is set in the node config.

3. Start Prometheusโ€‹

./prometheus --config.file=prometheus.yml

Grafana Setupโ€‹

1. Install Grafanaโ€‹

# Ubuntu/Debian
sudo apt install -y software-properties-common
sudo add-apt-repository "deb https://packages.grafana.com/oss/deb stable main"
wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
sudo apt update
sudo apt install grafana
sudo systemctl enable grafana-server
sudo systemctl start grafana-server

2. Add Prometheus Data Sourceโ€‹

  1. Open Grafana at http://localhost:3000
  2. Login (default: admin/admin)
  3. Configuration โ†’ Data Sources โ†’ Add data source
  4. Select Prometheus
  5. URL: http://localhost:9090
  6. Save & Test

3. Import or Create Dashboardsโ€‹

Create panels for:

  • Block height โ€” Graph of prime_chain_block_height over time
  • Total stake โ€” Gauge or stat for prime_chain_total_stake
  • Transaction count โ€” Rate of prime_chain_tx_count
  • Pending transactions โ€” prime_chain_pending_txs
  • Peer count โ€” prime_chain_peer_count
  • Missed blocks โ€” prime_chain_validator_missed_blocks (critical for validators)

Alert Rulesโ€‹

Configure Prometheus alerting to catch issues before they cause slashing or downtime.

Prometheus Alert Rulesโ€‹

Create alerts.yml (or add to prometheus.yml):

groups:
- name: prime-chain
rules:
# Block production stalled
- alert: MersennetBlockStalled
expr: increase(prime_chain_block_height[5m]) == 0
for: 2m
labels:
severity: critical
annotations:
summary: "Mersennet block production stalled"
description: "No new blocks in 5 minutes. Node may be out of sync or consensus may be stuck."

# Missed blocks (slashing risk)
- alert: MersennetMissedBlocks
expr: increase(prime_chain_validator_missed_blocks[1h]) > 0
for: 5m
labels:
severity: warning
annotations:
summary: "Validator missed blocks"
description: "Validator has missed blocks in the last hour. Risk of downtime slashing."

# Low disk space
- alert: MersennetLowDiskSpace
expr: (node_filesystem_avail_bytes{mountpoint="/"} / node_filesystem_size_bytes{mountpoint="/"}) < 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "Low disk space on Mersennet node"
description: "Less than 10% disk space remaining. Node may stop if disk fills."

# Peer count too low
- alert: MersennetLowPeerCount
expr: prime_chain_peer_count < 3
for: 10m
labels:
severity: warning
annotations:
summary: "Low P2P peer count"
description: "Fewer than 3 peers connected. Network connectivity may be degraded."

Reference the rules file in prometheus.yml:

rule_files:
- 'alerts.yml'

Alertmanager (Optional)โ€‹

To send alerts to email, Slack, or PagerDuty:

  1. Install Alertmanager
  2. Configure receivers (e.g. Slack webhook)
  3. Set alertmanager.url in Prometheus config

Best Practicesโ€‹

PracticeRecommendation
UptimeAim for 99.9%+ to avoid downtime slashing
DiskMonitor and expand before hitting 10% free
PeersMaintain at least 5โ€“10 stable peers
BackupsBackup validator key and config; never expose the key
AlertsRoute critical alerts to a channel you monitor 24/7

Next Stepsโ€‹