Skip to content
← All runbooks

Monitoring & on-call

Discord Alerting Topology

Phase 12, consolidated alerting routes to #ops-alerts.

Channel: #ops-alerts

All production alerts route to this single channel. Team members with on-call role have notifications enabled.

Alert Sources

Source
Honeybadger
Trigger
Missed heartbeat (daemon down)
Format
Webhook embed
Source
Sentry
Trigger
Critical error rate-limited (>5 in 5min)
Format
Sentry Discord bot
Source
New Relic
Trigger
Threshold breach (p95, error rate, lag)
Format
Webhook embed
Source
GitHub Actions
Trigger
Workflow failure on master
Format
GHA Discord webhook (Phase 3)
Source
Subgraph cron
Trigger
Block lag >200 for 30min
Format
Custom webhook (Phase 4)
Source
Vigil-keeper
Trigger
Liquidation executed (informational)
Format
Custom webhook
Source
Lantern
Trigger
Publish completed (informational)
Format
Custom webhook

Example Payloads

Honeybadger Missed Heartbeat

{
  "content": "⚠️ **Honeybadger**: `notifier` missed heartbeat",
  "embeds": [{
    "title": "Check-In Missed: notifier",
    "description": "No ping received in 60s (expected every 30s)",
    "color": 16776960,
    "timestamp": "2026-05-28T14:00:00Z"
  }]
}

New Relic Threshold Alert

{
  "content": "🚨 **New Relic**: p95 latency > 3s for 10 min",
  "embeds": [{
    "title": "CRITICAL: p95 latency > 3s for 10 min",
    "description": "atrium-verify p95 response time exceeded 3s threshold",
    "color": 15158332,
    "fields": [
      { "name": "Current Value", "value": "4.2s", "inline": true },
      { "name": "Threshold", "value": "3s", "inline": true },
      { "name": "Duration", "value": "12 min", "inline": true }
    ]
  }]
}

Sentry Critical Error

{
  "content": "🐛 **Sentry**: New critical error in atrium-verify",
  "embeds": [{
    "title": "TypeError: Cannot read properties of undefined",
    "url": "https://atrium.sentry.io/issues/12345",
    "color": 15158332,
    "fields": [
      { "name": "Level", "value": "error", "inline": true },
      { "name": "Events", "value": "12", "inline": true }
    ]
  }]
}

Subgraph Staleness

{
  "content": "📊 **Scribe**: Block lag exceeded threshold",
  "embeds": [{
    "title": "Subgraph lag > 200 blocks for 30 min",
    "description": "Current lag: 247 blocks. Investigate indexer health.",
    "color": 16776960
  }]
}

GHA Workflow Failure

{
  "content": "❌ **GitHub Actions**: `ci.yml` failed on `master`",
  "embeds": [{
    "title": "CI Pipeline Failed",
    "url": "https://github.com/atrium-protocol/atrium/actions/runs/12345",
    "color": 15158332
  }]
}

Informational (low-priority, no @mention)

Vigil-Keeper Liquidation

{
  "content": "⚡ Vigil-keeper executed liquidation for account `0x1a2b...`, margin ratio 0.92"
}

Lantern Publish

{
  "content": "✅ Lantern published attestation #142, 3 leaves, root `0xabc1...`"
}

Routing Rules

  • P0/P1 alerts (Honeybadger miss, NR critical, Sentry fatal): @on-call mention
  • Informational (liquidations, publishes): no mention, logged for audit trail
  • Rate limiting: Sentry limited to 1 alert per issue per 10 min; NR limited by condition duration