Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agentbot.raveculture.xyz/llms.txt

Use this file to discover all available pages before exploring further.

Watchdog API

The watchdog monitors agent gateway processes for health and automatically recovers from failures. It detects crash loops, performs auto-repair, and sends notifications via Telegram and Discord.

How It Works

Gateway running


Health check every 2 min

      ├── Healthy → continue monitoring

      └── Unhealthy

           ├── First failure → mark degraded
           ├── Retry every 5 sec
           ├── 3 failures → trigger repair


      Auto-repair

           ├── Kill gateway process
           ├── Wait 5 seconds
           ├── Restart gateway
           ├── Verify health

           ├── Success → resume normal monitoring
           └── Failure → retry (max 2 attempts)

                              └── Crash loop detected → notify + stop

Configuration

VariableDefaultDescription
WATCHDOG_CHECK_INTERVAL120Health check interval in seconds
WATCHDOG_DEGRADED_CHECK_INTERVAL5Retry interval when degraded (seconds)
WATCHDOG_MAX_REPAIR_ATTEMPTS2Max auto-repair attempts before crash-loop
WATCHDOG_CRASH_LOOP_WINDOW300Window for crash-loop detection (seconds)
WATCHDOG_CRASH_LOOP_THRESHOLD3Crashes in window to trigger crash-loop
WATCHDOG_AUTO_REPAIRtrueEnable/disable auto-repair
TELEGRAM_BOT_TOKENTelegram bot token for notifications
TELEGRAM_ADMIN_CHAT_IDChat ID for Telegram notifications
DISCORD_WEBHOOK_URLDiscord webhook URL for notifications

Lifecycle States

StateDescription
stoppedGateway is not running
startingGateway process started, waiting for health check
runningGateway is healthy
degradedHealth check failed, retrying
crash_loopToo many crashes, auto-repair exhausted
repairingAuto-repair in progress

Notifications

The watchdog sends notifications for:
  • Crash detected — Gateway process exited unexpectedly
  • Crash loop — 3+ crashes in 5-minute window
  • Auto-repair started — Attempting to restart the gateway
  • Auto-repair succeeded — Gateway is healthy again
  • Auto-repair failed — Could not recover, needs manual intervention

Telegram Format

🐺 Agentbot Watchdog
🔴 Crash loop detected - View logs
Trigger: `crash_loop`
Attempt count: 3

Discord Format

Embedded message with color coding:
  • 🔴 Red: crashes, failures
  • 🟡 Yellow: repairing, degraded
  • 🟢 Green: recovered, healthy

Test Script

Test the watchdog by injecting an invalid config:
#!/bin/bash
CONTAINER_NAME="agentbot-<userId>"
docker exec "$CONTAINER_NAME" node -e "
  const fs = require('fs');
  const cfg = JSON.parse(fs.readFileSync('/data/.openclaw/openclaw.json'));
  cfg.hooks = cfg.hooks || {};
  cfg.hooks.transformDir = '/tmp/does-not-exist';
  fs.writeFileSync('/data/.openclaw/openclaw.json', JSON.stringify(cfg, null, 2));
"
The watchdog should detect the gateway failure and auto-repair within 2 health check cycles.