All posts
March 14, 2026·5 min read·ERPFit Team
monitoringinfrastructuretypescriptuptime-kuma
How We Monitor 19+ Services with Uptime Kuma and TypeScript

How We Monitor 19+ Services with Uptime Kuma and TypeScript

Real-world monitoring setup for SME infrastructure — from Uptime Kuma configuration to health check scripts and Telegram alerts.

Table of Contents

When you manage infrastructure for multiple clients — websites, APIs, email servers, databases — the question isn't "will there be incidents" but "how fast will you know about them." Our goal: under 5 minutes.

Why Uptime Kuma

Before Uptime Kuma, we tried Datadog, UptimeRobot, and Hetrix. Common problems: either expensive (Datadog — hundreds of dollars per month for a small team), or limited monitors on the free tier (UptimeRobot — 50 monitors), or missing critical features.

Uptime Kuma is self-hosted, open source, runs on Docker. Unlimited monitors, no monthly fees. Setup takes 5 minutes:

docker run -d --restart=always \
  -p 3001:3001 \
  -v uptime-kuma:/app/data \
  --name uptime-kuma \
  louislam/uptime-kuma:1

Beautiful web UI, easy to use. Supports multiple monitor types: HTTP(S), TCP, DNS, Docker containers, and push-based monitors for custom scripts.

What We Monitor

19+ endpoints, split into 3 groups:

Group 1: Websites & APIs (HTTP monitors)

Check HTTP status code + response time. Interval: 60 seconds. Includes:

  • Client websites (WordPress, static sites)
  • API endpoints (health check routes)
  • Admin panels (ERPNext, Mautic, HestiaCP)

Group 2: SSL Certificates

Uptime Kuma automatically checks SSL certificate expiry dates. Alerts 14 days before expiry. Most certs auto-renew via Let's Encrypt, but sometimes auto-renewal fails — without monitoring, you only find out when a client calls saying "the website shows a security warning."

Group 3: Custom Health Checks (Push monitors)

Some things can't be checked with a simple HTTP request. For example: has the backup completed? Is the database running? How much disk space is left?

For these cases, we use push monitors. A script runs via cron, checks conditions, then sends a heartbeat to Uptime Kuma if everything is fine:

// health-check.ts — runs every 5 minutes via cron
const checks = [
  checkDiskSpace('/'),           // > 20% free
  checkPostgresConnection(),     // can connect
  checkBackupAge('/backups/'),   // < 25 hours old
]

const results = await Promise.all(checks)
const allPassed = results.every(r => r.ok)

if (allPassed) {
  // Send heartbeat to Uptime Kuma push URL
  await fetch(UPTIME_KUMA_PUSH_URL)
}

If the script doesn't send a heartbeat within 10 minutes — Uptime Kuma marks the service as DOWN and sends an alert.

Telegram Alerts

Email notifications are slow — especially at night. We use Telegram Bot for alerts because:

  • Instant push notifications on phone
  • Can create group chats for different service groups
  • Free, unlimited messages
  • Simple API — one HTTP POST and done

Uptime Kuma has built-in Telegram integration — just create a bot, get the token and chat ID, enter them in settings. When a service goes DOWN, the message arrives in seconds.

TypeScript Health Check Scripts

Beyond Uptime Kuma, we have custom health check scripts running on Bun that check deeper than what Uptime Kuma covers:

Backup Verification

Backups run daily via cron. But if the cron job fails, nobody knows — unless you check.

// verify-backups.ts
import { readdir, stat } from 'node:fs/promises'

async function checkBackupAge(dir: string, maxAgeHours = 25) {
  const files = await readdir(dir)
  const latest = files
    .filter(f => f.endsWith('.tar.gz'))
    .sort()
    .pop()

  if (!latest) return { ok: false, error: 'No backup files found' }

  const info = await stat(`${dir}/${latest}`)
  const ageHours = (Date.now() - info.mtimeMs) / (1000 * 60 * 60)

  return {
    ok: ageHours < maxAgeHours,
    file: latest,
    ageHours: Math.round(ageHours * 10) / 10,
  }
}

The script runs every 6 hours. If the newest backup is older than 25 hours (allowing 1 hour buffer versus the daily cron), it sends an alert.

Disk Space Monitoring

Running out of disk space on a VPS is the most common cause of incidents — database crashes, logs can't write, containers won't start. Simple but essential check.

Database Connection Check

Not just checking if the port is open — actually connecting and running a small query (SELECT 1). If PostgreSQL is running but overloaded (connection pool exhausted), an HTTP port check still passes but the app doesn't work.

Real Results

Over the past 12 months:

  • Caught 3 SSL certificate renewal failures — before the cert expired
  • Caught 2 VPS disk space exhaustions — before the database crashed
  • Caught 1 ISP routing issue — knew about it before clients called
  • Average time from incident to alert: under 3 minutes

Without monitoring, these incidents would only be discovered when clients called — usually hours later, after the damage was done.

Cost

Uptime Kuma runs in a Docker container on an existing server. No additional hosting cost. Telegram Bot is free. Health check scripts — write once, run forever.

Total cost to monitor 19+ services: $0/month.

Compared to Datadog ($23/host/month) or UptimeRobot Pro ($7/month for 50 monitors) — self-hosted monitoring is the clear choice for SMEs.

Lesson

Monitoring doesn't need to be complex. You don't need Prometheus + Grafana + AlertManager for 20 services. Uptime Kuma + a few TypeScript scripts + Telegram Bot — enough to sleep well at night.

What matters isn't the tool — it's monitoring the right things and alerting the right people. Backup verification is the most commonly forgotten — everyone sets up a cron backup then forgets, until they need to restore and discover backups stopped running 3 months ago.

Share:𝕏FBin