Quick Answer: For quick uptime monitoring, deploy Uptime Kuma:
docker run -d -p 3001:3001 -v uptime-data:/app/data louislam/uptime-kuma. For full metrics, deploy Prometheus + Grafana + node-exporter. For logs, usejournalctl -for set up Loki.
Need a VPS? Vultr (free credit), DigitalOcean ($200 free credit), or RackNerd (cheap annual deals).
Your server is running. But how do you know it's healthy? How do you find out when something breaks — before your users do? Monitoring answers these questions.
This guide covers everything from simple uptime checks to full observability stacks.
What to Monitor
| Category | What to Watch | Why |
|---|---|---|
| Uptime | Is the service responding? | Know immediately when something goes down |
| CPU | Usage percentage | High CPU = performance issues or crypto miners |
| Memory | Used vs available RAM | Memory leaks crash services |
| Disk | Space remaining | Full disk = everything breaks |
| Network | Bandwidth, latency, errors | Detect DDoS, bandwidth limits |
| Services | Nginx, Docker, databases running? | Restart failed services automatically |
| Logs | Errors, warnings, unusual patterns | Find root cause of problems |
| SSL | Certificate expiry | Don't let HTTPS break |
| Response time | How fast are your endpoints? | Catch slowdowns before users notice |
Part 1: Quick Health Checks (No Tools Needed)
You can monitor the basics with commands you already have:
One-Liner Health Check
#!/bin/bash
echo "=== Server Health ==="
echo "Uptime: $(uptime -p)"
echo "Load: $(cat /proc/loadavg | awk '{print $1, $2, $3}')"
echo "CPU: $(top -bn1 | grep "Cpu(s)" | awk '{print $2}')% used"
echo "RAM: $(free -m | awk 'NR==2{printf "%dMB/%dMB (%.1f%%)", $3, $2, $3/$2*100}')"
echo "Disk: $(df -h / | awk 'NR==2{print $3 "/" $2 " (" $5 ")"}')"
echo "Docker: $(docker ps -q 2>/dev/null | wc -l) containers running"
echo "Failed services: $(systemctl --failed --no-legend | wc -l)"
echo "Nginx: $(systemctl is-active nginx)"
echo "SSH: $(systemctl is-active ssh)"
Watch Resources Live
# CPU + memory + processes
htop
# Disk I/O
iostat -x 1
# Network bandwidth
iftop
# Or: vnstat -l
# Docker resource usage
docker stats
Cron-Based Alerts
Simple monitoring without any tools — cron checks and emails/Telegram alerts:
#!/bin/bash
# /opt/monitor.sh — run every 5 minutes via cron
# Check disk
DISK_USAGE=$(df / | awk 'NR==2{print $5}' | tr -d '%')
if [ "$DISK_USAGE" -gt 90 ]; then
echo "ALERT: Disk usage at ${DISK_USAGE}%"
# Send alert (Telegram, email, webhook)
fi
# Check memory
MEM_USAGE=$(free | awk 'NR==2{printf "%.0f", $3/$2*100}')
if [ "$MEM_USAGE" -gt 90 ]; then
echo "ALERT: Memory usage at ${MEM_USAGE}%"
fi
# Check if nginx is running
if ! systemctl is-active --quiet nginx; then
echo "ALERT: Nginx is down! Attempting restart..."
systemctl restart nginx
fi
# Check if a URL responds
HTTP_CODE=$(curl -so /dev/null -w '%{http_code}' --max-time 10 https://yourdomain.com)
if [ "$HTTP_CODE" != "200" ]; then
echo "ALERT: Website returned $HTTP_CODE"
fi
# Run every 5 minutes
crontab -e
# Add: */5 * * * * /opt/monitor.sh >> /var/log/monitor.log 2>&1
Part 2: Uptime Monitoring (Uptime Kuma)
Uptime Kuma is the best open-source uptime monitor. Beautiful UI, easy setup, supports HTTP, TCP, ping, DNS, Docker, and more.
Deploy with Docker
# compose.yml
services:
uptime-kuma:
image: louislam/uptime-kuma
restart: unless-stopped
ports:
- "3001:3001"
volumes:
- uptime-data:/app/data
volumes:
uptime-data:
docker compose up -d
# Visit http://YOUR_SERVER:3001
What to Monitor
Add these monitors:
| Type | Target | Interval |
|---|---|---|
| HTTP | https://yourdomain.com |
60s |
| HTTP | https://yourdomain.com/api/health |
30s |
| TCP | localhost:5432 (PostgreSQL) |
60s |
| TCP | localhost:6379 (Redis) |
60s |
| Ping | Your other servers | 60s |
| Docker | Container names | 60s |
| SSL | Your domains (checks expiry) | 86400s (daily) |
Alerting
Uptime Kuma supports 90+ notification channels:
- Telegram — most popular
- Discord — webhook
- Slack — webhook
- Email — SMTP
- Webhook — any custom URL
- PagerDuty, Opsgenie — enterprise
Set up a Telegram bot notification so you get an instant alert on your phone when anything goes down.
Part 3: Metrics Stack (Prometheus + Grafana)
For detailed server metrics — CPU, RAM, disk, network over time with graphs and dashboards.
Architecture
node-exporter (collects metrics) → Prometheus (stores metrics) → Grafana (visualizes)
Deploy with Docker Compose
# compose.yml
services:
prometheus:
image: prom/prometheus
restart: unless-stopped
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prom_data:/prometheus
ports:
- "9090:9090"
grafana:
image: grafana/grafana
restart: unless-stopped
environment:
GF_SECURITY_ADMIN_PASSWORD: admin
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
depends_on:
- prometheus
node-exporter:
image: prom/node-exporter
restart: unless-stopped
ports:
- "9100:9100"
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.sysfs=/host/sys'
- '--path.rootfs=/rootfs'
volumes:
prom_data:
grafana_data:
Prometheus Config
Create prometheus.yml:
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Start and Configure
docker compose up -d
# Prometheus: http://YOUR_SERVER:9090
# Grafana: http://YOUR_SERVER:3000 (admin/admin)
Grafana Setup
- Log in to Grafana (admin/admin)
- Add data source → Prometheus → URL:
http://prometheus:9090 - Import dashboard → ID: 1860 (Node Exporter Full)
- You now have CPU, RAM, disk, network, and more in beautiful graphs
What You Get
- CPU usage over time with per-core breakdown
- Memory used/cached/free trends
- Disk I/O read/write speeds
- Network bandwidth in/out
- System load averages
- Disk space trends (predict when you'll run out)
Part 4: Log Monitoring
journalctl (Built-in)
# All logs
journalctl -f # Follow live
# Specific service
journalctl -u nginx -f
journalctl -u docker -f
# Errors only
journalctl -p err --since "1 hour ago"
# Failed services
systemctl --failed
Simple Log Monitoring Script
#!/bin/bash
# Watch for errors in key logs
tail -f /var/log/nginx/error.log \
/var/log/auth.log \
/var/log/fail2ban.log | \
grep --line-buffered -iE "error|fail|denied|ban" | \
while read line; do
echo "[ALERT] $line"
# Send to Telegram/Discord here
done
Loki + Grafana (Advanced Log Aggregation)
For searchable, indexed logs across multiple servers:
# Add to your monitoring compose.yml
services:
loki:
image: grafana/loki
restart: unless-stopped
ports:
- "3100:3100"
volumes:
- loki_data:/loki
promtail:
image: grafana/promtail
restart: unless-stopped
volumes:
- /var/log:/var/log:ro
- ./promtail.yml:/etc/promtail/config.yml
depends_on:
- loki
volumes:
loki_data:
In Grafana: Add data source → Loki → URL: http://loki:3100
Now you can search all your logs from Grafana's Explore panel.
Full reference: Linux Log Files Explained
Part 5: Docker Monitoring
Check Container Health
# Running containers
docker ps
# Resource usage per container
docker stats
# Container logs
docker logs container-name -f --tail 50
# Inspect health check status
docker inspect --format='{{.State.Health.Status}}' container-name
Monitor Docker with Prometheus
Add cAdvisor to your compose stack:
services:
cadvisor:
image: gcr.io/cadvisor/cadvisor
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
Add to prometheus.yml:
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
Import Grafana dashboard ID 14282 for Docker container metrics.
Part 6: SSL Certificate Monitoring
Don't let your HTTPS certificates expire.
With Uptime Kuma
Add a monitor of type "HTTP(s) - Keywords" or the built-in SSL check. Set alert threshold to 14 days before expiry.
Manual Check
# Check certificate expiry
echo | openssl s_client -connect yourdomain.com:443 2>/dev/null | openssl x509 -noout -enddate
# Script to check and alert
EXPIRY=$(echo | openssl s_client -connect yourdomain.com:443 2>/dev/null | openssl x509 -noout -enddate | cut -d= -f2)
DAYS_LEFT=$(( ($(date -d "$EXPIRY" +%s) - $(date +%s)) / 86400 ))
if [ "$DAYS_LEFT" -lt 14 ]; then
echo "ALERT: SSL expires in $DAYS_LEFT days!"
fi
Online Check
Use our SSL Certificate Checker to check any domain's certificate status.
Part 7: Alerting Best Practices
What to Alert On (Wake You Up)
- Service down (HTTP check returns non-200)
- Disk > 90% full
- SSL certificate expires within 7 days
- Server unreachable (ping fails)
- Multiple failed SSH attempts (potential attack)
What to Warn About (Check Tomorrow)
- CPU > 80% for 15+ minutes
- Memory > 85%
- Disk > 75%
- Response time > 2 seconds
- Docker container restarting frequently
What to Just Log
- Individual request errors (404s, 500s)
- Normal SSH logins
- Cron job completions
- Package updates available
Alert Fatigue
The #1 monitoring mistake: too many alerts. If you get 50 alerts a day, you stop reading them. Keep alerts to critical issues only.
Part 8: Monitoring Checklist
Minimum Setup (Any Server)
# 1. Install Uptime Kuma
docker run -d -p 3001:3001 -v uptime-data:/app/data --restart unless-stopped louislam/uptime-kuma
# 2. Set up monitors for your services
# 3. Configure Telegram alerts
# 4. Add cron health check script
# Done. 90% of monitoring value for 10 minutes of work.
Full Stack (Production)
- Uptime Kuma — external uptime checks + SSL monitoring
- Prometheus + node-exporter — server metrics
- Grafana — dashboards and visualization
- cAdvisor — Docker container metrics
- Loki + Promtail — log aggregation
- Fail2ban — automatic intrusion response
Resource Requirements
| Stack | RAM Needed | Best For |
|---|---|---|
| Uptime Kuma only | 128MB | Small setups, 1-5 servers |
| Prometheus + Grafana | 512MB-1GB | Medium setups |
| Full stack (all above) | 2GB+ | Production, multiple servers |
Related Guides
- Linux Log Files Explained
- systemctl Tutorial
- Docker Compose Examples
- Complete Docker Guide
- Complete Self-Hosting Guide
- Server Hardening Guide
- Crontab Tutorial
- Bash Scripting Cheat Sheet
Related Tools
- SSL Certificate Checker — check cert expiry
- Port Scanner — verify ports are open
- Speed Test — test connection speed
- Uptime Calculator — SLA downtime calculator
- Global Latency Test — ping worldwide