Metrics Module
The metrics module collects system resource usage and fail2ban status, reporting them to the Bloqd server for monitoring.
Collected Metrics
| Metric | Description | Field Name |
|---|---|---|
| CPU Usage | Percentage of CPU in use | cpu_percent |
| CPU Count | Number of CPU cores | cpu_count |
| Memory Total | Total system memory (bytes) | mem_total |
| Memory Used | Used memory (bytes) | mem_used |
| Memory Percent | Memory usage percentage | mem_percent |
| Memory Available | Available memory (bytes) | mem_available |
| Disk Total | Total disk space (bytes) | disk_total |
| Disk Used | Used disk space (bytes) | disk_used |
| Disk Percent | Disk usage percentage | disk_percent |
| Disk Free | Free disk space (bytes) | disk_free |
| Load 1m | 1-minute load average | load_1 |
| Load 5m | 5-minute load average | load_5 |
| Load 15m | 15-minute load average | load_15 |
| Uptime | System uptime (seconds) | uptime |
| Boot Time | Unix timestamp of boot | boot_time |
| fail2ban Running | fail2ban service status | fail2ban_running |
| fail2ban Version | Installed version | fail2ban_version |
| fail2ban Jails | List of active jails | fail2ban_jails |
Configuration
modules:
metrics:
enabled: true
interval: 300 # 5 minutes
collect:
cpu: true
memory: true
disk: true
load: true
network: false
| Setting | Description | Default |
|---|---|---|
enabled | Enable metrics collection | true |
interval | Collection interval (seconds) | 300 |
collect.cpu | Collect CPU metrics | true |
collect.memory | Collect memory metrics | true |
collect.disk | Collect disk metrics | true |
collect.load | Collect load averages | true |
collect.network | Collect network I/O | false |
Network Metrics (Optional)
When collect.network is enabled:
| Metric | Description |
|---|---|
net_bytes_sent | Total bytes sent |
net_bytes_recv | Total bytes received |
net_packets_sent | Total packets sent |
net_packets_recv | Total packets received |
How It Works
- Module runs at configured interval
- Collects enabled metrics using
psutil - Queries fail2ban status via
fail2ban-client - Sends combined data via heartbeat API
┌─────────────────┐
│ Metrics Module │
└────────┬────────┘
│
┌────┴────┐
▼ ▼
┌───────┐ ┌──────────┐
│psutil │ │fail2ban- │
│ │ │ client │
└───┬───┘ └────┬─────┘
│ │
└────┬─────┘
│
▼
┌─────────────────┐
│ Bloqd Server │
│ (heartbeat) │
└─────────────────┘
Dashboard Display
Metrics appear in the server detail page:
- Overview Tab: Current values
- Metrics Tab: Historical charts
- CPU usage over time
- Memory usage over time
- Disk usage over time
- Load averages
Events
| Event | Direction | Description |
|---|---|---|
metrics_collected | Emits | Metrics successfully collected |
API Payload
Metrics are sent via the heartbeat endpoint:
{
"hostname": "web-server-01",
"cpu_percent": 25.5,
"cpu_count": 4,
"mem_total": 8589934592,
"mem_used": 4294967296,
"mem_percent": 50.0,
"mem_available": 4294967296,
"disk_total": 107374182400,
"disk_used": 53687091200,
"disk_percent": 50.0,
"disk_free": 53687091200,
"load_1": 0.5,
"load_5": 0.4,
"load_15": 0.3,
"uptime": 864000,
"boot_time": 1704067200,
"fail2ban_running": true,
"fail2ban_version": "0.11.2",
"fail2ban_jails": ["sshd", "nginx-http-auth"]
}
Troubleshooting
Metrics Not Updating
-
Check agent logs:
journalctl -u bloqd-agent | grep -i metrics -
Verify module is enabled in config
-
Check interval setting (default 5 minutes)
High CPU from Agent
If the agent uses excessive CPU:
-
Increase metrics interval:
modules:
metrics:
interval: 600 # 10 minutes -
Disable unused collectors:
modules:
metrics:
collect:
network: false
Load Average Not Available
Load averages are not available on Windows. The agent handles this gracefully and omits these metrics.
fail2ban Info Missing
If fail2ban metrics are missing:
-
Check fail2ban is running:
systemctl status fail2ban -
Verify fail2ban-client works:
fail2ban-client status -
Check agent has permission to run fail2ban-client (requires root)