Skip to content

TheTechBasket/System-Alert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

server-monitor

Lightweight Linux server monitoring with Bash, systemd, and alert fan-out to Telegram, Discord, or a generic webhook.

The project is built for self-hosted servers where you want:

  • no agent install outside your own scripts
  • no SaaS dependency for basic alerting
  • a single deploy command that pushes scripts and the server .env
  • enough control to tune cadence, thresholds, and which services matter

Repository layout

server-monitor/
├── .env.example                 # Local deploy config template
├── .env                         # Local deploy targets and SSH defaults (ignored)
├── .env.server.example          # Server alert config template
├── .env.server                  # Optional local defaults for deploy.sh (ignored)
├── .deploy/                     # Generated per-host env files from deploy.sh (ignored)
├── deploy.sh                    # Interactive deployment and config generation
├── README.md
├── scripts/
│   ├── lib/
│   │   └── common.sh            # Shared env/state helpers
│   ├── monitors/
│   │   ├── brute-force.sh
│   │   ├── log-size.sh
│   │   ├── resources.sh
│   │   ├── services.sh
│   │   └── ssh-login.sh
│   ├── notifiers/
│   │   ├── discord.sh
│   │   ├── telegram.sh
│   │   └── webhook.sh
│   ├── notify.sh
│   └── run-monitors.sh
└── systemd/
    ├── server-monitor.service
    └── server-monitor.timer

What changed in deployment

deploy.sh now does the full server-side config flow instead of stopping at script upload.

It now:

  • keeps local deploy hosts in .env
  • uses .env.server and .env.server.example as defaults for alert config
  • interactively asks which monitors and alert platforms to enable
  • asks for notifier values like Telegram token, chat ID, Discord webhook, or generic webhook settings
  • asks for cadence defaults such as resource checks, service checks, brute-force checks, and storage checks
  • asks whether scheduling should be handled by the built-in systemd timer or by your own cron
  • asks which services to monitor from the remote server's enabled service list
  • generates a per-host env file under .deploy/<host>.env.server
  • generates cron setup artifacts when cron mode is selected
  • uploads that generated file to the server as /etc/server-monitor/.env
  • reloads an existing timer deployment and sends a Server Monitor Updated alert when a prior server config already exists

This means local .env stays local, and the server receives only the generated alert config as .env.

Default alert behaviour

The new defaults are meant to be useful without being noisy:

  • CPU, RAM, disk, I/O wait, and network checks run every 1 minute
  • resource alerts wait for a sustained breach of 5 minutes before firing
  • service status checks run every 1 minute
  • brute-force checks run every 1 minute over the last 5 minutes of SSH logs
  • log/storage checks run every 1440 minutes, which is once a day
  • SSH login alerts are event-driven through PAM and fire immediately
  • recovery alerts are enabled by default

You can override all of these during deploy or by editing .env.server locally.

Setup

1. Configure deploy targets

cp .env.example .env

Example local .env:

SSH_HOST_1="100.64.0.2"
SSH_HOST_2="100.64.0.3"
SSH_USER="ploi"
REMOTE_DIR="/etc/server-monitor"

2. Review server defaults

cp .env.server.example .env.server

This file is optional but useful. deploy.sh uses it as the default answer set for interactive prompts.

3. Deploy

./deploy.sh

Useful variants:

./deploy.sh --server 1
./deploy.sh --host 100.64.0.9 --user ploi
./deploy.sh --cron
./deploy.sh --non-interactive
./deploy.sh --test
./deploy.sh --no-timer
./deploy.sh --no-pam

During interactive deploy, the script will ask for:

  • alert platforms to enable
  • credentials or webhook values for those platforms
  • which monitor classes to enable
  • whether systemd timer or self-managed cron should run the monitors
  • how often each check should run
  • how long resource pressure should be sustained before alerting
  • which services to watch on that host
  • thresholds for enabled monitors

For a single-host deploy, the latest answers are also written back to local .env.server so the next deploy starts from the same values.

Generated server config

Every deploy creates a host-specific file locally:

.deploy/100.64.0.2.env.server

That generated file is copied to the remote machine as:

/etc/server-monitor/.env

The remote .env is installed with:

  • owner: root:root
  • mode: 600

If cron mode is selected, deploy also generates:

.deploy/<host>.cron.setup.sh
.deploy/<host>.cron.setup.md

For single-host deploys, it also refreshes local cron.setup.md.

Monitor coverage

Monitor Script Default cadence Trigger
CPU scripts/monitors/resources.sh 1 min sustained CPU % threshold
RAM scripts/monitors/resources.sh 1 min sustained RAM % threshold
Disk space scripts/monitors/resources.sh 1 min sustained partition usage threshold
Disk I/O wait scripts/monitors/resources.sh 1 min sustained iowait threshold
Network spike scripts/monitors/resources.sh 1 min sustained RX/TX MB/s spike
Services scripts/monitors/services.sh 1 min selected service is not active
Brute force scripts/monitors/brute-force.sh 1 min failed SSH attempts over 5 min
Log size scripts/monitors/log-size.sh 1440 min large file or oversized log dir
SSH login scripts/monitors/ssh-login.sh event-driven PAM session open

Service monitoring

Service monitoring no longer has to track every enabled unit.

During deploy, the script reads enabled services from the target host and shows a selectable list. The chosen services are stored in:

SERVICES_TO_MONITOR="nginx php8.3-fpm mysql"

If you leave this blank in a hand-written env, the monitor falls back to the filtered enabled-service list on the server.

Commands on the server

Check configured monitors:

sudo bash /etc/server-monitor/scripts/run-monitors.sh --list

Force a full run regardless of cadence:

sudo bash /etc/server-monitor/scripts/run-monitors.sh --force

Run one monitor directly:

sudo bash /etc/server-monitor/scripts/run-monitors.sh services

Send a test notification:

sudo bash /etc/server-monitor/scripts/notify.sh "Test" "Hello from $(hostname)" "" "info"

Scheduling

Systemd timer mode

By default deploy.sh installs or reloads:

  • /etc/systemd/system/server-monitor.service
  • /etc/systemd/system/server-monitor.timer

The timer runs every 60 seconds. Per-monitor cadence is controlled inside the scripts via env values, so you get fine-grained timing without needing multiple timers.

If a timer deployment already exists, deploy does this safely:

  • copies updated units
  • runs systemctl daemon-reload
  • resets failed state
  • restarts the timer if it already existed
  • enables it if it did not
  • sends an update alert once the config was already present on the server

Cron mode

If you prefer to manage scheduling yourself, choose cron mode interactively or pass --cron.

Deploy will then:

  • upload scripts and .env normally
  • disable any existing server-monitor.timer
  • generate a helper shell script and markdown instructions under .deploy/
  • leave execution to your own cron

Recommended cron entry:

* * * * * /bin/bash /etc/server-monitor/scripts/run-monitors.sh

That single every-minute cron works because run-monitors.sh already respects the cadence values in /etc/server-monitor/.env.

Safety and security

The deploy and runtime flow was tightened specifically to avoid destabilizing the server.

Runtime safety

  • monitor execution is serialized with flock when available, so overlapping runs do not pile up
  • resource alerts require sustained pressure before sending, which reduces false positives from short spikes
  • log-size checks run daily by default to avoid unnecessary filesystem churn
  • notifier HTTP calls run with request timeouts, so alert endpoints do not stall monitoring forever
  • systemd units use NoNewPrivileges=true, PrivateTmp=true, ProtectSystem=strict, and ProtectHome=true
  • the service can only write to /etc/server-monitor/state

Deploy safety

  • deploy uploads into a temporary staging directory first
  • dry-run still checks SSH, sudo, remote temp staging, and rsync-to-temp viability
  • the server config is installed as a root-only .env
  • local .env is never copied to the server
  • legacy flat-file paths are cleaned up only inside the monitor install directory
  • timer reloads are idempotent, so repeated deploys do not duplicate units or hooks
  • the PAM line is appended only if the hook is not already present

Secret handling

  • notifier tokens and webhooks live only in .env.server, generated .deploy/*, and the remote /etc/server-monitor/.env
  • .env, .env.server, and .deploy/ are ignored by git
  • the unprivileged deploy user cannot read /etc/server-monitor/.env without sudo

Suggestions already folded into the code

These were worth implementing immediately rather than leaving as notes:

  • moved the repo into real scripts/ and systemd/ folders so local and remote layouts match
  • fixed the log scan find expression so file matching behaves correctly
  • added load average context to resource alerts
  • added service target selection instead of blindly watching everything
  • moved cadence into config instead of hardcoding one schedule for every check

Good next improvements

If you want to push this further, these are the next changes worth considering:

  1. add a validate-env.sh script so malformed .env.server values fail before deploy starts
  2. add a second notifier retry queue for temporary outbound network failures
  3. replace the plain-text interactive prompts with a config summary and confirmation step before upload
  4. add per-service grace periods so noisy restart cycles do not alert instantly during deployments
  5. add lightweight tests for env parsing and cadence logic

About

[Beta] Get timely alerts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages