Here are some examples of the dashboards included in this monitoring stack.
This repository contains the complete configuration for a robust monitoring stack using Docker, Prometheus, and Grafana. It is designed to provide comprehensive health and performance monitoring for servers and websites across different cloud environments and operating systems.
The entire stack runs in Docker containers orchestrated by Docker Compose, making it portable and easy to manage.
This configuration is pre-built to monitor:
- Host System Metrics (DigitalOcean VM): Full health monitoring of the main server running the stack (CPU, Memory, Disk I/O, Network).
- Remote Linux System Metrics (AWS VM): Full health monitoring of a remote AWS EC2 instance.
- Remote Windows System Metrics (Azure Stack Hub VM): Health monitoring of a Windows Server VM, including CPU, Memory, and Disk Usage.
- Database Metrics (MS SQL Server 2022): Performance and health metrics for a Microsoft SQL Server instance.
- Website & URL Health: Uptime, response time, and SSL certificate health for 22 web endpoints, separated into logical groups.
The configuration is split into multiple files for easier management.
.
├── docker-compose.yml # Main file to launch all services
├── alertmanager_config/
│ └── config.yml # Alertmanager configuration with Graph API webhook
├── prometheus_config/
│ ├── prometheus.yml # Main prometheus config, loads job files
│ ├── alert.rules.yml # Alert rules definition
│ └── scrape_configs/
│ ├── websites.yml # All website monitoring jobs
│ ├── nodes.yml # All Linux/Windows server monitoring jobs
│ └── sql.yml # The SQL server monitoring job
├── loki_config/
│ └── loki-config.yml # Loki log aggregation configuration
├── promtail_config/
│ └── promtail-config.yml # Promtail log collection configuration
├── nginx_config/
│ └── default.conf # NGINX reverse proxy for Grafana
├── blackbox_config/
│ └── config.yml # Blackbox Exporter URL probing configuration
├── graph_email_service/ # NEW: Microsoft Graph API email service
│ ├── Dockerfile # Docker build configuration
│ ├── package.json # Node.js dependencies
│ ├── graph-email-service.js # Main email service application
│ └── healthcheck.js # Health check endpoint
└── README.md # This documentation file
Before you begin, you will need:
- A primary server (e.g., a DigitalOcean VM) where the monitoring stack will run.
- Docker and Docker Compose installed on this primary server.
- A domain name (e.g.,
adplay-mobile.com). - A Cloudflare account managing your domain's DNS.
- One or more remote servers to monitor (e.g., an AWS EC2 instance and a Windows VM).
- Install Docker.
- Run the Node Exporter container:
sudo docker run -d --name=node-exporter --net="host" --pid="host" -v "/:/host:ro,rslave" quay.io/prometheus/node-exporter:latest --path.rootfs=/host
- In your cloud firewall (e.g., AWS Security Group), open TCP port 9100 and allow access from your DigitalOcean monitoring server's IP.
- Follow the detailed manual setup instructions to install
windows_exporter(on port 9182) andsql_exporter(on port 9187). This includes creating a low-privilege user and setting up the services. - In your cloud firewall (e.g., Azure Stack Hub NSG), open TCP ports 9182 and 9187 and allow access from your DigitalOcean monitoring server's IP.
- In the Windows Defender Firewall on the VM, also add inbound rules to allow traffic on TCP ports 9182 and 9187.
- Clone this repository to your DigitalOcean server.
- Configure DNS: In Cloudflare, create an A record for a subdomain (e.g.,
monitoring) that points to your DigitalOcean server's public IP. Ensure the proxy status is Enabled (orange cloud). - Configure NGINX: Edit
nginx_config/default.confand replacemonitoring.adplay-mobile.comwith your chosen subdomain. - Configure Prometheus:
- Edit
prometheus_config/scrape_configs/nodes.ymland add the public IPs of your AWS and Windows VMs. - Edit
prometheus_config/scrape_configs/websites.ymlto add or remove any URLs you wish to monitor.
- Edit
- Launch the main monitoring stack:
cd /path/to/your/cloned/repo sudo docker-compose up -d - Launch the local Node Exporter: This command runs the Node Exporter for the DigitalOcean VM itself and connects it to the Docker network.
(Note: The network name is based on the directory name. Use
# (Ensure any old node-exporter is removed: sudo docker rm -f node-exporter) sudo docker run -d \ --name=node-exporter \ --network=monitoring-stack_monitoring_net \ -v "/proc:/host/proc:ro" \ -v "/sys:/host/sys:ro" \ -v "/:/rootfs:ro" \ quay.io/prometheus/node-exporter:latest \ --path.procfs=/host/proc \ --path.sysfs=/host/sys \ --path.rootfs=/rootfs
docker network lsto confirm.)
- Navigate to your domain (e.g.,
https://monitoring.adplay-mobile.com). - Log in to Grafana (
admin/your password). - Add Prometheus as a Data Source:
- Go to Administration -> Data sources -> Add data source.
- Select Prometheus.
- Set the Prometheus server URL to
http://prometheus:YOUR_PORT. - Click "Save & Test".
- Import Dashboards:
- Go to Dashboards -> New -> Import.
- Import the following dashboards by their ID. Remember to change the Name and UID for each import to create separate, dedicated dashboards.
7587: For website health (Blackbox). Import this multiple times, once for each job (blackbox,blackbox_portals, etc.).1860: For Linux server health (Node Exporter).14694or10467: For Windows Server health.14333or19420: For MS SQL Server health.
- Grafana Dashboard:
https://<your_subdomain> - Prometheus UI:
http://<Your_DO_Server_IP>:YOUR_PORT(Requires opening port 9090 in your DigitalOcean firewall).
Logs docker-compose logs
docker-compose logs [service-name] docker-compose logs -f [service-name]
Loki crash loop Fix allow_structured_metadata: false in loki-config.yml
Connectivity docker network inspect monitoring_stack_monitoring_net docker-compose exec prometheus ping alertmanager
Prometheus ─┬─ Alertmanager ── Graph Email Service ── Microsoft Graph API │ └─ Grafana └─ Loki ── Promtail


