A production-grade Windows telemetry agent focused on System Stability, Critical Fault Detection, and ML Training Data Generation. Uses the modern Windows Eventing API (EvtQuery) to collect high-value system events, categorizes them for diagnostic ML pipelines, and streams them through a Kafka → PostgreSQL data pipeline.
graph TD
subgraph "Windows Host (Producer)"
WE[Windows Event Logs] --> |EvtQuery| COL[collector.py]
COL --> |Parse & Hash| CLS{ErrorClassifier}
CLS --> |Decorate| FMT[JSON Payload formatting\n+ Diagnostic Context]
end
subgraph "Local Fallback"
FMT -.-> |Network failure| LFS[(collected_events.json)]
end
subgraph "Linux Server / WSL (Broker & Consumer)"
FMT --> |Kafka Producer API\nacks=all, retries=5| KAFKA[Kafka Broker\nTopic: sentinel-events]
KAFKA --> |KafkaConsumer API| K2P[kafka_to_postgres.py]
K2P --> |psycopg2| PG[(PostgreSQL Database)]
end
classDef windows fill:#e1f5fe,stroke:#01579b,stroke-width:2px;
classDef linux fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px;
classDef storage fill:#fff3e0,stroke:#e65100,stroke-width:2px;
class WE,COL,CLS,FMT windows;
class KAFKA,K2P linux;
class LFS,PG storage;
- Targeted Monitoring: System, Kernel-Power, and DriverFrameworks logs.
- Auto-Classification Engine: Classifies events into ML-ready labels (
SYSTEM_FAULT,DRIVER_ISSUE,SERVICE_ERROR,RESOURCE_WARNING,SECURITY_EVENT, etc.). - Diagnostic Context: Attaches point-in-time system resource snapshots (CPU >90%, Mem >90%, Disk <10%) to the exact moment an error occurred.
- Guaranteed Delivery: Producer confirms every message explicitly via
future.get(timeout=10). No silent drops. - Resilience: Producer gracefully handles Kafka outages with exponential backoff reconnections, while Consumer handles DB outages identically.
- Transactional Consistency: Consumer implements batch-level rollbacks and delays Kafka offset commits until DB insert is confirmed.
- Storage Guard: Automatically pauses collection if system disk drops below 1GB free space.
- Graceful Elevation: Detects Administrator privileges via channel probing and falls back gracefully when limited.
- Kafka Publishing: Events streamed to Kafka topic
sentinel-eventsviakafka-python-ng. - PostgreSQL Consumer: Standalone
kafka_to_postgres.pyscript reads from Kafka and writes to PostgreSQL with idempotent dedup (ON CONFLICT DO NOTHING). - Three Delivery Modes: Local file (testing), Kafka pipeline, or HTTPS — switchable via environment variables.
- Hardware-Tied Hashing: SHA256 hashes for deduplication using
(raw_xml + machine_guid + record_id). - Atomic Checkpoints: Checkpoints only advance if transmission to the server succeeds.
SentinelCore/
├── src/ # Core agent and data pipeline
│ ├── collector.py # Main log collection and Kafka publisher (Run on Windows)
│ ├── kafka_to_postgres.py # Kafka → PostgreSQL consumer (Run in WSL/Linux)
│ ├── analyze_logs.py # Helper functions for log analysis
│ └── enhanced_analyzer.py # Advanced ML correlation framework
├── tests/ # End-to-end and live testing suite
│ ├── test_e2e.py # E2E unit tests
│ ├── test_live_errors.py # Live testing against real Windows Event Log
│ └── validate_collector.py # Pipeline dependency validator
├── deploy/ # Fully automated deployment tooling
│ └── deploy_startup.ps1 # Registers agent as a SYSTEM service
├── docs/ # Documentation and Guides
│ ├── LOCAL_TESTING_GUIDE.md # Local testing and Kafka pipeline usage
│ └── WSL_KAFKA_POSTGRES_SETUP.md # WSL infrastructure setup
├── config.json # Pipeline configuration and Kafka tuning parameters
├── requirements.txt # Standard Python dependencies
└── README.md
Requirements: Windows 10/11, WSL (Ubuntu), Python 3.9+
Before the Windows agent can run, the receiving pipeline must be online.
- Install Kafka and PostgreSQL in WSL. Detailed steps are in docs/WSL_KAFKA_POSTGRES_SETUP.md.
- Configure
config.jsonin the root directory on the Windows side. Ensurebootstrap_serversmatches your WSL IP address:{ "kafka": { "bootstrap_servers": "172.30.178.75:9092", "topic": "sentinel-events", "client_id": "windows-test-agent", "acks": "all", "retries": 5, "retry_backoff_ms": 3000, "linger_ms": 50, "request_timeout_ms": 15000 }, "agent": { "system_id_mode": "AUTO", "batch_size": 20, "retry_attempts": 3, "retry_backoff_seconds": 3 } }
The consumer script (src/kafka_to_postgres.py) connects to Kafka, subscribes to the topic, and writes events to the PostgreSQL database.
- Open a WSL Ubuntu terminal.
- Install consumer dependencies:
pip install kafka-python-ng psycopg2-binary
- Run the consumer script:
python3 src/kafka_to_postgres.py
- Leave this running in the terminal. It uses exponential backoff to handle any temporary network interruptions automatically.
The collector script (src/collector.py) monitors Windows Event Logs, structures them, and pushes them to Kafka.
- Open PowerShell as Administrator.
- Install producer dependencies:
pip install -r requirements.txt - Run the collector manually to verify data flow:
$env:SENTINEL_KAFKA_MODE = "true" python src\collector.py
- Watch the WSL terminal to confirm the consumer successfully inserts the events into PostgreSQL.
Once you verify the pipeline works perfectly, you can configure SentinelCore to run silently in the background every time Windows boots.
- Open PowerShell as Administrator.
- Run the deployment script:
This script does the following:
cd C:\path\to\SentinelCore .\deploy\deploy_startup.ps1
- Installs dependencies
- Creates an isolated Python virtual environment (
.venv) - Creates a Scheduled Task named
SentinelCore Agentthat elevates as theSYSTEMuser - Runs invisibly on OS startup
If you ever need to stop the background agent, simply delete the scheduled task or stop it from the Windows Task Scheduler GUI.
The structured JSON payload is designed for direct ingestion into Machine Learning pipelines for predictive maintenance models:
{
"system_id": "machine-guid",
"events": [
{
"fault_type": "DRIVER_ISSUE",
"severity": "WARNING",
"provider_name": "Microsoft-Windows-Kernel-PnP",
"event_id": 219,
"cpu_usage_percent": 45.2,
"memory_usage_percent": 88.1,
"disk_free_percent": 15.0,
"message": "Microsoft-Windows-Kernel-PnP Event 219 (WARNING) on channel System",
"created_at": "2026-02-27T01:30:00Z",
"diagnostic_context": {
"resource_alert": ["HIGH MEMORY"]
},
"raw_xml": "<Event>...</Event>"
}
]
}Production-grade software. Ensure compliance with your organization's telemetry policies before deployment.