A lightweight, distributed data processing framework implemented in Python. It features a Master-Worker architecture with dynamic load balancing, fault tolerance, and real-time monitoring.
- Distributed Architecture: Master-Worker pattern using raw sockets.
- Dynamic Load Balancing: Automatically throttles task assignment when workers are overloaded (>80% CPU).
- Fault Tolerance: Detects worker failures and re-queues lost tasks automatically.
- Real-time Dashboard: CLI-based TUI to monitor throughput and cluster health.
- Zero Dependencies: Core logic uses only Python standard library (optional
psutilfor accurate CPU metrics).
-
Clone the repository:
git clone https://github.com/Jaskirat-s7/distributed-systems.git cd distributed-systems -
(Optional) Install dependencies for accurate CPU monitoring:
pip install psutil
Note: The system works without
psutilby using simulated CPU metrics.
Run the automated cluster simulation to see the system in action:
python3 run_cluster.py-
Start the Master Node:
python3 main.py --mode master
-
Start Worker Nodes: Open new terminals and run:
python3 main.py --mode worker
You can add as many workers as you like.
-
Master Node (
master.py):- Manages the Task Queue.
- Listens for Worker connections.
- Distributes tasks based on Worker health (Heartbeats).
- Handles re-queuing of tasks if a Worker disconnects.
-
Worker Node (
worker.py):- Connects to Master.
- Requests tasks and executes simulated CPU-intensive work (Prime Factorization).
- Sends Heartbeats with CPU usage stats.
-
Dashboard (
dashboard.py):- Visualizes active workers, task progress, and system throughput using
curses.
- Visualizes active workers, task progress, and system throughput using
MIT