Multi-Site Ray Cluster

Unified compute fabric across ACTIVATE resources using Ray. Deploys a Ray head node on any resource, connects workers from one or more additional sites via SSH tunnels, and provides a live dashboard showing cluster topology and task placement.

Architecture

                ACTIVATE Workflow
                      |
           +----------+----------+------------ ...
           v          v          v
        Site 1      Site 2     Site N
       Ray HEAD    Ray WORKER  Ray WORKER
       Dashboard   (SSH/SLURM) (SSH/SLURM)
           |          |          |
           +--- Ray Cluster -----+
                      |
              Workload Options:
              - Fractal rendering (visual)
              - Mathematical benchmark (charts)
              - Cluster only (bring your own workload)

Quick Start

Add this workflow to your ACTIVATE account
Configure:
- Head Node: Select any resource — runs the Ray coordinator + dashboard (no compute)
- Compute Workers: Add one or more worker sites (SSH, SLURM, or PBS resources)
- Workload: Choose fractal rendering, benchmark, or cluster-only mode
Click Execute
Open the session link to view the live dashboard

Workload Modes

Fractal Rendering

Distributes Mandelbrot tile rendering across all sites. Tiles appear live on a canvas, color-coded by which site rendered them.

Mathematical Benchmark

Three phases:

Task Throughput — Bursts many small tasks to measure scheduling rate and placement distribution
CPU Compute — Matrix multiplications (NumPy) measuring GFLOPS per node
Scaling Test — Compares multi-site vs single-site throughput

Cluster Only

Deploys the Ray cluster with no built-in workload. The dashboard shows connection instructions with copy-paste commands for SSH tunnels, Ray job submission, and direct head node access. Use this mode to run your own Ray jobs, training scripts, or interactive workloads.

Worker Dispatch

Each worker node registers 1 task slot with Ray. Ray handles placement across nodes; tasks use internal parallelism (OpenMP, MPI, PyTorch, etc.) for multi-core/GPU work within each node.

Worker dispatch modes:

SSH: Direct connection to the remote host (single node per site)
SLURM: Submit via srun with configurable partition, account, QoS, nodes, and walltime
PBS: Submit via qsub with configurable directives

Cross-site workers connect through SSH tunnels with unique loopback IPs (127.0.X.Y) for multi-node support.

Dashboard

Cluster tab — Node topology grouped by site, task placement bar chart, throughput over time
Connect tab — SSH tunnel commands, Python examples, Ray Jobs CLI, cluster info table (cluster_only mode)
Ray Dashboard tab — Proxied Ray native dashboard (port 8265)

Files

ray-cluster/
├── workflow.yaml              # Multi-site workflow definition
├── README.md
├── ROADMAP.md                 # Future improvements and priorities
├── thumbnail.png
└── scripts/
    ├── setup.sh               # Install Ray + NumPy via uv/pip (handles old Python)
    ├── start_ray_head.sh      # Start Ray head (--num-cpus=0) + dashboard
    ├── dispatch_workers.sh    # Connect workers from all sites (SSH/SLURM/PBS)
    ├── run_benchmark.sh       # Run benchmark, POST results to dashboard
    ├── benchmark.py           # Ray distributed benchmark + fractal tasks
    ├── dashboard.py           # FastAPI live dashboard server (WebSocket)
    ├── diagnose.sh            # Cluster diagnostic tool
    ├── diagnose_cluster.py    # Ray cluster health checker
    └── templates/
        └── index.html         # Dashboard UI

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Site Ray Cluster

Architecture

Quick Start

Workload Modes

Fractal Rendering

Mathematical Benchmark

Cluster Only

Worker Dispatch

Dashboard

Files

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
ROADMAP.md		ROADMAP.md
thumbnail.png		thumbnail.png
workflow.yaml		workflow.yaml

Folders and files

Latest commit

History

Repository files navigation

Multi-Site Ray Cluster

Architecture

Quick Start

Workload Modes

Fractal Rendering

Mathematical Benchmark

Cluster Only

Worker Dispatch

Dashboard

Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages