Multi-tower inference: remote command delegation - Windows and cuda support - end to end installation fixes#315
Multi-tower inference: remote command delegation - Windows and cuda support - end to end installation fixes#315
Conversation
Cross-platform install (macOS, Ubuntu/Debian, WSL2): - GPU detection (CUDA, Metal, CPU-only) - System deps, Node.js 22, Rust via rustup - Python venv with ML packages (torch, transformers, peft) when GPU present - npm install + TypeScript build + Rust worker build - Default config.env creation - Idempotent (safe to re-run) Usage: git clone ... && cd continuum/src && bash scripts/install.sh
There was a problem hiding this comment.
Pull request overview
Adds a new one-shot installer script intended to bootstrap a “Continuum Tower” environment across macOS/Linux/WSL by installing toolchains, optionally setting up a Python ML venv based on GPU detection, building the project, and writing a default user config.
Changes:
- Introduces
src/scripts/install.shto install system dependencies, Node.js, Rust, and (optionally) a Python ML venv. - Adds basic GPU detection (CUDA vs Metal vs CPU-only) to decide which PyTorch wheel index to use.
- Runs the repo build (
npm install,npm run build:ts,scripts/setup-rust.sh) and creates a default~/.continuum/config.envif missing.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| npm install --silent 2>&1 | tail -3 | ||
|
|
||
| echo -e " Building TypeScript..." | ||
| npm run build:ts 2>&1 | tail -1 | ||
|
|
||
| echo -e " Building Rust workers..." | ||
| bash scripts/setup-rust.sh 2>&1 | tail -5 |
There was a problem hiding this comment.
These build steps pipe output to tail, which will mask failures unless pipefail is enabled (and even then truncates the logs needed to debug). Consider removing the | tail ... truncation or capturing logs to a file while still letting failures surface clearly.
| npm install --silent 2>&1 | tail -3 | |
| echo -e " Building TypeScript..." | |
| npm run build:ts 2>&1 | tail -1 | |
| echo -e " Building Rust workers..." | |
| bash scripts/setup-rust.sh 2>&1 | tail -5 | |
| npm install --silent > npm_install.log 2>&1 | |
| tail -3 npm_install.log | |
| echo -e " Building TypeScript..." | |
| npm run build:ts > build_ts.log 2>&1 | |
| tail -1 build_ts.log | |
| echo -e " Building Rust workers..." | |
| bash scripts/setup-rust.sh > setup_rust.log 2>&1 | |
| tail -5 setup_rust.log |
src/scripts/install.sh
Outdated
| curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - | ||
| sudo apt-get install -y nodejs |
There was a problem hiding this comment.
On Linux/WSL the script uses sudo unconditionally for NodeSource setup + apt install. This will fail when running as root (or in minimal environments without sudo). You already compute a SUDO variable in install_system_deps; reuse that pattern here or detect id -u before calling sudo.
| curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - | |
| sudo apt-get install -y nodejs | |
| # Determine whether to use sudo (for non-root users with sudo installed) | |
| local SUDO_CMD="sudo" | |
| if [ "$(id -u)" -eq 0 ] || ! command -v sudo >/dev/null 2>&1; then | |
| SUDO_CMD="" | |
| fi | |
| if [ -n "$SUDO_CMD" ]; then | |
| curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - | |
| sudo apt-get install -y nodejs | |
| else | |
| curl -fsSL https://deb.nodesource.com/setup_22.x | bash - | |
| apt-get install -y nodejs | |
| fi |
| echo -e "${BLUE}━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━${NC}" | ||
| echo "" | ||
|
|
||
| PLATFORM=$(preflight_detect_platform) |
There was a problem hiding this comment.
preflight_detect_platform can return unknown, but most case "$PLATFORM" statements have no default branch and still print success messages. This can lead to a confusing partial install on unsupported systems; consider explicitly erroring out if PLATFORM is unknown (or adding *) cases that exit 1 with a clear message).
| PLATFORM=$(preflight_detect_platform) | |
| PLATFORM=$(preflight_detect_platform) | |
| if [ "$PLATFORM" = "unknown" ]; then | |
| echo -e " Platform: ${RED}unknown${NC}" | |
| echo "" | |
| echo -e "${RED}Error:${NC} Unsupported platform detected by preflight_detect_platform." | |
| echo "This installer currently supports: macOS (Apple Silicon), Ubuntu/Debian (x86_64), and WSL2." | |
| echo "If you're on a supported system and see this message, please update your scripts or open an issue." | |
| exit 1 | |
| fi |
| # Installs: Node.js, Rust, Python venv (with ML packages if GPU detected), system deps | ||
| # Idempotent: safe to run multiple times (skips what's already installed) | ||
|
|
||
| set -e |
There was a problem hiding this comment.
The script pipes several critical build/install commands into tail later on, but only set -e is enabled. In bash, a failing command in a pipeline won't fail the script unless set -o pipefail is set, so npm install / npm run build:ts / setup-rust.sh failures could be silently ignored.
| set -e | |
| set -e | |
| set -o pipefail |
| # Python venv support | ||
| if ! python3 -m venv --help &>/dev/null 2>&1; then | ||
| # Detect python version for correct package name | ||
| local pyver=$(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')" 2>/dev/null || echo "3.12") | ||
| needed+=("python${pyver}-venv") |
There was a problem hiding this comment.
Linux/WSL system deps don't ensure python3 is installed, but later steps call python3 -m venv unconditionally. On minimal Debian/Ubuntu images this will fail even if pythonX.Y-venv is installed; consider adding python3 (and usually python3-pip) to the needed list when python3 isn't present.
| # Python venv support | |
| if ! python3 -m venv --help &>/dev/null 2>&1; then | |
| # Detect python version for correct package name | |
| local pyver=$(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')" 2>/dev/null || echo "3.12") | |
| needed+=("python${pyver}-venv") | |
| # Python + venv support | |
| if command -v python3 &>/dev/null; then | |
| # Ensure venv module is available for the installed python3 | |
| if ! python3 -m venv --help &>/dev/null 2>&1; then | |
| # Detect python version for correct package name | |
| local pyver | |
| pyver=$(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')" 2>/dev/null || echo "3.12") | |
| needed+=("python${pyver}-venv") | |
| fi | |
| else | |
| # No python3 installed: install interpreter, pip, and generic venv package | |
| needed+=("python3" "python3-pip" "python3-venv") |
| macos) | ||
| preflight_check_build_tools | ||
| # Homebrew packages | ||
| for pkg in jq git; do | ||
| if ! command -v "$pkg" &>/dev/null; then | ||
| echo -e " Installing $pkg..." | ||
| brew install "$pkg" | ||
| fi | ||
| done |
There was a problem hiding this comment.
On macOS this assumes Homebrew is available (brew install ...) but there's no check or guidance if brew is missing. Consider checking command -v brew and either installing it or printing a clear instruction before attempting installs.
| $pip install -q transformers peft accelerate datasets trl bitsandbytes pytest | ||
|
|
There was a problem hiding this comment.
bitsandbytes is installed for both CUDA and Metal paths, but it is generally CUDA-specific and commonly fails to install/import on macOS/MPS environments. This will likely break the install on Apple Silicon; consider only installing bitsandbytes when HAS_CUDA=true (and skipping it for Metal/CPU).
| $pip install -q transformers peft accelerate datasets trl bitsandbytes pytest | |
| $pip install -q transformers peft accelerate datasets trl pytest | |
| if $HAS_CUDA; then | |
| echo -e " Installing CUDA-specific packages..." | |
| $pip install -q bitsandbytes | |
| fi |
| # Continuum Tower Install — One command to get a tower running. | ||
| # | ||
| # Usage: | ||
| # git clone https://github.com/CambrianTech/continuum.git | ||
| # cd continuum/src | ||
| # bash scripts/install.sh | ||
| # | ||
| # Works on: macOS (Apple Silicon), Ubuntu/Debian (x86_64), WSL2 | ||
| # Installs: Node.js, Rust, Python venv (with ML packages if GPU detected), system deps | ||
| # Idempotent: safe to run multiple times (skips what's already installed) | ||
|
|
There was a problem hiding this comment.
The PR title/description focus on multi-tower inference and remote command delegation, but the only change in this PR is adding an install script. Either the PR metadata should be updated to reflect the installer work, or the missing multi-tower changes should be included so the PR matches its stated purpose.
| echo -e "\n${YELLOW}Creating default config at $CONFIG_FILE${NC}" | ||
| cat > "$CONFIG_FILE" << 'ENVEOF' | ||
| # Continuum Tower Configuration | ||
| # Add API keys here for cloud provider access. | ||
| # Uncomment and fill in the ones you need: | ||
|
|
||
| # ANTHROPIC_API_KEY=sk-ant-... | ||
| # OPENAI_API_KEY=sk-proj-... | ||
| # DEEPSEEK_API_KEY=sk-... | ||
| # HF_TOKEN=hf_... | ||
|
|
||
| # Storage path for models, adapters, datasets (default: ~/.continuum) | ||
| # CONTINUUM_STORAGE_PATH=/path/to/storage | ||
| ENVEOF | ||
| echo -e " ${YELLOW}Edit $CONFIG_FILE to add your API keys${NC}" |
There was a problem hiding this comment.
The installer writes its own minimal ~/.continuum/config.env template, but the repo already has scripts/ensure-config.ts as the canonical config generator (with required defaults like HTTP_PORT/WS_PORT, LOG_LEVEL, etc.). Creating a partial file here will prevent ensure-config.ts from generating the full default config and can leave the system missing expected keys; consider invoking the existing ensure-config script (or reusing its template) instead of writing a second format.
| echo -e "\n${YELLOW}Creating default config at $CONFIG_FILE${NC}" | |
| cat > "$CONFIG_FILE" << 'ENVEOF' | |
| # Continuum Tower Configuration | |
| # Add API keys here for cloud provider access. | |
| # Uncomment and fill in the ones you need: | |
| # ANTHROPIC_API_KEY=sk-ant-... | |
| # OPENAI_API_KEY=sk-proj-... | |
| # DEEPSEEK_API_KEY=sk-... | |
| # HF_TOKEN=hf_... | |
| # Storage path for models, adapters, datasets (default: ~/.continuum) | |
| # CONTINUUM_STORAGE_PATH=/path/to/storage | |
| ENVEOF | |
| echo -e " ${YELLOW}Edit $CONFIG_FILE to add your API keys${NC}" | |
| echo -e "\n${YELLOW}No config found at $CONFIG_FILE — generating default config via ensure-config${NC}" | |
| if command -v npm >/dev/null 2>&1; then | |
| ( | |
| cd "$PROJECT_DIR" | |
| # Use the canonical config generator to create a full default config.env | |
| if ! npm run ensure-config >/dev/null 2>&1; then | |
| echo -e " ${YELLOW}Warning:${NC} 'npm run ensure-config' failed. You may need to run it manually after installation." | |
| fi | |
| ) | |
| else | |
| echo -e " ${YELLOW}Warning:${NC} npm is not available. Please run 'npm run ensure-config' from the src directory to create $CONFIG_FILE." | |
| fi |
| echo -e " Python ML: $VENV_DIR" | ||
| fi | ||
| echo "" | ||
| echo -e " ${YELLOW}Start:${NC} cd src && npm start" |
There was a problem hiding this comment.
The post-install instruction cd src && npm start conflicts with the usage instructions at the top (cd continuum/src before running this script). If the user is already in src/, cd src will land in src/src and fail; consider printing either just npm start or an unambiguous path from the repo root.
| echo -e " ${YELLOW}Start:${NC} cd src && npm start" | |
| echo -e " ${YELLOW}Start:${NC} npm start" |
- Guard against running as root (sudo puts config under /root) - Node.js via nvm instead of nodesource apt repo (no sudo needed) - Only apt-get install uses sudo elevation
- cargo-features.sh: shared helper detects GPU (Metal on macOS, CUDA on Linux/WSL via nvidia-smi, CPU-only fallback). Future: ROCm for AMD. - parallel-start.sh: uses --no-default-features + detected GPU features instead of hardcoded default=["metal"] (which fails on Linux with objc2) - setup-rust.sh: same feature detection for per-worker builds - Source ~/.cargo/env and nvm in parallel-start for non-interactive shells (SSH sessions don't load .bashrc)
accelerate-src links Apple's Accelerate.framework — fails on Linux. Now: metal+accelerate on macOS, cuda on Linux, cpu-only fallback. Workspace deps no longer hardcode accelerate feature.
Removes --no-default-features (caused rusqlite/regex unresolved on Linux). default=[] means cargo build alone gets CPU-only, scripts always pass --features metal,accelerate (macOS) or --features cuda (Linux).
models/ in .gitignore was root-relative-less, catching ALL models/ dirs including continuum-core/src/models/ (Rust source). Changed to /models/ (root-relative) to only match the top-level ML model downloads. The models module (352 lines, model discovery via provider APIs) was never committed — only existed locally on macOS.
…ly target These deps were accidentally placed under [target.'cfg(target_os = "macos")'.dependencies] instead of [dependencies]. Worked on macOS, broke on every other platform with 69 errors.
… GpuConvertPlugin - gpu_bridge: cross-platform stub (has_bridge → false on non-macOS) - release_unused_buffers: #[cfg(target_os = "macos")] gate - GpuConvertPlugin: platform-conditional plugin (no-op on non-macOS) GPU→GPU rendering still active on macOS, falls back to CPU readback on Linux
…gitignored)" This reverts commit 7084ce5.
Mirrors cargo-features.sh logic in TypeScript — passes --features metal,accelerate (macOS), cuda (Linux+NVIDIA), or nothing (CPU-only) to cargo test. Without this, ts-rs export tests fail on non-macOS because they try to compile with default (empty) features. Also reverted accidental commit of 262 generated .ts files — these must be generated by the build pipeline, not checked in.
Fixes protobuf descriptor crash on Linux — webrtc-sys and ort-sys both statically linked protobuf, causing duplicate registration at runtime. load-dynamic loads libonnxruntime.so at runtime via dlopen, avoiding the static protobuf conflict entirely. Removed --allow-multiple-definition linker hack (no longer needed).
Changed build/ to /build/ (root-relative) so command subdirectories named "build" are tracked. Same pattern as the models/ fix.
Linux strictly enforces ulimit -v (kills process on exceed). CUDA runtime + WebRTC + ONNX need 8-16GB virtual memory. macOS doesn't enforce it anyway, so keep as soft hint there only.
- SystemPaths.ts: use $USER env var for Postgres URL fallback - parallel-start.sh: use $USER for psql health checks - More hardcoded references remain in seed data/tests (separate cleanup)
- ZERO "joel" references remaining (was 59 across 36 files)
- ZERO "FlashGordon" drive paths
- Database configs use $USER env var, not hardcoded username
- Seed data uses "owner" as generic primary human identifier
- Test fixtures use "test-user" / "user-owner-00001"
- Shell scripts use ${USER} for Postgres operations
- Rust binaries use /tmp as HOME fallback
- Shared code does NOT use process.env (browser-safe)
- Fixed per-package GPU features (accelerate only for continuum-core)
Runtime identity should come from onboarding, not source code.
load-dynamic broke TTS/STT on macOS (ORT panic — no libonnxruntime.dylib). macOS uses download-binaries (static linking, works with macOS linker). Linux uses load-dynamic-ort feature flag to avoid protobuf conflict with webrtc-sys.
…bient light - Reverted gpu_convert_plugin() abstraction — caused Bevy animation freeze on macOS. Root cause unclear but reverting fixed it immediately. - gpu_bridge stub and Metal cfg gates kept for Linux compilation. - Added AmbientLight (brightness 500) for base face illumination. - Key light moved to top-right-front, fill boosted to 15k from left.
GPU readback returns corrupted data (green/noise artifacts) before the Metal render pipeline is fully warmed up. Skip first 30 ReadbackComplete callbacks (~2s at 15fps) per slot before publishing frames to WebRTC. Zero-cost check (integer comparison only, no analysis).
Summary
Commands.execute()transparently delegates to remote towerstower://5090local → mesh address long-termTest plan
npm startsucceeds on 5090./jtag pingresponds from 5090🤖 Generated with Claude Code