Skip to content

fix: prevent azlin gui from reusing stale local VNC tunnels#958

Merged
rysweet merged 2 commits intomainfrom
fix/gui-wrong-vm-routing
Apr 8, 2026
Merged

fix: prevent azlin gui from reusing stale local VNC tunnels#958
rysweet merged 2 commits intomainfrom
fix/gui-wrong-vm-routing

Conversation

@rysweet
Copy link
Copy Markdown
Owner

@rysweet rysweet commented Apr 8, 2026

Summary

  • isolate each azlin gui VNC forward onto its own local loopback port instead of reusing 5901
  • require GUI tunnel readiness to be owned by the current process tree, not any listener already bound on that port
  • add a live gadugi regression scenario that occupies local 5901 and proves azlin gui devy still connects to the correct VM

Validation

  • cargo test -q -p azlin --bin azlin test_build_vnc_tunnel_args -- --nocapture
  • cargo test -q -p azlin --bin azlin test_build_vnc_viewer_args -- --nocapture
  • cargo test -q -p azlin --bin azlin test_gui_routed_ssh_command_prefix_starts_with_ssh_binary -- --nocapture
  • cargo build -q -p azlin --bin azlin

Live evidence

  • with a stale Simard bastion tunnel still alive on local port 50200, azlin gui devy opened a fresh bastion tunnel to .../virtualMachines/devy
  • the same run launched vncviewer against 127.0.0.1:<random-port>, proving the GUI path no longer reuses a hard-coded local 5901 forward

Merge readiness

QA-team evidence

  • Scenario files: tests/agentic-scenarios/pr-958-gui-tunnel-isolation.yaml
  • Validation command: cd rust && AZLIN_BIN=./target/debug/azlin gadugi-test validate -f ../tests/agentic-scenarios/pr-958-gui-tunnel-isolation.yaml
  • Validation result: passed
  • Run command: cd rust && AZURE_CONFIG_DIR=$HOME/azure-atevet17 AZLIN_BIN=./target/debug/azlin gadugi-test run -d ../tests/agentic-scenarios -s pr-958-gui-tunnel-isolation
  • Run target: live Azure env (devy) with AZURE_CONFIG_DIR=$HOME/azure-atevet17
  • Run result: passed
  • Evidence location: outputs/sessions/session_8a1d8d55-18e0-4156-854c-df49f32e3c48_2026-04-08T01-37-01-310Z.json

Documentation

  • User-facing docs impact: no
  • Updated docs: none
  • PR description links added: n/a
  • Rationale if not applicable: checked rust/crates/azlin/src/cmd_gui.rs, rust/crates/azlin/src/bastion_tunnel.rs, and tests/agentic-scenarios/pr-958-gui-tunnel-isolation.yaml; the code changes are internal tunnel-selection/readiness plumbing, the scenario is QA-only, and the existing docs already describe GUI forwarding as localhost:<local_port> rather than fixed 5901

Quality-audit

  • Cycle 1 summary: confirmed the GUI local-port readiness check could accept a foreign listener and the first scenario draft could false-positive; fixed by reusing process-tree ownership checks and binding the scenario to the current GUI PID tree
  • Cycle 2 summary: confirmed lsof exit-code handling and tool-availability robustness gaps plus scenario portability/e2e-proof gaps; fixed by treating lsof exit code 1 as empty, adding a Linux /proc listener-ownership fallback, removing /tmp assumptions, and requiring an actual TigerVNC connection line
  • Cycle 3 summary: no 2/3-confirmed findings; only single-agent nits remained
  • Additional cycles: none
  • Final clean cycle: 3
  • Fixes followed default-workflow: yes; the initial default-workflow findings were recovered into this clean branch and each confirmed follow-up issue was closed through QA-team and quality-audit loops
  • Convergence summary: after two fix cycles the gadugi scenario reran cleanly and cycle 3 produced zero confirmed medium/high correctness or security issues

CI

  • Checks command: gh pr checks 958
  • Result: all green
  • Skipped checks: Security Scanning/OSSF Scorecard (workflow skipped by repo configuration)
  • Flaky reruns performed: none
  • Real failures fixed: none

Scope

  • Changed files reviewed: gh pr diff 958 --name-only -> rust/crates/azlin/src/bastion_tunnel.rs, rust/crates/azlin/src/cmd_gui.rs, tests/agentic-scenarios/pr-958-gui-tunnel-isolation.yaml
  • Unrelated changes: none

Verdict

  • Merge-ready: yes
  • Remaining blockers: none

@rysweet rysweet changed the base branch from fix/chromium-snap-session-scope to main April 8, 2026 01:44
Ryan Sweet and others added 2 commits April 7, 2026 18:44
Avoid reusing the fixed local port 5901 for GUI sessions.
Each azlin gui run now picks its own free local forward port, waits for the SSH tunnel to bind, and launches vncviewer against that port.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use process-tree listener ownership checks for GUI VNC tunnels, add Linux /proc ownership fallback, and cover the regression with a live gadugi scenario.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rysweet rysweet force-pushed the fix/gui-wrong-vm-routing branch from 960d7bd to acab50a Compare April 8, 2026 01:44
@rysweet rysweet merged commit 2d6e991 into main Apr 8, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant