[fix] I/P: Fix connection failure during some remote builds by hanno-becker · Pull Request #201 · awslabs/AutoCorrode

hanno-becker · 2026-03-29T03:21:27Z

Symptom: Sessions with multiple theories blocks in ROOT failed when built remotely via I/P, with repeated "Connection refused: failed to open socket 127.0.0.1:" errors. The build completed all theory loading but exited with rc=1.

Issue: When proxying a build to a remote machine, the proxy rewrites bash_process_address in PIDE messages so that the remote ML process connects to a remote Bash.Server instead of the local one. The build_session message carries a separate Options object per theory group, each containing bash_process_address. The rewrite only replaced the first occurrence (count=1), leaving subsequent groups with the unrewritten local address.

Fix: Remove the count limit from replace() for both bash_process_address and bash_process_password, so all theory groups are rewritten.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Symptom: Sessions with multiple `theories` blocks in ROOT failed when built remotely via I/P, with repeated "Connection refused: failed to open socket 127.0.0.1:<port>" errors. The build completed all theory loading but exited with rc=1. Issue: When proxying a build to a remote machine, the proxy rewrites bash_process_address in PIDE messages so that the remote ML process connects to a remote Bash.Server instead of the local one. The build_session message carries a separate Options object per theory group, each containing bash_process_address. The rewrite only replaced the first occurrence (count=1), leaving subsequent groups with the unrewritten local address. Fix: Remove the count limit from replace() for both bash_process_address and bash_process_password, so all theory groups are rewritten. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

Source file verification was skipped entirely when the variable was missing. Now defaults to the resolved --dir (or pwd), shows a warning, and proceeds with source status reporting. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

This allows an agent with I/R MCP connection to spawn multiple subagents working concurrently in separate REPLs. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

MCP clients can now use \<Rightarrow> and ⇒ interchangeably in pattern, old_str, new_str, and content parameters. Whitespace differences (extra spaces, tabs) are also tolerated during matching. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

Without timing data from REPL steps, agents have a hard time optimizing proofs for performance. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

LLM-based MCP clients to I/R are sensitive to the amount of information the receive, but the I/R output contains a lot of noise, including: - Every evaluation is suffixed with `val it = (): unit` - 'Duplicate simpset' warnings This commit adds an extensible noise filter to I/R to increase the information density of the response to the LLM. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

Timeout was a single global Synchronized.var shared across all REPLs, so one MCP client changing it affected every other concurrent session. Move it to a per-REPL field: init/init_from_document default to 10s, fork inherits from parent, and Ir.timeout now takes a REPL ID. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

Symptom: when multiple MCP clients maintain persistent TCP connections to I/R (e.g. repl.py's default pool of 5 + 1 console = 6 connections), Isabelle/jEdit becomes sluggish or unresponsive — PIDE proof checking and interactive editing stall even though the REPL connections are idle. Analysis: I/R's TCP server used Future.fork (tcp_handler.ML) and Future.forks (ml_repl.ML) to spawn connection handler and accept loop threads. Isabelle's Future system is a fixed-size worker thread pool (typically max_threads = 8), not raw OS threads. Each Future.fork body that blocks on socket I/O permanently occupies a worker slot — the worker cannot return to the pool to pick up other tasks. With 6 persistent connections + 1 accept loop, 7 of 8 worker threads are parked on blocking reads or Synchronized.guarded_access, leaving a single worker for all PIDE tasks and I/R eval futures combined. The eval futures themselves were already correctly prioritized at ir_pri = ~1 (below PIDE's >= 0), but this was moot when no workers were available to execute them. Fix: replace Future.fork/Future.forks with Isabelle_Thread.fork for all I/O-bound threads (TCP connection handlers and accept loop). Isabelle_Thread creates standalone OS threads outside the Future worker pool — the standard Isabelle idiom for I/O threads (cf. Message_Channel.make in message_channel.ML). Connection threads can block on socket I/O indefinitely without reducing the pool's capacity. The full worker pool remains available for PIDE and I/R eval futures. Output routing is unaffected: connection handlers use direct Message_Channel.message calls (not Private_Output), while eval futures still run in the worker pool with correct Future group ancestry (ir_group → conn_group → cmd_group) for find_connection_message(). Added prominent warnings at is_server_context(), connection_serve(), and eval_in_group() documenting that Private_Output wrappers (writeln, warning, etc.) only route correctly on Future worker threads — code on connection Isabelle_Threads must use the direct `message` function. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

hanno-becker added 9 commits March 29, 2026 04:21

I/R: Allow multiple concurrent MCP clients

d61082d

This allows an agent with I/R MCP connection to spawn multiple subagents working concurrently in separate REPLs. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

[fix]I/R: Detect bad MCP config

7047915

Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

I/R: Report timing data

28b140b

Without timing data from REPL steps, agents have a hard time optimizing proofs for performance. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] I/P: Fix connection failure during some remote builds#201

[fix] I/P: Fix connection failure during some remote builds#201
hanno-becker wants to merge 9 commits intomainfrom
ip_build_fix

hanno-becker commented Mar 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hanno-becker commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hanno-becker commented Mar 29, 2026 •

edited

Loading