Skip to content

feat(aios-runtime): add loop detection middleware#5

Merged
broomva merged 1 commit intomainfrom
bro-423-loop-detection
Apr 3, 2026
Merged

feat(aios-runtime): add loop detection middleware#5
broomva merged 1 commit intomainfrom
bro-423-loop-detection

Conversation

@broomva
Copy link
Copy Markdown
Owner

@broomva broomva commented Apr 3, 2026

Summary

  • add loop detection middleware on the canonical aiOS runtime path
  • introduce middleware-installed tool-call guards so repeated tool calls can be warned or blocked before execution
  • persist loop detection events to the journal and add end-to-end tests for normal, warning, and hard-stop flows

Validation

  • cargo fmt --all
  • cargo clippy --workspace -- -D warnings
  • cargo test --workspace

Linear: BRO-423

Summary by CodeRabbit

Release Notes

  • New Features

    • Implemented loop detection middleware to prevent repeated tool calls; warns after configured repetitions and enforces hard stops to break execution patterns.
  • Documentation

    • Updated architecture and status documentation to reflect loop detection implementation in the runtime.
  • Tests

    • Added tests covering normal tool flows and repeated tool-call detection with warning/blocking behavior.

@linear
Copy link
Copy Markdown

linear bot commented Apr 3, 2026

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 3, 2026

📝 Walkthrough

Walkthrough

The PR introduces loop detection middleware to the AIOS runtime by adding a ToolCallGuard trait and LoopDetectionMiddleware implementation that tracks tool-call signatures within a sliding window, evaluates guards during turn execution, and blocks or warns on repeated calls exceeding configured thresholds.

Changes

Cohort / File(s) Summary
Runtime Core Implementation
crates/aios-runtime/src/lib.rs
Added ToolCallGuard trait with ToolCallGuardDecision enum (Allow/Warn/Block), LoopDetectionConfig, and LoopDetectionMiddleware. Extended TurnContext with tool_call_guards field. Integrated guard evaluation into tool-call execution flow via evaluate_tool_call_guards, persist_loop_guard_event, and emit_guard_message helpers. Added signature hashing utilities using Blake3.
Runtime Dependencies & Documentation
crates/aios-runtime/Cargo.toml, crates/aios-runtime/README.md
Added blake3 = "1.8" dependency. Updated README to document per-turn tool-call guard evaluation and loop detection middleware behavior.
Kernel Tests
crates/aios-kernel/src/lib.rs
Added two async Tokio tests exercising LoopDetectionMiddleware: loop_detection_allows_normal_tool_flow (distinct tool calls) and loop_detection_warns_then_hard_stops_repeated_tool_calls (repeated calls with warning/hard-stop assertions).
Architecture & Status Documentation
docs/ARCHITECTURE.md, docs/STATUS.md
Updated kernel tick lifecycle to include tool-call guard phase before gate. Added loop detection status entry with implementation details.

Sequence Diagram(s)

sequenceDiagram
    participant Model
    participant Runtime as Runtime
    participant Guard as Loop Detection Guard
    participant Decision
    participant ToolExec as Tool Execution

    Model->>Runtime: Emits ToolCall
    Runtime->>Runtime: Normalize & hash<br/>tool-call signature
    Runtime->>Guard: on_tool_call(context, call)
    Guard->>Guard: Check sliding window<br/>for repetitions
    alt Repeated call exceeds threshold
        Guard->>Decision: Block decision
    else Repeated call in warning threshold
        Guard->>Decision: Warn decision
    else New/allowed call
        Guard->>Decision: Allow decision
    end
    Decision-->>Runtime: ToolCallGuardDecision
    Runtime->>Runtime: Persist custom event<br/>(warning/hard-stop)
    alt Decision is Block
        Runtime->>Runtime: Force text-only<br/>response (skip tool call)
    else Decision is Allow/Warn
        Runtime->>ToolExec: Proceed to policy<br/>evaluation & execution
    end
    ToolExec-->>Model: Tool result or<br/>guard message
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • aiOS#4: Builds on earlier middleware work with modifications to TurnContext/turn middleware APIs and kernel tests, sharing the same architectural foundation for guard integration.

Poem

🐰 A guard in the loop, watching calls go round,
When patterns repeat, the detector's found!
Blake3 hashes leap, signatures dance free,
Warn, then stop—loop detection's key! 🔐✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.74% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(aios-runtime): add loop detection middleware' directly and concisely describes the main change—introducing loop detection middleware to the aios-runtime crate, which aligns with all the substantive changes in the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bro-423-loop-detection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/aios-runtime/src/lib.rs (1)

61-95: 🛠️ Refactor suggestion | 🟠 Major

Decouple the generic guard hook from loop-detection-specific payloads.

ToolCallGuardDecision bakes in loop-only fields (repetitions, signature), and persist_loop_guard_event() hardcodes loop_detection.* for every guard decision. A second guard type cannot use this API without inventing bogus loop metadata or emitting mislabeled journal entries. As per coding guidelines, "Keep public APIs version-aware and backward compatible unless explicitly changing contracts".

Also applies to: 1513-1561

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aios-runtime/src/lib.rs` around lines 61 - 95, ToolCallGuardDecision
currently embeds loop-detection-specific fields (repetitions, signature) and the
ToolCallGuard API plus journal emission (persist_loop_guard_event) assumes loop
metadata for every guard; refactor by removing loop-only fields from the general
ToolCallGuardDecision and replace them with a generic, extensible payload (e.g.,
an optional metadata map or a GuardMetadata enum) so non-loop guards can provide
their own structured info, add a concrete LoopDetection variant or struct for
loop-related data used by the loop-detection guard, and update
ToolCallGuard::on_tool_call callers and the journal emission site
(persist_loop_guard_event usage) to detect the LoopDetection variant and only
persist loop_* entries for that case while other metadata is recorded under a
generic guard metadata path; update types referenced in this change:
ToolCallGuardDecision, ToolCallGuard, and the journal persistence call sites to
handle the new metadata shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/aios-runtime/src/lib.rs`:
- Around line 137-152: LoopDetectionConfig currently allows invalid combinations
via Default and public fields; add a constructor (e.g., LoopDetectionConfig::new
or a builder) that validates invariants and returns a Result/Err on invalid
input, and update Default to call that constructor or return a guaranteed-valid
instance: ensure warning_threshold > 0, warning_threshold <= hard_stop_limit,
and both warning_threshold and hard_stop_limit <= window_size + 1; reject or
panic on invalid combos per crate conventions and add unit tests covering
boundary cases (0, warning==hard_stop, warning>hard_stop, thresholds ==
window_size+1, and > window_size+1) to prevent silently widening safety
boundaries, referencing struct LoopDetectionConfig and its Default impl.
- Around line 154-158: The LoopDetectionMiddleware currently stores session
repetition state in memory via the history_by_session Arc<Mutex<HashMap<String,
VecDeque<String>>>> field, which makes loop decisions non-replayable; change the
implementation so the repetition window is persisted to durable storage (e.g.,
workspace files or the event journal) instead of kept only in RAM: replace or
wrap history_by_session with a storage-backed provider used by the
LoopDetectionMiddleware constructor and all places that push/pop history
(reference LoopDetectionMiddleware, history_by_session, LoopDetectionConfig, and
any methods that read/update the VecDeque) so the middleware loads history on
init and appends updates to the durable log/file whenever the window changes,
ensuring behavior is replayable after restarts and consistent across runtime
attachments.

---

Outside diff comments:
In `@crates/aios-runtime/src/lib.rs`:
- Around line 61-95: ToolCallGuardDecision currently embeds
loop-detection-specific fields (repetitions, signature) and the ToolCallGuard
API plus journal emission (persist_loop_guard_event) assumes loop metadata for
every guard; refactor by removing loop-only fields from the general
ToolCallGuardDecision and replace them with a generic, extensible payload (e.g.,
an optional metadata map or a GuardMetadata enum) so non-loop guards can provide
their own structured info, add a concrete LoopDetection variant or struct for
loop-related data used by the loop-detection guard, and update
ToolCallGuard::on_tool_call callers and the journal emission site
(persist_loop_guard_event usage) to detect the LoopDetection variant and only
persist loop_* entries for that case while other metadata is recorded under a
generic guard metadata path; update types referenced in this change:
ToolCallGuardDecision, ToolCallGuard, and the journal persistence call sites to
handle the new metadata shape.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e051d650-2938-4417-9959-9e24d2eafffa

📥 Commits

Reviewing files that changed from the base of the PR and between c079c2e and 0240190.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • crates/aios-kernel/src/lib.rs
  • crates/aios-runtime/Cargo.toml
  • crates/aios-runtime/README.md
  • crates/aios-runtime/src/lib.rs
  • docs/ARCHITECTURE.md
  • docs/STATUS.md

Comment on lines +137 to +152
#[derive(Debug, Clone)]
pub struct LoopDetectionConfig {
pub warning_threshold: usize,
pub hard_stop_limit: usize,
pub window_size: usize,
}

impl Default for LoopDetectionConfig {
fn default() -> Self {
Self {
warning_threshold: 3,
hard_stop_limit: 5,
window_size: 20,
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate LoopDetectionConfig invariants at construction time.

warning_threshold == 0, warning_threshold > hard_stop_limit, or either threshold being greater than window_size + 1 all create broken safety behavior from a public config surface. Reject invalid combinations in new/a builder and cover the boundaries with a couple of tests. As per coding guidelines, "do not silently widen capabilities or safety boundaries".

Also applies to: 160-166

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aios-runtime/src/lib.rs` around lines 137 - 152, LoopDetectionConfig
currently allows invalid combinations via Default and public fields; add a
constructor (e.g., LoopDetectionConfig::new or a builder) that validates
invariants and returns a Result/Err on invalid input, and update Default to call
that constructor or return a guaranteed-valid instance: ensure warning_threshold
> 0, warning_threshold <= hard_stop_limit, and both warning_threshold and
hard_stop_limit <= window_size + 1; reject or panic on invalid combos per crate
conventions and add unit tests covering boundary cases (0, warning==hard_stop,
warning>hard_stop, thresholds == window_size+1, and > window_size+1) to prevent
silently widening safety boundaries, referencing struct LoopDetectionConfig and
its Default impl.

Comment on lines +154 to +158
#[derive(Clone)]
pub struct LoopDetectionMiddleware {
config: LoopDetectionConfig,
history_by_session: Arc<Mutex<HashMap<String, VecDeque<String>>>>,
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Persist the repetition window outside process memory.

history_by_session only lives in RAM, so warnings/hard-stops reset after a restart or a fresh runtime attaching to the same session. That makes loop decisions non-replayable from the journal and weakens the safety boundary during recovery. As per coding guidelines, "Avoid hidden mutable state outside workspace files and event logs" and "do not silently widen capabilities or safety boundaries".

Also applies to: 168-191

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/aios-runtime/src/lib.rs` around lines 154 - 158, The
LoopDetectionMiddleware currently stores session repetition state in memory via
the history_by_session Arc<Mutex<HashMap<String, VecDeque<String>>>> field,
which makes loop decisions non-replayable; change the implementation so the
repetition window is persisted to durable storage (e.g., workspace files or the
event journal) instead of kept only in RAM: replace or wrap history_by_session
with a storage-backed provider used by the LoopDetectionMiddleware constructor
and all places that push/pop history (reference LoopDetectionMiddleware,
history_by_session, LoopDetectionConfig, and any methods that read/update the
VecDeque) so the middleware loads history on init and appends updates to the
durable log/file whenever the window changes, ensuring behavior is replayable
after restarts and consistent across runtime attachments.

@broomva broomva merged commit bc854c1 into main Apr 3, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant