Skip to content

perf: avoid cloning global middleware list on every request#1350

Open
sansyrox wants to merge 1 commit intomainfrom
perf/avoid-middleware-list-clone
Open

perf: avoid cloning global middleware list on every request#1350
sansyrox wants to merge 1 commit intomainfrom
perf/avoid-middleware-list-clone

Conversation

@sansyrox
Copy link
Copy Markdown
Member

@sansyrox sansyrox commented Mar 27, 2026

Summary

  • get_global_middlewares() returned Vec<FunctionInfo> via .to_vec(), cloning every FunctionInfo (each requiring GIL acquisition for Py<PyAny> refcount bumps). Called twice per request (before + after middleware).
  • Now returns RwLockReadGuard<Vec<FunctionInfo>> and the server iterates over borrowed references, chaining route-specific middleware via Iterator::chain.
  • Eliminates 2×N FunctionInfo clones and GIL acquisitions per request where N is the number of global middlewares.

Test plan

  • All existing integration tests pass (middleware behavior is unchanged, only allocation pattern differs)
  • check_response helper validates global_after header is present on every response, confirming after-middleware still runs

Made with Cursor

Summary by CodeRabbit

Release Notes

  • Performance Improvements
    • Optimized middleware execution to reduce memory overhead and improve request processing efficiency.

get_global_middlewares() was returning Vec<FunctionInfo> via .to_vec(),
which cloned every FunctionInfo (each requiring GIL acquisition for
Py<PyAny> refcount bumps). This happened twice per request (before +
after middleware).

Return a RwLockReadGuard instead and iterate over borrowed references,
chaining route-specific middleware via Iterator::chain. This eliminates
2*N FunctionInfo clones and GIL acquisitions per request where N is the
number of global middlewares.

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
robyn Ready Ready Preview, Comment Mar 27, 2026 0:05am

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 27, 2026

📝 Walkthrough

Walkthrough

The pull request optimizes middleware handling by eliminating unnecessary vector copies in the middleware router and refactoring middleware execution loops to avoid building combined mutable vectors. Instead, route-specific middleware is wrapped as Option and composed using iterator chaining.

Changes

Cohort / File(s) Summary
Middleware Router Optimization
src/routers/middleware_router.rs
Changed get_global_middlewares return type from Vec<FunctionInfo> to std::sync::RwLockReadGuard<'_, Vec<FunctionInfo>>, eliminating the .to_vec() copy and exposing the underlying read-locked storage directly.
Middleware Execution Loop Refactoring
src/server.rs
Refactored before/after middleware execution loops to avoid building combined mutable vectors. Route-specific middleware is now wrapped as Option and composed with global middleware using iter().chain(), with direct reference passing to execution functions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hopping through the middlewares so fine,
No copies made, no vectors combined,
With chains of Options, we optimize the way,
Guard the data, but don't delay!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The description provides a clear technical summary of the problem, solution, and performance improvement. However, it's missing the required PR template structure including the fixed issue reference, PR checklist items, and pre-commit verification. Complete the PR template by adding the issue reference (## Description - fixes #), filling out the PR Checklist items with status, and confirming pre-commit hooks were run.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and specifically describes the main performance optimization: avoiding cloning the global middleware list on every request.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/avoid-middleware-list-clone

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/server.rs (1)

506-537: ⚠️ Potential issue | 🔴 Critical

Read lock held across .await — deadlock risk persists here.

The before_middlewares guard (line 506-507) remains held throughout the for loop, including the .await on line 519. This is the downstream manifestation of the issue raised in middleware_router.rs. The same applies to after_middlewares at lines 583-617.

If the fix in middleware_router.rs is to return a cloned Vec, this code will work correctly as-is since there would be no guard to hold.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/server.rs` around lines 506 - 537, The read lock returned by
middleware_router.get_global_middlewares is being held across await points (used
via before_middlewares in the for loop), causing deadlock risk; fix by ensuring
you iterate over an owned collection instead of a guard—have
get_global_middlewares return a cloned Vec or immediately collect/clone into a
local Vec (e.g., let before_middlewares =
middleware_router.get_global_middlewares(...).clone() or let before_list: Vec<_>
= before_middlewares.iter().cloned().collect()), similarly materialize
route_before into an owned Option<(function, route_params)> and combine it with
the owned before Vec so execute_middleware_function(&request,
before_middleware).await runs without holding any read lock; apply the same
pattern for after_middlewares/route_after and their loop.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/routers/middleware_router.rs`:
- Around line 93-102: The get_global_middlewares method currently returns a
std::sync::RwLockReadGuard<'_, Vec<FunctionInfo>> which causes the RwLock to be
held across await points; instead, acquire the read lock on
self.globals.get(middleware_type).unwrap(), clone the Vec<FunctionInfo> while
holding the lock, then release the lock and return the owned Vec<FunctionInfo>
(change the method signature to return Vec<FunctionInfo>); update callers in
server.rs that iterate middlewares (which previously held the guard) to use the
cloned Vec so no std::sync::RwLock is held across async .await boundaries.

---

Outside diff comments:
In `@src/server.rs`:
- Around line 506-537: The read lock returned by
middleware_router.get_global_middlewares is being held across await points (used
via before_middlewares in the for loop), causing deadlock risk; fix by ensuring
you iterate over an owned collection instead of a guard—have
get_global_middlewares return a cloned Vec or immediately collect/clone into a
local Vec (e.g., let before_middlewares =
middleware_router.get_global_middlewares(...).clone() or let before_list: Vec<_>
= before_middlewares.iter().cloned().collect()), similarly materialize
route_before into an owned Option<(function, route_params)> and combine it with
the owned before Vec so execute_middleware_function(&request,
before_middleware).await runs without holding any read lock; apply the same
pattern for after_middlewares/route_after and their loop.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dded83aa-4438-421b-9c90-6040372a5da2

📥 Commits

Reviewing files that changed from the base of the PR and between a54ff96 and ab9fbe1.

📒 Files selected for processing (2)
  • src/routers/middleware_router.rs
  • src/server.rs

Comment on lines +93 to 102
pub fn get_global_middlewares(
&self,
middleware_type: &MiddlewareType,
) -> std::sync::RwLockReadGuard<'_, Vec<FunctionInfo>> {
self.globals
.get(middleware_type)
.unwrap()
.read()
.unwrap()
.to_vec()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for patterns where middleware might add other middleware dynamically
rg -n "add_global_middleware|add_middleware" --type py -C 3

# Check if there are any tests that register middleware from within middleware
rg -n "middleware.*middleware" --type py -C 2

Repository: sparckles/Robyn

Length of output: 7001


🏁 Script executed:

# Check server.rs for usage of get_global_middlewares
rg -n "get_global_middlewares" src/routers/server.rs -A 10 -B 2

Repository: sparckles/Robyn

Length of output: 120


🏁 Script executed:

# Also check if middleware can be added after server startup
rg -n "add_global_middleware" src/ -A 3 -B 1

Repository: sparckles/Robyn

Length of output: 770


🏁 Script executed:

# Verify the Python-Rust boundary and whether Python middleware can call add_global_middleware
rg -n "add_global_middleware\|add_middleware" . --type py -A 5 | head -100

Repository: sparckles/Robyn

Length of output: 41


🏁 Script executed:

# Find where get_global_middlewares is used in server.rs
rg -n "get_global_middlewares" src/server.rs -A 15 -B 3

Repository: sparckles/Robyn

Length of output: 1960


🏁 Script executed:

# Get context around the line numbers mentioned (519, 590-595)
sed -n '510,530p' src/server.rs
sed -n '580,610p' src/server.rs

Repository: sparckles/Robyn

Length of output: 2510


🏁 Script executed:

# Check the middleware execution function to see if it can trigger registration
rg -n "execute_middleware_function|execute_after_middleware_function" src/ -A 10 | head -150

Repository: sparckles/Robyn

Length of output: 3446


🏁 Script executed:

# Check if add_global_middleware is exposed to Python bindings
rg -n "add_global_middleware" . --type rs | grep -E "pub.*fn|#\[pymethods\]|#\[pyo3\]" -B 2 -A 1

Repository: sparckles/Robyn

Length of output: 85


🏁 Script executed:

# Look for Python FFI/PyO3 bindings
rg -n "add_global_middleware\|pymethods" . --type rs -B 5 -A 3 | head -80

Repository: sparckles/Robyn

Length of output: 85


🏁 Script executed:

# Check if middleware has access to server instance in Python
fd -e py . | xargs grep -l "server\." | head -5
cat robyn/router.py | grep -A 20 "def before_request\|def after_request" | head -40

Repository: sparckles/Robyn

Length of output: 127


🏁 Script executed:

# Search for PyO3 bindings of add_global_middleware
rg "add_global_middleware" src/ -B 3 -A 3 | grep -E "pub fn|#\[" -B 2 -A 2

Repository: sparckles/Robyn

Length of output: 679


🏁 Script executed:

# Check if Server is exposed to Python bindings
rg "#\[pymethods\]" src/ -A 30 | head -100

Repository: sparckles/Robyn

Length of output: 5401


🏁 Script executed:

# Look at the server.rs bindings to see what's exposed to Python
rg "fn add_global_middleware\|#\[pyo3\]\|#\[pymethods\]" src/server.rs -B 2 -A 5

Repository: sparckles/Robyn

Length of output: 41


🏁 Script executed:

# Check if Python middleware can call Robyn API methods at runtime
grep -r "self.app\|self.server\|add_global_middleware\|add_middleware" robyn/ --include="*.py" | grep -v "def " | head -20

Repository: sparckles/Robyn

Length of output: 757


🏁 Script executed:

# Check if Server struct has pymethods
rg "#\[pymethods\]|#\[pyclass\]" src/server.rs -A 10 | head -50

Repository: sparckles/Robyn

Length of output: 923


🏁 Script executed:

# Look at the Server struct definition to see if it's a PyClass
rg "pub struct Server|#\[pyclass\]" src/server.rs -B 2 -A 5

Repository: sparckles/Robyn

Length of output: 289


🏁 Script executed:

# Check the actual current return type by looking at the method signature more carefully
sed -n '93,102p' src/routers/middleware_router.rs

Repository: sparckles/Robyn

Length of output: 336


🏁 Script executed:

# Check if there are any other callers of get_global_middlewares that might be affected
rg "get_global_middlewares" src/ -B 5 -A 10

Repository: sparckles/Robyn

Length of output: 2807


🏁 Script executed:

# Check the pymethods block for Server to see what methods are exposed to Python
rg "#\[pymethods\]" src/server.rs -A 200 | head -250

Repository: sparckles/Robyn

Length of output: 10413


🏁 Script executed:

# Specifically look for add_global_middleware in the pymethods section
sed -n '30,500p' src/server.rs | grep -A 5 "add_global_middleware"

Repository: sparckles/Robyn

Length of output: 383


🏁 Script executed:

# Check if there's any way for Python middleware to access the Server instance
rg "FunctionInfo|execute_middleware_function" src/executors/mod.rs -B 5 -A 15 | head -80

Repository: sparckles/Robyn

Length of output: 2665


🏁 Script executed:

# Verify that add_global_middleware is NOT in the pymethods block
sed -n '25,450p' src/server.rs | grep -E "pub fn|add_global_middleware" | head -30

Repository: sparckles/Robyn

Length of output: 656


Holding std::sync::RwLock across async .await is an anti-pattern that can cause thread starvation.

While the returned RwLockReadGuard is held by callers in server.rs during the entire middleware iteration loop including across .await points, std::sync::RwLock is not designed for async contexts. It blocks the underlying OS thread rather than yielding to the async runtime, which can starve the executor when the lock is held across .await boundaries.

Consider one of these alternatives:

  1. Clone the Vec under the lock to release it before async iteration (restores original behavior)
  2. Use tokio::sync::RwLock which is designed for async contexts
  3. Collect middleware into a local Vec before iterating
Proposed fix: Clone inside lock to release before async iteration
     pub fn get_global_middlewares(
         &self,
         middleware_type: &MiddlewareType,
-    ) -> std::sync::RwLockReadGuard<'_, Vec<FunctionInfo>> {
+    ) -> Vec<FunctionInfo> {
         self.globals
             .get(middleware_type)
             .unwrap()
             .read()
             .unwrap()
+            .clone()
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn get_global_middlewares(
&self,
middleware_type: &MiddlewareType,
) -> std::sync::RwLockReadGuard<'_, Vec<FunctionInfo>> {
self.globals
.get(middleware_type)
.unwrap()
.read()
.unwrap()
.to_vec()
}
pub fn get_global_middlewares(
&self,
middleware_type: &MiddlewareType,
) -> Vec<FunctionInfo> {
self.globals
.get(middleware_type)
.unwrap()
.read()
.unwrap()
.clone()
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routers/middleware_router.rs` around lines 93 - 102, The
get_global_middlewares method currently returns a std::sync::RwLockReadGuard<'_,
Vec<FunctionInfo>> which causes the RwLock to be held across await points;
instead, acquire the read lock on self.globals.get(middleware_type).unwrap(),
clone the Vec<FunctionInfo> while holding the lock, then release the lock and
return the owned Vec<FunctionInfo> (change the method signature to return
Vec<FunctionInfo>); update callers in server.rs that iterate middlewares (which
previously held the guard) to use the cloned Vec so no std::sync::RwLock is held
across async .await boundaries.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 27, 2026

Merging this PR will not alter performance

✅ 189 untouched benchmarks


Comparing perf/avoid-middleware-list-clone (ab9fbe1) with main (a54ff96)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant