fix(deployment_manager): harden and optimize concurrent GitHub file uploads by Copilot · Pull Request #68 · groupthinking/EventRelay

Copilot · 2026-03-14T16:55:12Z

_upload_to_github had several correctness and robustness gaps introduced when migrating to httpx: silent auth failures, blocking file I/O on the event loop, swallowed CancelledError, no rate-limit handling, and unbounded coroutine creation for large repos.

Changes

Auth validation – Check user_response.status_code != 200 before parsing JSON; raises a descriptive exception instead of a KeyError on login
Non-blocking file reads – open().read() → await asyncio.to_thread(file_path.read_bytes) to keep the event loop free during disk I/O
CancelledError propagation – Explicit except asyncio.CancelledError: raise before the broad except Exception, so task cancellation (e.g. server shutdown) is not silently swallowed
Rate-limit retry/backoff – On HTTP 403/429, retries up to 3 times with exponential backoff; honors Retry-After header with a safe fallback for non-integer values; no wasted sleep after the final attempt; exhausted retries are logged as errors (not silently None)
Bounded queue-based worker pool – Replaces Semaphore(10) + gather(all_coroutines) with a proper producer/consumer pattern:
- asyncio.Queue(maxsize=20) prevents unbounded memory growth for large repos
- Workers started before queue fill for immediate consumption
- try/finally ensures queue.task_done() is always called
- gather(..., return_exceptions=True) prevents one worker error from cancelling siblings

# Before: O(files) coroutine objects created up-front
results = await asyncio.gather(*(run_with_semaphore(task) for task in upload_tasks))

# After: O(workers) coroutines, bounded queue, retry-aware
queue: asyncio.Queue = asyncio.Queue(maxsize=MAX_WORKERS * 2)
worker_tasks = [asyncio.create_task(worker()) for _ in range(MAX_WORKERS)]
# ... fill queue ...
results = await asyncio.gather(*worker_tasks, return_exceptions=True)

🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.

vercel · 2026-03-14T16:55:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
v0-uvai	Ready	Preview, Comment, Open in v0	Mar 14, 2026 5:07pm

railway-app · 2026-03-14T16:56:30Z

🚅 Deployed to the EventRelay-pr-68 environment in EventRelay

Service	Status	Web	Updated (UTC)
eslint-config	✅ Success (View Logs)	Web	Mar 14, 2026 at 5:10 pm
EventRelay	✅ Success (View Logs)	Web	Mar 14, 2026 at 5:10 pm
web	✅ Success (View Logs)	Web	Mar 14, 2026 at 5:10 pm
supabase	✅ Success (View Logs)		Mar 14, 2026 at 5:10 pm
guides	✅ Success (View Logs)	Web	Mar 14, 2026 at 5:10 pm

…_github - Check user_response.status_code before parsing JSON to surface auth errors - Use asyncio.to_thread for file reads to avoid blocking the event loop - Re-raise asyncio.CancelledError to allow proper task cancellation - Add retry/backoff (3 retries, exponential) for 403/429 rate limit responses - Wrap Retry-After header parsing in try/except for non-integer values - Replace semaphore+gather with bounded queue-based worker pool (maxsize=20) - Start workers before filling queue for immediate consumption - Add try/finally in worker to guarantee queue.task_done() is always called - Use return_exceptions=True in gather to prevent one error cancelling all workers" Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>

Copilot · 2026-03-14T17:07:19Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/user
- Triggering command: /usr/bin/python python -m pytest tests/testing/test_deployment_pipeline.py -v (http block)
- Triggering command: /usr/bin/python python -m pytest tests/testing/test_deployment_pipeline.py::TestDeploymentPipelineIntegration::test_deployment_manager_orchestration -v (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

vercel · 2026-03-14T17:16:50Z

src/youtube_extension/backend/deployment_manager.py

+                    logger.error(f"Exhausted retries uploading {relative_path} due to rate limiting")
+                    return None
+                except asyncio.CancelledError:
+                    # Propagate cancellation so calling code can handle shutdown correctly


Worker tasks can fail to call queue.task_done() when cancelled at a race condition boundary between queue.get() success and entering the try/finally block

vercel · 2026-03-14T17:16:50Z

src/youtube_extension/backend/deployment_manager.py

+                await queue.put(item)
+            # Sentinel values to signal each worker to stop
+            for _ in range(MAX_WORKERS):
+                await queue.put(None)


Missing await queue.join() call in worker pool pattern causes improper synchronization between item processing and worker shutdown

vercel · 2026-03-14T17:16:50Z

src/youtube_extension/backend/deployment_manager.py

+                        response = await client.put(upload_url, headers=headers, json=file_data)
+                        if response.status_code in [200, 201]:
+                            return str(relative_path)
+                        elif response.status_code in [403, 429]:


Retry-After header with RFC 7231 datetime format is not parsed correctly, causing fallback to exponential backoff instead of respecting the server's specified retry time

Copilot

Pull request overview

This PR hardens and improves the robustness of GitHub repository uploads in DeploymentManager._upload_to_github, primarily addressing async correctness (cancellation + non-blocking I/O) and operational behavior (rate limiting + bounded concurrency).

Changes:

Add explicit auth status validation before parsing /user response JSON.
Replace blocking file reads with asyncio.to_thread(...), add rate-limit retry/backoff, and switch to a queue-based worker pool for bounded concurrency.
Regenerate/update *.egg-info packaging metadata (appears incidental to the deployment-manager change).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File	Description
`src/youtube_extension/backend/deployment_manager.py`	Improves GitHub upload resilience (auth check, non-blocking reads, retry/backoff, worker pool).
`src/youtube_extension.egg-info/requires.txt`	Updates generated dependency metadata (should not be committed).
`src/youtube_extension.egg-info/SOURCES.txt`	Updates generated source manifest (should not be committed).
`src/youtube_extension.egg-info/PKG-INFO`	Updates generated long description/metadata (should not be committed; also contains conflicting runtime claims).

Comments suppressed due to low confidence (1)

src/youtube_extension.egg-info/requires.txt:60

PR description focuses on hardening _upload_to_github, but this PR also includes regenerated *.egg-info metadata (dependencies + long description changes). Unless the intent is to change packaging metadata, these files should be removed from the PR to keep the change set scoped and reviewable.

google-genai>=1.0.0
google-generativeai>=0.3.0
opentelemetry-distro>=0.40b0
opentelemetry-exporter-otlp>=1.20.0
psutil>=5.9.0
ffmpeg-python>=0.2.0
qrcode[pil]>=7.0
opencv-python>=4.8.0

[deploy]
docker>=6.1.0
kubernetes>=28.1.0
helm>=3.12.0
flyctl>=0.1.0

[dev]
pytest>=7.4.0
pytest-asyncio>=0.21.0
pytest-cov>=4.1.0
pytest-xdist>=3.5.0
pytest-mock>=3.12.0
black>=23.0.0
isort>=5.12.0
flake8>=6.0.0
mypy>=1.7.0
pre-commit>=3.5.0
ruff>=0.1.0
jupyter>=1.0.0
ipykernel>=6.25.0
notebook>=7.0.0

[docs]
mkdocs>=1.5.0

src/youtube_extension.egg-info/requires.txt

 PyYAML>=6.0.0
 requests>=2.31.0
 Pillow>=10.0.0
+google-genai>=1.0.0
 google-generativeai>=0.3.0
 opentelemetry-distro>=0.40b0
 opentelemetry-exporter-otlp>=1.20.0


src/youtube_extension.egg-info/SOURCES.txt

 src/integration/supabase_db.py
+src/integration/temporal_video_analysis.py
 src/integration/vercel_deploy.py
 src/integration/youtube_api.py


src/youtube_extension.egg-info/PKG-INFO

-  @documentation Document this API endpoint
-  @documentation Create a setup guide for new contributors
-  ```
+- Python >= 3.11


src/youtube_extension/backend/deployment_manager.py

+                    # Retry with exponential backoff on 403/429 (rate limit / abuse detection).
+                    # max_retries=3 means 1 initial attempt + 3 retries = 4 total requests.
+                    max_retries = 3
+                    for attempt in range(max_retries + 1):
+                        response = await client.put(upload_url, headers=headers, json=file_data)
+                        if response.status_code in [200, 201]:
+                            return str(relative_path)
+                        elif response.status_code in [403, 429]:
+                            if attempt == max_retries:
+                                break  # exhausted all retries, no point sleeping
+                            try:
+                                retry_after = int(response.headers.get("Retry-After", 2 ** (attempt + 1)))
+                            except (ValueError, TypeError):
+                                retry_after = 2 ** (attempt + 1)
+                            logger.warning(
+                                f"Rate limited uploading {relative_path} "
+                                f"(attempt {attempt + 1}/{max_retries + 1}), retrying in {retry_after}s"
+                            )
+                            await asyncio.sleep(retry_after)
+                        else:
+                            logger.warning(
+                                f"Failed to upload {relative_path}: "
+                                f"{response.status_code} - {response.text}"
+                            )
+                            return None
+
+                    logger.error(f"Exhausted retries uploading {relative_path} due to rate limiting")
+                    return None
+                except asyncio.CancelledError:
+                    # Propagate cancellation so calling code can handle shutdown correctly
+                    raise
                except Exception as e:
                    logger.warning(f"Error uploading {file_path}: {e}")
                    return None

-            # Collect tasks
-            for file_path in project_path_obj.rglob("*"):
-                # Skip excluded directories and dotfiles
-                if should_skip_path(file_path.relative_to(project_path_obj)):
-                    continue
-                if file_path.is_file() and not file_path.name.startswith('.'):
-                    relative_path = file_path.relative_to(project_path_obj)
-                    upload_tasks.append(upload_file(client, file_path, relative_path))
-
-            # Run uploads concurrently with a semaphore to avoid overwhelming the GitHub API
-            # Secondary rate limit for GitHub is generally not strictly documented for concurrent writes but 10-20 concurrent requests is a safe maximum.
-            semaphore = asyncio.Semaphore(10)
-
-            async def run_with_semaphore(coro):
-                async with semaphore:
-                    return await coro
-
-            results = await asyncio.gather(*(run_with_semaphore(task) for task in upload_tasks))
-            uploaded_files = [res for res in results if res is not None]
+            # Bounded worker-pool: only MAX_WORKERS coroutines exist at a time, keeping
+            # memory usage O(workers) rather than O(files) for large repositories.
+            # maxsize=MAX_WORKERS*2 keeps the queue bounded so producers don't outpace consumers.
+            MAX_WORKERS = 10
+            queue: asyncio.Queue = asyncio.Queue(maxsize=MAX_WORKERS * 2)
+            uploaded_files: list[str] = []


src/youtube_extension/backend/deployment_manager.py

+            # Collect file list up-front; coroutines are created lazily inside workers
+            files_to_upload = [
+                (file_path, file_path.relative_to(project_path_obj))
+                for file_path in project_path_obj.rglob("*")
+                if not should_skip_path(file_path.relative_to(project_path_obj))
+                and file_path.is_file()
+                and not file_path.name.startswith('.')
+            ]


src/youtube_extension/backend/deployment_manager.py

                    file_data = {
                        "message": f"Add {relative_path}",
                        "content": encoded_content
                    }

                    upload_url = f"https://api.github.com/repos/{username}/{repo_name}/contents/{relative_path}"
-                    response = await client.put(upload_url, headers=headers, json=file_data)



src/youtube_extension/backend/deployment_manager.py

+                    # Retry with exponential backoff on 403/429 (rate limit / abuse detection).
+                    # max_retries=3 means 1 initial attempt + 3 retries = 4 total requests.
+                    max_retries = 3
+                    for attempt in range(max_retries + 1):
+                        response = await client.put(upload_url, headers=headers, json=file_data)
+                        if response.status_code in [200, 201]:
+                            return str(relative_path)
+                        elif response.status_code in [403, 429]:
+                            if attempt == max_retries:
+                                break  # exhausted all retries, no point sleeping
+                            try:
+                                retry_after = int(response.headers.get("Retry-After", 2 ** (attempt + 1)))
+                            except (ValueError, TypeError):
+                                retry_after = 2 ** (attempt + 1)
+                            logger.warning(
+                                f"Rate limited uploading {relative_path} "
+                                f"(attempt {attempt + 1}/{max_retries + 1}), retrying in {retry_after}s"
+                            )
+                            await asyncio.sleep(retry_after)
+                        else:
+                            logger.warning(


Initial plan

00a2145

Copilot AI assigned Copilot and groupthinking Mar 14, 2026

Copilot AI mentioned this pull request Mar 14, 2026

⚡ Performance improvement for GitHub deployment calls #60

Closed

Copilot started work on behalf of groupthinking March 14, 2026 16:55 View session

vercel bot temporarily deployed to Preview March 14, 2026 16:55 Inactive

railway-app bot temporarily deployed to EventRelay / EventRelay-pr-68 March 14, 2026 16:56 Destroyed

railway-app bot deployed to EventRelay / EventRelay-pr-68 March 14, 2026 17:06 View deployment

Copilot AI changed the title ~~[WIP] Fix performance issue with GitHub deployment calls~~ fix(deployment_manager): harden and optimize concurrent GitHub file uploads Mar 14, 2026

Copilot AI requested a review from groupthinking March 14, 2026 17:07

Copilot finished work on behalf of groupthinking March 14, 2026 17:07

vercel bot deployed to Preview March 14, 2026 17:07 View deployment

vercel bot reviewed Mar 14, 2026

View reviewed changes

groupthinking approved these changes Mar 16, 2026

View reviewed changes

groupthinking requested a review from Copilot March 16, 2026 13:11

Copilot started reviewing on behalf of groupthinking March 16, 2026 13:11 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deployment_manager): harden and optimize concurrent GitHub file uploads#68

fix(deployment_manager): harden and optimize concurrent GitHub file uploads#68
Copilot wants to merge 2 commits intooptimize-github-deployment-5034492779168511499from
copilot/sub-pr-60

Copilot AI commented Mar 14, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 14, 2026 •

edited

Loading

Uh oh!

railway-app bot commented Mar 14, 2026 •

edited

Loading

Uh oh!

Copilot AI commented Mar 14, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

vercel bot Mar 14, 2026

Uh oh!

vercel bot Mar 14, 2026

Uh oh!

vercel bot Mar 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

vercel bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

railway-app bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI commented Mar 14, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

vercel bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

vercel bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

vercel bot Mar 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 14, 2026 •

edited

Loading

vercel bot commented Mar 14, 2026 •

edited

Loading

railway-app bot commented Mar 14, 2026 •

edited

Loading