Skip to content

Fix how we fetch job logs#214

Closed
golharam wants to merge 3 commits intomainfrom
bugfix/get_job_log
Closed

Fix how we fetch job logs#214
golharam wants to merge 3 commits intomainfrom
bugfix/get_job_log

Conversation

@golharam
Copy link
Copy Markdown
Contributor

@golharam golharam commented Apr 8, 2026

get_job_log wasn't returned AWS CloudWatch logs. There were a few issues identified by Claude:

Performance Issues

  1. Synchronous Blocking I/O
    The get_log_events() function makes synchronous boto3 calls to CloudWatch Logs, which blocks the entire request thread while waiting for AWS responses.

  2. Fetching All Logs at Once
    Lines 219-234 in services.py fetch ALL log events in a while loop until exhausted. For large jobs with millions of log lines, this causes:

Memory accumulation (all logs stored in events list)
Extended processing time (multiple AWS API calls in sequence)
Gateway timeout before response can be sent

  1. Inefficient boto3 Client Creation
    Line 221-224 creates a new boto3 client inside the while loop on every iteration, adding unnecessary overhead.

  2. No Pagination Support
    The API endpoint doesn't support pagination - clients must wait for the entire log file, even if they only need recent entries.

  3. No Caching
    Logs for completed jobs never change, but they're fetched from CloudWatch every time.

@EricSDavis - I ended up implementing a StreamingResponse which seems to be the best solution here, however the FrontEnd still isn't displaying anything.

@golharam golharam requested a review from EricSDavis April 8, 2026 02:14
@golharam
Copy link
Copy Markdown
Contributor Author

Closing in favor of #215

@golharam golharam closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant