Fix(ingestion): implement retry mechanism + solve Docker cross-platform issue#557
Fix(ingestion): implement retry mechanism + solve Docker cross-platform issue#557tanmayjoddar wants to merge 2 commits intocertego:developfrom
Conversation
|
Hi @Lorygold , Just a gentle follow-up on this PR. Thanks! |
|
@lucaCigarini please run the CI |
|
I'm afraid I cannot run CI manually. The author needs to push another commit (even an empty one like this |
|
I pushed a new commit to trigger the CI pipeline as requested. It looks like the workflow is currently awaiting maintainer approval. Thanks! |
|
Hi @lucaCigarini @Lorygold — sorry for the noise. The only remaining failure is Black formatting, pushing the fix now. |
|
Hi @lucaCigarini @Lorygold — all CI issues are now resolved:
Would really appreciate it if someone could approve the workflow run when you get a chance. Thank you for your patience! |
|
Fix: Added Pushing the fix now — will update here once CI completes. |
|
Hi @lucaCigarini @Lorygold — could someone please approve the CI workflow run? I'm actively working on getting all tests to pass and would appreciate being able to see the full output. Thanks! |
- Add shared _read_retry_config() in BaseIngestion - Add connection_retry.py with generic backoff decorator (backoff library) - Elasticsearch, OpenSearch, Splunk: eager _initialize_connection() with retry on startup; health-check after successful connect - process_user_logins catches, logs and returns [] on error (per-user loop must not abort remaining users) - process_users re-raises so the task knows if the user-list fetch failed - Move retry config from env vars into ingestion.json; defaults applied when block is absent (backward compatible) Closes certego#545
process_user_logins catches exceptions, logs them and returns [] — it does not re-raise. Tests should only assert the log, not a raised exception.

Implemented comprehensive retry logic for all ingestion sources.
During testing, discovered and fixed a critical Docker cross-platform bug that prevented
BuffaLogs from starting in Linux containers on Windows development machines (WSL2). This
blocked all testing until resolved.
Retry Mechanism Implementation:
Backward Compatibility:
Critical Docker Bug Discovered & Fixed:
While testing the retry implementation, discovered BuffaLogs container failed to start with error:
exec ./run.sh: no such file or directoryRoot Cause: Shell scripts (run.sh, run_worker.sh, run_beat.sh) had Windows line endings
(CRLF) which Linux containers cannot execute. This is a cross-platform compatibility issue
that affects anyone developing on Windows with WSL2/Docker Desktop.
Solution Implemented:
/bin/bashinvocationFiles Modified:
Removed:
Testing:
Fixes #545
Elasticsearch retry with exception logging via predicate-based backoff hook
Docker Cross-Platform Fix: Shell Script Line Endings Resolved
Problem:
exec ./run.sh: no such file or directoryon fresh docker-compose upRoot Cause: Shell scripts (run.sh, run_worker.sh, run_beat.sh) had Windows CRLF line endings; Linux containers cannot execute them.
Solution:
/bin/bashinvocationResult: All services running healthy on both Windows and Linux environments
