Skip to content

Fix health check to use HTTPS#211

Merged
samuelvkwong merged 2 commits intomainfrom
fix-healthcheck-https
Apr 2, 2026
Merged

Fix health check to use HTTPS#211
samuelvkwong merged 2 commits intomainfrom
fix-healthcheck-https

Conversation

@samuelvkwong
Copy link
Copy Markdown
Collaborator

@samuelvkwong samuelvkwong commented Apr 2, 2026

Summary

  • The Docker Swarm health check was using curl -f http://localhost/health/, but SECURE_SSL_REDIRECT returns a 301 redirect to HTTPS
  • This caused health checks to fail, leading Swarm to restart replicas until max_attempts was exhausted
  • Changed to curl -fk https://localhost/health/ to hit the HTTPS endpoint directly (-k skips cert verification for localhost)

Test plan

  • Exec into a running web container and verify curl -fk https://localhost/health/ returns {"status": "ok"}
  • Deploy and confirm docker ps shows containers as (healthy)
  • Verify no restart loops in docker service ps

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Integrated OpenTelemetry observability capabilities across the application.
  • Chores

    • Updated health endpoint checks to use HTTPS with TLS support.
    • Added observability configuration example and environment variable setup.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Warning

Rate limit exceeded

@samuelvkwong has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 14 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 6 minutes and 14 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d043b846-a686-47f4-873c-a1f621398aa9

📥 Commits

Reviewing files that changed from the base of the PR and between ef444cf and 33d51d8.

📒 Files selected for processing (1)
  • docker-compose.prod.yml
📝 Walkthrough

Walkthrough

This pull request introduces OpenTelemetry observability infrastructure across the application. Changes include adding initialization code in Django and ASGI entry points, configuring Docker Compose services with observability networks, updating environment configuration examples, and updating the shared dependency reference to enable telemetry features.

Changes

Cohort / File(s) Summary
Configuration & Dependency
.gitignore, example.env, pyproject.toml
Added .gitignore entry for compose override file, included OTEL collector endpoint in example environment, updated adit-radis-shared dependency from version 0.19.1 to @openobserve branch.
Docker Compose Setup
docker-compose.override.yml.example, docker-compose.prod.yml
Added example compose file defining observability anchor with OTEL networking and environment configuration for init, web, default_worker, and llm_worker services; updated web service healthcheck to use HTTPS with insecure TLS flag.
OpenTelemetry Initialization
manage.py, radis/asgi.py, radis/settings/base.py
Added early setup_opentelemetry() invocation in both Django management command and ASGI entry points before framework initialization; configured Django logging to register OTEL handler when telemetry is active.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • openradx/radis#168: Modifies the same pyproject.toml adit-radis-shared dependency entry with a different git reference update.

Poem

🐰 With traces that hop and logs that gleam,
We instrument the Django stream.
OTEL threads through entry and app,
Observability closes the gap! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The PR title 'Fix health check to use HTTPS' directly matches the primary change in docker-compose.prod.yml, where the healthcheck was updated from HTTP to HTTPS. However, the changeset includes substantial observability-related changes across multiple files (OpenTelemetry setup, environment configuration, Docker Compose overrides) that are not reflected in the title. Clarify whether the PR focuses primarily on the HTTPS health check fix (current title is accurate) or on the broader observability integration. If observability is the main objective, consider a more inclusive title such as 'Add OpenTelemetry observability integration and fix health check to use HTTPS'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix-healthcheck-https

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude bot commented Apr 2, 2026

Code Review

Thanks for the PR! The health check fix is correct, but there are several concerns worth addressing — especially one that could break reproducible builds.


Critical: Mutable branch reference in pyproject.toml

pyproject.toml:96

-    "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@0.19.1",
+    "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",

This pins the dependency to a branch name (openobserve) rather than a version tag or commit SHA. Branch refs are mutable — any future commit pushed to openobserve will silently change what gets installed in new environments, breaking reproducibility and making it impossible to audit what's actually running in production.

The uv.lock currently pins it to commit a945c9d93841fc6c376df7aec78550fec1715b93, so existing environments are safe — but any uv sync after a new commit lands on that branch will silently pick it up.

Recommendation: Either:

  • Land the OTel changes in adit-radis-shared, cut a new version tag (e.g. 0.20.0), and reference that
  • Or use the full commit SHA in the meantime: @a945c9d93841fc6c376df7aec78550fec1715b93

PR Scope

The PR title says "Fix health check to use HTTPS" but ~310 of the 318 additions are OpenTelemetry integration. These are unrelated changes and mixing them makes the git history harder to reason about. Consider splitting into two PRs:

  1. One-liner health check fix (immediately mergeable)
  2. OTel integration as its own PR with appropriate context

Health check fix (docker-compose.prod.yml:55)

The fix itself is correct. Using -k (skip TLS verification) is appropriate for a localhost health check where no valid cert exists, and is a standard pattern. Optionally, a short inline comment explaining why -k is needed would help future readers:

# -k skips TLS cert verification for localhost (no valid cert for 127.0.0.1)
test: ["CMD", "curl", "-fk", "https://localhost/health/"]

OTel initialization: graceful degradation

manage.py:82 / radis/asgi.py:115

setup_opentelemetry()

What happens when OTEL_EXPORTER_OTLP_ENDPOINT is empty or unreachable? In development environments (the default case), the example.env sets:

OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector.local:4318

If OTel is unconditionally initialized with an unreachable endpoint, every request could generate connection errors or introduce latency. It would be good to document (or verify) that setup_opentelemetry() is a no-op when the endpoint is empty/unset. The docker-compose.override.yml.example correctly uses ${OTEL_EXPORTER_OTLP_ENDPOINT:-} (empty default), which is good — but example.env should probably default to an empty value too, or add a comment indicating users should leave it blank unless they're running the observability stack.


Minor: noqa: E402 in asgi.py

from adit_radis_shared.telemetry import setup_opentelemetry  # noqa: E402
...
from django.core.asgi import get_asgi_application  # noqa: E402

This is fine and unavoidable given OTel must be initialized before Django. A brief comment explaining the ordering requirement would help maintainers understand why this pattern is used here (similar to how it's already done in manage.py).


Missing tests

The OTel integration has no test coverage. At minimum, it would be useful to have a test that:

  • Verifies the health endpoint still returns 200 when OTel is disabled
  • Verifies setup_opentelemetry() is a no-op (no exceptions) when OTEL_EXPORTER_OTLP_ENDPOINT is unset

This is especially important for a production-facing change that touches the ASGI entry point.


Summary

Issue Severity
Branch pin (@openobserve) instead of tag/SHA High — breaks reproducible builds
PR mixes unrelated changes Medium
OTel graceful degradation with missing endpoint Medium
No tests for OTel integration Medium
example.env default endpoint Low
Missing comments in asgi.py and health check Low

The health check fix is good to go once it's separated. The OTel work needs the dependency pin resolved before merge.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request integrates OpenTelemetry for application-wide observability, adding OTLP exporter configurations, environment variables, and initialization logic in manage.py and asgi.py. It also updates the adit-radis-shared dependency to a branch reference and switches the production healthcheck to HTTPS. Feedback focuses on ensuring build stability by using immutable version tags instead of branch references for dependencies. Additionally, it is recommended to guard telemetry initialization with an activity check to prevent unnecessary overhead in environments where observability is not enabled.

pyproject.toml Outdated
requires-python = ">=3.12,<4.0"
dependencies = [
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@0.19.1",
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a branch reference (@openobserve) for a dependency is discouraged for production-bound code. Branches are mutable, which can lead to non-deterministic builds if the branch is updated. It is safer to use a specific version tag or a commit hash to ensure environment stability.

manage.py Outdated
Comment on lines +28 to +30
from adit_radis_shared.telemetry import setup_opentelemetry

setup_opentelemetry()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

OpenTelemetry is initialized unconditionally in manage.py, which affects all management commands (including migrations and shell). It is recommended to guard this with is_telemetry_active() to avoid unnecessary overhead or connection attempts in environments where telemetry is not required, consistent with the logic used in the settings.

Suggested change
from adit_radis_shared.telemetry import setup_opentelemetry
setup_opentelemetry()
from adit_radis_shared.telemetry import setup_opentelemetry, is_telemetry_active
if is_telemetry_active():
setup_opentelemetry()

radis/asgi.py Outdated
Comment on lines +18 to +20
from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402

setup_opentelemetry()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

OpenTelemetry is initialized unconditionally here. Consider guarding this call with is_telemetry_active() to ensure it only runs when telemetry is explicitly configured, consistent with the logic used in radis/settings/base.py.

Suggested change
from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402
setup_opentelemetry()
from adit_radis_shared.telemetry import setup_opentelemetry, is_telemetry_active # noqa: E402
if is_telemetry_active():
setup_opentelemetry()

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
docker-compose.override.yml.example (1)

9-11: Ensure documentation covers the external network prerequisite.

The openradx-observability network must exist before using this override (created by the referenced observability stack). Consider adding a brief comment in this file noting this dependency, or ensure the linked documentation in example.env covers this clearly.

📝 Suggested comment addition
 networks:
   openradx-observability:
+    # This network is created by the openradx-observability stack.
+    # See https://github.com/openradx/openradx-observability
     external: true
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose.override.yml.example` around lines 9 - 11, Add a short comment
above the networks block in docker-compose.override.yml.example that explains
the external network "openradx-observability" must be created beforehand (it’s
provided by the observability stack), and point readers to the example.env or
observability stack documentation for steps to create it so users know this
prerequisite before using the override.
pyproject.toml (1)

10-10: Using a branch reference with uv.lock pinning is reproducible but not ideal practice.

The uv.lock file pins adit-radis-shared to a specific commit (a945c9d93841fc6c376df7aec78550fec1715b93) on the openobserve branch, so builds remain reproducible. However, relying on a branch reference in pyproject.toml while pinning in the lock file is inconsistent with best practices.

For clarity and proper version communication, tag and release the openobserve branch changes as a stable version in adit-radis-shared, then update this reference to use the version tag instead of the branch name.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 10, The pyproject dependency string
"adit-radis-shared @
git+https://github.com/openradx/adit-radis-shared.git@openobserve" should not
point to a branch; instead tag and release the current openobserve commit in the
adit-radis-shared repo (create a semantic version tag), update the
pyproject.toml dependency to use that version tag (e.g., replace `@openobserve`
with `@vX.Y.Z`), and then regenerate the lockfile (uv.lock) so it pins the
released tag commit consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docker-compose.override.yml.example`:
- Around line 9-11: Add a short comment above the networks block in
docker-compose.override.yml.example that explains the external network
"openradx-observability" must be created beforehand (it’s provided by the
observability stack), and point readers to the example.env or observability
stack documentation for steps to create it so users know this prerequisite
before using the override.

In `@pyproject.toml`:
- Line 10: The pyproject dependency string "adit-radis-shared @
git+https://github.com/openradx/adit-radis-shared.git@openobserve" should not
point to a branch; instead tag and release the current openobserve commit in the
adit-radis-shared repo (create a semantic version tag), update the
pyproject.toml dependency to use that version tag (e.g., replace `@openobserve`
with `@vX.Y.Z`), and then regenerate the lockfile (uv.lock) so it pins the
released tag commit consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d2def08d-fa36-4456-879c-3c2fd49c8948

📥 Commits

Reviewing files that changed from the base of the PR and between 20d5ee0 and ef444cf.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • .gitignore
  • docker-compose.override.yml.example
  • docker-compose.prod.yml
  • example.env
  • manage.py
  • pyproject.toml
  • radis/asgi.py
  • radis/settings/base.py

The health check was using plain HTTP, but SECURE_SSL_REDIRECT causes a
301 redirect to HTTPS, making the probe fail and Swarm restart replicas.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@samuelvkwong samuelvkwong force-pushed the fix-healthcheck-https branch from ef444cf to 044e7dc Compare April 2, 2026 10:14
@samuelvkwong samuelvkwong merged commit b48e154 into main Apr 2, 2026
2 checks passed
@samuelvkwong samuelvkwong deleted the fix-healthcheck-https branch April 2, 2026 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant