Skip to content

Integrate OpenTelemetry from adit-radis-shared#191

Merged
samuelvkwong merged 13 commits intomainfrom
openobserve
Apr 2, 2026
Merged

Integrate OpenTelemetry from adit-radis-shared#191
samuelvkwong merged 13 commits intomainfrom
openobserve

Conversation

@samuelvkwong
Copy link
Copy Markdown
Collaborator

@samuelvkwong samuelvkwong commented Jan 31, 2026

Summary

  • Integrate shared OpenTelemetry telemetry module from adit-radis-shared
  • Initialize telemetry in manage.py and radis/asgi.py before Django loads
  • Conditionally add OTel logging handler in radis/settings/base.py when telemetry is active
  • Update adit-radis-shared dependency to openobserve branch (includes telemetry module)

Dependencies

Test plan

  • Verify uv sync resolves all dependencies correctly
  • Verify RADIS starts without telemetry when OTEL_EXPORTER_OTLP_ENDPOINT is not set
  • Verify telemetry initializes when OTEL_EXPORTER_OTLP_ENDPOINT is set
  • Run existing tests to confirm no regressions

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added observability and distributed tracing via OpenTelemetry
    • Configurable OTLP endpoint for telemetry export
    • Default exclusion of health/static paths from tracing
    • Included an example observability compose configuration
  • Chores

    • Updated a shared dependency reference for telemetry
    • Added telemetry-related environment example and .gitignore entry

samuelvkwong and others added 2 commits January 31, 2026 00:21
Use the shared telemetry module from adit-radis-shared to set up
OpenTelemetry traces, metrics, and logs. Telemetry is initialized
in manage.py and asgi.py before Django loads, and the OTel logging
handler is added conditionally in settings.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 31, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds OpenTelemetry initialization and logging integration across application entry points and settings, updates the adit-radis-shared dependency to the telemetry-enabled tag, and exposes OTLP configuration via Docker Compose and example environment files.

Changes

Cohort / File(s) Summary
Entrypoints / Telemetry Init
manage.py, radis/asgi.py
Call setup_opentelemetry() early during process startup so tracing begins before Django initializes (manage and ASGI flows).
Settings / Logging
radis/settings/base.py
Import telemetry helpers and conditionally add the OpenTelemetry logging handler when telemetry is active.
Dependency
pyproject.toml
Update adit-radis-shared dependency tag from 0.19.1 to openobserve.
Compose & Env
docker-compose.base.yml, example.env
Add OTEL_PYTHON_DJANGO_EXCLUDED_URLS to shared compose env and add OTEL_EXPORTER_OTLP_ENDPOINT to example env (observability config).
Observability compose example
docker-compose.observability.yml.example
New example file defining x-observability anchor to attach services to openradx-observability network and inject OTLP endpoint env var.
Repository config
.gitignore
Add docker-compose.observability.yml to .gitignore as an opt-in file.

Sequence Diagram(s)

sequenceDiagram
    participant Entrypoint as Entrypoint\n(`manage.py`/ASGI)
    participant OTEL as OpenTelemetry\nSetup
    participant Django as Django\nFramework
    participant Logger as Django\nLogging
    participant OTLP as OTLP\nExporter

    Entrypoint->>OTEL: setup_opentelemetry()
    OTEL-->>Entrypoint: initialized

    Entrypoint->>Django: initialize application
    Django->>Logger: configure LOGGING
    note right of Logger: add_otel_logging_handler(LOGGING) if active

    Django->>Logger: emit log / trace
    Logger->>OTLP: handler forwards to OTLP exporter
    OTLP-->>Logger: send confirmation
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 I padded through startup, light and fleet,
Traces blooming where processes meet,
Logs now hop on shiny OTLP streams,
Observing dreams in observability gleams.
— a rabbit, tracing carrots 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Integrate OpenTelemetry from adit-radis-shared' directly and accurately summarizes the main objective of the pull request, which integrates OpenTelemetry telemetry initialization across multiple entry points (manage.py, asgi.py, settings) and updates the shared dependency.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch openobserve

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @samuelvkwong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces robust observability capabilities by integrating OpenTelemetry into the application. It sets up the necessary infrastructure to collect telemetry data by initializing the tracing system at an early stage of application startup and configures logging to forward relevant data when telemetry is active. These changes are supported by updating the shared dependency module and incorporating new OpenTelemetry libraries, paving the way for better monitoring and debugging of the application's performance and behavior.

Highlights

  • OpenTelemetry Integration: The project now integrates OpenTelemetry for enhanced observability, allowing for the collection of traces, metrics, and logs.
  • Early Telemetry Initialization: OpenTelemetry is initialized early in the application's lifecycle, specifically in manage.py and radis/asgi.py, to ensure comprehensive tracing of all requests before Django fully loads.
  • Conditional Logging Handler: A new OpenTelemetry logging handler is conditionally added to the logging configuration in radis/settings/base.py, activating only when telemetry is enabled.
  • Dependency Updates: The adit-radis-shared dependency has been updated to reference the openobserve branch, and several new OpenTelemetry-related packages have been added to uv.lock to support the new telemetry features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly integrates OpenTelemetry for tracing and logging by initializing it early in manage.py and radis/asgi.py. The logging configuration is also properly updated. My only concern is with the dependency on a git branch in pyproject.toml, which can lead to non-reproducible builds. I've suggested pinning to a specific commit hash to improve stability. Otherwise, the changes are solid.

pyproject.toml Outdated
requires-python = ">=3.12,<4.0"
dependencies = [
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@0.19.1",
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Depending on a git branch (openobserve) can lead to unpredictable and non-reproducible builds, as the branch can be updated or deleted. It's much safer to pin to a specific commit hash or a tag.

Since the associated PR in adit-radis-shared is not yet merged, I recommend pinning to the specific commit hash from that branch for now. Once the shared library PR is merged and a new version is released, this should be updated to point to the new version tag.

Suggested change
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",
"adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@fd2c783a389d2ea9c275b22a794a99f0fa2ba382",

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@radis/asgi.py`:
- Around line 18-22: Remove the unused inline flake8 suppression comments by
deleting the trailing "# noqa: E402" on the import of setup_opentelemetry and
the import of get_asgi_application in radis/asgi.py; keep the calls/imports
(setup_opentelemetry() and get_asgi_application) unchanged, and only remove the
E402 suppressions (or enable E402 in config if you truly need to suppress it).
🧹 Nitpick comments (1)
pyproject.toml (1)

10-10: Prefer pinning adit-radis-shared to a commit/tag for reproducible builds.

A moving branch ref can introduce unreviewed changes into builds. Consider pinning to a commit SHA or a version tag once the dependency is stable.

♻️ Example pinning approach
-    "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@openobserve",
+    "adit-radis-shared @ git+https://github.com/openradx/adit-radis-shared.git@<commit-sha>",

Comment on lines +18 to +22
from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402

setup_opentelemetry()

from django.core.asgi import get_asgi_application # noqa: E402
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused # noqa: E402 directives (RUF100).

Ruff reports these as unused; drop them (or enable E402 if you truly need suppression).

🧹 Suggested cleanup
-from adit_radis_shared.telemetry import setup_opentelemetry  # noqa: E402
+from adit_radis_shared.telemetry import setup_opentelemetry
@@
-from django.core.asgi import get_asgi_application  # noqa: E402
+from django.core.asgi import get_asgi_application
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from adit_radis_shared.telemetry import setup_opentelemetry # noqa: E402
setup_opentelemetry()
from django.core.asgi import get_asgi_application # noqa: E402
from adit_radis_shared.telemetry import setup_opentelemetry
setup_opentelemetry()
from django.core.asgi import get_asgi_application
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 18-18: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)


[warning] 22-22: Unused noqa directive (non-enabled: E402)

Remove unused noqa directive

(RUF100)

🤖 Prompt for AI Agents
In `@radis/asgi.py` around lines 18 - 22, Remove the unused inline flake8
suppression comments by deleting the trailing "# noqa: E402" on the import of
setup_opentelemetry and the import of get_asgi_application in radis/asgi.py;
keep the calls/imports (setup_opentelemetry() and get_asgi_application)
unchanged, and only remove the E402 suppressions (or enable E402 in config if
you truly need to suppress it).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
example.env (1)

124-125: ⚠️ Potential issue | 🟡 Minor

Fix typo in proxy note.

Line 124: “Malke” → “Make”.

Suggested change
-# Malke sure to use .local in NO_PROXY as otherwise the communication with
+# Make sure to use .local in NO_PROXY as otherwise the communication with
🤖 Fix all issues with AI agents
In `@otel-collector-config.yaml`:
- Around line 36-41: The config currently sets otlphttp/openobserve
tls.insecure: true which disables TLS verification; change it so tls.insecure is
driven by an environment variable (e.g., OPENOBSERVE_TLS_INSECURE) with a secure
default of false, update the otlphttp/openobserve block to read the env var for
tls.insecure (falling back to false) and document that production should not set
it to true; ensure references to endpoint and headers remain unchanged and
validate the env parsing follows your config templating conventions.
🧹 Nitpick comments (2)
example.env (1)

112-120: Add a dev‑only warning for the default OpenObserve credentials.

Line 112‑119: the defaults are handy for local use but can be unintentionally reused outside dev. A short warning keeps expectations clear and reminds that OPENOBSERVE_AUTH_HEADER must match the root creds.

Suggested change
-# OpenObserve
+# OpenObserve (dev-only defaults — change for production)
 ZO_ROOT_USER_EMAIL="admin@localhost.com"
 ZO_ROOT_USER_PASSWORD="admin"
 OPENOBSERVE_DEV_PORT=5080

 # OpenTelemetry Configuration
 OPENOBSERVE_ENDPOINT=http://openobserve.local:5080/api/default
+# OPENOBSERVE_AUTH_HEADER should match ZO_ROOT_USER_* (Basic base64(user:password))
 OPENOBSERVE_AUTH_HEADER="Basic YWRtaW5AbG9jYWxob3N0LmNvbTphZG1pbg=="
docker-compose.base.yml (1)

63-71: Pin the OpenObserve image to a specific version instead of using latest.

Using latest makes deployments non-reproducible and risks unintended breaking changes. OpenObserve's documentation recommends pinning to a fixed release version (e.g., v0.50.2) in production environments. Use an environment variable for the version tag and document it in example.env to maintain reproducibility across environments.

Suggested change
-  openobserve:
-    image: public.ecr.aws/zinclabs/openobserve:latest
+  openobserve:
+    image: public.ecr.aws/zinclabs/openobserve:${OPENOBSERVE_IMAGE_TAG:?}

Comment on lines +36 to +41
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
insecure: true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenTelemetry Collector otlphttp exporter tls.insecure configuration environment variables

💡 Result:

OpenTelemetry Collector doesn’t have a special “environment variable name” that directly maps to exporters.otlphttp.tls.insecure. Instead, you set it in Collector YAML and (optionally) substitute from an environment variable.

1) Use env-var substitution inside config.yaml

Collector configs support ${env:VAR} expansion. [1]

exporters:
  otlphttp:
    endpoint: ${env:OTLPHTTP_ENDPOINT}
    tls:
      insecure: ${env:OTLPHTTP_TLS_INSECURE}

Example shell:

export OTLPHTTP_ENDPOINT="https://otel-gateway.internal:4318"
export OTLPHTTP_TLS_INSECURE="true"
otelcol --config=/etc/otelcol/config.yaml

tls.insecure: true is a standard Collector TLS client setting (“disable TLS verification”). [1][2]

2) (Alternative) Put the whole config in an env var and load it

The Collector can read config from an environment variable via --config=env:.... [1]

Important note (common confusion)

OTEL_EXPORTER_OTLP_INSECURE is an SDK OTLP exporter environment variable, and the spec notes “insecure” applies to OTLP/gRPC behavior; OTLP/HTTP uses the URL scheme (http vs https) for transport security decisions. So don’t expect OTEL_EXPORTER_OTLP_INSECURE to configure the Collector’s otlphttp exporter TLS. [3]

Sources: Collector config + env expansion + TLS options [1], Collector TLS config package docs [2], OTLP exporter spec (insecure env var semantics) [3].


🌐 Web query:

OpenTelemetry Collector configuration environment variable substitution syntax

💡 Result:

OpenTelemetry Collector config supports environment variable substitution using ${...}:

  • Basic

    • ${env:VAR_NAME} (Collector docs example) [1]
    • ${VAR_NAME} is also considered valid by the OTel configuration data model spec (the env: prefix is optional) [2]
  • Default value (bash-style)

    • ${env:VAR_NAME:-default} (if VAR_NAME is unset/empty) [1][2]
  • Escaping a literal dollar sign

    • Use $$ to render a literal $ (no substitution) [1]

Notes:

  • Substitution applies to scalar values (not YAML mapping keys) [2]
  • If a variable is unset and no default is provided, it’s replaced with an empty value [2]

Example

processors:
  attributes/example:
    actions:
      - key: ${env:DB_KEY:-mydefault}
        action: ${env:OPERATION:-}
exporters:
  prometheus:
    namespace: $$DataVisualization

[1]

Sources: OpenTelemetry Collector configuration docs [1], OTel configuration data model spec (env substitution ABNF and rules) [2].


🌐 Web query:

OpenTelemetry Collector production TLS settings security recommendations

💡 Result:

Production TLS security recommendations for the OpenTelemetry Collector

  1. Enable TLS everywhere (receiver + exporter)

    • Configure receiver-side TLS with cert_file + key_file to encrypt inbound telemetry. [1]
    • Configure exporter-side TLS with ca_file (and optionally client cert_file + key_file). [1]
  2. Prefer mTLS for collector ingress (stronger authentication)

    • On receivers, require client certs with client_ca_file (this enforces “RequireAndVerifyClientCert”). [1][2]
  3. Do not use insecure TLS flags in production

    • Keep insecure: false (don’t disable TLS). [1][2]
    • Keep insecure_skip_verify: false (don’t skip cert verification). [1][2]
  4. Enforce modern TLS versions

    • Set min_version to "1.2" at minimum, and prefer "1.3" where possible. TLS 1.0/1.1 are deprecated and should be avoided. [2][3]
  5. Use safe cipher suites (mainly relevant to TLS 1.2)

    • If you must pin suites, use cipher_suites; otherwise the Collector uses a safe default list. [2][3]
  6. Plan certificate rotation

    • Use reload_interval so the Collector reloads certificates without a restart (helps with automated rotation). [1][2]
  7. Protect private keys and config secrets

    • Treat TLS private keys and config as sensitive; store them securely (encrypted storage/secret store) and consider environment-variable expansion for sensitive values. [4]

Minimal hardened examples

OTLP receiver (gRPC) with mTLS

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /etc/otel/server.crt
          key_file: /etc/otel/server.key
          client_ca_file: /etc/otel/clients_ca.crt
          min_version: "1.2"
          reload_interval: 1h

Based on the Collector TLS/mTLS options. [1][2]

OTLP exporter (gRPC) with server verification (and optional client cert)

exporters:
  otlp:
    endpoint: upstream-collector:4317
    tls:
      ca_file: /etc/otel/upstream_ca.crt
      min_version: "1.2"
      # optional for mTLS:
      # cert_file: /etc/otel/client.crt
      # key_file: /etc/otel/client.key

Client-side TLS configuration. [1][2]

Sources: OpenTelemetry Collector TLS configuration docs [1], Collector configtls reference (all TLS knobs, defaults, and version guidance) [2], Splunk OTel Collector TLS settings (same upstream options, includes TLS version/cipher guidance) [3], OpenTelemetry Collector config security best practices [4].


Avoid shipping tls.insecure: true as the default.

Lines 40–41: This disables TLS verification and is unsafe for production. The OpenTelemetry Collector recommends keeping insecure: false for all production deployments. Gate this setting by environment variable with a secure default:

Suggested change
   otlphttp/openobserve:
     endpoint: ${env:OPENOBSERVE_ENDPOINT}
     headers:
       Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
     tls:
-      insecure: true
+      insecure: ${env:OPENOBSERVE_TLS_INSECURE:-false}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
insecure: true
otlphttp/openobserve:
endpoint: ${env:OPENOBSERVE_ENDPOINT}
headers:
Authorization: ${env:OPENOBSERVE_AUTH_HEADER}
tls:
insecure: ${env:OPENOBSERVE_TLS_INSECURE:-false}
🤖 Prompt for AI Agents
In `@otel-collector-config.yaml` around lines 36 - 41, The config currently sets
otlphttp/openobserve tls.insecure: true which disables TLS verification; change
it so tls.insecure is driven by an environment variable (e.g.,
OPENOBSERVE_TLS_INSECURE) with a secure default of false, update the
otlphttp/openobserve block to read the env var for tls.insecure (falling back to
false) and document that production should not set it to true; ensure references
to endpoint and headers remain unchanged and validate the env parsing follows
your config templating conventions.

@claude
Copy link
Copy Markdown

claude bot commented Feb 1, 2026

PR Review: OpenTelemetry Integration

Reviewed this PR - overall solid implementation. Details below:

Strengths

  1. Proper initialization timing (setup_opentelemetry before Django loads)
  2. Conditional activation via is_telemetry_active()
  3. Good infrastructure design (OpenObserve + OTel collector)
  4. Smart filtering of health/static logs

Issues

1. Security: Hardcoded Credentials (example.env:113-119)

  • Base64 admin creds in OPENOBSERVE_AUTH_HEADER
  • Add warning comments for production

2. Missing Health Checks (docker-compose.base.yml:63-84)

  • otel-collector depends on openobserve without health checks
  • Add service_healthy condition

3. Missing Dependencies (docker-compose.base.yml:42-55)

  • App services should depend on otel-collector
  • Add to web, default_worker, llm_worker

4. Default OTEL_EXPORTER_OTLP_ENDPOINT (docker-compose.base.yml:34)

  • Default may confuse production deployments
  • Consider empty default, set only in dev

5. No Resource Limits (docker-compose.base.yml:63-84)

  • Add memory/CPU limits guidance

6. Collector Config (otel-collector-config.yaml:11-14)

  • Add memory_limiter processor for high log volumes

7. Missing Documentation

  • Add observability section to CLAUDE.md
  • Document OpenObserve UI access and telemetry toggle

Testing

  • No tests for telemetry initialization or graceful degradation

Dependencies

  • pyproject.toml uses unmerged @openobserve branch
  • Update after adit-radis-shared#152 merges

Must Fix Before Merge

  1. Security warning in example.env
  2. Documentation in CLAUDE.md
  3. Note dependency update needed

Should Fix

  1. Health check dependencies
  2. memory_limiter processor
  3. App service dependencies

Approval

Almost ready after Must Fix items addressed. Core implementation follows OpenTelemetry best practices. Great work!

@claude
Copy link
Copy Markdown

claude bot commented Feb 1, 2026

Code Review: OpenTelemetry Integration

Thanks for implementing OpenTelemetry observability in RADIS! This is a solid foundation for monitoring and tracing. Here's my detailed review:

✅ Strengths

  1. Proper initialization order: Correctly initializing telemetry before Django loads in both manage.py and asgi.py ensures all requests are instrumented
  2. Conditional activation: The is_telemetry_active() check allows graceful degradation when telemetry isn't configured
  3. Smart filtering: Excluding health checks and static files from logs reduces noise
  4. Comprehensive instrumentation: Includes Django, psycopg, and WSGI instrumentation
  5. Good separation of concerns: Telemetry logic is properly encapsulated in the shared library

🔍 Issues & Concerns

1. Dependency on unreleased branch (Critical)

The PR depends on adit-radis-shared@openobserve branch instead of a released version. This creates risks:

  • Unstable dependency that could change or disappear
  • CI/CD pipelines may fail if branch is force-pushed or deleted
  • Difficult for other developers to reproduce environments

Recommendation: Wait for adit-radis-shared#152 to be merged and released, then update to a tagged version

2. Security: Hardcoded credentials in example.env (High)

The base64-encoded auth header contains default credentials (admin@localhost.com:admin). While this is in example.env, developers might copy these to production.

Recommendations:

  • Add prominent comment warning these are development-only credentials
  • Consider using a placeholder or document password rotation best practices

3. Always-on services increase resource usage (Medium)

OpenObserve and otel-collector services are always started in base config, even when telemetry isn't needed. This adds significant memory overhead for development.

Recommendation: Move these services to a Docker profile so they only start when needed.

4. Missing health check configuration (Low)

The otel-collector has a health check endpoint at :13133 but no Docker health check is configured.

📊 Performance Considerations

  1. Batch processing is well-configured: 5s timeout and 1000 batch size are reasonable defaults
  2. Log filtering reduces overhead: Excluding health/static endpoints is good
  3. Consider adding trace sampling for high-traffic production environments

🧪 Test Coverage

Missing: No tests for telemetry integration. Consider adding unit tests for graceful degradation when OTEL_EXPORTER_OTLP_ENDPOINT is unset and integration tests verifying telemetry initializes without errors.

📝 Documentation Gaps

  1. CLAUDE.md should document the new telemetry feature (how to enable/disable, access OpenObserve UI, troubleshooting)
  2. New environment variables missing from CLAUDE.md
  3. README should mention observability as a feature

🎯 Overall Assessment

This is a well-implemented feature with good architectural decisions. The main blocker is the dependency on an unreleased branch. Once that's resolved and documentation is added, this will be ready to merge.

The integration follows OpenTelemetry best practices and the collector configuration is production-ready. Great work!

samuelvkwong and others added 4 commits March 31, 2026 12:32
Remove OpenObserve and OTel collector services from RADIS's Docker
Compose stack. Telemetry is now sent directly to the centralized
collector via OTEL_EXPORTER_OTLP_ENDPOINT.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the OTLP endpoint mapping from docker-compose.base.yml since
the observability stack is now centralized in openradx-observability.
Add docker-compose.observability.yml.example as an opt-in overlay
referencing the env variable, and ignore the active copy in .gitignore.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Point OTEL_EXPORTER_OTLP_ENDPOINT to the collector on the shared
openradx-observability network instead of host.docker.internal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
samuelvkwong and others added 3 commits April 2, 2026 08:47
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@samuelvkwong samuelvkwong merged commit bd6c2ef into main Apr 2, 2026
2 checks passed
@samuelvkwong samuelvkwong deleted the openobserve branch April 2, 2026 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant