LCORE-1251: Added TLS E2E Tests#1413
LCORE-1251: Added TLS E2E Tests#1413jrobertboos wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds end-to-end TLS testing: a mock TLS inference server, compose changes to start and share TLS certificates, TLS-specific test configs, feature scenarios and step implementations to mutate/restore runtime config, and a dev dependency for certificate generation. Changes
Sequence Diagram(s)sequenceDiagram
participant Tester as Test Runner
participant Compose as Docker Compose
participant Mock as mock-tls-inference
participant Llama as llama-stack
participant LS as lightspeed-stack
Tester->>Compose: docker-compose up
Compose->>Mock: build & start mock-tls-inference
Mock-->>Compose: healthcheck OK (https://localhost:8443/health)
Compose->>Llama: start llama-stack (wait healthy)
Compose->>LS: start lightspeed-stack (wait healthy)
Tester->>LS: send test request (HTTP)
LS->>Llama: forward inference request
Llama->>Mock: HTTPS call to tls-openai (/v1/chat/completions)
Mock-->>Llama: deterministic TLS response
Llama-->>LS: return inference result
LS-->>Tester: final HTTP response (assert status/body)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
I like how @max-svistunov used |
tisnik
left a comment
There was a problem hiding this comment.
PTAL at comments.
Also certs can be generated on the fly - please look at @max-svistunov 's PR
tests/e2e/features/environment.py
Outdated
| switch_config(context.feature_config) | ||
| restart_container("lightspeed-stack") | ||
|
|
||
| if "TLS" in feature.tags: |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/features/environment.py`:
- Around line 591-598: After restoring TLS defaults with
switch_config(context.default_config_backup), also remove the backup and restart
the service: call remove_config_backup(context.default_config_backup) and clear
the attribute (e.g., delattr(context, "default_config_backup") or set it to
None), then if not context.is_library_mode call
restart_container("lightspeed-stack") so the restored file is actually reloaded;
keep the existing handling for run_yaml_backup
(switch_run_config/remove_config_backup) unchanged.
- Around line 534-542: The TLS feature block currently hardcodes Docker TLS
configs and doesn't work on Prow; modify the block that checks "TLS" in
feature.tags to detect the Prow environment (e.g., via an env var such as PROW
or a CI indicator) and short-circuit by skipping the feature when running on
Prow (call feature.skip or raise behave.exceptions.Skipped with a clear message)
instead of calling create_config_backup, switch_config or switch_run_config;
alternatively, if you prefer to support Prow, implement the missing Prow-side
TLS plumbing where switch_run_config would apply run.yaml variants—reference the
existing symbols create_config_backup, switch_config and switch_run_config to
locate the code to change.
In `@tests/e2e/mock_tls_inference_server/certs/ca.key`:
- Around line 1-52: This file contains a committed CA private key (ca.key) which
must be removed; update the TLS test fixture to generate ephemeral certs at test
runtime (e.g., use trustme or an OpenSSL-based helper in the mock TLS server
setup used by the mock_tls_inference_server tests), change the test setup code
that currently reads the static ca.key to instead create and load
runtime-generated key/certpair into the test server and client, ensure generated
artifacts are not committed by removing ca.key from the repo and adding runtime
output paths to .gitignore, and update any test/CI initialization helpers to
produce and clean up certs during test execution.
In `@tests/e2e/mock_tls_inference_server/Dockerfile`:
- Around line 1-6: The Dockerfile runs the mock TLS server as root; update it to
create and switch to a non-root user before CMD so the container runs with least
privilege: add steps to create a dedicated user/group (e.g., "appuser"), chown
the application files and certs (referencing the WORKDIR /app and files copied
by COPY server.py and COPY certs/), and add a USER appuser instruction before
CMD ["python", "server.py"] so the process runs as that non-root user.
In `@tests/e2e/mock_tls_inference_server/server.py`:
- Around line 64-69: The current handler silently swallows JSON errors and falls
back to {} then a default MODEL_ID which lets malformed/empty requests succeed;
change the except block so that when json.loads(body) raises
JSONDecodeError/UnicodeDecodeError you return an HTTP 400/422 error response (no
longer set request_data = {}) and do not proceed; after parsing, validate
request_data contains required completion fields (e.g., ensure
request_data.get("model") is present or that one of "input"/"prompt"/"messages"
exists) and if validation fails return a 400/422 error instead of defaulting
model = MODEL_ID; update the logic around request_data, model and any response
path in the request handler so malformed or empty payloads are rejected.
- Around line 105-115: The TLS server context currently doesn't pin protocol
versions; update _make_tls_context to accept optional min_version and
max_version parameters (e.g., of type ssl.TLSVersion or accepted string mapped
to ssl.TLSVersion) and set ctx.minimum_version and ctx.maximum_version
accordingly before returning. Ensure callers in tests (the run-tls-min-version
scenario) pass min_version=ssl.TLSVersion.TLSv1_3 (or equivalent) so the mock
server enforces the same TLS bounds as the client, and adjust any call sites to
supply these arguments.
In `@tests/e2e/utils/utils.py`:
- Around line 347-348: The function currently substitutes an empty string when
context.faiss_vector_store_id is missing, which hides configuration errors;
change it to fail fast by checking if the placeholder "{VECTOR_STORE_ID}"
appears in result and context.faiss_vector_store_id is falsy, and in that case
raise a clear AssertionError/ValueError (mentioning that FAISS vector store ID
is required) instead of returning result.replace with an empty string; otherwise
continue to replace using context.faiss_vector_store_id as before (referencing
result, context.faiss_vector_store_id, and the "{VECTOR_STORE_ID}" placeholder).
- Around line 180-181: The Prow branch in switch_run_config() currently returns
early and skips applying TLS run.yaml updates; modify switch_run_config to
detect Prow via is_prow_environment(), call update_config_configmap(source_path)
to push the updated run.yaml/configmap into the cluster, then return so
subsequent restart logic isn't run locally; target the switch_run_config
function and ensure update_config_configmap is invoked with the same source_path
used by switch_run_config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d22b2762-aa0d-4c94-9347-02f856f1c52c
📒 Files selected for processing (21)
docker-compose-library.yamldocker-compose.yamltests/e2e/configs/run-tls-min-version.yamltests/e2e/configs/run-tls-mtls.yamltests/e2e/configs/run-tls-verify-ca.yamltests/e2e/configs/run-tls-verify-default.yamltests/e2e/configs/run-tls-verify-false.yamltests/e2e/configuration/library-mode/lightspeed-stack-tls.yamltests/e2e/configuration/server-mode/lightspeed-stack-tls.yamltests/e2e/features/environment.pytests/e2e/features/tls.featuretests/e2e/mock_tls_inference_server/Dockerfiletests/e2e/mock_tls_inference_server/certs/ca.crttests/e2e/mock_tls_inference_server/certs/ca.keytests/e2e/mock_tls_inference_server/certs/client.crttests/e2e/mock_tls_inference_server/certs/client.keytests/e2e/mock_tls_inference_server/certs/server.crttests/e2e/mock_tls_inference_server/certs/server.keytests/e2e/mock_tls_inference_server/generate_certs.shtests/e2e/mock_tls_inference_server/server.pytests/e2e/utils/utils.py
| FROM python:3.12-slim | ||
| WORKDIR /app | ||
| COPY server.py . | ||
| COPY certs/ certs/ | ||
| EXPOSE 8443 8444 | ||
| CMD ["python", "server.py"] |
There was a problem hiding this comment.
Run the mock TLS server as a non-root user.
Container currently runs as root, which weakens container security posture and can fail policy checks.
Proposed fix
FROM python:3.12-slim
WORKDIR /app
+RUN useradd --create-home --uid 10001 appuser
COPY server.py .
COPY certs/ certs/
+RUN chown -R appuser:appuser /app
+USER appuser
EXPOSE 8443 8444
CMD ["python", "server.py"]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| FROM python:3.12-slim | |
| WORKDIR /app | |
| COPY server.py . | |
| COPY certs/ certs/ | |
| EXPOSE 8443 8444 | |
| CMD ["python", "server.py"] | |
| FROM python:3.12-slim | |
| WORKDIR /app | |
| RUN useradd --create-home --uid 10001 appuser | |
| COPY server.py . | |
| COPY certs/ certs/ | |
| RUN chown -R appuser:appuser /app | |
| USER appuser | |
| EXPOSE 8443 8444 | |
| CMD ["python", "server.py"] |
🧰 Tools
🪛 Trivy (0.69.3)
[error] 1-1: Image user should not be 'root'
Specify at least 1 USER command in Dockerfile with non-root user as argument
Rule: DS-0002
(IaC/Dockerfile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/mock_tls_inference_server/Dockerfile` around lines 1 - 6, The
Dockerfile runs the mock TLS server as root; update it to create and switch to a
non-root user before CMD so the container runs with least privilege: add steps
to create a dedicated user/group (e.g., "appuser"), chown the application files
and certs (referencing the WORKDIR /app and files copied by COPY server.py and
COPY certs/), and add a USER appuser instruction before CMD ["python",
"server.py"] so the process runs as that non-root user.
| try: | ||
| request_data = json.loads(body.decode("utf-8")) | ||
| except (json.JSONDecodeError, UnicodeDecodeError): | ||
| request_data = {} | ||
|
|
||
| model = request_data.get("model", MODEL_ID) |
There was a problem hiding this comment.
Reject empty or malformed completion payloads.
Falling back to {} and then defaulting model lets this mock return 200 even when the client stops sending a valid OpenAI request, so the TLS scenario can false-pass while request construction is broken.
🐛 Proposed fix
try:
request_data = json.loads(body.decode("utf-8"))
except (json.JSONDecodeError, UnicodeDecodeError):
- request_data = {}
+ self.send_error(400, "Invalid JSON body")
+ return
- model = request_data.get("model", MODEL_ID)
+ if not isinstance(request_data, dict) or "model" not in request_data:
+ self.send_error(400, "Missing required field: model")
+ return
+
+ model = request_data["model"]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| request_data = json.loads(body.decode("utf-8")) | |
| except (json.JSONDecodeError, UnicodeDecodeError): | |
| request_data = {} | |
| model = request_data.get("model", MODEL_ID) | |
| try: | |
| request_data = json.loads(body.decode("utf-8")) | |
| except (json.JSONDecodeError, UnicodeDecodeError): | |
| self.send_error(400, "Invalid JSON body") | |
| return | |
| if not isinstance(request_data, dict) or "model" not in request_data: | |
| self.send_error(400, "Missing required field: model") | |
| return | |
| model = request_data["model"] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/e2e/mock_tls_inference_server/server.py` around lines 64 - 69, The
current handler silently swallows JSON errors and falls back to {} then a
default MODEL_ID which lets malformed/empty requests succeed; change the except
block so that when json.loads(body) raises JSONDecodeError/UnicodeDecodeError
you return an HTTP 400/422 error response (no longer set request_data = {}) and
do not proceed; after parsing, validate request_data contains required
completion fields (e.g., ensure request_data.get("model") is present or that one
of "input"/"prompt"/"messages" exists) and if validation fails return a 400/422
error instead of defaulting model = MODEL_ID; update the logic around
request_data, model and any response path in the request handler so malformed or
empty payloads are rejected.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
tests/e2e/features/environment.py (2)
595-602:⚠️ Potential issue | 🟠 MajorRestart
lightspeed-stackafter restoring the TLS defaults.This only restores files on disk. The running service keeps the TLS config loaded into later features, and this branch never removes
context.default_config_backup.♻️ Proposed fix
if "TLS" in feature.tags: switch_config(context.default_config_backup) run_backup = getattr(context, "run_yaml_backup", None) if run_backup: switch_run_config(run_backup) remove_config_backup(run_backup) if not context.is_library_mode: restart_container("llama-stack") + restart_container("lightspeed-stack") + remove_config_backup(context.default_config_backup)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/features/environment.py` around lines 595 - 602, After restoring TLS defaults with switch_config(context.default_config_backup), also delete the backup (clear context.default_config_backup) and, if a run_backup was applied, remove_config_backup(run_backup) as before; finally restart the running service "lightspeed-stack" (not "llama-stack") when not in library mode by calling restart_container("lightspeed-stack") so the restored TLS files are picked up by the running service. Ensure the branch still checks getattr(context, "run_yaml_backup", None) and uses switch_run_config(run_backup) prior to remove_config_backup(run_backup).
538-545:⚠️ Potential issue | 🟠 MajorGuard
@TLSin Prow.This branch assumes Docker-side config swapping, but
switch_run_config()is a no-op intests/e2e/utils/utils.pyon Prow. Without an explicit skip or separate Prow plumbing, the@TLS*Configscenarios silently exercise the wrongrun.yamlthere.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/features/environment.py` around lines 538 - 545, The TLS branch in environment.py assumes Docker-side config swapping but on Prow switch_run_config() is a no-op, causing `@TLS` scenarios to exercise the wrong run.yaml; update the TLS handling (the block that sets context.feature_config, context.default_config_backup via create_config_backup, context.run_yaml_backup, and calls switch_config) to detect Prow (e.g., an existing PROW env var or a context.is_prow flag) and either skip TLS scenarios when running on Prow or call an alternative plumbing path; specifically, guard the TLS branch so it does not perform config swaps on Prow (or explicitly skip the scenario) and ensure any calls to switch_run_config()/switch_config() are only executed when not on Prow.tests/e2e/mock_tls_inference_server/Dockerfile (1)
1-14:⚠️ Potential issue | 🟠 MajorStill running as root.
Add a non-root
USERbeforeCMD; otherwise this container keeps broader-than-needed privileges for the test server.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/mock_tls_inference_server/Dockerfile` around lines 1 - 14, The Dockerfile is still running as root; add a dedicated non-root user and switch to it before CMD to reduce privileges: create a user (e.g., adduser or useradd with a fixed UID/GID like 1000), ensure ownership of copied artifacts and the /certs directory is changed (chown) so that the new user can access server.py and /certs, perform package installs and file copies as root first, then add a USER instruction (referencing USER and CMD in the Dockerfile) to run the container as that non-root user.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docker-compose-library.yaml`:
- Around line 125-127: Remove the unnecessary host port bindings from the mock
TLS service so the library-mode stack uses the compose network only; locate the
mock-tls-inference service (look for the service name "mock-tls-inference") and
delete or comment out the ports block entries "- "8443:8443"" and "-
"8444:8444"" (or remove the entire ports: section if it becomes empty) so the
service is reachable only via the compose network and does not publish ports to
the host.
In `@docker-compose.yaml`:
- Around line 153-155: Remove the host port bindings "8443:8443" and "8444:8444"
from the docker-compose service that talks to mock-tls-inference over
lightspeednet (the entries under ports currently mapping 8443 and 8444), so the
service only exposes those ports internally to the compose network; keep the
internal container ports as needed for in-container healthcheck and
inter-service communication but do not publish them to the host to avoid startup
conflicts.
---
Duplicate comments:
In `@tests/e2e/features/environment.py`:
- Around line 595-602: After restoring TLS defaults with
switch_config(context.default_config_backup), also delete the backup (clear
context.default_config_backup) and, if a run_backup was applied,
remove_config_backup(run_backup) as before; finally restart the running service
"lightspeed-stack" (not "llama-stack") when not in library mode by calling
restart_container("lightspeed-stack") so the restored TLS files are picked up by
the running service. Ensure the branch still checks getattr(context,
"run_yaml_backup", None) and uses switch_run_config(run_backup) prior to
remove_config_backup(run_backup).
- Around line 538-545: The TLS branch in environment.py assumes Docker-side
config swapping but on Prow switch_run_config() is a no-op, causing `@TLS`
scenarios to exercise the wrong run.yaml; update the TLS handling (the block
that sets context.feature_config, context.default_config_backup via
create_config_backup, context.run_yaml_backup, and calls switch_config) to
detect Prow (e.g., an existing PROW env var or a context.is_prow flag) and
either skip TLS scenarios when running on Prow or call an alternative plumbing
path; specifically, guard the TLS branch so it does not perform config swaps on
Prow (or explicitly skip the scenario) and ensure any calls to
switch_run_config()/switch_config() are only executed when not on Prow.
In `@tests/e2e/mock_tls_inference_server/Dockerfile`:
- Around line 1-14: The Dockerfile is still running as root; add a dedicated
non-root user and switch to it before CMD to reduce privileges: create a user
(e.g., adduser or useradd with a fixed UID/GID like 1000), ensure ownership of
copied artifacts and the /certs directory is changed (chown) so that the new
user can access server.py and /certs, perform package installs and file copies
as root first, then add a USER instruction (referencing USER and CMD in the
Dockerfile) to run the container as that non-root user.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a26de868-fde3-4fe4-a391-b36f163c1457
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
docker-compose-library.yamldocker-compose.yamlpyproject.tomlsrc/app/endpoints/responses.pytests/e2e/configs/run-tls-min-version.yamltests/e2e/configs/run-tls-mtls.yamltests/e2e/configs/run-tls-verify-ca.yamltests/e2e/configs/run-tls-verify-default.yamltests/e2e/configs/run-tls-verify-false.yamltests/e2e/configuration/library-mode/lightspeed-stack-tls.yamltests/e2e/configuration/server-mode/lightspeed-stack-tls.yamltests/e2e/features/environment.pytests/e2e/features/tls.featuretests/e2e/mock_tls_inference_server/Dockerfiletests/e2e/mock_tls_inference_server/server.pytests/e2e/test_list.txttests/e2e/utils/utils.py
✅ Files skipped from review due to trivial changes (12)
- pyproject.toml
- src/app/endpoints/responses.py
- tests/e2e/configuration/server-mode/lightspeed-stack-tls.yaml
- tests/e2e/test_list.txt
- tests/e2e/configuration/library-mode/lightspeed-stack-tls.yaml
- tests/e2e/features/tls.feature
- tests/e2e/configs/run-tls-mtls.yaml
- tests/e2e/configs/run-tls-verify-false.yaml
- tests/e2e/configs/run-tls-min-version.yaml
- tests/e2e/configs/run-tls-verify-ca.yaml
- tests/e2e/mock_tls_inference_server/server.py
- tests/e2e/configs/run-tls-verify-default.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/e2e/utils/utils.py
| ports: | ||
| - "8443:8443" | ||
| - "8444:8444" |
There was a problem hiding this comment.
Don't publish the mock TLS ports to the host here either.
The library-mode stack reaches mock-tls-inference over the compose network, so these host bindings only make unrelated local runs fail when 8443 or 8444 is already occupied.
✂️ Proposed fix
- ports:
- - "8443:8443"
- - "8444:8444"
+ expose:
+ - "8443"
+ - "8444"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ports: | |
| - "8443:8443" | |
| - "8444:8444" | |
| expose: | |
| - "8443" | |
| - "8444" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker-compose-library.yaml` around lines 125 - 127, Remove the unnecessary
host port bindings from the mock TLS service so the library-mode stack uses the
compose network only; locate the mock-tls-inference service (look for the
service name "mock-tls-inference") and delete or comment out the ports block
entries "- "8443:8443"" and "- "8444:8444"" (or remove the entire ports: section
if it becomes empty) so the service is reachable only via the compose network
and does not publish ports to the host.
| ports: | ||
| - "8443:8443" | ||
| - "8444:8444" |
There was a problem hiding this comment.
Don't publish the mock TLS ports to the host.
Everything here talks to mock-tls-inference over lightspeednet, and the healthcheck runs in-container, so binding 8443/8444 on the host only adds avoidable startup failures when those ports are already in use.
✂️ Proposed fix
- ports:
- - "8443:8443"
- - "8444:8444"
+ expose:
+ - "8443"
+ - "8444"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ports: | |
| - "8443:8443" | |
| - "8444:8444" | |
| expose: | |
| - "8443" | |
| - "8444" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docker-compose.yaml` around lines 153 - 155, Remove the host port bindings
"8443:8443" and "8444:8444" from the docker-compose service that talks to
mock-tls-inference over lightspeednet (the entries under ports currently mapping
8443 and 8444), so the service only exposes those ports internally to the
compose network; keep the internal container ports as needed for in-container
healthcheck and inter-service communication but do not publish them to the host
to avoid startup conflicts.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/e2e/features/steps/tls.py (1)
122-235: Consider extracting repeated config initialization into a helper.All five TLS configuration steps share identical boilerplate for ensuring nested dict keys exist before setting TLS options. This could be consolidated into a helper function.
Proposed refactor
def _get_tls_config_section(provider: dict) -> dict: """Ensure provider has config.network.tls structure and return tls dict reference. Parameters: provider: The provider configuration dictionary. Returns: Reference to the tls config dict (creates nested structure if needed). """ if "config" not in provider: provider["config"] = {} if "network" not in provider["config"]: provider["config"]["network"] = {} if "tls" not in provider["config"]["network"]: provider["config"]["network"]["tls"] = {} return provider["config"]["network"] `@given`("Llama Stack is configured with TLS verification disabled") def configure_tls_verify_false(context: Context) -> None: """Configure run.yaml with TLS verify: false.""" _backup_llama_config() config = _load_llama_config() provider = _find_tls_openai_provider(config) _get_tls_config_section(provider)["tls"] = {"verify": False} _write_config(config, _LLAMA_STACK_CONFIG)Apply the same pattern to the other four configuration steps.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/features/steps/tls.py` around lines 122 - 235, Multiple step handlers (configure_tls_verify_false, configure_tls_verify_ca, configure_tls_verify_true, configure_tls_mtls, configure_tls_min_version) repeat the same nested dict setup for provider["config"]["network"]["tls"]; add a helper function (e.g., _get_tls_config_section(provider)) that ensures provider["config"], provider["config"]["network"], and provider["config"]["network"]["tls"] exist and returns the tls dict (or network dict reference) and then replace the repeated if-blocks in each step to call that helper and assign the TLS fields (keep the configure_tls_mtls change to provider["config"]["base_url"] as-is), then call _write_config as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/e2e/features/environment.py`:
- Around line 502-509: The TLS feature handler sets context.feature_config and
calls create_config_backup("lightspeed-stack.yaml") and
switch_config(context.feature_config) but does not call
restart_container("lightspeed-stack"), so the lightspeed-stack container won't
pick up the new TLS config; after switch_config(context.feature_config) add a
call to restart_container("lightspeed-stack") (matching how other tag handlers
like Authorized/RBAC use restart_container) to ensure the container is restarted
and the TLS configuration is applied.
---
Nitpick comments:
In `@tests/e2e/features/steps/tls.py`:
- Around line 122-235: Multiple step handlers (configure_tls_verify_false,
configure_tls_verify_ca, configure_tls_verify_true, configure_tls_mtls,
configure_tls_min_version) repeat the same nested dict setup for
provider["config"]["network"]["tls"]; add a helper function (e.g.,
_get_tls_config_section(provider)) that ensures provider["config"],
provider["config"]["network"], and provider["config"]["network"]["tls"] exist
and returns the tls dict (or network dict reference) and then replace the
repeated if-blocks in each step to call that helper and assign the TLS fields
(keep the configure_tls_mtls change to provider["config"]["base_url"] as-is),
then call _write_config as before.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7697c5ae-3f60-4820-ae07-8770eefacf4e
📒 Files selected for processing (3)
tests/e2e/features/environment.pytests/e2e/features/steps/tls.pytests/e2e/features/tls.feature
✅ Files skipped from review due to trivial changes (1)
- tests/e2e/features/tls.feature
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/e2e/features/steps/tls.py (1)
153-242: Consider extracting the repeated nested structure setup.The pattern for ensuring nested
config.networkstructure appears in all five TLS configuration functions:if "config" not in provider: provider["config"] = {} if "network" not in provider["config"]: provider["config"]["network"] = {}This could be consolidated into a helper that returns the prepared provider or merges into
_ensure_tls_provider. Given your comment about future trustme refactoring, this is low priority—but it would reduce repetition and simplify each step function to focus on its specific TLS settings.♻️ Optional: Extract network config setup
+def _ensure_network_config(provider: dict) -> dict: + """Ensure provider has config.network structure and return the network dict.""" + provider.setdefault("config", {}).setdefault("network", {}) + return provider["config"]["network"] + + `@given`("Llama Stack is configured with TLS verification disabled") def configure_tls_verify_false(context: Context) -> None: _backup_llama_config() config = _load_llama_config() provider = _ensure_tls_provider(config) - - if "config" not in provider: - provider["config"] = {} - if "network" not in provider["config"]: - provider["config"]["network"] = {} - - provider["config"]["network"]["tls"] = {"verify": False} + network = _ensure_network_config(provider) + network["tls"] = {"verify": False} _write_config(config, _LLAMA_STACK_CONFIG)Apply similar simplification to the other four TLS configuration functions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/features/steps/tls.py` around lines 153 - 242, The TLS step functions (configure_tls_verify_false, configure_tls_verify_ca, configure_tls_verify_true, configure_tls_mtls) all repeat the same nested setup for provider["config"]["network"]; extract this into a small helper (e.g., ensure_provider_network(provider) or extend _ensure_tls_provider) that ensures provider["config"] and provider["config"]["network"] exist and returns the provider, then replace the repeated if-blocks in each step with a single call to that helper so each step only sets the tls-specific fields and base_url.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/e2e/features/steps/tls.py`:
- Around line 153-242: The TLS step functions (configure_tls_verify_false,
configure_tls_verify_ca, configure_tls_verify_true, configure_tls_mtls) all
repeat the same nested setup for provider["config"]["network"]; extract this
into a small helper (e.g., ensure_provider_network(provider) or extend
_ensure_tls_provider) that ensures provider["config"] and
provider["config"]["network"] exist and returns the provider, then replace the
repeated if-blocks in each step with a single call to that helper so each step
only sets the tls-specific fields and base_url.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8de213f2-765f-4880-8a13-a644223e4193
📒 Files selected for processing (1)
tests/e2e/features/steps/tls.py
|
@jrobertboos you need to rebase + resolve conflicts |
d4d67ea to
b8daafc
Compare
|
@tisnik rebased and resolved conflicts :) |
(lcore-1251) fixed tls tests & removed other e2e tests for quicker test running (lcore-1251) restored test_list.txt (lcore-1251) use `trustme` for certs (lcore-1251) quick tls server fix (lcore-1251) removed tags in place of steps (fix) removed unused code fix tls config verified correct llm response clean LCORE-1253: Add e2e proxy and TLS networking tests Add comprehensive end-to-end tests verifying that Llama Stack's NetworkConfig (proxy, TLS) works correctly through the Lightspeed Stack pipeline. Test infrastructure: - TunnelProxy: Async HTTP CONNECT tunnel proxy that creates TCP tunnels for HTTPS traffic. Tracks CONNECT count and target hosts. - InterceptionProxy: Async TLS-intercepting (MITM) proxy using trustme CA to generate per-target server certificates. Simulates corporate SSL inspection proxies. Behave scenarios (tests/e2e/features/proxy.feature): - Tunnel proxy: Configures run.yaml with NetworkConfig proxy pointing to a local tunnel proxy. Verifies CONNECT to api.openai.com:443 is observed and the LLM query succeeds through the proxy. - Interception proxy: Configures run.yaml with proxy and custom CA cert (trustme). Verifies TLS interception of api.openai.com traffic and successful LLM query through the MITM proxy. - TLS version: Configures run.yaml with min_version TLSv1.2 and verifies the LLM query succeeds with the TLS constraint. Each scenario dynamically generates a modified run-ci.yaml with the appropriate NetworkConfig, restarts Llama Stack with the new config, restarts the Lightspeed Stack, and sends a query to verify the full pipeline. Added trustme>=1.2.1 to dev dependencies. LCORE-1253: Add negative tests, TLS/cipher scenarios, and cleanup hooks Expand proxy e2e test coverage to fully address all acceptance criteria: AC1 (tunnel proxy): - Add negative test: LLM query fails gracefully when proxy is unreachable AC2 (interception proxy with CA): - Add negative test: LLM query fails when interception proxy CA is not provided (verifies "only successful when correct CA is provided") AC3 (TLS version and ciphers): - Add TLSv1.3 minimum version scenario - Add custom cipher suite configuration scenario (ECDHE+AESGCM:DHE+AESGCM) Test infrastructure: - Add after_scenario cleanup hook in environment.py that restores original Llama Stack and Lightspeed Stack configs after @Proxy scenarios. Prevents config leaks between scenarios. - Use different ports for each interception proxy instance to avoid address-already-in-use errors in sequential scenarios. Documentation: - Update docs/e2e_scenarios.md with all 7 proxy test scenarios. - Update docs/e2e_testing.md with proxy-related Behave tags (@Proxy, @tunnelproxy, @InterceptionProxy, @TLSVersion, @tlscipher). LCORE-1253: Address review feedback Changes requested by reviewer (tisnik) and CodeRabbit: - Detect Docker mode once in before_all and store as context.is_docker_mode. All proxy step functions now use the context attribute instead of calling _is_docker_mode() repeatedly. - Log exception in _restore_original_services instead of silently swallowing it. - Only clear context.services_modified on successful restoration, not when restoration fails (prevents leaking modified state). - Add 10-second timeout to tunnel proxy open_connection to prevent stalls on unreachable targets. - Handle malformed CONNECT port with ValueError catch and 400 response. LCORE-1253: Replace tag-based cleanup with Background restore step Move config restoration from @Proxy after_scenario hook to an explicit Background Given step. This follows the team convention that tags are used only for test selection (filtering), not for triggering behavior. The Background step "The original Llama Stack config is restored if modified" runs before every scenario. If a previous scenario left a modified run.yaml (detected by backup file existence), it restores the original and restarts services. This handles cleanup even when the previous scenario failed mid-way. Removed: - @Proxy tag from feature file (was triggering after_scenario hook) - after_scenario hook for @Proxy in environment.py - _restore_original_services function (replaced by Background step) - context.services_modified tracking (no hook reads it) Updated docs/e2e_testing.md: tags documented as selection-only, not behavior-triggering. LCORE-1253: Address radofuchs review feedback Rewrite proxy e2e tests to follow project conventions: - Reuse existing step definitions: use "I use query to ask question" from llm_query_response.py and "The status code of the response is" from common_http.py instead of custom query/response steps. - Split service restart into two explicit Given steps: "Llama Stack is restarted" and "Lightspeed Stack is restarted" so restart ordering is visible in the feature file. - Remove local (non-Docker) mode code path. Proxy tests use restart_container() exclusively, consistent with the rest of the e2e test suite. - Check specific status code 500 for error scenarios instead of the broad >= 400 range. - Remove custom send_query, verify_llm_response, and verify_error_response steps that duplicated existing functionality. Net reduction: -183 lines from step definitions. LCORE-1253: Clean up proxy servers between scenarios Stop proxy servers and their event loops explicitly in the Background restore step. Previously, proxy daemon threads were left running after each scenario, causing asyncio "Task was destroyed but it is pending" warnings at process exit. The _stop_proxy helper schedules an async stop on the proxy's event loop, waits for it to complete, then stops the loop. Context references are cleared so the next scenario starts clean. LCORE-1253: Stop proxy servers after last scenario in after_feature Add proxy cleanup in after_feature to stop proxy servers left running from the last scenario. The Background restore step handles cleanup between scenarios, but the last scenario's proxies persist until process exit, causing asyncio "Task was destroyed" warnings. The cleanup checks for proxy objects on context (no tag check needed) and calls _stop_proxy to gracefully shut down the event loops. fixed dup steps addressed comments
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
Description
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Tests
Chores