Infra/mlo 3 hermetic workspaces docker#635
Conversation
Review Summary by QodoAdd hermetic Docker workspace and Makefile for standardized development
WalkthroughsDescription• Introduce hermetic Docker workspace with docker-compose for reproducible development • Add Makefile with standardized targets for build, test, and security operations • Create requirements.txt and requirements-dev.txt for dependency management • Update README with Docker setup and Makefile usage instructions • Add .dockerignore and persistent volume directories for clean builds Diagramflowchart LR
A["Dockerfile<br/>Python 3.12 + PyTorch"] --> B["docker-compose.yml<br/>Service Configuration"]
C["requirements.txt<br/>Runtime Dependencies"] --> A
D["requirements-dev.txt<br/>Dev Dependencies"] --> A
B --> E["Makefile<br/>Standardized Commands"]
E --> F["Local & Docker<br/>Workflows"]
G[".dockerignore<br/>Clean Build Context"] --> A
H["README.md<br/>Usage Documentation"] --> F
File Changes2. docker-compose.yml
|
Code Review by Qodo
1. Root-owned bind mount
|
📝 WalkthroughWalkthroughIntroduces a complete Docker-based development environment for NeuralDBG, including containerization files (Dockerfile, docker-compose.yml, .dockerignore), a Makefile with automation targets, project directory structure placeholders, updated dependencies (numpy, development packages), and documentation for Docker-based workflows. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can use your project's `ruff` configuration to improve the quality of Python code reviews.Add a Ruff configuration file to your project to customize how CodeRabbit runs |
|
| working_dir: /workspace | ||
| command: bash -lc "sleep infinity" | ||
| tty: true | ||
| stdin_open: true | ||
| volumes: | ||
| - ./:/workspace | ||
| - ./data:/data | ||
| - ./models:/models | ||
| - ./outputs:/outputs |
There was a problem hiding this comment.
1. Root-owned bind mount 🐞 Bug ⛯ Reliability
The dev container runs as root while bind-mounting ./ into /workspace, so running pytest/coverage inside Docker will create root-owned files on the host (e.g., .pytest_cache, htmlcov, .coverage) and can break local cleanup/editing without sudo. This is triggered directly by make test-docker / docker-compose run workflows.
Agent Prompt
### Issue description
The docker-compose dev workflow bind-mounts the repo into the container, but the container runs as root, causing root-owned artifacts to be created on the host when running tests/tools in Docker.
### Issue Context
This is especially problematic because the Makefile’s `test-docker` path runs `pytest` inside the container, which writes `.pytest_cache` and potentially other files to the bind-mounted workspace.
### Fix Focus Areas
- docker-compose.yml[8-16]
- Dockerfile[1-22]
- Makefile[51-52]
### Implementation notes
- Add a non-root user in the Dockerfile and set `USER` to it.
- Optionally accept build args for UID/GID and create the user with those IDs.
- Alternatively/additionally set `user:` in `docker-compose.yml` to `${UID}:${GID}` (with defaults) so the container runs with host-like permissions.
- Ensure `/data`, `/models`, `/outputs` remain writable for that user.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
Makefile (2)
69-71: Consider extending the clean target to remove__pycache__directories.The
.PHONYdeclaration and.dockerignorereference__pycache__, but the clean target doesn't remove it.Proposed fix
clean: rm -rf .pytest_cache htmlcov .mypy_cache rm -f .coverage .coverage.* + find . -type d -name "__pycache__" -exec rm -rf {} + 2>/dev/null || true🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 69 - 71, The Makefile's clean target doesn't remove __pycache__ directories referenced elsewhere (.PHONY and .dockerignore); update the clean target (symbol: clean) to also delete all __pycache__ directories across the tree (use a recursive find/remove approach rather than hardcoding paths for portability), so running make clean removes .pytest_cache, htmlcov, .mypy_cache, .coverage files and every __pycache__ directory; keep .PHONY unchanged but ensure the clean recipe uses a safe command that handles no-matches gracefully (e.g., a find-based removal).
54-56: Consider clarifying the coverage source path.The
--source=neuraldbgassumes the package is at the repository root. If the source structure changes, this path would need updating.Also, consider adding
--branchfor branch coverage if you want more thorough coverage metrics:- $(PYTHON) -m coverage run --source=neuraldbg -m pytest + $(PYTHON) -m coverage run --source=neuraldbg --branch -m pytest🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 54 - 56, The coverage Makefile target uses a hardcoded --source=neuraldbg which can break if the package layout or name changes; update the coverage invocation in the coverage target to reference a variable or discovered path (e.g., a PACKAGE or SRC_DIR make variable used elsewhere) instead of the literal "neuraldbg", and add the --branch flag to the coverage run command so branch coverage is measured; look for the coverage target in the Makefile and modify the line invoking "python -m coverage run --source=neuraldbg -m pytest" to use the variable (and include --branch) so the source path is easy to update and branch coverage is enabled.Dockerfile (1)
1-22: Consider adding a non-root user for improved security posture.Running containers as root, even for development, is a security best practice violation (Trivy DS-0002). While this is a dev container, adding a non-root user prevents accidental permission issues when files are created in mounted volumes.
Proposed fix
FROM python:3.12-slim ENV PIP_NO_CACHE_DIR=1 \ PYTHONDONTWRITEBYTECODE=1 \ PYTHONUNBUFFERED=1 +RUN useradd --create-home --shell /bin/bash devuser + WORKDIR /workspace COPY requirements.txt requirements-dev.txt ./ RUN apt-get update && \ apt-get install -y --no-install-recommends build-essential git && \ rm -rf /var/lib/apt/lists/* RUN python -m pip install --upgrade pip && \ python -m pip install --index-url https://download.pytorch.org/whl/cpu --extra-index-url https://pypi.org/simple torch && \ python -m pip install -r requirements.txt && \ python -m pip install -r requirements-dev.txt COPY . . +RUN chown -R devuser:devuser /workspace +USER devuser + CMD ["bash"]Note: The SonarCloud suggestion to merge RUN instructions is intentionally not followed here—separating apt-get and pip layers improves caching when only Python dependencies change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Dockerfile` around lines 1 - 22, Add a non-root user and switch to it after privileged setup: create a user/group (e.g., useradd/addgroup or adduser) with a deterministic uid/gid (e.g., 1000), set a HOME, chown the WORKDIR (/workspace) and any other needed directories to that user, and add a USER instruction to switch to that user before CMD; keep the privileged RUN steps that require root (apt-get, pip installs) earlier and perform the chown after COPY . . so the non-root user owns the workspace. Reference symbols: WORKDIR /workspace, RUN (apt-get / pip install layers), COPY . ., and add USER <username> plus the useradd/chown commands in a new RUN.docker-compose.yml (1)
1-1: Remove the deprecatedversionattribute.The
versionkey is obsolete in Docker Compose v2+ and is ignored. Removing it avoids confusion and aligns with current best practices.Proposed fix
-version: "3.9" - services:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose.yml` at line 1, Remove the deprecated top-level "version" key from docker-compose.yml: delete the line containing version: "3.9" and keep the remaining root keys (services, networks, volumes, etc.) at the file root so the compose file is valid for Compose v2+; ensure no other keys rely on the removed attribute and the file still begins directly with the services: block.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.dockerignore:
- Around line 1-16: Update the .dockerignore contents to prevent sensitive files
from being included in images by adding patterns for environment and secret
files; specifically add entries for ".env", ".env.*", and "secrets/" (and
optionally "secrets/**") so that COPY . . in the Dockerfile cannot accidentally
package these files.
In `@requirements-dev.txt`:
- Around line 1-5: Project lacks a deterministic lockfile causing
non-reproducible installs; generate and commit a lock file and update install
targets to use it. Create a lock (e.g., requirements.lock or constraints.txt or
use poetry.lock if using Poetry) by resolving exact versions (pip-compile or
poetry lock) and include hashes (pip --require-hashes or pip-compile
--generate-hashes); then update the install commands in Dockerfile and Makefile
to install from the lock (pip install -r requirements.lock or pip install
--constraint constraints.txt) instead of unbounded
requirements.txt/requirements-dev.txt so rebuilds are reproducible. Ensure
unique symbols to change: requirements-dev.txt, requirements.txt, generated lock
file name, Dockerfile install commands, and Makefile install target.
---
Nitpick comments:
In `@docker-compose.yml`:
- Line 1: Remove the deprecated top-level "version" key from docker-compose.yml:
delete the line containing version: "3.9" and keep the remaining root keys
(services, networks, volumes, etc.) at the file root so the compose file is
valid for Compose v2+; ensure no other keys rely on the removed attribute and
the file still begins directly with the services: block.
In `@Dockerfile`:
- Around line 1-22: Add a non-root user and switch to it after privileged setup:
create a user/group (e.g., useradd/addgroup or adduser) with a deterministic
uid/gid (e.g., 1000), set a HOME, chown the WORKDIR (/workspace) and any other
needed directories to that user, and add a USER instruction to switch to that
user before CMD; keep the privileged RUN steps that require root (apt-get, pip
installs) earlier and perform the chown after COPY . . so the non-root user owns
the workspace. Reference symbols: WORKDIR /workspace, RUN (apt-get / pip install
layers), COPY . ., and add USER <username> plus the useradd/chown commands in a
new RUN.
In `@Makefile`:
- Around line 69-71: The Makefile's clean target doesn't remove __pycache__
directories referenced elsewhere (.PHONY and .dockerignore); update the clean
target (symbol: clean) to also delete all __pycache__ directories across the
tree (use a recursive find/remove approach rather than hardcoding paths for
portability), so running make clean removes .pytest_cache, htmlcov, .mypy_cache,
.coverage files and every __pycache__ directory; keep .PHONY unchanged but
ensure the clean recipe uses a safe command that handles no-matches gracefully
(e.g., a find-based removal).
- Around line 54-56: The coverage Makefile target uses a hardcoded
--source=neuraldbg which can break if the package layout or name changes; update
the coverage invocation in the coverage target to reference a variable or
discovered path (e.g., a PACKAGE or SRC_DIR make variable used elsewhere)
instead of the literal "neuraldbg", and add the --branch flag to the coverage
run command so branch coverage is measured; look for the coverage target in the
Makefile and modify the line invoking "python -m coverage run --source=neuraldbg
-m pytest" to use the variable (and include --branch) so the source path is easy
to update and branch coverage is enabled.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9bbb6fb4-83b4-41dd-b22e-7c8dc8e9d2ea
📒 Files selected for processing (11)
.dockerignore.gitignoreDockerfileMakefileREADME.mddata/.gitkeepdocker-compose.ymlmodels/.gitkeepoutputs/.gitkeeprequirements-dev.txtrequirements.txt
| .git | ||
| .github | ||
| .venv | ||
| venv | ||
| __pycache__ | ||
| .pytest_cache | ||
| .coverage | ||
| .coverage.* | ||
| htmlcov | ||
| build | ||
| neuraldbg.egg-info | ||
| artifacts | ||
| data | ||
| models | ||
| outputs | ||
| *.log |
There was a problem hiding this comment.
Add .env and secrets patterns to prevent accidental inclusion in Docker images.
The .dockerignore file should exclude .env, .env.*, and secrets/ to prevent sensitive data from being copied into Docker images via the COPY . . instruction in the Dockerfile.
Proposed fix
data
models
outputs
*.log
+.env
+.env.*
+secrets/Based on learnings: "Certain files MUST be in .gitignore and NEVER committed publicly: ... .env, and API keys/tokens"
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| .git | |
| .github | |
| .venv | |
| venv | |
| __pycache__ | |
| .pytest_cache | |
| .coverage | |
| .coverage.* | |
| htmlcov | |
| build | |
| neuraldbg.egg-info | |
| artifacts | |
| data | |
| models | |
| outputs | |
| *.log | |
| .git | |
| .github | |
| .venv | |
| venv | |
| __pycache__ | |
| .pytest_cache | |
| .coverage | |
| .coverage.* | |
| htmlcov | |
| build | |
| neuraldbg.egg-info | |
| artifacts | |
| data | |
| models | |
| outputs | |
| *.log | |
| .env | |
| .env.* | |
| secrets/ |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.dockerignore around lines 1 - 16, Update the .dockerignore contents to
prevent sensitive files from being included in images by adding patterns for
environment and secret files; specifically add entries for ".env", ".env.*", and
"secrets/" (and optionally "secrets/**") so that COPY . . in the Dockerfile
cannot accidentally package these files.
| pytest>=8.0.0,<9.0.0 | ||
| coverage>=7.0.0,<8.0.0 | ||
| bandit>=1.7.0,<2.0.0 | ||
| safety>=3.0.0,<4.0.0 | ||
| pre-commit>=3.0.0,<4.0.0 |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify whether dependency resolution is deterministic or range-based.
echo "== Dependency manifests =="
fd 'requirements.*\.txt|constraints.*\.txt|poetry\.lock|Pipfile\.lock|uv\.lock'
echo
echo "== Version specifiers inside requirements files =="
rg -n '^[a-zA-Z0-9_.-]+(>=|<=|~=|>|<)' --glob 'requirements*.txt' --glob 'constraints*.txt'
echo
echo "== Install commands (Dockerfile / Makefile / scripts) =="
rg -n 'pip install .*requirements(-dev)?\.txt|--require-hashes|-c constraints' Dockerfile MakefileRepository: LambdaSection/NeuralDBG
Length of output: 749
Dependency resolution is not deterministic; add a lock file to guarantee reproducible builds.
The project lacks a lock mechanism (no poetry.lock, requirements.lock, or constraints file). While requirements-dev.txt uses bounded ranges (e.g., >=8.0.0,<9.0.0), requirements.txt uses unbounded ranges (numpy>=1.26.0 with no upper bound). For a truly hermetic workspace, install commands in the Dockerfile and Makefile should pin exact versions or resolve dependencies through a generated lock file with --require-hashes to ensure every rebuild is identical.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@requirements-dev.txt` around lines 1 - 5, Project lacks a deterministic
lockfile causing non-reproducible installs; generate and commit a lock file and
update install targets to use it. Create a lock (e.g., requirements.lock or
constraints.txt or use poetry.lock if using Poetry) by resolving exact versions
(pip-compile or poetry lock) and include hashes (pip --require-hashes or
pip-compile --generate-hashes); then update the install commands in Dockerfile
and Makefile to install from the lock (pip install -r requirements.lock or pip
install --constraint constraints.txt) instead of unbounded
requirements.txt/requirements-dev.txt so rebuilds are reproducible. Ensure
unique symbols to change: requirements-dev.txt, requirements.txt, generated lock
file name, Dockerfile install commands, and Makefile install target.
There was a problem hiding this comment.
4 issues found across 11 files
Confidence score: 3/5
- There is concrete user-facing risk in
requirements.txt:numpy>=1.26.0drops Python 3.8 support, so installs can fail for an environment the project still claims to support. - The
Makefilechange to hardcodeddocker-composemay break local workflows on systems that only provide Compose v2 (docker compose), which can block common dev/CI tasks. Dockerfileanddocker-compose.ymlintroduce medium operational risk: unpinned base/tooling weakens reproducibility, and missing host-user mapping can create root-owned files in a bind-mounted repo.- Pay close attention to
requirements.txt,Makefile,Dockerfile,docker-compose.yml- Python compatibility, compose command portability, reproducible builds, and file-permission side effects need validation before merge.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="Makefile">
<violation number="1" location="Makefile:4">
P2: Hardcoding `docker-compose` breaks the Make targets on hosts that only have the standard Compose v2 plugin.</violation>
</file>
<file name="requirements.txt">
<violation number="1" location="requirements.txt:1">
P1: `numpy>=1.26.0` drops Python 3.8 support, so installs will fail on a version this project still advertises as supported.</violation>
</file>
<file name="Dockerfile">
<violation number="1" location="Dockerfile:1">
P2: This workspace is not actually hermetic because the base image and critical Python tooling are left unpinned. A rebuild later can resolve different images or wheels and break reproducibility without any repo change.</violation>
</file>
<file name="docker-compose.yml">
<violation number="1" location="docker-compose.yml:12">
P2: Add a host-user mapping for the bind-mounted workspace; otherwise container writes will leave root-owned files in the repo.</violation>
</file>
Architecture diagram
sequenceDiagram
participant Dev as Developer
participant Make as Makefile
participant Compose as Docker Compose
participant Reg as PyTorch/PyPi Registry
participant FS as Host Filesystem
participant Cont as NeuralDBG Container
Note over Dev,Cont: NEW: Hermetic Development Lifecycle
rect rgb(23, 37, 84)
Note right of Dev: Build Phase
Dev->>Make: make build
Make->>Compose: build neuraldbg-dev
Compose->>Reg: NEW: Fetch Python 3.12-slim & Torch CPU
Reg-->>Compose: image layers
Compose->>FS: Copy source code & requirements
Compose-->>Dev: Image: neuraldbg-dev:latest
end
rect rgb(5, 46, 22)
Note right of Dev: Runtime Phase
Dev->>Make: make up
Make->>Compose: up -d
Compose->>FS: NEW: Bind mount source, /data, /models, /outputs
Compose->>Cont: Start container (sleep infinity)
Cont-->>Dev: Workspace ready
end
rect rgb(69, 26, 3)
Note right of Dev: Execution Phase (e.g., Testing)
Dev->>Make: make test-docker
Make->>Compose: run --rm neuraldbg-dev "pytest"
Compose->>Cont: Execute command
Cont->>FS: Read source & data volumes
Cont->>Cont: Run PyTorch-based logic
alt Success Path
Cont-->>Dev: 38 passed
else Failure Path (e.g., Coverage Gate)
Cont->>Cont: Run coverage report
opt Coverage < 60%
Cont-->>Dev: Exit Code 1 (Fail)
end
end
end
Dev->>Make: make down
Make->>Compose: down
Compose->>Cont: Stop & Remove
Note over FS: Persistent artifacts remain in ./data, ./models, ./outputs
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| @@ -0,0 +1 @@ | |||
| numpy>=1.26.0 | |||
There was a problem hiding this comment.
P1: numpy>=1.26.0 drops Python 3.8 support, so installs will fail on a version this project still advertises as supported.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At requirements.txt, line 1:
<comment>`numpy>=1.26.0` drops Python 3.8 support, so installs will fail on a version this project still advertises as supported.</comment>
<file context>
@@ -0,0 +1 @@
+numpy>=1.26.0
</file context>
| numpy>=1.26.0 | |
| numpy>=1.24.0,<1.26.0; python_version < "3.9" | |
| numpy>=1.26.0; python_version >= "3.9" |
| SHELL := /bin/bash | ||
|
|
||
| PYTHON := $(shell if [ -x .venv/bin/python ]; then echo .venv/bin/python; elif command -v python3 >/dev/null 2>&1; then echo python3; else echo python; fi) | ||
| COMPOSE := docker-compose |
There was a problem hiding this comment.
P2: Hardcoding docker-compose breaks the Make targets on hosts that only have the standard Compose v2 plugin.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At Makefile, line 4:
<comment>Hardcoding `docker-compose` breaks the Make targets on hosts that only have the standard Compose v2 plugin.</comment>
<file context>
@@ -0,0 +1,71 @@
+SHELL := /bin/bash
+
+PYTHON := $(shell if [ -x .venv/bin/python ]; then echo .venv/bin/python; elif command -v python3 >/dev/null 2>&1; then echo python3; else echo python; fi)
+COMPOSE := docker-compose
+
+.PHONY: help install install-dev up down build rebuild shell test test-docker coverage bandit safety security precommit clean
</file context>
| @@ -0,0 +1,22 @@ | |||
| FROM python:3.12-slim | |||
There was a problem hiding this comment.
P2: This workspace is not actually hermetic because the base image and critical Python tooling are left unpinned. A rebuild later can resolve different images or wheels and break reproducibility without any repo change.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At Dockerfile, line 1:
<comment>This workspace is not actually hermetic because the base image and critical Python tooling are left unpinned. A rebuild later can resolve different images or wheels and break reproducibility without any repo change.</comment>
<file context>
@@ -0,0 +1,22 @@
+FROM python:3.12-slim
+
+ENV PIP_NO_CACHE_DIR=1 \
</file context>
| command: bash -lc "sleep infinity" | ||
| tty: true | ||
| stdin_open: true | ||
| volumes: |
There was a problem hiding this comment.
P2: Add a host-user mapping for the bind-mounted workspace; otherwise container writes will leave root-owned files in the repo.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At docker-compose.yml, line 12:
<comment>Add a host-user mapping for the bind-mounted workspace; otherwise container writes will leave root-owned files in the repo.</comment>
<file context>
@@ -0,0 +1,16 @@
+ command: bash -lc "sleep infinity"
+ tty: true
+ stdin_open: true
+ volumes:
+ - ./:/workspace
+ - ./data:/data
</file context>


Summary
Introduce a hermetic Docker-based dev workspace for NeuralDBG and standardize local/dev commands with a Makefile.
Changes
Validation
Notes
Warnings come from PyTorch compile/deprecation and full backward hook behavior, not functional failures.
Summary by cubic
Adds a hermetic Docker-based dev workspace and a Makefile to standardize local workflows for NeuralDBG, addressing Linear MLO-3. Improves reproducibility and makes tests and security checks easy to run locally or in Docker.
New Features
torch; installs runtime and dev deps.docker-compose.ymldev service with bind-mounted workspace and persistent/data,/models,/outputs.bandit,safety), andpre-commit..dockerignoreto shrink build context;.gitignore+.gitkeepto track empty data/model/output dirs; README docs for quick start.Migration
make build && make up; open a shell:make shell.make test-docker(localmake teststill works)../data,./models,./outputsfor persistent artifacts via mounted volumes.Written for commit 7644db7. Summary will update on new commits.
Summary by CodeRabbit
New Features
Documentation
Chores