Add staging environment for testing worker scaling on Docker Swarm by samuelvkwong · Pull Request #315 · openradx/adit

samuelvkwong · 2026-03-25T23:26:34Z

Summary

Adds a staging deployment environment that enables testing worker scaling and Docker Swarm deployment features while maintaining development settings (DEBUG=True, debug toolbar, etc.).

Motivation

Currently, only production can use Docker Swarm for scaling workers. This PR enables developers to:

Test worker scaling locally before deploying to production
Test Docker Swarm deployment procedures
Validate multi-worker concurrent processing behavior
All while keeping dev-friendly settings (debug mode, detailed logging, etc.)

Changes

New Files

docker-compose.staging.yml - Swarm deployment config with dev settings
STAGING_DEPLOYMENT.md - Comprehensive deployment and usage guide

Modified Files

example.env - Added staging port variables to avoid conflicts

Key Features

✅ Dev settings on Swarm: DEBUG=True, debug toolbar, console email
✅ Different ports: Runs on localhost:8001 (dev uses 8000)
✅ Pre-scaled workers: Starts with 3 mass_transfer_workers by default
✅ No config needed: Uses existing .env file
✅ All three can run together: Dev, staging, and production simultaneously

Port Allocation

No conflicts - all three environments can run at the same time:

Service	Dev	Staging	Production
Web UI	8000	8001	80, 443
PostgreSQL	5432	5433	(internal)
DICOM Receiver	11122	11123	11112
Orthanc1	7501	7503	(internal)
Orthanc2	7502	7504	(internal)

Usage

# Deploy staging (uses existing .env)
docker swarm init
docker stack deploy -c docker-compose.base.yml -c docker-compose.staging.yml adit-staging

# Access at http://localhost:8001

# Scale workers
docker service scale adit-staging_mass_transfer_worker=5

# Cleanup
docker stack rm adit-staging

Testing Checklist

Verify staging deploys successfully
Confirm accessible on http://localhost:8001
Test worker scaling with docker service scale
Verify all three environments can run simultaneously
Confirm debug toolbar visible in staging

See STAGING_DEPLOYMENT.md for full documentation.

Summary by CodeRabbit

New Features
- Mass Transfer: bulk DICOM export with partitioned date ranges, JSON filter editor (CodeMirror), pseudonymization modes, optional DICOM→NIfTI conversion, job/task UI, volume listing, CSV export, background worker/queue with scaling, and preference persistence.
Bug Fixes / Reliability
- Improved task cancellation, DIMSE/connection cleanup, safer handling of empty/partial transfer responses, task timeout/cleanup hooks, and worker restart/resilience.
Documentation
- Staging deployment guide and full Mass Transfer specification added.
Tests
- Extensive tests for partitions, pseudonymization, processor behavior, forms, and cleanup.

The recursive split used `mid` as the boundary for both halves, causing studies at the midpoint to appear in both. Additionally, since the DICOM query operates at date-level granularity, same-day splits produced identical queries returning the same results in both halves. Shift the right half to start at mid + 1s and deduplicate by StudyInstanceUID when merging. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ks better

One task per partition handles discovery + export + convert on a separate mass_transfer queue. Includes review fixes and comprehensive test suite.

…ellation of in-progress tasks

… actual reason, improve status message and fix previous jobs from not being processed if the worker processing them was dead

…on overhead

…tric folder structure, and two-mode pseudonymization

…t correlation

…mpty retrievals Reverts the C-MOVE preference since C-MOVE requires PACS-side AE registration. Restores study/series pseudonymized UID fields on MassTransferVolume and populates them after DICOM anonymization. Tracks image count from C-GET to skip series with 0 retrieved images instead of passing empty directories to dcm2niix.

…eanup notes

Restores 0009 (which removed the fields) and adds 0010 to re-add them, since 0009 already ran on existing databases.

… C-GET IMPAX PACS stores most images in compressed formats (JPEG Lossless, JPEG 2000, etc.) but our storage contexts only offered uncompressed transfer syntaxes. During C-GET, the PACS could not send images back because no matching transfer syntax was negotiated, resulting in Success status with 0 images delivered.

The previous commit added compressed transfer syntaxes to StoragePresentationContexts, but dimse_connector.py was only passing cx.abstract_syntax to add_requested_context (which defaults to uncompressed-only). Now passes cx.transfer_syntax so the PACS can actually send images in JPEG Lossless and other compressed formats.

Avoid cancelling the fetch task after the sentinel has been received, ensuring exceptions from the completed fetch function propagate instead of being swallowed by cancellation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…b FK Procrastinate deletes finished jobs via a raw SQL stored procedure (procrastinate_finish_job_v1), which bypasses Django's ORM-level on_delete=SET_NULL handling. This caused a ForeignKeyViolation when the worker tried to finalize completed MassTransferTasks, because the database-level FK constraint on queued_job_id defaulted to NO ACTION. The other apps (batch_transfer, selective_transfer, batch_query) already had a RunSQL migration step calling procrastinate_on_delete_sql() to set ON DELETE SET NULL at the PostgreSQL level. The mass_transfer app was missing this step because the queued_job field was included in the initial CreateModel migration rather than added later via AddField. This resulted in an infinite retry loop: each task completed its DICOM work successfully, but Procrastinate could not delete the job row, so the task was re-picked and re-processed indefinitely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Previously mass transfer only supported folder destinations. This adds support for server destinations by downloading series to a temp directory and uploading via the destination operator. NIfTI conversion is disabled for server destinations both in the form (Alpine.js checkbox disabling) and backend (clean method clears the flag). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CSS overrides scoped to [data-bs-theme="dark"] so the JSON filter editor automatically matches the site-wide theme toggle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nsfer Create MassTransferVolume records in PENDING state immediately after discovery so they appear in the UI while the task is running. Each volume is updated in place during transfer and saved via a finally block that guarantees persistence even on RetriableDicomError. Key changes: - Remove SeriesTransferResult dataclass; transfer methods now update MassTransferVolume fields directly instead of returning result objects - Replace _group_series() with _create_pending_volumes() + _group_volumes() - Remove _create_volume_record(); volumes are bulk-created upfront - Add persistent parameter to DicomOperator to abstract over DimseConnector.auto_close - Move destination setup (dest_operator/output_base) before try block with assert ensuring exactly one is set - Add defensive PENDING check and safe volume.save() in finally block - Replace protocol-specific comments (C-GET/C-FIND/DIMSE) with protocol-agnostic terms (fetch/query) - Restyle task detail template with Back to Job button and pagination - Add test fixture and _fake_export_success helper to reduce boilerplate Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Defers individual task queuing to a background Procrastinate job on the default queue so that the HTTP view returns immediately instead of blocking on potentially thousands of defer() calls. Adds factories, a dedicated tasks module, and comprehensive tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace unlimited process timeout (None) with a 24-hour cap to prevent runaway tasks from blocking the mass_transfer queue indefinitely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix task reset routing: override queue_pending_task() on MassTransferTask so single-task resets use the mass_transfer queue instead of the generic dicom queue with wrong timeout - Deduplicate queue logic: queue_mass_transfer_tasks now delegates to task.queue_pending_task() instead of duplicating the defer/save code - Add error handling to task queueing loop with logging and re-raise - Narrow bare except clauses in _extract_dicom_metadata and _write_dicom_metadata to specific exception types with logging - Move close() error guard from processor into DicomOperator.close() - Add debug logging to silent except OSError: pass blocks - Remove unused cleanup_on_failure hook from DicomTask and all call sites - Add model-level clean() to MassTransferJob for date range validation - Add CheckConstraint ensuring partition_start < partition_end - Add MinValueValidator(1) to DicomServer.max_search_results - Make FilterSpec and DiscoveredSeries frozen dataclasses - Fix dimse_connector docstring referencing non-existent auto_config - Add docstring to build_partitions(), expand filters_json help_text - Sync inherited TransferTask field validators in migration - Fix test fixtures to satisfy new partition constraint - Add 10 new tests: filter JSON validation, discover_series filtering, date range rejection, source authorization - Delete outdated mass_transfer_spec.md - Document assertion usage policy in CLAUDE.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace plain string identifiers with realistic DICOM UID format in test helper calls for consistency with actual DICOM data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace run_worker_once() with direct queue_mass_transfer_tasks() calls in the queue tests. The worker spawned pebble daemon subprocesses that left PostgreSQL connections open, causing a teardown warning when running the full test suite. The queueing logic is now tested directly while task processing is already covered by core/tests/test_tasks.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove unnecessary SHA hash suffix from study folder names since studies are already nested under patient folders, and include series description in the fallback when series number is missing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Decouple pseudonym computation from dicognito's IDAnonymizer to ensure stability across library upgrades. The new standalone compute_pseudonym function uses SHA-256 with divmod extraction into A-Z0-9, matching dicognito's approach but with a stronger hash. Deterministic pseudonyms are 14 chars, random pseudonyms are 15 chars so the two modes can be distinguished by length. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ce tests Instead of raising an error when switching DICOM services (e.g. C-FIND to C-GET) on a persistent connection, the DimseConnector now transparently closes the old association and opens a new one. This simplifies callers that need to chain different operations. Also adds Playwright acceptance tests covering pseudonymized, unpseudonymized, folder-destination, and NIfTI-conversion mass transfers across all three transfer protocols (C-MOVE, C-GET, DICOMweb). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The procrastinate worker thread running queue_mass_transfer_tasks opens a Django DB connection (via MassTransferTask.save() in queue_pending_task) that was never closed, since worker threads don't go through Django's request/response lifecycle. This left an idle PostgreSQL session that blocked test database teardown, causing a PytestWarning. Add db.close_old_connections() in a finally block, matching the pattern already used in core/tasks.py _run_dicom_task. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nvironment # Conflicts: # adit/mass_transfer/processors.py # adit/mass_transfer/tests/test_processor.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Cap filters JSON display at 20 lines with overflow scroll - Clear queued_job_id before deleting procrastinate job on cancel

Add a new filter field that allows excluding series with fewer than a specified number of instances. Includes form validation (must be >= 1), processor filtering logic, and comprehensive tests for both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Allows parsing semicolon-delimited CSVs (common in European Excel exports) by passing e.g. --delimiter ";". Defaults to comma for backwards compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nvironment

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

NumericalAdvantage and others added 30 commits February 18, 2026 16:06

Add mass transfer app and processing

bf3a45c

Add mass transfer opt-out pseudonymization

ea4d781

Ensure mass transfer exports are cleaned on failure

1a28368

Improve mass transfer filter and date inputs

687dc29

Fix CI image build and refine mass transfer form layout

4a5a0f1

Document dcm2niix dependency and adjust mass transfer form layout

b5fb532

Fix mass transfer study time range and worker env

c747e22

Route mass transfer via dicom queue and scope filters

61a1b98

Enforce filter names and tweak UI spacing

0efaa85

Fix pyright type errors

281303d

Address code review feedback for mass transfer PR

f4925c2

Use rslave mount propagation so containers see NAS mounts

deccb11

split up tasks into processing and querying to manage creation of tas…

a3c7413

…ks better

Collapse two-phase mass transfer into single-phase with dedicated worker

8f5444a

One task per partition handles discovery + export + convert on a separate mass_transfer queue. Includes review fixes and comprehensive test suite.

Add migration for export_cleaned field on MassTransferVolume

253a1d3

Include stdout in dcm2niix conversion error messages

268f078

Add three-mode anonymization with longitudinal linking

3c028e8

Fix DIMSE connection leak on abandoned generators and enable job canc…

8662207

…ellation of in-progress tasks

Remove MassTransferAssociation model, show user failed transfers with…

31bd1e3

… actual reason, improve status message and fix previous jobs from not being processed if the worker processing them was dead

Add persistent DIMSE connection mode to avoid per-operation associati…

d96a036

…on overhead

Rewrite mass transfer processing with deferred insertion, patient-cen…

ff1d71f

…tric folder structure, and two-mode pseudonymization

Use per-study random pseudonyms in non-linking mode to prevent patien…

9a86256

…t correlation

Prefer C-MOVE over C-GET for mass transfer

3e53c06

Include study time in folder name

e9370fd

Update spec with pseudonymized UID fields, folder name format, and cl…

9e61dd5

…eanup notes

Re-add pseudonymized UID columns via migration 0010

05c5a94

Restores 0009 (which removed the fields) and adds 0010 to re-add them, since 0009 already ran on existing databases.

medihack and others added 30 commits March 27, 2026 22:27

Add dark mode support for CodeMirror in mass transfer form

21d8f95

CSS overrides scoped to [data-bs-theme="dark"] so the JSON filter editor automatically matches the site-wide theme toggle. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add 24-hour safety timeout for mass transfer task processing

41bf368

Replace unlimited process timeout (None) with a 24-hour cap to prevent runaway tasks from blocking the mass_transfer queue indefinitely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Use DICOM-style UIDs in mass transfer processor test fixtures

5a100f8

Replace plain string identifiers with realistic DICOM UID format in test helper calls for consistency with actual DICOM data. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/masstransfer' into add-staging-e…

1a8a8d2

…nvironment # Conflicts: # adit/mass_transfer/processors.py # adit/mass_transfer/tests/test_processor.py

Add environment variables for staging replica scaling

a70ff7f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add scrollbar for long filters and fix cancel FK violation

3452466

- Cap filters JSON display at 20 lines with overflow scroll - Clear queued_job_id before deleting procrastinate job on cancel

Fix pyright error: queued_job_id type annotation allows None

ad6818b

Merge branch 'main' into masstransfer

4c5e242

Merge branch 'masstransfer' into add-staging-environment

d38eebf

Add --min-series-instances arg to csv_to_mass_transfer_filters script

409053b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'masstransfer' into add-staging-environment

2eb85d5

Add --delimiter arg to csv_to_mass_transfer_filters script

50adf2c

Allows parsing semicolon-delimited CSVs (common in European Excel exports) by passing e.g. --delimiter ";". Defaults to comma for backwards compatibility. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add MASS_TRANSFER_WORKER_REPLICAS to example.env

962904d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/masstransfer' into add-staging-e…

9e167e2

…nvironment

Remove STAGING_DEPLOYMENT.md

c93cf1f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into add-staging-environment

8a6e9d5

Add mass transfer worker to observability config example

1279ecf

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into add-staging-environment

a541702

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add staging environment for testing worker scaling on Docker Swarm#315

Add staging environment for testing worker scaling on Docker Swarm#315
samuelvkwong wants to merge 113 commits intomainfrom
add-staging-environment

samuelvkwong commented Mar 25, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

samuelvkwong commented Mar 25, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

New Files

Modified Files

Key Features

Port Allocation

Usage

Testing Checklist

Summary by CodeRabbit

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

samuelvkwong commented Mar 25, 2026 •

edited by coderabbitai bot

Loading