Skip to content

Add NIfTI conversion API#300

Open
mhumzaarain wants to merge 12 commits intomainfrom
DCM_TO_NIFTI_API
Open

Add NIfTI conversion API#300
mhumzaarain wants to merge 12 commits intomainfrom
DCM_TO_NIFTI_API

Conversation

@mhumzaarain
Copy link
Copy Markdown
Collaborator

@mhumzaarain mhumzaarain commented Feb 10, 2026

Summary

  • Add three new WADO-RS endpoints (/nifti at study, series, and instance level)
    that stream DICOM-to-NIfTI converted files as multipart HTTP responses
  • Add granular error handling for dcm2niix exit codes with a typed exception
    hierarchy (DcmToNiftiConversionError and subclasses)
  • Add modality-based pre-filtering to skip non-image series (SR, KO, PR) before
    fetching, with warnings logged for unexpected conversion failures
  • Add adit-client methods (retrieve_nifti_* / iter_nifti_*) for
    programmatic consumption of the new endpoints

Specification

New Endpoints

Three WADO-RS endpoints for on-the-fly DICOM-to-NIfTI conversion, appended as /nifti to the standard WADO-RS retrieve paths:

Level URL Pattern View
Study {ae_title}/wadors/studies/{study_uid}/nifti RetrieveNiftiStudyAPIView
Series {ae_title}/wadors/studies/{study_uid}/series/{series_uid}/nifti RetrieveNiftiSeriesAPIView
Instance {ae_title}/wadors/studies/{study_uid}/series/{series_uid}/instances/{image_uid}/nifti RetrieveNiftiImageAPIView

Response format: multipart/related; type=application/octet-stream with boundary nifti-boundary. Each part includes a Content-Disposition header with the output filename. Files are yielded in a deterministic order per conversion: JSON sidecar first, then NIfTI (.nii.gz / .nii), then .bval, then .bvec.

Conversion Pipeline

  1. Fetch — DICOM datasets are retrieved from the configured PACS via DicomOperator (DIMSE).
  2. Filter (study-level only) — Series with non-image modalities (SR, KO, PR) are skipped before fetching to avoid unnecessary network and conversion overhead.
  3. Convert — Datasets are written to a temporary directory and converted with dcm2niix (-f "%s-%d" -z y).
  4. Stream — Output files are read with aiofiles and streamed as a multipart response. Study-level requests process one series at a time to bound memory usage.

Error Handling

dcm2niix Exit-Code Mapping

A typed exception hierarchy (DcmToNiftiConversionError and subclasses) maps each dcm2niix exit code to a specific exception:

Exit Code Meaning Exception
0 Success
1 Unspecified error DcmToNiftiConversionError
2 No DICOM found NoValidDicomError
4 Corrupt DICOM InvalidDicomError
5 Invalid input folder InputDirectoryError
6 Invalid output folder OutputDirectoryError
7 Write permission error OutputDirectoryError
8 Partial conversion Logged as warning, conversion continues
9 Rename error DcmToNiftiConversionError

Graceful Degradation

  • NoValidDicomError and NoSpatialDataError are logged as warnings and the series is skipped — the request continues with remaining series.
  • Other DcmToNiftiConversionError subclasses are also logged as warnings and skipped.
  • PACS-level errors are mapped to HTTP semantics: RetriableDicomError503, other DicomError502.

Modality Pre-Filtering

At study level, a QIDO-RS-style series query is issued first. Series whose Modality is in {"SR", "KO", "PR"} are skipped with a debug log. This prevents fetching large structured-report or presentation-state series that cannot produce NIfTI output.

adit-client Methods

Six new public methods on AditClient, in two flavors per level:

Method Returns
retrieve_nifti_study(ae_title, study_uid) list[tuple[str, BytesIO]]
iter_nifti_study(ae_title, study_uid) Iterator[tuple[str, BytesIO]]
retrieve_nifti_series(ae_title, study_uid, series_uid) list[tuple[str, BytesIO]]
iter_nifti_series(ae_title, study_uid, series_uid) Iterator[tuple[str, BytesIO]]
retrieve_nifti_image(ae_title, study_uid, series_uid, image_uid) list[tuple[str, BytesIO]]
iter_nifti_image(ae_title, study_uid, series_uid, image_uid) Iterator[tuple[str, BytesIO]]

Security

Filenames in Content-Disposition headers are sanitized: \r, \n, and " characters are stripped to prevent header injection.

Files Changed (8 files)

File Change
adit/dicom_web/urls.py 3 new URL patterns
adit/dicom_web/views.py 3 new view classes
adit/dicom_web/renderers.py WadoMultipartApplicationNiftiRenderer
adit/dicom_web/utils/wadors_utils.py wado_retrieve_nifti(), _process_single_fetch(), _fetch_dicom_data(), modality filtering
adit/core/errors.py DcmToNiftiConversionError hierarchy (6 subclasses)
adit/core/utils/dicom_to_nifti_converter.py DcmToNiftiExitCode enum, exit-code-to-exception mapping
adit-client/adit_client/client.py 6 public methods + multipart parsing helpers

Test Coverage

49 tests total (43 unit + 6 acceptance), covering all new components.

Unit Tests (43 tests)

Test File Tests What It Covers
adit/core/tests/utils/test_dicom_to_nifti_converter.py 13 All dcm2niix exit codes (0-9), subprocess errors, folder validation, warning detection
adit/dicom_web/tests/utils/test_wado_retrieve_nifti.py 13 Modality pre-filtering (SR/KO/PR skip), study/series/image level dispatch, file yield ordering (json, nifti, bval, bvec), NoValidDicomError/NoSpatialDataError/DcmToNiftiConversionError graceful handling, error wrapping (RetriableDicomError -> 503, DicomError -> 502)
adit/dicom_web/tests/test_renderers.py 5 Multipart boundary formatting (single/multiple/empty), filename sanitization (\r, \n, " stripped), content_type property
adit-client/tests/test_nifti_client.py 12 _extract_filename (valid, path stripping, missing header, no filename field), _extract_part_content_with_headers (boundary markers, empty bytes, normal content), _iter_multipart_response (header parsing, response-header fallback, missing disposition error)

Acceptance Tests (6 tests)

Test File Tests What It Covers
adit/dicom_web/tests/acceptance/test_wadors.py 6 End-to-end NIfTI retrieval against live Orthanc for all 3 levels (study/series/image) x 2 modes (retrieve_* / iter_*), verifying .nii.gz + .json presence, non-empty content, and filename consistency between retrieve and iter

Design Decisions

  • No pseudonymization/trial-protocol support — NIfTI endpoints do not accept Pseudonym or TrialProtocolID parameters (unlike DICOM retrieve endpoints), since manipulation is not applicable to NIfTI output.
  • Per-series fetching at study level — Instead of loading an entire study into memory, each series is fetched and converted independently, bounding peak memory.
  • Warnings over failures — Conversion errors on individual series do not abort the entire study-level request; partial results are returned with warnings logged.

API Usage (adit-client)

from adit_client import AditClient

client = AditClient(server_url="https://<adit-host>", token="<your-token>")

# --- Study level ---

# Buffered: loads all files into memory at once
files = client.retrieve_nifti_study(ae_title="ORTHANC", study_uid="1.2.3...")
for filename, content in files:
    with open(filename, "wb") as f:
        f.write(content.read())

# Streaming: memory-efficient, processes one file at a time
for filename, content in client.iter_nifti_study(ae_title="ORTHANC", study_uid="1.2.3..."):
    with open(filename, "wb") as f:
        f.write(content.read())

# --- Series level ---

files = client.retrieve_nifti_series(
    ae_title="ORTHANC",
    study_uid="1.2.3...",
    series_uid="1.2.3.4...",
)

for filename, content in client.iter_nifti_series(
    ae_title="ORTHANC",
    study_uid="1.2.3...",
    series_uid="1.2.3.4...",
):
    with open(filename, "wb") as f:
        f.write(content.read())

# --- Instance level ---

files = client.retrieve_nifti_image(
    ae_title="ORTHANC",
    study_uid="1.2.3...",
    series_uid="1.2.3.4...",
    image_uid="1.2.3.4.5...",
)

for filename, content in client.iter_nifti_image(
    ae_title="ORTHANC",
    study_uid="1.2.3...",
    series_uid="1.2.3.4...",
    image_uid="1.2.3.4.5...",
):
    with open(filename, "wb") as f:
        f.write(content.read())

Return type: Each method returns (filename, BytesIO) tuples. Typical filenames include:

  • series_description-sequence_name.json — JSON sidecar with DICOM metadata
  • series_description-sequence_name.nii.gz — compressed NIfTI image
  • series_description-sequence_name.bval — b-values (diffusion MRI only)
  • series_description-sequence_name.bvec — b-vectors (diffusion MRI only)

Summary by CodeRabbit

  • New Features

    • NIfTI REST endpoints and client support for study/series/image retrieval with both batch and streaming APIs; responses preserve per-file filenames and sidecars.
  • Improvements

    • Robust DICOM→NIfTI conversion with explicit exit-code handling, clearer error types, safer temp/IO handling and better logging.
    • Asynchronous fetch-and-convert pipeline with modality filtering, ordered streaming, and multipart responses that expose per-part filenames.
  • Tests

    • Acceptance tests for NIfTI endpoints, filename parity, and non-empty contents.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds server-side DICOM→NIfTI conversion and streaming: new wadors NIfTI endpoints, renderer, async utilities to fetch/convert DICOM to NIfTI (and sidecars), client methods to retrieve/iterate multipart NIfTI parts as (filename, BytesIO), multipart parsing, conversion-specific errors, and acceptance tests.

Changes

Cohort / File(s) Summary
Error Handling
adit/core/errors.py
Adds DcmToNiftiConversionError hierarchy (NoValidDicomError, InvalidDicomError, OutputDirectoryError, InputDirectoryError, ExternalToolError, NoSpatialDataError) and extends BatchFileSizeError to store batch_tasks_count and max_batch_size.
Conversion Logic
adit/core/utils/dicom_to_nifti_converter.py
Introduces DcmToNiftiExitCode IntEnum, validates input/output dirs, captures dcm2niix stdout/stderr, maps exit codes to typed exceptions, and raises ExternalToolError for subprocess failures.
WADO Utilities & Workflow
adit/dicom_web/utils/wadors_utils.py
Adds NON_IMAGE_MODALITIES, _fetch_dicom_data, wado_retrieve_nifti, _process_single_fetch; switches to sentinel-based async fetch with queueing, uses TemporaryDirectory/aiofiles for staging, invokes DicomToNiftiConverter, skips non-image series, and maps conversion errors to API errors.
Renderer, Views & URLs
adit/dicom_web/renderers.py, adit/dicom_web/views.py, adit/dicom_web/urls.py
Adds WadoMultipartApplicationNiftiRenderer (multipart/related with nifti-boundary), three RetrieveNifti*APIView views using that renderer and wado_retrieve_nifti, and URL routes for study/series/image NIfTI endpoints.
Client
adit-client/adit_client/client.py
Adds public retrieve_/iter_ methods for study/series/image NIfTI; implements multipart parsing helpers _iter_multipart_response, _extract_part_content_with_headers, _extract_filename to yield (filename, BytesIO) pairs; updates imports and type hints.
Tests
adit/dicom_web/tests/acceptance/test_wadors.py
Adds acceptance tests for retrieve/iter NIfTI endpoints validating .nii(.gz) and .json sidecars, non-empty BytesIO contents, and filename parity between retrieve and iter methods.

Sequence Diagram(s)

sequenceDiagram
    participant Client as AditClient
    participant API as WADO API View
    participant Utils as wadors_utils
    participant DICOM as DICOM Server
    participant Converter as DicomToNiftiConverter
    participant Renderer as NIfTI Renderer

    Client->>API: GET /<ae>/wadors/.../nifti
    API->>Utils: wado_retrieve_nifti(query, level)
    Utils->>DICOM: fetch DICOM datasets
    DICOM-->>Utils: list[Dataset]
    Utils->>Converter: convert datasets -> files on disk
    Converter-->>Utils: files (filename, BytesIO)
    Utils-->>API: AsyncIterator[(filename, BytesIO)]
    API->>Renderer: render(async tuples)
    Renderer-->>API: multipart byte stream
    API-->>Client: HTTP multipart/related response
    Client->>Client: _iter_multipart_response() yields (filename, BytesIO)
Loading

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 I hopped through stacks and spun a thread of light,

DICOM to NIfTI, tucked filenames tight.
Boundaries hum, streams dance in tune,
Files arrive like carrots under moon.
A little rabbit cheers the new byte flight.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 51.72% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add NIfTI conversion API' directly and clearly summarizes the main feature addition: DICOM-to-NIfTI conversion endpoints and client support.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DCM_TO_NIFTI_API

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @mhumzaarain, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the system's interoperability by introducing direct DICOM-to-NIfTI conversion capabilities via new WADO-RS API endpoints. It streamlines the process of accessing medical imaging data in a widely used neuroimaging format, improving data accessibility for researchers and applications. The changes also include robust error handling and intelligent filtering to ensure reliable and efficient conversions.

Highlights

  • New WADO-RS NIfTI Conversion Endpoints: Introduced three new WADO-RS API endpoints (/nifti at study, series, and instance levels) that stream DICOM-to-NIfTI converted files as multipart HTTP responses. This allows clients to directly request NIfTI formatted data from the DICOM web server.
  • Enhanced Error Handling for DICOM to NIfTI Conversion: Implemented a granular exception hierarchy for dcm2niix exit codes, including specific errors for no valid DICOM, invalid DICOM, input/output directory issues, and external tool failures. This provides more precise feedback on conversion problems.
  • Modality-Based Pre-filtering: Added logic to pre-filter and skip non-image series (e.g., Structured Reports, Key Object Selection, Presentation State) before attempting DICOM-to-NIfTI conversion, improving efficiency and preventing unnecessary processing.
  • Client-Side API Integration: New methods (retrieve_nifti_* and iter_nifti_*) have been added to the adit-client to programmatically consume the newly exposed NIfTI conversion endpoints, simplifying client interaction.
Changelog
  • adit-client/adit_client/client.py
    • Added BytesIO and Union to typing imports.
    • Implemented retrieve_nifti_study, iter_nifti_study, retrieve_nifti_series, iter_nifti_series, retrieve_nifti_image, and iter_nifti_image methods for NIfTI retrieval.
    • Added private helper methods _extract_filename, _extract_part_content_with_headers, and _iter_multipart_response to parse multipart HTTP responses and extract filenames.
  • adit/core/errors.py
    • Introduced DcmToNiftiConversionError as a base exception for DICOM to NIfTI conversion.
    • Defined several subclasses of DcmToNiftiConversionError for specific failure scenarios: NoValidDicomError, InvalidDicomError, OutputDirectoryError, InputDirectoryError, ExternalToolError, and NoSpatialDataError.
  • adit/core/utils/dicom_to_nifti_converter.py
    • Imported the newly defined DcmToNiftiConversionError and its subclasses.
    • Added DcmToNiftiExitCode IntEnum to map dcm2niix exit codes to meaningful constants.
    • Modified the convert method to handle dcm2niix subprocess results more robustly, mapping exit codes to the new custom exception types and logging warnings for partial conversions.
  • adit/dicom_web/renderers.py
    • Created WadoMultipartApplicationNiftiRenderer to render multipart/related responses for NIfTI files, including content-type and content-disposition headers for each part.
  • adit/dicom_web/tests/acceptance/test_wadors.py
    • Imported BytesIO for testing file content.
    • Added acceptance tests test_retrieve_nifti_study, test_retrieve_nifti_series, test_iter_nifti_study, and test_iter_nifti_series to verify the functionality of the new NIfTI endpoints and client methods.
  • adit/dicom_web/urls.py
    • Imported RetrieveNiftiImageAPIView, RetrieveNiftiSeriesAPIView, and RetrieveNiftiStudyAPIView.
    • Added new URL patterns for /nifti endpoints at the study, series, and instance levels, routing to the respective NIfTI retrieval API views.
  • adit/dicom_web/utils/wadors_utils.py
    • Imported necessary modules for file operations (os, BytesIO, Path, aiofiles, aiofiles.os, TemporaryDirectory) and the new error types.
    • Imported DicomToNiftiConverter and write_dataset.
    • Defined NON_IMAGE_MODALITIES set to filter out non-image DICOM series.
    • Added _fetch_dicom_data as a synchronous helper to fetch DICOM datasets.
    • Implemented wado_retrieve_nifti to orchestrate DICOM retrieval and NIfTI conversion, including modality-based filtering and handling study-level requests by processing series individually.
    • Created _process_single_fetch to manage the temporary storage, conversion using DicomToNiftiConverter, and yielding of NIfTI, JSON, bval, and bvec files.
  • adit/dicom_web/views.py
    • Imported WadoMultipartApplicationNiftiRenderer and wado_retrieve_nifti.
    • Added RetrieveNiftiStudyAPIView, RetrieveNiftiSeriesAPIView, and RetrieveNiftiImageAPIView classes, each configured with WadoMultipartApplicationNiftiRenderer and calling wado_retrieve_nifti for their respective levels.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link
Copy Markdown

claude bot commented Feb 10, 2026

Code Review: WADO-RS NIfTI Conversion API Endpoints

Summary

This PR adds NIfTI conversion endpoints to the DICOMweb API, allowing users to retrieve DICOM data converted to NIfTI format. The implementation includes proper error handling, client library methods, and acceptance tests. Overall, the code quality is good with thoughtful architecture decisions.

✅ Strengths

  1. Excellent Error Handling: The typed exception hierarchy for dcm2niix exit codes is well-designed and makes error handling explicit and maintainable.

  2. Memory Efficiency: Study-level requests fetch series individually rather than loading entire studies into memory, which is critical for large datasets.

  3. Smart Pre-filtering: Non-image modalities (SR, KO, PR) are filtered before fetching, avoiding unnecessary DICOM transfers.

  4. Comprehensive Testing: Acceptance tests cover all three levels (study, series, instance) and both retrieve/iter variants.

  5. Async Architecture: Proper use of async/await with sync_to_async for blocking operations.

  6. Client API Design: Both retrieve_* (load all) and iter_* (streaming) methods provide flexibility for different use cases.

🔍 Issues & Recommendations

High Priority

  1. Unused Import in client.py (line 9)

    • Union is imported but never used
    • Fix: Remove the unused import
  2. Filename Sanitization Missing (adit-client/adit_client/client.py:108)

    • Filenames from Content-Disposition headers are used without validation
    • Security Risk: Potential path traversal if dcm2niix generates unexpected filenames
    • Recommendation: Add filename sanitization:
    def _extract_filename(self, content_disposition: str) -> str:
        if not content_disposition or "filename=" not in content_disposition:
            raise ValueError("No filename found in Content-Disposition header")
        filename = content_disposition.split("filename=")[1].strip('"')
        # Sanitize: remove path separators
        filename = os.path.basename(filename)
        if not filename or filename.startswith('.'):
            raise ValueError(f"Invalid filename: {filename}")
        return filename
  3. Missing Error Handling for Empty Results (wadors_utils.py:258-259)

    • If aiofiles.os.scandir returns no files after conversion, the function silently yields nothing
    • Recommendation: Add validation after scandir to ensure at least some files were created (or log a warning)
  4. Client Code Relies on Private API (adit-client/adit_client/client.py:21-27, 96-101)

    • Using dicomweb_client._http_get, _extract_part_content, and _decode_multipart_message
    • Risk: Breaking changes in dicomweb-client library updates
    • Recommendation: Consider contributing multipart header parsing to dicomweb-client upstream, or add version pinning with tests to catch breaking changes

Medium Priority

  1. Inconsistent Error Logging (wadors_utils.py:246-256)

    • NoValidDicomError and NoSpatialDataError are logged as warnings and execution continues
    • Other DcmToNiftiConversionError subclasses are also logged as warnings
    • Generic exceptions are logged as errors and re-raised
    • Question: Should conversion failures for individual series in a study-level request fail the entire request or just skip that series? The current behavior (skip) seems intentional but isn't documented in the docstring.
    • Recommendation: Document this behavior in the function docstring
  2. File Ordering Not Guaranteed (wadors_utils.py:276)

    • file_pairs.items() iteration order depends on dict insertion order (which is guaranteed in Python 3.7+, so this is fine)
    • However, the comment at line 222 says "yields files in order: JSON first, then NIfTI, then bval, then bvec" but this is only true within each base_name group, not across groups
    • Recommendation: Clarify the docstring to indicate files are grouped by base_name, with ordering within each group
  3. Potential Race Condition with TemporaryDirectory (wadors_utils.py:229)

    • Async code writes multiple DICOM files in a loop (lines 232-236)
    • These are sequential, which is correct, but the comment/pattern might benefit from making it explicit that ordering matters
    • Recommendation: No code change needed, just noting this is handled correctly
  4. Missing Test for Instance-Level NIfTI

    • Tests exist for test_retrieve_nifti_study, test_retrieve_nifti_series, test_iter_nifti_study, and test_iter_nifti_series
    • Missing: test_retrieve_nifti_image and test_iter_nifti_image for instance-level endpoints
    • Recommendation: Add instance-level tests for completeness

Low Priority

  1. Magic String for Boundary (renderers.py:94)

    • boundary: str = "nifti-boundary" is hardcoded
    • Recommendation: Consider if this should be configurable or at least documented why this specific value was chosen
  2. Type Annotation Inconsistency (wadors_utils.py:261)

    • file_pairs: dict[str, dict[str, str]] is well-typed, but the inner dict values could be more specific (they're filenames)
    • Minor improvement: Could use a TypedDict for clarity, but current typing is acceptable
  3. Code Duplication in Client (adit-client/adit_client/client.py:17-101)

    • The six NIfTI methods (retrieve/iter × study/series/image) have significant duplication
    • Recommendation: Consider extracting common logic into a helper method:
    def _retrieve_nifti(self, ae_title: str, level: str, study_uid: str, 
                        series_uid: str = None, image_uid: str = None, 
                        stream: bool = False):
        # Common logic here

    However, the current explicit methods have better IDE support and type safety, so this is optional.

🎯 Code Quality

  • Follows Google Python Style Guide: ✅
  • Line length (100 chars): ✅
  • Type hints: ✅ Well-typed throughout
  • Docstrings: ✅ Present and informative
  • Error messages: ✅ Clear and actionable
  • Logging: ✅ Appropriate use of debug/warning/error levels

🔒 Security

  • ✅ No SQL injection risks (uses ORM)
  • ✅ No command injection (subprocess args are properly structured)
  • ⚠️ Minor risk: Filename from Content-Disposition header should be sanitized (see issue wip #2)
  • ✅ Temporary files are properly cleaned up via TemporaryDirectory context manager
  • ✅ No sensitive data exposure in error messages

🚀 Performance

  • Excellent: Study-level requests process series individually (streaming)
  • ✅ Uses async I/O for file operations (aiofiles)
  • ✅ Pre-filters non-convertible series before fetching
  • ✅ Streaming HTTP responses for large datasets
  • ⚠️ Consideration: Each series creates a separate temp directory and runs dcm2niix. For studies with many series, this could be I/O intensive. Consider adding rate limiting or concurrency controls for production use at scale.

🧪 Test Coverage

  • ✅ Acceptance tests for study and series retrieval
  • ✅ Tests for both retrieve and iter variants
  • ✅ Validates file types (.nii.gz, .json) are present
  • ✅ Validates content is not empty
  • Missing: Instance-level tests (see issue Refactor connector #8)
  • Missing: Error case tests (what happens when conversion fails?)
  • Missing: Tests for non-image modality filtering

📝 Documentation

  • ✅ Clear docstrings on main functions
  • ✅ Inline comments explain non-obvious logic
  • ✅ PR description clearly explains the changes
  • ⚠️ Consider: Adding a note to CLAUDE.md about the new NIfTI endpoints for future reference

Verdict

This is a well-implemented feature with thoughtful architecture decisions. The error handling and memory management are particularly well done. The main concerns are:

  1. Add filename sanitization for security
  2. Add missing test coverage for instance-level endpoints and error cases
  3. Document the behavior when individual series fail conversion

After addressing the high-priority items, this would be ready to merge.


Review generated with assistance from Claude Code

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new API endpoints for DICOM to NIfTI conversion, along with corresponding client methods. However, a high-severity PII leak was identified due to missing pseudonymization in the new endpoints, a medium-severity HTTP header injection vulnerability in the NIfTI renderer, and a medium-severity insecure filename parsing issue in the client library. Additionally, there is code duplication in both the client and server-side views that could be refactored, a potential high memory usage issue when streaming large NIfTI files, and a fragile implementation detail in the client relying on monkey-patching a third-party library. Addressing these security and maintainability concerns will make the new functionality more robust and scalable.

Comment on lines +327 to +349
class RetrieveNiftiStudyAPIView(RetrieveAPIView):
renderer_classes = [WadoMultipartApplicationNiftiRenderer]

async def get(
self, request: AuthenticatedApiRequest, ae_title: str, study_uid: str
) -> StreamingHttpResponse:
async with self.track_session(request.user) as session:
source_server = await self._get_dicom_server(request, ae_title)

query = self.query.copy()
query["StudyInstanceUID"] = study_uid

images = wado_retrieve_nifti(source_server, query, "STUDY")

renderer = cast(
WadoMultipartApplicationNiftiRenderer, getattr(request, "accepted_renderer")
)
return StreamingHttpResponse(
streaming_content=_StreamingSessionWrapper(
renderer.render(images), session, self._finalize_statistic
),
content_type=renderer.content_type,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The get method in RetrieveNiftiStudyAPIView introduces a high-severity PII leak because it does not implement pseudonymization, which is crucial for protecting patient privacy in ADIT. This allows raw DICOM data, including PII, to be converted to NIfTI. Please ensure pseudonymization parameters are extracted and applied during retrieval and conversion, similar to RetrieveStudyAPIView. Furthermore, this method duplicates logic found in RetrieveNiftiSeriesAPIView and RetrieveNiftiImageAPIView, which could be refactored into a shared base class or helper method for better maintainability.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIfTI files contain only voxel data and spatial orientation. They have zero DICOM tags, zero patient identity fields, zero PHI by design. The conversion from DICOM → NIfTI is itself the stripping of all patient identity. There is nothing to pseudonymize because there are no identity fields in the output format.

Comment on lines +417 to +440
class RetrieveNiftiSeriesAPIView(RetrieveAPIView):
renderer_classes = [WadoMultipartApplicationNiftiRenderer]

async def get(
self, request: AuthenticatedApiRequest, ae_title: str, study_uid: str, series_uid: str
) -> StreamingHttpResponse:
async with self.track_session(request.user) as session:
source_server = await self._get_dicom_server(request, ae_title)

query = self.query.copy()
query["StudyInstanceUID"] = study_uid
query["SeriesInstanceUID"] = series_uid

images = wado_retrieve_nifti(source_server, query, "SERIES")

renderer = cast(
WadoMultipartApplicationNiftiRenderer, getattr(request, "accepted_renderer")
)
return StreamingHttpResponse(
streaming_content=_StreamingSessionWrapper(
renderer.render(images), session, self._finalize_statistic
),
content_type=renderer.content_type,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The new NIfTI conversion endpoint does not implement pseudonymization. This allows users to retrieve raw DICOM data converted to NIfTI, including PII, even when pseudonymization is expected. Please ensure that pseudonymization parameters are extracted from the request and applied during the retrieval and conversion process.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIfTI files contain only voxel data and spatial orientation. They have zero DICOM tags, zero patient identity fields, zero PHI by design. The conversion from DICOM → NIfTI is itself the stripping of all patient identity. There is nothing to pseudonymize because there are no identity fields in the output format.

Comment on lines +515 to +544
class RetrieveNiftiImageAPIView(RetrieveAPIView):
renderer_classes = [WadoMultipartApplicationNiftiRenderer]

async def get(
self,
request: AuthenticatedApiRequest,
ae_title: str,
study_uid: str,
series_uid: str,
image_uid: str,
) -> StreamingHttpResponse:
async with self.track_session(request.user) as session:
source_server = await self._get_dicom_server(request, ae_title)

query = self.query.copy()
query["StudyInstanceUID"] = study_uid
query["SeriesInstanceUID"] = series_uid
query["SOPInstanceUID"] = image_uid

images = wado_retrieve_nifti(source_server, query, "IMAGE")

renderer = cast(
WadoMultipartApplicationNiftiRenderer, getattr(request, "accepted_renderer")
)
return StreamingHttpResponse(
streaming_content=_StreamingSessionWrapper(
renderer.render(images), session, self._finalize_statistic
),
content_type=renderer.content_type,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The new NIfTI conversion endpoint does not implement pseudonymization. This allows users to retrieve raw DICOM data converted to NIfTI, including PII, even when pseudonymization is expected. Please ensure that pseudonymization parameters are extracted from the request and applied during the retrieval and conversion process.

Comment on lines +296 to +331
def _iter_multipart_response(self, response, stream=False) -> Iterator[tuple[str, BytesIO]]:
"""Parse a multipart response, yielding (filename, content) tuples."""
dicomweb_client = self._create_dicom_web_client("")
original_extract_method = dicomweb_client._extract_part_content

try:
dicomweb_client._extract_part_content = self._extract_part_content_with_headers

for part in dicomweb_client._decode_multipart_message(response, stream=stream):
headers = {}
content = part

idx = part.find(b"\r\n\r\n")
if idx > -1:
headers_bytes = part[:idx]
content = part[idx + 4 :]

for header_line in headers_bytes.split(b"\r\n"):
if header_line and b":" in header_line:
name, value = header_line.split(b":", 1)
headers[name.decode("utf-8").strip()] = value.decode("utf-8").strip()

content_disposition = headers.get("Content-Disposition")
if content_disposition:
filename = self._extract_filename(content_disposition)
else:
for header, value in response.headers.items():
if header.lower() == "content-disposition":
filename = self._extract_filename(value)
break
else:
raise ValueError("No Content-Disposition header found in response")

yield (filename, BytesIO(content))
finally:
dicomweb_client._extract_part_content = original_extract_method
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _iter_multipart_response method relies on monkey-patching the private _extract_part_content method of dicomweb_client.DICOMwebClient. This is a fragile approach that can break with future updates to the dicomweb-client library. It would be more robust to handle the HTTP request and multipart parsing without relying on the client's internal implementation for this specific case.

A more robust approach would be to use the underlying requests.Session from the DICOMwebClient to make the GET request and then use a dedicated multipart parsing library (like requests_toolbelt.multipart.decoder) or parse the response stream manually. This would decouple your client from the internal implementation details of dicomweb-client.

Comment on lines +217 to +299
async def _process_single_fetch(
dicom_images: list[Dataset],
) -> AsyncIterator[tuple[str, BytesIO]]:
"""Convert a list of DICOM datasets to NIfTI format and yield the resulting files.

For each conversion, yields files in order: JSON first, then NIfTI (.nii.gz or .nii),
then bval, then bvec.

If conversion fails with NoValidDicomError or NoSpatialDataError, a warning is logged
because the series was expected to contain image data (non-image modalities are filtered
out before this function is called).
"""
async with TemporaryDirectory() as temp_dir:
temp_path = Path(temp_dir)

for file_idx, dicom_image in enumerate(dicom_images):
dicom_file_path = temp_path / f"dicom_file_{file_idx}.dcm"
await sync_to_async(write_dataset, thread_sensitive=False)(
dicom_image, dicom_file_path
)

nifti_output_dir = temp_path / "nifti_output"
await aiofiles.os.makedirs(nifti_output_dir, exist_ok=True)
converter = DicomToNiftiConverter()

try:
await sync_to_async(converter.convert, thread_sensitive=False)(
temp_path, nifti_output_dir
)
except (NoValidDicomError, NoSpatialDataError) as e:
# The series passed the modality check but still failed conversion.
# This is unexpected and worth logging as a warning.
logger.warning(f"Series conversion failed unexpectedly: {e}")
return
except DcmToNiftiConversionError as e:
logger.warning(f"Failed to convert DICOM files to NIfTI: {e}")
return
except Exception as e:
logger.error(f"Error during DICOM to NIfTI conversion: {e}")
raise

entries = await aiofiles.os.scandir(nifti_output_dir)
all_files = [entry.name for entry in entries]

file_pairs: dict[str, dict[str, str]] = {}
for filename in all_files:
base_name, ext = os.path.splitext(filename)
if ext == ".json":
file_pairs.setdefault(base_name, {})["json"] = filename
elif ext == ".gz" and base_name.endswith(".nii"):
actual_base = os.path.splitext(base_name)[0]
file_pairs.setdefault(actual_base, {})["nifti"] = filename
elif ext == ".nii":
file_pairs.setdefault(base_name, {})["nifti"] = filename
elif ext == ".bval":
file_pairs.setdefault(base_name, {})["bval"] = filename
elif ext == ".bvec":
file_pairs.setdefault(base_name, {})["bvec"] = filename

for _base_name, files in file_pairs.items():
if "json" in files:
json_file_path = os.path.join(nifti_output_dir, files["json"])
async with aiofiles.open(json_file_path, "rb") as f:
json_content = await f.read()
yield files["json"], BytesIO(json_content)

if "nifti" in files:
nifti_file_path = os.path.join(nifti_output_dir, files["nifti"])
async with aiofiles.open(nifti_file_path, "rb") as f:
nifti_content = await f.read()
yield files["nifti"], BytesIO(nifti_content)

if "bval" in files:
bval_file_path = os.path.join(nifti_output_dir, files["bval"])
async with aiofiles.open(bval_file_path, "rb") as f:
bval_content = await f.read()
yield files["bval"], BytesIO(bval_content)

if "bvec" in files:
bvec_file_path = os.path.join(nifti_output_dir, files["bvec"])
async with aiofiles.open(bvec_file_path, "rb") as f:
bvec_content = await f.read()
yield files["bvec"], BytesIO(bvec_content)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _process_single_fetch function reads the entire content of each generated NIfTI file into a BytesIO object in memory before yielding it. NIfTI files can be very large, potentially leading to high memory consumption and scalability issues, especially when processing studies with multiple large series.

To improve memory efficiency, you should stream the file content instead of loading it all at once. This would likely require changes in a few places:

  1. _process_single_fetch could yield an async generator for each file's content that reads it from the temporary file in chunks.
  2. The WadoMultipartApplicationNiftiRenderer would need to be updated to consume this async generator of chunks instead of calling .getvalue() on a BytesIO object.

This change would ensure that the server's memory usage remains low regardless of the size of the NIfTI files being served.

yield f"\r\n--{self.boundary}\r\n".encode()

yield "Content-Type: application/octet-stream\r\n".encode()
yield f'Content-Disposition: attachment; filename="{filename}"\r\n\r\n'.encode()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The filename is inserted into the Content-Disposition header without sanitization. If the filename contains double quotes or line breaks (which could be triggered by malicious DICOM metadata if dcm2niix does not sanitize it), it can lead to HTTP header injection or response splitting. Please sanitize the filename by removing or escaping double quotes and control characters.

"""Extract filename from Content-Disposition header."""
if not content_disposition or "filename=" not in content_disposition:
raise ValueError("No filename found in Content-Disposition header")
filename = content_disposition.split("filename=")[1].strip('"')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The filename extraction from the Content-Disposition header is fragile and insecure. It fails to handle multiple parameters correctly and does not sanitize the resulting filename, which could lead to path traversal in applications using this library if they use the returned filename to save files without further validation. Please use a robust parsing method and sanitize the filename to remove any path components or dangerous characters.

Comment on lines +193 to +277
def retrieve_nifti_study(self, ae_title: str, study_uid: str) -> list[tuple[str, BytesIO]]:
"""Retrieve NIfTI files for a study."""
url = f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
return list(self._iter_multipart_response(response, stream=False))

def iter_nifti_study(self, ae_title: str, study_uid: str) -> Iterator[tuple[str, BytesIO]]:
"""Iterate over NIfTI files for a study."""
url = f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
yield from self._iter_multipart_response(response, stream=True)

def retrieve_nifti_series(
self, ae_title: str, study_uid: str, series_uid: str
) -> list[tuple[str, BytesIO]]:
"""Retrieve NIfTI files for a series."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
return list(self._iter_multipart_response(response, stream=False))

def iter_nifti_series(
self, ae_title: str, study_uid: str, series_uid: str
) -> Iterator[tuple[str, BytesIO]]:
"""Iterate over NIfTI files for a series."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
yield from self._iter_multipart_response(response, stream=True)

def retrieve_nifti_image(
self, ae_title: str, study_uid: str, series_uid: str, image_uid: str
) -> list[tuple[str, BytesIO]]:
"""Retrieve NIfTI files for a single image."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/instances/{image_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
return list(self._iter_multipart_response(response, stream=False))

def iter_nifti_image(
self, ae_title: str, study_uid: str, series_uid: str, image_uid: str
) -> Iterator[tuple[str, BytesIO]]:
"""Iterate over NIfTI files for a single image."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/instances/{image_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
yield from self._iter_multipart_response(response, stream=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is significant code duplication across the retrieve_nifti_* and iter_nifti_* methods for study, series, and image levels. The logic for URL construction, making the HTTP request, and processing the response is nearly identical in all six methods. This makes the code harder to maintain.

Consider refactoring this logic into one or two private helper methods. For example, you could have a method that constructs the URL and another that performs the request and yields the multipart content. The public methods would then just call these helpers with the appropriate parameters. This would reduce redundancy and improve maintainability.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@adit/core/errors.py`:
- Around line 88-91: The code currently defines NoSpatialDataError but it is
never raised by DicomToNiftiConverter.convert() while
wadors_utils._process_single_fetch catches it alongside NoValidDicomError, so
either remove the dead except clause or make the converter raise the error; to
fix, decide which approach: (A) remove the NoSpatialDataError branch from
wadors_utils._process_single_fetch so only NoValidDicomError is handled, or (B)
update DicomToNiftiConverter.convert to validate spatial attributes (e.g., check
presence of image orientation/position, pixel spacing, slice thickness or other
spatial metadata) and raise NoSpatialDataError with a clear message when those
checks fail (ensure the converter docstring and exception list include
NoSpatialDataError and import/use the NoSpatialDataError symbol), so the handler
in wadors_utils remains correct.

In `@adit/core/utils/dicom_to_nifti_converter.py`:
- Around line 116-117: The except block that catches subprocess.SubprocessError
and raises ExternalToolError loses the original traceback; update the raise so
it chains the original exception (use "raise ExternalToolError(... ) from e")
when handling subprocess.SubprocessError thrown while invoking dcm2niix (the
place referencing ExternalToolError and subprocess.SubprocessError in
dicom_to_nifti_converter.py), so the original exception context is preserved for
debugging.

In `@adit/dicom_web/utils/wadors_utils.py`:
- Around line 211-214: In the except handlers in wadors_utils.py (the blocks
catching RetriableDicomError and DicomError), chain the new exceptions to the
original by using "raise ServiceUnavailableApiError(... ) from err" and "raise
BadGatewayApiError(... ) from err" respectively so the original traceback is
preserved; update the two raise statements inside the except RetriableDicomError
as err and except DicomError as err blocks to use "from err" while keeping the
existing error message string.
- Around line 183-190: Accessing series.Modality can raise AttributeError when
Modality is absent; update the loop in wadors_utils.py (the for series in
series_list block) to defensively read the modality (e.g., via getattr(series,
"Modality", None) or try/except AttributeError) and handle a missing modality by
logging a debug/warn with series.SeriesInstanceUID and continuing (still use
NON_IMAGE_MODALITIES to filter when modality is present). Ensure logger.debug
message still references SeriesInstanceUID and that the code never lets an
AttributeError propagate out of the series iteration.
- Around line 254-256: Replace the logger.error call inside the except Exception
as e handler for the DICOM-to-NIfTI conversion with logger.exception so the
traceback is preserved; specifically, in the except block that currently reads
"except Exception as e: logger.error(f'Error during DICOM to NIfTI conversion:
{e}'); raise", change it to call logger.exception("Error during DICOM to NIfTI
conversion") (keep the subsequent bare raise) so the full stack trace is logged
for the conversion routine that uses the logger instance.

In `@adit/dicom_web/views.py`:
- Around line 336-337: wado_retrieve_nifti should validate the first fetch
before streaming: call the existing peek_images (same check used by
wado_retrieve) to perform an initial _fetch_dicom_data probe and raise/handle
errors early; update wado_retrieve_nifti to invoke peek_images (or replicate its
first-fetch logic) using the same query (including the StudyInstanceUID
assignment) and only proceed to start the streaming generator if peek_images
succeeds so fetch errors aren't emitted as stream payloads.
🧹 Nitpick comments (9)
adit/dicom_web/views.py (1)

327-349: No peek_images call — conversion errors may surface mid-stream.

The existing DICOM retrieve views call await self.peek_images(images) to catch errors before streaming begins, allowing a proper HTTP error response. The NIfTI views skip this step. While wado_retrieve_nifti internally catches DcmToNiftiConversionError and logs warnings, DICOM-level errors like RetriableDicomError/DicomError are re-raised as API exceptions. If these occur after StreamingHttpResponse has already started sending headers, Django cannot return a clean error status code.

Consider whether a similar peek mechanism (or pre-flight validation) would be appropriate for the NIfTI endpoints, particularly for the SERIES and IMAGE levels where the fetch happens in one shot before conversion.

adit/dicom_web/tests/acceptance/test_wadors.py (2)

662-667: Use strict=True in zip() for stronger assertion.

Although the length check on line 665 guards against mismatches, adding strict=True to zip() is a good defensive practice per Ruff B905. Same applies to line 689.

♻️ Proposed fix
-    for (r_name, _), (i_name, _) in zip(retrieved, iterated):
+    for (r_name, _), (i_name, _) in zip(retrieved, iterated, strict=True):

585-645: Missing test coverage for image-level NIfTI and error scenarios.

There are tests for study-level and series-level NIfTI retrieval but none for:

  • retrieve_nifti_image / iter_nifti_image (the image-level NIfTI endpoint)
  • Error scenarios (e.g., non-existent study/series, server without WADO support)

These could be added in a follow-up.

Would you like me to draft acceptance tests for image-level NIfTI retrieval and error cases, or open an issue to track this?

adit/dicom_web/renderers.py (1)

90-118: No error handling in NIfTI renderer — contrast with DICOM renderer.

WadoMultipartApplicationDicomRenderer.render() (lines 45-52) catches exceptions from the upstream iterator and emits an _error_stream part, allowing the client to detect failures mid-stream. The NIfTI renderer has no equivalent: if the upstream iterator raises, the multipart response will be truncated with no error signal.

Given that wadors_utils._process_single_fetch catches most conversion errors internally, this is low-risk in practice, but an unhandled edge case (e.g., I/O error reading a temp file) would produce a silently truncated response.

adit/core/utils/dicom_to_nifti_converter.py (1)

107-114: Success log runs after partial conversion — minor misleading message.

When exit_code == PARTIAL_CONVERSION (8), the method logs a warning but then falls through to line 119 which logs "successfully converted." This could be confusing in logs. Consider returning early or adjusting the success message.

Also, the condition on line 111 (exit_code == UNSPECIFIED_ERROR or exit_code != 0) could be simplified to just exit_code != 0 since all known non-zero codes are already handled above.

♻️ Suggested simplification
-            elif exit_code == DcmToNiftiExitCode.UNSPECIFIED_ERROR or exit_code != 0:
+            elif exit_code != 0:
adit/dicom_web/utils/wadors_utils.py (1)

229-236: All DICOM datasets loaded into memory simultaneously before conversion.

_process_single_fetch receives the full list[Dataset], writes them all to disk, then runs conversion. For large series (e.g., thousands of CT slices), this holds all datasets in memory until the temp files are written. This is an inherent consequence of the synchronous _fetch_dicom_data design but worth noting as a scaling concern for study-level requests where multiple series are processed sequentially.

adit-client/adit_client/client.py (3)

279-284: Filename extraction is fragile for edge cases.

split("filename=")[1].strip('"') assumes a specific format. It would fail or produce incorrect results if:

  • The value uses filename*= (RFC 5987 encoding)
  • Multiple filename= tokens appear in the header
  • The filename contains escaped quotes

For dcm2niix-generated filenames this is safe in practice, but consider using Python's email.message or cgi module for more robust parsing.

♻️ More robust alternative
+    import re
+
     def _extract_filename(self, content_disposition: str) -> str:
         """Extract filename from Content-Disposition header."""
         if not content_disposition or "filename=" not in content_disposition:
             raise ValueError("No filename found in Content-Disposition header")
-        filename = content_disposition.split("filename=")[1].strip('"')
+        match = re.search(r'filename="([^"]+)"', content_disposition)
+        if not match:
+            match = re.search(r'filename=(\S+)', content_disposition)
+        if not match:
+            raise ValueError("No filename found in Content-Disposition header")
+        filename = match.group(1)
         return filename

193-277: Significant duplication across six NIfTI methods — consider a shared helper.

The retrieve/iter method pairs for study, series, and image levels share nearly identical structure, differing only in URL path and parameters. A private helper could consolidate this:

♻️ Sketch of a shared helper
def _nifti_request(self, url: str, stream: bool) -> Iterator[tuple[str, BytesIO]]:
    """Shared helper for NIfTI multipart requests."""
    dicomweb_client = self._create_dicom_web_client("")
    response = dicomweb_client._http_get(
        url,
        headers={"Accept": "multipart/related; type=application/octet-stream"},
        stream=True,
    )
    if stream:
        yield from self._iter_multipart_response(response, stream=True)
    else:
        yield from self._iter_multipart_response(response, stream=False)

def retrieve_nifti_study(self, ae_title: str, study_uid: str) -> list[tuple[str, BytesIO]]:
    url = f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
    return list(self._nifti_request(url, stream=False))

def iter_nifti_study(self, ae_title: str, study_uid: str) -> Iterator[tuple[str, BytesIO]]:
    url = f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
    yield from self._nifti_request(url, stream=True)
# ... similar for series and image

296-331: Document the dicomweb-client version this multipart parsing approach was tested against.

The monkey-patching of _extract_part_content is necessary given that dicomweb-client provides no public API for extracting per-part headers from multipart responses. The code already clearly explains this purpose in the _extract_part_content_with_headers docstring.

To improve maintainability, add a version comment to _iter_multipart_response documenting which dicomweb-client version this was tested against (currently ≥0.60.0 per pyproject.toml). This helps future maintainers quickly identify when to re-verify compatibility after library upgrades. For example:

def _iter_multipart_response(self, response, stream=False) -> Iterator[tuple[str, BytesIO]]:
    """Parse a multipart response, yielding (filename, content) tuples.
    
    Note: Uses private API _extract_part_content (tested with dicomweb-client 0.60.0+)
    to access per-part headers. Verify compatibility after library upgrades.
    """

Comment on lines +116 to +117
except subprocess.SubprocessError as e:
raise ExternalToolError(f"Failed to execute dcm2niix: {e}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Chain the original exception with raise ... from e.

This loses the original traceback, making debugging harder. Ruff B904 correctly flags this.

🔧 Proposed fix
         except subprocess.SubprocessError as e:
-            raise ExternalToolError(f"Failed to execute dcm2niix: {e}")
+            raise ExternalToolError(f"Failed to execute dcm2niix: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except subprocess.SubprocessError as e:
raise ExternalToolError(f"Failed to execute dcm2niix: {e}")
except subprocess.SubprocessError as e:
raise ExternalToolError(f"Failed to execute dcm2niix: {e}") from e
🧰 Tools
🪛 Ruff (0.15.0)

[warning] 117-117: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 117-117: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
In `@adit/core/utils/dicom_to_nifti_converter.py` around lines 116 - 117, The
except block that catches subprocess.SubprocessError and raises
ExternalToolError loses the original traceback; update the raise so it chains
the original exception (use "raise ExternalToolError(... ) from e") when
handling subprocess.SubprocessError thrown while invoking dcm2niix (the place
referencing ExternalToolError and subprocess.SubprocessError in
dicom_to_nifti_converter.py), so the original exception context is preserved for
debugging.

Comment on lines +183 to +190
for series in series_list:
modality = series.Modality
if modality in NON_IMAGE_MODALITIES:
logger.debug(
f"Skipping non-image series {series.SeriesInstanceUID} "
f"(modality: {modality})"
)
continue
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

series.Modality access may raise AttributeError if Modality is absent.

Not all PACS return Modality in find_series results. If the attribute is missing, this line will raise an unhandled AttributeError, skipping the rest of the study.

🛡️ Proposed defensive access
             for series in series_list:
-                modality = series.Modality
+                modality = getattr(series, "Modality", None)
                 if modality in NON_IMAGE_MODALITIES:
                     logger.debug(
                         f"Skipping non-image series {series.SeriesInstanceUID} "
                         f"(modality: {modality})"
                     )
                     continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for series in series_list:
modality = series.Modality
if modality in NON_IMAGE_MODALITIES:
logger.debug(
f"Skipping non-image series {series.SeriesInstanceUID} "
f"(modality: {modality})"
)
continue
for series in series_list:
modality = getattr(series, "Modality", None)
if modality in NON_IMAGE_MODALITIES:
logger.debug(
f"Skipping non-image series {series.SeriesInstanceUID} "
f"(modality: {modality})"
)
continue
🤖 Prompt for AI Agents
In `@adit/dicom_web/utils/wadors_utils.py` around lines 183 - 190, Accessing
series.Modality can raise AttributeError when Modality is absent; update the
loop in wadors_utils.py (the for series in series_list block) to defensively
read the modality (e.g., via getattr(series, "Modality", None) or try/except
AttributeError) and handle a missing modality by logging a debug/warn with
series.SeriesInstanceUID and continuing (still use NON_IMAGE_MODALITIES to
filter when modality is present). Ensure logger.debug message still references
SeriesInstanceUID and that the code never lets an AttributeError propagate out
of the series iteration.

Comment on lines +211 to +214
except RetriableDicomError as err:
raise ServiceUnavailableApiError(str(err))
except DicomError as err:
raise BadGatewayApiError(str(err))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Chain exceptions with raise ... from err.

Both raise statements discard the original traceback. Ruff B904 flags this correctly.

🔧 Proposed fix
     except RetriableDicomError as err:
-        raise ServiceUnavailableApiError(str(err))
+        raise ServiceUnavailableApiError(str(err)) from err
     except DicomError as err:
-        raise BadGatewayApiError(str(err))
+        raise BadGatewayApiError(str(err)) from err
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except RetriableDicomError as err:
raise ServiceUnavailableApiError(str(err))
except DicomError as err:
raise BadGatewayApiError(str(err))
except RetriableDicomError as err:
raise ServiceUnavailableApiError(str(err)) from err
except DicomError as err:
raise BadGatewayApiError(str(err)) from err
🧰 Tools
🪛 Ruff (0.15.0)

[warning] 212-212: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 214-214: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In `@adit/dicom_web/utils/wadors_utils.py` around lines 211 - 214, In the except
handlers in wadors_utils.py (the blocks catching RetriableDicomError and
DicomError), chain the new exceptions to the original by using "raise
ServiceUnavailableApiError(... ) from err" and "raise BadGatewayApiError(... )
from err" respectively so the original traceback is preserved; update the two
raise statements inside the except RetriableDicomError as err and except
DicomError as err blocks to use "from err" while keeping the existing error
message string.

Comment on lines +254 to +256
except Exception as e:
logger.error(f"Error during DICOM to NIfTI conversion: {e}")
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Use logger.exception instead of logger.error to preserve traceback.

logger.error with an f-string loses the exception traceback. logger.exception automatically includes it, which is critical for debugging unexpected failures in the conversion pipeline. Ruff TRY400 flags this.

🔧 Proposed fix
         except Exception as e:
-            logger.error(f"Error during DICOM to NIfTI conversion: {e}")
+            logger.exception(f"Error during DICOM to NIfTI conversion: {e}")
             raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
logger.error(f"Error during DICOM to NIfTI conversion: {e}")
raise
except Exception as e:
logger.exception(f"Error during DICOM to NIfTI conversion: {e}")
raise
🧰 Tools
🪛 Ruff (0.15.0)

[warning] 255-255: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

🤖 Prompt for AI Agents
In `@adit/dicom_web/utils/wadors_utils.py` around lines 254 - 256, Replace the
logger.error call inside the except Exception as e handler for the
DICOM-to-NIfTI conversion with logger.exception so the traceback is preserved;
specifically, in the except block that currently reads "except Exception as e:
logger.error(f'Error during DICOM to NIfTI conversion: {e}'); raise", change it
to call logger.exception("Error during DICOM to NIfTI conversion") (keep the
subsequent bare raise) so the full stack trace is logged for the conversion
routine that uses the logger instance.

Comment on lines +336 to +337
query = self.query.copy()
query["StudyInstanceUID"] = study_uid
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's check the adit/dicom_web/views.py file around lines 336-337
cat -n adit/dicom_web/views.py | sed -n '330,345p'

Repository: openradx/adit

Length of output: 838


🏁 Script executed:

# Let's also get the broader context to understand the function
cat -n adit/dicom_web/views.py | sed -n '320,360p'

Repository: openradx/adit

Length of output: 2028


🏁 Script executed:

# Search for wado_retrieve_nifti function
rg -A 30 "def wado_retrieve_nifti" adit/

Repository: openradx/adit

Length of output: 2368


🏁 Script executed:

# Search for _fetch_dicom_data function
rg -A 20 "def _fetch_dicom_data" adit/

Repository: openradx/adit

Length of output: 1503


🏁 Script executed:

# Look for DicomOperator.fetch_series or fetch_study implementations
rg -A 15 "def fetch_series" adit/ --type py

Repository: openradx/adit

Length of output: 2443


🏁 Script executed:

# Search for DicomOperator class and its methods
rg "class DicomOperator" adit/ --type py -A 5

Repository: openradx/adit

Length of output: 389


🏁 Script executed:

# Find where self.query is defined in RetrieveAPIView
rg -B 5 "class RetrieveAPIView" adit/dicom_web/views.py | head -20

Repository: openradx/adit

Length of output: 195


🏁 Script executed:

# Get the full wado_retrieve_nifti function
cat adit/dicom_web/utils/wadors_utils.py | sed -n '/^async def wado_retrieve_nifti/,/^async def /p' | head -100

Repository: openradx/adit

Length of output: 2292


🏁 Script executed:

# Get the full _fetch_dicom_data function
cat adit/dicom_web/utils/wadors_utils.py | sed -n '/^def _fetch_dicom_data/,/^def /p' | head -60

Repository: openradx/adit

Length of output: 2054


🏁 Script executed:

# Look for self.query initialization and see what fields it contains
rg "self\.query\s*=" adit/dicom_web/views.py -B 2 -A 2

Repository: openradx/adit

Length of output: 39


🏁 Script executed:

# Check DicomOperator.fetch_study implementation
rg -A 20 "def fetch_study" adit/core/utils/dicom_operator.py

Repository: openradx/adit

Length of output: 682


🏁 Script executed:

# Check how QueryDataset handles empty values
rg -A 5 "class QueryDataset" adit/ --type py

Repository: openradx/adit

Length of output: 548


🏁 Script executed:

# Look for query property/field in RetrieveAPIView and WebDicomAPIView parent
rg "query\s*=" adit/dicom_web/views.py -B 3 -A 3 | head -40

Repository: openradx/adit

Length of output: 1475


🏁 Script executed:

# Check WebDicomAPIView for self.query
rg -B 5 -A 15 "class WebDicomAPIView" adit/dicom_web/views.py

Repository: openradx/adit

Length of output: 674


🏁 Script executed:

# Look for query initialization in parent classes or as a property
rg "self\.query\s*=|\.query\s*=" adit/dicom_web/ -B 2 -A 2

Repository: openradx/adit

Length of output: 39


🏁 Script executed:

# Get full fetch_series implementation
cat adit/core/utils/dicom_operator.py | sed -n '/def fetch_series/,/^    def /p' | head -50

Repository: openradx/adit

Length of output: 1277


🏁 Script executed:

# Search for retrieve methods that use peek or error recovery
rg "peek|error|recovery|try|except" adit/dicom_web/views.py | head -20

Repository: openradx/adit

Length of output: 1141


🏁 Script executed:

# Check the DICOM retrieve pattern for comparison
rg -B 5 -A 15 "class.*Retrieve.*APIView" adit/dicom_web/views.py | head -100

Repository: openradx/adit

Length of output: 3773


🏁 Script executed:

# Look for the wado_retrieve function to compare with wado_retrieve_nifti
rg -A 50 "^async def wado_retrieve\(" adit/dicom_web/utils/wadors_utils.py

Repository: openradx/adit

Length of output: 2033


🏁 Script executed:

# Check _fetch_images_with_wado_rs implementation
rg -A 30 "def _fetch_images_with_wado_rs" adit/core/utils/dicom_operator.py

Repository: openradx/adit

Length of output: 1303


🏁 Script executed:

# Look for error handling differences between the two retrieve paths
rg -B 3 -A 10 "peek_images" adit/dicom_web/views.py

Repository: openradx/adit

Length of output: 2490


🏁 Script executed:

# Check if there are any guards or validations for PatientID in WADO operations
rg "PatientID" adit/dicom_web/utils/wadors_utils.py -B 2 -A 2

Repository: openradx/adit

Length of output: 1715


🏁 Script executed:

# Check if DicomWebConnector.send_wado_rs handles empty PatientID
rg -A 20 "def send_wado_rs" adit/core/utils/ --type py

Repository: openradx/adit

Length of output: 1701


🏁 Script executed:

# Check how find_series works and if it requires PatientID
rg -A 15 "def find_series" adit/core/utils/dicom_operator.py

Repository: openradx/adit

Length of output: 734


🏁 Script executed:

# Look at how QueryDataset is constructed and if empty strings cause issues
rg -B 5 -A 20 "def from_dict" adit/core/utils/dicom_dataset.py

Repository: openradx/adit

Length of output: 813


🏁 Script executed:

# Check the send_wado_rs implementation in DimseConnector or DicomWebConnector
rg -B 3 "send_wado_rs" adit/core/utils/ --type py -A 15

Repository: openradx/adit

Length of output: 2718


🏁 Script executed:

# Check the full send_wado_rs implementation to see if it filters empty values
cat adit/core/utils/dicom_web_connector.py | sed -n '/def send_wado_rs/,/^    def /p' | head -80

Repository: openradx/adit

Length of output: 2379


🏁 Script executed:

# Check if there's any validation or filtering of empty strings in the query
rg -B 5 -A 10 "dictify" adit/core/utils/dicom_dataset.py

Repository: openradx/adit

Length of output: 1255


🏁 Script executed:

# Look at how find_series is called in wado_retrieve_nifti - note it doesn't pass PatientID
rg -B 3 -A 8 "find_series" adit/dicom_web/utils/wadors_utils.py

Repository: openradx/adit

Length of output: 457


🏁 Script executed:

# Check if QueryDataset.create handles empty values differently than from_dict
rg -B 5 -A 20 "def create" adit/core/utils/dicom_dataset.py | head -60

Repository: openradx/adit

Length of output: 1400


🏁 Script executed:

# Check if there are any tests for wado_retrieve_nifti with study-level retrieval
rg -B 5 -A 20 "test.*wado_retrieve_nifti.*STUDY" adit/dicom_web/tests/ --type py

Repository: openradx/adit

Length of output: 39


🏁 Script executed:

# Check find_series implementation to see if it handles the query correctly
cat adit/core/utils/dicom_operator.py | sed -n '/def find_series/,/^    def /p' | head -40

Repository: openradx/adit

Length of output: 1690


🏁 Script executed:

# Look at how find_series calls the connector - does it pass PatientID?
rg -B 3 -A 15 "def find_series" adit/core/utils/dicom_operator.py

Repository: openradx/adit

Length of output: 819


🏁 Script executed:

# Check DicomWebConnector find_series to see if empty PatientID causes issues
rg -A 30 "def find_series" adit/core/utils/dicom_web_connector.py

Repository: openradx/adit

Length of output: 39


🏁 Script executed:

# Check the find_series usage more carefully - see what server capabilities affect the behavior
cat adit/core/utils/dicom_operator.py | sed -n '/def find_series/,/^    def /p' | head -80

Repository: openradx/adit

Length of output: 2148


🏁 Script executed:

# Check if there's error handling for find_series in wado_retrieve_nifti
cat adit/dicom_web/utils/wadors_utils.py | sed -n '/^async def wado_retrieve_nifti/,/^async def /p' | head -120

Repository: openradx/adit

Length of output: 2292


🏁 Script executed:

# Check DicomWebConnector find_series to see if it filters empty PatientID
cat adit/core/utils/dicom_web_connector.py | sed -n '/def find_series/,/^    def /p' | head -80

Repository: openradx/adit

Length of output: 39


🏁 Script executed:

# Check if wado_retrieve has peek_images error recovery
rg -B 2 -A 10 "peek_images" adit/dicom_web/views.py | grep -A 10 "RetrieveNiftiStudyAPIView" | head -20

Repository: openradx/adit

Length of output: 39


NIfTI path lacks peek/error recovery—errors in early fetches won't be caught before streaming starts.

Empty PatientID in the query is safe for WADO-RS operations (it is not used at the series level), and find_series validates PatientID adequately based on server capabilities. However, unlike wado_retrieve, wado_retrieve_nifti does not call peek_images() to validate the first fetch before beginning the stream. This means errors that occur during the first _fetch_dicom_data call will be streamed as error messages instead of triggering early exception handling.

🤖 Prompt for AI Agents
In `@adit/dicom_web/views.py` around lines 336 - 337, wado_retrieve_nifti should
validate the first fetch before streaming: call the existing peek_images (same
check used by wado_retrieve) to perform an initial _fetch_dicom_data probe and
raise/handle errors early; update wado_retrieve_nifti to invoke peek_images (or
replicate its first-fetch logic) using the same query (including the
StudyInstanceUID assignment) and only proceed to start the streaming generator
if peek_images succeeds so fetch errors aren't emitted as stream payloads.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable feature by adding WADO-RS endpoints for DICOM-to-NIfTI conversion. However, a security audit identified two high-severity vulnerabilities: a Path Traversal vulnerability in the adit-client due to improper handling of the Content-Disposition header, and an HTTP Header Injection (CRLF Injection) vulnerability in the WadoMultipartApplicationNiftiRenderer due to unsanitized filenames being placed in response headers. These issues pose serious security risks and require immediate remediation. Additionally, while the implementation is comprehensive and well-structured, feedback focuses on improving maintainability by reducing code duplication in a few key areas.

Comment on lines +279 to +284
def _extract_filename(self, content_disposition: str) -> str:
"""Extract filename from Content-Disposition header."""
if not content_disposition or "filename=" not in content_disposition:
raise ValueError("No filename found in Content-Disposition header")
filename = content_disposition.split("filename=")[1].strip('"')
return filename
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The _extract_filename function extracts a filename from the Content-Disposition header of an HTTP response. The current implementation does not sanitize the extracted filename for path traversal characters (e.g., ../). An attacker who can control the server's HTTP response can set a malicious Content-Disposition header, such as attachment; filename="../../../etc/passwd". If the user of the AditClient library then uses this filename to write the received data to the filesystem, it can result in writing files to arbitrary locations on the system where the client is run, potentially leading to code execution or system compromise. The filename should be sanitized to remove any directory information.

yield f"\r\n--{self.boundary}\r\n".encode()

yield "Content-Type: application/octet-stream\r\n".encode()
yield f'Content-Disposition: attachment; filename="{filename}"\r\n\r\n'.encode()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The render method of the WadoMultipartApplicationNiftiRenderer class constructs a Content-Disposition header using a filename that originates from the output of the dcm2niix tool. The filename format is based on the DICOM Series Description (%s), which can be controlled by an attacker if they can provide a malicious DICOM file. The filename is embedded directly into the response header without sanitization. If an attacker crafts a DICOM file with a SeriesDescription containing CRLF characters (e.g., foo\r\nContent-Type: text/html), the generated filename can break out of the Content-Disposition header and inject arbitrary HTTP headers and body content into the response. This can lead to vulnerabilities like Cross-Site Scripting (XSS), cache poisoning, or session fixation. The filename should be sanitized to remove any characters that are not valid in a filename or could be used for header injection.

Comment on lines +193 to +277
def retrieve_nifti_study(self, ae_title: str, study_uid: str) -> list[tuple[str, BytesIO]]:
"""Retrieve NIfTI files for a study."""
url = f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
return list(self._iter_multipart_response(response, stream=False))

def iter_nifti_study(self, ae_title: str, study_uid: str) -> Iterator[tuple[str, BytesIO]]:
"""Iterate over NIfTI files for a study."""
url = f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
yield from self._iter_multipart_response(response, stream=True)

def retrieve_nifti_series(
self, ae_title: str, study_uid: str, series_uid: str
) -> list[tuple[str, BytesIO]]:
"""Retrieve NIfTI files for a series."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
return list(self._iter_multipart_response(response, stream=False))

def iter_nifti_series(
self, ae_title: str, study_uid: str, series_uid: str
) -> Iterator[tuple[str, BytesIO]]:
"""Iterate over NIfTI files for a series."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
yield from self._iter_multipart_response(response, stream=True)

def retrieve_nifti_image(
self, ae_title: str, study_uid: str, series_uid: str, image_uid: str
) -> list[tuple[str, BytesIO]]:
"""Retrieve NIfTI files for a single image."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/instances/{image_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
return list(self._iter_multipart_response(response, stream=False))

def iter_nifti_image(
self, ae_title: str, study_uid: str, series_uid: str, image_uid: str
) -> Iterator[tuple[str, BytesIO]]:
"""Iterate over NIfTI files for a single image."""
url = (
f"{self.server_url}/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
f"series/{series_uid}/instances/{image_uid}/nifti"
)
dicomweb_client = self._create_dicom_web_client(ae_title)
response = dicomweb_client._http_get(
url,
headers={"Accept": "multipart/related; type=application/octet-stream"},
stream=True,
)
yield from self._iter_multipart_response(response, stream=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There is significant code duplication across the retrieve_nifti_* and iter_nifti_* methods. The logic for URL construction and making the HTTP request is repeated in each method. This can be refactored to improve maintainability and reduce redundancy.

I suggest introducing a private helper method to handle the common logic of making the request. This will make the public methods much cleaner and easier to manage.

    def _make_nifti_request(self, path: str):
        url = f"{self.server_url}{path}"
        dicomweb_client = self._create_dicom_web_client("")
        return dicomweb_client._http_get(
            url,
            headers={"Accept": "multipart/related; type=application/octet-stream"},
            stream=True,
        )

    def retrieve_nifti_study(self, ae_title: str, study_uid: str) -> list[tuple[str, BytesIO]]:
        """Retrieve NIfTI files for a study."""
        path = f"/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
        response = self._make_nifti_request(path)
        return list(self._iter_multipart_response(response, stream=False))

    def iter_nifti_study(self, ae_title: str, study_uid: str) -> Iterator[tuple[str, BytesIO]]:
        """Iterate over NIfTI files for a study."""
        path = f"/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/nifti"
        response = self._make_nifti_request(path)
        yield from self._iter_multipart_response(response, stream=True)

    def retrieve_nifti_series(
        self, ae_title: str, study_uid: str, series_uid: str
    ) -> list[tuple[str, BytesIO]]:
        """Retrieve NIfTI files for a series."""
        path = (
            f"/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
            f"series/{series_uid}/nifti"
        )
        response = self._make_nifti_request(path)
        return list(self._iter_multipart_response(response, stream=False))

    def iter_nifti_series(
        self, ae_title: str, study_uid: str, series_uid: str
    ) -> Iterator[tuple[str, BytesIO]]:
        """Iterate over NIfTI files for a series."""
        path = (
            f"/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
            f"series/{series_uid}/nifti"
        )
        response = self._make_nifti_request(path)
        yield from self._iter_multipart_response(response, stream=True)

    def retrieve_nifti_image(
        self, ae_title: str, study_uid: str, series_uid: str, image_uid: str
    ) -> list[tuple[str, BytesIO]]:
        """Retrieve NIfTI files for a single image."""
        path = (
            f"/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
            f"series/{series_uid}/instances/{image_uid}/nifti"
        )
        response = self._make_nifti_request(path)
        return list(self._iter_multipart_response(response, stream=False))

    def iter_nifti_image(
        self, ae_title: str, study_uid: str, series_uid: str, image_uid: str
    ) -> Iterator[tuple[str, BytesIO]]:
        """Iterate over NIfTI files for a single image."""
        path = (
            f"/api/dicom-web/{ae_title}/wadors/studies/{study_uid}/"
            f"series/{series_uid}/instances/{image_uid}/nifti"
        )
        response = self._make_nifti_request(path)
        yield from self._iter_multipart_response(response, stream=True)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I considered this but decided to keep the explicit methods as-is. The "duplication" is only ~3 lines of boilerplate per method (URL construction + _http_get call), and having six clearly-named methods with distinct type signatures provides better IDE support (autocomplete, go-to-definition) and makes each method's URL path immediately visible without indirection. The proposed helper doesn't meaningfully reduce complexity, it just moves it behind another layer

Comment on lines +91 to +114
if exit_code == DcmToNiftiExitCode.SUCCESS:
pass
elif exit_code == DcmToNiftiExitCode.NO_DICOM_FOUND:
raise NoValidDicomError(f"No DICOM images found in input folder: {error_msg}")
elif exit_code == DcmToNiftiExitCode.VERSION_REPORT:
logger.info(f"dcm2niix version report: {error_msg}")
elif exit_code == DcmToNiftiExitCode.CORRUPT_DICOM:
raise InvalidDicomError(f"Corrupt DICOM file: {error_msg}")
elif exit_code == DcmToNiftiExitCode.INVALID_INPUT_FOLDER:
raise InputDirectoryError(f"Input folder invalid: {error_msg}")
elif exit_code == DcmToNiftiExitCode.INVALID_OUTPUT_FOLDER:
raise OutputDirectoryError(f"Output folder invalid: {error_msg}")
elif exit_code == DcmToNiftiExitCode.WRITE_PERMISSION_ERROR:
raise OutputDirectoryError(
f"Unable to write to output folder (check permissions): {error_msg}"
)
elif exit_code == DcmToNiftiExitCode.PARTIAL_CONVERSION:
logger.warning(f"Converted some but not all input DICOMs: {error_msg}")
elif exit_code == DcmToNiftiExitCode.RENAME_ERROR:
raise DcmToNiftiConversionError(f"Unable to rename files: {error_msg}")
elif exit_code == DcmToNiftiExitCode.UNSPECIFIED_ERROR or exit_code != 0:
raise DcmToNiftiConversionError(
f"Unspecified error (exit code {exit_code}): {error_msg}"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The long if/elif chain to handle dcm2niix exit codes is functional but can be made more concise and maintainable. Using a dictionary to map exit codes to exceptions and messages would make this logic more data-driven and easier to extend if more exit codes need to be handled in the future.

            error_map = {
                DcmToNiftiExitCode.NO_DICOM_FOUND: (NoValidDicomError, f"No DICOM images found in input folder: {error_msg}"),
                DcmToNiftiExitCode.CORRUPT_DICOM: (InvalidDicomError, f"Corrupt DICOM file: {error_msg}"),
                DcmToNiftiExitCode.INVALID_INPUT_FOLDER: (InputDirectoryError, f"Input folder invalid: {error_msg}"),
                DcmToNiftiExitCode.INVALID_OUTPUT_FOLDER: (OutputDirectoryError, f"Output folder invalid: {error_msg}"),
                DcmToNiftiExitCode.WRITE_PERMISSION_ERROR: (OutputDirectoryError, f"Unable to write to output folder (check permissions): {error_msg}"),
                DcmToNiftiExitCode.RENAME_ERROR: (DcmToNiftiConversionError, f"Unable to rename files: {error_msg}"),
            }

            if exit_code == DcmToNiftiExitCode.SUCCESS:
                pass
            elif exit_code == DcmToNiftiExitCode.VERSION_REPORT:
                logger.info(f"dcm2niix version report: {error_msg}")
            elif exit_code == DcmToNiftiExitCode.PARTIAL_CONVERSION:
                logger.warning(f"Converted some but not all input DICOMs: {error_msg}")
            elif exit_code in error_map:
                exc_class, msg = error_map[exit_code]
                raise exc_class(msg)
            elif exit_code != 0:
                raise DcmToNiftiConversionError(
                    f"Unspecified error (exit code {exit_code}): {error_msg}"
                )

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer keeping the if/elif chain here. The exit codes have three distinct behaviors: raise an exception, log and continue, or do nothing (SUCCESS), which don't all fit cleanly into a dictionary lookup. The suggested refactor still needs if/elif for SUCCESS, VERSION_REPORT, and PARTIAL_CONVERSION, so it splits the logic across two patterns (dict + if/elif) instead of one consistent chain. The current version is 24 lines, reads top-to-bottom, and each case is immediately visible. I don't think the indirection improves readability or maintainability for a fixed set of 10 exit codes that rarely changes.

Comment on lines +276 to +299
for _base_name, files in file_pairs.items():
if "json" in files:
json_file_path = os.path.join(nifti_output_dir, files["json"])
async with aiofiles.open(json_file_path, "rb") as f:
json_content = await f.read()
yield files["json"], BytesIO(json_content)

if "nifti" in files:
nifti_file_path = os.path.join(nifti_output_dir, files["nifti"])
async with aiofiles.open(nifti_file_path, "rb") as f:
nifti_content = await f.read()
yield files["nifti"], BytesIO(nifti_content)

if "bval" in files:
bval_file_path = os.path.join(nifti_output_dir, files["bval"])
async with aiofiles.open(bval_file_path, "rb") as f:
bval_content = await f.read()
yield files["bval"], BytesIO(bval_content)

if "bvec" in files:
bvec_file_path = os.path.join(nifti_output_dir, files["bvec"])
async with aiofiles.open(bvec_file_path, "rb") as f:
bvec_content = await f.read()
yield files["bvec"], BytesIO(bvec_content)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for yielding the different file types (json, nifti, bval, bvec) is repetitive. Each file type has its own if block with nearly identical code for opening, reading, and yielding the file content. This can be simplified to reduce duplication.

Consider using a loop over a predefined list of file types to handle this more elegantly.

Suggested change
for _base_name, files in file_pairs.items():
if "json" in files:
json_file_path = os.path.join(nifti_output_dir, files["json"])
async with aiofiles.open(json_file_path, "rb") as f:
json_content = await f.read()
yield files["json"], BytesIO(json_content)
if "nifti" in files:
nifti_file_path = os.path.join(nifti_output_dir, files["nifti"])
async with aiofiles.open(nifti_file_path, "rb") as f:
nifti_content = await f.read()
yield files["nifti"], BytesIO(nifti_content)
if "bval" in files:
bval_file_path = os.path.join(nifti_output_dir, files["bval"])
async with aiofiles.open(bval_file_path, "rb") as f:
bval_content = await f.read()
yield files["bval"], BytesIO(bval_content)
if "bvec" in files:
bvec_file_path = os.path.join(nifti_output_dir, files["bvec"])
async with aiofiles.open(bvec_file_path, "rb") as f:
bvec_content = await f.read()
yield files["bvec"], BytesIO(bvec_content)
file_order = ["json", "nifti", "bval", "bvec"]
for _base_name, files in file_pairs.items():
for file_type in file_order:
if file_type in files:
filename = files[file_type]
file_path = os.path.join(nifti_output_dir, filename)
async with aiofiles.open(file_path, "rb") as f:
content = await f.read()
yield filename, BytesIO(content)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion

@claude
Copy link
Copy Markdown

claude bot commented Feb 11, 2026

Pull Request Review: WADO-RS NIfTI Conversion API Endpoints

This is a solid implementation that adds NIfTI conversion capabilities to the DICOMweb API. Here's my detailed feedback:


✅ Strengths

  1. Excellent Error Handling: The granular error hierarchy for dcm2niix exit codes is well-designed and properly maps tool-specific errors to typed exceptions.

  2. Good Modality Filtering: Pre-filtering non-image modalities (SR, KO, PR) before fetching prevents unnecessary network transfers and failed conversions.

  3. Comprehensive Test Coverage: Acceptance tests cover all three levels (study, series, instance) and both batch and streaming APIs, which is thorough.

  4. Client API Design: Dual batch/streaming methods (retrieve_nifti_* and iter_nifti_*) provide flexibility for different use cases.


🔧 Code Quality Issues

1. Potential Resource Leak in Client (adit-client/adit_client/client.py:121-156)

The _iter_multipart_response method patches _extract_part_content but doesn't guarantee restoration on exception. While there's a finally block, if an exception occurs during iteration after yielding values, the patch might not be restored properly.

Recommendation: Consider creating a new DICOMwebClient instance instead of patching the shared one, or add a context manager to ensure proper cleanup.

2. Unused Import (adit-client/adit_client/client.py:10)

The Union type hint is imported but not used in the visible code.

3. NoSpatialDataError Not Raised (adit/core/errors.py:88-91)

NoSpatialDataError is defined but never raised in the converter code. Either remove it or add the corresponding logic to raise it when appropriate.

4. Condition Always True (adit/core/utils/dicom_to_nifti_converter.py:111)

The condition elif exit_code == DcmToNiftiExitCode.UNSPECIFIED_ERROR or exit_code != 0: will catch any non-zero exit code that wasn't matched earlier. The or exit_code != 0 part makes the first check redundant. Consider simplifying to just else:.

5. File Extension Parsing (adit/dicom_web/utils/wadors_utils.py:782-793)

The file extension parsing with .nii.gz files works but is convoluted. Consider using a more direct approach with endswith() checks.


🔒 Security Considerations

1. Filename Injection (adit/dicom_web/renderers.py:348)

Good job sanitizing the filename for Content-Disposition. However, consider also removing other control characters and validating the filename more strictly with a regex pattern that allows only safe characters.

2. Path Traversal (adit-client/adit_client/client.py:109)

Good use of os.path.basename() to prevent path traversal attacks. This is the correct approach.


⚡ Performance Considerations

1. Memory Efficiency - Study-Level Fetches (adit/dicom_web/utils/wadors_utils.py:695-722)

Excellent design! Fetching series individually for study-level requests prevents loading entire studies into memory. This is the right approach for large datasets.

2. Temporary File Management

Using TemporaryDirectory with async cleanup is good, but for very large studies with many series, you might accumulate significant disk usage before cleanup. Consider adding periodic cleanup or size limits for production use.


🐛 Potential Bugs

1. Missing PatientID Validation (adit/dicom_web/utils/wadors_utils.py:712)

In the study-level fetch, you access query["PatientID"] but it might not exist if the original request didn't include it. Add validation to ensure required fields are present.

2. Iterator Consumption (adit/dicom_web/views.py:836)

You pass async iterators directly to wado_retrieve_nifti, which is correct. However, ensure that exceptions during iteration are properly propagated and don't leave resources open. The current implementation looks safe.


📝 Documentation & Style

1. Missing Docstring Examples

The client methods would benefit from usage examples in docstrings showing how to use the returned tuples and save files.

2. Consistent Logging

Good fix changing the logger name from "__name__" to __name__ in line 623.


🧪 Testing Suggestions

  1. Edge Cases: Add tests for:

    • Studies with only non-image modalities (SR/KO/PR)
    • Empty series
    • Corrupt DICOM files
    • Very large studies (performance testing)
  2. Error Handling: Test each exception type in the error hierarchy to ensure proper propagation through the API.

  3. Streaming: Verify that streaming responses properly handle client disconnection and don't leak resources.


📊 Overall Assessment

Score: 8.5/10

This is a well-architected feature with good separation of concerns, comprehensive error handling, and solid test coverage. The main improvements needed are:

  1. Fix the resource leak potential in client patching
  2. Add input validation for required fields
  3. Clean up unused code (NoSpatialDataError, Union import)
  4. Improve filename sanitization
  5. Simplify the exit code condition logic

The performance optimizations (series-by-series fetching, streaming responses) show good understanding of the problem domain. Once the minor issues are addressed, this will be production-ready.


Recommendation: Approve with minor revisions suggested above.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@adit/dicom_web/renderers.py`:
- Around line 101-120: The generator in render (streaming_content) emits only a
closing boundary when images yields zero items because first_part remains True;
modify streaming_content to track whether any part was yielded (e.g., use a
boolean like first_part or parts_emitted) and after the async for loop only emit
the closing boundary if at least one part was emitted—otherwise yield nothing
(or raise/return an empty body) so no standalone closing boundary is sent;
update references in streaming_content (first_part, boundary, images, render)
accordingly.

In `@adit/dicom_web/utils/wadors_utils.py`:
- Around line 246-253: The exception handlers in _process_single_fetch currently
swallow NoValidDicomError, NoSpatialDataError, and DcmToNiftiConversionError and
just return, which yields an empty response for non-STUDY requests; update the
except blocks in _process_single_fetch so that if the request level (the
function's level parameter) is not "STUDY" you re-raise the caught exception (or
raise an appropriate HTTP/ValueError) instead of silently returning, and only
log-and-return for level == "STUDY"; ensure the change is consistent with how
wado_retrieve_nifti expects to handle errors so callers at SERIES/IMAGE levels
receive an error response rather than an empty multipart payload.
- Around line 258-259: The call to aiofiles.os.scandir in wadors_utils.py should
be used as an async context manager to guarantee the underlying OS file
descriptor is closed; replace the current direct await
aiofiles.os.scandir(nifti_output_dir) usage (the entries -> all_files list
comprehension) with a with (await aiofiles.os.scandir(nifti_output_dir)) as
entries: block and then build all_files = [entry.name for entry in entries]
inside that block so entries is always cleaned up even on error or interruption.
🧹 Nitpick comments (5)
adit/dicom_web/tests/acceptance/test_wadors.py (1)

665-669: Add strict=True to zip() calls (Ruff B905).

The preceding assertion already checks equal lengths, but strict=True provides a safety net if that assertion is ever removed or reordered.

🔧 Proposed fix
-    for (r_name, _), (i_name, _) in zip(retrieved, iterated):
+    for (r_name, _), (i_name, _) in zip(retrieved, iterated, strict=True):

Apply the same change on lines 693 and 756.

Also applies to: 690-694, 753-757

adit/dicom_web/utils/wadors_utils.py (2)

229-244: dcm2niix input directory includes the nifti_output subdirectory.

temp_path is used as both the parent for DICOM files and the location of nifti_output_dir. Since dcm2niix scans the input directory recursively, it will also scan into nifti_output/. On the first conversion call this is harmless (the output dir is empty), but it's fragile — if the code is ever refactored to run multiple conversions into the same temp dir, stale outputs could interfere.

Consider placing DICOM files in their own subdirectory (e.g., temp_path / "dicom_input").

🔧 Proposed fix
     async with TemporaryDirectory() as temp_dir:
         temp_path = Path(temp_dir)
+        dicom_input_dir = temp_path / "dicom_input"
+        await aiofiles.os.makedirs(dicom_input_dir, exist_ok=True)

         for file_idx, dicom_image in enumerate(dicom_images):
-            dicom_file_path = temp_path / f"dicom_file_{file_idx}.dcm"
+            dicom_file_path = dicom_input_dir / f"dicom_file_{file_idx}.dcm"
             await sync_to_async(write_dataset, thread_sensitive=False)(
                 dicom_image, dicom_file_path
             )

         nifti_output_dir = temp_path / "nifti_output"
         await aiofiles.os.makedirs(nifti_output_dir, exist_ok=True)
         converter = DicomToNiftiConverter()

         try:
             await sync_to_async(converter.convert, thread_sensitive=False)(
-                temp_path, nifti_output_dir
+                dicom_input_dir, nifti_output_dir
             )

118-156: _fetch_dicom_data largely duplicates wado_retrieve's fetch logic.

Both functions build a QueryDataset, dispatch on level, and call the same operator.fetch_* methods. The only difference is sync vs async and the callback accumulating into a list vs a queue. Consider extracting the shared dispatch into a helper to reduce duplication.

Also, the assert on line 145 will be stripped under python -O. The existing wado_retrieve (line 76) has the same issue, so this is consistent, but worth noting.

adit-client/adit_client/client.py (2)

297-332: Monkey-patching DICOMwebClient._extract_part_content is fragile.

This approach depends on the internal structure of dicomweb-client, which may change without notice across versions. Additionally, line 299 creates a throwaway DICOMwebClient with an empty ae_title solely to access _decode_multipart_message, allocating an unnecessary HTTP session.

Consider implementing a standalone multipart parser (or using a library like requests-toolbelt) instead of patching the library internals. This would decouple the NIfTI streaming from the dicomweb-client library's private API.


194-278: All six NIfTI methods use dicomweb_client._http_get (private API).

This is the same concern as the monkey-patching: coupling to DICOMwebClient internals. If you do proceed with this approach, consider using requests.Session directly (which is already configured via _create_dicom_web_client) to avoid depending on private methods.

Also note dicomweb_client is instantiated on e.g., lines 197 and 208 but only used for _http_get; the same client instance could be reused within _iter_multipart_response instead of creating a second one on line 299.

Comment on lines +101 to +120
def render(self, images: AsyncIterator[tuple[str, BytesIO]]) -> AsyncIterator[bytes]:
async def streaming_content():
first_part = True
async for filename, file_content in images:
if first_part:
yield f"--{self.boundary}\r\n".encode()
first_part = False
else:
yield f"\r\n--{self.boundary}\r\n".encode()

yield "Content-Type: application/octet-stream\r\n".encode()
safe_filename = filename.replace("\r", "").replace("\n", "").replace('"', "")
disposition = f'Content-Disposition: attachment; filename="{safe_filename}"'
yield f"{disposition}\r\n\r\n".encode()

yield file_content.getvalue()

yield f"\r\n--{self.boundary}--\r\n".encode()

return streaming_content()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Empty iterator produces a malformed closing boundary without an opening part.

If images yields zero items (e.g., all series are filtered out as non-image), first_part stays True and the generator emits only the closing boundary \r\n--nifti-boundary--\r\n with no opening boundary. Multipart parsers may reject this.

Consider yielding nothing (or an empty body with a proper status code) when there are no parts.

🔧 Proposed fix
         async def streaming_content():
             first_part = True
             async for filename, file_content in images:
                 if first_part:
                     yield f"--{self.boundary}\r\n".encode()
                     first_part = False
                 else:
                     yield f"\r\n--{self.boundary}\r\n".encode()

                 yield "Content-Type: application/octet-stream\r\n".encode()
                 safe_filename = filename.replace("\r", "").replace("\n", "").replace('"', "")
                 disposition = f'Content-Disposition: attachment; filename="{safe_filename}"'
                 yield f"{disposition}\r\n\r\n".encode()

                 yield file_content.getvalue()

-            yield f"\r\n--{self.boundary}--\r\n".encode()
+            if not first_part:
+                yield f"\r\n--{self.boundary}--\r\n".encode()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def render(self, images: AsyncIterator[tuple[str, BytesIO]]) -> AsyncIterator[bytes]:
async def streaming_content():
first_part = True
async for filename, file_content in images:
if first_part:
yield f"--{self.boundary}\r\n".encode()
first_part = False
else:
yield f"\r\n--{self.boundary}\r\n".encode()
yield "Content-Type: application/octet-stream\r\n".encode()
safe_filename = filename.replace("\r", "").replace("\n", "").replace('"', "")
disposition = f'Content-Disposition: attachment; filename="{safe_filename}"'
yield f"{disposition}\r\n\r\n".encode()
yield file_content.getvalue()
yield f"\r\n--{self.boundary}--\r\n".encode()
return streaming_content()
def render(self, images: AsyncIterator[tuple[str, BytesIO]]) -> AsyncIterator[bytes]:
async def streaming_content():
first_part = True
async for filename, file_content in images:
if first_part:
yield f"--{self.boundary}\r\n".encode()
first_part = False
else:
yield f"\r\n--{self.boundary}\r\n".encode()
yield "Content-Type: application/octet-stream\r\n".encode()
safe_filename = filename.replace("\r", "").replace("\n", "").replace('"', "")
disposition = f'Content-Disposition: attachment; filename="{safe_filename}"'
yield f"{disposition}\r\n\r\n".encode()
yield file_content.getvalue()
if not first_part:
yield f"\r\n--{self.boundary}--\r\n".encode()
return streaming_content()
🤖 Prompt for AI Agents
In `@adit/dicom_web/renderers.py` around lines 101 - 120, The generator in render
(streaming_content) emits only a closing boundary when images yields zero items
because first_part remains True; modify streaming_content to track whether any
part was yielded (e.g., use a boolean like first_part or parts_emitted) and
after the async for loop only emit the closing boundary if at least one part was
emitted—otherwise yield nothing (or raise/return an empty body) so no standalone
closing boundary is sent; update references in streaming_content (first_part,
boundary, images, render) accordingly.

Comment on lines +246 to +253
except (NoValidDicomError, NoSpatialDataError) as e:
# The series passed the modality check but still failed conversion.
# This is unexpected and worth logging as a warning.
logger.warning(f"Series conversion failed unexpectedly: {e}")
return
except DcmToNiftiConversionError as e:
logger.warning(f"Failed to convert DICOM files to NIfTI: {e}")
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Silent return on conversion failure may confuse callers at SERIES/IMAGE level.

For study-level requests, silently skipping a failed series is reasonable (other series may succeed). But for series-level or image-level requests, _process_single_fetch silently returns empty results on NoValidDicomError, NoSpatialDataError, or DcmToNiftiConversionError. The caller in wado_retrieve_nifti (lines 204-209) doesn't distinguish the level, so the client receives an empty multipart response with no error indication.

Consider propagating these errors (or returning an appropriate HTTP error) when level != "STUDY".

🤖 Prompt for AI Agents
In `@adit/dicom_web/utils/wadors_utils.py` around lines 246 - 253, The exception
handlers in _process_single_fetch currently swallow NoValidDicomError,
NoSpatialDataError, and DcmToNiftiConversionError and just return, which yields
an empty response for non-STUDY requests; update the except blocks in
_process_single_fetch so that if the request level (the function's level
parameter) is not "STUDY" you re-raise the caught exception (or raise an
appropriate HTTP/ValueError) instead of silently returning, and only
log-and-return for level == "STUDY"; ensure the change is consistent with how
wado_retrieve_nifti expects to handle errors so callers at SERIES/IMAGE levels
receive an error response rather than an empty multipart payload.

Comment on lines +258 to +259
entries = await aiofiles.os.scandir(nifti_output_dir)
all_files = [entry.name for entry in entries]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, check aiofiles version and scandir implementation
pip install aiofiles 2>/dev/null && python3 << 'ENDPYTHON'
import inspect
import aiofiles.os

# Check if scandir exists and how it's implemented
if hasattr(aiofiles.os, 'scandir'):
    print("=== aiofiles.os.scandir exists ===")
    scandir_func = aiofiles.os.scandir
    print(f"Type: {type(scandir_func)}")
    print(f"Module: {scandir_func.__module__}")
    
    # Try to get source or info
    try:
        sig = inspect.signature(scandir_func)
        print(f"Signature: {sig}")
    except:
        print("Could not get signature")
    
    # Check if it's a coroutine function
    print(f"Is coroutine function: {inspect.iscoroutinefunction(scandir_func)}")
    
    # Try to get docstring
    if scandir_func.__doc__:
        print(f"\nDocstring:\n{scandir_func.__doc__}")
else:
    print("scandir not found in aiofiles.os")

# Also check aiofiles version
import aiofiles
print(f"\naiofiles version: {aiofiles.__version__}")
ENDPYTHON

Repository: openradx/adit

Length of output: 863


🏁 Script executed:

#!/bin/bash
# Read the actual code at lines 258-259 with context
cat -n adit/dicom_web/utils/wadors_utils.py | sed -n '250,270p'

Repository: openradx/adit

Length of output: 1151


🏁 Script executed:

#!/bin/bash
# Check aiofiles source code for scandir implementation and resource management
python3 << 'ENDPYTHON'
import inspect
import aiofiles.os

# Get more details about the scandir function
scandir_func = aiofiles.os.scandir

# Try to get the actual implementation
try:
    # Check if it's wrapping os.scandir
    import os
    print("=== Standard os.scandir() ===")
    print(f"Is context manager: {hasattr(os.scandir('.'), '__enter__')}")
    
    # Check return type
    std_scandir = os.scandir('.')
    print(f"Type: {type(std_scandir)}")
    print(f"Has __enter__: {hasattr(std_scandir, '__enter__')}")
    print(f"Has __aenter__: {hasattr(std_scandir, '__aenter__')}")
    print(f"Has __iter__: {hasattr(std_scandir, '__iter__')}")
    std_scandir.close()
except Exception as e:
    print(f"Error checking os.scandir: {e}")

print("\n=== Checking aiofiles implementation ===")
# Try to find where aiofiles.os module comes from
print(f"aiofiles.os module: {aiofiles.os.__file__}")

# Check if we can see the actual wrapping
try:
    import aiofiles.os as aos_module
    print(f"Module contents sample: {[x for x in dir(aos_module) if not x.startswith('_')][:10]}")
except:
    pass
ENDPYTHON

Repository: openradx/adit

Length of output: 459


🏁 Script executed:

#!/bin/bash
# Check aiofiles source directly for scandir resource management
python3 << 'ENDPYTHON'
import aiofiles.os
import inspect

# Try to get source code of scandir
try:
    source_file = inspect.getfile(aiofiles.os.scandir)
    print(f"Source file: {source_file}")
    
    # Read the file to find scandir
    with open(source_file, 'r') as f:
        content = f.read()
    
    # Look for scandir definition/wrapping
    if 'scandir' in content:
        # Find the scandir part
        lines = content.split('\n')
        for i, line in enumerate(lines):
            if 'scandir' in line.lower():
                start = max(0, i - 2)
                end = min(len(lines), i + 15)
                print(f"\nContext around scandir (lines {start}-{end}):")
                for j in range(start, end):
                    print(f"{j}: {lines[j]}")
                break
except Exception as e:
    print(f"Error: {e}")

# Also check aiofiles documentation for resource management
print("\n=== Checking aiofiles module structure ===")
import aiofiles
print(f"aiofiles.__file__: {aiofiles.__file__}")
ENDPYTHON

Repository: openradx/adit

Length of output: 272


🌐 Web query:

aiofiles os.scandir resource management context manager

💡 Result:

  • aiofiles.os.scandir is just os.scandir executed in a threadpool (scandir = wrap(os.scandir)), i.e. you await it to create the iterator. It does not return a special async context manager/async iterator wrapper. [1], [2]
  • The returned object is the normal os.scandir() iterator (ScandirIterator), which is a synchronous context manager and has .close() to free OS resources (fd/handle). Using with ... (or calling .close()) is recommended to avoid ResourceWarning if it isn’t exhausted. [3]

Typical usage:

from aiofiles import os as aios

it = await aios.scandir(path)   # created in executor
with it:                        # regular (sync) context manager
    for entry in it:            # regular (sync) iteration
        ...

If you need closing to happen off-thread too:

it = await aios.scandir(path)
try:
    ...
finally:
    await asyncio.get_running_loop().run_in_executor(None, it.close)

Sources:
[1] https://sources.debian.org/src/aiofiles/24.1.0-2/src/aiofiles/os.py
[2] https://sources.debian.org/src/aiofiles/24.1.0-2/src/aiofiles/ospath.py
[3] https://docs.python.org/3.12/library/os.html#os.scandir


🌐 Web query:

aiofiles scandir file descriptor leak

💡 Result:

aiofiles.os.scandir() can look like it “leaks” file descriptors because it ultimately returns a regular os.scandir() iterator, which holds an OS directory handle (FD) until the iterator is exhausted or explicitly closed. If you keep creating scandir iterators without exhausting/closing them (or they live longer than you expect), open FDs accumulate. aiofiles is just delegating the call to the standard-library API via an executor; it doesn’t change this lifecycle. [1][2][3]

What to do

Always close the iterator (or use it as a context manager):

from aiofiles import os as aios

async def walk(path):
    with (await aios.scandir(path)) as it:  # closes FD on exit
        for entry in it:
            ...

Or:

it = await aios.scandir(path)
try:
    for entry in it:
        ...
finally:
    it.close()

If you fully iterate the scandir iterator, it also closes automatically, but relying on “eventual GC” is exactly what leads to FD growth in real systems. [1][2]

Sources:
[1] Python bug tracker discussion of os.scandir() FD lifetime and adding close() / context manager behavior: (bugs.python.org)
[2] Python.org discussion noting with/close() is needed to avoid leaking scandir FDs (especially when GC is delayed): (discuss.python.org)
[3] aiofiles README listing aiofiles.os.scandir as an executor-enabled wrapper: (github.com)

Citations:


Wrap aiofiles.os.scandir() result in a context manager to ensure proper file descriptor cleanup.

aiofiles.os.scandir() returns a standard os.scandir() iterator, which is a context manager holding an OS file descriptor. The current code iterates fully via list comprehension, but this does not guarantee cleanup if iteration is interrupted or on error. Use with (await aiofiles.os.scandir(nifti_output_dir)) as entries: to ensure the file descriptor is always released.

🤖 Prompt for AI Agents
In `@adit/dicom_web/utils/wadors_utils.py` around lines 258 - 259, The call to
aiofiles.os.scandir in wadors_utils.py should be used as an async context
manager to guarantee the underlying OS file descriptor is closed; replace the
current direct await aiofiles.os.scandir(nifti_output_dir) usage (the entries ->
all_files list comprehension) with a with (await
aiofiles.os.scandir(nifti_output_dir)) as entries: block and then build
all_files = [entry.name for entry in entries] inside that block so entries is
always cleaned up even on error or interruption.

@claude
Copy link
Copy Markdown

claude bot commented Feb 11, 2026

PR Review: Add WADO-RS NIfTI conversion API endpoints

Summary

This PR adds DICOM-to-NIfTI conversion capabilities to the WADO-RS API, introducing three new endpoints at study, series, and instance levels. The implementation includes granular error handling, modality-based pre-filtering, and streaming multipart responses with associated metadata files.

Code Quality & Best Practices

Strengths

  1. Excellent error handling hierarchy: The new exception classes provide clear, typed exceptions for different dcm2niix failure modes.

  2. Comprehensive exit code mapping: Documents dcm2niix exit codes with proper IntEnum usage.

  3. Modality pre-filtering: Smart optimization to skip non-image modalities before fetching.

  4. Good test coverage: Six new acceptance tests cover all endpoints.

  5. Streaming architecture: Async iterator pattern efficiently handles large conversions.

Potential Issues & Improvements

Critical Issues

  1. Memory leak risk in client parsing (adit-client/adit_client/client.py:121-156):

    • Creates a new DICOMwebClient on every call and monkey-patches it
    • Recommendation: Consider creating a context manager or caching
  2. Resource cleanup in exception paths (adit/dicom_web/utils/wadors_utils.py:865-875):

    • Early returns while inside async with TemporaryDirectory()
    • Recommendation: Ensure cleanup even if consumers don't exhaust the generator
  3. Unsafe filename handling (adit/dicom_web/renderers.py:112):

    • Sanitization incomplete. Filenames could contain path separators
    • Recommendation: Use werkzeug.utils.secure_filename or proper sanitization

Medium Priority Issues

  1. Subprocess error handling (adit/core/utils/dicom_to_nifti_converter.py:116):

    • Catches SubprocessError but subprocess.run() doesn't raise it
    • Recommendation: Handle FileNotFoundError and PermissionError
  2. Async queue ordering (adit/dicom_web/utils/wadors_utils.py:660-694):

    • call_soon_threadsafe with put_nowait doesn't guarantee ordering
    • Recommendation: Consider asyncio.run_coroutine_threadsafe
  3. Content-Disposition parsing (adit-client/adit_client/client.py:104-109):

    • Doesn't handle quoted filenames with spaces properly
    • Recommendation: Use proper header parsing

Security Concerns

  1. Path traversal via filenames: Incomplete sanitization
  2. Unbounded disk usage: No limits on study size
  3. No rate limiting: Expensive operations lack throttling
  4. Command injection: Low risk - safely constructed

Test Coverage

Good coverage of happy paths, but missing:

  • Error scenarios (conversion failures, corrupt DICOM)
  • Large studies for memory efficiency testing
  • Empty responses (study with only non-image series)
  • Malformed headers
  • Concurrent requests

Recommendations

Must Fix (Before Merge)

  1. Fix filename sanitization
  2. Add resource limits
  3. Fix subprocess error handling
  4. Improve Content-Disposition parsing

Should Fix

  1. Add error scenario tests
  2. Document file ordering guarantee
  3. Consider memory leak risk

Nice to Have

  1. Add caching for converted files
  2. Add async client methods
  3. Batch conversion optimization

Conclusion

Solid implementation with good architecture. However, address security and resource management concerns before merging, particularly filename sanitization, resource limits, and cleanup in exception paths.

Overall: Approve with changes requested

Great work! Code quality is high, but please address the critical issues.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@adit/dicom_web/utils/wadors_utils.py`:
- Around line 108-118: The exception handlers in wadors_utils.py must re-raise
API errors using exception chaining to preserve tracebacks: in the
ExceptionGroup loop inside the try/except, change the raises to "raise
ServiceUnavailableApiError(str(exc)) from exc" and "raise
BadGatewayApiError(str(exc)) from exc" when inspecting each exc; likewise update
the individual handlers to "raise ServiceUnavailableApiError(str(err)) from err"
in the except RetriableDicomError block and "raise BadGatewayApiError(str(err))
from err" in the except DicomError block so the original exceptions (exc/err)
are attached as the __cause__.
🧹 Nitpick comments (2)
adit/dicom_web/utils/wadors_utils.py (2)

121-159: Duplicated fetch dispatch logic.

The STUDY/SERIES/IMAGE dispatch (Lines 134–157) mirrors fetch_with_sentinel (Lines 67–91) nearly verbatim. Consider extracting a shared helper that accepts a callback, reducing the maintenance burden of keeping both branches in sync.


235-237: Dispatching sync_to_async per file is inefficient for large series.

Each iteration dispatches a separate thread-pool task for a single write_dataset call. For a series with hundreds of DICOM images, this creates significant overhead. Batch all writes into one sync_to_async call.

♻️ Proposed refactor
-        for file_idx, dicom_image in enumerate(dicom_images):
-            dicom_file_path = temp_path / f"dicom_file_{file_idx}.dcm"
-            await sync_to_async(write_dataset, thread_sensitive=False)(dicom_image, dicom_file_path)
+        def _write_all_dicoms(images: list[Dataset], dest: Path) -> None:
+            for file_idx, dicom_image in enumerate(images):
+                write_dataset(dicom_image, dest / f"dicom_file_{file_idx}.dcm")
+
+        await sync_to_async(_write_all_dicoms, thread_sensitive=False)(dicom_images, temp_path)

Comment on lines +108 to +118
except ExceptionGroup as eg:
# Extract the first relevant exception
for exc in eg.exceptions:
if isinstance(exc, RetriableDicomError):
raise ServiceUnavailableApiError(str(exc))
elif isinstance(exc, DicomError):
raise BadGatewayApiError(str(exc))
raise # Re-raise if no handled exception found
raise
except RetriableDicomError as err:
raise ServiceUnavailableApiError(str(err))
except DicomError as err:
raise BadGatewayApiError(str(err))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Chain re-raised exceptions with from to preserve tracebacks.

Ruff B904 flags all four raise statements here. The ExceptionGroup handler (Lines 111, 113) should use from exc, and the individual handlers (Lines 116, 118) should use from err.

🔧 Proposed fix
     except ExceptionGroup as eg:
         for exc in eg.exceptions:
             if isinstance(exc, RetriableDicomError):
-                raise ServiceUnavailableApiError(str(exc))
+                raise ServiceUnavailableApiError(str(exc)) from exc
             elif isinstance(exc, DicomError):
-                raise BadGatewayApiError(str(exc))
+                raise BadGatewayApiError(str(exc)) from exc
         raise
     except RetriableDicomError as err:
-        raise ServiceUnavailableApiError(str(err))
+        raise ServiceUnavailableApiError(str(err)) from err
     except DicomError as err:
-        raise BadGatewayApiError(str(err))
+        raise BadGatewayApiError(str(err)) from err
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except ExceptionGroup as eg:
# Extract the first relevant exception
for exc in eg.exceptions:
if isinstance(exc, RetriableDicomError):
raise ServiceUnavailableApiError(str(exc))
elif isinstance(exc, DicomError):
raise BadGatewayApiError(str(exc))
raise # Re-raise if no handled exception found
raise
except RetriableDicomError as err:
raise ServiceUnavailableApiError(str(err))
except DicomError as err:
raise BadGatewayApiError(str(err))
except ExceptionGroup as eg:
for exc in eg.exceptions:
if isinstance(exc, RetriableDicomError):
raise ServiceUnavailableApiError(str(exc)) from exc
elif isinstance(exc, DicomError):
raise BadGatewayApiError(str(exc)) from exc
raise
except RetriableDicomError as err:
raise ServiceUnavailableApiError(str(err)) from err
except DicomError as err:
raise BadGatewayApiError(str(err)) from err
🧰 Tools
🪛 Ruff (0.15.0)

[warning] 111-111: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 113-113: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 116-116: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


[warning] 118-118: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🤖 Prompt for AI Agents
In `@adit/dicom_web/utils/wadors_utils.py` around lines 108 - 118, The exception
handlers in wadors_utils.py must re-raise API errors using exception chaining to
preserve tracebacks: in the ExceptionGroup loop inside the try/except, change
the raises to "raise ServiceUnavailableApiError(str(exc)) from exc" and "raise
BadGatewayApiError(str(exc)) from exc" when inspecting each exc; likewise update
the individual handlers to "raise ServiceUnavailableApiError(str(err)) from err"
in the except RetriableDicomError block and "raise BadGatewayApiError(str(err))
from err" in the except DicomError block so the original exceptions (exc/err)
are attached as the __cause__.

@claude
Copy link
Copy Markdown

claude bot commented Feb 11, 2026

Pull Request Review: Add WADO-RS NIfTI Conversion API Endpoints

This is a well-structured PR that adds a useful feature for converting DICOM to NIfTI format via REST API endpoints. Overall the implementation is solid with good test coverage.


✅ Strengths

  1. Excellent error handling hierarchy - The new DcmToNiftiConversionError exception hierarchy in adit/core/errors.py:52-91 maps clearly to dcm2niix exit codes.

  2. Comprehensive test coverage - Six acceptance tests cover all three levels (study/series/image) with both streaming and batch retrieval methods.

  3. Smart modality filtering - Filtering non-image modalities (SR, KO, PR) before fetching in adit/dicom_web/utils/wadors_utils.py:806-811 prevents unnecessary DICOM fetches.

  4. Memory-efficient streaming - Study-level requests fetch series individually rather than loading the entire study into memory.

  5. Proper async/await patterns - The refactored wado_retrieve function correctly uses call_soon_threadsafe for thread-safe queue operations.


🐛 Potential Bugs (High Priority)

1. Race condition in sentinel delivery (adit/dicom_web/utils/wadors_utils.py:660-708)

If the async task is cancelled before the sentinel is placed, the consumer loop will hang forever waiting for None. Add proper cancellation handling with try/except asyncio.CancelledError.

2. Inconsistent exception handling (adit/dicom_web/utils/wadors_utils.py:723-737)

The nested fetch_with_sentinel() function could raise exceptions (e.g., ValueError for invalid level) that would not be handled properly. Add a catch-all exception handler.

3. Potential filename injection vulnerability (adit/dicom_web/renderers.py:348-349)

While filenames are sanitized by removing some characters, consider using a more robust sanitization approach with a whitelist of allowed characters and length limits.


⚠️ Code Quality Issues

4. Line length violations - Several lines exceed the 100-character limit. Run uv run cli format-code to auto-fix.

5. Type annotation style (adit-client/adit_client/client.py:111) - Use Optional[bytes] instead of Union[bytes, None] per PEP 484.

6. Inconsistent log levels (adit/dicom_web/utils/wadors_utils.py:865-875) - Use consistent log levels based on whether errors are expected/handled vs unexpected/re-raised.


📝 Documentation

7. Missing CLAUDE.md update - The new /nifti endpoints are not documented in CLAUDE.md under the DICOMweb section.

8. Client method docstrings - Methods like retrieve_nifti_study() could include: expected return format, what files are included, and exception types.


Summary

Overall Assessment: High-quality PR with thoughtful design. Error handling is excellent, test coverage is comprehensive, and the async implementation is mostly correct.

Critical items to address:

  • Issue 1: Fix potential race condition in sentinel delivery
  • Issue 3: Improve filename sanitization

Recommended improvements:

  • Issue 2: Add comprehensive exception handling
  • Issue 4: Fix line length violations
  • Issue 7: Update CLAUDE.md

Stats:

  • Files changed: 8 | Lines added: 801 | Lines removed: 52
  • Test coverage: ✅ Excellent (6 acceptance tests)
  • Code style: ⚠️ Minor violations
  • Security: ⚠️ One low-severity issue

Great work overall! The architecture is sound and follows Django and async best practices. 🚀

@claude
Copy link
Copy Markdown

claude bot commented Feb 11, 2026

Pull Request Review: Add WADO-RS NIfTI conversion API endpoints

Overview

This PR adds three new WADO-RS endpoints for streaming DICOM-to-NIfTI converted files at study, series, and instance levels. The implementation includes robust error handling, modality filtering, and client library support.

✅ Strengths

1. Excellent Error Handling

  • Granular exception hierarchy (adit/core/errors.py:52-91): Well-designed typed exceptions (DcmToNiftiConversionError and subclasses) mapped to specific dcm2niix exit codes
  • Comprehensive exit code handling (adit/core/utils/dicom_to_nifti_converter.py:88-114): All documented dcm2niix exit codes are properly handled with appropriate exception types
  • Graceful degradation (adit/dicom_web/utils/wadors_utils.py:247-257): Conversion failures for individual series are logged as warnings and don't fail the entire request

2. Performance & Memory Efficiency

  • Series-level streaming (adit/dicom_web/utils/wadors_utils.py:179-206): For study-level requests, each series is fetched and converted individually, preventing loading entire studies into memory
  • Async/await throughout: Proper use of async patterns with aiofiles for I/O operations
  • Multipart streaming (adit/dicom_web/renderers.py:326-357): Files are streamed as they're generated

3. Smart Modality Filtering

  • Pre-filtering (adit/dicom_web/utils/wadors_utils.py:186-193): Non-image modalities (SR, KO, PR) are filtered before fetching, reducing unnecessary network traffic and processing
  • Well-documented constants (adit/dicom_web/utils/wadors_utils.py:32-34): Clear comments explaining what each modality represents

4. Comprehensive Testing

  • Acceptance tests cover all six new methods (retrieve/iter for study/series/image)
  • Content validation: Tests verify both file types (.nii.gz, .json) and non-empty content
  • Consistency checks: Tests compare retrieve vs. iter methods for equivalence

🔍 Issues & Recommendations

Critical Issues

1. Missing NoSpatialDataError Definition ⚠️

Location: adit/core/errors.py:88-91

The NoSpatialDataError exception is defined but never raised in dicom_to_nifti_converter.py. However, it's caught in wadors_utils.py:247. This suggests it should be raised for a specific dcm2niix exit code, but it's unclear which one. Consider:

  • Is this for exit code 2 (NO_DICOM_FOUND) when files exist but have no spatial data?
  • Should dcm2niix output be parsed for specific error messages?
  • Document when this exception should be raised, or remove it if unused

2. Subprocess Error Handling Logic Issue ⚠️

Location: adit/core/utils/dicom_to_nifti_converter.py:111-114

The condition elif exit_code == DcmToNiftiExitCode.UNSPECIFIED_ERROR or exit_code != 0: has a logical flaw - the or exit_code != 0 part will never trigger because any non-zero exit code would match an earlier elif. Consider simplifying to else: since this handles all remaining non-zero exit codes.

3. Client Method Uses Private API ⚠️

Location: adit-client/adit_client/client.py:22, 124, 156

The client relies on private methods of DICOMwebClient (methods starting with _) including monkey-patching _extract_part_content. This is fragile - private methods can change without warning in upstream releases. Consider:

  1. Implement multipart parsing from scratch using email.message or requests-toolbelt
  2. Contribute an upstream patch to dicomweb-client to expose filename metadata
  3. Document this as a known risk and pin dicomweb-client version strictly

Security Concerns

4. Filename Sanitization Could Be Stronger 🔒

Location: adit/dicom_web/renderers.py:348

Current sanitization only strips \r, \n, and ". Consider additional hardening:

  • Strip all ASCII control characters (0x00-0x1F, 0x7F)
  • Enforce maximum filename length
  • Example: safe_filename = "".join(c for c in filename if c.isprintable() and c not in '\r\n"\\/:*?<>|')[:255]

5. No Input Validation in Client 🔒

Location: adit-client/adit_client/client.py:108

The _extract_filename method doesn't validate filenames after extraction. Add validation after os.path.basename() to check for empty filenames, control characters, or suspicious patterns.

Performance & Optimization

6. Temporary File Cleanup

Location: adit/dicom_web/utils/wadors_utils.py:232

Good use of TemporaryDirectory for cleanup. For large studies with many series, temp files accumulate until the entire request completes. Consider cleaning up each series temp directory immediately after yielding its files (though current approach is acceptable).

7. File Reading Could Stream Chunks

Location: adit/dicom_web/utils/wadors_utils.py:282-284

Currently reads entire NIfTI file into memory before yielding. For very large NIfTI files, consider chunked reading. Acceptable for most medical imaging use cases, but worth noting for future optimization.

Code Quality

8. Logger Name Typo Fixed

Location: adit/dicom_web/utils/wadors_utils.py:30

Good catch fixing logger = logging.getLogger("__name__") to logger = logging.getLogger(__name__)

9. Improved Async Pattern

Location: adit/dicom_web/utils/wadors_utils.py:61-98

The refactoring of wado_retrieve from TaskGroup to a simpler sentinel pattern is cleaner and easier to follow.

Testing

10. Test Coverage Gaps

While acceptance tests are comprehensive, consider adding:

  • Unit tests for DicomToNiftiConverter exit code handling
  • Unit tests for filename extraction and sanitization
  • Error case tests: What happens when dcm2niix is not installed?
  • Edge cases: Empty studies, all-SR studies, mixed modalities

11. Hard-coded Test Data

Location: adit/dicom_web/tests/acceptance/test_wadors.py:424-426

Filters match NON_IMAGE_MODALITIES but are hard-coded in tests. Import and use NON_IMAGE_MODALITIES constant from wadors_utils to ensure consistency.

Documentation

12. API Documentation Missing

The PR adds significant new functionality but doesn't update CLAUDE.md or API documentation. Should document:

  • New NIfTI endpoints and usage
  • Expected behavior with non-image modalities
  • What files are returned (.nii.gz, .json, .bval, .bvec)
  • Performance characteristics for large studies

🎯 Recommendations Summary

Must Fix (Before Merge)

  1. Resolve NoSpatialDataError - either implement it or remove it
  2. Fix exit code handling logic in dicom_to_nifti_converter.py:111
  3. Strengthen filename sanitization in both server and client
  4. Add input validation for filename extraction in client

Should Fix (Soon)

  1. Document or mitigate private API usage in dicomweb-client
  2. Add unit tests for error handling paths
  3. Use NON_IMAGE_MODALITIES constant in tests
  4. Update CLAUDE.md with NIfTI endpoint documentation

Nice to Have

  1. Consider chunked file reading for large NIfTI files
  2. Add error handling tests (missing dcm2niix, malformed data, etc.)
  3. Add view class docstrings

📊 Overall Assessment

Quality Score: 8/10

This is a well-architected feature with thoughtful error handling, good performance characteristics, and comprehensive testing. The main concerns are around security hardening (filename validation), the use of private APIs in the client, and some minor logic issues in error handling.

The code follows the project's conventions well, uses appropriate async patterns, and the streaming approach is excellent for memory efficiency.

Recommendation: Request changes for the "Must Fix" items, then approve. This is solid work overall!

@mhumzaarain mhumzaarain changed the title Add WADO-RS NIfTI conversion API endpoints Add NIfTI conversion API Feb 11, 2026
- Track sentinel receipt in wado_retrieve to avoid cancelling fetch_task when consumer exits normally after seeing the sentinel
- Update test fake_run stubs to create output folders/files so converter validation passes
@mhumzaarain mhumzaarain linked an issue Mar 9, 2026 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nifti Export

2 participants