Skip to content

Refactor legacy MinorTraceChemistry into the Ocotillo schema via backfill job #600

@kbighorse

Description

@kbighorse

Summary

Implement a repeatable, idempotent backfill job to migrate legacy NMA_MinorTraceChemistry records into the new Ocotillo schema (Observation, Sample, Parameter, Notes tables).

Source feature spec: tests/features/nma-chemistry-minortracechemistry-refactor.feature

Requirements

Core backfill (idempotent)

  • Create Observation records from legacy NMA_MinorTraceChemistry rows, keyed by GlobalIDnma_pk_chemistryresults
  • Link each Observation to its Sample via SamplePtIDnma_pk_chemistrysample
  • Re-running the job must not create duplicates

Field mapping

Legacy Field Target Notes
GlobalID Observation.nma_pk_chemistryresults Idempotency key
SamplePtID Sample linkage via nma_pk_chemistrysample Also links to Thing
Analyte Parameter.parameter_name (matrix = "water") Create Parameter if needed
SampleValue Observation.value
Units Observation.unit
AnalysisDate Observation.observation_datetime
AnalysisMethod Observation.analysis_method_name Preserve as-is (e.g. "Field analysis", "EPA 200.8")
AnalysesAgency Observation.analysis_agency
Uncertainty Observation.uncertainty
Symbol = < Observation.detect_flag = false Value is detection limit, not detected concentration
Volume Sample.volume Populated on related Sample
VolumeUnit Sample.volume_unit Populated on related Sample
Notes Notes table record target_table="observation", note_type="Chemistry Observation"

Unmapped fields (ignored)

  • SamplePointID, OBJECTID, WCLab_ID — not persisted in new schema
  • Volume and VolumeUnit go to Sample, not Observation

Orphan prevention

  • Skip legacy records whose SamplePtID does not match an existing Sample
  • Report count of skipped records with reason (missing Sample linkage)

Linkage integrity

  • Each Observation must link to its Sample and the Thing associated with that Sample (no orphaned observations)

Acceptance Criteria

All scenarios in the feature file pass:

  • Backfill creates Observation records and can be re-run without duplicates
  • Volume and VolumeUnit populate the related Sample
  • Observations link to Sample (and Thing) by SamplePtID
  • AnalysisMethod values are preserved as-is
  • Notes are stored in the Notes table and linked to the Observation
  • Symbol < sets detect_flag to false
  • Unmapped legacy fields are not persisted
  • Orphan legacy records are skipped and reported

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions