Implement OpenAI two‑pass ETS extraction pipeline with QC, review integration, migrations and tests by karilint · Pull Request #239 · karilint/mammalbase

karilint · 2026-03-05T14:20:48Z

Motivation

Replace the brittle regex/legacy extractor with a production two‑pass OpenAI (Responses API) workflow (PASS1 evidence → PASS2 ETS) and add deterministic QC before any ETS import.
Preserve legacy baseline extractor behind flags and keep the existing review/import UI but adapt it to accept LLM‑generated ETS payloads and re‑normalize after curator edits.
Build a DB‑driven trait vocabulary (abbr. dictionary + trait list) with TTL caching to prime LLM prompts and preserve page provenance for auditability.

Description

Added a production OpenAI adapter using Structured Outputs and pydantic models with retries/backoff in app/recode_extraction/adapters/openai_client.py and prompt templates in app/recode_extraction/services/openai_two_pass_prompts.py.
Implemented a DB‑driven trait vocabulary service with 6‑hour cache and bootstrap fallback in app/recode_extraction/services/trait_vocabulary.py.
Implemented deterministic QC/normalization (range, mean±SD, point parsing, dedupe, ETS validation and provenance tagging) in app/recode_extraction/services/qc.py and wired it into the pipeline.
Extended orchestrator to run openai_two_pass backend: per‑page PASS1 evidence extraction, merged evidence persistence, PASS2 structuring, QC → create candidates with ets_payload in ExtractedAssertionModel; legacy pipeline preserved in _run_legacy_pipeline (see app/recode_extraction/services/orchestrator.py).
Adapted review/import flow so persist_approved_assertions_to_ets will import prefilled ets_payload (and re‑normalize when curators edit values/units) in app/recode_extraction/services/review.py.
Added model fields and migration to persist artifacts and QC: pass1_evidence_package, pass2_structured_package, qc_summary on SourceExtractionRun, and qc_errors + larger unmapped_reason on ExtractedAssertionModel (app/recode_extraction/migrations/0006_openai_two_pass_fields.py, app/recode_extraction/models.py).
Exposed backend selection and QC indicators in views/templates and added settings toggles and OpenAI config variables in app/config/settings.py and app/recode_extraction/templates/*.
Added tests that mock OpenAI and PDF text extraction to validate orchestration and QC behavior in app/tests/recode_extraction/test_openai_two_pass_pipeline.py and app/tests/recode_extraction/test_openai_qc_normalization.py, and updated review tests to cover prefilled ETS payload persistence.
Updated docs (docs/recode_integration.md) and app/requirements.txt (openai, pydantic), and kept the NE/RE graph pipeline out of production paths.

Testing

Static/compile checks: ran python -m py_compile on new modules and related tests and they passed locally in this environment.
Unit tests (mocked OpenAI): added tests/recode_extraction/test_openai_two_pass_pipeline.py and tests/recode_extraction/test_openai_qc_normalization.py which mock the OpenAI client and PdfToTextService so no network calls occur.
Full test run attempted with pytest --ds=config.settings ... but the Django test harness failed during DB setup in this environment due to missing/None MySQL NAME (CI/local DB config not available here), so automated Django tests could not complete; failures are environmental and not logic failures in added code.
Migration: created 0006_openai_two_pass_fields.py to add JSON fields and qc_errors; run of makemigrations was prepared and migration committed.

Files of primary interest: app/recode_extraction/adapters/openai_client.py, app/recode_extraction/services/{openai_two_pass_prompts.py,trait_vocabulary.py,qc.py,openai_two_pass.py}, app/recode_extraction/services/orchestrator.py, app/recode_extraction/services/review.py, app/recode_extraction/models.py, app/recode_extraction/migrations/0006_openai_two_pass_fields.py, tests under app/tests/recode_extraction/ and docs update docs/recode_integration.md.

If you want, I can: run the test matrix inside a configured CI (with DB), trim or extend prompt examples, or split the PR into smaller PRs (prompts/client + orchestrator + QC + UI) for easier review.

Codex Task

…afety

Add tests for OpenAI two-pass pipeline and QC behavior

4816bfb

karilint added the codex label Mar 5, 2026 — with ChatGPT Codex Connector

karilint added 9 commits March 6, 2026 11:33

Guard review flow against missing DB columns and backfill migration

2307fe2

Fix Responses schema for Pass2 metadata and break import cycle

ef6d307

Polish schema test style and document pytest warning context

fbea466

Accept mocked PASS1 evidence payloads in two-pass orchestrator

4cf70a8

Reduce OpenAI token usage with PASS1 evidence compaction and filtering

5eb4840

Clean warning sources in settings and validation regex

dc2e859

Add reusable OpenAI mocking fixture and adapter unit tests

586612d

Improve two-pass coverage, citation defaults, and importer locality s…

e072033

…afety

Add extraction phase timing logs and total runtime summary

73d3a9f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement OpenAI two‑pass ETS extraction pipeline with QC, review integration, migrations and tests#239

Implement OpenAI two‑pass ETS extraction pipeline with QC, review integration, migrations and tests#239
karilint wants to merge 10 commits intocodex/prepare-implementation-plan-for-recode-extractionfrom
codex/implement-openai-two-pass-trait-extraction

karilint commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

karilint commented Mar 5, 2026

Motivation

Description

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant