feat: add SQLite storage as single source of truth#321
feat: add SQLite storage as single source of truth#321
Conversation
…ecurity (#313) Add in-memory SQLite database as foundation for replacing intermediate data structures (CombinedRawDataset, CombinedIndexedDataset). Includes schema with 13 tables, FK indexes, authorizer for security, insert API, and DatabasePopulator that converts RawDataset to SQL rows. Phase 1+2 of the migration plan — pure addition with opt-in DB param on CombinedRawDatasetsGenerator.
Add SQL CHECK constraints to enforce valid values at the database level for significance, lifecycle_state, implementation, category, verification_type, test status, variant, and element_kind columns.
Add ELToSQLCompiler that translates Lark expression language parse trees into SQL WHERE clauses with parameterized queries. Add DatabaseFilterProcessor that replicates the recursive DAG-walk filter logic using SQL DELETEs with cascade cleanup of orphaned SVCs and MVRs.
…ommands (#313) Phase 4 of the SQLite storage migration: replace CombinedIndexedDataset with direct database queries through RequirementsRepository and service layer (StatisticsService, ExportService). - Add RequirementsRepository as the data access layer wrapping RequirementsDatabase - Add StatisticsService with TestStats/RequirementStatus/TotalStats dataclasses - Add ExportService for JSON export with --req-ids/--svc-ids filtering - Add build_database() pipeline helper in storage/pipeline.py - Rewrite status, export, and report commands to use DB pipeline - Migrate LifecycleValidator from CombinedIndexedDataset to RequirementsRepository - Migrate GroupByOrganizor from CombinedIndexedDataset to RequirementsRepository - Fix multi-pass DB population to satisfy FK constraints across URNs - Update all affected tests for new interfaces Signed-off-by: Jimisola Laursen <jimisola@jimisola.com>
…file names (#313) - Make build_database() a context manager; update all commands to use `with` blocks - Replace Utils dict helpers with collections.defaultdict in parsing graph - Delete unused CombinedIndexedDataset, statistics_container, statistics_generator, indexed_dataset_filter_processor, and 5 dead Utils methods - Remove empty RequirementsELTransformer/SVCsELTransformer subclasses - Rename files to match their primary class names (el_compiler → el_to_sql_compiler, filter_processor → database_filter_processor, generic_el → generic_el_transformer) - Add unit tests for RequirementsRepository, StatisticsService, ExportService, pipeline - Update CLAUDE.md architecture docs to reflect SQLite pipeline Signed-off-by: jimisola <jimisola@jimisola.com>
…lls (#313) Replace 13-column status table (5 sub-columns per test group) with 5 columns using compact inline formatting. Each test cell shows positionally-aligned counts (T P F S M) with colored numbers and dim dashes for zeros. Remove merged header complexity. - Add _format_test_cell() for single-cell test stats rendering - Use orange for missing, yellow for skipped (was both red) - Empty cell for not_applicable (was ambiguous dash) - Color-coded legend - Delete _build_merged_headers, _parse_col_widths, _replace_header_with_merged, _format_cell Signed-off-by: jimisola <jimisola@jimisola.com>
Signed-off-by: jimisola <jimisola@jimisola.com>
|
@Jonas-Werne @lfvdavid It runs, regression tests passes etc, so please try locally. |
#313) Fix two bugs: - TotalStats.missing_automated_tests and missing_manual_tests were never aggregated from per-requirement stats, always reporting 0. Now accumulated from each requirement's TestStats after calculation. - Dangling FK references (e.g. SVC referencing non-existent requirement) crashed with IntegrityError. Now caught gracefully with warnings, allowing semantic validation to report all errors. Signed-off-by: jimisola <jimisola@jimisola.com>
Regression testing verified directly against main. Baselines were from pre-Pydantic-v2 and are no longer needed. Signed-off-by: jimisola <jimisola@jimisola.com>
Regression Test ResultsFull regression comparison of Reports (asciidoc + markdown) — 6/6 cosmetic onlyAll report diffs are cosmetic:
No data value changes in any report output. Export JSON — intentional schema changesAll diffs are the new schema per #315:
Status JSON — intentional schema changesSame schema redesign as export: cleaner naming, nested Bugs found and fixed
Stale baselines removedThe |
Signed-off-by: jimisola <jimisola@jimisola.com>
Signed-off-by: jimisola <jimisola@jimisola.com>
Extract helper methods from ExportService.to_export_dict (C901: 21→<10), StatisticsService._calculate (C901: 17→<10), and DatabasePopulator.populate_from_raw_dataset (C901: 14→<10) to satisfy flake8 C901 complexity threshold. Signed-off-by: jimisola <jimisola@jimisola.com>
The top-level `from reqstool.common.utils import Utils` was incorrectly removed in the dead-code cleanup commit, causing NameError when running as an installed package (the conditional import only covers direct exec). Signed-off-by: jimisola <jimisola@jimisola.com>
…unts (#313) Replace CRD→CID pipeline in all three commands (status, export, report) with direct DB queries via RequirementsRepository and service layer. Fix StatisticsService undercounting missing automated tests and MVRs by aggregating from per-requirement stats instead of global annotation scan. Signed-off-by: Jimisola Laursen <jimisola@jimisola.com>
Jonas-Werne
left a comment
There was a problem hiding this comment.
I removed the commented out import from the demo project and got this and it did not get a enumerated value.
But more importantly, with the current representation you can't see the ID of the requirement this will not work.
I like the enumerations but they can't replace the ID.
We could combine the URN + ID column to contain the full ID
reqstool-demo:REQ_001, ext-001:REQ_001 etc, then we could change the ID column to say STATUS that include the Enumerated status
What enumerated value are you referring to?
REQ_PASS, REQ_MANUAL_FAIL etc are the IDs of those requirements as per: https://github.com/reqstool/reqstool-demo/blob/main/docs/reqstool/requirements.yml What do you mean that you can't see the ID? |
|
We should add @requirements that are only implemented once to have that case as well: Or rather, remove the class level annotations except for one requirement that we name accordingly, e.g. |
My mistake, I confused the ID to be some status of the requirement. Like REQ_PASS would mean everything was done and REQ_NOT_IMPLEMENTED meant that you have not annotated the code. |
No, worries. I appreciate you taking a look. I though that the REQ_nnn were not self-explanatory as the new IDs are. |
|
I think @lfvdavid has the best data to regression test this. Maybe we should generate a more complex data in the demo project. All levels microservice, system, external and include filtering of requirements and also some MVR |
Ideally, I don't want the demo to be too complex. It shall just demonstrate the basics for new users. What do you think? I rather have a more complex example directly in reqstool-client then. And we still need to dog food reqstool-client and others with reqstool. |
Signed-off-by: jimisola <jimisola@users.noreply.github.com>

Improvements
This PR replaces the hand-built, multi-stage intermediate data pipeline with an in-memory SQLite database as the single source of truth.
Complexity reduction
svcs_from_req,mvrs_from_svc,reqs_from_urn, etc.)IndexedDatasetFilterProcessorwith recursive cleanupStatisticsGeneratorwith manual aggregation loopsStatisticsServiceWhat was removed (1,181 lines)
StatisticsContainer/TotalStatisticsItem(105 lines)StatisticsGenerator(313 lines)CombinedIndexedDatasetGenerator(302 lines)IndexedDatasetFilterProcessor(390 lines)CombinedIndexedDataset(53 lines)What replaced it (1,567 lines)
StatisticsService,ExportServiceThe net increase of ~386 lines trades imperative index-maintenance code for a declarative storage layer with FK constraints, CASCADE rules, ACID semantics, and a security authorizer.
Summary
Replace intermediate data structures (
CombinedIndexedDataset,StatisticsContainer,StatisticsGenerator) with an in-memory SQLite database as the single source of truth. All commands (status,export,report) now query the database through aRequirementsRepositoryand service layer instead of traversing in-memory dicts.Phases completed in this PR
RequirementsDatabase,DatabasePopulator,authorizer(security sandbox)DatabaseFilterProcessorfor applying requirement/SVC filters via SQL DELETERequirementsRepository— data access layer with all entity queries and test result resolutionStatisticsService— replacesStatisticsGenerator+StatisticsContainerwith frozenTestStats/RequirementStatus/TotalStatsdataclassesExportService— replacesGenerateJsonCommandinline logic with--req-ids/--svc-idsfilteringbuild_database()— pipeline helper: Location → parse → populate DB → filter → validateLifecycleValidatorandGroupByOrganizormigrated fromCombinedIndexedDatasettoRequirementsRepositorystatus_output.schema.jsonandexport_output.schema.jsonfor output validationStatus table (before → after)
Before: 13 columns with complex merged header spanning T/P/F/S/M sub-columns for each test group.
After: 5 clean columns — each test cell shows inline counts with colors (green=passed, red=failed, yellow=skipped, orange=missing):
Key design decisions
Location → CombinedRawDatasetsGenerator(database=db) → DatabaseFilterProcessor → RequirementsRepository → commandsStatisticsGenerator,StatisticsContainer,CombinedIndexedDataset, etc.) kept for Phase 5 cleanup_TotalStatsTemplateAdapterto bridge new attribute names to existing Jinja2 templatesCloses
StatisticsContainerrenamed/refactored intoStatisticsService+ clean dataclassesRelated
Test plan
status_output.schema.jsonexport_output.schema.jsonreqstool-demo(requires./mvnw verifyin sibling project)🤖 Generated with Claude Code