-
Notifications
You must be signed in to change notification settings - Fork 37
Open
Milestone
Description
Is your feature request related to a problem? Please describe.
When inspecting or diffing CAS contents for tests we frequently rely on a simple CSV stringification that:
- does not preserve rich, human-friendly output (HTML) for easier visual inspection,
- lacks configurable columns (anchor, covered text, indexed status),
- produces unstable ordering for multi-valued/annotation features and ambiguous anchors,
- offers no convenient way to exclude noisy features/types or treat empty strings specially,
- and forces long covered text into the output making diffs noisy.
I'm often frustrated when test failures produce long, hard‑to‑scan CAS dumps or when small, irrelevant differences (e.g., non-deterministic anchor numbering or list order) make comparisons brittle.
Describe the solution you'd like
Add an enhanced CAS -> comparable text utility with the following capabilities:
- Output formats: Keep CSV but add an HTML renderer for nicer human-readable tables.
- Configurable columns: enable/disable an anchor column, an indexed column, and a covered‑text column (with configurable max length and middle-abbreviation).
- Anchor formatting: anchors include type short name, optional annotation offsets, optional sofa id, optional indexing marker, and stable disambiguation suffixes for duplicate anchors; support optional anchor feature hash suffix.
- Stable ordering: when multi‑valued features hold annotations, optionally sort them by begin (asc), end (desc), type name to provide deterministic set‑like ordering.
- Index awareness: mark FSs as indexed and optionally add a dedicated
<INDEXED>column; use indexed status as a tie-breaker when ordering. - Exclusions: allow regex patterns to exclude specific features or types from rendering (cache regex compilation for performance).
- Null/empty handling: configurable
nullValue, and an option to treat empty strings as null so empty values don’t clutter diffs. - Multi‑valued rendering: robust handling of array/list FSs and primitive arrays, rendering them as bracketed lists; handle nested multi-valued structures recursively.
- Rendering options: omit XML declaration in HTML output and minimal inline styling so HTML is self-contained.
- Public API knobs: setters/getters for all above flags so callers can tune output for different use cases (compact machine diffs vs human inspection).
This produces a single stable, configurable comparable representation useful for both automated assertions and human debugging.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels