Improve CasToComparableText

**Is your feature request related to a problem? Please describe.**
When inspecting or diffing CAS contents for tests we frequently rely on a simple CSV stringification that:
- does not preserve rich, human-friendly output (HTML) for easier visual inspection,
- lacks configurable columns (anchor, covered text, indexed status),
- produces unstable ordering for multi-valued/annotation features and ambiguous anchors,
- offers no convenient way to exclude noisy features/types or treat empty strings specially,
- and forces long covered text into the output making diffs noisy.

I'm often frustrated when test failures produce long, hard‑to‑scan CAS dumps or when small, irrelevant differences (e.g., non-deterministic anchor numbering or list order) make comparisons brittle.

**Describe the solution you'd like**
Add an enhanced CAS -> comparable text utility with the following capabilities:
- Output formats: Keep CSV but add an HTML renderer for nicer human-readable tables.
- Configurable columns: enable/disable an anchor column, an indexed column, and a covered‑text column (with configurable max length and middle-abbreviation).
- Anchor formatting: anchors include type short name, optional annotation offsets, optional sofa id, optional indexing marker, and stable disambiguation suffixes for duplicate anchors; support optional anchor feature hash suffix.
- Stable ordering: when multi‑valued features hold annotations, optionally sort them by begin (asc), end (desc), type name to provide deterministic set‑like ordering.
- Index awareness: mark FSs as indexed and optionally add a dedicated `<INDEXED>` column; use indexed status as a tie-breaker when ordering.
- Exclusions: allow regex patterns to exclude specific features or types from rendering (cache regex compilation for performance).
- Null/empty handling: configurable `nullValue`, and an option to treat empty strings as null so empty values don’t clutter diffs.
- Multi‑valued rendering: robust handling of array/list FSs and primitive arrays, rendering them as bracketed lists; handle nested multi-valued structures recursively.
- Rendering options: omit XML declaration in HTML output and minimal inline styling so HTML is self-contained.
- Public API knobs: setters/getters for all above flags so callers can tune output for different use cases (compact machine diffs vs human inspection).

This produces a single stable, configurable comparable representation useful for both automated assertions and human debugging.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve CasToComparableText #444

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve CasToComparableText #444

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions