Skip to content

Add multi-index lookup and GroupedMatrix for structured groupBy results#205

Merged
perNyfelt merged 25 commits intomainfrom
core-v3.7.0-2.1
Mar 10, 2026
Merged

Add multi-index lookup and GroupedMatrix for structured groupBy results#205
perNyfelt merged 25 commits intomainfrom
core-v3.7.0-2.1

Conversation

@perNyfelt
Copy link
Member

  • Add createIndex/resetIndex/lookup/lookupIndices to Matrix with lazy-invalidated internal index (database-style indexing over data columns)
  • Add GroupedMatrix class with get(), keys(), level(), agg(), each()
  • Change Stat.groupBy to return GroupedMatrix with compound List keys
  • Add Matrix.groupBy() convenience method
  • Propagate index through clone/subset/selectColumns, clear on drop
  • Emit #index: metadata in toCsvString
  • Update GgStat callers to use toStringKeyMap() for compat
  • Add 32 new tests (17 index + 16 GroupedMatrix)
  • Update docs and cookbook

- Add createIndex/resetIndex/lookup/lookupIndices to Matrix with
  lazy-invalidated internal index (database-style indexing over data columns)
- Add GroupedMatrix class with get(), keys(), level(), agg(), each()
- Change Stat.groupBy to return GroupedMatrix with compound List keys
- Add Matrix.groupBy() convenience method
- Propagate index through clone/subset/selectColumns, clear on drop
- Emit #index: metadata in toCsvString
- Update GgStat callers to use toStringKeyMap() for compat
- Add 32 new tests (17 index + 16 GroupedMatrix)
- Update docs and cookbook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 9, 2026 13:27
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds database-style indexing to Matrix and introduces GroupedMatrix as a structured return type for groupBy, updating callers, tests, and docs to match.

Changes:

  • Add Matrix index API (createIndex/resetIndex/lookup/lookupIndices) with lazy invalidation and CSV metadata emission.
  • Introduce GroupedMatrix and change Stat.groupBy/Matrix.groupBy to return it (with legacy toStringKeyMap() compatibility).
  • Update ggplot stats, tests, and documentation to use the new grouping/indexing behavior.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
matrix-ggplot/src/main/groovy/se/alipsa/matrix/gg/stat/GgStat.groovy Updates Stat.groupBy(...) call sites to use GroupedMatrix.toStringKeyMap() for legacy string-key behavior.
matrix-core/v3.7.0-roadmap.md Documents the new index + GroupedMatrix design and implementation checklist.
matrix-core/src/test/groovy/StatTest.groovy Updates groupBy assertions to use GroupedMatrix.get(...) and adds a legacy toStringKeyMap() check.
matrix-core/src/test/groovy/MatrixTest.groovy Adds index API tests plus Matrix.groupBy() convenience method tests.
matrix-core/src/test/groovy/GroupedMatrixTest.groovy Adds a new test suite covering GroupedMatrix accessors and aggregation APIs.
matrix-core/src/main/groovy/se/alipsa/matrix/core/Stat.groovy Changes groupBy implementation/return type to GroupedMatrix.
matrix-core/src/main/groovy/se/alipsa/matrix/core/Matrix.groovy Implements index fields, invalidation hooks, CSV #index: emission, lookup APIs, and groupBy convenience methods.
matrix-core/src/main/groovy/se/alipsa/matrix/core/GroupedMatrix.groovy Adds the new GroupedMatrix type with structured keys + aggregation helpers + legacy map conversions.
docs/python-comparison.md Updates feature comparison to reflect rolling/cumulative ops and new indexing/grouping support.
docs/cookbook/matrix-core.md Adds cookbook sections/examples for indexing/lookup and GroupedMatrix usage.

- Remove dead branch in GroupedMatrix.agg, clarify keys are source column names
- Add key count validation to lookupIndices (0 keys, too many keys)
- Preserve index on empty lookup results for consistency
- Clear indexMap in copyIndexTo to prevent stale state
- Use immutable keys in Stat.groupBy to protect map integrity
- Defensive-copy groups map in GroupedMatrix constructor

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

- Sort partial-key lookupIndices results to preserve original row order
- Document #index: comment header in toCsvString Javadoc
- Infer types in GroupedMatrix.agg (group column types from source,
  aggregation types from first result)
- Update python-comparison.md groupBy section for GroupedMatrix API
- Add invalidateIndex to all remaining mutating methods: apply(rows),
  apply(criteria), replace(col,from,to), replace(col,type,values),
  upsertColumn
- Clear index on rename if indexed column is renamed

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

…kupIndices

- Fix agg type inference to use per-column slots, preventing misalignment
  when some aggregations return null for the first group
- Add #index: parsing to MatrixBuilder for full CSV round-trip support
- Return unmodifiable list from lookupIndices full-key path
- Extend testIndexCsvRoundTrip to verify actual round-trip restore

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Remove outdated reference to underscore-separated string keys and
document the structured List<?> compound keys and toStringKeyMap()
for backward compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Return Collections.emptyList() for missing full keys and
Collections.unmodifiableList() for partial-key results, matching
the full-key behavior for a consistent API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

…cates

- Add public markIndexDirty() for direct column mutation scenarios
- Wrap index map keys with Collections.unmodifiableList() in buildIndex()
- Reject duplicate column names in createIndex()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

- MatrixBuilder: filter out empty strings from #index: spec parsing
  to avoid confusing errors from trailing commas or empty entries
- GroupedMatrix.agg: infer aggregated column types as the common
  supertype across all groups instead of using the first non-null
  type, ensuring stable types regardless of group iteration order

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@perNyfelt perNyfelt requested a review from Copilot March 9, 2026 16:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

perNyfelt and others added 2 commits March 10, 2026 12:53
…der leak

- GroupedMatrix constructor: defensively copy and wrap all key lists as
  immutable to protect map integrity from external callers
- Matrix.rename: update indexedColumnNames in-place instead of dropping
  the index, so callers don't lose their index on column rename
- MatrixBuilder: clear indexColumns at parse start to prevent index
  leaking between reuses of the same builder instance
- Fix lookupIndices comment: "unmodifiable view" instead of "defensive copy"
- Roadmap: clarify toStringKeyMap() is the backward-compat method

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove clearPendingIndexColumns() from types() methods so that
  overriding types after csvString() parsing preserves the #index:
  metadata (index is still cleared when data/columns/rows change)
- Update toMap() Javadoc to clarify it is not the legacy format;
  point to toStringKeyMap() for backward compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.

@perNyfelt perNyfelt merged commit 82fc4ab into main Mar 10, 2026
4 of 5 checks passed
@perNyfelt perNyfelt deleted the core-v3.7.0-2.1 branch March 10, 2026 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants