Ranking Parent Documents with Nested Features by dhruva-r · Pull Request #946 · Yelp/nrtsearch

dhruva-r · 2026-03-30T02:18:21Z

Jira Ticket

Summary of Changes

Add _CHILDREN. prefix support in SegmentDocLookup, allowing scoring plugins running on parent documents to access aggregated doc values from their child (nested) documents (e.g., _CHILDREN.appointments.price
returns all appointment prices across a parent's children)
Store _nested_child_count as a new meta field on parent documents at index time so ChildAggregatedDocValues can compute the child range without backward scanning
Filter children by nested path using BitSetProducer, so indexes with multiple nested types (e.g., appointments and reviews) return only values from the target path
Add ChildFilter proto message on SearchRequest, enabling query-time filtering of which children are included in _CHILDREN. aggregation (e.g., only appointments with price >= 50)
Cache the child field iterator in ChildAggregatedDocValues for efficiency, with a lastParentDocId guard to prevent incorrect backward iteration on repeated access for the same document

Tests

dhruva-r · 2026-04-01T21:30:49Z

If a different approach is required or if this approach is inefficient let me know.

vim345

Overall LGTM. I just had one comment.

Also, AI mentioned the following about missing test coverage:

 What's missing:

  1. No test for the issue #1 bug path — there's no test that verifies the _nested_child_count doc values accessor is reused across multiple documents in the same segment (i.e. no performance regression test, and the bug wouldn't cause a test failure — just extra allocations).
  2. No test for a parent with zero children — the childCount <= 0 early-return path in setDocId is untested.
  3. No test for documents without nested fields — i.e. calling _CHILDREN. on a non-parent document in a mixed index.
  4. No test for the lastParentDocId caching guard — the early-return if (parentDocId == lastParentDocId) return; is never exercised by a test that calls setDocId twice with the same doc ID.
  5. ChildFilterIntegrationTest uses scores to encode results rather than fetching actual field values, which makes assertions somewhat fragile and hard to read — though this is a style concern rather than a correctness gap.

Is it possible to address these as well?

vim345 · 2026-04-03T17:42:34Z

src/main/java/com/yelp/nrtsearch/server/doc/ChildAggregatedDocValues.java

+      return;
+    }
+
+    LoadedDocValues<?> childCountDocValues = childCountFieldDef.getDocValues(leafContext);


getDocValues(leafContext) is called on every setDocId invocation rather than once in the constructor (as is done for childFieldDocValues). This allocates a new accessor on every parent document scored. It should be initialized once in the constructor and reused like childFieldDocValues.

Is it possible for you to do some benchmarking to see what the performance looks like with or without the above suggestion?

dhruva-r added 5 commits March 27, 2026 13:32

solution1 commit

3a9971c

added support to add query to child filters

aa31423

updating childAggregatedDocValues

f543603

removing dev comments and spotlessApply

f95d257

adding resource file to fix failing build

33a72d6

dhruva-r marked this pull request as ready for review April 1, 2026 21:29

dhruva-r requested review from vim345 and waziqi89 April 1, 2026 21:29

vim345 reviewed Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ranking Parent Documents with Nested Features#946

Ranking Parent Documents with Nested Features#946
dhruva-r wants to merge 5 commits intomainfrom
dhruva_parents_documents_can_reference_nested_fields

dhruva-r commented Mar 30, 2026

Uh oh!

dhruva-r commented Apr 1, 2026

Uh oh!

vim345 left a comment

Uh oh!

vim345 Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dhruva-r commented Mar 30, 2026

Jira Ticket

Summary of Changes

Tests

Uh oh!

dhruva-r commented Apr 1, 2026

Uh oh!

vim345 left a comment

Choose a reason for hiding this comment

Uh oh!

vim345 Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants