Skip to content

Optimize Hybrid Search for performance #822

@paullizer

Description

@paullizer

User: what do you think of this suggestion?

application/single_app/functions_search.py

hybrid_search() now calls normalize_document_revision_families(...) on every search request (and potentially across multiple groups/public workspaces). That helper queries Cosmos for all accessible documents and can upsert items + update search-index visibility, which is expensive to do in the hot path and can add significant latency/cost under load. Consider moving normalization to mutation-time (upload/revision/delete), running it asynchronously, or caching/TTL-gating it so searches don’t routinely trigger Cosmos + index writes.

GitHub Copilot: I’m checking the Cosmos-specific guidance first, then I’ll inspect the search path and revision-normalization code to assess whether this suggestion is technically sound.

Read , lines 1 to 400

Read

Read skill cosmosdb-best-practices

Read memory

I’ve confirmed the search path does invoke revision normalization up front. Next I’m tracing what that helper actually does and whether the same work already happens on upload/delete paths.

Searched for text normalize_document_revision_families (**/application/single_app/**/*.py), 5 results

Read , lines 80 to 190

Searched for regex search_visibility_state|is_current_version|revision_family_id (**/application/single_app/**/*.py), 61 results

I need the normalization helper and the mutation paths side by side so I can judge whether the search-path write is redundant or protecting against gaps elsewhere.

Read , lines 400 to 540

Read , lines 580 to 740

Read , lines 3020 to 3075

Read , lines 3110 to 3155

I’ve verified upload and delete paths already maintain the revision flags and search visibility. I’m now checking how broad the search-time query is, because that determines how expensive the safety net really is.

Searched for text def _query_accessible_documents (**/application/single_app/functions_documents.py), 1 result

Read , lines 320 to 405

Searched for regex hybrid_search\( (**/application/single_app/**/*.py), 7 results

The expensive part is confirmed: search-time normalization does a full accessible-document Cosmos query before any cache hit, and it can fan out across group/public scopes. I’m checking whether that hot-path repair is there mainly for legacy/backfill safety, which affects how strongly I’d endorse moving it.

Searched for regex select_current_documents|normalize_document_revision_families\( (**/application/single_app/**/*.py), 12 results

Read , lines 260 to 330

Searched for regex set_document_chunk_visibility\( (**/application/single_app/functions_documents.py), 5 results

I have enough to form an opinion; I’m just pulling exact line anchors so I can point to the hot-path and the mutation-time safeguards precisely.

Searched for regex generate_search_cache_key|get_cached_search_results|normalize_document_revision_families\(|_query_accessible_documents\(|cosmos_container.upsert_item\(|set_document_chunk_visibility\( (**/application/single_app/functions_search.py), 7 results

Searched for regex def normalize_document_revision_families|def _query_accessible_documents|set_document_chunk_visibility\(existing_document, active=False\)|existing_document\['is_current_version'\] = False|search_visibility_state': 'active'|promoted_document\['search_visibility_state'\] = 'active' (**/application/single_app/functions_documents.py), 5 results

Searched for regex cosmos_container.upsert_item\(document_item\)|document_item\["search_visibility_state"\] = "archived"|document_item\["search_visibility_state"\] = "active" (**/application/single_app/functions_documents.py), 4 results

Searched for regex revision_family_id": revision_family_id|"is_current_version": True|"search_visibility_state": "active" (**/application/single_app/functions_documents.py), 9 results

I’ve got the core answer. One last detail: I’m checking whether search itself filters on current/archived status, because that decides whether hot-path normalization is optional or currently compensating for missing search filters.

Searched for text user_id eq '@|user_id eq '|group_id eq '|public_workspace_id eq '|search_visibility_state (**/application/single_app/functions_search.py), 20 results

Read , lines 190 to 360

Searched for regex set_document_chunk_visibility\(promoted_document, active=True\)|promoted_document\['is_current_version'\] = True (**/application/single_app/functions_documents.py), 2 results

Searched for text def hybrid_search|def select_current_documents (**/application/single_app/**/*.py), 2 results

It’s a good suggestion, and I’d treat it as a real performance concern rather than a minor cleanup.

hybrid_search() in functions_search.py runs normalize_document_revision_families(...) before it even builds the cache key and checks the cache at functions_search.py and functions_search.py. That means even cache-hit searches still pay the normalization cost first. The helper itself does a cross-partition SELECT * over all accessible docs in functions_documents.py, and if it finds drift it can update both Cosmos items and search-index visibility in functions_documents.py. With doc_scope="all", that can fan out across personal docs, multiple groups, and visible public workspaces, so the latency/RU concern is valid.

The main caveat is correctness. Search currently filters on scope fields like user_id, group_id, and public_workspace_id, not on a current/archive flag, in functions_search.py. So search-time normalization is also acting as a repair loop for legacy or inconsistent data. The good news is the normal mutation paths already do most of this work on revision upload and delete/promotion in functions_documents.py, functions_documents.py, and functions_documents.py. So I’d support the suggestion, but as: move invariant maintenance to mutation-time, keep an async/backfill repair path for drift, and use TTL-gating only as an interim mitigation. Caching by itself is not enough unless the cache lookup can happen before normalization or normalization gets its own TTL.

User: how much work and effort is this?

GitHub Copilot: This is probably medium effort, not huge.

The quick win is small: about 2 to 4 hours. That would be something like TTL-gating or scope-level caching so functions_search.py does not run revision normalization on every search. That reduces latency fast, but it still leaves the underlying design in place.

The proper fix is more like 1 day of focused work, maybe 1.5 days if you want it cleaned up well. The reason it is not trivial is that search-time normalization in functions_search.py is currently compensating for possible drift, while the actual normalization helper in functions_documents.py can both read broadly and write to Cosmos and the search index. The good part is that upload and delete/promotion flows already maintain most of this state at mutation time in functions_documents.py and functions_documents.py, so this is more of a refactor and rollout problem than a full redesign.

If you also want a safe rollout for old data, I’d call it 1 to 2 days total. That version would include:

  1. Remove or gate hot-path normalization from search.
  2. Keep mutation-time updates as the source of truth.
  3. Add a one-time repair/backfill path for legacy documents.
  4. Add regression coverage for upload revisioning, current-version promotion, and search visibility across personal, group, and public scopes.

The main risk is not coding time, it is correctness during rollout. If you simply remove the search-time repair without a backfill or guard, stale legacy revisions could remain searchable. That is the part that needs care.

If you want, I can turn this into a concrete implementation plan with the lowest-risk option first:

  1. Quick mitigation
  2. Proper refactor
  3. Proper refactor plus legacy backfill

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Pending Evaluation

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions