Skip to content

Replace git ls-tree with projection-based folder count#1917

Open
tyrielv wants to merge 2 commits intomicrosoft:masterfrom
tyrielv:tyrielv/hydration-status-4-projection
Open

Replace git ls-tree with projection-based folder count#1917
tyrielv wants to merge 2 commits intomicrosoft:masterfrom
tyrielv:tyrielv/hydration-status-4-projection

Conversation

@tyrielv
Copy link
Contributor

@tyrielv tyrielv commented Mar 19, 2026

Replace git ls-tree with projection-based folder count for hydration status

Summary

Replace the slow git ls-tree -r -d HEAD approach for counting total folders (used as the denominator in hydration percentage) with an in-memory projection tree walk. On a large repo with 2.8 million files (~387K directories), this reduces folder count time from ~10-25 seconds (wide variance observed in practice) to ~78ms in the mount process, and to ~2 seconds via index parsing in the unmounted fallback path.

Once this PR completes, a follow-up PR will be created to further improve the status hydration hook to use IPC with the mount's named pipe to get the cached data rather than launching gvfs health --status

What changed

Area Change
GitIndexProjection GetProjectedFolderCount() — walks the in-memory FolderData tree using a stack (mounted repos)
GitIndexProjection.GitIndexParser CountIndexFolders() — parses git index v4 to extract unique directories (unmounted fallback for gvfs health)
EnlistmentHydrationSummary CreateSummary() now takes a required Func<int> projectedFolderCountProvider parameter; GetHeadTreeCount() removed entirely
GitStatusCache SetProjectedFolderCountProvider() — late-bound provider set by FileSystemCallbacks after GitIndexProjection is created (documented circular dependency)
FileSystemCallbacks Wires the provider: () => this.GitIndexProjection.GetProjectedFolderCount()
HealthVerb Fallback path uses GitIndexProjection.CountIndexFolders() for unmounted repos
Dead code removed GetHeadTreeCount(), GitProcess.GetHeadTreeId(), GVFSConstants.TreeCount cache file constant

Threading fixes (from prior PR review feedback)

  • Interlocked.Exchange for activeHydrationTask — fixes race between background thread and Dispose()
  • Bounded Wait(5s) in Dispose() — prevents shutdown hang if hydration task stalls
  • Compacted blank lines around ThrowIfCancellationRequested() calls

Tests

  • Rewrote EnlistmentHydrationSummaryTests — 5 tests for CountIndexFolders using synthetic git index v4 binary
  • Added EnlistmentHydrationSummaryContextTestsGetIndexFileCount error path + CreateSummary cancellation

Manual testing

Verified on a large repo (~2.8M files, ~387K dirs) — gvfs health --status and git status both show correct hydration percentages, updating from 6% → 11% folders after hydrating 10,000 directories.

Replace the slow git ls-tree -r -d HEAD approach (~25s on a large repo
with 2.8 million files) with in-memory projection tree walk (~78ms) via
FolderData.GetRecursiveFolderCount.

- Add GitIndexProjection.GetProjectedFolderCount for mounted repos
- Add GitIndexProjection.CountIndexFolders (git index parser) as fallback
  for unmounted repos (gvfs health --status)
- Add GitStatusCache.SetProjectedFolderCountProvider, called by
  FileSystemCallbacks after GitIndexProjection is created
- Make projectedFolderCountProvider a required parameter of CreateSummary
- Remove GetHeadTreeCount, GetHeadTreeId, and TreeCountCache.dat
- Add GVFSEnlistment.GitIndexPath convenience property
- Fix activeHydrationTask race (Interlocked.Exchange) and add bounded
  Dispose wait
- Rewrite EnlistmentHydrationSummaryTests for CountIndexFolders

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@tyrielv tyrielv requested a review from KeithIsSleeping March 20, 2026 17:28
Copy link

@KeithIsSleeping KeithIsSleeping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-Model Code Review

Reviewed with Claude Opus 4.6, GPT-5.3-Codex, and GPT-5.4. Note: Opus gave this PR a clean bill on the item below — this finding is from the GPT models only (2/3). Flagging for author consideration.

/// This is computed from the in-memory tree built during index parsing,
/// so it is essentially free (no I/O, no process spawn).
/// </summary>
public virtual int GetProjectedFolderCount()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MEDIUM (disputed — 2/3 models flag, Opus disagrees): Folder count semantics diverge between mount path and health fallback

GetProjectedFolderCount() walks the in-memory projection tree, which is built only from entries accepted by AddIndexEntryToProjection (skip-worktree/merge-state filtered). CountIndexFolders() in the fallback path parses raw index entries and counts directories for ALL entries with no equivalent filtering.

This means the hydration denominator can differ depending on code path (mounted status-cache vs gvfs health --status fallback). It could also fall below HydratedFolderCount, causing EnlistmentHydrationSummary.IsValid to fail.

Note: Opus 4.6 analyzed this and concluded the count is correct, arguing the projection tree is the right denominator since only projected folders can be hydrated. This may be intentional behavior — flagging for author to confirm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants