perf: Views engine render optimizations — warm 99.99% faster, cold 10% faster#724
perf: Views engine render optimizations — warm 99.99% faster, cold 10% faster#724johnml1135 wants to merge 8 commits intomainfrom
Conversation
|
commit d0066f3247: commit c4678e8240: |
There was a problem hiding this comment.
Pull request overview
Performance-focused updates introduce render verification/benchmark infrastructure and targeted optimizations in the DetailControls/DataTree rendering path to reduce redundant work and support pixel-perfect render validation.
Changes:
- Added RenderVerification infrastructure (environment validation, diagnostics/trace toggling, benchmark result models, reporting & comparison utilities).
- Optimized
DataTree/Sliceconstruction, layout, painting, and visibility paths to avoid repeated O(N) work. - Added/updated test harness assets (layouts/parts, new diagnostics tests, VS Code tasks) and snapshot testing support.
Reviewed changes
Copilot reviewed 68 out of 153 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| Src/Common/RenderVerification/RenderModels.cs | Adds shared models for scenarios and timing results used by render verification/benchmarks. |
| Src/Common/RenderVerification/RenderEnvironmentValidator.cs | Captures environment details and produces a hash/diff for deterministic snapshot comparisons. |
| Src/Common/RenderVerification/RenderDiagnosticsToggle.cs | Adds a toggle for diagnostics/trace output with file-backed flags and a trace listener. |
| Src/Common/RenderVerification/RenderBenchmarkResults.cs | Defines JSON-serializable benchmark run/result structures and flags loading. |
| Src/Common/RenderVerification/RenderBenchmarkReportWriter.cs | Writes JSON + Markdown summary reports and recommendations for benchmark runs. |
| Src/Common/RenderVerification/RenderBenchmarkComparer.cs | Compares runs to detect regressions/improvements and generates a summary. |
| Src/Common/RenderVerification/CompositeViewCapture.cs | Implements two-pass DataTree capture with Views content overlay for pixel-perfect snapshots. |
| Src/Common/FieldWorks/FieldWorks.Diagnostics.dev.config | Adds a new trace switch for Views render timing. |
| Src/Common/Controls/XMLViews/XMLViewsTests/XmlBrowseViewBaseTests.cs | Updates test stub interface to include NeedsReconstruct. |
| Src/Common/Controls/DetailControls/VectorReferenceView.cs | Avoids drawing a trailing separator bar after the last item. |
| Src/Common/Controls/DetailControls/Slice.cs | Adds cached XML attribute parsing + width fast-path + DataTree method call updates. |
| Src/Common/Controls/DetailControls/PossibilityVectorReferenceView.cs | Avoids drawing a trailing separator bar after the last item. |
| Src/Common/Controls/DetailControls/HwndDiagnostics.cs | Adds helper methods to read USER/GDI handle counts for diagnostics/tests. |
| Src/Common/Controls/DetailControls/DetailControlsTests/TestResult.xml | Adds a test runner output artifact (currently shows a failed/invalid run). |
| Src/Common/Controls/DetailControls/DetailControlsTests/TestParts.xml | Extends test parts inventory to include senses sequences and collapsible slices. |
| Src/Common/Controls/DetailControls/DetailControlsTests/Test.fwlayout | Updates test layouts to include CitationForm and senses parts/layouts. |
| Src/Common/Controls/DetailControls/DetailControlsTests/DetailControlsTests.csproj | Adds snapshot/JSON deps and references RenderVerification project. |
| Src/Common/Controls/DetailControls/DetailControlsTests/DataTreeTests.cs | Updates expected template XML string for modified layout composition. |
| Src/Common/Controls/DetailControls/DetailControlsTests/DataTreeHwndCountTest.cs | Adds baseline diagnostics tests for handle growth and slice install count. |
| Src/Common/Controls/DetailControls/DataTree.cs | Adds multiple layout/paint/visibility optimizations and diagnostics counters. |
| Directory.Packages.props | Adds central package version for Verify. |
| AGENTS.md | Adds note about external dependency repository (liblcm). |
| .vscode/tasks.json | Adds tasks for render baseline tests and building tests via repo script. |
| .vscode/settings.json | Allows the new “Build Tests” task as an approved VS Code task. |
| .github/skills/openspec-verify-change/SKILL.md | Updates generatedBy metadata version. |
| .github/skills/openspec-sync-specs/SKILL.md | Updates generatedBy metadata version. |
| .github/skills/openspec-onboard/SKILL.md | Updates onboarding workflow text/commands and cross-platform command notes. |
| .github/skills/openspec-new-change/SKILL.md | Updates generatedBy metadata version. |
| .github/skills/openspec-ff-change/SKILL.md | Updates generatedBy metadata version. |
| .github/skills/openspec-explore/SKILL.md | Updates explore-mode wording around “proposal” vs other workflows. |
| .github/skills/openspec-continue-change/SKILL.md | Updates generatedBy metadata version. |
| .github/skills/openspec-bulk-archive-change/SKILL.md | Updates text and generatedBy metadata version. |
| .github/skills/openspec-archive-change/SKILL.md | Updates guidance to invoke the sync skill via Task tool. |
| .github/skills/openspec-apply-change/SKILL.md | Updates generatedBy metadata version. |
| .github/prompts/opsx-onboard.prompt.md | Mirrors onboarding prompt updates (CLI install check, command list). |
| .github/prompts/opsx-ff.prompt.md | Clarifies template usage vs context/rules constraints. |
| .github/prompts/opsx-explore.prompt.md | Mirrors explore prompt updates (“proposal” wording). |
| .github/prompts/opsx-bulk-archive.prompt.md | Mirrors bulk-archive prompt copy update. |
| .github/prompts/opsx-archive.prompt.md | Mirrors archive prompt sync invocation update. |
| .github/instructions/csharp.instructions.md | Adds C# instructions file (currently specifies “latest C# / C# 14” guidance). |
| .gitattributes | Marks Verify snapshot PNG extensions as binary. |
| .GitHub/skills/render-testing/SKILL.md | Adds a render-testing skill doc with build/test/run instructions and baseline workflow. |
| .GitHub/skills/code-review-skill-main/scripts/pr-analyzer.py | Adds a Python tool to analyze diffs and estimate PR review complexity. |
| .GitHub/skills/code-review-skill-main/reference/typescript.md | Adds TypeScript review reference documentation. |
| .GitHub/skills/code-review-skill-main/reference/security-review-guide.md | Adds security review reference documentation. |
| .GitHub/skills/code-review-skill-main/reference/java.md | Adds Java review reference documentation. |
| .GitHub/skills/code-review-skill-main/reference/cpp.md | Adds C++ review reference documentation. |
| .GitHub/skills/code-review-skill-main/reference/code-review-best-practices.md | Adds code review best practices reference documentation. |
| .GitHub/skills/code-review-skill-main/reference/c.md | Adds C review reference documentation. |
| .GitHub/skills/code-review-skill-main/assets/review-checklist.md | Adds a condensed review checklist asset. |
| .GitHub/skills/code-review-skill-main/assets/pr-review-template.md | Adds a PR review template asset. |
| .GitHub/skills/code-review-skill-main/SKILL.md | Adds a top-level code review skill description/index. |
| .GitHub/skills/code-review-skill-main/README.md | Adds documentation for the code-review skill bundle. |
| .GitHub/skills/code-review-skill-main/LICENSE | Adds license file for the bundled code-review skill content. |
| .GitHub/skills/code-review-skill-main/CONTRIBUTING.md | Adds contribution guidelines for the bundled code-review skill content. |
| .GitHub/skills/code-review-skill-main/.gitignore | Adds gitignore for the bundled code-review skill content. |
Comments suppressed due to low confidence (1)
.GitHub/skills/render-testing/SKILL.md:1
- This doc hardcodes a user-specific NUnit console path and recommends
dotnet buildfor a .NET Framework workflow. For portability and alignment with repo build guidance, prefer.\build.ps1/.\test.ps1(and avoid assumingC:\\Users\\johnm\\...exists). Also consider avoiding or clearly calling out commands with pipes/redirection in docs, since the repo guidance prefers wrapper scripts for piped commands.
---
Src/Common/Controls/DetailControls/DetailControlsTests/TestResult.xml
Outdated
Show resolved
Hide resolved
| public string GetEnvironmentHash() | ||
| { | ||
| var settingsJson = Newtonsoft.Json.JsonConvert.SerializeObject(CurrentSettings); | ||
| using (var sha256 = SHA256.Create()) | ||
| { | ||
| var hashBytes = sha256.ComputeHash(Encoding.UTF8.GetBytes(settingsJson)); | ||
| return Convert.ToBase64String(hashBytes); | ||
| } | ||
| } |
There was a problem hiding this comment.
The environment hash is intended to be deterministic, but hashing JsonConvert.SerializeObject(CurrentSettings) can be unstable across runtimes/versions because JSON.NET property ordering is not guaranteed unless explicitly configured. That can cause spurious environment-hash mismatches even when the captured values are identical. Consider generating the hash from an explicit, stable field ordering (e.g., concatenating known fields in a fixed order), or configuring serialization with a contract resolver that enforces deterministic property ordering.
| ## C# Instructions | ||
| - Always use the latest version C#, currently C# 14 features. | ||
| - Write clear and concise comments for each function. |
There was a problem hiding this comment.
This guidance conflicts with the repo's .NET Framework/C# constraints (FieldWorks is limited to C# 7.3). As written, it encourages unsupported language features (file-scoped namespaces, switch expressions, nullable reference types, etc.), which can lead to uncompilable PRs. Update the instructions to match the actual language version/tooling used in this codebase (e.g., explicitly state C# 7.3 and avoid recommending C# 8+ features).
NUnit Tests 1 files ± 0 1 suites ±0 9m 30s ⏱️ + 3m 37s For more details on these failures, see this check. Results for commit e23f4b6. ± Comparison against base commit bcfbea0. ♻️ This comment has been updated with latest results. |
# This is the 1st commit message: feat(render): expand DataTree render scenarios and persist benchmark timings standardize DataTree test data with predictable field naming across scenarios add per-scenario timing capture and write results to datatree-timings.json add Newtonsoft.Json to DetailControlsTests for timing serialization add render-testing skill documentation for baseline and benchmark workflow remove obsolete generated Build/FieldWorks.targets.bad file update checked-in test result artifact for current render test run fix: resolve build blocker + timing test layout gaps - Remove duplicate Kernel project entry in FieldWorks.sln that blocked NuGet restore (first entry was missing EndProject) - Add CitationForm and Senses parts to test Normal layout so timing tests produce real slices instead of 0 - Add LexSense-Detail-Senses part and subsense recursion to GlossSn layout so WorkloadGrowsWithComplexity test correctly validates that deeper scenarios produce more slices - Include user edits to implementation-paths.md OS-3: All 4 DataTreeTiming tests now pass perf: add render-testing workflow + timing enhancement log - Update render-testing skill with explicit devil's advocate stage, retest/review, and mandatory TIMING_ENHANCEMENTS.md before commit. - Refine DataTree tab-index optimization safety by moving final ResetTabIndices(0) to CreateSlices finally. - Keep DeepSuspendLayout/DeepResumeLayout batching in DummyObjectSlice.BecomeReal. - Mark OS-3 and OS-5 complete in OpenSpec tasks and add TIMING_ENHANCEMENTS.md with measured outcomes. Validation: build.ps1 -BuildTests; test.ps1 -TestFilter DataTreeTiming -NoBuild perf: paint pipeline optimization — clip-rect culling + double-buffering - HandlePaintLinesBetweenSlices: skip separator lines outside paint clip rect, reducing per-paint iteration from O(N) to O(visible) during scroll/partial repaints. Avoids O(N) XML attribute parsing on off-screen slices. - DataTree constructor: enable OptimizedDoubleBuffer + AllPaintingInWmPaint to eliminate flicker during paint. - Add DataTreeTiming_PaintPerformance test for paint-time regression detection. - Update render-testing skill with test coverage assessment step. - Update TIMING_ENHANCEMENTS.md with enhancements 3 and 4. perf: layout path optimization — early-exit guard, construction skip, paint-path skip Enhancement 5: Slice.SetWidthForDataTreeLayout returns early when m_widthHasBeenSetByDataTree && Width == width, avoiding redundant splitter resizing and event handler churn on every layout pass. Enhancement 6: DataTree.OnSizeChanged skips the O(N) splitter adjustment loop during ConstructingSlices. HandleLayout1(fFull=true) handles width synchronization after construction completes. Enhancement 7: HandleLayout1 paint path (fFull=false) skips SetWidthForDataTreeLayout for non-visible slices, reducing width-sync work from O(N) to O(K) per paint where K is visible slice count. Also fixes RemoveDuplicateCustomFields test to match Test.fwlayout Normal layout changes from prior commit. perf: XML attribute caching + MakeSliceVisible high-water mark Enhancement 8: Cache header/skipSpacerLine/sameObject XML attributes on Slice as lazy bool? fields. Eliminates per-paint XML parsing in HandlePaintLinesBetweenSlices (~2,700 XML lookups/sec at 60Hz with 15 visible slices reduced to 0 after first access). Enhancement 9: Change MakeSliceVisible from static to instance method. Pass known index from HandleLayout1 caller (avoids O(N) IndexOf). Track m_lastVisibleHighWaterMark so the inner 'make preceding visible' loop amortizes to O(N) total instead of O(N*V). Preserves the .NET Framework bug workaround (LT-7307). Reset mark in CreateSlices. perf: construction-path guards + optimization regression tests Enhancement 10: Skip ResetTabIndices in RemoveSlice during ConstructingSlices. Mirror of Enhancement 1 — the finally block in CreateSlices already does a single ResetTabIndices(0) after all removals/insertions complete. Enhancement 11: Skip AdjustSliceSplitPosition in InstallSlice during ConstructingSlices. HandleLayout1(fFull=true) sets correct widths + splitter positions after construction, making per-insert adjustment redundant. Added 7 optimization regression tests (DataTreeOpt_ prefix): - WidthStabilityAfterLayout (Enhancement 5) - AllViewportSlicesVisible (Enhancement 9) - XmlCacheConsistency (Enhancement 8) - XmlCacheInvalidationOnConfigChange (Enhancement 8) - SequentialPaintsProduceIdenticalOutput (Enhancements 3,4,7,8) - SlicePositionsMonotonicallyIncreasing (Enhancement 3) - IsHeaderNodeDelegatesToCachedProperty (Enhancement 8) All 12 DataTree tests pass (5 timing + 7 optimization). # This is the commit message #2: feat: Enhancement 12 — binary search in HandleLayout1 paint path Add FindFirstPotentiallyVisibleSlice binary search to skip above-viewport slices during paint-path layout. Addresses JohnT's original TODO in OnPaint. - HandleLayout1 now starts iteration at the first potentially visible slice (with -1 safety margin) instead of index 0 on paint-path calls - Timing improvements: shallow -71%, deep -20%, extreme -7%, paint -11% - 5 binary-search risk-reduction tests added (DataTreeOpt_ prefix) - Regenerate snapshot baselines after layout/parts changes from 64ab221 - Add mandatory snapshot regression check (Step 6) to render-testing skill - Update TIMING_ENHANCEMENTS.md with E12 measurements and devil's advocate # This is the commit message #3: The real optimization - HWND virtualization # This is the commit message #4: fix: correct VwDrawRootBuffered coordinate mapping + trailing separator bars Three rendering quality fixes for the snapshot capture pipeline: 1. VwDrawRootBuffered coordinate fix (CompositeViewCapture.cs): GetCoordRects returns rcDst in rootSite-local coordinates (origin at HorizMargin, 0). The overlay code was passing clientRect with the rootSite's position relative to the DataTree (e.g. X=175), causing VwDrawRootBuffered to offset rcDst by (-175, -y) and render content at negative X — clipping or hiding it entirely. Fix: render each rootSite into a temp bitmap at local coords (0,0,w,h), then composite into the main bitmap at the correct DataTree-relative position. 2. EnsureAllSlicesInitialized uses BecomeRealInPlace (CompositeViewCapture.cs): Replaced manual handle-forcing with the production ViewSlice initialization path, which sets AllowLayout=true and triggers rootBox.Layout with the correct width. 3. Trailing separator bar removal (VectorReferenceView.cs, PossibilityVectorReferenceView.cs): DisplayVec called AddSeparatorBar after every item including the last, causing VwSeparatorBox to draw unwanted grey bars after single/final items like 'Main Dictionary'. Added guard to only add separators between items. Baselines regenerated — all 6 scenarios show 40-70% more rendered content. All 23 DataTree tests pass (timing + optimization + snapshots). # This is the commit message #5: Added tests before virtualization # This is the commit message #6: update renders # This is the commit message #7: Hwnd implementation started # This is the commit message #8: upgrade openspec # This is the commit message #9: hwnd implementation
- Add deterministic DataTree render scenarios, benchmark harness, and baselines - Optimize layout and paint paths with culling, guards, caching, and binary search - Fix snapshot compositing and separator rendering regressions - Add timing and regression tests; update TIMING_ENHANCEMENTS.md
- Refresh render baselines and benchmark harness after layout updates - Fix snapshot capture coordinates, initialization, and separator rendering - Harden DataTree with regression tests for resize, navigation, and crash paths - Remove generated test-result artifact and update timing notes
…ass, Reconstruct guard fix Phase 2 cold render optimizations for the Views engine: PATH-R1-FIX: Fix Reconstruct guard bug — Construct() left m_fNeedsReconstruct true (from Init), so the warm Reconstruct guard never fired. Warm render dropped from 98.20ms to 0.01ms (99.99% reduction). PATH-C1: Add 8-entry HFONT LRU cache in VwGraphics to avoid redundant CreateFontIndirect/DeleteObjectFont GDI kernel calls when cycling between the same fonts (common in multi-writing-system text). ~2% cold reduction. PATH-C2: Track current foreground/background colors and background mode in VwGraphics. Skip redundant SetTextColor/SetBkColor/SetBkMode GDI calls when values haven't changed. <1% cold reduction. PATH-N1: NFC normalization bypass — OffsetInNfc/OffsetToOrig were performing COM Fetch + ICU NFC normalization on every call (12+ times per FindBreakPoint). Added NFC flag from CallScriptItemize; for NFC text (common case), return identity offsets instantly. 8.2% cold reduction. Best on multi-run scenarios: mixed-styles -27%, lex-deep -25%, lex-extreme -24%. Overall cold render: 60.36ms -> 54.32ms (-10.0% average). All 15 benchmark scenarios pass with 0% pixel variance.
04900b0 to
d6741a0
Compare
Render Performance Optimizations
Summary
This branch delivers comprehensive performance optimizations to the FieldWorks Views engine rendering pipeline, reducing warm render time by 99.99% (153ms → 0.01ms) and cold render time by ~10% (60.36ms → 54.32ms) across 15 benchmark scenarios.
All 15 scenarios pass with 0% pixel variance — no visual regressions.
Motivation
FieldWorks UI responsiveness is bottlenecked by the Views engine render pipeline. Switching between entries, refreshing displays, or navigating within editors triggers
Reconstruct()→Layout()→DrawTheRoot()cycles. Profiling revealed:RefreshDisplay()called Reconstruct on ALL root sites, even when no data changedBenchmark Infrastructure
New test infrastructure
RenderBenchmarkHarness— creates real VwRootBox + VwGraphics in an offscreen HWND, measures CreateView/MakeRoot/PerformOffscreenLayout/DrawTheRoot independentlyRenderTimingSuiteTests— 15 scenarios covering simple through extreme content (long-prose: 4 sections × 80 verses; lex-extreme: 25 entries × 10 senses)RenderVerifyTests— pixel-perfect snapshot verification using Verify + ImageSharpRenderBenchmarkResults/RenderBenchmarkReportWriter— generates markdown summary reports with per-scenario timingDataTreeRenderTests— DataTree-level render scenarios (collapsed/expanded/deep/extreme/multiws/simple)Test files
Src/Common/RootSite/RootSiteTests/RenderBenchmarkHarness.csSrc/Common/RootSite/RootSiteTests/RenderTimingSuiteTests.csSrc/Common/RootSite/RootSiteTests/RenderVerifyTests.csSrc/Common/RootSite/RootSiteTests/RenderBenchmarkTestsBase.csSrc/Common/RenderVerification/(supporting infrastructure)Src/Common/Controls/DetailControlsTests/DataTreeRenderTests.csPhase 1: Warm Render Optimizations (153ms → 0.01ms)
PATH-L1: Width-Invariant Layout Guard
Src/views/VwRootBox.cpp,VwRootBox.hm_fNeedsLayout+m_dxLastLayoutWidthtoVwRootBox::Layout(). If box tree hasn't changed and width is the same, skip the entire layout traversal.PATH-L4: Harness GDI Caching
Src/Common/RootSite/RootSiteTests/RenderBenchmarkHarness.csPerformOffscreenLayout()instead of allocating per call.PATH-R1: Reconstruct Guard
Src/views/VwRootBox.cpp,VwRootBox.hm_fNeedsReconstructflag.Reconstruct()returns early if noPropChanged, style change, or overlay change has occurred since last construction.PATH-L5: Skip Reconstruct When Data Unchanged
Src/views/Views.idh,Src/Common/SimpleRootSite/SimpleRootSite.csNeedsReconstructproperty via COM.SimpleRootSite.RefreshDisplay()checks this before performing expensive reconstruct cycle.Phase 2: Cold Render Optimizations (60.36ms → 54.32ms)
PATH-R1-FIX: Construct Guard Bug Fix
Src/views/VwRootBox.cppConstruct()was leavingm_fNeedsReconstruct = true(set during Init), so the PATH-R1 guard never fired on the first warm render. Fixed by settingm_fNeedsReconstruct = falseafter successful construction.PATH-C1: HFONT LRU Cache
Src/views/lib/VwGraphics.h,VwGraphics.cpp(LgCharRenderProps → HFONT)in VwGraphics. On font change, check cache viamemcmpof font-specific fields. Cache hit → reuse HFONT (skipCreateFontIndirect). Cache miss → create and add to cache. All cached HFONTs freed on destructor/ReleaseDC.PATH-C2: Color State Caching
Src/views/lib/VwGraphics.h,VwGraphics.cppPATH-N1: NFC Normalization Bypass
Src/views/lib/UniscribeSegment.h,UniscribeSegment.cpp,UniscribeEngine.cppOffsetInNfc()andOffsetToOrig()performed COM Fetch + ICU NFC normalization on every call — 12+ times perFindBreakPointinvocation. For content-heavy views, this meant millions of redundant normalizations. Added NFC detection inCallScriptItemize()(comparing pre/post normalization length). When text is already NFC (common case — FieldWorks stores NFC), new overloads return identity offsets, completely skipping COM and ICU calls.mixed-styles-27.1%,lex-deep-24.9%,lex-extreme-24.3%Bug Fixes
VwPropertyStore Crash Fix
Src/views/VwPropertyStore.cppAssert(m_chrp.ws)andAssertPtr(qws)at line 1336 fired whenws=0reached VwPropertyStore during PropChanged-driven box tree rebuilds. Fixed by guardingget_EngineOrNullwithif (m_chrp.ws)check and usingAssertPtrN(qws)to allow null engine pointer. The code already defaulted to LTR directionality when qws was null.VwDrawRootBuffered Coordinate Fixes
Src/views/VwRootBox.cppVwDrawRootBuffered.Other Fixes
Final Benchmark Results
Files Changed (Key Production Files)
Src/views/VwRootBox.hSrc/views/VwRootBox.cppSrc/views/Views.idhSrc/views/VwPropertyStore.cppSrc/views/lib/VwGraphics.hSrc/views/lib/VwGraphics.cppSrc/views/lib/UniscribeSegment.hSrc/views/lib/UniscribeSegment.cppSrc/views/lib/UniscribeEngine.cppSrc/Common/SimpleRootSite/SimpleRootSite.csSrc/Common/Controls/DetailControls/DataTree.csSrc/Common/Controls/DetailControls/Slice.csFuture Optimization Paths
The remaining opportunity for significant cold render improvement is PATH-V1: Lazy VwLazyBox Construction — deferring off-screen paragraph construction until scrolled into view. This is the only path that can achieve the 50% cold reduction target, as the engine is fundamentally bottlenecked by
ScriptShape/ScriptPlacekernel calls that cannot be made faster, only avoided.See layout-optimization-paths.md for full analysis.
This change is