UI automation: multi-window inspect, cross-window search, screenshot compositing#419
UI automation: multi-window inspect, cross-window search, screenshot compositing#419
Conversation
WebView2 controls expose base64 data URIs as element names, which bloat inspect output with hundreds of characters per element. Truncate displayed names to 80 chars and values to 60 chars with '…' suffix. JSON output is unaffected (full data preserved). Applied to both inspect and search text output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All UI commands now transparently find and interact with elements
across popup windows, flyout menus, and cross-process dialogs.
Part 1 - FindSingleElementAsync fallback:
When element not found on main window, automatically searches all
app-related windows (same PID + cross-process owned via GW_OWNER).
Covers flyout MenuBar items, file picker dialogs, system dialogs.
SourceWindowHandle tracked on UiElement for correct HWND routing.
Part 2 - inspect spans all app windows:
Full tree inspect shows popup/owned window contents with separator:
--- HWND 1840448: "View" (popup, Xaml_WindowedPopupClass) ---
mnu-splitview-5211 MenuItem "Split View"
Part 3 - ResolveComElement uses element source HWND:
When an element came from a popup/dialog, action methods
(invoke, click, set-value, etc.) resolve against the correct
window HWND instead of the session's main window.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When multiple windows are detected (dialogs, popups), compose all captures side-by-side into a single PNG image instead of separate files. Each window gets a label bar showing HWND, type, and title. Better for agents: one image to analyze instead of multiple files. Dark background, 8px gap between windows, 28px label bars. Uses SkiaSharp canvas compositing (already a dependency for PNG encoding). Single-window behavior unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With cross-window element search, all commands transparently find elements across popups/dialogs. The verbose multi-window warning is no longer needed — inspect shows all windows inline, and action commands resolve elements across windows automatically. Auto-selection still happens (foreground → largest), just silently. Logged at debug level for troubleshooting. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add header separator for main window when multiple windows exist - Show owner HWND in popup/dialog separator lines (e.g., owner: HWND 133306) - Add blank line before footer, show 'Use -w <HWND>' hint for multi-window - Preserve window separator elements through --interactive/--hide-* filters - Change default inspect depth from 5 to 4 (--interactive still bumps to 8) - Deduplicate windows already in main UIA tree (modal dialogs) - Filter internal system windows (PseudoConsoleWindow, IME, MSCTFIME UI) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When --interactive filters to only interactive elements, non-interactive parent containers (Window, Pane, Group, etc.) are now shown as collapsed grey breadcrumb lines like '… Window > Pane > MenuBar' to preserve tree context. Breadcrumbs only appear when the ancestor path changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR enhances the winapp ui command set to better handle multi-window applications by improving inspect output, enabling cross-window element resolution for interactions, and compositing multi-window screenshots into a single image.
Changes:
- Extend UIA inspect/search to detect and traverse additional popup/owned windows and route interactions via a new per-element
SourceWindowHandle. - Update
ui screenshotmulti-window behavior to composite captures side-by-side into one PNG (and adjust JSON output accordingly). - Improve CLI output clarity (window separators, interactive breadcrumbs) and adjust defaults (inspect depth, output truncation).
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/winapp-CLI/WinApp.Cli/Services/UiSessionService.cs | Removes verbose multi-window console warning; exposes window class name helper for cross-window filtering. |
| src/winapp-CLI/WinApp.Cli/Services/UiAutomationService.cs | Adds multi-window inspect traversal, cross-window fallback element search, and source-HWND-based element resolution. |
| src/winapp-CLI/WinApp.Cli/Models/UiElement.cs | Introduces SourceWindowHandle for correct cross-window interaction routing. |
| src/winapp-CLI/WinApp.Cli/Commands/UiSearchCommand.cs | Truncates long element names/values in search output to reduce noise. |
| src/winapp-CLI/WinApp.Cli/Commands/UiScreenshotCommand.cs | Captures multiple windows and composites into a single PNG; changes JSON output shape/contents. |
| src/winapp-CLI/WinApp.Cli/Commands/UiInspectCommand.cs | Preserves window separators through filters and adds breadcrumb context rendering in --interactive mode; truncates long fields. |
| src/winapp-CLI/WinApp.Cli/Commands/SharedUiOptions.cs | Changes default inspect depth from 5 to 4. |
| docs/fragments/skills/winapp-cli/ui-automation.md | Updates inspect depth example wording. |
| .github/plugin/skills/winapp-cli/ui-automation/SKILL.md | Updates documented default depth to 4 and adjusts inspect example wording. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ansiConsole.WriteLine(); | ||
| ansiConsole.MarkupLine($"[grey]--- {EscapeMarkup(el.Name ?? "")} ---[/]"); | ||
| lastBreadcrumb = ""; | ||
| Array.Clear(ancestorTypes); |
There was a problem hiding this comment.
Array.Clear(ancestorTypes) does not compile (there is no overload that takes only the array). Use Array.Clear(ancestorTypes, 0, ancestorTypes.Length) (or ancestorTypes.AsSpan().Clear()) when resetting breadcrumb state.
| Array.Clear(ancestorTypes); | |
| Array.Clear(ancestorTypes, 0, ancestorTypes.Length); |
Build Metrics ReportBinary Sizes
Test Results✅ 718 passed out of 718 tests in 372.2s (+26.0s vs. baseline) Test Coverage❌ 20.5% line coverage, 35.4% branch coverage · CLI Startup Time40ms median (x64, Updated 2026-04-10 00:53:24 UTC · commit |
- Fix Array.Clear to use 3-arg overload for compat - Update AutoSelectWindow doc to say 'silently' (warning was removed) - Set composite Width/Height in multi-window screenshot JSON output Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Improvements to
winapp uicommands based on agent trial feedback. Focuses on multi-window handling, output clarity, and interactive mode.Changes
Multi-window inspect output
Use -w <HWND> to target a specific windowhint--interactive/--hide-*filtersCross-window element search
FindSingleElementAsyncfalls back to popup/owned windows when element not found on main windowSourceWindowHandleonUiElementenables correct HWND routing for interactionsScreenshot compositing
GetWindow(GW_OWNER)Interactive mode breadcrumbs
--interactivenow shows collapsed grey breadcrumb lines (e.g.,Window > Pane > MenuBar) for non-interactive ancestor containersOther improvements