Skip to content

UI automation: multi-window inspect, cross-window search, screenshot compositing#419

Open
nmetulev wants to merge 9 commits intomainfrom
nm/ui-fixes
Open

UI automation: multi-window inspect, cross-window search, screenshot compositing#419
nmetulev wants to merge 9 commits intomainfrom
nm/ui-fixes

Conversation

@nmetulev
Copy link
Copy Markdown
Member

@nmetulev nmetulev commented Apr 9, 2026

Summary

Improvements to winapp ui commands based on agent trial feedback. Focuses on multi-window handling, output clarity, and interactive mode.

Changes

Multi-window inspect output

  • Adds header separators for each window with HWND, title, type, class name, and owner
  • Deduplicates windows already present in the main UIA tree (modal dialogs)
  • Filters internal system windows (PseudoConsoleWindow, IME, MSCTFIME UI)
  • Footer shows element count + Use -w <HWND> to target a specific window hint
  • Preserves window separators through --interactive/--hide-* filters

Cross-window element search

  • FindSingleElementAsync falls back to popup/owned windows when element not found on main window
  • Searches same-PID windows + cross-process owned windows (file pickers via GW_OWNER)
  • SourceWindowHandle on UiElement enables correct HWND routing for interactions

Screenshot compositing

  • Multi-window screenshots compose all captures side-by-side into a single PNG
  • Label bars show HWND, window type, and title for each capture
  • Cross-process dialog detection via GetWindow(GW_OWNER)

Interactive mode breadcrumbs

  • --interactive now shows collapsed grey breadcrumb lines (e.g., Window > Pane > MenuBar) for non-interactive ancestor containers
  • Preserves tree context without the noise of full container elements

Other improvements

  • Default inspect depth changed from 5 to 4
  • Truncates long element names (80 chars) and values (60 chars) — fixes WebView base64 data URI noise
  • Silent multi-window auto-select (foreground → largest) replaces verbose warning

nmetulev and others added 6 commits April 8, 2026 23:30
WebView2 controls expose base64 data URIs as element names, which
bloat inspect output with hundreds of characters per element.

Truncate displayed names to 80 chars and values to 60 chars with
'…' suffix. JSON output is unaffected (full data preserved).

Applied to both inspect and search text output.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
All UI commands now transparently find and interact with elements
across popup windows, flyout menus, and cross-process dialogs.

Part 1 - FindSingleElementAsync fallback:
When element not found on main window, automatically searches all
app-related windows (same PID + cross-process owned via GW_OWNER).
Covers flyout MenuBar items, file picker dialogs, system dialogs.
SourceWindowHandle tracked on UiElement for correct HWND routing.

Part 2 - inspect spans all app windows:
Full tree inspect shows popup/owned window contents with separator:
  --- HWND 1840448: "View" (popup, Xaml_WindowedPopupClass) ---
    mnu-splitview-5211 MenuItem "Split View"

Part 3 - ResolveComElement uses element source HWND:
When an element came from a popup/dialog, action methods
(invoke, click, set-value, etc.) resolve against the correct
window HWND instead of the session's main window.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When multiple windows are detected (dialogs, popups), compose all
captures side-by-side into a single PNG image instead of separate
files. Each window gets a label bar showing HWND, type, and title.

Better for agents: one image to analyze instead of multiple files.
Dark background, 8px gap between windows, 28px label bars.

Uses SkiaSharp canvas compositing (already a dependency for PNG
encoding). Single-window behavior unchanged.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With cross-window element search, all commands transparently find
elements across popups/dialogs. The verbose multi-window warning
is no longer needed — inspect shows all windows inline, and
action commands resolve elements across windows automatically.

Auto-selection still happens (foreground → largest), just silently.
Logged at debug level for troubleshooting.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add header separator for main window when multiple windows exist
- Show owner HWND in popup/dialog separator lines (e.g., owner: HWND 133306)
- Add blank line before footer, show 'Use -w <HWND>' hint for multi-window
- Preserve window separator elements through --interactive/--hide-* filters
- Change default inspect depth from 5 to 4 (--interactive still bumps to 8)
- Deduplicate windows already in main UIA tree (modal dialogs)
- Filter internal system windows (PseudoConsoleWindow, IME, MSCTFIME UI)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When --interactive filters to only interactive elements, non-interactive
parent containers (Window, Pane, Group, etc.) are now shown as collapsed
grey breadcrumb lines like '… Window > Pane > MenuBar' to preserve
tree context. Breadcrumbs only appear when the ancestor path changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 9, 2026 08:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the winapp ui command set to better handle multi-window applications by improving inspect output, enabling cross-window element resolution for interactions, and compositing multi-window screenshots into a single image.

Changes:

  • Extend UIA inspect/search to detect and traverse additional popup/owned windows and route interactions via a new per-element SourceWindowHandle.
  • Update ui screenshot multi-window behavior to composite captures side-by-side into one PNG (and adjust JSON output accordingly).
  • Improve CLI output clarity (window separators, interactive breadcrumbs) and adjust defaults (inspect depth, output truncation).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/winapp-CLI/WinApp.Cli/Services/UiSessionService.cs Removes verbose multi-window console warning; exposes window class name helper for cross-window filtering.
src/winapp-CLI/WinApp.Cli/Services/UiAutomationService.cs Adds multi-window inspect traversal, cross-window fallback element search, and source-HWND-based element resolution.
src/winapp-CLI/WinApp.Cli/Models/UiElement.cs Introduces SourceWindowHandle for correct cross-window interaction routing.
src/winapp-CLI/WinApp.Cli/Commands/UiSearchCommand.cs Truncates long element names/values in search output to reduce noise.
src/winapp-CLI/WinApp.Cli/Commands/UiScreenshotCommand.cs Captures multiple windows and composites into a single PNG; changes JSON output shape/contents.
src/winapp-CLI/WinApp.Cli/Commands/UiInspectCommand.cs Preserves window separators through filters and adds breadcrumb context rendering in --interactive mode; truncates long fields.
src/winapp-CLI/WinApp.Cli/Commands/SharedUiOptions.cs Changes default inspect depth from 5 to 4.
docs/fragments/skills/winapp-cli/ui-automation.md Updates inspect depth example wording.
.github/plugin/skills/winapp-cli/ui-automation/SKILL.md Updates documented default depth to 4 and adjusts inspect example wording.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ansiConsole.WriteLine();
ansiConsole.MarkupLine($"[grey]--- {EscapeMarkup(el.Name ?? "")} ---[/]");
lastBreadcrumb = "";
Array.Clear(ancestorTypes);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array.Clear(ancestorTypes) does not compile (there is no overload that takes only the array). Use Array.Clear(ancestorTypes, 0, ancestorTypes.Length) (or ancestorTypes.AsSpan().Clear()) when resetting breadcrumb state.

Suggested change
Array.Clear(ancestorTypes);
Array.Clear(ancestorTypes, 0, ancestorTypes.Length);

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 9, 2026

Build Metrics Report

Binary Sizes

Artifact Baseline Current Delta
CLI (ARM64) 30.42 MB 30.45 MB 📈 +26.0 KB (+0.08%)
CLI (x64) 30.79 MB 30.82 MB 📈 +27.5 KB (+0.09%)
MSIX (ARM64) 12.84 MB 12.85 MB 📈 +14.4 KB (+0.11%)
MSIX (x64) 13.63 MB 13.65 MB 📈 +19.5 KB (+0.14%)
NPM Package 26.70 MB 26.73 MB 📈 +21.9 KB (+0.08%)
NuGet Package 26.79 MB 26.81 MB 📈 +19.3 KB (+0.07%)

Test Results

718 passed out of 718 tests in 372.2s (+26.0s vs. baseline)

Test Coverage

20.5% line coverage, 35.4% branch coverage · ⚠️ -0.1% vs. baseline

CLI Startup Time

40ms median (x64, winapp --version) · ✅ no change vs. baseline


Updated 2026-04-10 00:53:24 UTC · commit aabe690 · workflow run

nmetulev and others added 2 commits April 9, 2026 09:14
- Fix Array.Clear to use 3-arg overload for compat
- Update AutoSelectWindow doc to say 'silently' (warning was removed)
- Set composite Width/Height in multi-window screenshot JSON output

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants