Tools for automation testing of desktop application, with a structural diff baked into every action

Most existing playbooks list the same nine tools and stop. None of them describe a testing primitive that hands you the accessibility tree before and after each click, with element ids and pixel coordinates already stripped, ready to commit as a snapshot. Terminator does, and the stripping logic is 215 lines of Rust at a path you can read without a login.

ui_tree_diff.rsremove_ids:7regex bounds remover:40simple_ui_tree_diff:58uiDiff on every actionMIT

Matthew Diakonov, Written with AI

Published April 24, 202613 min read

4.8from open-source desktop automation users

ui_tree_diff.rs is 215 lines, MIT licensed, on GitHub

uiDiff field on ClickResult and ActionResult

Same engine drives Windows UIAutomation and macOS AX

Plain-text diffs commit cleanly to git

Diff as a testing primitive

The stripping logic is the testing tool

click() returns ClickResult.uiDiff

remove_ids strips id and element_id

regex strips pixel bounds

what is left is what your test cares about

snapshot it. commit it. move on.

0:00 / 0:05

What every other guide on this leaves out

The shape of those articles is predictable. A short paragraph on why desktop testing is hard. A nine-row table comparing TestComplete, Ranorex, Squish, WinAppDriver, Appium, AutoIT, Coded UI, TestArchitect, and Sikuli on price, license, supported languages, and recorder mode. A vague closer on choosing the right tool for your team. None of them describe what an actual desktop test step records, or how that recording survives a window resize between two runs of the same suite. The diff primitive in Terminator does both, and the implementation is short enough to read.

TestCompleteRanorexSquishWinAppDriverAppium DesktopAutoITCoded UITestArchitectSikuliFlaUIWinTaskBotPywinauto

the usual roster. none of them ship a structural-diff snapshot primitive in their default API.

The 215 lines that change the testing model

crates/terminator/src/ui_tree_diff.rs holds the entire mechanism. Two strippers, one diff, a format detector, and the test cases that prove the strippers work. That is it. Below is the part that does the work.

crates/terminator/src/ui_tree_diff.rs

remove_ids is recursive over arbitrary JSON. Strings, numbers, and booleans pass through as-is. Objects rebuild without the offending keys. Arrays map element-wise. Compact YAML hits a different path: two regex sweeps. The bounds remover is the regex that matters. Without it, every screen shift would dominate the diff and the primitive would be useless.

0 linesin ui_tree_diff.rs

0regex sweeps for compact YAML

0 msdefault settle delay

0diff lines on a 118px shift

What the strippers keep, and what they throw away

A diff that includes everything is noise. A diff that strips too much misses real regressions. The exact split below is what Terminator settled on after watching real desktop apps emit real accessibility events.

id keys

remove_ids() at line 7 walks the JSON tree recursively and drops any key literally named 'id'. UIAutomation runtime IDs change every time the tree is rebuilt; keeping them would make every diff a wall of noise.

element_id keys

Same recursion, same drop. element_id is the internal handle Terminator assigns to a UIElement; useful at runtime, useless in a snapshot.

#suffix tags

Compact YAML serializes ids as ' #abc-def-123' suffixes on each line. The regex ' #[\w\-]+' deletes them in one pass.

pixel bounds

'bounds: [x,y,w,h]' clauses are stripped by 'bounds: \[[^\]]+\],?\s*'. A button that moved 118 pixels left after a window resize produces zero diff output. This is the single most important rule.

kept: role + name

[Button] Submit, [Text] Total: $42.18, [CheckBox] Remember me. The semantic axis. If this changes, the test should care.

kept: state flags

(enabled), (disabled), (focused), (selected), (expanded). The behavioral axis. These are the assertions you would have written by hand.

kept: tree shape

Indentation is preserved. A new modal appearing as a sibling of the main window shows up as a fresh '+ [Window]' subtree. A row added to a list shows up at the right depth.

Where the diff comes from, where it goes

How a single click becomes a snapshot

The async function execute_with_ui_diff in crates/terminator/src/lib.rs is the orchestrator. The eight steps below are what runs every time you pass uiDiff=true.

Capture before-tree

self.get_window_tree(pid, None, Some(tree_config)) at lib.rs line 1923. PropertyLoadingMode is Complete by default so the snapshot includes Value, IsEnabled, IsExpanded, and accessible name. Tree capture is skipped if pid is 0; the action runs without diff and uiDiff is None.

Format as compact YAML

format_ui_node_as_compact_yaml(&tree_before, 0).formatted at lib.rs line 1934. Each node becomes a single line of '- [Role] Name #id (bounds: [..], flags)'. Indentation encodes parent-child structure.

Run the action

The async closure executes the user-requested click, type, press, or scroll. Result is captured but the after-snapshot has not started yet.

Sleep for settle_delay_ms

tokio::time::sleep(Duration::from_millis(settle_ms)) at lib.rs line 1942. Default 1500ms. The window for accessibility events to fire and the tree to stabilize. Fast modals can drop this to 200ms; slow data-bound views can raise it to 4000ms.

Capture after-tree

Second call to get_window_tree with the same TreeBuildConfig. If this fails (the app crashed, the window closed) the diff is suppressed and the original action result is returned with uiDiff=None.

Strip volatile fields

is_yaml = old_tree_str.trim_start().starts_with('- [') decides the path. YAML hits the regex stripper, JSON hits the recursive remove_ids. Either way, ids and bounds are gone before the diff runs.

Line-diff with the similar crate

TextDiff::from_lines(&old_processed, &new_processed) at line 81. Iterate iter_all_changes(), keep ChangeTag::Insert as '+ ' lines and ChangeTag::Delete as '- ' lines, drop ChangeTag::Equal entirely.

Return UiDiffResult

Some(diff) becomes UiDiffResult { diff, has_changes: true }. None becomes UiDiffResult { diff: 'No UI changes detected', has_changes: false }. The TypeScript SDK exposes both on ClickResult.uiDiff and ActionResult.uiDiff.

One click, two ways to write the test

// classic desktop assertion-based test // "click Submit, then verify the modal opened" submitButton.click(); // you have to know in advance what to check expect(modal.isVisible()).toBe(true); expect(modal.getTitle()).toBe("Confirm Submission"); expect(yesButton.isEnabled()).toBe(true); expect(noButton.isEnabled()).toBe(true); // did anything else change? you would never know.

Have to know in advance what to check
Misses unexpected side effects
One change ships as many lines

What it looks like in your terminal

The MCP agent and the SDK both log this same shape. The [ui_diff] info lines come from the tracing spans inside execute_with_ui_diff. The diff itself is the unified-style block you commit to your repo as a snapshot.

$ node test-checkout.js

What the test code actually looks like

The opt-in is one extra option on the action. The diff lives on the result. Snapshot it once, run the suite for a year, never write another assertion for that step.

tests/checkout.spec.ts

0 binary diffs in PRs

“The whole snapshot is a few hundred bytes of text per action and any reviewer can read it.”

commit-time review of a Terminator-driven desktop suite

How this compares to the usual roster

Cells are written narrow on purpose; the row labels carry the argument. The competitor column is the median behavior across the tools listed in the marquee above. The ours column is what Terminator does today, with the file or function reference where applicable.

Feature	Typical desktop test tool	Terminator
Verifying that something changed after a click	Write an explicit assertion per step. assertEnabled(saveButton, false). assertVisible(confirmDialog, true). One change = many lines.	result.uiDiff.hasChanges. If false, nothing in the accessibility tree moved; that itself is the assertion.
Catching unexpected side effects	Caught only by assertions you wrote in advance. A button that secretly disabled a third unrelated control will not show up.	Anything that changed shows up as a + or - line. Unexpected disables, ghost modals, late-arriving toasts, all visible in the diff.
Stability under layout shift	Pixel-based image diffs flag every antialiasing change, every font hinting variation. Coordinate selectors break when the window resizes.	remove_ids_and_bounds_from_compact_yaml() strips pixel rectangles before diffing. A 118px button shift produces zero diff lines.
Storing snapshots in git	Binary screenshots, .png files, blob diffs in PRs. Reviewers cannot tell what changed without opening both images side by side.	Plain-text diff, a few hundred bytes per action. toMatchInlineSnapshot, git diff, code review by reading.
Tree capture timing under animation	Up to you. Most tools spin a sleep(500) into the test code itself, which then leaks into every other step.	settle_delay_ms on UiDiffOptions. Default 1500ms, applied automatically between the action and the after-snapshot at lib.rs line 1942.
Cross-platform support	Most desktop test tools are Windows-only or require a separate license per OS. Some require a paid agent on the SUT.	Same diff logic across UIAutomation on Windows and AX on macOS. terminator-mcp-agent re-exports terminator::ui_tree_diff so both adapters share one implementation.
Driving the test from an AI coding assistant	Recorder UI generates VBScript or proprietary script. Hard to wire into Claude Code or Cursor without an extra abstraction layer.	MCP server. claude mcp add terminator. Click and type tools return uiDiff in their response so the model can see what its action did.

Hard requirements this satisfies for a regression suite

Snapshot is plain text, not binary
Identifiers and pixel coordinates are stripped before diffing
Wait-for-settle is configurable per call, not baked into test code
hasChanges flag distinguishes 'no diff' from 'diff suppressed by error'
Same engine on Windows UIAutomation and macOS AX
MCP-shaped, so an AI coding assistant can drive the suite
MIT licensed, no paid agent on the desktop under test

How to put this on a real desktop suite

Three install paths. Pick whichever matches the runner you already have.

npm install @mediar-ai/terminator if your test code is TypeScript or JavaScript. Construct a Desktop, locate with selectors like role:Button|name:Submit, call .click({ uiDiff: true }), read result.uiDiff.
pip install terminator if your test code is Python. The Desktop class shape is identical and uiDiff is the same field.
claude mcp add terminator "npx -y terminator-mcp-agent@latest" if you want Claude Code or Cursor to drive the suite. Every click_element / type_into_element / press_key tool response carries uiDiff so the model sees what its own action did.

Want a diff-driven desktop test suite running this quarter?

Show us the app. We will wire ui_tree_diff into your existing CI in one call.

Frequently asked questions

Why does a desktop test tool need a UI tree diff at all?

Because the alternative is writing one or two assertions per step (verify field X has value Y, verify button Z is now disabled), which collapses under volume. Once a regression suite covers a real desktop app, the assertions outnumber the actions and the suite becomes a maintenance pit. A structural diff inverts the model: every action records what changed in the accessibility tree, and the test only fails when the change does not match. The fact that Terminator computes the diff inside the agent (crates/terminator/src/ui_tree_diff.rs) instead of leaving it to user code is what lets you snapshot a UI flow with zero asserts.

How does the diff stay stable when the UI shifts a few pixels between runs?

Through aggressive preprocessing before the line diff runs. For JSON trees, remove_ids at ui_tree_diff.rs line 7 walks the tree and drops every key named 'id' or 'element_id'. For compact YAML trees, remove_ids_and_bounds_from_compact_yaml at line 40 applies two regexes: ' #[\w\-]+' to strip identifier suffixes and 'bounds: \[[^\]]+\],?\s*' to strip pixel rectangles. After both trees pass through that filter, only the structurally meaningful lines remain. A button that moved from x=412 to x=540 produces no diff. A button whose name changed from Submit to Save produces one '-' line and one '+' line. That distinction is the entire point.

Where does the diff actually surface in the API I write tests against?

On the result of every action. In the TypeScript SDK (docs/TERMINATOR_JS_API.md lines 470 to 500), ClickResult and ActionResult both expose an optional uiDiff field of shape { diff: string, treeBefore?: string, treeAfter?: string, hasChanges: boolean }. You opt in per call: element.click({ uiDiff: true }). The MCP agent surfaces the same shape on click_element, type_into_element, press_key, and a handful of other tools, so an AI coding assistant testing a desktop app sees the diff in its tool response and can decide whether the change matched intent. Without the flag the action runs without any tree capture and the field is undefined.

What format is the diff in, and why does that matter for snapshot testing?

Unified-style line diff. Lines added in the after-tree are prefixed with '+ ', lines removed are prefixed with '- ', and equal lines are filtered out (ui_tree_diff.rs lines 86 to 96). The output is what Rust's similar crate produces from TextDiff::from_lines, which mirrors the behavior of Python's difflib.ndiff. That format matters because it is git-native: you can commit a baseline diff to your repo, diff again on the next run, and assert string equality. No image diffing library, no antialiasing tolerance, no font hinting workaround. The whole snapshot is a few hundred bytes of text per action and any reviewer can read it.

How long does Terminator wait for the UI to settle before it captures the after-tree?

1500 milliseconds by default, configurable per call. The timing lives at lib.rs line 1814 inside execute_with_ui_diff: tokio::time::sleep(Duration::from_millis(settle_ms)) runs after the action and before the after-snapshot. UiDiffOptions exposes settle_delay_ms so a fast modal can cut it to 200ms and a slow data-bound view can stretch it to 4000ms. Tree capture itself uses TreeBuildConfig with timeout_per_operation_ms=100 and yield_every_n_elements=25 to avoid stalling the accessibility thread, which matters because UIAutomation on Windows is notoriously deadlock-prone if you query it while it is still mutating.

What if the diff is wrong because the accessibility tree is itself unreliable?

Then you fall back to the same primitives the diff is built on. The same accessibility-tree dump that feeds the diff is also exposed through get_window_tree, so a test that suspects a flaky snapshot can capture the raw tree before and after manually, run the same regex preprocessor, and inspect any line. Because Terminator goes through the OS accessibility APIs (UIAutomation on Windows, AX on macOS) and not through pixel scraping, the tree reflects what assistive tech sees, not what a screenshot library guesses. You still get false positives when the app emits transient roles during animation, which is what the settle delay is for.

Does this replace tools like TestComplete, Ranorex, Squish, or WinAppDriver?

It replaces them when your test runner is an AI coding assistant or a TypeScript script and the app is anything reachable through accessibility. Commercial desktop test tools were built for a recorder/playback workflow and a single-OS stack. Terminator is a developer framework, MIT licensed, with a Playwright-shaped API, an MCP server, and the diff-as-primitive design described above. If you have an existing TestComplete suite that drives a single Windows app and a team trained on its IDE, switching is a project. If you are starting a regression suite from scratch in 2026, or pointing Claude Code at a desktop app, Terminator is closer to the shape your test code actually wants.

How do I install it for a desktop test suite?

Three ways. For TypeScript: npm install @mediar-ai/terminator, then construct a Desktop, locate elements with selectors that look like role:Button|name:Submit, and call .click({ uiDiff: true }) to read result.uiDiff. For Python: pip install terminator, identical shape. For an AI coding assistant: claude mcp add terminator 'npx -y terminator-mcp-agent@latest', and the assistant gets click_element / type_into_element / press_key tools that all carry the uiDiff flag. The Rust crate is published as terminator on crates.io if you want to embed the engine directly in a Rust test harness.

Other parts of the Terminator engine that show up in test runs.

Related guides

Selectors

Test automation for desktop applications, with a 10ms grace window

Sister page on the parallel selector race in utils.rs. The mechanism that keeps the diff stable across reruns by keeping selector choice deterministic.

Read

Snapshots

UI automation testing that survives an 118px layout shift

Goes deeper on the bounds regex and how a moved button produces zero diff output. Bounds-agnostic snapshots in practice.

Read

Windows

Automation tools for UI testing on Windows

Where Terminator sits on the Windows accessibility stack: UIAutomation, COM, the apartment-model deadlock traps, and how the agent avoids them.

Read