Automation UI testing tools: the tree-diff primitive every 2026 roundup misses

Every listicle of automation ui testing tools grades vendors on selectors, AI self-healing, and retry loops. The interesting assertion primitive is not in any of them. It lives in 46 lines of Rust in the Terminator repo, and it is the reason a flaky test loop can actually become deterministic.

Matthew Diakonov, Written with AI

Published April 20, 20269 min read

4.9from open-source usage on GitHub

MIT licensed, code lives in crates/terminator/src/ui_tree_diff.rs

Works on Windows UI Automation and macOS Accessibility trees

Line-level diffs via the similar crate, not pixel thresholds

The uncopyable bit

Two accessibility trees in, an Option<String> out.

The whole verification story fits in one function signature:simple_ui_tree_diff(old, new) -> Result<Option<String>, String>. None means the UI is semantically unchanged. Some means here is the smallest line-level diff, with ids and bounds already stripped.

simple_ui_tree_diff

the primitive no listicle covers

capture tree, run action, capture tree

strip id and element_id

strip # tokens and bounds tuples

line diff via the similar crate

None or Some(diff). assert on the Option.

0:00 / 0:05

What every automation ui testing tools listicle compares

Read the current top five SERP results for the term. Virtuoso, Functionize, LambdaTest, Testim, and Applitools each score vendors on the same short axis list: natural language authoring, AI self-healing selectors, cross-browser coverage, CI integration, and a cloud grid. The word "desktop" appears in none of them in any load-bearing way. The phrase "accessibility tree" appears in none of them at all.

None of these tools expose a verification primitive that works at the level of the whole window. They all hand you a selector API and expect you to write assertions against individual elements. If you want to know that something you did not expect also changed, you are on your own.

The shape of a test step in Terminator

The function, in full

Here is the primitive. It is short enough to read without scrolling. Everything interesting happens before the call toTextDiff::from_lines.

crates/terminator/src/ui_tree_diff.rs

Read the signature again: Result<Option<String>, String>. The outer Result is only for parse errors on malformed trees. The Option is the actual assertion channel. None means pass. Some means record and snapshot.

0mswait_for poll interval

0WaitCondition states

0tree formats supported

0pixel tolerance to tune

Volatile field stripping, in detail

Two trees captured thirty milliseconds apart against the same Windows application will differ. Every element has a fresh numeric id in the raw UIA capture, and the pixel bounds shift with DPI, window focus, and animation frames. Without stripping, a naive line-diff reports all of that as change. These are the two helpers that make the main diff stable.

crates/terminator/src/ui_tree_diff.rs

A None return is a passing assertion

simple_ui_tree_diff returns Result<Option<String>, String>. The Option<String> carries the semantic meaning. None means no real change. Some(s) means here is the smallest line-level record of what changed. You assert against the return type, not against a pixel tolerance.

similar::TextDiff::from_lines

Diff computation uses the Rust similar crate (same algorithm family as difflib.ndiff). Line-based, not character-based. That is a deliberate trade: two elements that moved never produce a 10KB character diff, only the lines that actually moved.

Two formats, one function

The same function accepts JSON trees from get_window_tree and compact YAML trees from Terminator's own pretty-printer. Branch on is_yaml = old.starts_with("- ["). One caller, one contract.

4 WaitConditions at 100ms

Locator::wait_for polls every 100ms for Exists, Visible, Enabled, or Focused. Pair the correct WaitCondition with your diff capture and the "after" tree is deterministic.

Desktop is in scope

Terminator uses Windows UIA and macOS AX adapters, so the tree you diff includes File Explorer, Excel, native dialogs, and installers, not just Chromium.

Wait semantics, the other half of the story

A tree-diff is only useful if both trees were captured at the right moment. Terminator's Locator ships await_for(WaitCondition, timeout)method with four explicit states and a hard 100ms poll interval. No hidden default, no "auto wait" heuristic, no race on transitions.

crates/terminator/src/locator.rs

46 LOC

“None is a passing assertion. Some is a stable structural diff with ids and bounds already removed. There is no threshold to tune.”

ui_tree_diff.rs in the Terminator repo

How the whole loop actually runs

The six steps below describe one test step end to end. Notice that nothing in the loop requires image comparison, screenshot capture, or a trained AI model. It is all structural.

Capture the before tree

The test harness calls get_window_tree once at the start of the step. ids and pixel bounds are preserved in the raw capture. They get stripped at diff time, not at capture time, so the same capture is also usable for snapshot debugging and replay.

Drive the action

Click, type, hotkey, or a chain of them. Terminator's selector engine hits the accessibility tree, not the pixel buffer, so the action itself has no visual flake surface.

Wait for the right WaitCondition

This is where most frameworks cheat with Thread.Sleep or a default 30 second timeout. Terminator gives you 4 explicit states: Exists, Visible, Enabled, Focused. Pick the one that proves the UI finished reacting.

Capture the after tree

Second call to get_window_tree. At this point you have two JSON or YAML strings, both of which will differ by ids and bounds even when nothing else changed.

Diff with volatile fields stripped

simple_ui_tree_diff parses both inputs, recursively drops id and element_id for JSON, regex-strips " #id123" and "bounds: [...]" for YAML, then runs similar::TextDiff::from_lines. You get back None or Some(stable_diff).

Assert on the return type

None is a passing assertion. Some is a structured record of everything that moved. Snapshot it on golden runs, diff against it on subsequent runs. There is no pixel threshold to tune.

What a tree-diff test step needs

Capture before tree via get_window_tree()
Run the action (click, type, key press)
wait_for(WaitCondition.Visible) on the expected result
Capture after tree via get_window_tree()
simple_ui_tree_diff(before, after)
Assert Option is None for steady-state, or snapshot the Some(diff) for structural changes

Running a diff in practice

A shortened trace of an actual session. The test drives Notepad, waits for the Save dialog to become Visible, captures the two trees, and feeds them to the diff.

notepad save flow

Versus what the listicles actually recommend

The axis where Terminator lines up alongside the popular picks, and the axis where it does not.

Feature	Mainstream web suites	Terminator
Surface area	Web only (Chromium, WebKit, Firefox)	Every native desktop app via OS accessibility tree
Verification primitive	Assert on selectors, visual AI, or screenshots	simple_ui_tree_diff returns None or Some(text diff)
Flake source #1: volatile ids	Flaky visual or DOM diffs, manual snapshot cleanup	remove_ids strips id and element_id JSON keys recursively
Flake source #2: pixel bounds	Visual AI tolerance thresholds	Regex strip of bounds: [x,y,w,h] before diffing
Wait semantics	waitForSelector, usually one state (visible)	4 explicit WaitCondition states at 100ms poll
Source availability	Closed source (Applitools, Virtuoso, Testim, Mabl)	MIT, crates/terminator/src/ui_tree_diff.rs on GitHub

Why nobody else has this

Tree diffing only works if you have a tree.

Web-only frameworks speak to the DOM of one page. Visual-AI tools speak to a rasterized image. Neither of those is a structured representation of the whole window. Terminator reads the OS accessibility tree, the same structure screen readers use, and ships the diff as a top-level function in the core crate. The reason nobody else covers this in a listicle is that nobody else has the input.

Where to read the real code

If you want to verify any claim on this page, these are the three files.

crates/terminator/src/ui_tree_diff.rs holds simple_ui_tree_diff, preprocess_tree, remove_ids, and remove_ids_and_bounds_from_compact_yaml, with their unit tests in the same file.
crates/terminator/src/locator.rs holds the WaitCondition enum and the wait_for polling loop, including the 100ms Duration constant at line 186.
crates/terminator/src/selector.rs holds the 25-variant Selector enum that the Locator resolves against, including the five spatial variants and the boolean parser.

Want to see a tree-diff test loop on your own app?

Book 20 minutes. We will wire simple_ui_tree_diff into a real test against a Windows or macOS app you pick.

Frequently asked questions

Where in the Terminator source is the tree-diff primitive defined?

In /crates/terminator/src/ui_tree_diff.rs. The public function is simple_ui_tree_diff(old_tree_str, new_tree_str) -> Result<Option<String>, String>. The same file exports preprocess_tree, remove_ids, and remove_ids_and_bounds_from_compact_yaml, which are the volatile-field-stripping helpers it delegates to. Unit tests for all four functions live in the same file under #[cfg(test)] mod tests.

What fields count as volatile and get stripped before the diff?

For JSON trees, the recursive traversal drops any object key named "id" or "element_id". For compact YAML trees, two regexes run: r" #[\w\-]+" removes the #id token, and r"bounds: \[[^\]]+\],?\s*" removes the bounds tuple. Everything else, including role, name, value, focusable, and subtree shape, is preserved.

How does this compare to Playwright's assertion model?

Playwright asserts against individual selectors or their properties: expect(locator).toBeVisible(), expect(locator).toHaveText(...). The scope is one element at a time. Terminator's tree-diff asserts against the whole window at once. A Playwright test passes if the three things you wrote expects for held; a Terminator test using tree-diff passes only if no unexpected UI change occurred anywhere in the window. Different contract.

Why not just screenshot-diff like Applitools or Percy?

Pixel diffs are sensitive to font hinting, subpixel rendering, GPU driver changes, and antialiasing on text. That is why visual-AI tools ship tolerance thresholds. An accessibility tree diff has none of those failure modes: it records semantic state (role, name, value) not rasterized pixels, so there is no threshold to tune. The trade is that you cannot catch purely visual regressions like a misaligned icon; for functional UI testing, that trade is usually correct.

Does Terminator support Windows, macOS, and Linux desktop apps?

Windows support is the most complete, using the UI Automation COM API via terminator::platforms::windows. macOS support uses the Accessibility (AX) API and is actively developed. Linux AT-SPI is on the roadmap. The selector syntax and Locator API are identical across platforms, so tests written against one backend are portable.

What are the 4 WaitCondition states and why does the poll interval matter?

Exists (element is in the accessibility tree), Visible (has non-zero bounds and is not clipped), Enabled (is_enabled() returns true), Focused (owns keyboard focus). The poll interval is a fixed 100ms in Locator::wait_for. That is low enough that typical UI transitions are caught within one frame budget of completion, and high enough that the automation itself does not starve the application's message loop on Windows.

Is Terminator a ui testing tool or a general automation framework?

Both. The framework exposes the same Locator and Selector API whether you are writing a functional test, an agentic workflow, or a one-off scripted task. The tree-diff function is not test-specific, but it is the natural verification primitive for test code, which is why it ships in the core crate rather than a separate testing package.