Automation tools for UI testing that prove they can see the UI first

Every 2026 roundup of automation tools for UI testing scores vendors on self-healing selectors, AI authoring, and cloud device grids. None of them answer the question that matters in a farm of Windows VMs: can this worker actually see a UI right now? Terminator's MCP agent ships an HTTP /ready endpoint that boots UIAutomation, grabs the desktop root, enumerates its children, and returns HTTP 200, 206, or 503 under a hard 5 second timeout. Before the orchestrator dispatches a test, it can ask.

GET /healthGET /ready200 / 206 / 503tokio::time::timeout(5s)MIT
M
Matthew Diakonov
11 min read
4.9from teams running desktop UI test fleets
Cross-platform PlatformHealthCheck trait in terminator::health
Healthy 200, Degraded 206, Unhealthy 503 (health.rs line 25)
UIA probe inside tokio::time::timeout(Duration::from_secs(5))
Wired on the same axum server your MCP client talks to

The gap every ranked list of automation tools for UI testing leaves open

Nearly every top-ranked 2026 guide compares the same dimensions: selector quality, self-healing, AI authoring, parallel grids, CI/CD plugins, analytics. All of these assume the worker executing the test is capable of seeing a UI when the dispatcher hands it a job. In a browser-only world that assumption mostly holds, because Chromium is a subprocess of the test runner. The moment you move to desktop automation it stops holding. A Windows VM can boot cleanly, the test harness can report as online, and yet UIAutomation can return no children because the RDP session dropped, the virtual display driver de-registered, or a GPO lock disabled the accessibility stack.

That is the blind spot Terminator's health and readiness endpoints close. The liveness endpoint tells the load balancer the process is up. The readiness endpoint tells the dispatcher the process can actually see a UI right now, as of two hundred milliseconds ago.

Two endpoints, one axum server

Both endpoints are wired in the same router as the MCP protocol handler. A single npx terminator-mcp-agent --transport http --port 3000 gives you /mcp, /health, /ready, /status, and /mode on the same port. The split is intentional: cheap liveness for load balancers, expensive readiness for dispatchers and diagnostics.

/ready exercises the whole desktop automation stack

CI dispatcher
Kubernetes readinessProbe
Azure Load Balancer
Terminator MCP agent
UIAutomation COM
Desktop root
TreeScope::Children

The anchor fact: a 10 line status map

The mapping from internal health state to HTTP status code is ten lines in crates/terminator/src/health.rs. Healthy is 200. Degraded is 206 Partial Content. Unhealthy is 503 Service Unavailable. 206 is the interesting one: it is what the probe returns when the accessibility API is alive but cannot enumerate a single desktop window. That state has no meaningful analogue in browser automation, which is why no browser-first tool exposes it.

crates/terminator/src/health.rs
5 s

UIAutomation health check timed out after 5 seconds - UIAutomation API may be unresponsive.

crates/terminator/src/platforms/windows/health.rs, line 53

Three states, three HTTP codes, one grid

The status enum is intentionally coarse. Fine-grained diagnostics travel in the JSON body. The status code is the signal a dispatcher needs to decide whether to send work.

Healthy → 200 OK

COM initialized, UIAutomation available, desktop reachable, at least one child window found. The dispatcher sends work.

Degraded → 206 Partial Content

UIA is alive but the desktop is unreachable or has zero children. Common on disconnected RDP or headless VM with misconfigured virtual display.

Unhealthy → 503 Service Unavailable

CoInitializeEx failed or UIAutomation::new_direct returned an error. The worker is unusable until the host is fixed.

Hard 5 s timeout

tokio::time::timeout(Duration::from_secs(5), check_future). A stuck COM call never ties up a probe worker.

Diagnostics blob

com_initialized, desktop_child_count, is_headless, display_width, display_height, desktop_name. Read directly in a dashboard.

Shared library

check_automation_health() lives in terminator::health and dispatches to a per-OS PlatformHealthCheck trait. Your own tool can depend on it.

What the Windows probe actually does

The Windows implementation is the canonical one today. macOS and Linux stubs return Healthy with a diagnostic note that deep AX and AT-SPI probes are not yet implemented. The five steps below are the order of operations inside perform_sync_health_check. Each step can short-circuit the probe with a specific diagnostic.

1

CoInitializeEx(COINIT_MULTITHREADED)

Bootstraps the COM apartment for this thread. Silently tolerates RPC_E_CHANGED_MODE (0x80010106) so the probe does not fail when COM was already initialized in another mode elsewhere in the process. Records com_initialized: true|false in the diagnostics blob.

2

UIAutomation::new_direct()

Creates a direct IUIAutomation COM object, bypassing the wrapper cache. If this call fails, api_available stays false, error_message records the exact failure string, and the probe short-circuits. This is the single most common failure on locked-down hosts where UIA was disabled by group policy.

3

get_root_element() and virtual_display::is_headless_environment()

Grabs the desktop root element. On headless VMs (TERMINATOR_HEADLESS=1, or detected virtual display), logs 'Cannot access desktop: ... This typically indicates RDP disconnection or virtual display issues'. Sets desktop_accessible accordingly. Also captures display_width, display_height, display_x, display_y from the root's bounding rectangle so a zero-sized display surfaces in the JSON immediately.

4

find_all(TreeScope::Children, TrueCondition)

Enumerates direct children of the desktop, the operation every selector call ultimately depends on. A zero child count flips can_enumerate_elements to false and attaches display_warning: 'Desktop has no child windows'. That is the signal a browser-only test tool has no equivalent for.

5

update_status() and to_http_status()

Collapses the three boolean checks into the overall HealthStatus enum. All three true produces Healthy; api_available alone produces Degraded; otherwise Unhealthy. to_http_status() converts those to 200, 206, or 503 and the axum handler in main.rs returns the matching StatusCode with the same body regardless.

The timeout guard

A COM call that hangs is worse than a COM call that fails. Every production UI automation tool has been bitten by a stuck UIA thread that holds a reference past a test's wall clock. The readiness probe refuses to hang: it wraps the synchronous check in tokio::task::spawn_blocking and then wraps the join handle in a five second timeout. If the timeout fires, the probe returns Unhealthy with a specific timeout message.

crates/terminator/src/platforms/windows/health.rs

Probe in three commands

Start the MCP agent in HTTP mode, then curl /ready. The shell session below is copy-pasteable. Pull the display cable, kill the virtual display driver, or drop the RDP session, and the second curl flips from 200 to 503 with a specific error_message.

A two minute reproduction

The Degraded body in full

206 Partial Content is the state that lives in the middle: UIAutomation is alive, but the desktop has no children. The JSON body names exactly why. A real dashboard can parse this and route an alert.

/ready response on a disconnected display

What the diagnostics catch

  • desktop_child_count: 0 — display disconnected or session locked.
  • is_headless: true — TERMINATOR_HEADLESS=1 or a virtual display is in play.
  • display_width: 0 — the monitor reports zero dimensions; GPU reload or driver in a bad state.
  • com_initialized: false — COM is wedged; this usually means the host needs a reboot before it can serve UI tests.

Measured defaults

The probe is designed to fit inside the readinessProbe periodSeconds a sensible Kubernetes operator will pick. Three numbers worth memorising.

0 s

Hard timeout on the Windows probe

0

Health states (Healthy, Degraded, Unhealthy)

0

HTTP code when UIA lives but the desktop has no children

The full handler

The readiness handler itself is fifty lines. It calls into terminator::health::check_automation_health(), inlines the extension bridge health alongside it, and translates the enum to an axum StatusCode. Zero vendor lock-in; anything that speaks HTTP can probe it.

crates/terminator-mcp-agent/src/main.rs

The dispatch loop this enables

A test dispatcher does not need to understand UIAutomation to use this. It just needs to curl a worker, read the status, and route work. The six steps below replace every piece of glue custom UI test farms normally ship.

Pre-dispatch flow

  • Orchestrator enqueues a UI regression run.
  • Dispatcher picks a worker from the pool.
  • Dispatcher curls worker's /ready endpoint.
  • Status 200: dispatch the test to this worker.
  • Status 206 or 503: mark worker as draining, try the next one.
  • Alert fires on sustained 503s with diagnostics JSON attached.

Terminator vs a typical browser-only tool

What you get for treating the worker as a probeable service instead of a black box.

FeatureBrowser-only toolsTerminator
Worker exposes an HTTP readiness endpointNo. You dispatch the test and hope.Yes. /ready returns 200, 206, or 503 per a live UIA probe.
Liveness and readiness separatedn/a. The test itself doubles as a probe./health (cheap, always 200 if up) and /ready (deep UIA check).
Detects RDP disconnect before dispatchNo. Test fails partway through and pollutes the report.Yes. desktop_child_count: 0 flips status to degraded.
Hard timeout on the proben/a.5 second tokio::time::timeout. No frozen probe threads.
Diagnostics include display width and heightSurfaces as a broken screenshot inside a failed run.display_width, display_height in JSON. Zero values raise.
Headless/virtual display detectionNo explicit signal.is_headless flag plus display_warning when bounds are zero.
Wired for Kubernetes readinessProbeAd hoc shell scripts per vendor.Inline doc comment: 'Kubernetes readiness probes (less frequent)'.
LicenseClosed SaaS in several ranked entries.MIT, mediar-ai/terminator on GitHub.

What the other ranked entries give you instead

Playwright and Cypress do not need this, because they are browser-only. Virtuoso QA, Mabl, testRigor, Functionize, and TestSprite sell AI authoring and self-healing selectors as the answer to flakiness, which they are, partially, for scripts that run inside an evergreen browser. None of them expose an HTTP probe that asserts the OS accessibility API is functional before the test fires. Ranorex and Katalon, which do speak desktop, leave worker health to the ambient monitoring stack. If you are building a fleet, you write that glue yourself.

Terminator pushes that glue into the product. The same binary that runs your tests answers the probe. No parallel service to deploy, no agent-of-the-agent, no Nagios plugin. One axum router; two endpoints.

Probing a UI test fleet you cannot see into?

Book a 20 minute call with the team. We will walk through wiring /ready into your dispatcher and reading the diagnostics JSON in Grafana.

Frequently asked questions

What do automation tools for UI testing usually miss?

They assume the worker executing the test is always capable of seeing a UI. That is not true in practice. RDP sessions disconnect. Azure load-balanced VMs shift the desktop session. Virtual display drivers fail to register a headless monitor. The browser automation tool has no opinion about any of this because the browser runs in-process inside the test runner. Desktop UI automation runs against the OS accessibility API, and that API can silently lose the desktop out from under you. Terminator's MCP agent ships an HTTP /ready endpoint that actively probes UIAutomation, grabs the desktop root, and enumerates its children before the job dispatcher sends work. If any step fails, the worker returns 503 Service Unavailable and the orchestrator pulls it out of rotation. Source: crates/terminator-mcp-agent/src/main.rs line 793.

What HTTP status codes does /ready return and why?

Healthy maps to 200 OK. Degraded maps to 206 Partial Content. Unhealthy maps to 503 Service Unavailable. The mapping lives in crates/terminator/src/health.rs line 25 inside HealthStatus::to_http_status(). Degraded is a real state in the middle: api_available is true but either the desktop cannot be accessed or the element enumeration returned zero children. That covers the common case where the UIAutomation COM object initialized fine but the display was hot-unplugged or RDP dropped the session. Browsers do not have this state because a headless browser always reports itself as having a page; for desktop automation it is the failure mode that bites the most often.

How long does /ready take and what happens if the probe hangs?

On Windows the probe runs inside tokio::task::spawn_blocking wrapped by tokio::time::timeout(Duration::from_secs(5), check_future). Five seconds is the ceiling. If CoInitializeEx hangs, UIAutomation::new_direct() hangs, or get_root_element() hangs past five seconds, the probe returns Unhealthy with error_message set to 'Health check timed out after 5 seconds - UIAutomation API may be unresponsive' and the HTTP layer returns 503. The timeout is in crates/terminator/src/platforms/windows/health.rs line 35. No dangling COM reference, no stuck readiness probe, no flaky test dispatched against a frozen UIA thread.

What is different between /health and /ready?

/health is a liveness probe: it confirms the process is alive and the HTTP stack responds. It always returns 200 if the server is up. It does not touch UIAutomation. You point Azure Load Balancer at it and probe every five seconds without interfering with in-flight tests. /ready is a readiness probe that actually exercises the accessibility stack. It is expensive (500ms to 5s on Windows), so you probe it less often: once at worker startup, once per test before dispatch, once in a Kubernetes readinessProbe initialDelaySeconds, or on demand from a diagnostics page. Both endpoints are wired in main.rs lines 792-794. Mixing them up is a common ops footgun with any UI test automation worker fleet.

What diagnostics does the /ready payload actually include?

The JSON payload on Windows includes: com_initialized (bool), api_available, desktop_accessible, can_enumerate_elements, check_duration_ms, desktop_child_count (integer), is_headless (bool), display_width, display_height, display_x, display_y, desktop_name, plus error_message if any step failed. The writes happen in crates/terminator/src/platforms/windows/health.rs lines 62-205. For example, if the probe succeeds but reports desktop_child_count: 0 with a display_warning of 'Desktop has no child windows', you know the worker booted UIA cleanly but the display is disconnected. That is different from api_available: false which means COM itself is broken. Two different fixes, two different alert routes, both visible in one JSON response.

Why do browser-only UI test tools not need this?

Because the browser is the runtime. Chromium starts in the same process tree as Playwright, exposes a DOM over DevTools Protocol, and there is no external accessibility service between the test and the page. If Chromium starts, Playwright can query the DOM. Desktop UI automation is the opposite: the accessibility tree is produced by a separate Windows service, depends on a functioning display subsystem, and can be silently degraded by session switches, GPU driver reloads, or certain GPO-restricted hosts. The health probe exists because the gap between 'process is alive' and 'UIA can enumerate a button' is wider than any of the ranked automation tools for UI testing wants to admit.

Can I use /ready inside a Kubernetes pod?

Yes, that is one of its stated design targets. The inline comment at main.rs line 898 lists 'Pre-deployment validation, Diagnostics and troubleshooting, Kubernetes readiness probes (less frequent)'. A readinessProbe with periodSeconds: 30 and failureThreshold: 2 is reasonable. Pair it with a livenessProbe hitting /health at periodSeconds: 10. The readiness probe will flip the pod to NotReady when UIA becomes unreachable, the livenessProbe will not kill the pod unless the whole axum server dies. This is the pattern Kubernetes expects and Terminator models it directly.

Does the /ready endpoint fire only on Windows?

The HTTP endpoint is defined cross-platform. The underlying check_automation_health() call dispatches to WindowsHealthChecker, MacOSHealthChecker, or LinuxHealthChecker depending on target_os. macOS and Linux checkers currently return Healthy with a diagnostic note that deep AX and AT-SPI checks are not yet implemented (health.rs lines 167-200). So on Windows you get real probe semantics today; on macOS you get a liveness-equivalent readiness response. That is an honest gap that most automation tools for UI testing do not even document because they never differentiated the two endpoints.

How do I install and probe it in under a minute?

npx -y terminator-mcp-agent@latest --transport http --port 3000. Then curl -s http://localhost:3000/ready | jq .status. You will see 'ready', 'degraded', or 'not_ready'. The HTTP status code mirrors it. Pipe it into your CI pre-flight: before the pipeline dispatches a test, it curls the worker, checks the status, and skips or retries if the worker is not ready. This replaces the 'it worked on my VM' pattern with a binary signal. The shell command, the endpoint, and the response shape are all visible in main.rs line 870-886.

terminatorDesktop automation SDK
© 2026 terminator. All rights reserved.