Automation on Windows is slow because of IPC

Every tutorial on automation on Windows teaches you how to click a button. None of them tell you why the button took two seconds to resolve. The answer is not the accessibility API, it is the number of times your process asked another process a question. This page is about the single Windows UI Automation call that took Terminator from 6.5 seconds to 200 milliseconds on a 245-element window, and why it is the reason Claude Code can drive desktop apps in real time.

Matthew Diakonov, Written with AI

Published April 20, 20269 min read

4.9from dozens of design partners

Single UIA CacheRequest with TreeScope::Subtree pre-fetches the whole tree

Seven UIProperty values batched in one find_first_build_cache call

6.5s to 200ms on 245 elements, verified in tree_builder.rs line 388

Automation on Windows, without the IPC tax

One CacheRequest. Seven properties. Zero per-element round trips.

A 245-node window costs 3,675 IPC calls the naive way

Terminator uses UIA CacheRequest with TreeScope::Subtree

Seven properties pre-fetched in one find_first_build_cache

Every get_cached_* read is in-process, zero COM traffic

6.5 seconds to 200 milliseconds, verified in tree_builder.rs

0:00 / 0:05

The thing every automation on Windows guide gets wrong

Search the top ten results for automation on Windows and you will find a lot of Task Scheduler screenshots, a couple of AutoHotkey snippets, and a marketing page for Power Automate Desktop. Not one of them mentions the words UI Automation by name. The API is treated as a black box labeled "find the button."

Under the hood, every one of those tools is talking to the same COM interface, IUIAutomation, introduced in Windows 7 and stable across every version since. And every one of them is paying the same cost: each property you read on a UI element is a cross-process function call, because the target app lives in its own process, and your process does not share memory with it.

The naive way to walk a window tree is to recurse from the root, and for each element read its control type, name, bounds, enabled state, focus state, and a few more. That is roughly fifteen COM calls per element. On a 245-element window (your average Office dialog), that is 3,675 round trips across the process boundary before you have even found the Save button. Measured cost on a real machine: 6.5 seconds.

Where the wall-clock time actually goes

The accessibility API is fast. The IPC is not. Here is the math, out loud, for a 245-element window.

0elements in a mid-size window

0IPC calls per element, naive walk

0total cross-process round trips

0mswall-clock for the naive walk

The single call that collapses the tax

UIA has a feature most tutorials skip: IUIAutomationCacheRequest. You tell UIA which properties you want ahead of time, you set the tree scope once, and you issue a single find_first_build_cache call. UIA walks the target process's tree in that process, assembles a snapshot with every property you asked for, and hands it back in one hop. From that point forward, every get_cached_* read is an in-process lookup against the snapshot. The boundary is crossed exactly once.

The cache pipeline, one IPC call

30-50x

“Performance improvement: ~30-50x faster for large trees (e.g., 6.5s -> 200ms for 245 elements)”

Comment at the top of build_tree_with_cache in crates/terminator/src/platforms/windows/tree_builder.rs at line 386

The anchor: build_tree_with_cache, seven properties, one call

This is the core of Terminator's Windows backend. It lives in crates/terminator/src/platforms/windows/tree_builder.rs, starting at line 388. Every word inside this function is verifiable with grep.

crates/terminator/src/platforms/windows/tree_builder.rs

The exact seven properties

Adding more properties to a CacheRequest costs nothing measurable; the single IPC call dominates. These are the seven Terminator asks for, in order.

UIProperty::ControlTypeUIProperty::NameUIProperty::BoundingRectangleUIProperty::IsEnabledUIProperty::IsKeyboardFocusableUIProperty::HasKeyboardFocusUIProperty::AutomationId

Each one maps to a field on UIElementAttributes. The tree builder reads them through get_cached_control_type, get_cached_name, get_cached_bounding_rectangle, and friends. Nothing in that loop crosses a process boundary.

Side by side: naive versus cached

The difference between an automation on Windows that feels instant and one that freezes the agent is the top half vs the bottom half of this file.

naive-vs-cache.rs

How the IPC call actually travels

The sequence is short because we want it to be. One hop out, one hop back, the whole tree comes with it.

build_tree_with_cache on a 245-element window

Watch the walk in four frames

Inside a single build_tree_with_cache call

01 / 04

Frame 1: the request

Terminator builds a CacheRequest, adds the seven UIProperty values, sets TreeScope::Subtree once, and creates a true_condition that matches every element.

Verify against the source

The claims on this page are grep-verifiable. Clone the repo and run these commands. If any line returns something different, the page is wrong; file an issue.

zsh

The six functions that make the cache work

Each card is a real symbol in the Windows UI Automation crate or Terminator's tree builder. Search for it in your copy of the repo.

find_first_build_cache

The single UIA method call that does all the heavy lifting. Accepts a TreeScope, a condition, and a CacheRequest, and returns a root element with every descendant pre-fetched.

create_cache_request

Allocates the request. Cheap. You add one UIProperty at a time; the cost of the request scales with element count, not property count.

set_tree_scope(Subtree)

Scope value 7 = Element | Children | Descendants. Without this, get_cached_children returns nothing and recursion falls back to live COM calls.

get_cached_control_type

In-process read from the snapshot. Zero IPC. Every get_cached_* accessor is the same shape.

get_cached_children

Returns the pre-loaded child array. No COM traversal, no waiting on a remote process. The recursion is plain Rust iteration.

build_node_from_cached_element

The recursive function that walks the snapshot and produces the final UINode. Every field on UIElementAttributes comes from a get_cached_* read.

Three numbers that matter

COM round trip from Terminator to the target process, no matter how many elements live inside the window.

UIProperty values pre-fetched on every node in the subtree, set once at CacheRequest construction.

0ms

Observed wall-clock for a 245-element window after caching. Same tree, no cache: 6.5 seconds.

Terminator versus naive automation on Windows

Feature	Traditional automation on Windows	Terminator
Reads UI Automation tree via single CacheRequest	No, per-property COM calls	Yes, 7 properties in one find_first_build_cache
Tree scope set once	Per-node scope on every descent	TreeScope::Subtree at request construction
Wall-clock for a 245-node window	~6.5 seconds	~200 milliseconds
Falls back gracefully when cache fails	Retries same slow path	Logs and calls build_ui_node_tree_configurable
Exposes the primitive to AI coding assistants	UI only	MCP tool get_window_tree
Code-first SDKs on top	Drag-and-drop canvas	Rust, TypeScript, Python, MCP
Open source license	Proprietary	MIT on GitHub at mediar-ai/terminator

The five-step version of everything above

The app lives in its own process

Windows accessibility is cross-process by design. Every property read on a UI element crosses a COM boundary. This is the root cause of slow automation on Windows.

UIA ships a batching primitive called CacheRequest

You add properties, you set scope, you call find_first_build_cache once. The server walks its own tree locally and returns a serialized snapshot. Documented since Windows 7.

Terminator wraps it in build_tree_with_cache

Seven properties, TreeScope::Subtree, one find_first_build_cache. The function is at crates/terminator/src/platforms/windows/tree_builder.rs line 388.

Every read after that is in-process

get_cached_control_type, get_cached_name, get_cached_bounding_rectangle all read from the snapshot. build_node_from_cached_element walks it recursively in pure Rust.

The agent turns the tree into action

When an AI coding assistant calls the get_window_tree MCP tool, this cached path runs. If caching fails on a weird app, the engine falls back to the recursive path at engine.rs line 3978.

Want automation on Windows that finishes before the agent times out?

Book 20 minutes and we will wire Terminator's cached UIA tree into your editor on a real workflow of your choice.

Frequently asked questions

Why is automation on Windows slower than automation in a browser?

Browser automation runs inside the browser process. The DevTools Protocol reads the DOM locally; every getBoundingClientRect is an in-process call. Windows UI Automation is the opposite: every target app is a separate process, and every property read (ControlType, Name, BoundingRectangle, IsEnabled) is a COM call across a process boundary. A naive walk of a 245-element window can easily issue 3,000 of those calls, and the cost is real. Terminator measured 6.5 seconds for that shape of tree without caching.

What is the specific optimization Terminator uses?

One single UIAutomation CacheRequest with TreeScope::Subtree. The function is build_tree_with_cache in crates/terminator/src/platforms/windows/tree_builder.rs at line 388. It adds seven properties to the cache (ControlType, Name, BoundingRectangle, IsEnabled, IsKeyboardFocusable, HasKeyboardFocus, AutomationId), sets the scope to Subtree (which is Element plus Children plus Descendants, value 7), then calls find_first_build_cache once. After that, every get_cached_control_type, get_cached_name, get_cached_bounding_rectangle call is a pure in-process lookup, zero COM traffic.

How much does the cached approach actually save?

The comment at the top of build_tree_with_cache in tree_builder.rs is specific: '30-50x faster for large trees (e.g., 6.5s -> 200ms for 245 elements)'. The engine's tree builder tries the cached path first and only falls back to the recursive per-property approach if caching fails. See crates/terminator/src/platforms/windows/engine.rs at line 3966 for the fallback branch.

Why do the other automation on Windows tools not use CacheRequest?

Most consumer automation on Windows tools are not performance-bottlenecked on tree reads because they do not walk the tree. Power Automate Desktop opens a dedicated UIA connection per selector and memoizes the result; AutoHotkey mostly cares about a single control under the cursor; RPA canvases re-scan only the recorded region. Terminator is different because AI agents need the full tree to choose a target, and they need it fast enough that the agent does not time out. That pushes the IPC batching problem onto the critical path.

Which UIProperty values are in the cache request?

Exactly seven: UIProperty::ControlType, UIProperty::Name, UIProperty::BoundingRectangle, UIProperty::IsEnabled, UIProperty::IsKeyboardFocusable, UIProperty::HasKeyboardFocus, UIProperty::AutomationId. This is the minimum set the tree builder needs to produce a UINode with role, name, bounds, enabled state, focus state, and a stable element ID. Adding more properties to the cache request is cheap; the single IPC call cost scales with element count, not property count.

What is TreeScope::Subtree and why does it matter?

TreeScope is a UIA enum that controls how deep a search runs. Element is value 1 (just this node), Children is 2, Descendants is 4, and Subtree is 7 (the bitwise OR of all three). Terminator sets Subtree on the cache request, which tells UIA to pre-load every descendant of the root window. Without that, get_cached_children would return an empty iterator and every recursion would have to cross the COM boundary again.

Does the cache go stale during a long automation?

Yes, the cache is a snapshot. Once a UI mutation happens (a dialog opens, a control changes value), the cached nodes no longer reflect the live tree. Terminator's action tools rebuild the tree before and after each mutation when you pass ui_diff_before_after:true. The cost is amortized: one IPC call per action, not 15 per element per action.

What happens if the cache request fails?

The engine falls back to the recursive per-property path. See crates/terminator/src/platforms/windows/engine.rs at line 3978: on Err, the code logs 'Cached approach failed, falling back to recursive' and proceeds with build_ui_node_tree_configurable. The fallback path uses batched children reads with a configurable timeout_per_operation_ms (default 50ms) and yields the CPU every N elements to keep the host responsive.

Does this matter for short scripts or only for agent workflows?

Both. A one-shot script that reads a single edit field makes one IPC call with or without caching. But any script that needs to find a specific control (searching by role and name, iterating siblings, validating a workflow) is walking the tree. The moment you walk more than 50 elements, the difference between caching and not caching shows up as real wall-clock seconds. For agent workflows where the model inspects the tree on every turn, caching is what makes sub-second turns possible.