Accessibility API desktop automation, without the mouse.

The accessibility API on Windows is not just a read-only inspection surface. UIA also exposes Control Patterns (Invoke, Toggle, ExpandCollapse, Value, Selection) that act on an element without moving the cursor. Terminator's invoke() calls UIInvokePattern.invoke() in nine lines of Rust at element.rs lines 838 to 859. No SendInput. No focus check. No visibility check. That is the real unlock for agent-ready desktop automation.

UIInvokePatternUITogglePatternUIExpandCollapsePatternUIValuePatternUISelectionItemPattern
M
Matthew Diakonov
10 min read
4.9from developers shipping AI-driven desktop automation
Invoke pattern fires inside the target process, no cursor motion
Five UIA Control Patterns wired into one locator grammar
Typed fallback error at element.rs:850 when the pattern is missing
MIT-licensed Rust, with NAPI, PyO3, and MCP surfaces on top

The half of the accessibility API nobody writes about

Every guide about desktop automation through the accessibility API covers the same slice: UIA succeeded MSAA, it exposes a tree of elements, you walk it with TreeWalker and filter with PropertyCondition. Read, read, read. Microsoft's own overview devotes most of its prose to tree navigation.

The write half is where real automation happens. UIA attaches a Control Pattern to each element that can be acted upon. A button gets InvokePattern. A checkbox gets TogglePattern. A combo box gets ExpandCollapsePattern and ValuePattern. A radio button gets SelectionItemPattern. Each pattern is a COM interface with methods like Invoke, Toggle, Expand, Collapse, SetValue, Select. You call those methods directly and the target app handles the action on its own UI thread.

This is how automation at CPU speed is possible. Not by injecting mouse events faster, but by skipping mouse events altogether.

Two code paths, one verb called "click"

In Terminator there are two methods on a UIElement that look like they do the same thing. They do not. One calls a UIA pattern. The other animates a cursor.

invoke() vs click()

Resolve the element's BoundingRectangle. Compute a click point in screen coordinates. Convert to MOUSEEVENTF_ABSOLUTE's 0..=65535 range. Build INPUT structs for MOVE, LEFTDOWN, LEFTUP. Call SendInput. Pray the right window is foreground.

  • Moves the physical cursor
  • Requires the element to sit on a monitor
  • Depends on Z-order and foreground activation
  • Has a race against pointer animations and OS toasts
0 INPUT

invoke_pat.invoke() crosses the UIA bridge into the target process and fires the control's default action. The cursor never moves. SendInput is never called.

element.rs line 856, terminator-rs

The nine lines that matter

Here is the full implementation of invoke(). Every single line is in this block. Open the file on GitHub at mediar-ai/terminator, crates/terminator/src/platforms/windows/element.rs, and jump to line 838.

crates/terminator/src/platforms/windows/element.rs

Three things to notice. First, there is no call to SendInput or any mouse API anywhere in this function. Second, the error path is typed: the substring match at line 848 distinguishes "pattern unavailable" from "COM failure", so callers can catch the first and gracefully fall back to click(). Third, the suggested fallback is named by string literal: "Try using 'click_element' instead." That is the entire fallback protocol.

What click actually does on Windows

For contrast, here is the other code path. This is send_mouse_click in input.rs. It is shared between desktop.click_at_coordinates and element.click(). If you have ever wondered why automated clicks sometimes land on the wrong element, this is the function to stare at.

crates/terminator/src/platforms/windows/input.rs

That is every click: three INPUT structs, one SendInput call, and an optional SetCursorPos to restore the cursor afterwards. It works, it is portable across UIA-compliant and legacy surfaces, and it is the right tool when no pattern is available. But it is strictly a fallback.

The pattern hub, one diagram

On the left, the kinds of elements you reach for in a typical workflow. In the middle, Terminator's pattern resolver. On the right, the UIA pattern that actually fires. The mouse path sits below as a single grayed-out fallback, reserved for elements with no actionable pattern.

element type to Control Pattern

role:Button
role:CheckBox
role:ComboBox
role:RadioButton
role:Edit
Terminator
InvokePattern
TogglePattern
ExpandCollapsePattern
SelectionItemPattern
ValuePattern

What talks to what, in the order it happens

Your call lands in element.rs::invoke. That function asks the uiautomation crate for a UIInvokePattern. The crate proxies to the Windows UIA COM surface, which in turn talks to the target process over the accessibility bridge. The app's own UI thread fires the button's default action. Nothing on your side of the bridge touches SendInput.

invoke() against a role:Button

Your codeelement.rsuiautomation crateUIA COMTarget appelement.invoke()get_pattern::<UIInvokePattern>()GetCurrentPattern(UIA_InvokePatternId)IUIAutomationInvokePattern*UIInvokePattern handleinvoke_pat.invoke()IUIAutomationInvokePattern::Invokeraise default action on UI threadaction completedOk(())

The five patterns Terminator wires up

UIA exposes more patterns than this (Scroll, RangeValue, Window, Transform, Text). These five are the ones the action dispatcher at element.rs lines 1490 to 1560 resolves by name. Each maps a high-level verb to a specific UIA interface.

UIInvokePattern

Default action for buttons, hyperlinks, menu items. Terminator calls invoke_pat.invoke() at element.rs line 856. No mouse motion, no focus, no visibility check.

UITogglePattern

Checkboxes and switches. Dispatched through perform_action("toggle") at element.rs line 1498. Flips state without a click event.

UIExpandCollapsePattern

Tree items, dropdowns, combo boxes. Dispatched through perform_action("expand_collapse") at element.rs line 1517. Opens the subtree directly.

UISelectionItemPattern

Radio buttons and list items. Use setSelected(true) instead of click(). The llms.txt pitfalls list calls this out explicitly.

UIValuePattern

Edit controls that expose a Value property. typeText routes through ValuePattern.SetValue when available, falling back to SendInput for controls that accept keystrokes only.

Inside a single invoke() call

Four steps, no mouse

1

Locate

desktop.locator('process:notepad >> role:Button && name:Save') walks the UIA tree with PropertyCondition and returns a single UIElement. No action yet.

2

Ask for the pattern

invoke() calls get_pattern::<UIInvokePattern>() on the element. UIA either hands you the pattern interface or returns UIA_E_ELEMENTNOTAVAILABLE.

3

Fire the pattern

invoke_pat.invoke() crosses the accessibility bridge into the target process and triggers the control's default action on the target's own UI thread.

4

Fall back only if needed

If the pattern is missing, element.rs:850 returns an UnsupportedOperation error that names click_element as the fallback. That is the only time SendInput enters the picture.

Numbers from the source

0lines of Rust for invoke() at element.rs:838-859
0UIA patterns wired into the action dispatcher
0%success rate Terminator cites in its README
0mouse events fired by invoke()

0 mouse events is the one that matters. Pattern invocation is the reason Terminator can claim background execution without lying: your cursor does not jump, your focused window does not steal input, your streaming screen share does not show the automation flailing.

Same selector, different verbs

The Node.js surface mirrors the Rust surface one-to-one. Here is what picking the right verb looks like in practice. The selector strings are identical to what the MCP agent passes, so anything you test from the SDK works the same from a Claude Code or Cursor tool call.

example.ts

What a run looks like

terminal

invoke() vs click(), side by side

Featureclick() — SendInputinvoke() — UIA pattern
how a button press is firedSendInput with MOUSEEVENTF_LEFTDOWN / LEFTUPUIInvokePattern.invoke() through the UIA surface
does the cursor moveyes, cursor is warped to the click pointno, cursor stays exactly where the human left it
requires the window to be foregroundtypically yes, for input routing to hitno, pattern fires inside the target process
requires the element to be visible on screenyes, bounds must land on a monitorno, pattern is position-independent
deterministic outcomedepends on focus, Z-order, animation statepattern fires or the element does not support it
fallback when the pattern is not availableno fallback, SendInput is the primitivetyped error at element.rs:850, suggests click_element
works from JS, Python, Rust, and MCPyou wire SendInput yourself per bindingone Rust core, NAPI-RS and PyO3 bindings ship it

The fallback ladder, in order

Six rules for picking the verb

  • If the element is a button, hyperlink, or menu item, reach for invoke() first. It resolves UIInvokePattern and bypasses SendInput.
  • If the element is a checkbox, switch, or toggle button, use toggle (or typeText for a value-pattern edit control). Click works, but the pattern is the deterministic path.
  • If the element is a radio button or list item, use setSelected(true). This routes through SelectionItemPattern.Select and avoids the label-click ambiguity.
  • If the element is a combo box, tree item, or expander, use perform_action('expand_collapse'). The subtree opens without scrolling or focusing.
  • Only fall back to click() when get_pattern returns UIA_E_ELEMENTNOTAVAILABLE or 'not support'. Terminator surfaces that error at element.rs:850 with a direct suggestion to switch to click_element.
  • Reserve click_at_coordinates for true pixel cases (native overlays, games, old Win32 apps with no accessible surface). Every other path should resolve an element and call a pattern.

Why this matters for an AI coding agent

A Claude Code or Cursor agent running against a desktop cannot afford to hijack the human's cursor every time it wants to press Save. Pattern invocation makes the agent polite: the human keeps the mouse, the foreground window stays foreground, and the automation runs in parallel with the human's keyboard. It also makes the agent reliable: each action has one of two outcomes, not a spectrum of "clicked somewhere".

One install: claude mcp add terminator "npx -y terminator-mcp-agent@latest". The MCP tool calls route straight through element.rs:838.

Where Control Patterns show up in the wild

UIA Control Patterns are the same primitive every other accessibility-aware tool consumes. The pattern-first approach is not a Terminator invention; what is unique is wrapping it in a Playwright-shaped locator grammar and shipping it through MCP.

Inspect.exeAccEventFlaUIFlaUInspectpywinautoPython-UIAutomation-for-WindowsWinAppDriverAppium Windows DriverUIAutomationClient .NETScreen readers (NVDA, Narrator)Terminator

Trying to wire an agent into a desktop without kidnapping the cursor?

Book 20 minutes with the maintainers. We will walk through picking patterns over clicks for the controls in your app.

Frequently asked questions

What does the accessibility API actually let you do besides inspect elements?

Two things. First, it exposes a tree of UIElement nodes with role, name, AutomationId, BoundingRectangle, and other semantic properties. Second, it exposes Control Patterns (Invoke, Toggle, ExpandCollapse, Value, Selection, Scroll, RangeValue, Window, Transform, Text) that represent what you can actually do to each element. Calling a pattern is a write operation. It fires inside the target process, talks to the app's UI thread through the accessibility bridge, and returns without ever generating a WM_MOUSEMOVE or a SendInput call. Tree enumeration is the read path. Patterns are the write path.

What is the difference between calling invoke() and calling click() in Terminator?

invoke() calls `get_pattern::<patterns::UIInvokePattern>()` on the element and then `invoke_pat.invoke()`. No cursor motion, no focus check, no visibility check, no monitor bounds math. The pattern runs inside the UIA surface in the target process. click() resolves the element's bounding rectangle, computes a click point in absolute screen coordinates, converts to normalized (0 to 65535) coordinates, and calls SendInput with MOUSEEVENTF_ABSOLUTE, MOUSEEVENTF_MOVE, MOUSEEVENTF_LEFTDOWN, and MOUSEEVENTF_LEFTUP. That requires the element to be visible on a monitor and the window to accept foreground input. Both methods exist because not every control exposes InvokePattern; for the ones that do, invoke() is faster, quieter, and does not fight the user's cursor.

Where is the actual implementation I can read?

crates/terminator/src/platforms/windows/element.rs lines 838 to 859. Nine lines of Rust: grab the UIInvokePattern from the element, branch on two distinct error modes ('not support' and 'UIA_E_ELEMENTNOTAVAILABLE' become an UnsupportedOperation error pointing the caller at click_element, everything else becomes a PlatformError), then call invoke_pat.invoke(). The mouse path lives in crates/terminator/src/platforms/windows/input.rs starting at line 38 in the send_mouse_click function. That function is the entire truth of what a click means on Windows.

Which UIA Control Patterns does Terminator wire up?

The action dispatcher at element.rs lines 1490 to 1560 handles 'invoke' via UIInvokePattern, 'toggle' via UITogglePattern (for checkboxes and switches), 'expand_collapse' via UIExpandCollapsePattern (for tree items and dropdowns), and the standard 'click', 'double_click', 'right_click' which fall back to SendInput. setSelected for radio buttons and list items uses SelectionItemPattern. typeText uses ValuePattern when the element is a real edit control, otherwise falls back to keyboard simulation via SendInput. Pattern-first, input-second.

Why do radio buttons and checkboxes sometimes not register on click()?

Because a radio button in UIA is backed by SelectionItemPattern, not InvokePattern. Sending a mouse LEFTDOWN/LEFTUP over its bounding rectangle is ambiguous: Windows might route the click to the label, to the control's hit-test region, or to a parent group box that swallows the event. The deterministic path is to call setSelected(true) which resolves SelectionItemPattern.Select() inside the target process. The ambiguity never comes up. Same story for checkboxes and TogglePattern. This is documented directly in the llms.txt pitfalls list: 'Radio button clicks do not register: use setSelected(true) instead of click()'.

Does this work while my cursor is doing other work?

Yes. Pattern invocation does not touch the cursor. Terminator markets this as 'does not take over your cursor or keyboard', and that statement is literally true for every action that resolves to a UIA pattern: invoke(), toggle(), expand/collapse, setSelected, and typeText against a ValuePattern edit control. It is only true for click() when the target window can be activated in the background without SetForegroundWindow; for foreground-dependent apps, Terminator still restores cursor position if you pass restore_cursor=true to send_mouse_click (input.rs line 45 through 50).

What happens when an element does not support the pattern I asked for?

You get a typed error with a specific fallback suggestion. element.rs line 848 detects the substring 'not support' or 'UIA_E_ELEMENTNOTAVAILABLE' in the underlying uiautomation crate error and rewrites it into AutomationError::UnsupportedOperation with the message 'Element does not support InvokePattern. This typically happens with custom controls, groups, or non-standard buttons. Try using click_element instead.' The ExpandCollapse and Toggle paths do the same routing. That means your agent can catch the error type and fall back to click() without parsing a stack trace.

Is this Windows only, or does it cover macOS and Linux?

Terminator ships Windows UIA today. The core trait AccessibilityEngine in platforms/mod.rs is designed to accept future adapters, but the mod.rs file contains `#[cfg(not(target_os = "windows"))] compile_error!("Terminator only supports Windows. Linux and macOS are not supported.");`. That is the current truth. macOS has AXUIElement with its own AXPress and AXActions surface that maps to the same idea (fire an action, skip the mouse); Linux has AT-SPI2 with Action interfaces. The pattern-first approach is portable in concept, and adapters for the other platforms are a roadmap item.

How does an MCP-based AI agent benefit from this?

Three ways. First, background execution: the agent can act on elements in a window that is not focused, without stealing the cursor from the human. Second, determinism: pattern invocation has a binary outcome (pattern fired or element does not support it) where SendInput has a continuous outcome (depends on where the cursor landed, which window has focus, whether an OS notification popped up mid-click). Third, speed: no animation, no debouncing, no 'wait for the cursor to arrive' delay. In practice that means an agent can execute hundreds of UI actions per second instead of one every few hundred ms, which is how Terminator claims >95% success rate at CPU speed rather than LLM-inference speed.

What is the shortest way to try this?

`npm install @mediar-ai/terminator`, then `const desktop = new Desktop(); await desktop.locator('process:notepad >> role:Edit').first(3000).then(el => el.typeText('hi'));`. Or hook it straight into Claude Code as an MCP server with `claude mcp add terminator "npx -y terminator-mcp-agent@latest"`. Both paths land on the same Rust core and the same element.rs:838 invoke() implementation.

terminatorDesktop automation SDK
© 2026 terminator. All rights reserved.