Accessibility API desktop automation, without the mouse.
The accessibility API on Windows is not just a read-only inspection surface. UIA also exposes Control Patterns (Invoke, Toggle, ExpandCollapse, Value, Selection) that act on an element without moving the cursor. Terminator's invoke() calls UIInvokePattern.invoke() in nine lines of Rust at element.rs lines 838 to 859. No SendInput. No focus check. No visibility check. That is the real unlock for agent-ready desktop automation.
The half of the accessibility API nobody writes about
Every guide about desktop automation through the accessibility API covers the same slice: UIA succeeded MSAA, it exposes a tree of elements, you walk it with TreeWalker and filter with PropertyCondition. Read, read, read. Microsoft's own overview devotes most of its prose to tree navigation.
The write half is where real automation happens. UIA attaches a Control Pattern to each element that can be acted upon. A button gets InvokePattern. A checkbox gets TogglePattern. A combo box gets ExpandCollapsePattern and ValuePattern. A radio button gets SelectionItemPattern. Each pattern is a COM interface with methods like Invoke, Toggle, Expand, Collapse, SetValue, Select. You call those methods directly and the target app handles the action on its own UI thread.
This is how automation at CPU speed is possible. Not by injecting mouse events faster, but by skipping mouse events altogether.
Two code paths, one verb called "click"
In Terminator there are two methods on a UIElement that look like they do the same thing. They do not. One calls a UIA pattern. The other animates a cursor.
invoke() vs click()
Resolve the element's BoundingRectangle. Compute a click point in screen coordinates. Convert to MOUSEEVENTF_ABSOLUTE's 0..=65535 range. Build INPUT structs for MOVE, LEFTDOWN, LEFTUP. Call SendInput. Pray the right window is foreground.
- Moves the physical cursor
- Requires the element to sit on a monitor
- Depends on Z-order and foreground activation
- Has a race against pointer animations and OS toasts
“invoke_pat.invoke() crosses the UIA bridge into the target process and fires the control's default action. The cursor never moves. SendInput is never called.”
element.rs line 856, terminator-rs
The nine lines that matter
Here is the full implementation of invoke(). Every single line is in this block. Open the file on GitHub at mediar-ai/terminator, crates/terminator/src/platforms/windows/element.rs, and jump to line 838.
Three things to notice. First, there is no call to SendInput or any mouse API anywhere in this function. Second, the error path is typed: the substring match at line 848 distinguishes "pattern unavailable" from "COM failure", so callers can catch the first and gracefully fall back to click(). Third, the suggested fallback is named by string literal: "Try using 'click_element' instead." That is the entire fallback protocol.
What click actually does on Windows
For contrast, here is the other code path. This is send_mouse_click in input.rs. It is shared between desktop.click_at_coordinates and element.click(). If you have ever wondered why automated clicks sometimes land on the wrong element, this is the function to stare at.
That is every click: three INPUT structs, one SendInput call, and an optional SetCursorPos to restore the cursor afterwards. It works, it is portable across UIA-compliant and legacy surfaces, and it is the right tool when no pattern is available. But it is strictly a fallback.
The pattern hub, one diagram
On the left, the kinds of elements you reach for in a typical workflow. In the middle, Terminator's pattern resolver. On the right, the UIA pattern that actually fires. The mouse path sits below as a single grayed-out fallback, reserved for elements with no actionable pattern.
element type to Control Pattern
What talks to what, in the order it happens
Your call lands in element.rs::invoke. That function asks the uiautomation crate for a UIInvokePattern. The crate proxies to the Windows UIA COM surface, which in turn talks to the target process over the accessibility bridge. The app's own UI thread fires the button's default action. Nothing on your side of the bridge touches SendInput.
invoke() against a role:Button
The five patterns Terminator wires up
UIA exposes more patterns than this (Scroll, RangeValue, Window, Transform, Text). These five are the ones the action dispatcher at element.rs lines 1490 to 1560 resolves by name. Each maps a high-level verb to a specific UIA interface.
UIInvokePattern
Default action for buttons, hyperlinks, menu items. Terminator calls invoke_pat.invoke() at element.rs line 856. No mouse motion, no focus, no visibility check.
UITogglePattern
Checkboxes and switches. Dispatched through perform_action("toggle") at element.rs line 1498. Flips state without a click event.
UIExpandCollapsePattern
Tree items, dropdowns, combo boxes. Dispatched through perform_action("expand_collapse") at element.rs line 1517. Opens the subtree directly.
UISelectionItemPattern
Radio buttons and list items. Use setSelected(true) instead of click(). The llms.txt pitfalls list calls this out explicitly.
UIValuePattern
Edit controls that expose a Value property. typeText routes through ValuePattern.SetValue when available, falling back to SendInput for controls that accept keystrokes only.
Inside a single invoke() call
Four steps, no mouse
Locate
desktop.locator('process:notepad >> role:Button && name:Save') walks the UIA tree with PropertyCondition and returns a single UIElement. No action yet.
Ask for the pattern
invoke() calls get_pattern::<UIInvokePattern>() on the element. UIA either hands you the pattern interface or returns UIA_E_ELEMENTNOTAVAILABLE.
Fire the pattern
invoke_pat.invoke() crosses the accessibility bridge into the target process and triggers the control's default action on the target's own UI thread.
Fall back only if needed
If the pattern is missing, element.rs:850 returns an UnsupportedOperation error that names click_element as the fallback. That is the only time SendInput enters the picture.
Numbers from the source
0 mouse events is the one that matters. Pattern invocation is the reason Terminator can claim background execution without lying: your cursor does not jump, your focused window does not steal input, your streaming screen share does not show the automation flailing.
Same selector, different verbs
The Node.js surface mirrors the Rust surface one-to-one. Here is what picking the right verb looks like in practice. The selector strings are identical to what the MCP agent passes, so anything you test from the SDK works the same from a Claude Code or Cursor tool call.
What a run looks like
invoke() vs click(), side by side
| Feature | click() — SendInput | invoke() — UIA pattern |
|---|---|---|
| how a button press is fired | SendInput with MOUSEEVENTF_LEFTDOWN / LEFTUP | UIInvokePattern.invoke() through the UIA surface |
| does the cursor move | yes, cursor is warped to the click point | no, cursor stays exactly where the human left it |
| requires the window to be foreground | typically yes, for input routing to hit | no, pattern fires inside the target process |
| requires the element to be visible on screen | yes, bounds must land on a monitor | no, pattern is position-independent |
| deterministic outcome | depends on focus, Z-order, animation state | pattern fires or the element does not support it |
| fallback when the pattern is not available | no fallback, SendInput is the primitive | typed error at element.rs:850, suggests click_element |
| works from JS, Python, Rust, and MCP | you wire SendInput yourself per binding | one Rust core, NAPI-RS and PyO3 bindings ship it |
The fallback ladder, in order
Six rules for picking the verb
- If the element is a button, hyperlink, or menu item, reach for invoke() first. It resolves UIInvokePattern and bypasses SendInput.
- If the element is a checkbox, switch, or toggle button, use toggle (or typeText for a value-pattern edit control). Click works, but the pattern is the deterministic path.
- If the element is a radio button or list item, use setSelected(true). This routes through SelectionItemPattern.Select and avoids the label-click ambiguity.
- If the element is a combo box, tree item, or expander, use perform_action('expand_collapse'). The subtree opens without scrolling or focusing.
- Only fall back to click() when get_pattern returns UIA_E_ELEMENTNOTAVAILABLE or 'not support'. Terminator surfaces that error at element.rs:850 with a direct suggestion to switch to click_element.
- Reserve click_at_coordinates for true pixel cases (native overlays, games, old Win32 apps with no accessible surface). Every other path should resolve an element and call a pattern.
Why this matters for an AI coding agent
A Claude Code or Cursor agent running against a desktop cannot afford to hijack the human's cursor every time it wants to press Save. Pattern invocation makes the agent polite: the human keeps the mouse, the foreground window stays foreground, and the automation runs in parallel with the human's keyboard. It also makes the agent reliable: each action has one of two outcomes, not a spectrum of "clicked somewhere".
One install: claude mcp add terminator "npx -y terminator-mcp-agent@latest". The MCP tool calls route straight through element.rs:838.
Where Control Patterns show up in the wild
UIA Control Patterns are the same primitive every other accessibility-aware tool consumes. The pattern-first approach is not a Terminator invention; what is unique is wrapping it in a Playwright-shaped locator grammar and shipping it through MCP.
Trying to wire an agent into a desktop without kidnapping the cursor?
Book 20 minutes with the maintainers. We will walk through picking patterns over clicks for the controls in your app.
Frequently asked questions
What does the accessibility API actually let you do besides inspect elements?
Two things. First, it exposes a tree of UIElement nodes with role, name, AutomationId, BoundingRectangle, and other semantic properties. Second, it exposes Control Patterns (Invoke, Toggle, ExpandCollapse, Value, Selection, Scroll, RangeValue, Window, Transform, Text) that represent what you can actually do to each element. Calling a pattern is a write operation. It fires inside the target process, talks to the app's UI thread through the accessibility bridge, and returns without ever generating a WM_MOUSEMOVE or a SendInput call. Tree enumeration is the read path. Patterns are the write path.
What is the difference between calling invoke() and calling click() in Terminator?
invoke() calls `get_pattern::<patterns::UIInvokePattern>()` on the element and then `invoke_pat.invoke()`. No cursor motion, no focus check, no visibility check, no monitor bounds math. The pattern runs inside the UIA surface in the target process. click() resolves the element's bounding rectangle, computes a click point in absolute screen coordinates, converts to normalized (0 to 65535) coordinates, and calls SendInput with MOUSEEVENTF_ABSOLUTE, MOUSEEVENTF_MOVE, MOUSEEVENTF_LEFTDOWN, and MOUSEEVENTF_LEFTUP. That requires the element to be visible on a monitor and the window to accept foreground input. Both methods exist because not every control exposes InvokePattern; for the ones that do, invoke() is faster, quieter, and does not fight the user's cursor.
Where is the actual implementation I can read?
crates/terminator/src/platforms/windows/element.rs lines 838 to 859. Nine lines of Rust: grab the UIInvokePattern from the element, branch on two distinct error modes ('not support' and 'UIA_E_ELEMENTNOTAVAILABLE' become an UnsupportedOperation error pointing the caller at click_element, everything else becomes a PlatformError), then call invoke_pat.invoke(). The mouse path lives in crates/terminator/src/platforms/windows/input.rs starting at line 38 in the send_mouse_click function. That function is the entire truth of what a click means on Windows.
Which UIA Control Patterns does Terminator wire up?
The action dispatcher at element.rs lines 1490 to 1560 handles 'invoke' via UIInvokePattern, 'toggle' via UITogglePattern (for checkboxes and switches), 'expand_collapse' via UIExpandCollapsePattern (for tree items and dropdowns), and the standard 'click', 'double_click', 'right_click' which fall back to SendInput. setSelected for radio buttons and list items uses SelectionItemPattern. typeText uses ValuePattern when the element is a real edit control, otherwise falls back to keyboard simulation via SendInput. Pattern-first, input-second.
Why do radio buttons and checkboxes sometimes not register on click()?
Because a radio button in UIA is backed by SelectionItemPattern, not InvokePattern. Sending a mouse LEFTDOWN/LEFTUP over its bounding rectangle is ambiguous: Windows might route the click to the label, to the control's hit-test region, or to a parent group box that swallows the event. The deterministic path is to call setSelected(true) which resolves SelectionItemPattern.Select() inside the target process. The ambiguity never comes up. Same story for checkboxes and TogglePattern. This is documented directly in the llms.txt pitfalls list: 'Radio button clicks do not register: use setSelected(true) instead of click()'.
Does this work while my cursor is doing other work?
Yes. Pattern invocation does not touch the cursor. Terminator markets this as 'does not take over your cursor or keyboard', and that statement is literally true for every action that resolves to a UIA pattern: invoke(), toggle(), expand/collapse, setSelected, and typeText against a ValuePattern edit control. It is only true for click() when the target window can be activated in the background without SetForegroundWindow; for foreground-dependent apps, Terminator still restores cursor position if you pass restore_cursor=true to send_mouse_click (input.rs line 45 through 50).
What happens when an element does not support the pattern I asked for?
You get a typed error with a specific fallback suggestion. element.rs line 848 detects the substring 'not support' or 'UIA_E_ELEMENTNOTAVAILABLE' in the underlying uiautomation crate error and rewrites it into AutomationError::UnsupportedOperation with the message 'Element does not support InvokePattern. This typically happens with custom controls, groups, or non-standard buttons. Try using click_element instead.' The ExpandCollapse and Toggle paths do the same routing. That means your agent can catch the error type and fall back to click() without parsing a stack trace.
Is this Windows only, or does it cover macOS and Linux?
Terminator ships Windows UIA today. The core trait AccessibilityEngine in platforms/mod.rs is designed to accept future adapters, but the mod.rs file contains `#[cfg(not(target_os = "windows"))] compile_error!("Terminator only supports Windows. Linux and macOS are not supported.");`. That is the current truth. macOS has AXUIElement with its own AXPress and AXActions surface that maps to the same idea (fire an action, skip the mouse); Linux has AT-SPI2 with Action interfaces. The pattern-first approach is portable in concept, and adapters for the other platforms are a roadmap item.
How does an MCP-based AI agent benefit from this?
Three ways. First, background execution: the agent can act on elements in a window that is not focused, without stealing the cursor from the human. Second, determinism: pattern invocation has a binary outcome (pattern fired or element does not support it) where SendInput has a continuous outcome (depends on where the cursor landed, which window has focus, whether an OS notification popped up mid-click). Third, speed: no animation, no debouncing, no 'wait for the cursor to arrive' delay. In practice that means an agent can execute hundreds of UI actions per second instead of one every few hundred ms, which is how Terminator claims >95% success rate at CPU speed rather than LLM-inference speed.
What is the shortest way to try this?
`npm install @mediar-ai/terminator`, then `const desktop = new Desktop(); await desktop.locator('process:notepad >> role:Edit').first(3000).then(el => el.typeText('hi'));`. Or hook it straight into Claude Code as an MCP server with `claude mcp add terminator "npx -y terminator-mcp-agent@latest"`. Both paths land on the same Rust core and the same element.rs:838 invoke() implementation.
Other Terminator pieces about driving UIA from code
Keep reading
Microsoft UI Automation has no spatial selectors
The companion piece about the read side of UIA. Terminator layers rightof, leftof, above, below, near over BoundingRectangle in 82 lines.
What is UI Automation, from the agent's perspective
Tree enumeration, PropertyCondition, and Control Patterns, explained in the order you actually need them when building an AI-driven desktop agent.
Claude computer use vs pattern-first desktop automation
Screenshot-driven agents vs accessibility-tree-driven agents. Why the latter hits >95% success while the former plateaus at pixel reliability.