Task automation for Windows that knows what "right of the Email field" means
Every other Windows task automation tool makes you choose. Read the accessibility tree by role and name, or drop to pixel coordinates when the tree does not help. Terminator adds a third option. You write spatial selectors, and the Windows backend walks the UIA tree, reads bounding boxes, and returns the element that sits where a human would point.
The gap the top SERP results all leave open
Search for task automation for Windows and the top ten results will sort the same short list of tools in slightly different orders. Windows Task Scheduler for cron. AutoHotkey for keystrokes. Power Automate Desktop for the low-code canvas. RoboTask and Automation Workshop for the no-code alternative. Every review compares them on the same axes: triggers, actions, licensing, UI.
None of them describe how any of these tools actually finds a control on the screen. That matters because the bulk of real desktop automation failure is selector failure. The button you wanted got renamed. The form got a new field. The third row grew a second Browse... button. Every recorder breaks on that and every coordinate script breaks on a DPI change.
Terminator's answer is a selector grammar that combines accessibility metadata with geometry. The most distinctive piece of that grammar is five spatial prefixes that nobody else in the SERP mentions.
“const NEAR_THRESHOLD: f64 = 50.0;”
crates/terminator/src/platforms/windows/engine.rs line 1815
Five prefixes, one grammar
Each spatial prefix wraps an anchor selector. The parser at crates/terminator/src/selector.rs maps the prefix onto a variant of the Selector enum. The anchor can be any other selector, including combinators like &&, ||, and process scope.
rightof:
Returns elements whose left edge is at or past the anchor's right edge, sharing vertical overlap with the anchor. Sorted by distance from the anchor center. Ideal for 'the input right of the Email label'.
leftof:
Symmetric to rightof:. Candidate's right edge at or before the anchor's left edge, with vertical overlap. Use for breadcrumb selectors and back/forward pair buttons.
above:
Candidate sits above the anchor with horizontal overlap. Use to find the header label that describes the focused input, or the toolbar button stacked directly on top of a cell.
below:
Candidate sits below the anchor with horizontal overlap. Use to walk down a column of rows under a header, or to find the dropdown list that opens under a combobox.
near:
Candidate is within 50 pixels of the anchor center by Euclidean distance. The only spatial selector with a numeric threshold. Use when direction is unimportant and proximity is the whole signal.
Composable with the rest of the grammar
The inner selector of a spatial prefix can be any selector: role, name, id, chain, and, or, not. Spatial selectors chain with process:, >>, and the rest of the language without special syntax.
How a spatial selector resolves
The call flow for a single rightof:name:Email lookup. The SDK posts one selector. Under the hood the Windows engine does four things, each in a specific order.
rightof:name:Email under the hood
The selector grammar, verbatim
The parser is nineteen lines of pattern matching on lowercase prefix strings. Every entry produces a typed variant of Selector, so the rest of the engine does not have to do string sniffing later.
The geometry filter, line by line
Inside the Windows engine, the filter is a single match on the selector variant. Each arm is a pair of inequalities. The near: arm is the only one with a numeric constant, and that constant is the anchor fact of this whole page.
The overlap rule is why rightof: does not return the wrong row
A plain "candidate is to the right of anchor" test would return every element on the screen whose x coordinate is larger than the anchor's, including buttons four rows down the form. That is useless. The fix is two lines of geometry.
vertical_overlap is defined as candidate_top < anchor_bottom && candidate_bottom > anchor_top. The candidate's vertical range must overlap the anchor's. That is the interval-overlap test, applied to y coordinates. Only when that passes do we check whether the candidate's left edge sits at or past the anchor's right edge.
The horizontal overlap rule is the mirror image, used by above: and below:. Together the two rules let you say "the label directly above this input" without having to measure coordinates yourself.
Anchor, filter, rank, return
An animated beam diagram of the four stages. Inputs flow from the SDK call into the Windows engine hub, and the hub emits results.
What the code looks like
The spatial selectors live in the same string grammar you already use for role, name, and id. That means every SDK (TypeScript, Python, Rust) and every MCP tool call inherits them automatically.
Where every prefix earns its keep
Real workflows that collapse to one spatial selector
- Fill the Email field in an Outlook compose window by anchoring on the To label
- Click the Browse button on the right Log directory row of a settings grid
- Toggle the radio button that sits within 50 pixels of its descriptive label
- Read the value displayed directly below the Total header on a report grid
- Select the dropdown entry that opens beneath the currency combo box
- Target the header label sitting above the focused cell in a spreadsheet
Spatial prefixes in the Terminator selector grammar. Each compiles to one variant of the Selector enum.
The hardcoded NEAR_THRESHOLD. Tight enough to reject distant siblings, loose enough to catch adjacent labels.
Tree depth the candidate scan walks before the geometric filter runs, set in the spatial-selector arm of find_elements.
Why this beats coordinates, not just competes with them
Coordinate scripts break on DPI changes, display swaps, and any layout adjustment. Screenshot agents take seconds per action and cost real money. Tree-only selectors miss controls that share a name with a dozen siblings. Spatial selectors use the same UIA tree that is already in memory, the same bounding rectangles every accessible element already exposes, and a constant-time geometric filter. The Terminator Windows engine does not add a visual subsystem. It adds nine lines of math.
Feature comparison against traditional task automation for Windows
| Feature | Coordinate or tree-only tools | Terminator spatial selectors |
|---|---|---|
| Addresses a control by the label sitting next to it | No, use coordinates or record pixel location | rightof:, leftof:, above:, below:, near: |
| Runs without pixel matching or OCR on the happy path | Often falls back to image matching | UIA tree plus BoundingRectangle only |
| Resolves ties by geometric distance | Returns the first or last tree hit | Sorted by Euclidean distance from anchor center |
| Requires vertical or horizontal overlap for right/left/above/below | Plain bounding-box compare, no overlap rule | Strict overlap filter in engine.rs |
| Spatial selector errors loudly when the anchor is missing | Silent skip or nearest-pixel best guess | AutomationError::ElementNotFound surfaced |
| Chains with process scope and role filters | Hotkey script DSL, canvas widgets | process:EXCEL >> rightof:name:Total && role:Edit |
The tools this page is not about
Each of these is a fine product on its own axis. None of them expose a spatial selector primitive that combines the accessibility tree with bounding-box geometry. If one of these fits your problem, use it. If the selectors keep breaking, that is what this page is about.
Verify the anchor facts against source
This page is grep-verifiable. The NEAR_THRESHOLD and the five selector prefixes are in the same commit as the Rust core. Clone the repository and reproduce every line.
Turn your clunkiest Windows workflow into one spatial selector
Thirty minutes on a call, one of your real forms, and we rewrite the selector so it survives the next redesign.
Frequently asked questions
What is a spatial selector in Terminator's task automation for Windows?
A spatial selector is a prefix keyword that turns another selector into a geometric query. Five prefixes exist: rightof:, leftof:, above:, below:, and near:. Each wraps an anchor selector. rightof:name:Email resolves by first finding the one element whose accessible name matches Email, reading its bounding box from UIA, then filtering all visible elements by the rule 'candidate_left is greater than or equal to anchor_right, and the two rectangles share vertical overlap'. The resulting set is sorted by Euclidean distance from the anchor center and the closest match is returned. The parsing happens in crates/terminator/src/selector.rs lines 419 to 437. The geometry lives in crates/terminator/src/platforms/windows/engine.rs lines 1754 to 1836.
Why would I use rightof: instead of writing a role and name selector?
Plenty of Windows controls have no meaningful accessible name, or share a name with a dozen other buttons in the same window. Think of a settings grid where every row has a Browse... button. You cannot pick by name, because Browse... is not unique. You can pick by position because each row has a distinct label, and the button you want is always rightof:name:Log directory. The spatial selector lets you anchor on the label the user actually reads and ignore the tree shape entirely.
How is the near: threshold defined?
There is a single hardcoded constant: NEAR_THRESHOLD: f64 = 50.0, declared inside the filter match arm in crates/terminator/src/platforms/windows/engine.rs at line 1815. The engine computes Euclidean distance between the anchor bounding box center and each candidate center, and keeps candidates whose distance is strictly less than 50 pixels. 50 is a deliberate choice. Most Windows controls are 20 to 40 pixels tall, and adjacent labels sit about 12 to 30 pixels away from the control they describe. 50 is tight enough to reject distant siblings but loose enough to catch a label that hugs its input.
Does rightof: only return the closest candidate, or can I get all matches?
Both shapes are supported. Calling find_element with the spatial selector returns a single element, the one with the smallest Euclidean distance from the anchor center, after the geometric filter. Calling find_elements returns every candidate that passed the filter, in no guaranteed order. The two code paths live in the same match arm in engine.rs. find_elements uses depth 100 when collecting candidates to ensure deeply nested elements are still reachable.
How does Terminator decide what 'right of' actually means?
Strict geometry, not heuristics. For rightof:, a candidate must satisfy two conditions. First, the candidate's left edge must be greater than or equal to the anchor's right edge. Second, the candidate's vertical range must overlap the anchor's vertical range, meaning candidate_top is less than anchor_bottom and candidate_bottom is greater than anchor_top. leftof:, above:, and below: are symmetric versions of the same rule. The overlap requirement is what stops rightof: from returning the button three rows down the form just because it sits to the right of the anchor's x axis.
Can I combine spatial selectors with role: and process: in one expression?
Yes. The selector grammar lets you nest any selector as the anchor. below:role:Label && name:First Name works. So does process:chrome >> rightof:role:Edit && name:Email. The combinator && inside the anchor is evaluated first, then the spatial filter wraps it. A side effect of this is that the spatial prefix inherits process scope from its surrounding chain, which matters because Terminator's Windows engine refuses to run a selector that does not eventually include a process: scope.
How is this different from Playwright's getByRole with name regex on a web page?
Playwright's DOM selectors work inside a single document where the tree already reflects layout. Windows UIA trees do not. The z-ordering and tab-order of a native Win32 form is not necessarily the spatial order of its controls. So a Playwright-shaped selector like role:Button && nth:3 can hand you the fourth button in the tree which is not the fourth button on screen. Terminator's spatial selectors sidestep the tree order entirely by measuring bounding boxes. That is what makes them the right primitive for legacy desktop UIs where the accessibility tree is shaped by the framework, not the visual layout.
What happens when the anchor element cannot be found?
The spatial selector errors the same way a plain selector does. Inside find_element, the engine calls self.find_element(inner_selector, root, timeout) before any geometric work. If the anchor resolves to zero matches, you get AutomationError::ElementNotFound with the inner selector in the message. No fallback, no guessing. That keeps the spatial primitive honest: if the anchor is gone, the workflow fails loudly instead of clicking some unrelated button that happened to sit in the right neighborhood.
Why not just use ui-vision or a screenshot agent for the same thing?
Speed, determinism, and cost. The spatial selector runs entirely at CPU speed against the UIA tree. Filtering a few hundred bounding boxes is microseconds of work. A screenshot agent takes an OCR or multimodal LLM pass on every attempt, which is two to five seconds of wall-clock time per action and a per-call API bill. Terminator reserves AI for recovery. The happy path is all deterministic geometry.
Does this work on Windows 10 and Windows 11?
Both. The UIA COM API (IUIAutomation) has shipped in every Windows release since Windows 7, and the bounding-box reads that power spatial selectors use the standard BoundingRectangle property that every UIA-conformant element exposes. The prebuilt npm, pip, and MCP binaries target Windows 10 and 11 on x64 and ARM64.