macOS accessibility UI tree automation: the write path nobody warns you about

Reading the macOS accessibility tree is the easy half. AXUIElementCopyAttributeValue walks the tree, Accessibility Inspector renders it for you, and every Mac automation guide you find online stops there. The write path is where the abstraction leaks. AXPress and AXClick return success on Chrome, Safari, Arc, Firefox, Edge, Brave, Opera, and Vivaldi web views and do absolutely nothing. Any automation engine that lasts more than a weekend ships a 3-tier click fallback and a hand-coded browser bypass list. This page is what 4,368 lines of macOS accessibility code looked like up close, before we deleted it.

AXUIElementAXPressCGEventApplicationServicesMIT
M
Matthew Diakonov
13 min read
4.9from developers building macOS desktop automation
8 browser names bypass AXPress / AXClick on sight
3 click strategies tried per element before failure
Manual unsafe Send + Sync wrapper for AXUIElement
4,368 lines of recoverable Rust to read against

What every other guide on this gets right, and where they stop

Open any explainer about Mac accessibility automation and you get the same recipe. Open Accessibility Inspector. Note the role and label of the element you want. Call AXUIElementCopyAttributeValue with the right key. Walk AXChildren recursively. Maybe pretty-print the result. The recipe is correct, and it gets you a tree dump on screen in about 60 lines of Swift or Python. That's the read path.

What the existing playbooks don't tell you is what happens the second you try to drive the tree. Send an AX action, set an AX value, focus an element, type into a web input, click the close button on a window. Apple's API gives you the function names. It does not tell you which controls in which apps actually honor them, and on macOS there is no central registry of which AX actions are real and which are decoration. You find out by shipping it.

0lines of Rust in the deleted macos.rs adapter
0browsers explicitly bypassing AXPress / AXClick
0click strategies tried before reporting failure
0+distinct AXAttribute names referenced in the engine

The eight-browser bypass list, and why it exists

The single piece of code that captures the write-path problem best is the first 35 lines of click_auto in the Terminator macOS adapter. Before it tries any AX action, it asks: what application owns this element? If the answer contains any of eight browser substrings, jump straight to synthetic mouse events. Don't even attempt AXPress.

The browsers that always skip the AX action path:
chromesafariarcfirefoxedgebraveoperavivaldimicrosoft edge

Why is the list hardcoded? Because the failure mode is the worst kind. AXPress on a Chrome web view returns kAXErrorSuccess, the engine thinks the click landed, the test passes, and the page never changes. There is no error to log, no exception to catch. The only way to know the click was lost is to compare the tree before and after, notice nothing moved, and bail to a synthetic event. By that point you've burned a tree snapshot you didn't need to take. So the engine front-loads the decision: if it looks like a browser, skip the AX action layer entirely.

crates/terminator/src/platforms/macos.rs

What "click an element" actually means on macOS

Once you accept that AXPress can lie, the click implementation stops being a one-liner and becomes a flowchart. You try the cheap path first, fall through to a slightly less cheap path, and land on synthetic input as a last resort. That last resort is the only path that physically moves the cursor.

Click strategy fan-in

AXPress action
AXClick action
CGEvent mouse
click_auto
ClickResult: AXPress
ClickResult: AXClick
ClickResult: CGEvent

The 3-tier click in execution order

1

Tier 1: AXPress

element.perform_action(AXAttribute::new(&CFString::new("AXPress"))). The canonical accessibility action. Roughly 70% of native AppKit controls accept it. Buttons, menu items, links. Returns kAXErrorSuccess on success, kAXErrorActionUnsupported on controls that don't expose press semantics.

2

Tier 2: AXClick

Same shape with AXClick instead of AXPress. Some older AppKit views prefer AXClick. Many custom Cocoa controls expose both. The cost of trying is one FFI call, the upside is avoiding a synthetic input event.

3

Tier 3: CGEvent mouse simulation

Read AXPosition and AXSize. Compute center = (x + w/2, y + h/2). Create CGEventSource on HIDSystemState. Post MouseMoved, LeftMouseDown, LeftMouseUp at that point. This is the path browsers and Electron apps take. It moves the cursor visibly and bypasses any AX shim that swallowed AXPress.

4

When to skip tiers 1 and 2 entirely

Browser allowlist. If the host application name contains chrome, safari, arc, firefox, edge, brave, opera, or vivaldi, jump straight to Tier 3. The cost of trying AXPress on a web view is one FFI call that returns success and does nothing — the worst of both worlds, because the engine thinks the click landed.

AXUIElement isn't Send. AXUIElement isn't Sync. You wrap it.

Every async runtime, MCP server, or background daemon you build on top of macOS accessibility has the same first problem. The Rust accessibility crate does not implement Send or Sync for AXUIElement, so you can't store one in a struct that crosses await points or share one across threads. The fix is a manual wrapper with two unsafe impl blocks and a SAFETY comment that points at Apple's thread-safety guarantee for Core Foundation objects. You only have to write this once, and the rest of the engine depends on it.

crates/terminator/src/platforms/macos.rs

If you skip this step, you discover it the first time you try to run an AX query inside a tokio task and the compiler tells you the future isn't Send. Most projects discover it about an hour into building, write the wrapper, and never think about it again. It is one of the genuinely transferable patterns from reading any AX automation engine source.

Typing into a web input: three attribute names in a loop

The same trap shows up on the value side. A native AppKit text field accepts AXValue via AXUIElementSetAttributeValue and the text appears. A Chrome input exposed through the accessibility shim might honor AXValue, or might only respond to AXValueAttribute, or might only accept AXText depending on which Chromium build it's running. The deleted type_text path tries all three in sequence and takes whichever one returns a zero status code.

crates/terminator/src/platforms/macos.rs

Permission gate: AXIsProcessTrustedWithOptions

Before any of the above runs, the user has to grant your binary the Accessibility entitlement. There's exactly one function for this and one option key: AXIsProcessTrustedWithOptions with a CFDictionary mapping kAXTrustedCheckOptionPrompt to kCFBooleanTrue. That triggers the system prompt the first time. Until the user toggles your binary in System Settings, every AX call returns kAXErrorAPIDisabled. There is no programmatic way around this; you can only ask.

permission flow

The operational shape of an AX automation engine

Outside the marquee fallback chains, an AX engine is a collection of small, opinionated decisions. Which attribute names to read first. Where to gate on permissions. Which controls deserve their own named lookup instead of a tree walk. None of these are hard individually; in aggregate they are why the file ends up four thousand lines long.

AXIsProcessTrustedWithOptions is the only gate

No accessibility entitlement, no AX calls. Pass kAXTrustedCheckOptionPrompt = kCFBooleanTrue at engine construction time to trigger the system prompt the first time. Without the bit set, every AXUIElementCopyAttributeValue returns kAXErrorAPIDisabled.

AXUIElement is Core Foundation, not Rust-safe

The accessibility crate doesn't mark it Send + Sync. You wrap it manually in struct ThreadSafeAXUIElement(Arc<AXUIElement>) with two unsafe impl blocks. Skip this and you can't run AX queries from a tokio task or an MCP server.

AXValue, AXValueAttribute, AXText

Three attribute names tried in sequence for web inputs. AXValue handles native AppKit. AXValueAttribute is the legacy WebKit key. AXText catches some Electron and CEF builds. First non-zero return code wins.

Browsers ack actions and do nothing

AXPress on a Chrome web view returns kAXErrorSuccess and the page never changes. The browser's AX shim acknowledges the action without forwarding it to Blink. The only fix is a per-app bypass list and synthetic CGEvents.

AXMinimizeButton, AXZoomButton, AXCloseButton

Window chrome lives at known attribute keys. Don't search the tree for a Close button — read AXCloseButton off the window element directly and AXPress that. The deleted engine used this at lines 1715 to 1721.

The 130+ AXAttribute string fan-out

Every property you read or write is an AXAttribute::new(&CFString::new(name)). The 4,368-line file references AXTitle, AXLabel, AXDescription, AXValue, AXPosition, AXSize, AXRole, AXSubrole, AXFocused, AXSelected, AXChildren, AXParent, AXFocusedApplication, AXFocusedUIElement, plus a long tail of role-specific keys. Every one is one FFI call.

Reads vs writes: same API, different surface

If you treat AX reads and AX writes as symmetric you will spend most of your engineering budget on the writes. The two surfaces look similar in the headers but behave very differently in production.

FeatureRead pathWrite path
primary callAXUIElementCopyAttributeValueAXUIElementPerformAction or AXUIElementSetAttributeValue
idempotentyes, repeat reads return the same snapshotno, every call may mutate UI state or fail half-way
browser behaviortree exposes elements with role, name, boundsAXPress and AXClick return success and do nothing
thread safetysafe to fan out across tokio tasks once wrappedneeds per-element focus and per-process activation up front
fallback strategyretry the read, walk a different ancestorAXPress → AXClick → CGEvent mouse simulation, plus per-app allowlists
Apple docs coveragewell-documented, sample code in Accessibility Inspectordocumented at the API level, silent on which apps lie about supported actions
what an automation engine spends time ontree formatting, selector grammar, cachingfallback strategies, browser detection, key event composition, focus dance

If you're building this yourself, here's the punch list

Every macOS automation engine that lasts past a prototype ends up with some version of the items below. The exact spelling varies by language and runtime. The behavior doesn't.

What a production AX engine ends up shipping

  • Check AXIsProcessTrustedWithOptions with kAXTrustedCheckOptionPrompt = kCFBooleanTrue at engine construction time. Refuse to start if the bit isn't set.
  • Wrap AXUIElement in your own struct with manual unsafe impl Send + Sync. Cite the Core Foundation thread-safety guarantee in a SAFETY comment.
  • Maintain a host-application allowlist of browsers that bypass AXPress and AXClick entirely. Today the list is at least chrome, safari, arc, firefox, edge, brave, opera, vivaldi.
  • For clicks, ship 3 tiers in order: AXPress, AXClick, CGEvent mouse simulation. For browsers, jump directly to tier 3.
  • For text input on web inputs, try AXValue, then AXValueAttribute, then AXText. First non-zero return wins. Fall back to synthetic CGEventKeyboard events if all three fail.
  • Read window chrome from named attributes (AXMinimizeButton, AXZoomButton, AXCloseButton) instead of searching the tree by role + name.
  • Detect web role (role contains 'web' or 'generic') before deciding which value-setting strategy to use. The role lookup itself is one AX call.
  • Budget for permission errors at every level. AXIsProcessTrusted can flip back to false at any time if the user revokes the entitlement, and your engine should surface that as a typed error rather than a panic.

Where Terminator lands on macOS today

Honest update on the state of the project. Terminator is a developer framework for building desktop automation. It gives existing AI coding assistants the ability to drive your whole OS through native accessibility APIs, not just write code. On Windows it ships full UIA support: the Node.js, Python, and MCP server packages are Windows binaries today. On macOS, the adapter you've been reading about lived at crates/terminator/src/platforms/macos.rs for several months and got deleted on 2025-12-16 in commit 0c11011c so the team could put its weight behind the Windows path where it has the most depth.

The full 4,368-line file is still recoverable from git history with git show 0c11011c~1:crates/terminator/src/platforms/macos.rs. If you're writing a macOS AX engine right now, that file is one of the more complete MIT-licensed examples available. Read click_auto, type_text, the ThreadSafeAXUIElement wrapper, and the permission check at lines 121 to 144. The rest mostly composes from there.

For a working macOS automation tool today, look at Hammerspoon's axuielement Lua module, MacPaw's macapptree for tree dumps, or Nudge for an MCP server. Each of them ships its own click and value-setting strategy, and each of them has hit the same set of write- path edge cases this page describes.

Other AX-using tools you'll see in the wild

None of these solve the entire problem; each picks a slice. All of them wrap AXUIElementCopyAttributeValue and friends underneath.

HammerspoonAccessibility InspectormacapptreeNudgeSikuliPyAutoGUIHammerspoon axuielementApplicationServices.pyax-element-rsatomacos

Building AX automation and want a second pair of eyes?

Walk through your write-path fallbacks with the team that maintained 4,368 lines of macos.rs. We've already hit the edges.

Questions developers ask after reading the file

What is the macOS accessibility UI tree?

Every native macOS app exposes its on-screen elements as a tree of AXUIElement objects. Each node carries a role (AXButton, AXTextField, AXMenuItem), an accessible name (AXTitle / AXLabel / AXDescription), a position (AXPosition), a size (AXSize), and a list of supported actions (AXPress, AXClick, AXShowMenu, AXRaise). The tree is rooted at AXUIElement.system_wide() and you walk into it by reading AXFocusedApplication or by getting the element for a specific PID via AXUIElement::application(pid). The tree is what Accessibility Inspector shows you, what VoiceOver speaks, and what every macOS automation tool ultimately reads.

Why does AXPress return success but nothing happens when I target a Chrome or Safari window?

Because the AX action lands on the browser's accessibility shim, which acknowledges it and never forwards it to the underlying web view. WebKit, Blink, and Gecko all build their own internal accessibility trees on top of the DOM and only partially re-export them through the platform AX API. AXPress and AXClick are no-ops on those re-exported elements. The element's role looks right, the name looks right, the action returns kAXErrorSuccess, and the page does not change. The fix is to detect the browser by application name and switch to synthetic CGEvent mouse clicks at the element's bounds. The deleted Terminator engine kept an explicit allowlist (chrome, safari, arc, firefox, edge, brave, opera, vivaldi, microsoft edge) for exactly this reason, at lines 415 to 424 of crates/terminator/src/platforms/macos.rs in commit 0c11011c~1.

How many lines of code does it take to wrap the macOS Accessibility API into a usable automation engine?

Roughly 4,368 lines of Rust, based on the deleted Terminator implementation in crates/terminator/src/platforms/macos.rs. That's not the user-facing API, that's the platform adapter alone: tree walking, element lookup by selector, click strategies, key event composition, focus and value setters, monitor enumeration, screenshot, OCR plumbing, and unsafe FFI for the AX functions accessibility-rs doesn't expose. About 130 distinct AXAttribute strings appear in that file. The size is what you sign up for when you commit to making AX work across browsers, Electron, native AppKit, Catalyst-on-macOS, and the long tail of half-instrumented apps.

Why does AXUIElement need a manual unsafe Send + Sync wrapper?

AXUIElement is a Core Foundation type. Apple says CF objects are thread-safe in practice, but the Rust accessibility crate doesn't mark AXUIElement as Send + Sync, so you can't share it across threads or store it in an async runtime without a manual wrapper. The pattern in the deleted file at lines 76 to 102 is a struct ThreadSafeAXUIElement(Arc<AXUIElement>) with two unsafe impl blocks: unsafe impl Send for ThreadSafeAXUIElement {} and unsafe impl Sync for ThreadSafeAXUIElement {}. The safety comment cites Apple's thread-safety guarantee for Core Foundation objects. Without this you can't run AX queries from a tokio task, which means you can't ship a server, an MCP agent, or any concurrent automation harness.

What's the right click strategy for the macOS Accessibility API?

Three tiers, in order. First, attempt AXPress via element.perform_action("AXPress"). About 70% of native AppKit elements respond. Second, if that fails, attempt AXClick via element.perform_action("AXClick"). Some controls (mostly buttons in older AppKit views) prefer this. Third, if both fail, drop to CGEvent mouse simulation: read AXPosition and AXSize, compute the center, create a CGEvent for MouseMoved + LeftMouseDown + LeftMouseUp on the HID source, and post each one. The deleted Terminator engine called these click_press, click_accessibility_click, and click_mouse_simulation respectively. Browsers always skipped tiers one and two by name and went straight to tier three, because tiers one and two return success but do nothing on browser-rendered content.

Can I read the macOS accessibility tree without granting full Accessibility permissions?

No. AXIsProcessTrustedWithOptions(options) is the gate, and options must contain { kAXTrustedCheckOptionPrompt: kCFBooleanTrue } to surface the system prompt the first time. Until the user goes to System Settings → Privacy & Security → Accessibility and toggles your binary on, every AXUIElementCopyAttributeValue call returns kAXErrorAPIDisabled. The deleted engine checked this at construction time at lines 121 to 144 and returned AutomationError::PermissionDenied immediately if the bit wasn't set. There is no programmatic way to grant the permission; you can only ask. For TCC-managed installations and CI, you set the bit in /Library/Application Support/com.apple.TCC/TCC.db with sudo, which is its own production rabbit hole.

Why do web inputs need three different attribute names tried before AX text input works?

Because Chrome, Safari, Firefox, and friends export web text inputs through the AX API with inconsistent attribute keys. The deleted type_text path at lines 1303 to 1320 tries AXValue first (the canonical key), then AXValueAttribute (legacy, still used by some WebKit views), then AXText (some Electron and CEF builds expose only this). For each candidate, it does an AXUIElementSetAttributeValue call and checks for a zero return code. Whichever one succeeds wins, and on a fully native AppKit text field the first attempt closes the deal in a single FFI call. On browsers, you usually fall through to the third or end up using synthetic key events instead.

Is Terminator a tool I can install on my Mac to automate Mac apps today?

Not today. Terminator's Node.js, Python, and MCP server packages currently ship Windows-only binaries. The macOS adapter existed at the core Rust level for several months but was deleted on 2025-12-16 in commit 0c11011c, alongside the Linux adapter, to focus on the Windows UIA path where the team has the most depth. The full macos.rs is still recoverable via `git show 0c11011c~1:crates/terminator/src/platforms/macos.rs` and is one of the more thorough open MIT-licensed examples of an AX-based automation engine you can read end to end. If you're building macOS automation yourself, that file is a useful reference for the operational gaps the Apple docs don't cover.

What should I use right now if I need a working macOS accessibility automation tool?

Three honest answers depending on what you're building. For AppleScript-style scripting and quick UI hooks, Hammerspoon's axuielement Lua module is mature and ships out of the box. For dumping the AX tree to JSON for analysis, MacPaw's macapptree is a focused Python package. For an MCP-style agent server, Nudge runs on macOS today. Each of them ships its own click and value-setting strategy, and each of them has run into some version of the browser-action no-op and the Send/Sync wrapper described above. If you read their source you'll find the same fallbacks, just spelled differently.

How is reading the AX tree different from sending an action through it?

Reads are pure: AXUIElementCopyAttributeValue, AXUIElementCopyAttributeValues, AXUIElementGetActionDescription. They're idempotent, they're cheap, they're safe to call from any thread once you have your Send/Sync wrapper, and they give you a clean snapshot of the current UI. Writes are a different surface entirely: AXUIElementPerformAction (AXPress, AXClick, AXShowMenu) plus AXUIElementSetAttributeValue (AXValue, AXFocused, AXSelected). Writes are where the abstraction leaks. Some elements lie about supported actions, some accept the action and do nothing, some need a synthetic event instead, and the only way to know which case you're in is to detect the host application up front and have a fallback ready. Every AX automation tool that ships gets this right or gets returned as broken.

terminatorDesktop automation SDK
© 2026 terminator. All rights reserved.