Claude skills for desktop automation: the two that ship inside Terminator.
Most articles on Claude skills walk through the first-party document set: xlsx, pdf, docx, csv. Useful, but it is the same skill format reading and writing files. Terminator goes one layer down. Two skill.md files ship in the repo at .claude/skills/. Both pair Anthropic's skill markdown with a 35-tool MCP server, so a phrase like connect to machine becomes a deterministic walk through Windows UIA, macOS AXUIElement, or Linux AT-SPI2 through one router. This page is about the wiring between those two layers and what it lets you ship.
The gap most writeups about this leave open
Read the existing material on Claude skills. You will find solid walkthroughs of the document set (one skill that reads PDFs, one that writes docx, one that diffs spreadsheets) and a lot of generic advice on writing skill.md frontmatter. What you will not find is a skill that actually drives the operating system. A skill that opens a remote VM, launches an installer, types into an internal admin tool, and screenshots the result. That is the workflow class skills are best suited for, and almost no public example shows it.
The reason is plumbing. To do desktop work, the skill needs OS hands. Bash and Read and Write are not enough. You need a router that speaks UIAutomation on Windows, AXUIElement on macOS, and AT-SPI2 on Linux, with a stable selector grammar on top so Claude can call the same role:Button && name:Save on every platform. That router is what an MCP server like Terminator's gives you. The skill on top is the recipe. Both pieces ship together in the Terminator repo, in two specific files.
What the two skill files contain
Both skills are plain markdown. No build step, no Python helper, no embedded JavaScript. The work each one does lives entirely in the workflow Claude follows when the trigger phrase matches.
terminator-issue-reporter / skill.md
197 lines. Auto-activates on 'terminator issue', 'terminator bug', 'create issue', 'report bug'. Reads `~/.local/share/claude-cli-nodejs/Cache/*/mcp-logs-terminator-mcp-agent/*.txt` (or the LOCALAPPDATA path on Windows), extracts the last 100 log lines, drafts a GitHub issue body with env, repro YAML, and an immediate workaround.
remote-mcp / skill.md
212 lines. Frontmatter declares `allowed-tools: Bash, Read, Write`. Auto-activates on 'remote MCP', 'connect to machine', 'execute on remote'. Body holds a tool reference table mapping seven OS actions to `terminator mcp exec --url http://<IP>:8080/mcp <tool> '<json>'` calls.
35 #[tool(...)] macros
Defined inside `crates/terminator-mcp-agent/src/server.rs` (10,912 lines). Confirmed by `grep -c '#\[tool(' server.rs` returning 35. Each macro's `description` string is what Claude actually sees when picking a tool.
Selector grammar, in the skill
remote-mcp/skill.md teaches the operators inline. `>>` chain (scope into process). `&&` AND. `||` OR. `!` NOT. Examples like `process:chrome >> role:Button && name:Submit` are listed so Claude does not have to invent them mid-conversation.
JIT-loaded markdown, not code
Both files are plain markdown with YAML frontmatter. No build step. Drop them into `.claude/skills/<name>/skill.md`, Claude picks them up on next session, and the trigger-phrase list at the top is what activates them.
The remote-mcp frontmatter and the MCP CLI it dials
The remote-mcp skill is the one to study first if you want to clone the pattern. Its frontmatter tells Claude when to load it and what tools it is allowed to invoke. Its body is mostly a reference table mapping intent to a single CLI shape.
The shape that matters is terminator mcp exec --url <URL> <tool> '<json>'. Every action the skill knows about is one of those calls. The CLI hands the tool name and the JSON args to the MCP server. The server resolves the name to one of 35 #[tool(...)] macros in crates/terminator-mcp-agent/src/server.rs. That count is verifiable: run grep -c "#\[tool(" server.rs and you get 35.
“The skill itself is a tiny, version-controlled prompt. The 35-tool MCP server is what gives it desktop hands. Both pieces ship in one repo, both are reviewable as PRs.”
terminator/.claude/skills + crates/terminator-mcp-agent/src/server.rs
Document skill vs desktop skill, side by side
The shapes look almost identical at the markdown level. What changes is what the workflow ends up calling. The first-party document skills bottom out at Read and Write. The Terminator skills bottom out at the 35-tool MCP router that drives the OS.
The same skill format, different I/O surface
# .claude/skills/draft-release-note/skill.md
# A first-party-style "document" skill, for contrast.
---
name: draft-release-note
description: Draft a release note from a CHANGELOG entry.
---
## Workflow
1. Read CHANGELOG.md (Read tool).
2. Find the most recent ## section.
3. Rewrite as marketing copy.
4. Write release-note.md (Write tool).
# Tools used: Read, Write. The OS is never touched.How a trigger phrase becomes an MCP tool call
The chain has five steps. None of them are magic. The thing to notice is that the skill never executes anything itself: it just shapes the tool call Claude is going to make next.
Phrase to OS, in one round trip
The five-step chain
User message lands
The user says something like 'open Outlook on the test VM and reply to the latest unread message.' Claude has not loaded any skill body yet, only the names and descriptions.
Trigger match
Claude scans the loaded skill manifests. The phrase 'on the test VM' lines up with the remote-mcp description ('connect to machine', 'execute on remote'). Match wins, skill body is JIT-loaded into context.
Workflow scaffolds the call
The skill body holds a tool reference table. Claude picks `open_application` for the launch and queues `wait_for_element` for the window. Both invocations are wrapped in the canonical `terminator mcp exec --url <URL> <tool> '<json>'` shape the skill defines.
MCP router resolves the tool
The CLI dials `http://<IP>:8080/mcp`. The terminator-mcp-agent server matches the tool name to one of the 35 `#[tool(...)]` arms in server.rs and runs the work against the OS accessibility API: UIAutomation on Windows, AXUIElement on macOS, AT-SPI2 on Linux.
Result feeds back into the skill loop
The MCP server returns structured JSON: status, action, optional `ui_tree_diff`, optional screenshot path. The skill workflow tells Claude how to interpret it (e.g., on element-not-found, broaden the selector and retry once before surfacing). Loop continues until the user's request is satisfied.
What a real session looks like
With the MCP server registered and .claude/skills/remote-mcp/skill.md in place, a conversation that controls a remote box looks like this. Claude picks the skill on its own, loads the body, and the tool calls it emits are the same shape the skill teaches.
Anatomy of the issue-reporter skill
The second skill that ships is a different shape. It does not drive the OS in a forward direction; it captures evidence after something failed. It auto-activates on terminator issue, create issue, and report bug, then walks Claude through pulling the latest MCP log, isolating a minimal repro, and generating a GitHub issue body.
The interesting part is how it teaches Claude where the logs actually live. Most agents will guess at log paths. This skill embeds both the macOS/Linux glob (~/.local/share/claude-cli-nodejs/Cache/*/mcp-logs-terminator-mcp-agent/*.txt) and the Windows path (%LOCALAPPDATA%\\claude-cli-nodejs\\Cache\\*\\mcp-logs-terminator-mcp-agent\\*.txt) inline, with a PowerShell snippet for the latest 100 lines. The skill is a runbook with executable hooks; Claude does not have to invent the runbook on the fly.
Two skill shapes, one MCP, in one table
Compared against a typical document-handling skill, the desktop skill keeps the same surface and changes the I/O. The trick is entirely in what tool the skill is allowed to invoke.
| Feature | Document-handling skills | Terminator desktop skills |
|---|---|---|
| I/O surface the skill targets | documents (pdf, docx, xlsx, csv) | the entire OS via UIA, AXUIElement, AT-SPI2 |
| tool count behind the skill | Read / Write / Bash plus a few local scripts | 35 #[tool(...)] macros in server.rs |
| where the skill ships | Anthropic's skill marketplace | .claude/skills/ checked into the product repo |
| trigger style | implicit, based on file types in the conversation | explicit phrase list at the top of skill.md |
| remote control | not applicable | remote-mcp skill routes through `terminator mcp exec --url` |
| logs and repro extraction | manual paste | issue-reporter pulls the latest MCP log, last 100 lines, with paths for both OSes |
| permissions model | skill-local scripts run with full conversation perms | allowed-tools: Bash, Read, Write declared in frontmatter; MCP gates OS access |
A short stat row, just so the numbers are explicit
Most of the values on this page are countable from the source. These are the four worth keeping in your head.
skill files
0
.claude/skills/terminator-issue-reporter/skill.md and .claude/skills/remote-mcp/skill.md
tool macros
0
#[tool(...)] arms in crates/terminator-mcp-agent/src/server.rs
OS APIs routed
0
UIAutomation, AXUIElement, AT-SPI2 — same selector grammar
build steps
0
Drop skill.md in, restart Claude, the trigger phrase works
If you are writing your own
Use the Terminator pair as the template. The full file paths are in the bento grid above. The minimum viable skill is a single markdown file with a tight description, a short trigger list, three to five worked examples, and a fallback section for the common failure modes. Keep it under 250 lines; both shipped skills do.
Checklist for a desktop-class skill
- Decide what your skill triggers on. Pick three to six phrases that a real user would actually say. The remote-mcp skill leans on 'remote MCP', 'connect to machine', 'execute on remote'. Generic verbs ('automate', 'do this') match too broadly and stomp on other skills.
- Declare the smallest tool set that works in `allowed-tools`. For desktop skills that means Bash (to run the terminator CLI), Read (to inspect logs and config files), and Write (to capture artifacts). Do not list every tool just in case.
- Wire the MCP server first, before the skill. Run `claude mcp add terminator "npx -y terminator-mcp-agent@latest"`. Verify with `terminator mcp exec --url <URL> get_desktop_elements '{"depth":1}'` from a shell. The skill assumes the router is up.
- Embed selector grammar inline in the skill body. The remote-mcp skill teaches `>>`, `&&`, `||`, `!` with worked examples. Claude does not have to guess across sessions; the recipe is right there in the markdown.
- Cover the failure modes. The issue-reporter skill ends with a 'Common Issue Patterns & Solutions' table (ElementNotFound, Browser script errors, MCP connection issues, workflow state issues) so Claude has a fallback when the happy path breaks.
- Version it like code. Both Terminator skills sit under `.claude/skills/` in the public repo. PRs to a skill are PRs to the workflow itself, which means review, history, and rollback are all on git.
The set of tools sitting underneath
Most skills lean on six to eight of the 35 tools. The remote-mcp skill explicitly names seven in its tool reference table. Worth scanning so you know what is on offer when you write your own.
Want help shaping a skill that controls your stack?
Bring the workflow you already do by hand. We will pair on the trigger phrases, the MCP tool sequence, and what the skill should do when the happy path breaks.
Questions developers ask before they ship a skill
What is a Claude skill, in one paragraph?
A skill is a markdown file with YAML frontmatter that lives in `.claude/skills/<name>/skill.md`. The frontmatter declares a name and a description; Claude loads the file lazily when the user's request matches the description, then follows whatever instructions are inside as a workflow. A skill can list its allowed tools (`allowed-tools: Bash, Read, Write`) and reference any MCP server already wired into Claude. The point is the skill itself is a tiny, version-controlled prompt and permissions config; the heavy lifting (file IO, network, OS calls) is whatever set of tools the skill is allowed to invoke. For desktop automation, those tools come from a process-control MCP server, not from a doc-handling library.
Where in the Terminator repo do the actual skills live?
Two files. `.claude/skills/terminator-issue-reporter/skill.md` is a 197-line bug-reporting workflow: it auto-activates on phrases like 'terminator issue', 'create issue', 'report bug'; pulls the latest MCP server log from `~/.local/share/claude-cli-nodejs/Cache/*/mcp-logs-terminator-mcp-agent/*.txt` (Windows path is `%LOCALAPPDATA%\claude-cli-nodejs\Cache\*\mcp-logs-terminator-mcp-agent\*.txt`); extracts the last 100 lines around the error; and drafts a GitHub issue body with environment, repro steps, and a minimal YAML workflow. `.claude/skills/remote-mcp/skill.md` is a 212-line remote-control workflow: it auto-activates on 'remote MCP', 'connect to machine', 'execute on remote'; declares `allowed-tools: Bash, Read, Write` in its frontmatter; and routes the user's intent through a `terminator mcp exec --url http://<IP>:8080/mcp` CLI call into the 35-tool router.
Why is desktop automation a good fit for the skill format specifically?
Three reasons. First, skills are JIT-loaded by trigger phrase, so a user can say 'open Notepad on the test VM and type the report' without selecting a tool by hand; the skill matches the phrase and Claude takes the wheel. Second, skills can declare scoped permissions (`allowed-tools`), which is the right shape for OS access where you do not want every conversation to be able to fire `run_command` against a remote Windows host. Third, the workflow itself (find element, validate, act, screenshot, retry) is repeatable enough to encode once and reuse, which is exactly what skills are for. The MCP server gives Claude desktop hands; the skill gives Claude a recipe for what to do with them.
What does the MCP server expose that the skill leans on?
Thirty-five tool definitions in `crates/terminator-mcp-agent/src/server.rs`, declared as Rust `#[tool(...)]` macros. The exact count is confirmed by `grep -c "#\[tool(" server.rs` (returns 35). The shape that matters for skills is the action set: `open_application`, `click_element`, `type_into_element`, `press_key`, `press_key_global`, `mouse_drag`, `scroll_element`, `select_option`, `set_selected`, `set_value`, `invoke_element`, `validate_element`, `wait_for_element`, `navigate_browser`, `execute_browser_script`, `capture_screenshot`, `run_command`, `execute_sequence`, plus read-only inspectors like `get_window_tree`, `get_desktop_elements`, `find_elements_by_property`. The remote-mcp skill names a working subset of seven of these in its tool reference table; the issue-reporter skill leans on `get_window_tree`, `validate_element`, and `take_screenshot` for repro extraction.
How does the trigger phrase actually become an MCP tool call?
The chain is: user message hits Claude. Claude matches the request against every skill's `description` field and the heuristics inside the body (the auto-activates list at the top of `skill.md`). Match wins, skill body loads into context. Skill instructs Claude to format a tool call (for remote-mcp, that is a `Bash` call to `terminator mcp exec --url <URL> <tool> '<json args>'`; for issue-reporter, a `Read` of the latest MCP log file plus a `Bash` call to `git rev-parse HEAD` for the commit). The terminator CLI inside the bash call dials the MCP endpoint, the MCP server resolves the tool by name (one of the 35 `#[tool(...)]` arms), runs the work against the OS accessibility API, and returns a structured result. Claude reads the result, plans the next step, repeats.
Is this only for Claude Code, or does it work in Claude Desktop and the API too?
Skills load anywhere Claude reads `.claude/skills/`, which is currently Claude Code, Claude Desktop with the Claude apps integration, and the Claude API when you pass them as part of the prompt context. The MCP server itself is independent of the skill: it advertises tools over the standard MCP protocol, so Claude Desktop's MCP client, the Claude API's MCP client, Cursor, Windsurf, and any other MCP-aware client can call into it directly. The skill is a polish on top: it gives the conversation a known recipe for when to call which tool. Without the skill you still have the 35 tools; you just don't have the trigger-phrase shortcut and the workflow scaffolding.
What does the remote-mcp skill actually let me do that vanilla Claude cannot?
Run any of the 35 desktop tools against a remote Windows or macOS box without writing the curl payloads yourself. The skill's tool reference table maps every action you'd want to do (run shell command, get UI tree, take screenshot, click, type, press key, wait for element, open application) to a `terminator mcp exec` invocation with the right argument shape, including selector grammar examples (`process:chrome >> role:Button && name:Submit`, with `>>` as descendant chain, `&&` as AND, `||` as OR, `!` as NOT). Without the skill, Claude has to infer all of that from cold. With it, the conversation goes straight from intent ('install this MSI on the test VM and verify the icon appeared') to a sequence of tool calls.
How do I write my own skill that uses Terminator's MCP?
Three steps. Add the MCP server to your Claude config (`claude mcp add terminator "npx -y terminator-mcp-agent@latest"` for Claude Code, or the JSON block for Claude Desktop). Create `.claude/skills/<your-skill>/skill.md` with frontmatter `name`, `description`, and optionally `allowed-tools: Bash, Read, Write`. In the body, list a few trigger phrases under 'Auto-activates when user mentions:' and write the workflow as plain markdown headings, calling tools by name where you would call them as a human. The remote-mcp skill is a clean template: copy its structure, swap the tool reference table for the action set your workflow actually uses, and you have a desktop skill that ships alongside your code.
How is this different from Anthropic's first-party skills (pdf, docx, xlsx)?
Different I/O surface. The first-party skills bundle Python helpers and Node libraries to read and write document formats; their tools are `Read`, `Write`, `Bash`, and skill-local scripts. Terminator's skills don't touch the document layer at all. They drive the OS. The pdf skill turns a PDF into structured text; the remote-mcp skill turns 'launch this installer on the test VM and click through the wizard' into a sequence of UIA-backed tool calls. They are complementary: a real workflow might use the docx skill to draft a release note, then the remote-mcp skill to publish it through the company's internal Windows admin tool that has no API.
What's the minimum I need installed to clone the pattern?
Node 18+ (so `npx` can fetch `terminator-mcp-agent`), one of Claude Code or Claude Desktop, and the Terminator MCP wired into your Claude client config. On Windows you also want the Microsoft UI Automation runtime, which ships with Windows 10/11; on macOS you want to grant your terminal Accessibility access in System Settings → Privacy & Security; on Linux you need AT-SPI2 enabled (default on most desktop distros). After that, `claude mcp add terminator "npx -y terminator-mcp-agent@latest"`, drop your `skill.md` in `.claude/skills/<name>/`, and Claude picks it up on next launch.
Other guides on the same router
Accessibility API for AI agents, the delta loop
Inside the same MCP server. After every action, only the changed lines of the UI tree reach the model. Two regexes strip volatile #ids and bounds before diffing.
Accessibility API for computer-use agents, the seven-mode click router
How click_element dispatches across InvokePattern, position, mouse, AT-SPI activate, and OmniParser fallback in one selector grammar.
macOS accessibility UI tree, the AX write path
What the macOS adapter does when click_element fires. AXPress vs AXMenuItemCmdChar vs synthetic CGEvent, and when each wins.