GuideMCPBrowser + desktop automation

Playwright MCP server, then off the page

Microsoft's playwright-mcp gave language models a browser tab. Excellent inside that tab, useless outside it. The moment your agent needs to open Excel, paste the value it just scraped, hit Cmd+S, and drag the file into a desktop uploader, it has left the universe playwright-mcp can see. This page is about what an MCP server looks like when its dispatch function does not stop at the browser. The server in question is Terminator; the anchor is its own Chrome extension on ws://127.0.0.1:17373.

Matthew Diakonov, Written with AI

Published April 19, 202611 min read

4.9from Open-source, MIT

Same shape as playwright-mcp: accessibility tree + click/type by selector

Dispatches navigate_browser AND open_application from one match block

Ships its own MV3 Chrome extension (manifest.json v0.24.32) on ws://127.0.0.1:17373

31 tools total, including run_command, mouse_drag, press_key_global

Read the source on GitHub Jump to the Chrome extension

Playwright MCP is browser-only. Terminator MCP is the same shape, OS-wide.

One dispatch function, two surfaces, one execute_sequence call.

Both speak MCP; both prefer accessibility trees over pixels

playwright-mcp tools end at the browser tab

Terminator dispatches navigate_browser AND open_application from one match arm

Its own MV3 Chrome extension bridges DOM eval over ws://127.0.0.1:17373

The same workflow can scrape a page and then paste into Excel

0:00 / 0:05

What playwright-mcp is, and where it stops

Microsoft's playwright-mcp is the reference MCP server for browser automation. It wraps Playwright, exposes a small set of typed tools (navigate, click, type, snapshot, screenshot), and speaks the Model Context Protocol so any MCP-aware client (Claude Code, Cursor, VS Code, the GitHub Copilot Coding Agent) can drive a real browser. It uses the page's accessibility tree, not pixels, which is why the LLM sees structured data and not screenshots.

That last detail is the interesting one. Accessibility trees are not a browser concept. Windows has had UI Automation since 2005; macOS has had AX since 2001. Every well-behaved desktop app already exposes one. So if the right shape for an LLM to drive a UI is "accessibility tree plus selector-based click and type," that shape generalizes past the browser. Terminator is what happens when you take that shape and apply it to the whole OS.

Scope of an MCP call, side by side

Browser-tab universe. Tools speak DOM. Anything outside the tab requires a second MCP server, a separate runtime, and a hand-off you write yourself.

browser_navigate, browser_click, browser_type
Cannot launch Excel, Notepad, or your CLI
Cannot send Ctrl+S to the OS focus
Cannot drag a file from Finder into a desktop app

Terminator's MCP server, by the numbers

Counts pulled from the open-source repo at the commit this page was written against. The first is match arms in the dispatch function. The second is total OS-level tool surface. The third is the version of the Chrome extension that ships in-tree. The fourth is the local WebSocket port the extension talks to.

0Tool match arms in dispatch_tool

0Tools playwright-mcp does not have

v0.0Chrome extension version (decimal)

0Local WebSocket port for the bridge

The anchor: Terminator ships its own Chrome extension

Inside the Terminator repo, alongside the MCP agent crate, there is a directory called crates/terminator/browser-extension/. It contains an MV3 Chrome extension named "Terminator Bridge", currently at version 0.24.32. This is the part no playwright-mcp page mentions, because no other MCP server does this. Terminator does not depend on Playwright; it bridges into Chrome through its own extension.

crates/terminator/browser-extension/manifest.json

Two permissions are doing the work. The debugger permission lets the extension attach to the active tab via Chrome's DevTools Protocol and run arbitrary JavaScript without opening a DevTools window. The host permission <all_urls> lets it do that on any site. The MV3 service worker (worker.js) opens a WebSocket connection to ws://127.0.0.1:17373 the moment Chrome boots, and waits for the local MCP server to send eval frames.

crates/terminator/browser-extension/worker.js

Verify in 10 seconds: clone mediar-ai/terminator and cat crates/terminator/browser-extension/manifest.json. You will see the version, the debugger permission, and the manifest_version 3 declaration. Then grep -n 17373 crates/terminator/browser-extension/worker.js to find the WebSocket URL the worker connects to. The same port appears in the MCP agent's execute_browser_script handler when it acts as the WebSocket client. That is the entire bridge: a manifest, a worker, a port.

One MCP server, two surfaces

Same MCP host (your editor), same dispatch function, two surfaces it can reach. The browser-DOM path uses the Chrome extension. The native-app path uses Windows UIA or macOS AX. The LLM picks tool names; the dispatcher picks targets.

What dispatch_tool can reach

The match block, with the browser arms next to the OS arms

This is the shape of Terminator's dispatch function. Browser tools and native tools live in the same match, dispatched by the LLM's tool call name. There is no second server, no protocol switch, no glue you have to write.

crates/terminator-mcp-agent/src/server.rs

A workflow playwright-mcp cannot express

Open Excel. Type a header row. Open a browser to an internal report. Scrape the rows out of the page DOM. Paste them back into Excel. Save. Six steps, two completely different surfaces, one execute_sequence call.

examples/mixed.yml

Step 4 returns through the Chrome extension on ws://127.0.0.1:17373. Step 5 reads ${{rows_result}} from the workflow env, which auto-populated because step 4 had an id. The data passing is documented in the MCP Agent README under "Tool Result Storage".

How an `execute_browser_script` call physically reaches the page

Five hops. The interesting one is hop 4: the bridge extension uses Chrome's debugger permission, which is how it can run JS in a tab without opening DevTools and without spawning a second browser the way Playwright does.

LLM script eval, end to end

LLM emits tool call

Claude or Cursor picks execute_browser_script and ships JSON-RPC over stdio to the local terminator-mcp-agent process.

dispatch_tool matches

server.rs line 9953 has one match arm per tool; execute_browser_script dispatches to the handler that talks to the Chrome extension.

Server frames eval msg

The handler opens a WebSocket client to ws://127.0.0.1:17373 and sends { id, action: "eval", code, awaitPromise: true }.

Extension runs JS

The MV3 service worker (worker.js) attaches to the active tab via the debugger permission and runs the JS through the DevTools Protocol. No DevTools window opens.

Result returns

Worker posts { id, ok: true, result } back. Server stores it as ${{step_id}}_result, available to the next step (which can be open_application or type_into_element).

playwright-mcp vs Terminator MCP, on real capabilities

Apples-to-apples on the things an LLM agent actually needs to do. Browser-only is the right answer when your task is browser-only. Most real automations are not.

Feature	playwright-mcp	Terminator MCP
Scope	Browser tab only (Chromium, Firefox, WebKit, Edge)	Whole OS via accessibility APIs (Windows UIA, macOS AX) plus Chrome via its own extension
Native app control	Cannot launch or click into Excel, Notepad, VS Code, legacy Win32	open_application, get_window_tree, click_element, type_into_element work on any accessible app
OS hotkeys	Sends keys into the tab	press_key_global sends Ctrl+S, Win+R, Cmd+Tab to the active OS focus
Drag and drop across windows	DOM drag inside the page only	mouse_drag accepts (start_x, start_y, end_x, end_y) in screen coordinates
Mix browser and native steps	Must hand off to a second tool	execute_sequence YAML interleaves navigate_browser, open_application, run_command in one call
Chrome integration	Driven by the playwright runtime, separate browser instance	Custom MV3 extension in browser-extension/manifest.json, WebSocket bridge ws://127.0.0.1:17373, uses chrome.debugger to eval in the active tab without DevTools UI
Workflow recording	playwright codegen for the browser	terminator-workflow-recorder captures real OS event streams and turns them into YAML
Headless / VM mode	headless Chromium	TERMINATOR_HEADLESS=true uses a virtual display so Windows UIA works without RDP

The 29 tools you get past the browser

Every name below is in Terminator's dispatch match block. None map to anything in playwright-mcp's tool list. They cover native UI, OS keys, shell, files, workflow control, and vision fallback.

open_applicationget_applications_and_windows_listget_window_treeclick_elementtype_into_elementpress_keypress_key_globalvalidate_elementwait_for_elementactivate_elementscroll_elementselect_optionset_selectedset_valueinvoke_elementhighlight_elementstop_highlightingmouse_dragdelayrun_commandexecute_sequencestop_executiongemini_computer_useread_filewrite_fileedit_filecopy_contentglob_filesgrep_files

For comparison, the two tool names that overlap with playwright-mcp: navigate_browser, execute_browser_script. Everything else in the marquee is what makes the dispatch surface OS-wide.

Install in three steps

Step 1 wires the Rust binary into your editor. Step 2 enables the browser-DOM bridge by loading the unpacked extension. Step 3 confirms 31 tools are visible to your agent.

install

When playwright-mcp is the right call

If your agent's entire job lives in a browser, playwright-mcp is leaner. It is the official Playwright project, it has the test framework around it (codegen, traces, expect), and it knows cross-browser quirks because Playwright already does. Use it for web QA, browser-scoped scrapers, anything that does not need the OS.

Use playwright-mcp if

Your agent only ever drives a web app and never leaves the tab
You want Playwright's full test ergonomics (fixtures, traces, expect) backing your MCP
You need cross-browser parity (Chromium, Firefox, WebKit, Edge)
You are happy isolating the agent to a fresh browser context per run

Use Terminator if

The work crosses the browser boundary (Excel, legacy app, file system, shell)
You want one MCP server with a Playwright-shaped API for the whole OS
You want the LLM to use your real Chrome session, real cookies, real DOM, via the bundled MV3 extension
You need OS-level keys, drag-and-drop across windows, or native window management
You want tool discovery to stay in sync with the source automatically (the build.rs trick)

MIT

“Every file path on this page (manifest.json, worker.js, server.rs:9953) is grep-able in a fresh clone of mediar-ai/terminator.”

github.com/mediar-ai/terminator

The shape Playwright proved, applied to the OS

Playwright proved that an accessibility-tree-based, selector- driven, typed-action API is the right interface for an LLM to drive a UI. playwright-mcp packages that interface as an MCP server for the browser. Terminator takes the same shape and applies it to the entire desktop, including a custom Chrome extension so you do not have to leave your real browser to get the DOM eval path.

If you only ever needed the browser, you would use playwright-mcp. If you needed the browser and Excel and the shell and a legacy Win32 app your team still depends on, you would use a server whose dispatch function does not stop at the tab. That is what Terminator is.

Wire Terminator MCP into your editor

One npx command for stdio under Claude Code, Cursor, or VS Code. Add the unpacked Chrome extension if you want the DOM bridge on ws://127.0.0.1:17373. MIT-licensed; the dispatch function is one file you can read end to end.

claude mcp add terminator →

Local WebSocket port the Chrome bridge listens on

Defined in browser-extension/worker.js. Loopback only; never leaves the machine.

Frequently asked questions

What is the Playwright MCP server?

Playwright MCP is Microsoft's reference Model Context Protocol server for browser automation. It runs Playwright under the hood, exposes MCP tools like browser_navigate, browser_click, browser_type, browser_snapshot, browser_screenshot, and lets an LLM client (Claude Code, Cursor, VS Code) drive a real browser through structured accessibility snapshots instead of pixels. It is excellent at what it does. The catch is that the universe of what it can do ends at the browser tab. The moment your agent needs to open Excel, drag a file from Finder into a desktop app, or run a shell command, it is out of room and you need a second tool.

How is Terminator's MCP server related to Playwright MCP?

Same shape, larger surface. Both are MCP servers. Both expose typed tools the LLM discovers via list_tools. Both prefer accessibility-tree snapshots over screenshots so the model sees structured data instead of pixels. The difference is what the dispatch function calls into. Playwright MCP calls Playwright, which calls Chrome DevTools Protocol, which talks to a browser. Terminator's dispatch function (crates/terminator-mcp-agent/src/server.rs line 9953) calls into Windows UI Automation, macOS Accessibility, and (for the browser case) its own MV3 Chrome extension on ws://127.0.0.1:17373. So Terminator gives the same Playwright-shaped API for the entire desktop, and it ships browser support as one slice of a larger toolset.

What is the Chrome extension Terminator ships and why does it exist?

Look at crates/terminator/browser-extension/manifest.json in the repo: it is an MV3 extension named "Terminator Bridge", currently version 0.24.32. The background service worker (worker.js) opens a WebSocket connection to ws://127.0.0.1:17373 the moment Chrome is started. The MCP server's execute_browser_script tool sends frames like { id, action: "eval", code, awaitPromise: true } down that socket; the extension uses Chrome's debugger permission to run the JS in the active tab via the DevTools Protocol and return the result. The point of the extension is that it lets the same MCP server that drives native Windows or macOS apps also reach into Chrome without launching a second browser instance via Playwright. You keep your real cookies, your real session, your real DOM.

Can Terminator do everything Playwright MCP can?

For the browser-DOM case, yes, and through a different mechanism. Terminator's navigate_browser drives whatever browser is the user's default; execute_browser_script executes arbitrary JavaScript inside the active tab through the Terminator Bridge extension. You can scrape the DOM, fill forms via document queries, await promises, return JSON. What you do not get is Playwright's full test runner ergonomics (fixtures, expect, traces). Terminator is an automation framework, not a test framework. If your goal is end-to-end test authoring, use playwright-mcp. If your goal is an LLM agent that needs to do real work that crosses the browser boundary, use Terminator.

What does a workflow that mixes browser and native steps actually look like?

It is a single execute_sequence call with steps that interleave tool names. A real example from the repo's pattern: open_application excel.exe, then type_into_element to write a header row into the active sheet, then navigate_browser to an internal report, then execute_browser_script to scrape rows out of the page DOM (the result lands in env as rows_result), then type_into_element back into Excel pasting that string, then press_key_global Ctrl+S to save. One MCP server. One match block. Six dispatched tool calls. Two completely different surfaces. This is the workflow shape playwright-mcp cannot express because steps 1, 2, 5, and 6 happen outside the browser.

Is Terminator just a wrapper around Playwright?

No. Terminator is a Rust crate that wraps the OS accessibility APIs directly (terminator-rs on crates.io). The browser-DOM evaluation path uses Chrome's own debugger protocol via a custom MV3 extension, not Playwright. There is no playwright dependency in the MCP agent. The architectural similarity is intentional: Playwright proved that an accessibility-tree-based, selector-driven, typed-action API is the right shape for LLMs to drive a UI. Terminator applies that same shape to the OS. The README phrases it as Playwright but for every app on your desktop. That is the lineage, not the dependency.

How many tools does Terminator's MCP server expose, and where can I see the list?

Thirty-one. The list is generated at compile time. crates/terminator-mcp-agent/build.rs reads server.rs, walks until it finds the line containing let result = match tool_name, collects every quoted name from the match arms, and bakes the comma-separated list into the binary as the MCP_TOOLS env var via rustc-env. crates/terminator-mcp-agent/src/prompt.rs reads env!("MCP_TOOLS") and pastes the names into the system prompt the server announces on initialize. So the LLM sees the same list the dispatch function dispatches against, automatically, every build. To see the list yourself: clone mediar-ai/terminator and run grep -n MCP_TOOLS crates/terminator-mcp-agent/build.rs crates/terminator-mcp-agent/src/prompt.rs, or open server.rs at line 9953 and read the arms.

How do I install Terminator's MCP server in Claude Code, Cursor, or VS Code?

One line for Claude Code: claude mcp add terminator "npx -y terminator-mcp-agent@latest" -s user. For Cursor and VS Code there are deep-link install buttons in the Terminator MCP Agent README. The agent is published as an npm package that wraps a Rust binary; it works over stdio by default (the editor spawns it as a child process) or over HTTP if you pass -t http (POST /mcp for calls, GET /status returns 503 when busy so a load balancer can drain). For the browser-DOM eval channel, additionally load the unpacked extension at crates/terminator/browser-extension/ in chrome://extensions with Developer Mode enabled.

Does Terminator work on macOS and Linux, or only Windows?

The MCP server runs on macOS and Windows. Per the repo's README, the most complete coverage is on Windows (Windows UIA gives the richest accessibility tree); macOS has the AX adapter and works for many apps. Linux desktop support depends on AT-SPI and is best-effort. The Chrome extension portion (the part that overlaps with playwright-mcp) is browser-driven so it works the same everywhere Chrome runs. If your use case is purely browser-DOM and you need Linux, playwright-mcp is the safer bet. If your use case is Windows desktop automation specifically, Terminator is built for it.

What is the catch?

Two real catches. First, accessibility trees are only as good as the apps that expose them. A well-built Win32 or UWP app gives you a clean tree; an Electron app or a custom-rendered canvas gives you mush, and you fall back to coordinate-based clicks or vision (Terminator has a gemini_computer_use tool for that, but it is slower and less reliable than tree-based selectors). Second, the Chrome extension requires loading an unpacked extension (developer mode) the first time, which is a manual step. If you cannot or will not enable developer extensions in Chrome, the navigate_browser path still works (it drives the OS-level browser window) but execute_browser_script does not have a DOM bridge. Most teams either enable developer extensions on the automation machine or use a headless Chrome path under run_command.