Desktop app automation where your browser tab is just another app on the selector tree.

Every other guide on this subject picks a side. RPA tools stay in the Win32 and UIA layer, then hand off to a human when the workflow enters a browser. Playwright and Selenium stay in a headless browser they launched themselves, then hand off to a human when the workflow exits the browser. Terminator collapses the seam by running a WebSocket bridge on 127.0.0.1:17373 that promotes the browser tab the user is already using into the same selector tree as the native apps around it.

Matthew Diakonov, Founder, Mediar

Published April 23, 202612 min read

4.9from open-source contributors

MIT-licensed Rust core, npm and pip SDKs on top

One transport for native UIA and live browser tabs

Runs on CPU speed, no model in the inner loop

One bridge. Two worlds.

127.0.0.1:17373 is the whole transport layer

Click a Win32 button in SAP

Then run JS in your logged-in Gmail tab

Same process. No reauth.

Chrome, Edge, Firefox, Brave, Opera.

0:00 / 0:06

The anchor: one constant, one port, one supervisor

The transport lives behind a single constant in the source tree. Open crates/terminator/src/extension_bridge.rs and look at line 32:

crates/terminator/src/extension_bridge.rs

That match statement is the seam between native app automation and browser automation. Five aliases on the left, five canonical names on the right, and a default that lower-cases the process name. When a workflow calls eval_in_browser("Google Chrome", ...) , the bridge knows which extension client holds the other end of the socket, and the JavaScript runs inside the tab that is already open in front of the user.

:17373

“The listener binds to 127.0.0.1 only. Remote peers cannot see the socket. The port is pinned so the Chrome extension and subprocess proxy can find it without configuration.”

DEFAULT_WS_ADDR, extension_bridge.rs line 32

The shape of a request

The protocol is small enough to fit on a napkin. The extension opens a WebSocket to ws://127.0.0.1:17373 and sends a hello. The bridge stores the browser name. When a caller wants to run JavaScript, it sends an EvalRequest. The extension runs the code and replies on the same id. Console logs and runtime exceptions ride the same connection as typed incoming events so you never need a second debugger attach.

bridge protocol, three messages

One call, one round trip

A realistic cross-boundary script

This is the shape of automation most real businesses need but most frameworks cannot express. One process. Three targets. Two of them are browser tabs in different browsers; the third is a native window.

example: SAP + Chrome tab + Edge tab

The transport is tiny. The implications are not.

A single WebSocket listener on a loopback port unlocks a shape of automation that neither Playwright-style browser control nor Win32-first RPA can replicate without gluing two stacks together.

Everything routes through :17373

0WebSocket port, hardcoded default

0browsers routable from one call

0process, not two (browser stays yours)

0second retry window for eval_in_browser

What this buys you over the two common stacks

The conventional answer to cross-boundary desktop app automation is "use Playwright for the browser, use UIA for the rest, and bolt them together." That bolt-on fails at two points: authentication (Playwright launches a clean browser and cannot see the user's cookies) and synchronization (nothing coordinates focus between the two tools). Terminator collapses both problems into one selector engine and one transport.

Feature	Playwright + RPA bolt-on	Terminator
Connection transport	Chrome DevTools Protocol over a new browser instance	WebSocket on 127.0.0.1:17373, user's existing browser profile
Cookies and session	Empty by default, imported per test run	Inherited from the running browser, every request
Target browsers	Chromium-family via CDP; Firefox via Marionette; Safari via WebDriver	Chrome, Edge, Firefox, Brave, Opera, from one call site
Native app control	Out of scope	Same process, via UIA selectors (role:Button\|name:Save)
Subprocess child	Each child spawns its own browser	Children proxy through the parent bridge (TERMINATOR_PARENT_BRIDGE_PORT=17373)
Return channel	DOM snapshot, no native signal	Click returns window_title_changed and bounds_changed; eval returns JSON value

Five things the bridge does that most pages about this topic never mention

One selector grammar

role:Button|name:Save hits a native button. The same locator syntax reaches into the DOM when the target is a browser tab. The engine picks the right adapter based on which app the selector resolves inside.

Five browsers, one call

eval_in_browser takes target_browser as a string. The normalization table accepts 'google chrome', 'microsoft edge', 'mozilla firefox', 'brave browser', and 'opera', then routes the eval to whichever extension client matched on the last hello message.

No headless process

The extension lives inside the browser the user already has open. There is no second Chrome, no profile copy, no cookie jar merge. If the user is logged in, the script is logged in.

Console and exceptions included

console.log, console.error, and runtime exceptions in the page surface as typed incoming messages on the same socket (TypedIncoming::ConsoleEvent, ExceptionEvent, LogEvent). You do not need a second transport to debug the JavaScript you just ran.

Self-healing bridge

A supervisor holds a reference to the server task and restarts it if the task dies. If the port is taken by an ancestor process, the new process becomes a proxy client instead of failing (line 217).

Watch it cross the boundary

01 / 05

Script fires eval_in_browser

Your Rust or TS workflow asks Terminator to run JavaScript in the 'chrome' browser target.

Getting a workflow that crosses the seam

Install Terminator and the MCP agent

npx -y terminator-mcp-agent@latest registers the MCP server. The first run binds 127.0.0.1:17373 and starts the extension bridge supervisor. Nothing happens yet because no browser extension is connected.

Install the Chrome extension

The extension is signed and distributed from the Terminator repo. On install, the background service worker opens a WebSocket to ws://127.0.0.1:17373 and sends a hello message declaring the browser name. Edge, Brave, and Opera share the same Chromium extension. Firefox has its own build.

Write a workflow that crosses the boundary

In your Rust, TypeScript, or Python script, or in a YAML workflow, interleave native UIA actions (click_element, type_into_element) with eval_in_browser calls. You do not launch a second browser, and you do not authenticate again. The user's session is already there.

Spawn subprocess helpers if you need to

Scripts run through the MCP scripting engine inherit TERMINATOR_PARENT_BRIDGE_PORT=17373 in their environment. When the child imports the terminator SDK, it finds the port held by its parent and registers as a proxy client (ClientType::Subprocess) instead of rebinding.

Handle browser switching

Pass the browser process name as the first argument to eval_in_browser. The normalization layer accepts the display name the user would read ('Google Chrome', 'Microsoft Edge'), maps it to the bridge's canonical name, and routes to the matching client. If the target browser has no extension connected, the caller gets Ok(None) after a bounded retry window, not a silent fallback to the wrong browser.

The terminal view on a first run

What actually happens when you run the MCP agent for the first time on a clean machine with the extension installed.

terminator-mcp-agent first run

Why this is a developer framework, not a shrink-wrapped RPA app

Terminator is not a recorder you hand to a business analyst. It is a Rust library, a TypeScript SDK, a Python module, a CLI, and an MCP server. The reason the bridge is a single 1,423-line file in the core crate and not a polished UI is that the primary audience is people writing code: developers already using Cursor, Claude Code, Windsurf, or VS Code who want their AI assistant to reach outside the editor and drive the desktop. The same line claude mcp add terminator "npx -y terminator-mcp-agent@latest" wires the entire bridge into Claude Code, and your assistant inherits the ability to click, type, and evaluate JS in the user's live tabs.

The programming model is Playwright-shaped

locator("role:Button|name:Save").click() is the same call whether the target is a native window or a DOM element inside a browser tab. If you have written Playwright tests, the API surface will feel familiar. The difference is the scope, not the grammar.

Pre-trained deterministic workflows

The README claims "100x faster than ChatGPT Agents, Claude, Perplexity Comet, BrowserBase, BrowserUse" not because the models are faster but because most of the workflow runs as deterministic code. The bridge is only invoked when the workflow actually needs to reach into a browser. Everything else fires at CPU speed with no model round trip.

The five browsers the bridge already knows about

Chrome"chrome"

Edge"msedge"

Firefox"firefox"

Brave"brave"

Opera"opera"

Google Chrome"chrome"

Microsoft Edge"msedge"

Mozilla Firefox"firefox"

Brave Browser"brave"

Any name you pass to eval_in_browser is lower-cased and matched against this table before the bridge picks a client.

When you should not reach for this

The bridge is Windows-first today. The feature matrix in the README is honest about that: macOS and Linux are marked "No" across the board for both core automation and the advanced features that depend on UIA. If your desktop app automation target is Safari on a Mac, or Gnome on Linux, Terminator is not the right choice right now. It is also not the right choice if you need a sandboxed browser session that is explicitly separate from the user's profile; the whole point of the bridge is that it inherits the profile. For test automation in CI, a vanilla Playwright setup will serve you better. For automation that has to drive SAP, Outlook, a native Electron app, and a browser tab all in one script, without relogin, the bridge is the answer.

The numbers on the count-ups above 0 and five come from one match statement in one Rust file. Under 60 lines of Rust pick the client, retry for up to 10 seconds if the target browser has no extension connected yet, and serialize the return value back across the socket. That is the entire cost of making a browser tab behave like a first-class citizen on the desktop automation tree.

Want to see one script drive SAP and a live Chrome tab?

Book 20 minutes with the team and we will walk through a real cross-boundary workflow on your stack.

Questions about the bridge that do not have good answers elsewhere

What is desktop app automation when the app in question is actually a browser tab?

Programmatic control of any application that happens to be running on a user's desktop, including the browser and the tabs inside it. Most business software is now SaaS, which means the thing labelled 'desktop app' in a task manager is often just Chrome hosting a CRM, a ticketing tool, or an email client. A realistic desktop app automation script has to treat the native Windows app and the page inside a browser tab as the same kind of target. Terminator does this by exposing UIA elements for native controls and shipping a Chrome extension that surfaces DOM elements through the same bridge. The transport in both cases is a WebSocket on 127.0.0.1:17373.

Why port 17373 specifically, and can it be changed?

It is a const on line 32 of crates/terminator/src/extension_bridge.rs: DEFAULT_WS_ADDR = "127.0.0.1:17373". The address is bindable and is checked at startup; if the port is already bound by another terminator-mcp-agent process in the same ancestor chain, the new process falls into proxy client mode instead of failing. The port is bindable-only by configuring the extension bridge constructor with a different address, but the Chrome extension and the subprocess proxy default to 17373, and changing it means rebuilding the extension.

Which browsers can the bridge actually drive?

The extension normalization table is explicit. eval_in_browser on lines 1070 to 1090 lower-cases the target, then maps 'google chrome' to 'chrome', 'msedge' and 'edge' and 'microsoft edge' to 'msedge', 'mozilla firefox' to 'firefox', 'brave browser' to 'brave', and 'opera' to 'opera'. Anything else passes through as the lowercased process name. The extension itself sends a typed 'hello' message on connect declaring its browser, and the bridge tracks which client speaks for which browser, so two browsers open at once both receive their own eval requests and the results come back on the right channel.

Does the bridge actually execute JavaScript, or does it fake it with clicks?

It runs real JavaScript inside the target tab via the extension's scripting permissions. An EvalRequest with action 'eval', an id, the code string, and an optional await_promise flag is serialized as JSON, sent over the WebSocket, executed in the page context, and the return value is shipped back as a serde_json::Value on the matching id. Console messages, runtime exceptions, and Log.entryAdded events are surfaced as typed incoming messages so the caller can see console.error output from the page without a second transport.

What happens to auth and cookies when automation spawns a new process?

Nothing. The extension runs inside the user's existing browser profile, which means every request that originates from automation carries the same cookies, session storage, and localStorage that the user already has. There is no separate browser context, no headless instance, and no handshake. If the user is logged into Gmail, Terminator is logged into Gmail. This is the reason the README lists 'uses your browser session, no need to relogin, keeps all your cookies and auth' as feature number one of Terminator MCP. The handoff happens through the extension, not through a cookie file copy or a CDP attach.

How does this compare to Playwright or Selenium for desktop app automation?

Playwright and Selenium only see the DOM inside a controlled browser process that they spawned. They cannot click a Win32 File menu, cannot interact with a Windows Open File dialog that covers the browser, cannot drive SAP GUI, and cannot drive a native Electron-shell app like the Slack desktop client through its menus. Terminator's bridge inverts the model: it attaches to the browser the user is already running, treats every native app via UIA, and lets a single script cross the boundary. A real workflow looks like 'open SAP, export a CSV, pick up the file in Outlook desktop, paste values into a Gmail tab already open in Chrome', which neither Playwright nor any pure-desktop RPA tool can cover end to end.

What is subprocess proxy mode, and when does it kick in?

Workflows in Terminator can run scripts (Node, Python, JS in browser) as child processes. Those children are started with TERMINATOR_PARENT_BRIDGE_PORT=17373 in their environment (scripting_engine.rs line 1865). When a child process imports the terminator SDK and the extension_bridge module tries to bind 127.0.0.1:17373, it finds the port in use by its own parent, detects the terminator-mcp-agent ancestor via find_terminator_ancestor, and switches to proxy client mode (line 217 of extension_bridge.rs). The child does not need its own extension; its eval requests travel up to the parent, out to the extension, and the result comes back. This is how a Node worker spawned inside a workflow can still talk to your logged-in Chrome.

Is the bridge a security risk? A local WebSocket sounds noisy.

The listener binds to 127.0.0.1 and nowhere else, so no network peer outside the loopback interface can see it. Messages have to arrive from a client that speaks the exact 'hello/eval/result' protocol or they are dropped. The Chrome extension is signed by the user at install time, and the subprocess proxy requires the TERMINATOR_PARENT_BRIDGE_PORT env var to be set, which only Terminator's own scripting engine does. That said, any local process with the port number and the protocol can connect. If you are building automation that touches sensitive tabs, treat the bridge the same way you would treat a local debugger socket: do not expose it, do not run untrusted code in the same OS user, and rely on the OS boundary.