GuideDesktop test telemetryWindows Named PipeMCP notifications/progress

Automation testing for desktop application, watched live over a pipe

Every guide on this subject covers how to send a click. Very few cover how the run reports back. Terminator gives each TypeScript desktop test its own Windows Named Pipe at \\.\pipe\mcp-workflow-events-{execution_id} and a tagged JSON line protocol. Eight structured event variants. No stderr scraping. No reporter plugin. The AI coding assistant that kicked the run off watches every step land in real time.

Matthew Diakonov, Written with AI

Published April 24, 202614 min read

4.9from Open-source, MIT

One pipe per execution, name contains the execution_id

JSON line protocol, tagged with "__mcp_event__": true

Eight event variants: Progress, StepStarted, StepCompleted, StepFailed, Screenshot, Status, Log, Data

866 lines in event_pipe.rs, 458 lines in log_pipe.rs, both MIT

Live desktop test telemetry

tagged JSON over a named pipe

One Named Pipe per test run

Eight structured WorkflowEvent variants

__mcp_event__: true is the only required tag

Forwards straight to MCP notifications/progress

Your stdout stays clean, your assistant watches live

0:00 / 0:05

The tag that makes a line a test event

The whole line between "this is test telemetry" and "this is a stray console.log" sits in a single field. If the JSON carries "__mcp_event__": true at the top level, it becomes a typed WorkflowEvent and gets forwarded to the MCP client. Any line without that tag is ignored by the event reader. That means your test can still scatter console.log anywhere it likes, they end up in the log stream instead of the event stream, and the two never collide.

crates/terminator-mcp-agent/src/event_pipe.rs

0WorkflowEvent variants

0Lines in event_pipe.rs

0Lines in log_pipe.rs

0Pipe per execution

A pipe per run, a run per pipe

The usual failure mode for concurrent desktop test runs on a single runner is log interleaving: two reporter plugins racing on stderr. Terminator sidesteps the race by giving each run its own named pipe. The server is created with first_pipe_instance(true), which makes a second writer with the same name a hard error instead of a silent merge. Add the execution_id suffix and two parallel runs cannot collide by accident.

crates/terminator-mcp-agent/src/event_pipe.rs

Eight events travel through a single channel

The WorkflowEvent enum has exactly eight variants. Every one maps to a concrete thing a desktop test run cares about. Nothing is reserved for "user data" except the explicit Data escape hatch.

ProgressStepStartedStepCompletedStepFailedScreenshotStatusLogData

Progress

current, total, message. Maps 1:1 to MCP notifications/progress so the client can render a real progress bar. The MCP spec treats this as a first-class notification, not a log level. One of the reasons the assistant can show a live status strip while a desktop test runs.

StepStarted

stepId, stepName, stepIndex, totalSteps, timestamp. Stable stepId lets the client join with StepCompleted or StepFailed later. totalSteps lets it compute a percentage without a separate Progress call.

StepCompleted

Adds duration (ms). Feeds a built-in per-step timing chart on the MCP client. Aggregates into regression tables across CI runs.

StepFailed

error + duration. The error string is the typed AutomationError variant from the SDK (ElementObscured, ElementNotStable, Timeout). No parsing. No regex. The retry policy matches on the variant string directly.

Screenshot

path OR base64, plus annotation and element. Attached inline to whichever step is in flight. The MCP client decides whether to cache to disk or render in the chat window.

Status

text, durationMs, position. Short-lived on-screen banner that can render over the app under test. Used for 'waiting 5 s for enabled Refund button' when the run is interactive.

Log

level (debug, info, warn, error), message, data. Routes through log_pipe.rs and forward_log_to_tracing so the same level reaches the MCP agent's tracing subscriber. One log line, zero stderr.

Data

key + value (any JSON). The catch-all for custom test telemetry that is not a step or a log. Durations, network traces, memory snapshots, whatever your runner wants to reason about later.

Producers on the left, consumers on the right

A single pipe multiplexes test-side producers into several agent-side consumers. The reader loop deserializes once, fans out to whichever destinations are subscribed at this moment. Tracing, the MCP transport, the on-disk replay log, and product analytics all read off the same stream.

event_pipe.rs routing

What the test looks like on the author's side

The test uses the same Desktop locator API as a normal Terminator test. The only thing added is an emit helper that writes tagged JSON to the pipe. When the run is not launched through MCP (you ran it as a plain bun script locally), MCP_EVENT_PIPE_PATH is unset and every emit.* call is a no-op. One test file covers both modes.

tests/checkout.spec.ts

And what the agent sees as it runs

Log output from a local run with RUST_LOG=info. The first line is the pipe creation. Every DEBUG RawEvent line is a deserialized event on its way out the MCP transport. The close line is the child process exiting, which tears down the reader.

RUST_LOG=info cargo run -p terminator-mcp-agent

A single conversation between four parties

The full handshake. The assistant kicks the run off with an MCP tool call; the Rust agent creates the pipe; the TS test connects and streams events; the agent converts each event into the right MCP notification and forwards it back. A failure variant is just one more event type on the same wire.

mcp -> rust agent -> named pipe -> ts test

Where this sits compared to the usual playbook

The common approach to telemetry in desktop test frameworks is a reporter plugin that scrapes stdout at the end of a run, plus a JUnit XML written to disk. That works for CI but not for a live caller. Here is how the pipe approach differs on eight concrete questions.

Feature	Typical desktop test reporter	Terminator (event_pipe.rs)
Where test telemetry goes	Stdout and stderr, scraped by a reporter plugin. Vendor reporters for each CI platform.	A dedicated Windows Named Pipe per execution. JSON lines tagged __mcp_event__:true. No scraping.
Isolation between parallel runs	File locks or process env vars. Log interleaving when two workers share the same runner.	Pipe name includes the execution_id. Two runs = two pipes. first_pipe_instance(true) enforces one writer.
Protocol the test speaks	Proprietary reporter API, or JUnit XML written at end of run.	MCP notifications/progress. The same protocol the AI coding assistant already speaks to the agent.
Step lifecycle as first-class events	'beforeEach' + 'afterEach' hooks that log lines. Reporter infers the shape from them.	Four structured variants: StepStarted, StepCompleted, StepFailed, and an out-of-band Progress. Typed in Rust, Zod-shaped in TS.
Screenshot attached to a step	Save file, log path, hope the reporter correlates by filename pattern.	Screenshot variant with path or base64, plus the annotation + element fields. Attached inline to the in-flight step.
Live vs. post-mortem	Usually post-mortem. Reporter compiles the final report after the run exits.	Live. The MCP agent forwards events as they arrive. The assistant sees step 2 of 4 while step 3 is executing.
Stderr pollution	High. Every log line competes with the reporter format.	Zero. Events go to the pipe, logs go through forward_log_to_tracing. Stderr stays clean for panics only.
License of the protocol	Proprietary or copy-left depending on the vendor.	MIT. event_pipe.rs + log_pipe.rs are 866 + 458 lines you can fork.

__mcp_event__: true

“The only bit a line needs to travel from console.log out to MCP notifications/progress. No schema version, no reporter registration, no plugin.”

event_pipe.rs, RawEvent at line 106, try_parse_event at line 207

Wiring it in, one step at a time

The five steps that turn a plain TypeScript desktop test into a live-telemetry MCP workflow. Each step corresponds to a real file and function in the repo, not pseudocode.

Spawn the TypeScript test under the MCP agent

When the assistant calls the execute_sequence tool, the Rust server creates an EventPipeServer with the run's execution_id, passes the pipe path to the TS child process via MCP_EVENT_PIPE_PATH, then awaits on connect(). The child runs under bun (preferred) or node.

Emit tagged JSON lines from the test

The @mediar-ai/terminator package ships an emit helper. Every emit.* writes a JSON line with __mcp_event__:true to the pipe path from the env var. If the env var is missing (you are running the file with plain bun locally, no MCP), the helper no-ops. Your test works in both modes without a flag.

Rust parses each line with try_parse_event

try_parse_event inspects the leading byte (must be {), deserializes into RawEvent, checks is_mcp_event, then matches event_type into one of the eight WorkflowEvent variants. Unknown types fall through to a debug log and keep the pipe alive.

Forward events as MCP notifications

Progress variants become notifications/progress on the in-flight tool call token. StepFailed and error-level Log variants become notifications/message at ERROR. Screenshot variants can be surfaced as structuredContent on the final tool result, or streamed inline depending on the client capability.

Close the pipe when the TS run exits

The child's stdout close triggers next_line() returning Ok(None), the reader loop exits, PipeServerHandle is dropped, the server tears down. The next execute_sequence creates a fresh pipe with a fresh execution_id. No leaks, no reuse across runs.

What you can actually do once this is in place

Eight concrete capabilities this primitive unlocks. Each one is a check you can run against the current codebase. None of them require a reporter plugin or a CI-specific integration.

Capabilities this unlocks

Render a live per-step progress bar in the assistant while a Windows desktop test runs, without a reporter plugin
Fail a step with a typed AutomationError variant (ElementObscured, ElementNotStable) and have the assistant retry based on the variant
Attach a screenshot to a specific step by stepId, not by filename correlation
Run two desktop tests in parallel on one runner without log interleaving or reporter races
Emit an arbitrary Data event with custom JSON so a downstream dashboard can aggregate it across runs
Keep stderr clean for real panics only. console.log does not flood the MCP transport.
Correlate events to the final ActionResult by execution_id because the pipe name already contains it
Write the test once and have it run with or without the MCP agent attached; emit is a no-op if MCP_EVENT_PIPE_PATH is unset

Anchor fact

Terminator's Rust MCP agent ships an event pipe at crates/terminator-mcp-agent/src/event_pipe.rs. 866 lines. MIT licensed. Eight WorkflowEvent variants (Progress, StepStarted, StepCompleted, StepFailed, Screenshot, Data, Status, Log) defined at lines 30 through 101. One pipe per execution, name generated by generate_pipe_name at line 243. The tag that separates telemetry from console noise lives on the is_mcp_event bool at line 106. Unit tests round-trip every variant against the exact JSON shape the TypeScript emit helper produces; run cargo test -p terminator-mcp-agent event_pipe to see them pass.

event_pipe.rslog_pipe.rsexecution_logger.rsserver_sequence.rs

A number for every part of the story

The sizes you can verify from the repo without running the code. Each figure is a wc -l or a line-count on an enum match arm.

lines in event_pipe.rs

lines in log_pipe.rs

WorkflowEvent variants

day replay retention

Wire your desktop test suite into a live MCP event stream

Bring a test you already have. We will wrap it with the emit helper, point the named pipe at your MCP client, and have you watching step_completed events land in real time before the call ends.

Frequently asked

Frequently asked questions

Why is live test telemetry even a problem on the desktop? CI tools handle it for browsers.

Browser tests talk to a headless Chromium in the same process tree. Playwright hands you an internal events API (test.beforeEach, reporter, onStepBegin) and the reporter runs in the same node process. Desktop test runs do not get that for free. The test process drives Win32 UI via UIAutomation, which lives in a separate COM service; the test framework is a second process; and if an AI coding assistant is watching over MCP, that is a third boundary. Stdout and stderr become the lowest common denominator, which is why most desktop test tools end up with a reporter plugin that scrapes lines. Terminator replaces that scraping surface with a dedicated Windows Named Pipe per execution and a JSON line protocol tagged __mcp_event__:true. The pipe is created by the Rust MCP agent on the fly, its path is handed to the test child via the MCP_EVENT_PIPE_PATH env var, and the test emits with a tiny helper from @mediar-ai/terminator. That is the design choice the other playbooks skip.

What distinguishes a test event from an ordinary log line?

The string "__mcp_event__":true as a required field on the JSON object, deserialized into the is_mcp_event bool on the RawEvent struct at lines 106 and 143 of event_pipe.rs. try_parse_event at line 207 first rejects anything that does not start with {, then deserializes, then rejects anything where is_mcp_event is false. Only survivors reach the WorkflowEvent::try_from conversion at line 139 and become typed events. Plain console.log output stays stdout, gets forwarded as log_pipe.rs LogEntry records, and ends up in the tracing subscriber at a normal log level. The two pathways never mix. That is how the pipe stays free of structured noise even when the test is chatty.

How are pipes isolated across parallel test runs?

generate_pipe_name at line 243 of event_pipe.rs formats the full pipe path as \\.\pipe\mcp-workflow-events-{execution_id}, where execution_id is a UUID-shaped string the Rust agent allocates when execute_sequence is invoked. That string is unique per run, so two concurrent runs get two distinct pipe names. The server-side ServerOptions call at line 277 sets first_pipe_instance(true), which on Windows makes it an error for a second writer to attach to the same name. Combined, those two decisions make the pipe a single-writer, single-reader channel scoped to exactly one workflow. No mutex, no reference counting, no cleanup worker. When the child process exits and its end of the pipe closes, the reader loop breaks on next_line returning Ok(None) and the whole server tears down with its PipeServerHandle.

Which event types does the pipe accept today?

Eight. Defined as the WorkflowEvent enum at lines 30-101 of event_pipe.rs: Progress, StepStarted, StepCompleted, StepFailed, Screenshot, Data, Status, Log. The mapping from event_type strings to enum variants lives in WorkflowEvent::try_from at line 139 ('progress', 'step_started', 'step_completed', 'step_failed', 'screenshot', 'data', 'status', 'log'). Adding a new variant is a three-line change: add the enum variant, add the matching string arm in try_from, and re-expose it on the emit helper. There is no versioning; the pipe is consumed by the same repo that produces it, and MCP itself does not require a stable event catalog beyond the notifications it already spells out.

How does Terminator map these events to MCP notifications?

Progress events map straight to MCP notifications/progress on the outer tool call's progress token. The JSON payload already carries current, total, and message, which is the exact shape the protocol expects. Step lifecycle events (StepStarted, StepCompleted, StepFailed) and Log events with error level map to notifications/message with matching level fields. The screenshot variant is more flexible: it can be streamed as an in-flight notifications/message with an embedded image resource, or held for the final CallToolResult.structuredContent depending on the client's declared capabilities. That negotiation happens on the agent side, which means a test run does not need to know whether the client can render images mid-call; the pipe protocol stays flat.

Does this replace JUnit XML and reporter plugins?

It replaces them for the in-flight phase. Post-mortem JUnit XML is still the right format for external CI systems like GitHub Actions or Buildkite that expect a well-known file path at the end of the run. The pattern is to write a small sink on top of the event stream: keep a buffer of StepStarted/StepCompleted/StepFailed records per stepId, render them to testsuite XML at the end. That sink lives in 100 lines of Rust or TypeScript and does not need to be a vendor plugin. The advantage of the Terminator approach is that the sink is optional. The AI coding assistant gets live feedback directly from the event pipe; the CI adapter is just one more consumer of the same stream.

What happens if the MCP client does not support progress notifications?

Nothing breaks. The agent checks the peer's declared capabilities at initialization. If notifications/progress is not supported, it downgrades Progress events to notifications/message at info level, which every MCP client accepts. Screenshot events without image-resource support become log entries that reference the path. The test itself never sees that negotiation: emit.progress() and emit.screenshot() both write the same JSON line regardless. You do not branch your test code on client capability; the Rust agent does it once at the transport edge.

Can I use the pipe protocol without writing a TypeScript test?

Yes, the pipe is language-agnostic. Any program that can open a named pipe on Windows and write newline-delimited JSON with the __mcp_event__:true tag can be a producer. That includes Python (pywin32), PowerShell (System.IO.Pipes), Go (winio/npipe), or a C# NUnit test running via dotnet. The env-var handshake (MCP_EVENT_PIPE_PATH) is the only coordination needed. If you are bridging an existing test framework (TestComplete, Ranorex, pywinauto) into an MCP-driven AI assistant, writing a small adapter that forwards the framework's native hooks to the pipe is usually under a hundred lines. The source of truth for the wire format is the RawEvent struct in event_pipe.rs and its try_from at line 139.

Is any of this useful if I am not using MCP or an AI coding assistant?

Yes, for two reasons. First, the same pipe protocol and execution_logger.rs combination gives you a clean per-step replay artifact on disk: a .json record, a regenerated .ts snippet, and a .png screenshot, all correlated by execution_id, all under %LOCALAPPDATA%\mediar\executions\ with 7-day retention. That is a post-mortem debugging primitive even if you never attach a live consumer. Second, forward_log_to_tracing in log_pipe.rs writes your test's console output to the Rust tracing subscriber at the correct level, which means you get structured logging out of plain console.log calls. Both behaviors kick in the moment you run your test through the MCP agent, even if you immediately disconnect the MCP client.

Where in the repo can I read this and prove it runs?

crates/terminator-mcp-agent/src/event_pipe.rs, 866 lines, MIT licensed. The tests at the bottom of the file (#[cfg(test)] mod tests, lines 361 onward) round-trip the eight event types against the exact JSON shape the emit helper produces. Run cargo test -p terminator-mcp-agent event_pipe to see them pass. The companion file, crates/terminator-mcp-agent/src/log_pipe.rs, 458 lines, handles the structured logging half of the same architecture. Both files are small enough to read in under thirty minutes end-to-end. The integration into the main server is in main.rs and server_sequence.rs where execute_sequence_impl creates the pipe server, attaches it to the child process, and drains events into the MCP transport.