GuideWorkflow recorderSemantic event streamevents.rs lines 475-517

The ui testing automation tools recorder most listicles never describe

Every roundup of ui testing automation tools says the same thing about recorders. "It captures your actions and plays them back." Nobody shows the format. Nobody describes the data shape. The recorder output is treated as an opaque artifact, which is why replay is brittle and why tests built from recordings break when the UI shifts a pixel. Terminator's recorder emits a typed, open, 14-variant event stream defined in crates/terminator-workflow-recorder/src/events.rs. This page is a walkthrough of that enum and why the variant choices matter.

Terminator, desktop automation framework

Published April 20, 202612 min read

4.9from Open-source, MIT

14 variants in WorkflowEvent enum (events.rs lines 475-517)

5 TextInputMethod variants: Typed, Pasted, AutoFilled, Suggestion, Mixed

6 ApplicationSwitchMethod variants including AltTab and TaskbarClick

FileOpened emits ranked candidate_paths with High/Medium/Low confidence

Semantic events, not keystrokes

what Terminator's recorder actually emits

Other recorders: raw key-down, key-up, mouse-move streams

Terminator: 14 WorkflowEvent variants, 8 semantic

TextInputCompleted tells you Typed vs Pasted vs AutoFilled

ApplicationSwitch tells you AltTab vs TaskbarClick vs StartMenu

FileOpened resolves the real path, with confidence

0:00 / 0:05

Recorders are where ui testing automation tools quietly differ

Open any 2026 listicle. You will see the same phrase applied to a dozen products: "records your actions and replays them." The sentence is correct. It is also useless. Every recorder records something. What separates the brittle ones from the durable ones is the shape of what ends up on disk. If a recorder saves a list of mouse coordinates, your test fails when a window opens at a slightly different position. If it saves a stream of keydown events, your test cannot tell a typed value from a pasted one. If it saves only what is visible on screen, it misses the fact that the user switched applications with Alt+Tab versus with a taskbar click.

Terminator's recorder writes out something different: a stream of semantic events. An Alt+Tab is an ApplicationSwitch with switch_method=AltTab. A pasted email address is a TextInputCompleted with input_method=Pasted and the full text value in one field. A double-click on an Excel file is a FileOpened with a ranked candidate path list.

0WorkflowEvent variants

0TextInputMethod values

0ApplicationSwitchMethod values

0TabAction values

The 14-variant WorkflowEvent enum, verbatim

This is the actual enum. The first six are low-level raw events (disabled by default in most recording configs, kept for edge cases). The last eight are high-level semantic events. Every one of them is a struct with explicit fields, not a blob of bytes.

crates/terminator-workflow-recorder/src/events.rs

Fourteen variants, one per class of intent

You never produce these yourself. The recorder emits them. But the set is the vocabulary your test code will read when you consume a recording, so it helps to see all fourteen at once.

MouseKeyboardClipboardTextSelectionDragDropHotkeyTextInputCompletedApplicationSwitchBrowserTabNavigationClickBrowserClickBrowserTextInputFileOpenedPendingAction

What each semantic variant actually tells you

Fourteen variants collapse into a small number of test-authoring decisions. Each card below is one variant (or group) and the field shape it gives you.

TextInputCompleted

Fires once per field after the user stops typing for ~500ms. Stores text_value, field_name, field_type, typing_duration_ms, keystroke_count, and the input_method from {Typed, Pasted, AutoFilled, Suggestion, Mixed}. This is the single most useful event in the enum for test authoring.

ApplicationSwitch

Records a focus change between two processes. Includes from_process_name, to_process_name, switch_method ({AltTab, TaskbarClick, WindowsKeyShortcut, StartMenu, WindowClick, Other}), and dwell_time_ms in the previous app. Six methods, not one.

FileOpened

Fires when a new window title appears to reference a file. The recorder searches recent-access paths, ranks by LastAccessTime, and emits a FilePathConfidence of High, Medium, or Low. Your test gets a real path, not a title fragment.

BrowserTabNavigation

Chrome-extension-bridged event with to_url, from_url, to_title, from_title, browser, tab_index, total_tabs, is_back_forward. Action is one of {Created, Switched, Closed, Moved, Duplicated, Pinned, Refreshed}. Method tracks how the navigation happened.

Click and BrowserClick

Two click variants. Click uses accessibility role plus name. BrowserClick additionally carries DomElementInfo with the CSS path and up to 5 ranked SelectorCandidate entries, so the downstream replay can pick the selector that actually survived the last page update.

BrowserTextInput

DOM-aware text input. Emitted via the extension bridge for inputs inside a browser tab. Carries the field's DomElementInfo so replay does not need to walk the accessibility tree, which is lossy for web forms.

Hotkey, Clipboard, TextSelection, DragDrop

The three semantic layer-helpers. Hotkey pattern matches against a small known-shortcut list (save, copy, close tab, etc.). Clipboard stores a hash plus action. TextSelection records the selected substring. DragDrop captures the source, target, and payload.

Mouse, Keyboard, PendingAction

Low-level leftovers. Mouse and Keyboard are the raw event streams (disabled by default in the config). PendingAction is an internal bookkeeping event emitted right before a capture completes, so consumers can block on UI-tree refresh before reading the next event.

The single most useful event: TextInputCompleted

Most of the interesting information in a UI workflow is: what did the user put in which field, and how did it get there? The event below captures that in one struct per field session. The input_method is the distinguishing field. Five values. No other mainstream recorder I know of exposes this.

crates/terminator-workflow-recorder/src/events.rs

How the recorder tells Typed from Pasted

The detection is timing and keystroke arithmetic, not a heuristic based on field type. Walk through what the recorder does when a user pastes.

Classifying a 'finance@acme.com' entry into TextInputMethod

01 / 06

Stage 1: Field focus

The user Tab-focuses an Edit element. An InputTextAccumulator is created with start_time=now, keystroke_count=0, initial_text=current field value.

FileOpened: window title to ranked path list

When a user opens a document, the window title usually shows the filename, not the full path. Every recorder I have seen before Terminator just saves the title verbatim. This one tries harder. The struct below is what you get.

crates/terminator-workflow-recorder/src/events.rs

From a recorded event to a replayable MCP tool step

Each variant of WorkflowEvent translates to an McpToolStep with a tool name, arguments, and an optional expected-change diff. That is how a recording becomes a replayable test, and how an LLM can read and rewrite the same file.

recorded event to replay step

Six stages from a user click to a JSON event

The recorder is ~4,000 lines of Windows-specific plumbing in crates/terminator-workflow-recorder/src/recorder/windows. Each stage below corresponds to real code in that directory.

Event capture in a dedicated thread

The Windows recorder runs a UI Automation event subscription on a background thread. Every focus change, click, and text-change notification lands in a bounded channel. A second thread owns clipboard polling. A third owns the browser-extension bridge for Chrome-specific signals.

Semantic aggregation, not stream dumping

Instead of saving each event raw, the aggregator maintains small state machines. For text input, it holds an InputTextAccumulator per focused element that tracks keystroke_count, start_time, and whether the user has been idle long enough to emit. For application switching, it holds an ApplicationState with a start-time stamp so dwell_time_ms comes out correct.

Method detection via timing and modifiers

Paste detection is timing-based: if the text length jumps by more than N characters in under M milliseconds without matching keystroke count, the event is classified as Pasted. Suggestion is detected by an autocomplete-list click interacting with the focused field. AutoFilled is inferred from text appearing without any keystroke burst at all.

File-path resolution against the OS

When a new window title is detected, the recorder parses a filename candidate out of it (handles 'file.txt - App', 'file.txt * - App', 'App - file.txt', and similar). Then it walks the filesystem's recent-access index, collects matches, and ranks them by LastAccessTime to assign FilePathConfidence.

Write out as SerializableWorkflowEvent

At recording end, each live event is converted to its Serializable counterpart (UIElement becomes SerializableUIElement, timestamps stay as u64 millis, enums become string tags). The whole workflow serializes through serde_json::to_string_pretty. The resulting .json file is the recording.

Replay as MCP tool calls, with oracles

On replay, each event maps to an McpToolStep with a tool_name, arguments, and optional expected_ui_changes / expected_dom_changes fields. A test runner can call each step, diff the UI after each action, and fail with a structured reason instead of a silent 'element not found'.

What a recording actually looks like on disk

A real workflow JSON. Six events cover what in a keystroke-based recorder would be a few hundred. Every field here is one line in the enum definition you saw above.

invoice-flow.json

Running the recorder, live

The CLI prints each semantic event as it is committed to the log. Read this output and the matching JSON above side by side. One line of terminal, one object in the file.

terminator-workflow-recorder live

The method tags, all in one place

Each tag below is a string value that shows up in a recording. If your replay logic needs to handle an "autofilled email" path differently from a "typed email" path, the branch key is event.input_method.

method=Typedmethod=Pastedmethod=AutoFilledmethod=Suggestionmethod=Mixedswitch_method=AltTabswitch_method=TaskbarClickswitch_method=WindowsKeyShortcutswitch_method=StartMenuswitch_method=WindowClickconfidence=Highconfidence=Mediumconfidence=Lowaction=Switchedaction=Createdaction=Closed

Terminator’s recorder vs a keystroke-dump recorder

Ten differences. The left column is the shape of the recording produced by the typical ui testing automation tools recorder (Selenium IDE, vendor-specific RPA tools, most browser codegen). The right column is what Terminator's workflow recorder produces.

Feature	Keystroke-dump recorder	Terminator
Stores 'user pasted john@example.com' as ONE event	Stores it as ~20 keydown/keyup pairs plus a clipboard event. No semantic link between them.	One TextInputCompleted event. input_method=Pasted. text_value is the full string.
Distinguishes typed input from paste from autofill	No. A paste and a fast type look identical in a keystroke log.	Yes. TextInputMethod has 5 variants: Typed, Pasted, AutoFilled, Suggestion, Mixed.
Captures how the user switched applications	Usually not recorded at all. A focus change is inferred from the next click location.	ApplicationSwitchMethod records AltTab, TaskbarClick, WindowsKeyShortcut, StartMenu, WindowClick, Other.
Detects a browser tab switch (not a page load)	No. Browser tab state is invisible to OS-level recorders.	Chrome extension bridge emits BrowserTabNavigation with to_url, from_url, method, is_back_forward.
Resolves the actual file path of an opened document	No. Window title is saved as-is. If the title is 'Q2-invoices.xlsx - Excel', the full path is lost.	FileOpenedEvent searches recent-access paths, emits primary_path plus ranked candidate_paths.
Records how long the user spent in a field	Derivable from keydown timestamps, not stored as a single value.	TextInputCompletedEvent.typing_duration_ms is one field per completion.
Records time spent in previous application (dwell)	No.	ApplicationSwitchEvent.dwell_time_ms is one field per switch.
Output is replayable as typed MCP tool calls	Replay requires the same OS, same resolution, often the same screen layout.	Each recorded event maps to an McpToolStep (tool_name, arguments, description) that any MCP client can run.
Expected UI change stored alongside the action	No.	McpToolStep.expected_ui_changes is a tree diff snapshot, used as a validation oracle on replay.
Output format is a typed Rust/TypeScript schema	Usually a proprietary binary or a screenshot reel.	SerializableWorkflowEvent is a serde enum. Full JSON schema is derivable from the source.

WorkflowEvent variants in events.rs lines 475-517

Eight are high-level semantic events. Six are low-level raw events. Together they cover every intent a user can express at a running OS.

“Every enum variant, method name, and struct field on this page is grep-able in a fresh clone of mediar-ai/terminator. The 14 count is not marketing. It is the number of arms in pub enum WorkflowEvent in events.rs lines 475-517.”

github.com/mediar-ai/terminator

Why the recording format decides everything

Tests built from recordings fail for one of three reasons: the UI shifted, the input method changed, or the application context changed. A recording format that only stores mouse coordinates loses to all three. A format that stores raw keystrokes loses to input-method and context changes. A semantic format captures enough invariants at record time that the replay can adapt.

Terminator's recorder is not a replacement for your test runner. It is a way to generate the first draft of a test from a real user flow, in a format that reads well enough to hand-edit and that replays across machines with non-identical screen geometry. You install it with cargo install terminator-workflow-recorder or drive it from the MCP server the same repo ships.

If you are evaluating ui testing automation tools for an app that is not purely web, ask the vendor for a sample recording file. If the answer is "it is binary" or "it is a screenshot reel," you are about to buy a brittle recorder. Terminator's answer is a readable, typed JSON with fourteen variants defined in one open-source file. That is the spec.

Read events.rs on GitHub

Have a workflow you want recorded and replayed across apps?

Walk us through the flow on a call. We will point at the matching WorkflowEvent variants and sketch the replay path end to end.

Frequently asked questions

What does Terminator record that mainstream ui testing automation tools recorders do not?

A semantic event stream instead of a raw input stream. When a user pastes an email address into a To: field, Selenium IDE or a low-code browser recorder saves a clipboard paste plus a focus change plus a change event. Terminator saves one TextInputCompleted event with text_value='finance@acme.com', input_method=Pasted, keystroke_count=0, typing_duration_ms=720, field_name='To'. The rest of the state machine is in the recorder, not the log file. This is how the recording stays readable when the workflow is thirty actions long.

Where is the 14-variant WorkflowEvent enum?

In the open-source Terminator repo at crates/terminator-workflow-recorder/src/events.rs, lines 475 to 517. Clone github.com/mediar-ai/terminator and grep for 'pub enum WorkflowEvent'. The fourteen variants are: Mouse, Keyboard, Clipboard, TextSelection, DragDrop, Hotkey, TextInputCompleted, ApplicationSwitch, BrowserTabNavigation, Click, BrowserClick, BrowserTextInput, FileOpened, PendingAction. Six are low-level (raw input, typically disabled in production configs). Eight are high-level semantic events. This is the surface area you build tests against.

How does the recorder tell Typed from Pasted from AutoFilled?

Timing and keystroke arithmetic. The Windows recorder keeps an InputTextAccumulator per focused element. Every key press increments keystroke_count. Every change-event on the element updates the observed text. If the text length jumps by many characters in a window where almost no keystrokes fired, the event is classified as Pasted. If text appears with zero keystrokes and no paste timing, it is AutoFilled. If the user clicks an autocomplete dropdown item that commits a value into the field, it is Suggestion. If more than one of those paths triggered inside the same field session, it is Mixed. Typed is the default. You can see the completion logic in crates/terminator-workflow-recorder/src/recorder/windows/structs.rs around line 200.

What is the FileOpened event for? Is it a hook into the filesystem?

It is not a filesystem hook. It is a window-title heuristic followed by a filesystem lookup. When a new window becomes foreground and the title looks like it contains a filename (patterns like 'name.ext - AppName' or 'AppName - name.ext'), the recorder searches the OS recent-access index for files with that name. The results are ranked by LastAccessTime and returned as candidate_paths. If one clear winner emerges, confidence is High. If multiple files compete but one is clearly the most recent, it is Medium. If the access times are ambiguous, it is Low. Your downstream tooling sees a typed confidence level, not a raw name string.

Can I replay a recording as a browser-only test?

Only if the recording was browser-only. If the recording crosses into a native app (the user opens Excel, the user hits Alt+Tab to Outlook), the replay has to also cross. That is why Terminator's runtime is desktop-native at the bottom and uses a Chrome extension bridge for DOM access at the top, in the same process. A recording that starts in Chrome, opens Excel, pastes a value, switches back, and clicks Send replays as one test file with one Desktop() instance.

How does the replay work on a machine where the UI has shifted slightly since the recording?

Two mechanisms. First, the recorded event has a UIElement with selector-relevant attributes (role, name, native id, class). The runtime re-resolves a selector against the live accessibility tree every time, so small coordinate shifts do not matter. Second, McpToolStep stores expected_ui_changes as a tree-diff snapshot, so after each action the runtime can verify the UI changed the way it did during recording. If the diff does not match, the step fails with a structured reason instead of a silent mismatch downstream.

Does this work on macOS and Linux, or only Windows?

The recorder is first-class on Windows today. The Windows implementation lives in crates/terminator-workflow-recorder/src/recorder/windows and is 3,500+ lines of UIA event plumbing. macOS support is in progress. Linux AT-SPI2 is experimental. If your target is cross-platform UI testing automation that includes native Windows apps, this is the right tool right now. If your team is macOS-first, the recorder side is less mature today, but the selector engine and locator API under crates/terminator already work across both platforms.

Is the recorder privacy-safe? Does it capture passwords?

The recorder respects field_type. When a field is classified as PasswordBox (or equivalent on macOS), the TextInputCompletedEvent is emitted with the keystroke_count and typing_duration_ms populated but text_value elided. Clipboard events can be length-capped via config (max_clipboard_content_length). Screenshot capture is optional, off by default, and has a configurable blur-on-sensitive-field mode. Recordings are local files by default; no telemetry leaves the machine unless you opt in.

How does this compare to Playwright's codegen?

Playwright's codegen is a browser-only recorder that emits Playwright-API code. It works well when every action is inside a Chromium or WebKit tab. Terminator's recorder is OS-wide. It records across native apps and browser tabs in a single session and emits events that replay as MCP tool calls, not just Playwright function calls. For testing a pure web app, Playwright's codegen is excellent and simpler. For testing a workflow that leaves the browser, Terminator is the answer, and the semantic event format keeps the recording readable at scale.

Can AI coding assistants consume a recording directly?

Yes. A Terminator recording is a JSON file of SerializableWorkflowEvent values. Because the recorder emits semantic events (not raw keystrokes), the file reads like an annotated transcript. Claude Code, Cursor, or any MCP-capable agent can ingest that file, map each event to the matching MCP tool, and execute the replay through the crates/terminator-mcp-agent server. This is why the recorder's output format matters: the event names and fields are the prompt surface that the LLM sees.