Automation in Windows that keeps your caret exactly where it was
You are mid-sentence in a Slack reply when a background automation fires. In Power Automate Desktop, AutoHotkey, or any other Windows automation tool, those keystrokes land in your Slack reply. Terminator does something different. Before the automation touches anything, it asks Windows UI Automation for your focused element and caret range, runs the action, and puts both back the way they were.
The foreground problem nobody talks about
Automation in Windows is almost always foreground automation. The tool picks a window, brings it to the front, and sends keystrokes or mouse events through the OS. That works because SendInput targets the active window. It also means any automation you trigger evicts whatever you were doing.
This is fine when automation is the only thing happening. It is not fine when an AI agent, a scheduled refresh, or a coworker script fires while you are typing. Your half-written reply absorbs the first few characters. Your caret jumps to the new window. The automation window flashes. You lose your place.
The Windows UIA API exposes everything you need to avoid this. You can ask for the currently focused element, ask that element for a caret range, and later reinstate both. Almost no tool uses these APIs. Terminator wires them directly into every type and key-press call.
Step 1: cache what the user was doing
Before the automation runs, Terminator fetches three things from the OS: the focused element, the caret range (if the element supports text), and the cursor position (if requested). All three go into a FocusState struct.
The non-obvious line is the pattern lookup. Not every UIA element speaks text. A button, a window chrome, or an image does not have a caret and will not implement TextPattern2. The GetCurrentPatternAs::<IUIAutomationTextPattern2>(UIA_TextPattern2Id) call returns Err for those elements, and Terminator gracefully stores a None for the caret range. When you later restore, it restores focus only.
“thread::sleep(Duration::from_millis(50));”
crates/terminator/src/platforms/windows/input.rs, line 281, inside restore_focus_state between SetFocus and range.Select()
Step 2: restore, in the right order
SetFocus first. Sleep 50ms. Then Select. The order matters because the focus change has to propagate through the message loop before the text range is reliable.
If you swap the order or drop the sleep, the Select call either no-ops or lands in the stale target element. You end up with focus in the right window and the caret in the wrong place. Terminator was written with the sleep in place from the start; removing it breaks caret restoration on any app that repaints on focus change.
The full round trip
What the pipeline looks like when an SDK caller invokes typeText(text, { restoreFocus: true }).
type_text with restore_focus = true
With focus restore vs without
Background automation fires while you type in Slack
The tool calls SendInput targeting the active window. Your Slack reply is the active window. The first characters of the automation's payload land inside your reply. The automation's own window grabs focus mid-stream, and your caret is gone.
- SendInput goes to whichever window has focus
- No save of focused element before action
- No save of caret range before action
- No restoration after the action completes
Concrete numbers from the source
Tools that do not preserve your caret
Every common approach to automation in Windows takes over the foreground and sends keystrokes to whatever has focus. None of them reach into UIA for TextPattern2.
Feature by feature
| Feature | Traditional Windows automation | Terminator |
|---|---|---|
| Saves focused element before action | No | GetFocusedElement() into FocusState |
| Saves caret position inside text fields | No | TextPattern2::GetCaretRange() |
| Restores focus after action | Manual workaround at best | SetFocus() on saved element |
| Re-selects original caret range | No | range.Select() after 50ms settle |
| Optional cursor-position restore | No | GetCursorPos + SetCursorPos flag |
| Crosses threads safely | Single-threaded only | COINIT_MULTITHREADED, Send + Sync |
What flows through the save/restore pipeline
Each call site in element.rs pulls three pieces of state from Windows, caches them in a FocusState, and reapplies them after the underlying action returns.
save_focus_state -> action -> restore_focus_state
Exactly what gets preserved
Anything the OS exposes through UI Automation survives the round trip. Anything the OS does not is a best-effort, and Terminator documents it as such.
Restored by restore_focus_state
- The focused IUIAutomationElement (window + element)
- Caret position inside a text field (zero-length range)
- Active text selection (non-zero range)
- Mouse cursor position (when restore_cursor = true)
- Keyboard modifier state (unchanged, never pressed)
- Undo/redo stack of the target app (untouched)
Opt in from the SDK
The Rust flag is plumbed through the TypeScript and Python SDKs so you can turn focus restoration on for a single call. You do not need to subclass or monkey-patch anything.
How to wire it into your own automation
Five short steps. Three happen inside Terminator. The first and last are the only ones you write.
Install the SDK
npm i @mediar-ai/terminator, or pip install terminator-py, or cargo add terminator-rs. The flag shape is the same on all three.
Call typeText or pressKey with restoreFocus: true
This is the entire API surface. Terminator routes the call through element.rs::type_text or element.rs::press_key.
save_focus_state fires
input.rs calls GetFocusedElement(), then attempts GetCurrentPatternAs<IUIAutomationTextPattern2> on the result. If the element speaks text, GetCaretRange() returns the caret range and it gets cached. FocusState is returned to the caller.
The action runs
type_text focuses the target element, sends the text via send_text or send_text_by_clipboard, then returns. Meanwhile the cached FocusState sits on the stack.
restore_focus_state fires
SetFocus on the saved element, sleep 50ms to let the message loop settle, then range.Select() to reinstate the caret. Your text field ends up byte-identical to how you left it.
Verify it against the source
The claim is two functions in one file plus four call sites. Clone the repo and grep.
Settle delay between SetFocus and range.Select(). Long enough for the window message to deliver, short enough to feel instant.
Functions in input.rs that implement the entire feature: save_focus_state and restore_focus_state.
Call sites across element.rs: two in type_text, two in press_key.
Install
Three flavors, same Rust core. The restoreFocus flag is exposed in all three.
Want your AI agent to automate Windows without stealing the user's caret?
Book 20 minutes and we will walk through save_focus_state and restore_focus_state against a workflow you already run.
Frequently asked questions
Why does most automation in Windows interrupt whatever you are typing?
Traditional tools for automation in Windows (Power Automate Desktop, AutoHotkey, UiPath, PowerShell GUI scripts) drive the foreground by sending keyboard and mouse input to the active window. The OS gives the input to whatever has focus. If your Slack reply has focus, the automation types into your Slack reply. The fix is to save focus before the action and restore it after. Terminator does this automatically via save_focus_state() and restore_focus_state() in crates/terminator/src/platforms/windows/input.rs.
What exactly does Terminator save before every type_text or press_key call?
Three things. First, the currently focused IUIAutomationElement, retrieved by calling GetFocusedElement() on the UI Automation instance. Second, if the focused element supports IUIAutomationTextPattern2, the caret range returned by GetCaretRange(). Third, the mouse cursor position from GetCursorPos when the caller passes restore_cursor=true to send_mouse_click. All three live in the FocusState struct at input.rs line 155.
Why is there a 50ms sleep between SetFocus and range.Select() in restore_focus_state?
When you call SetFocus on a UI Automation element, Windows routes a focus change message through the message loop. The receiving window processes the message, fires its own focus handlers, and typically repaints. If you try to select a text range before that sequence completes, the Select call either no-ops or lands in a stale element. The 50ms sleep in restore_focus_state at input.rs line 281 lets the focus change settle, then the saved IUIAutomationTextRange is reselected with range.Select().
Does this work across threads or does the whole pipeline need to stay on one thread?
Across threads. The FocusState struct has unsafe impl Send and unsafe impl Sync at input.rs lines 164 and 165, and the crate initializes COM with COINIT_MULTITHREADED. Under the MTA model, UI Automation COM objects can be accessed from any thread in the apartment, so you can save focus on one thread, let an async runtime schedule your automation elsewhere, and restore focus on a different thread without marshalling.
Which actions in element.rs actually respect the restore_focus flag?
type_text at line 1144 and press_key at line 1232. Both accept a restore_focus: bool parameter. When true, save_focus_state() runs before the action and restore_focus_state() runs after. The flag is plumbed through the TypeScript SDK as an option on typeText and pressKey so you can turn it on per call without subclassing.
Can I verify this in source without building the crate?
Yes. The file is crates/terminator/src/platforms/windows/input.rs in mediar-ai/terminator on GitHub. Lines 171-244 define save_focus_state, lines 250-294 define restore_focus_state. The IUIAutomationTextPattern2::GetCaretRange call lives at lines 215-218 inside save_focus_state. The four call sites in element.rs are at lines 1156, 1225, 1243, and 1301 inside type_text and press_key.