Automation in Windows that keeps your caret exactly where it was

You are mid-sentence in a Slack reply when a background automation fires. In Power Automate Desktop, AutoHotkey, or any other Windows automation tool, those keystrokes land in your Slack reply. Terminator does something different. Before the automation touches anything, it asks Windows UI Automation for your focused element and caret range, runs the action, and puts both back the way they were.

M
Matthew Diakonov
9 min read
4.9from dozens of design partners
One file: crates/terminator/src/platforms/windows/input.rs
Uses IUIAutomationTextPattern2::GetCaretRange to cache the caret
50ms settle delay before range.Select() after SetFocus

The foreground problem nobody talks about

Automation in Windows is almost always foreground automation. The tool picks a window, brings it to the front, and sends keystrokes or mouse events through the OS. That works because SendInput targets the active window. It also means any automation you trigger evicts whatever you were doing.

This is fine when automation is the only thing happening. It is not fine when an AI agent, a scheduled refresh, or a coworker script fires while you are typing. Your half-written reply absorbs the first few characters. Your caret jumps to the new window. The automation window flashes. You lose your place.

The Windows UIA API exposes everything you need to avoid this. You can ask for the currently focused element, ask that element for a caret range, and later reinstate both. Almost no tool uses these APIs. Terminator wires them directly into every type and key-press call.

Step 1: cache what the user was doing

Before the automation runs, Terminator fetches three things from the OS: the focused element, the caret range (if the element supports text), and the cursor position (if requested). All three go into a FocusState struct.

input.rs

The non-obvious line is the pattern lookup. Not every UIA element speaks text. A button, a window chrome, or an image does not have a caret and will not implement TextPattern2. The GetCurrentPatternAs::<IUIAutomationTextPattern2>(UIA_TextPattern2Id) call returns Err for those elements, and Terminator gracefully stores a None for the caret range. When you later restore, it restores focus only.

50ms

thread::sleep(Duration::from_millis(50));

crates/terminator/src/platforms/windows/input.rs, line 281, inside restore_focus_state between SetFocus and range.Select()

Step 2: restore, in the right order

SetFocus first. Sleep 50ms. Then Select. The order matters because the focus change has to propagate through the message loop before the text range is reliable.

input.rs

If you swap the order or drop the sleep, the Select call either no-ops or lands in the stale target element. You end up with focus in the right window and the caret in the wrong place. Terminator was written with the sleep in place from the start; removing it breaks caret restoration on any app that repaints on focus change.

The full round trip

What the pipeline looks like when an SDK caller invokes typeText(text, { restoreFocus: true }).

type_text with restore_focus = true

Callertype_textUIATarget elementtypeText(text, { restoreFocus: true })save_focus_state()FocusState { element, caret_range }focus() + send_text(text)chars typedSetFocus on saved elementsleep 50msrange.Select() on caret_rangeOk(())

With focus restore vs without

Background automation fires while you type in Slack

The tool calls SendInput targeting the active window. Your Slack reply is the active window. The first characters of the automation's payload land inside your reply. The automation's own window grabs focus mid-stream, and your caret is gone.

  • SendInput goes to whichever window has focus
  • No save of focused element before action
  • No save of caret range before action
  • No restoration after the action completes

Concrete numbers from the source

0mssleep between SetFocus and Select
0fields in the FocusState struct
0call sites in element.rs
0SendInput absolute coord range

Tools that do not preserve your caret

Every common approach to automation in Windows takes over the foreground and sends keystrokes to whatever has focus. None of them reach into UIA for TextPattern2.

Power Automate DesktopAutoHotkeyPowerShell GUI scriptsUiPathTask SchedulerRoboTaskAutoItWinAutomationpyautoguiSendKeys

Feature by feature

FeatureTraditional Windows automationTerminator
Saves focused element before actionNoGetFocusedElement() into FocusState
Saves caret position inside text fieldsNoTextPattern2::GetCaretRange()
Restores focus after actionManual workaround at bestSetFocus() on saved element
Re-selects original caret rangeNorange.Select() after 50ms settle
Optional cursor-position restoreNoGetCursorPos + SetCursorPos flag
Crosses threads safelySingle-threaded onlyCOINIT_MULTITHREADED, Send + Sync

What flows through the save/restore pipeline

Each call site in element.rs pulls three pieces of state from Windows, caches them in a FocusState, and reapplies them after the underlying action returns.

save_focus_state -> action -> restore_focus_state

GetFocusedElement
GetCaretRange
GetCursorPos
FocusState
SetFocus
range.Select()
SetCursorPos

Exactly what gets preserved

Anything the OS exposes through UI Automation survives the round trip. Anything the OS does not is a best-effort, and Terminator documents it as such.

Restored by restore_focus_state

  • The focused IUIAutomationElement (window + element)
  • Caret position inside a text field (zero-length range)
  • Active text selection (non-zero range)
  • Mouse cursor position (when restore_cursor = true)
  • Keyboard modifier state (unchanged, never pressed)
  • Undo/redo stack of the target app (untouched)

Opt in from the SDK

The Rust flag is plumbed through the TypeScript and Python SDKs so you can turn focus restoration on for a single call. You do not need to subclass or monkey-patch anything.

background-refresh.ts

How to wire it into your own automation

Five short steps. Three happen inside Terminator. The first and last are the only ones you write.

1

Install the SDK

npm i @mediar-ai/terminator, or pip install terminator-py, or cargo add terminator-rs. The flag shape is the same on all three.

2

Call typeText or pressKey with restoreFocus: true

This is the entire API surface. Terminator routes the call through element.rs::type_text or element.rs::press_key.

3

save_focus_state fires

input.rs calls GetFocusedElement(), then attempts GetCurrentPatternAs<IUIAutomationTextPattern2> on the result. If the element speaks text, GetCaretRange() returns the caret range and it gets cached. FocusState is returned to the caller.

4

The action runs

type_text focuses the target element, sends the text via send_text or send_text_by_clipboard, then returns. Meanwhile the cached FocusState sits on the stack.

5

restore_focus_state fires

SetFocus on the saved element, sleep 50ms to let the message loop settle, then range.Select() to reinstate the caret. Your text field ends up byte-identical to how you left it.

Verify it against the source

The claim is two functions in one file plus four call sites. Clone the repo and grep.

zsh
0ms

Settle delay between SetFocus and range.Select(). Long enough for the window message to deliver, short enough to feel instant.

0

Functions in input.rs that implement the entire feature: save_focus_state and restore_focus_state.

0

Call sites across element.rs: two in type_text, two in press_key.

Install

Three flavors, same Rust core. The restoreFocus flag is exposed in all three.

install

Want your AI agent to automate Windows without stealing the user's caret?

Book 20 minutes and we will walk through save_focus_state and restore_focus_state against a workflow you already run.

Frequently asked questions

Why does most automation in Windows interrupt whatever you are typing?

Traditional tools for automation in Windows (Power Automate Desktop, AutoHotkey, UiPath, PowerShell GUI scripts) drive the foreground by sending keyboard and mouse input to the active window. The OS gives the input to whatever has focus. If your Slack reply has focus, the automation types into your Slack reply. The fix is to save focus before the action and restore it after. Terminator does this automatically via save_focus_state() and restore_focus_state() in crates/terminator/src/platforms/windows/input.rs.

What exactly does Terminator save before every type_text or press_key call?

Three things. First, the currently focused IUIAutomationElement, retrieved by calling GetFocusedElement() on the UI Automation instance. Second, if the focused element supports IUIAutomationTextPattern2, the caret range returned by GetCaretRange(). Third, the mouse cursor position from GetCursorPos when the caller passes restore_cursor=true to send_mouse_click. All three live in the FocusState struct at input.rs line 155.

Why is there a 50ms sleep between SetFocus and range.Select() in restore_focus_state?

When you call SetFocus on a UI Automation element, Windows routes a focus change message through the message loop. The receiving window processes the message, fires its own focus handlers, and typically repaints. If you try to select a text range before that sequence completes, the Select call either no-ops or lands in a stale element. The 50ms sleep in restore_focus_state at input.rs line 281 lets the focus change settle, then the saved IUIAutomationTextRange is reselected with range.Select().

Does this work across threads or does the whole pipeline need to stay on one thread?

Across threads. The FocusState struct has unsafe impl Send and unsafe impl Sync at input.rs lines 164 and 165, and the crate initializes COM with COINIT_MULTITHREADED. Under the MTA model, UI Automation COM objects can be accessed from any thread in the apartment, so you can save focus on one thread, let an async runtime schedule your automation elsewhere, and restore focus on a different thread without marshalling.

Which actions in element.rs actually respect the restore_focus flag?

type_text at line 1144 and press_key at line 1232. Both accept a restore_focus: bool parameter. When true, save_focus_state() runs before the action and restore_focus_state() runs after. The flag is plumbed through the TypeScript SDK as an option on typeText and pressKey so you can turn it on per call without subclassing.

Can I verify this in source without building the crate?

Yes. The file is crates/terminator/src/platforms/windows/input.rs in mediar-ai/terminator on GitHub. Lines 171-244 define save_focus_state, lines 250-294 define restore_focus_state. The IUIAutomationTextPattern2::GetCaretRange call lives at lines 215-218 inside save_focus_state. The four call sites in element.rs are at lines 1156, 1225, 1243, and 1301 inside type_text and press_key.

terminatorDesktop automation SDK
© 2026 terminator. All rights reserved.