GuideProduction criteriaConcurrency

The best MCP server is the one whose concurrency shape matches the resource it controls.

Every top result for this query is a listicle that ranks MCP servers by tool count, category coverage, or GitHub stars. None of them ask the one question that decides whether the server will survive a weekend of real use: does the server's concurrency shape match the shape of the resource behind it? Terminator is the only MCP server I have read that answers this question in code, by defaulting MCP_MAX_CONCURRENT to 1 because a desktop has exactly one mouse. This page shows the exact lines of main.rs that enforce it and walks through what the same idea looks like applied to other resources.

Terminator, desktop automation framework

Published April 19, 202612 min read

4.9from 9 top-ranked 'best MCP server' articles read

MCP_MAX_CONCURRENT defaults to 1

503 body shape doubles as LB contract

Cancellation via tokio::select, not polling

Schema compiled from source at build time

Read the source on GitHub Jump to the criteria

Best MCP server

Ranked by whether its concurrency shape matches its resource

A desktop has one mouse

A search API has many workers

A checkout flow has one session

The best server encodes that in its default

Terminator defaults MCP_MAX_CONCURRENT to 1

0:00 / 0:07

MCP_MAX_CONCURRENT=1POST /mcpGET /status503 busy bodystop_executiontokio::selectCancellationTokenexecute_sequenceenv!("MCP_TOOLS")Azure LB friendlyJSON-RPC 2.0stdio or HTTP

The question every listicle skips

Nine different articles show up on the first page of Google for "best MCP server". Taskade, Firecrawl, K2view, Skyvia, MCP Bundles, Effloow, Fungies, FastMCP, mcpmanager. I read all of them. They disagree on which servers belong in the top ten, but they agree on what makes a server "best": breadth of tool coverage, category popularity, GitHub stars, and the author's personal taste. None of them ask whether the server's concurrency default is correct for the thing it controls.

That omission is the whole game. An MCP server is a wrapper over a resource. The resource has a correct concurrency shape. If the wrapper disagrees with the shape, the wrapper corrupts state. The number of tools the wrapper exposes is completely irrelevant to that failure mode.

“Default value of MCP_MAX_CONCURRENT in Terminator's MCP agent. Set higher only if you can prove that two concurrent tool calls never contend for the same input device, which on a real desktop is almost never true.”

crates/terminator-mcp-agent/src/main.rs line 516

Resource shape decides everything

Before you pick an MCP server, you have to decide what shape the resource behind it actually has. There are three shapes, and the correct default concurrency is different for each.

The desktop has one mouse

If two agents try to click at once, one of them loses the race and the LLM gets told its click succeeded while a different app actually received focus. The only correct concurrency for a physical device is 1. Terminator encodes that in the default.

A search API has N

Read-only fan-out is fine. A search MCP server can serve 50 requests at once and nothing breaks. That is a different resource with a different correct answer, which is exactly why 'best MCP server' is the wrong question.

A checkout flow has 1

You do not want two agents both adding items to the same cart, racing on the same stripe checkout page. Serialised per session is the safe default. Most business MCP servers silently ignore this.

The 503 body is a contract

busy, activeRequests, maxConcurrent, lastActivity. A load balancer reads the same four fields whether it hit POST /mcp with work or GET /status with a probe. One parser, two code paths.

The /status endpoint is a health probe, not a dashboard

Returning 503 from /status while the VM is busy is on purpose. Azure's Standard Load Balancer reads that as 'take me out of rotation', and the next client gets routed to an idle VM. No retry logic in the client needed.

stop_execution hits tokio::select

The cancellation is not a polite request. Every tool handler in server.rs is awaited inside a tokio::select against a CancellationToken from the request manager. Calling stop_execution drops the handler and the mouse is free on the next frame.

The AnimatedBeam view of the contract

On the left, the clients. In the middle, the gate. On the right, the physical machine. The gate is where Terminator makes its bet. If the machine is already doing something, the gate says 503 before the request ever lands on the dispatch function, and the client can either retry against a different VM or wait.

POST /mcp routing under MCP_MAX_CONCURRENT=1

The exact code that enforces the default

This is the part you cannot copy from another "best MCP server" page. Open crates/terminator-mcp-agent/src/main.rs. Two blocks of code define the contract. The first sets the default. The second enforces it.

crates/terminator-mcp-agent/src/main.rs

Notice the order of operations inside mcp_gate. The active-requests counter is checked before the cancellation token is registered, before the timeout clock starts, before any work is scheduled. A busy VM rejects the request at the transport boundary, not at the handler. There is no path where the rejected request touches the shared desktop state.

The matching status endpoint returns the same body the 503 does. Four fields: busy, activeRequests, maxConcurrent, lastActivity. That is the whole LB contract.

crates/terminator-mcp-agent/src/main.rs

What it looks like from a terminal

You do not have to take my word for it. Install the server, hit the endpoints, and watch the flip from 200 to 503 when a tool call is in flight.

Probing /status before and during a tool call

0Default MCP_MAX_CONCURRENT

0HTTP status when busy

0Fields in the busy body

0+Tools guarded by one gate

Five criteria that actually separate production MCP servers from demos

If you are going to rank MCP servers, rank them against real production behaviour. Here are five filters to use instead of tool count. Terminator was designed to pass all five; most of the servers on the top listicles pass one or two.

Concurrency shape must match the resource

A desktop is serial. A search API is parallel. A checkout is per-session serial. The 'best' server is the one that picks the right shape and enforces it in code, not the one with the longest tool list.

Busy behaviour must be legible to infrastructure

503 with a documented JSON body, same shape from /status and from /mcp, so an ops team can write one parser and route accordingly. Silently queuing is worse than returning 503.

Cancellation must be cheap and synchronous

stop_execution returning after the action finishes is not cancellation. Terminator wires every handler through tokio::select on a per-request CancellationToken so the abort is effective within one tick.

Schema must be derived from code, not hand-written

Terminator's build.rs reads the match arms in server.rs and bakes the tool names into the binary via env!("MCP_TOOLS"). The system prompt cannot drift from what the server will actually accept.

Observability must be on by default, not opt-in

Every tool call is wrapped in log_request and log_response_with_logs, with screenshots before and after UI actions. Executions land on disk in executions/. If you cannot audit the last tool call, the server is not production ready.

Terminator vs the generic "top ten" server

The column on the right is not every MCP server in the world. It is the composite behaviour of the servers that show up highest on the listicles, mostly stateless HTTP wrappers around SaaS APIs, where these production questions either do not apply or have not been answered.

Feature	Typical listicle pick	Terminator MCP
Default concurrency	Unbounded or whatever the host sets	MCP_MAX_CONCURRENT defaults to 1 (main.rs:516)
What happens when busy	Request queues, blocks, or races the active one	POST /mcp returns 503 with a JSON body shaped for a load balancer
GET /status behaviour	No such endpoint, or 200 even while a request runs	200 when idle, 503 when busy, same body shape as the 503 from /mcp
Cancellation	No way to stop a long action once it starts	stop_execution tool + tokio::select on a per-request CancellationToken
State ownership	Stateless, relies on the caller to be serial	Long-lived process owns the machine, focus, and pointer for the duration
Schema drift protection	Prompt engineer keeps the tool list in sync by hand	build.rs bakes the dispatch_tool match arms into env!("MCP_TOOLS") at compile time
Horizontal scale story	Run more replicas of the same process	Run more VMs, each with one mouse; LB picks an idle one via /status

How to pick the right MCP server for your actual resource

The best MCP server for your problem is the one whose defaults you agree with before you look at the tool list. Here is the order of questions I ask myself.

Pick a resource

Search API. Postgres. A GitHub account. Your entire desktop. A Salesforce tenant. Each one has a different answer to the question 'how many concurrent operations can touch this without corrupting state'.

Write that answer down

If it is one, put MCP_MAX_CONCURRENT=1 in your env and the default in the binary. If it is ten, set it to ten and document why. If it is unbounded, say so and explain which invariants protect you. This is not an optimisation, it is correctness.

Enforce it at the transport boundary, not the handler

Terminator's gate is middleware on POST /mcp. Reject with 503 before the request manager registers the cancellation token. Doing it inside the handler is too late; by then you have already touched shared state.

Expose /status with the same shape as the 503

One JSON shape, four fields, two code paths that return it. A load balancer can poll /status and read 503 from /mcp with the same parser. The client sees a useful error, not a timeout.

Scale by replication, not by thread count

If you need more throughput for a serial resource, run more replicas behind an LB. Each one owns one copy of the resource. The LB picks an idle replica via /status. This is how Terminator scales across VMs.

A checklist you can screenshot

Use this as a filter on any MCP server you are evaluating. Terminator passes every line. Most of the servers you see on top listicles pass the last two.

Production MCP server checklist

Concurrency default matches the resource shape, encoded in the binary.
POST /mcp returns 503 with a documented JSON body when busy.
GET /status returns the same body with the same fields.
stop_execution cancels in-flight work via a per-request CancellationToken.
Tool schemas derived from source at compile time, not hand-written.
Every tool call is logged with request, response, and duration by default.
Transport is either stdio or HTTP framed as JSON-RPC 2.0, not a bespoke protocol.

The honest take on "best"

I will not tell you Terminator is the best MCP server for your Notion database. Of course it is not; Notion publishes their own server and they know their invariants better than anyone. Terminator is the best MCP server I have read for the specific job of letting an AI coding assistant drive a whole desktop, because it treats the desktop as what it is: a single-session, single-focus, single-pointer resource that breaks loudly when two callers touch it at once. That awareness shows up in the defaults, in the HTTP contract, and in the cancellation plumbing. If you are evaluating MCP servers for anything that touches physical devices, logged-in UI sessions, or exclusive resources, steal those criteria and apply them to whichever server you end up picking.

If you are here because you want to try the server itself, install it with one command and point your editor at it:

install.sh

Evaluating an MCP server for a real deployment?

We will walk you through Terminator's concurrency model, the /status contract, and how we scale it across a fleet of VMs. 20 minutes, live.

Frequently asked questions

What is the actual 'best MCP server' as of April 2026?

There is no single answer, and any list that gives you one is selling you something. The question is structurally wrong. An MCP server is a wrapper over a resource, and the right answer depends on the shape of that resource. For a desktop, the best server I have read is Terminator, specifically because it defaults MCP_MAX_CONCURRENT=1 and has a busy-aware /status endpoint that plays nicely with a load balancer. For a search engine, Exa is the obvious pick because search is read-only and fan-out friendly. For a CRM, HubSpot and Salesforce each publish their own. The correct question is: which MCP server is best for the resource I actually want an AI to control.

Why does Terminator default MCP_MAX_CONCURRENT to 1?

Because a desktop has exactly one mouse, one keyboard, and one window with focus at any time. If two tool calls run at once and both try to click, one of them either races the other or hits the wrong window. The LLM has no way to know which happened, because the accessibility tree the tool reads is a snapshot of shared state. Terminator makes this explicit in crates/terminator-mcp-agent/src/main.rs line 516: the default is 1. You can set it higher with an env var, but only if you know that your specific workflow never contends for the same input device at the same time, which for GUI automation is almost never true.

What happens when the Terminator MCP server is busy and a new request arrives?

POST /mcp returns 503 Service Unavailable immediately, with a JSON body shaped {"busy": true, "activeRequests": 1, "maxConcurrent": 1, "lastActivity": "<ISO-8601>"}. The 503 is intentional. A load balancer probing GET /status gets the same body with the same four fields and pulls that VM out of rotation until the active request finishes. The next client request lands on an idle VM. This is in main.rs lines 664 through 684 for the gate middleware and 537 through 553 for the status handler. It is documented in the crate's README under the HTTP Endpoints heading.

Is any other MCP server designed around this concurrency contract?

Not that I have read. The vast majority of MCP servers I have audited are HTTP wrappers around stateless APIs where concurrency is the remote service's problem. GitHub MCP, Slack MCP, Exa, Firecrawl, Notion MCP: all fine to run unbounded because the upstream has its own rate limiting and the operations are mostly idempotent. The 'best MCP server' listicles never bring up concurrency because for most of the servers in those lists, the right answer is 'whatever, the API is stateless anyway'. Desktop, browser control, and anything running against a physical device or a single logged-in session are the servers where concurrency shape genuinely matters, and they are underrepresented in the top-ranked articles.

How do I verify the concurrency behaviour myself?

Run the server with HTTP transport: npx -y terminator-mcp-agent@latest -t http. In one terminal, curl http://127.0.0.1:3000/status and confirm you get 200 with busy=false. Fire a long-running tool call like click_element against a slow-to-render window. In a second terminal, curl the same URL while the first is running. You will get 503 with busy=true, the same body shape, activeRequests=1, maxConcurrent=1. When the first call finishes, /status flips back to 200. You can also set MCP_MAX_CONCURRENT=2 as an env var, rerun the experiment, and watch two concurrent clicks actually step on each other, which is an excellent education in why the default is 1.

Does this mean Terminator does not scale?

It scales by replication, not by thread count. Each VM owns exactly one mouse, so the throughput story is not about making one VM serve 100 concurrent tool calls; it is about running 100 VMs behind a load balancer and letting the LB pick an idle one on every request. The 503-from-/status behaviour is the mechanism: Azure Load Balancer, Google Cloud Load Balancer, and any HTTP health check that understands 503 will take a busy VM out of rotation and route traffic elsewhere. This is the same scaling pattern you would use for any resource that is physically serial, like a 3D printer, a hardware test bench, or a phone farm.

What does 'schema drift protection' actually mean in Terminator's code?

crates/terminator-mcp-agent/build.rs reads src/server.rs at compile time, walks until it hits the line containing 'let result = match tool_name', then collects every subsequent match arm that starts with a quoted string followed by '=>'. That list becomes the MCP_TOOLS environment variable via rustc-env. prompt.rs reads env!("MCP_TOOLS") and pastes it into the system prompt the server returns on initialize. The consequence: the list of tool names the LLM is told about is literally compiled from the list of tool names the dispatch function knows how to route. They cannot disagree. Every other MCP server I have read maintains the prompt and the dispatch by hand.

How is stop_execution different from a regular cancellation?

stop_execution is a first-class tool in the MCP server. The LLM can call it like any other tool, and every handler in server.rs awaits its work inside a tokio::select against a CancellationToken pulled from the request manager at the start of the request. When stop_execution fires, the token flips and the tokio::select drops the in-flight work. It is synchronous in the sense that the desktop pointer is free within one scheduler tick. Contrast with HTTP-only cancellation where you can drop the connection but the server-side handler keeps running to completion and corrupting state. In practice, this means you can say 'stop' to Claude in the middle of a runaway automation and the mouse actually stops.

Does the 'best MCP server' question even make sense when Anthropic, OpenAI, and Google all have native tool use?

Yes, for a specific reason. MCP lets one tool server serve every client that speaks the protocol. If you hand-code tools into Claude's native tool-use API, you cannot reuse them with Cursor, VS Code, a custom agent, or an MCP-aware chat app without re-implementing the wrapper. The value of MCP is the single implementation, shared contract. For a product like Terminator, where the tool set is 30+ handlers, that reuse is worth the protocol overhead. For a product where the 'tool' is one HTTP call, native tool use is usually the right choice.

Where do I look in the Terminator repo to verify everything on this page?

crates/terminator-mcp-agent/src/main.rs for the concurrency default at line 516, the status handler at lines 537 to 553, and the mcp_gate middleware at lines 664 to 684. crates/terminator-mcp-agent/src/server.rs for the 31 tool handlers (grep for '#[tool('), the dispatch_tool match block at line 9953, and the stop_execution handler at line 8587. crates/terminator-mcp-agent/build.rs for the compile-time tool extraction at line 31, and crates/terminator-mcp-agent/src/prompt.rs for the env!("MCP_TOOLS") injection at line 10. Finally, crates/terminator-mcp-agent/README.md lines 36 through 45 document the HTTP endpoints and the 503 behaviour as the public contract.

Pages that unpack the same MCP server from different angles