The best MCP server is the one whose concurrency shape matches the resource it controls.
Every top result for this query is a listicle that ranks MCP servers by tool count, category coverage, or GitHub stars. None of them ask the one question that decides whether the server will survive a weekend of real use: does the server's concurrency shape match the shape of the resource behind it? Terminator is the only MCP server I have read that answers this question in code, by defaulting MCP_MAX_CONCURRENT to 1 because a desktop has exactly one mouse. This page shows the exact lines of main.rs that enforce it and walks through what the same idea looks like applied to other resources.
The question every listicle skips
Nine different articles show up on the first page of Google for "best MCP server". Taskade, Firecrawl, K2view, Skyvia, MCP Bundles, Effloow, Fungies, FastMCP, mcpmanager. I read all of them. They disagree on which servers belong in the top ten, but they agree on what makes a server "best": breadth of tool coverage, category popularity, GitHub stars, and the author's personal taste. None of them ask whether the server's concurrency default is correct for the thing it controls.
That omission is the whole game. An MCP server is a wrapper over a resource. The resource has a correct concurrency shape. If the wrapper disagrees with the shape, the wrapper corrupts state. The number of tools the wrapper exposes is completely irrelevant to that failure mode.
“Default value of MCP_MAX_CONCURRENT in Terminator's MCP agent. Set higher only if you can prove that two concurrent tool calls never contend for the same input device, which on a real desktop is almost never true.”
crates/terminator-mcp-agent/src/main.rs line 516
Resource shape decides everything
Before you pick an MCP server, you have to decide what shape the resource behind it actually has. There are three shapes, and the correct default concurrency is different for each.
The desktop has one mouse
If two agents try to click at once, one of them loses the race and the LLM gets told its click succeeded while a different app actually received focus. The only correct concurrency for a physical device is 1. Terminator encodes that in the default.
A search API has N
Read-only fan-out is fine. A search MCP server can serve 50 requests at once and nothing breaks. That is a different resource with a different correct answer, which is exactly why 'best MCP server' is the wrong question.
A checkout flow has 1
You do not want two agents both adding items to the same cart, racing on the same stripe checkout page. Serialised per session is the safe default. Most business MCP servers silently ignore this.
The 503 body is a contract
busy, activeRequests, maxConcurrent, lastActivity. A load balancer reads the same four fields whether it hit POST /mcp with work or GET /status with a probe. One parser, two code paths.
The /status endpoint is a health probe, not a dashboard
Returning 503 from /status while the VM is busy is on purpose. Azure's Standard Load Balancer reads that as 'take me out of rotation', and the next client gets routed to an idle VM. No retry logic in the client needed.
stop_execution hits tokio::select
The cancellation is not a polite request. Every tool handler in server.rs is awaited inside a tokio::select against a CancellationToken from the request manager. Calling stop_execution drops the handler and the mouse is free on the next frame.
The AnimatedBeam view of the contract
On the left, the clients. In the middle, the gate. On the right, the physical machine. The gate is where Terminator makes its bet. If the machine is already doing something, the gate says 503 before the request ever lands on the dispatch function, and the client can either retry against a different VM or wait.
POST /mcp routing under MCP_MAX_CONCURRENT=1
The exact code that enforces the default
This is the part you cannot copy from another "best MCP server" page. Open crates/terminator-mcp-agent/src/main.rs. Two blocks of code define the contract. The first sets the default. The second enforces it.
Notice the order of operations inside mcp_gate. The active-requests counter is checked before the cancellation token is registered, before the timeout clock starts, before any work is scheduled. A busy VM rejects the request at the transport boundary, not at the handler. There is no path where the rejected request touches the shared desktop state.
The matching status endpoint returns the same body the 503 does. Four fields: busy, activeRequests, maxConcurrent, lastActivity. That is the whole LB contract.
What it looks like from a terminal
You do not have to take my word for it. Install the server, hit the endpoints, and watch the flip from 200 to 503 when a tool call is in flight.
Five criteria that actually separate production MCP servers from demos
If you are going to rank MCP servers, rank them against real production behaviour. Here are five filters to use instead of tool count. Terminator was designed to pass all five; most of the servers on the top listicles pass one or two.
Concurrency shape must match the resource
A desktop is serial. A search API is parallel. A checkout is per-session serial. The 'best' server is the one that picks the right shape and enforces it in code, not the one with the longest tool list.
Busy behaviour must be legible to infrastructure
503 with a documented JSON body, same shape from /status and from /mcp, so an ops team can write one parser and route accordingly. Silently queuing is worse than returning 503.
Cancellation must be cheap and synchronous
stop_execution returning after the action finishes is not cancellation. Terminator wires every handler through tokio::select on a per-request CancellationToken so the abort is effective within one tick.
Schema must be derived from code, not hand-written
Terminator's build.rs reads the match arms in server.rs and bakes the tool names into the binary via env!("MCP_TOOLS"). The system prompt cannot drift from what the server will actually accept.
Observability must be on by default, not opt-in
Every tool call is wrapped in log_request and log_response_with_logs, with screenshots before and after UI actions. Executions land on disk in executions/. If you cannot audit the last tool call, the server is not production ready.
Terminator vs the generic "top ten" server
The column on the right is not every MCP server in the world. It is the composite behaviour of the servers that show up highest on the listicles, mostly stateless HTTP wrappers around SaaS APIs, where these production questions either do not apply or have not been answered.
| Feature | Typical listicle pick | Terminator MCP |
|---|---|---|
| Default concurrency | Unbounded or whatever the host sets | MCP_MAX_CONCURRENT defaults to 1 (main.rs:516) |
| What happens when busy | Request queues, blocks, or races the active one | POST /mcp returns 503 with a JSON body shaped for a load balancer |
| GET /status behaviour | No such endpoint, or 200 even while a request runs | 200 when idle, 503 when busy, same body shape as the 503 from /mcp |
| Cancellation | No way to stop a long action once it starts | stop_execution tool + tokio::select on a per-request CancellationToken |
| State ownership | Stateless, relies on the caller to be serial | Long-lived process owns the machine, focus, and pointer for the duration |
| Schema drift protection | Prompt engineer keeps the tool list in sync by hand | build.rs bakes the dispatch_tool match arms into env!("MCP_TOOLS") at compile time |
| Horizontal scale story | Run more replicas of the same process | Run more VMs, each with one mouse; LB picks an idle one via /status |
How to pick the right MCP server for your actual resource
The best MCP server for your problem is the one whose defaults you agree with before you look at the tool list. Here is the order of questions I ask myself.
Pick a resource
Search API. Postgres. A GitHub account. Your entire desktop. A Salesforce tenant. Each one has a different answer to the question 'how many concurrent operations can touch this without corrupting state'.
Write that answer down
If it is one, put MCP_MAX_CONCURRENT=1 in your env and the default in the binary. If it is ten, set it to ten and document why. If it is unbounded, say so and explain which invariants protect you. This is not an optimisation, it is correctness.
Enforce it at the transport boundary, not the handler
Terminator's gate is middleware on POST /mcp. Reject with 503 before the request manager registers the cancellation token. Doing it inside the handler is too late; by then you have already touched shared state.
Expose /status with the same shape as the 503
One JSON shape, four fields, two code paths that return it. A load balancer can poll /status and read 503 from /mcp with the same parser. The client sees a useful error, not a timeout.
Scale by replication, not by thread count
If you need more throughput for a serial resource, run more replicas behind an LB. Each one owns one copy of the resource. The LB picks an idle replica via /status. This is how Terminator scales across VMs.
A checklist you can screenshot
Use this as a filter on any MCP server you are evaluating. Terminator passes every line. Most of the servers you see on top listicles pass the last two.
Production MCP server checklist
- Concurrency default matches the resource shape, encoded in the binary.
- POST /mcp returns 503 with a documented JSON body when busy.
- GET /status returns the same body with the same fields.
- stop_execution cancels in-flight work via a per-request CancellationToken.
- Tool schemas derived from source at compile time, not hand-written.
- Every tool call is logged with request, response, and duration by default.
- Transport is either stdio or HTTP framed as JSON-RPC 2.0, not a bespoke protocol.
The honest take on "best"
I will not tell you Terminator is the best MCP server for your Notion database. Of course it is not; Notion publishes their own server and they know their invariants better than anyone. Terminator is the best MCP server I have read for the specific job of letting an AI coding assistant drive a whole desktop, because it treats the desktop as what it is: a single-session, single-focus, single-pointer resource that breaks loudly when two callers touch it at once. That awareness shows up in the defaults, in the HTTP contract, and in the cancellation plumbing. If you are evaluating MCP servers for anything that touches physical devices, logged-in UI sessions, or exclusive resources, steal those criteria and apply them to whichever server you end up picking.
If you are here because you want to try the server itself, install it with one command and point your editor at it:
Evaluating an MCP server for a real deployment?
We will walk you through Terminator's concurrency model, the /status contract, and how we scale it across a fleet of VMs. 20 minutes, live.
Frequently asked questions
What is the actual 'best MCP server' as of April 2026?
There is no single answer, and any list that gives you one is selling you something. The question is structurally wrong. An MCP server is a wrapper over a resource, and the right answer depends on the shape of that resource. For a desktop, the best server I have read is Terminator, specifically because it defaults MCP_MAX_CONCURRENT=1 and has a busy-aware /status endpoint that plays nicely with a load balancer. For a search engine, Exa is the obvious pick because search is read-only and fan-out friendly. For a CRM, HubSpot and Salesforce each publish their own. The correct question is: which MCP server is best for the resource I actually want an AI to control.
Why does Terminator default MCP_MAX_CONCURRENT to 1?
Because a desktop has exactly one mouse, one keyboard, and one window with focus at any time. If two tool calls run at once and both try to click, one of them either races the other or hits the wrong window. The LLM has no way to know which happened, because the accessibility tree the tool reads is a snapshot of shared state. Terminator makes this explicit in crates/terminator-mcp-agent/src/main.rs line 516: the default is 1. You can set it higher with an env var, but only if you know that your specific workflow never contends for the same input device at the same time, which for GUI automation is almost never true.
What happens when the Terminator MCP server is busy and a new request arrives?
POST /mcp returns 503 Service Unavailable immediately, with a JSON body shaped {"busy": true, "activeRequests": 1, "maxConcurrent": 1, "lastActivity": "<ISO-8601>"}. The 503 is intentional. A load balancer probing GET /status gets the same body with the same four fields and pulls that VM out of rotation until the active request finishes. The next client request lands on an idle VM. This is in main.rs lines 664 through 684 for the gate middleware and 537 through 553 for the status handler. It is documented in the crate's README under the HTTP Endpoints heading.
Is any other MCP server designed around this concurrency contract?
Not that I have read. The vast majority of MCP servers I have audited are HTTP wrappers around stateless APIs where concurrency is the remote service's problem. GitHub MCP, Slack MCP, Exa, Firecrawl, Notion MCP: all fine to run unbounded because the upstream has its own rate limiting and the operations are mostly idempotent. The 'best MCP server' listicles never bring up concurrency because for most of the servers in those lists, the right answer is 'whatever, the API is stateless anyway'. Desktop, browser control, and anything running against a physical device or a single logged-in session are the servers where concurrency shape genuinely matters, and they are underrepresented in the top-ranked articles.
How do I verify the concurrency behaviour myself?
Run the server with HTTP transport: npx -y terminator-mcp-agent@latest -t http. In one terminal, curl http://127.0.0.1:3000/status and confirm you get 200 with busy=false. Fire a long-running tool call like click_element against a slow-to-render window. In a second terminal, curl the same URL while the first is running. You will get 503 with busy=true, the same body shape, activeRequests=1, maxConcurrent=1. When the first call finishes, /status flips back to 200. You can also set MCP_MAX_CONCURRENT=2 as an env var, rerun the experiment, and watch two concurrent clicks actually step on each other, which is an excellent education in why the default is 1.
Does this mean Terminator does not scale?
It scales by replication, not by thread count. Each VM owns exactly one mouse, so the throughput story is not about making one VM serve 100 concurrent tool calls; it is about running 100 VMs behind a load balancer and letting the LB pick an idle one on every request. The 503-from-/status behaviour is the mechanism: Azure Load Balancer, Google Cloud Load Balancer, and any HTTP health check that understands 503 will take a busy VM out of rotation and route traffic elsewhere. This is the same scaling pattern you would use for any resource that is physically serial, like a 3D printer, a hardware test bench, or a phone farm.
What does 'schema drift protection' actually mean in Terminator's code?
crates/terminator-mcp-agent/build.rs reads src/server.rs at compile time, walks until it hits the line containing 'let result = match tool_name', then collects every subsequent match arm that starts with a quoted string followed by '=>'. That list becomes the MCP_TOOLS environment variable via rustc-env. prompt.rs reads env!("MCP_TOOLS") and pastes it into the system prompt the server returns on initialize. The consequence: the list of tool names the LLM is told about is literally compiled from the list of tool names the dispatch function knows how to route. They cannot disagree. Every other MCP server I have read maintains the prompt and the dispatch by hand.
How is stop_execution different from a regular cancellation?
stop_execution is a first-class tool in the MCP server. The LLM can call it like any other tool, and every handler in server.rs awaits its work inside a tokio::select against a CancellationToken pulled from the request manager at the start of the request. When stop_execution fires, the token flips and the tokio::select drops the in-flight work. It is synchronous in the sense that the desktop pointer is free within one scheduler tick. Contrast with HTTP-only cancellation where you can drop the connection but the server-side handler keeps running to completion and corrupting state. In practice, this means you can say 'stop' to Claude in the middle of a runaway automation and the mouse actually stops.
Does the 'best MCP server' question even make sense when Anthropic, OpenAI, and Google all have native tool use?
Yes, for a specific reason. MCP lets one tool server serve every client that speaks the protocol. If you hand-code tools into Claude's native tool-use API, you cannot reuse them with Cursor, VS Code, a custom agent, or an MCP-aware chat app without re-implementing the wrapper. The value of MCP is the single implementation, shared contract. For a product like Terminator, where the tool set is 30+ handlers, that reuse is worth the protocol overhead. For a product where the 'tool' is one HTTP call, native tool use is usually the right choice.
Where do I look in the Terminator repo to verify everything on this page?
crates/terminator-mcp-agent/src/main.rs for the concurrency default at line 516, the status handler at lines 537 to 553, and the mcp_gate middleware at lines 664 to 684. crates/terminator-mcp-agent/src/server.rs for the 31 tool handlers (grep for '#[tool('), the dispatch_tool match block at line 9953, and the stop_execution handler at line 8587. crates/terminator-mcp-agent/build.rs for the compile-time tool extraction at line 31, and crates/terminator-mcp-agent/src/prompt.rs for the env!("MCP_TOOLS") injection at line 10. Finally, crates/terminator-mcp-agent/README.md lines 36 through 45 document the HTTP endpoints and the 503 behaviour as the public contract.
Pages that unpack the same MCP server from different angles
More from the Terminator guides
What is an MCP server?
An MCP server, opened in the editor. Not the abstract definition; the actual dispatch_tool function that 31 tools flow through.
What are MCP servers, really?
The 150 lines of startup code nobody shows you. Process lifecycle, stdio framing, panic hooks, and why the protocol starts before the handshake.
MCP server list
A concrete list of Terminator's MCP tools, not a generic directory of servers. Names, signatures, and what each one actually touches.