-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Description
Description
MCP tools from a stdio-transport server intermittently vanish mid-session (within 5-10 minutes of normal use). The server process remains running — it did not crash. Once the tools disappear, they never come back until OpenCode is restarted.
The model sees:
Model tried to call unavailable tool 'seadev_run'.
Available tools: bash, read, glob, grep, edit, write, task, webfetch, ...
seadev_* tools are completely absent from the available list — not erroring, just gone. Tools from a second MCP server (f5-confluence_*) remain available in the same session.
Root Cause (verified from source)
In packages/opencode/src/mcp/index.ts, the MCP.tools() function (line 609) is called on every LLM step via resolveTools() in prompt.ts:836. It calls client.listTools() for each connected MCP client.
Lines 621-635:
const toolsResults = await Promise.all(
connectedClients.map(async ([clientName, client]) => {
const toolsResult = await client.listTools().catch((e) => {
log.error("failed to get tools", { clientName, error: e.message })
const failedStatus = {
status: "failed" as const,
error: e instanceof Error ? e.message : String(e),
}
s.status[clientName] = failedStatus
delete s.clients[clientName] // ← permanently removes client
return undefined
})
return { clientName, client, toolsResult }
}),
)Three problems
| # | Problem | Impact |
|---|---|---|
| 1 | delete s.clients[clientName] permanently removes the MCP client from the singleton state |
Client is gone for the lifetime of the process — tools vanish permanently |
| 2 | No retry logic | A single transient failure (timeout, pipe hiccup, GC pause) permanently evicts a healthy server |
| 3 | No reconnection and no onclose/onerror handlers on MCP clients |
Nothing ever recreates a deleted client |
Compare with create() at startup (line 509) where listTools() is wrapped with withTimeout(). The runtime tools() call has no such timeout wrapper — it relies on the MCP SDK's internal timeout, and on failure, permanently deletes rather than retrying. |
Evidence
| Check | Result |
|---|---|
| MCP server process alive? | Yes — ps aux confirms the Python process is still running with the same PID hours after tools vanished |
| Crash in OpenCode logs? | No — zero error/disconnect entries for the affected MCP server in ~/.local/share/opencode/log/ |
| Other MCP server affected? | No — f5-confluence_* tools remained available in the same session |
| Restart fixes it? | Yes — restarting OpenCode re-creates the client via create() in the Instance.state() initializer |
Code path
prompt.ts:604 → resolveTools() called every LLM loop iteration
prompt.ts:836 → MCP.tools() called
mcp/index.ts:621 → client.listTools() called per-client with .catch()
mcp/index.ts:630 → catch handler: delete s.clients[clientName] ← permanent eviction
(no retry, no reconnect, no onclose handler)
State is a singleton (Instance.state) — the deletion persists for the entire process lifetime.
Suggested Fix
Option A — Minimal: Retry listTools() 2-3 times with short backoff before evicting:
const toolsResult = await retry(() => client.listTools(), { attempts: 3, delay: 1000 })
.catch((e) => {
// only evict after all retries exhausted
})Option B — Robust: On failure, attempt to create() a new client for the same MCP config. Mark as "reconnecting" rather than "failed".
Option C — Defensive: Register client.onclose handlers after creating MCP clients (near line 476) to trigger automatic reconnection:
client.onclose = () => {
log.warn("MCP client closed, reconnecting", { key })
// trigger reconnection
}At minimum, delete s.clients[clientName] on line 630 should be removed or guarded by a retry counter — a single transient listTools() failure should not permanently kill an otherwise healthy MCP connection.
Related Issues
- [FEATURE]: MCP server startup timeout should fallback to lazy-load, not fail #13672 — MCP startup timeout should fallback to lazy-load (covers startup path, not runtime eviction)
- Issue: MCP Client disconnects immediately if Server returns a Prompt with an empty name {"name":""} #11816 — MCP client disconnects on invalid prompt name (covers schema error, not transient failure)
Plugins
opencode-beads
OpenCode version
1.2.24
Steps to reproduce
- Configure two MCP servers in
opencode.json— one fast/local and one that wraps subprocess calls (e.g. a custom FastMCP server that runs SSH commands) - Start OpenCode and verify both MCP servers connect and all tools are available
- Use the session normally for 5-10 minutes, invoking tools from both servers
- At some point, tools from one server silently vanish — the model reports
Model tried to call unavailable tool. The other server's tools remain available. - Verify the MCP server process is still running (
ps aux | grep <server-name>) - Restart OpenCode — tools reappear immediately
Screenshot and/or share link
No response
Operating System
macos Tahoe 26.3
Terminal
iTerm2