Skip to content

fix: prevent orphaned processes via stdin/ppid/onclose lifecycle guards#77

Merged
PatrickSys merged 2 commits intomasterfrom
fix/process-lifecycle-hardening
Mar 19, 2026
Merged

fix: prevent orphaned processes via stdin/ppid/onclose lifecycle guards#77
PatrickSys merged 2 commits intomasterfrom
fix/process-lifecycle-hardening

Conversation

@PatrickSys
Copy link
Owner

Summary

  • Adds 4 missing lifecycle guards to main() in src/index.ts that caused orphaned node processes to accumulate when MCP clients exit without clean shutdown
  • A consumer reported 41 orphaned processes consuming ~27GB RAM on a 32GB machine across 28 Claude Code sessions on a large NX monorepo
  • Root cause: StdioServerTransport never listens for stdin end/close, no parent-death detection, no SIGHUP handler

Changes (~27 lines added, no new dependencies)

Guard What it catches
process.stdin.on('end'/'close') Client exit (normal or crash) — stdin pipe breaks
server.onclose Graceful MCP protocol-level disconnect
PPID polling (5s, unref'd) Parent SIGKILL, terminal close, npx/cmd.exe wrapper chains (cross-platform)
process.once('SIGHUP') Terminal close (Unix), console window close (Windows)

Why not transport.onclose?

The MCP SDK's Protocol.connect() overwrites transport.onclose with its own wrapper. Setting it after server.connect(transport) would break SDK internal cleanup. server.onclose is the correct hook — it fires after SDK cleanup completes.

Why no singleton/PID-file guard?

Would prevent legitimate multi-session use (two Claude Code tabs on the same project). PPID polling already catches orphans within 5 seconds, making a singleton redundant.

Test plan

  • pnpm build — no type errors
  • pnpm test — all 312 tests pass
  • Manual: start MCP server, kill parent terminal → server exits within 5s (PPID poll)
  • Manual: pipe stdin from a process, kill that process → server exits immediately (stdin end)
  • Verify CLI subcommands still work (they don't call main())

Adds 4 missing lifecycle guards to main() that caused orphaned node
processes to accumulate (~27GB RAM from 41 orphans on a 32GB machine):

- stdin end/close listeners: detect client exit via pipe closure
- server.onclose: handle graceful MCP protocol disconnect
- PPID polling (5s, unref'd): cross-platform parent death detection
- SIGHUP handler: terminal close on Unix + Windows console close
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@greptile-apps
Copy link

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR adds four lifecycle guards to main() in src/index.ts to prevent orphaned Node.js processes when MCP clients exit without a clean shutdown — addressing a real-world report of 41 accumulated processes consuming ~27 GB of RAM. The approach (PPID polling, stdin pipe closure detection, server.onclose, and SIGHUP) is sound and well-motivated, but there are two issues worth addressing before merge.

Key changes:

  • PPID polling every 5 s (unref'd) to detect parent SIGKILL / crash on all platforms
  • stdin end/close listeners to detect broken stdio pipe immediately
  • server.onclose hook for graceful MCP protocol-level disconnect
  • process.once('SIGHUP', ...) handler consistent with the existing SIGINT/SIGTERM pattern

Issues found:

  • The bare catch {} in the PPID guard swallows EPERM (parent alive, permission denied), which can cause the server to exit spuriously — only ESRCH (process not found) should trigger an exit
  • The new stdin/server.onclose exit paths fire process.exit(0) before process.once('exit', stopAllWatchers) is registered, creating a narrow window where active file watchers may not be cleaned up if the connection drops during initProject()

Confidence Score: 3/5

  • Safe to merge after fixing the EPERM false-positive in the PPID guard; the watcher-cleanup race is low-risk but worth addressing.
  • The PPID guard's bare catch catching EPERM is a correctness bug that could cause the server to exit when the parent process is still alive, which would undermine the purpose of the fix. The watcher cleanup race is low-probability but real. Both issues are straightforward to fix with a few lines of change.
  • src/index.ts — specifically the parentGuard catch block (line ~1622) and the ordering of exit-handler registration relative to the new stdin/server.onclose handlers.

Important Files Changed

Filename Overview
src/index.ts Adds four orphan-process lifecycle guards to main(): PPID polling, stdin end/close, server.onclose, and SIGHUP. The PPID guard has a correctness bug (bare catch swallows EPERM, making the server exit even when the parent is alive with a permission mismatch), and the stdin/server.onclose handlers have a narrow race where watcher cleanup may be skipped.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[MCP Client spawns server] --> B[main starts]
    B --> C[Setup PPID guard - setInterval 5s unrefed]
    B --> D[server.connect transport]
    D --> E[Register stdin end and close handlers]
    D --> F[Set server.onclose handler]
    D --> G[Register SIGHUP handler]

    C -->|every 5s| H{process.kill parentPid 0}
    H -->|success - parent alive| C
    H -->|throws ESRCH - parent dead| EXIT[process.exit 0]
    H -->|throws EPERM - bug - parent alive no permission| EXIT

    E -->|stdin pipe breaks| EXIT
    F -->|graceful MCP disconnect| EXIT
    G -->|terminal closed| EXIT2[stopAllWatchers then process.exit 0]

    EXIT -->|relies on exit listener registered later| WATCH[stopAllWatchers via exit event]
    EXIT2 --> WATCH
Loading

Last reviewed commit: "fix: prevent orphane..."

Comment on lines +1619 to +1625
const parentGuard = setInterval(() => {
try {
process.kill(parentPid, 0);
} catch {
process.exit(0);
}
}, 5_000);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Bare catch swallows EPERM, falsely kills server

process.kill(pid, 0) throws two distinct errors:

  • ESRCH — process does not exist → correct to exit
  • EPERM — process exists but we lack permission to signal it → incorrect to exit

The bare catch block catches both, meaning if the parent process is still alive but running in a slightly different permission context (e.g., after a setuid call, or on certain Linux configurations where a parent can have a different UID), the server will exit spuriously every 5 seconds.

The fix is to only exit on ESRCH:

Suggested change
const parentGuard = setInterval(() => {
try {
process.kill(parentPid, 0);
} catch {
process.exit(0);
}
}, 5_000);
const parentGuard = setInterval(() => {
try {
process.kill(parentPid, 0);
} catch (err: unknown) {
if ((err as NodeJS.ErrnoException).code === 'ESRCH') {
process.exit(0);
}
}
}, 5_000);

Comment on lines +1632 to +1639
// Detect stdin pipe closure — the primary signal that the MCP client is gone.
// StdioServerTransport only listens for 'data'/'error', never 'end'.
process.stdin.on('end', () => process.exit(0));
process.stdin.on('close', () => process.exit(0));

// Handle graceful MCP protocol-level disconnect.
// Fires after SDK internal cleanup when transport.close() is called.
server.onclose = () => process.exit(0);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 New exit paths bypass watcher cleanup window

The stdin and server.onclose handlers call process.exit(0) directly, relying on process.once('exit', stopAllWatchers) (registered at line 1668) to clean up watchers. However, stopAllWatchers is a const that is not yet declared at this point in main().

If either handler fires after initProject() starts a watcher (around line 1648) but before process.once('exit', stopAllWatchers) is registered (line 1668), the exit listener won't be in place and watchers will be abandoned rather than stopped gracefully. The window is small but real — initProject() is awaited, so the event loop can yield during that call.

One safe pattern is to hoist the stopAllWatchers registration to right after server.connect(transport), before these handlers are set, so the exit listener is always in place when the handlers could possibly fire.

- PPID poll: only exit on ESRCH (process gone), ignore EPERM (process
  alive but different UID — setuid, Linux security modules, etc.)
- Hoist stopAllWatchers + exit/signal listeners to before stdin/onclose
  handlers, closing the race window where process.exit() could fire
  during initProject() before the cleanup listener was registered
@PatrickSys PatrickSys merged commit 0b49d7a into master Mar 19, 2026
@PatrickSys PatrickSys deleted the fix/process-lifecycle-hardening branch March 19, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant