Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Rejected Proposal: Sleep(INF) Process Termination

Status: rejected.

Concern

Unix-style zombies are a poor fit for capOS. A terminated child should not keep its address space, cap table, endpoint state, or other authority alive merely because a parent has not waited yet. The remaining observable state should be a small, capability-scoped completion record, and only holders of the corresponding ProcessHandle should be able to observe it.

The current ProcessHandle.wait() -> exitCode :Int64 shape is also too weak for future lifecycle semantics. Raw numeric status cannot distinguish normal application exit from abandon, kill, fault, startup failure, runtime panic, or supervisor policy actions without inventing process-wide magic numbers.

Proposal

Introduce a system sleep operation and treat Sleep(INF) as a special terminal operation. The argument for this spelling is that a process that never wants to run again can enter an infinite sleep instead of becoming a zombie. The kernel would recognize the infinite case and handle it specially:

  • finite Sleep(duration) blocks the process and wakes it later;
  • Sleep(INF) never wakes, so the kernel tears down the process;
  • the process’s authority is released as if it had exited;
  • parent-visible process completion is either omitted or reported as a special status.

A variant also removes the dedicated sys_exit syscall and makes Sleep(INF) the only user-visible process termination primitive.

Candidate Semantics

Sleep(INF) as Exit(0)

The simplest version maps Sleep(INF) to normal successful exit.

This is rejected because it lies about intent. A program that completed successfully, a program that intentionally detached, and a program that chose to disappear without status are not the same lifecycle event. Supervisors would see the same status for all of them.

Sleep(INF) as Abandoned

A less lossy version gives Sleep(INF) a distinct terminal status:

struct ProcessStatus {
  union {
    exited @0 :ApplicationExit;
    abandoned @1 :Void;
    killed @2 :KillReason;
    faulted @3 :FaultInfo;
    startupFailed @4 :StartupFailure;
  }
}

struct ApplicationExit {
  code @0 :Int64;
}

ProcessHandle.wait() would return status :ProcessStatus instead of a bare exitCode :Int64. Normal application termination returns exited(code), while Sleep(INF) returns abandoned.

This fixes the type problem, but leaves the operation name wrong. Sleep normally means the process remains alive and keeps its authority until a wake condition. The infinite special case would instead release authority, reclaim memory, cancel endpoint state, complete process handles, and make the process impossible to wake. That is termination, not sleep.

Sleep(INF) as Detached No-Status Termination

Another version treats Sleep(INF) as detached termination and gives parents no status. That avoids inventing an exit code, but it weakens supervision. Init and future service supervisors need a definite terminal event to implement restart policy, diagnostics, dependency failure reporting, and “wait for all children” flows. A missing status is not a useful status.

Remove sys_exit Through a Typed Lifecycle Capability

Removing the dedicated sys_exit syscall is a separate, plausible future direction. The cleaner version is not Sleep(INF), but an explicit lifecycle operation:

interface ProcessSelf {
  terminate @0 (status :ProcessStatus) -> ();
  abandon @1 () -> ();
}

interface ProcessHandle {
  wait @0 () -> (status :ProcessStatus);
}

The process would receive ProcessSelf only for itself. Calling terminate would be non-returning in practice: the kernel would process the request, release process authority, complete any ProcessHandle waiter with the typed status, and not post an ordinary success completion back to the dying process.

The transport shape needs care. A generic Cap’n Proto call normally expects a completion CQE, but a self-termination operation cannot safely rely on the dying process to consume one. Viable implementations include:

  • a dedicated ring operation such as CAP_OP_EXIT targeting a self-lifecycle cap;
  • a ProcessSelf.terminate call whose method is explicitly non-returning and never posts a CQE to the caller;
  • keeping sys_exit temporarily until ring-level non-returning operations have explicit ABI and runtime support.

This path removes the ambient exit syscall without overloading sleep. It also forces terminal status to become typed before kill, abandon, restart policy, or fault reporting are added.

Rationale For Rejection

Sleep(INF) solves the wrong abstraction problem. The zombie problem is not that a process needs a forever-blocked state. The problem is retaining process resources after terminal execution. capOS should solve that by separating process lifetime from process-status observation:

  • process termination immediately releases authority and reclaims process resources;
  • a ProcessHandle is only observation authority, not ownership of the live process;
  • if a handle exists, a small completion record may remain until it is waited or released;
  • if no handle exists, terminal status can be discarded;
  • no ambient parent process table is needed.

Under that model, a sleeping process remains alive and authoritative, while a terminated process does not. Special-casing Sleep(INF) to perform teardown would make the name actively misleading and would create a hidden terminal operation with different semantics from finite sleep.

The accepted direction is therefore:

  • keep explicit process termination semantics;
  • replace raw exitCode :Int64 with typed ProcessStatus before adding more lifecycle states;
  • keep exit(code) as the current minimal ABI until a typed self-lifecycle capability or ring operation can replace it cleanly;
  • add future Timer.sleep(duration) only for real sleep, where the process remains alive and may wake.

Sleep(INF) remains rejected as a termination primitive. The concern it raises is valid, but the solution is typed terminal status plus status-record cleanup, not infinite sleep.