Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: libcapos-service

Define a userspace service framework above capos-rt for long-running capOS services. The library should provide common lifecycle, endpoint, readiness, shutdown, context, metrics, and budgeting mechanics without adding a generic kernel Service capability or a kernel-level phase machine.

The immediate target is terminal/networking lifecycle: byte-stream terminal hosting, Telnet/TLS/SSH gateway plumbing, listener accept loops, shell launch, proxying, cleanup, and observable shutdown. HTTP/fetch services come later.

Problem

Current services duplicate the same shape:

  • discover bootstrap caps;
  • wait for dependencies;
  • mark readiness through log output or implicit behavior;
  • run accept or endpoint receive loops;
  • spawn children or proxy byte streams;
  • release result caps and temporary state;
  • log or count failures;
  • shut down after EOF, error, process exit, or supervisor request.

Duplicating that lifecycle is tolerable for proofs, but it is a poor foundation for production gateway, storage, agent, monitoring, and network services. Repeated hand-rolled loops are also where capability leaks, stuck children, incorrect close ordering, and hidden unbounded work appear.

Layering Decision

The stack remains:

schema/capos.capnp
  stable authority-bearing interfaces

capos-rt
  raw CapSet/ring/typed-handle transport, result caps, release, exceptions

libcapos-service
  userspace lifecycle and endpoint/service helpers

domain libraries
  terminal host, HTTP/fetch, storage, supervisor, agent tools

init/supervisors
  compose services by passing capabilities

libcapos-service is not a new authority source. It wraps and narrows capabilities the process already holds. The kernel still sees ordinary typed capability calls and ordinary process lifecycle.

Core Surface

Initial framework pieces:

  • Service lifecycle: initialize, dependency wait, ready, run, drain, shutdown, and final cleanup.
  • Endpoint serve loops: generated or handwritten helpers for RECV, decode, dispatch, RETURN, exception return, cancellation, and release.
  • Readiness handles: typed local handles or service-exported readiness caps, not global service names.
  • Shutdown and drain: cancellable waits, child/process-handle cleanup, listener stop, in-flight request drain, bounded force-close.
  • Background tasks: timers, periodic health checks, metrics export, and discovery loops with explicit cancellation.
  • Request/session context: owned context object per request or session containing caller-session metadata, derived policy, resource reservations, transfer state, timing, and audit correlation.
  • Metrics hooks: bounded counters and summaries; no unbounded per-user, per-cap-id, or per-method labels by default.
  • Resource budgeting: reservation/donation hooks that call into the relevant ledger owner; the framework records what was reserved and releases it on every exit path.
  • Error boundary: preserve the error-handling split from error-handling-proposal.md: CQE status for transport/kernel dispatch failure, CapException for capability infrastructure failure, and schema result unions for normal domain outcomes.

First Target: Terminal And Networking

The first useful slice should be:

  1. TerminalSessionFromByteStream / byte-stream terminal host.
  2. Lifecycle wrapper around accept, session minting, proxying, and cleanup.
  3. Request/session context and metrics hooks.
  4. Network service container for listener-backed services.
  5. HTTP/fetch lifecycle only after terminal/networking proves the cleanup and authority model.

This ordering deliberately exercises the hard lifecycle edges before adding HTTP convenience: authenticated session creation, shell spawn, bidirectional byte proxying, EOF/close/error ordering, repeated connect/disconnect, and release of terminal/session/process result caps.

Authority Rules

  • The framework must not accept ambient service names, raw global handles, or stringly typed service discovery.
  • Hooks receive only the caps explicitly passed to that service or request.
  • Request contexts are lifecycle-owned and must be dropped deterministically.
  • Background tasks are budgeted and cancellable during shutdown.
  • Retry policy is domain-specific and requires idempotency or operation ids.
  • Pool keys for reusable resources include every authority and identity field that changes policy: target, protocol, TLS identity, cap/object epoch, caller/session reference, namespace, tenant, and transformation policy.
  • Readiness means the service can actually accept authorized work; config parse success is not enough.

Non-Goals

  • No generic kernel Service capability.
  • No kernel callback registry or phase machine.
  • No plugin ABI that passes phase_id and bytes through a single generic cap.
  • No global service discovery namespace.
  • No HTTP-first framework that delays terminal/networking lifecycle cleanup.
  • No replacement for capos-rt transport primitives.

Implementation Sequence

  1. Draft shared ServiceMain/ServiceRuntime shape for one process.
  2. Factor byte-stream terminal host lifecycle around TerminalSessionFromByteStream.
  3. Convert a focused terminal or gateway proof to use the lifecycle wrapper.
  4. Add request/session context and bounded metrics hooks.
  5. Add readiness and shutdown/drain helpers.
  6. Add endpoint serve-loop helpers that preserve typed schema authority.
  7. Add resource reservation/donation hooks.
  8. Consider HTTP/fetch domain library only after terminal/networking proofs pass.

Verification

Initial proof gates:

make docs
make run-terminal
make run-telnet or qemu-telnet-harness
focused close/reconnect proof
hidden password behavior remains byte-identical
child shell receives no raw network/spawn/listener authority
gateway cleanup releases terminal/session/process handles on EOF/error/shutdown

Later endpoint-helper gates should add targeted tests for exception return, result-cap release, cancellation, and resource rollback.