IPC and Endpoints
Endpoints let one process serve capability calls to another process without adding a separate IPC syscall surface. The same ring transport carries ordinary kernel capability calls and cross-process endpoint calls.
Current Behavior
An Endpoint is a kernel capability object with queues for pending client
calls, pending server receives, and in-flight calls awaiting RETURN. A service
that owns the raw endpoint can receive and return. Importers receive a
ClientEndpoint facet that can CALL but cannot RECV or RETURN.
sequenceDiagram
participant Client
participant ClientRing as Client ring
participant Endpoint
participant ServerRing as Server ring
participant Server
Server->>ServerRing: submit RECV on raw endpoint
Client->>ClientRing: submit CALL on client facet
ClientRing->>Endpoint: deliver params and caller result target
Endpoint->>ServerRing: complete RECV with EndpointMessageHeader and params
ServerRing-->>Server: cap_enter returns completion
Server->>ServerRing: submit RETURN with call_id and result
ServerRing->>Endpoint: take in-flight target
Endpoint->>ClientRing: post caller CQE with result and receiver metadata
ClientRing-->>Client: wait returns matching completion
If a CALL arrives before a RECV, the endpoint queues bounded params. If a RECV arrives before a CALL, the endpoint queues the receive request. Delivered calls move into the in-flight queue until the server returns or cleanup cancels them.
Design
Endpoint IPC is capability-oriented. The manifest can export a raw endpoint from one service; importers get a narrowed client facet. This keeps server-only authority out of clients without introducing rights bitmasks.
CALL and RETURN may carry sideband transfer descriptors. Copy transfers insert a new cap into the receiver while preserving the sender. Move transfers reserve the sender slot, insert the destination, then remove the source on commit. RETURN-side transfers append result-cap records after the normal result payload. Cross-session delivery is additionally checked against the cap hold transfer scope: same-session caps fail closed, cross-session-shareable caps may cross, and service-regrant-only caps need a trusted fixed-session regrant path. CALL SQEs may also request field-granular session disclosure. The kernel intersects that request with the invoked cap’s disclosure scope before delivering any subject fields, so a request without scope or scope without a request exposes only the default opaque caller-session metadata.
Legacy receiver metadata is stored on cap-table hold edges and delivered to
servers with endpoint invocation metadata, so one endpoint can distinguish
transitional callers without one object per caller. Some ABI structs still name
this field badge; that name is compatibility state, not the normal
shared-service authority model. Session-bound invocation context is the
replacement model for normal workload paths: every normal process has one
immutable session context, endpoint calls expose privacy-preserving
caller-session metadata by default, and shared services derive user-facing
state from broker-granted capabilities plus service-scoped session references.
See Session Context.
Delegated Client Relabeling Containment
The Gate 0 containment rule is narrow: a process that holds an imported
ClientEndpoint may delegate that same client identity, but it may not mint a
sibling identity by setting another legacy badge during spawn. Endpoint owners
and explicit trusted mint paths remain transitional mechanisms for low-level
tests. Normal shared services use broker-granted roots/facets plus
session-bound invocation context instead of service-object badges.
Normal capos-shell help and smoke expectations must therefore omit arbitrary
badge N launch examples. Omitted shell badge syntax preserves the source
identity instead of selecting badge zero. Legacy badge syntax may remain
reachable only as a debug or hostile-test input, and QEMU coverage for the
Telnet blocker must prove both explicit client @name badge N and low-level
legacy badge-zero relabel encodings from a nonzero delegated client facet fail
closed.
Shell-serviced stdio bridges now bind the active child wait to the first opaque
live caller-session reference seen on the bridge endpoint. A later call from a
different live caller session is answered with an empty result and the child is
terminated; transferred caps are released before either normal transfer
rejection or caller-session rejection returns. Normal StdIO.close is treated
as a clean child close rather than a security rejection.
Future IPC should add notification objects for lightweight signaling and promise pipelining for Cap’n Proto-style dependent calls.
Invariants
- Only raw endpoint holders may RECV or RETURN.
- Imported endpoint caps are
ClientEndpointfacets and must reject RECV and RETURN from userspace. - Delegating an imported client facet must preserve its server-visible object identity. Only endpoint owners or explicit trusted mint paths may create sibling client identities, and normal services should not treat that identity as user/session authority.
- Endpoint queues are bounded by call count, receive count, in-flight count, per-call params, and total queued params.
- Each in-flight call has a kernel-assigned non-zero
call_id. - CALL delivery copies params into kernel-owned queued storage before the caller can resume.
- Move transfer commit must not leave both source and destination live.
- Transfer rollback must preserve source authority if destination insertion or result delivery fails.
- Process exit must cancel queued state involving that pid and wake affected peers when possible.
Code Map
kernel/src/cap/endpoint.rs- endpoint queues, client facet, call IDs, cancellation by pid.kernel/src/cap/ring.rs- endpoint CALL/RECV/RETURN dispatch, result copying, deferred cancellation CQEs.kernel/src/cap/transfer.rs- transfer descriptor loading and transaction preparation.capos-lib/src/cap_table.rs- cap-table transfer primitives and rollback.kernel/src/cap/mod.rs- manifest export resolution and client-facet construction.capos-config/src/ring.rs-EndpointMessageHeader, transfer descriptors, transfer result records, endpoint opcodes.demos/capos-demo-support/src/lib.rs- endpoint, IPC, transfer, and hostile IPC smoke routines.demos/endpoint-roundtrip,demos/ipc-server,demos/ipc-client- QEMU smoke binaries.demos/ipc-zerocopy-producer,demos/ipc-zerocopy-consumer- QEMU smoke for the multi-message shared-buffer zero-copy IPC pattern.
Validation
make run-smokevalidates same-process endpoint RECV/RETURN, cross-process IPC, endpoint exit cleanup, legacy badged calls, transfer success/failure paths, and clean halt.make run-spawnvalidates init-spawned endpoint-roundtrip, server, and client processes.make run-memoryobject-sharedvalidates a one-shot shared-buffer handoff over an endpoint cap transfer.make run-ipc-zerocopyvalidates the multi-message zero-copy IPC pattern at the substrate level: the producer transfers oneMemoryObjectto the consumer and then exchanges four record payloads through the shared mapping while endpoint CALLs carry only sequence numbers and checksums. The demo drives raw SQE/CQE construction throughcapos-demo-supportrather than a typed runtime client and uses an ad-hoc seq+checksum framing because the typedSharedBufferABI, ring-shaped producer/consumer metadata, and notification primitives are still pending; production services (File.readBuf,BlockDevice.readBlocks, NIC RX/TX rings) will reuse the sameMemoryObjectsubstrate through that future surface, not the demo’s framing.cargo test-libcovers cap-table transfer preflight, provisional insertion, commit, rollback, stale generation, and slot exhaustion cases.cargo test-ring-loomcovers ring queue behavior that endpoint IPC depends on for completion delivery.
Open Work
- Add notification objects for signal-style events.
- Add Cap’n Proto promise pipelining after endpoint routing can resolve dependent answers.
- Add a typed
SharedBuffercapability surface (ring-shaped producer/consumer metadata, completion signaling, lifetime/quota rules) on top of the rawMemoryObjectsubstrate exercised bymake run-ipc-zerocopy. - Add epoch-based revocation if broad authority invalidation becomes necessary.