Userspace Runtime
The userspace runtime owns the repeated mechanics that every service needs: bootstrap validation, heap initialization, typed capability lookup, ring submission, completion matching, application exception decoding, and handle lifetime.
Related
- Go VirtualMemory Contract defines the caller-buffer reserve, commit, and decommit methods allocator paths need.
- Programming Languages summarizes current native Rust support and planned language-runtime tracks.
- Memory Management documents the implemented kernel
VirtualMemoryandMemoryObjectbehavior. - Go Runtime is the owning language runtime proposal; LLVM Target records the Go runtime OS hooks that drive this work.
Current Behavior
Runtime-owned _start receives (ring_addr, pid, capset_addr), initializes a
fixed heap, validates the ring address, reads the read-only CapSet page, installs
an emergency Console panic path when available, calls capos_rt_main(runtime),
and exits with the returned code.
The Runtime lends out at most one RuntimeRingClient at a time. The client
wraps the raw ring page, keeps request buffers alive until completions are
matched, handles out-of-order completions, packs copy-transfer descriptors, and
parses result-cap records. Owned runtime handles queue CAP_OP_RELEASE when the
last local reference is dropped; the release queue flushes when a ring client is
borrowed or dropped, or when code calls Runtime::flush_releases() explicitly.
Promise placeholders are currently bookkeeping only; their future SQE
coordinates map AnswerId.raw() to pipeline_dep and a result-cap record index
to pipeline_field.
Design
The runtime separates non-owning bootstrap references from owned local handles.
CapSet entries produce typed Capability<T> values only when the interface ID
matches the requested type, and the same manifest-order CapSet entries remain
available for diagnostic and shell surfaces that need to list or inspect what a
process was actually granted. Result-cap adoption performs the same interface
check before producing OwnedCapability<T>.
Typed clients are thin wrappers over the ring client. They encode Cap’n Proto
params, submit CALL SQEs, wait for a matching CQE, decode transport errors, and
decode kernel-produced CapException payloads into client errors. Endpoint
servers can use submit_endpoint_return_exception() to return a serialized
CapException to the original caller over the same endpoint RETURN path.
The handwritten TimerClient exposes monotonic now reads and sleep calls
over the same completion-matching path.
The handwritten VirtualMemoryClient exposes map, reserve, commit, decommit,
unmap, and protect calls for runtime heap/arena allocation over anonymous user
pages. It has both the ordinary allocation-backed async methods and synchronous
caller-buffer methods for allocator growth paths that cannot allocate while
asking the kernel for more memory. This matches the reserve/commit/decommit
surface specified in
Go VirtualMemory Contract.
The handwritten ThreadControlClient exposes current-process FS-base reads and
updates for runtimes that need to swap a language-managed TLS base after process
startup.
The 7.1.0 threading contract keeps one process ring and the runtime’s
single-owner ring-client invariant for the first in-process threading
implementation. Future multi-threaded runtimes must serialize blocking ring
entry through capos-rt until a runtime reactor or Ring v2 lands. The reactor
bridge uses one runtime-owned CQ drainer plus ParkSpace-backed wait records;
the full-SMP kernel target is per-thread rings, where cap_enter waits on the
current thread’s CQ. After 7.2, the existing ThreadControlClient methods apply
to the current thread’s FS base rather than to a process-wide saved FS base.
ThreadControl.exitThread and the raw exit(code) syscall both terminate the
current thread; the process exits when its last live thread exits.
The 7.2.3 park slice adds a process-local ParkSpace marker type and compact
CAP_OP_PARK / CAP_OP_UNPARK operations. capos-rt should expose
those operations as runtime synchronization primitives in a later slice; the
current thread-lifecycle proof uses raw SQEs so the runtime does not
prematurely claim the park user_data namespace. Blocking park wait is not
an ordinary
RuntimeRingClient call: the wait SQE must be thread-owned for the current
thread, and the runtime must reserve park user_data values,
write the wait SQE under its ring-submission lock, release that lock before
cap_enter, and demultiplex park CQEs into runtime-owned wait slots so a
sibling thread can still submit the wake. The temporary single-thread park
fallback remains only as the pre-thread runtime checkpoint proof.
Future generated clients should preserve this split: transport lifetime and completion matching belong in the runtime, while interface-specific encoding belongs in generated or handwritten client wrappers.
Invariants
ring_addrmust equalRING_VADDR; runtime bootstrap rejects any other address.- The CapSet header magic/version must validate before lookup.
- CapSet handles are non-owning unless explicitly adopted.
- Only one runtime ring client may be live at a time for a process.
- Until Ring v2, multithreaded generic client waits must flow through a runtime reactor/demux path rather than letting multiple threads consume the process CQ directly.
- Park wait must not hold the live runtime ring client while the kernel parks the current thread.
- Request params and result buffers must outlive their matching CQE.
- A result cap can be consumed only once and only with the expected interface ID.
- Promise placeholders must map to sideband result-cap record indexes, not schema field paths.
- Dropping the final owned handle queues exactly one local
CAP_OP_RELEASE;Runtime::flush_releases()forces queued releases and reports rejected kernel release results. - Release flushing treats stale or already-removed caps as non-fatal cleanup.
Code Map
capos-rt/src/entry.rs-_start,Runtime, bootstrap validation, single-owner ring token, release queue flushing.capos-rt/src/alloc.rs- fixed userspace heap initialization.capos-rt/src/capset.rs- typed CapSet lookup and manifest-order iteration wrappers.capos-rt/src/ring.rs- ring client, pending calls, completion matching, copy-transfer packing, result-cap parsing.capos-rt/src/client.rs- Console, TerminalSession, BootPackage, ProcessSpawner, ProcessHandle, VirtualMemory, Timer, ThreadControl, ThreadSpawner, and ThreadHandle clients, and exception decoding.capos-rt/src/lib.rs- typed capability marker types and owned handle reference counting.capos-rt/src/panic.rs- emergency Console output path.capos-rt/src/syscall.rs- raw syscall instructions and public syscall wrappers, including the hostile smoke probe for the removed ambient write syscall.targets/x86_64-unknown-capos.json- userspace target specification.tools/check-userspace-runtime-surface.sh- source check that keeps runtime primitives owned bycapos-rt.init/src/main.rs,capos-rt/src/bin/smoke.rs, andshell/src/main.rs- current runtime users.
Validation
make capos-rt-checkbuilds the runtime smoke binary againsttargets/x86_64-unknown-capos.json, matching the booted userspace target.make init-capos-build,make demos-capos-build,make shell-capos-build, andmake capos-rt-capos-buildexpose focused custom-target build wrappers for the current userspace crates and runtime smoke binary.tools/check-userspace-runtime-surface.shverifiesinit,demos, andshelldo not define_start, panic handlers, global allocators, raw syscall instructions, or entry-point macros outsidecapos-rt.make run-smokevalidates runtime entry, typed Console calls, exception decoding, owned handle release, result-cap parsing through IPC, and clean process exit.make run-spawnvalidatesProcessSpawnerClient,ProcessHandleClient,VirtualMemoryClient,TimerClient,ThreadControlClient,ThreadSpawnerClient,ThreadHandleClient, result-cap adoption, and release behavior under init spawning. Thesingle-thread-runtimechild proves the first runtime-shaped checkpoint over caller-buffer VirtualMemory calls and Timer; thethread-lifecyclechild proves in-process create, self-join rejection, join, detach, last-threadexitThread, and private ParkSpace wait/wake correctness.make run-shellvalidates CapSet iteration, capability inspection, typed application-error decoding, guest session metadata, exact-grant spawning, ProcessHandle waits, and stale-handle release behavior in the focused shell-launch proof manifest.make run-terminalvalidatesTerminalSessionClientwrites, bounded line reads, hidden-echo input handling, and structured cancellation in the focused terminal proof manifest.cd capos-rt && cargo test --lib --target x86_64-unknown-linux-gnucovers host-testable runtime invariants when run explicitly.
Open Work
- Add generated client bindings after the schema surface stabilizes.
- Implement promise/answer transport semantics beyond current placeholders.
- Add typed ParkSpace clients with runtime-owned
user_datademultiplexing. - Define release behavior for queued handles when a process exits before the release queue flushes.