Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

What capOS Is

A research kernel that boots on x86_64 QEMU. The rest of this page is about why it looks the way it does — the specific design bets behind the code — not a feature inventory. For the feature-by-feature matrix, see Current Status.

What Makes capOS Different

capOS is a research vehicle for a few specific design bets. Each is unusual on its own; the combination is the point.

  • Everything is a typed capability. System resources are accessed through Cap’n Proto interfaces defined in schema/capos.capnp. There is no ambient authority — no global path namespace, no open-by-name, no implicit inherit. A process can only invoke objects present in its local capability table. See Capability Model and the schema/repo map.
  • The interface IS the permission. Instead of a parallel READ/WRITE/EXEC rights bitmask (Zircon, seL4), attenuation is a narrower capability: a wrapper CapObject exposing fewer methods, or an Endpoint client facet that cannot RECV/RETURN. The kernel just dispatches; policy lives in interfaces. See Capability Model, IPC and Endpoints, and the prior-art notes on Zircon and seL4.
  • Identity metadata is not authority. In prose, a user is the human-facing actor, a principal is identity metadata, an account is planned durable local record state, and policy/resource profiles select bundles and quotas. Sessions receive capabilities; none of those labels become kernel subjects or bypass cap-table authority. See the local users backlog, User Identity and Policy, and Resource Accounting and Quotas.
  • io_uring-style shared-memory ring for every call. Every process owns a submission/completion queue page. Userspace writes SQEs with a normal memory store; the kernel processes them through cap_enter. New operations are SQE opcodes (CALL, RECV, RETURN, RELEASE, NOP), not new syscalls. The remaining syscall surface is cap_enter and exit; the accepted threading contract keeps current-thread exit as a ThreadControl capability operation. See Capability Ring, Userspace Runtime, and In-Process Threading.
  • Release is transport, not an application method. Dropping the last owned handle in capos-rt queues one local CAP_OP_RELEASE; acquiring or dropping a runtime ring client flushes the queue, and long-running code can call Runtime::flush_releases() explicitly. No close() method on every interface, no mutable table self-reference during dispatch. See Userspace Runtime and Capability Ring.
  • Capability transfer is first-class. Copy and move descriptors ride sideband on CALL/RETURN SQEs. Move reserves the sender slot until the receiver accepts and preflight checks pass, then commits or rolls back atomically — no lost, duplicated, or half-inserted authority. See Authority Accounting and IPC and Endpoints.
  • Cap’n Proto wire format end-to-end. The same encoding describes the boot manifest, runtime method calls, and future persistence/remote transparency. The debug tap records fixed, bounded SQE/CQE metadata today; authorized payload capture, replay, audit, and migration remain future transport work. See Manifest and Service Startup, Error Handling, and Storage and Naming.
  • Host-testable pure logic. Cap-table, frame-bitmap, ELF parser, frame ledger, lazy buffers, small ABI constants, and the ring model live in capos-lib, capos-abi, and capos-config, and run under cargo test-lib, Miri, Loom, Kani, and proptest without any kernel scaffolding. Kernel glue stays thin. See Verification Workflow and Repository Map.
  • Schema-first boot. system.cue is compiled to a Cap’n Proto SystemManifest embedded as the single Limine boot module. The kernel validates only the kernel-owned boot boundary and launches initConfig.init; mkmanifest and init validate the service graph under initConfig.services as structured data, not shell scripts or baked environment variables. See Boot Flow, Manifest and Service Startup, and Build, Boot, and Test.

Execution Model

Each process owns an address space, a local capability table, a mapped capability-ring page, and a read-only CapSet page that enumerates its bootstrap handles. The kernel enters Ring 3 with iretq and returns through cap_enter or the timer. Ordinary capability calls progress only via cap_enter; timer-side polling handles non-CALL ring work and call targets that are explicitly safe for interrupt dispatch. Details in Process Model, Capability Ring, In-Process Threading, and Scheduling.

Boot Flow

The kernel receives exactly one Limine module — a Cap’n Proto SystemManifest compiled from system.cue — validates the kernel-owned boot boundary, loads only initConfig.init.binary, builds that process’s bootstrap capability table and CapSet page from initConfig.init.caps, and starts the scheduler. The default manifest now boots the standalone init ELF, and init validates the service graph before spawning the foreground capos-shell, the remote-session CapSet gateway, and the resident demo services. The shell mints an anonymous UserSession when it starts and the user runs login or setup as ordinary shell commands to upgrade to an operator session. Focused shell-led manifests such as system-smoke.cue and system-shell.cue still boot capos-shell directly as initConfig.init until the run-target/init-policy cleanup migrates them. Full walkthrough in Boot Flow and Manifest and Service Startup.

Authority Boundaries

Authority is carried by cap-table hold edges with generation-tagged CapIds. Ring 0 ↔ Ring 3, capability table ↔ kernel object, endpoint IPC, copy/move transfer, manifest/boot-package, and process spawn are the boundaries reviewers care about; each one fails closed at hostile input. See Trust Boundaries for the boundary table and Authority Accounting for the transfer and quota invariants.

What capOS Is Not

A POSIX clone, a microkernel-shaped Linux replacement, or a production OS. It is a place to try the above choices and see which ones survive contact with real workloads. See Build, Boot, and Test to run it.