Verification Workflow
This page maps capOS claims to the commands, QEMU smokes, fuzz targets, proof tools, and review documents that currently support them.
Local Command Set
Use the repo aliases and Makefile targets instead of bare host commands. The
workspace default Cargo target is x86_64-unknown-none, so host tests rely on
aliases that set the host target explicitly.
| Scope | Command | What it checks |
|---|---|---|
| Formatting | make fmt-check | Rust formatting across kernel, shared crates, standalone userspace crates, and demos. |
| Config and manifest logic | cargo test-config | Cap’n Proto manifest encode/decode, CUE value handling, CapSet layout, and config validation. |
| Ring concurrency model | cargo test-ring-loom | Bounded SQ/CQ producer-consumer invariants and corrupted-SQ recovery behavior. |
| Deferred-completion concurrency model | make model-dma-deferred-completion-loom | Bounded Loom over the kernel DeferredCompletionQueue reservation budget and the multi-CPU TLB shootdown generation re-read (kernel/src/arch/x86_64/tlb.rs); budget never exceeded, no completion dropped/double-popped, no retire ahead of a covering flush. |
| DMA authority lifecycle model | make model-dma-tla | Pinned TLC bounded check of models/dma/dma_authority.tla: allocate->map->publish->complete->revoke->scrub->reuse ordering plus generation-keyed stale completion, record-before-PTE-install split, drive-pin/quarantine, and queue-enable epoch-fence interleavings. Fails closed on any invariant violation, deadlock, or analyzer error. |
| DMA assurance aggregate | make dma-assurance-model-check | Local aggregate over the DMA Alloy, TLA+, Loom, and Kani gates. Requires installed cargo-kani; GitHub CI splits the same evidence across the DMA Assurance Models and Kani Proofs jobs. |
| Shared library logic | cargo test-lib | ELF parser, frame bitmap, frame ledger, capability table, and property-test coverage. |
| Manifest tool | cargo test-mkmanifest | Host-side manifest conversion and validation behavior. |
| Userspace runtime | tools/check-userspace-runtime-surface.sh; make capos-rt-check; make init-capos-build demos-capos-build shell-capos-build | Runtime primitive ownership, custom-target boot build path, entry ABI, typed clients, ring helpers, and no_std constraints. |
| Kernel build | cargo build --features qemu | Kernel build with the QEMU exit feature enabled. |
| Generated code | make generated-code-check | Cap’n Proto compiler path/version, schema binding output equality, no_std patch anchors, Adventure and Paperclips content freshness, locked generator dependencies, and checked-in generated-output drift. |
| Dependency policy | make dependency-policy-check | cargo-deny and cargo-audit policy across root and standalone Cargo lockfiles, plus npm lockfile validation and audit checks for the docs toolchain. |
| Mandatory Kani gate | make kani-lib | Bounded capos-lib harness set for frame allocation, stale-handle rejection, frame-grant and cap-slot fail-closed accounting, and transfer-origin fail-closed behavior. |
| DMA-authority core Kani gate | make kani-dma-authority | Bounded Kani over the extracted pure DMA-authority core (capos_lib::dma_authority): a recycled slot’s generation strictly increases and never aliases a live handle, a stale-generation completion is rejected without mutating completion/free/reuse state, and a buffer cannot be re-exposed until its in-flight completion is observed. Faithful model of the kernel/src/device_dma.rs authority arithmetic; the kernel call-through is a tracked follow-up. |
| Full image build | make | Kernel, userspace demos, runtime smoke binaries, manifest, Limine artifacts, and ISO packaging. |
| Default interactive boot | make run | Operator-facing default init-owned boot path from layered system.cue: standalone init starts the foreground shell, resident demo services, and the remote-session CapSet gateway, forwards only the remote CapSet endpoint on loopback, and keeps console/debug output logged separately. |
| Default QEMU smoke | make run-smoke | Scripted focused shell-led boot from system-smoke.cue: kernel boot-launches capos-shell as init, grants the shell bootstrap cap bundle, then proves anonymous-session bootstrap, login failed-auth redaction, successful password auth, broker upgrade to operator bundle, terminal isolation, and clean halt. |
| Focused spawn QEMU smoke | make run-spawn | Narrower init-owned ProcessSpawner graph: kernel boot-launches only standalone init with Console, BootPackage, and ProcessSpawner; init validates BootPackage metadata, spawns endpoint/IPC/VirtualMemory/Timer/FrameAllocator children, waits for them, exercises hostile spawn checks, and halts cleanly. |
| Shell, terminal, and local-auth smokes | make run-shell; make run-terminal; make run-credential; make run-login; make run-login-setup | Anonymous shell behavior, TerminalSession line input and cancellation, CredentialStore verifier behavior, username-aware password login, broker-issued operator bundle upgrade, volatile first-boot setup credential creation, terminal isolation, and stale-handle release. |
| Focused service smokes | make run-chat; make run-adventure; make run-paperclips; make run-revocable-read; make run-memoryobject-shared; make run-ringtap-failing-call | Resident-service demos, the clean-room Paperclips terminal demo, revocation behavior, MemoryObject sharing, and debug-tap viewer behavior. |
| Networking smoke | make run-net; make qemu-net-harness | QEMU virtio-net attachment, kernel PCI/device-discovery path, descriptor-accounting guard evidence, ARP, and ICMP. TCP/UDP socket proof lives under the Phase C userspace network-stack gates. |
| SSH gateway proof smokes | make run-ssh-host-key; make run-ssh-authorized-key; make run-ssh-public-key-session; make run-ssh-public-key-auth; make run-ssh-feature-policy; make run-restricted-shell-launcher | Development host-key fixture validation, authorized-key mapping, public-key session minting, public-key authentication failure privacy, unsupported SSH feature policy, and restricted shell launch authority. The bounded host-local socket-to-TerminalSession wiring proof is retired with the kernel socket owner. |
Do not claim full verification unless the relevant command actually ran in the
current change. For doc-only changes, use an appropriately narrower check such
as mdbook build.
Review Workflow
- Identify the changed trust boundary or state that the change is docs-only.
- Read
REVIEW.md(at the repository root) for the applicable security, unsafe, memory, performance, capability, and emergency-path checklist. - Read the relevant review-finding task records under
docs/tasks/before judging correctness so known open findings are not treated as solved behavior. - For system-design work, list the concrete design and research files read; reviewers should reject vague grounding such as “docs” or “research”.
- Run the smallest command set that exercises the changed behavior, then add QEMU proof for user-visible kernel or runtime behavior.
- Record unresolved non-critical findings as task records under
docs/tasks/ordocs/tasks/on-hold/with concrete remediation context before treating the task as reviewed.
Evidence by Claim
| Claim type | Required evidence |
|---|---|
| Parser or manifest validation | Host tests for valid and malformed input; fuzz target when arbitrary bytes can reach the parser. |
| Kernel/user pointer safety | QEMU hostile-pointer smoke plus code review of address, length, permissions, and validation-to-use windows. |
| Ring or IPC transport behavior | Host model/property tests where possible, plus QEMU process output proving success and failure paths. |
| Userspace runtime primitive ownership | tools/check-userspace-runtime-surface.sh plus review of capos-rt/src/entry.rs, alloc.rs, panic.rs, and syscall.rs. |
| Capability transfer or release | Rollback tests for copy/move/release failure, cap-slot exhaustion, stale caps, and process-exit cleanup. A release-only proof shows local cleanup only; any claim that peers, children, sessions, or delegated holders lose authority needs a separate explicit revoke, session-expiry, object-epoch, or service-specific invalidation proof. |
| Resource accounting | Tests that prove quota rejection, matched release on success and failure, and process-exit cleanup. |
| Generated code, schema, or generated content changes | make generated-code-check and a checked-in baseline diff generated by the pinned compiler or pinned CUE/generator path. |
| Dependency or toolchain changes | Dependency-class review plus make dependency-policy-check; update Trusted Build Inputs when trust assumptions change. |
| Device or DMA work | make run-net or a targeted QEMU smoke; no userspace-driver transition without the gates in DMA Isolation. |
| Panic-surface hardening | Updated Panic Surface Inventory when reachability or classification changes. |
| Authentication and session work | Host tests for TerminalSession line-input bounds, secret-mode echo suppression, cancellation behavior, exclusive terminal handoff, non-inheritance without an explicit grant, verifier encoding, entropy-unavailable fail-closed behavior, bootstrap-plus-RAM-overlay credential handling, volatile credential/disable-state disclosure, bounded single-use setup-token/challenge first-consume/expiry/replay semantics, generic failure/backoff policy, and audit redaction with opaque record IDs plus pre-auth serial-safe failure events; QEMU proof for setup/login, failed auth, successful capos-shell launch through TerminalSession/CredentialStore/SessionManager/AuthorityBroker, lack of terminal access for an ungranted child, absence of broad BootPackage/raw ProcessSpawner caps in the shell, and fail-closed behavior when the secure-randomness path is unavailable. |
Fuzzing and Proof Tracks
The current fuzz corpus lives under fuzz/ and covers manifest Cap’n Proto
input, exported JSON conversion for mkmanifest, arbitrary ELF parser input,
Telnet IAC filtering, terminal line discipline, and ring SQE wire validation.
Run fuzzers when a change alters those parsers, schema shape, terminal/network
byte-stream handling, SQE validation, or related validation rules.
Kani coverage is intentionally narrow and lives in capos-lib, where pure
logic can be bounded without hardware state. Add or refresh Kani harnesses for
ledger, cap-table, bitmap, and parser invariants when those invariants become
part of a security claim. The required local/CI gate is make kani-lib. The
extracted DMA-authority core (capos_lib::dma_authority) has its own bounded
gate, make kani-dma-authority, which proves ownership-generation bump on
recycle, stale-handle rejection without mutation, and no-re-expose before
completion — a faithful model of the kernel/src/device_dma.rs arithmetic whose
kernel call-through is a tracked follow-up.
Loom coverage belongs in shared ring logic. Extend cargo test-ring-loom when
SQ/CQ ownership, ordering, corruption recovery, or wake semantics change.
DMA assurance model files live under models/dma/ and are bounded checked
evidence for device and cloud-backend claims. The Alloy relational authority graph
(models/dma/dma_authority.als) is now an analyzer-checked gate:
make model-dma-alloy runs the pinned Alloy Analyzer 6.2.0 headless at scope
for 4 and fails on any counterexample (free-page reachability, same-domain
IOVA uniqueness, and the ownership-generation stale-handle gate), with the
checked verdict table recorded in models/dma/README.md. The focused Loom gate
for the DeferredCompletionQueue reservation budget and the multi-CPU
generation re-read is also checked (make model-dma-deferred-completion-loom,
pinned loom 0.7.2). The TLA+ lifecycle model (models/dma/dma_authority.tla) is
now a model-checked gate as well: make model-dma-tla runs the pinned TLC
2.19 (tla2tools 1.7.4) over the bounded configuration (2 devices / 2 domains /
2 pages / 2 iovas, generations 0..1) and fails closed on any invariant
violation, deadlock, or analyzer error, with the checked result recorded in
models/dma/README.md. It covers the lifecycle ordering plus the landed
generation-keyed stale completion, record-before-PTE-install split, drive-pin/
quarantine, and queue-enable epoch-fence interleavings. The extracted pure
DMA-authority core is checked by Kani as well (make kani-dma-authority, pinned
kani-verifier 0.67.0): ownership-generation bump on recycle, stale-handle
rejection without mutation, and no-re-expose before completion over
capos_lib::dma_authority.
The DMA checked gates are wired into CI. The GitHub dma-assurance-models job
runs make model-dma-alloy, make model-dma-tla, and
make model-dma-deferred-completion-loom; the kani-proofs job runs
make kani-dma-authority after the mandatory make kani-lib gate. The local
make dma-assurance-model-check aggregate runs all four when cargo-kani is
installed. Do not claim Verus evidence – or any Alloy/TLC/Loom/Kani
DMA-authority result beyond what these targets actually check – unless the
exact command, checker version, configuration, model bounds, and output are
recorded in the task closeout.
For DMA work, map claims through
DMA Assurance Model:
TLA+ for lifecycle ordering and races, Alloy for authority topology, Kani for
pure Rust validators/accounting, and Loom for atomic or queue interleavings such
as DeferredCompletionQueue. The model supplements the required QEMU or cloud
evidence; it does not replace hardware-facing smokes.
Documentation Sources
REVIEW.md(at the repository root): rules for security, unsafe code, capability invariants, resource accounting, and emergency paths.docs/tasks/: open remediation backlog, review-finding task records, and latest verification task records.- Trusted Build Inputs: trusted compiler, generated-code, dependency, bootloader, manifest, and host-tool inputs.
- Panic Surface Inventory: classified panic-like surfaces and commands used to generate the inventory.
- Authority Accounting: authority graph, quota, transfer, rollback, and ProcessSpawner accounting invariants.