Verification Workflow

This page maps capOS claims to the commands, QEMU smokes, fuzz targets, proof tools, and review documents that currently support them.

Local Command Set

Use the repo aliases and Makefile targets instead of bare host commands. The workspace default Cargo target is x86_64-unknown-none, so host tests rely on aliases that set the host target explicitly.

Scope	Command	What it checks
Formatting	`make fmt-check`	Rust formatting across kernel, shared crates, standalone userspace crates, and demos.
Config and manifest logic	`cargo test-config`	Cap’n Proto manifest encode/decode, CUE value handling, CapSet layout, and config validation.
Ring concurrency model	`cargo test-ring-loom`	Bounded SQ/CQ producer-consumer invariants and corrupted-SQ recovery behavior.
Deferred-completion concurrency model	`make model-dma-deferred-completion-loom`	Bounded Loom over the kernel `DeferredCompletionQueue` reservation budget and the multi-CPU TLB shootdown generation re-read (`kernel/src/arch/x86_64/tlb.rs`); budget never exceeded, no completion dropped/double-popped, no retire ahead of a covering flush.
DMA authority lifecycle model	`make model-dma-tla`	Pinned TLC bounded check of `models/dma/dma_authority.tla`: allocate->map->publish->complete->revoke->scrub->reuse ordering plus generation-keyed stale completion, record-before-PTE-install split, drive-pin/quarantine, and queue-enable epoch-fence interleavings. Fails closed on any invariant violation, deadlock, or analyzer error.
DMA assurance aggregate	`make dma-assurance-model-check`	Local aggregate over the DMA Alloy, TLA+, Loom, and Kani gates. Requires installed `cargo-kani`; GitHub CI splits the same evidence across the DMA Assurance Models and Kani Proofs jobs.
Shared library logic	`cargo test-lib`	ELF parser, frame bitmap, frame ledger, capability table, and property-test coverage.
Manifest tool	`cargo test-mkmanifest`	Host-side manifest conversion and validation behavior.
Userspace runtime	`tools/check-userspace-runtime-surface.sh`; `make capos-rt-check`; `make init-capos-build demos-capos-build shell-capos-build`	Runtime primitive ownership, custom-target boot build path, entry ABI, typed clients, ring helpers, and no_std constraints.
Kernel build	`cargo build --features qemu`	Kernel build with the QEMU exit feature enabled.
Generated code	`make generated-code-check`	Cap’n Proto compiler path/version, schema binding output equality, no_std patch anchors, Adventure and Paperclips content freshness, locked generator dependencies, and checked-in generated-output drift.
Dependency policy	`make dependency-policy-check`	cargo-deny and cargo-audit policy across root and standalone Cargo lockfiles, plus npm lockfile validation and audit checks for the docs toolchain.
Mandatory Kani gate	`make kani-lib`	Bounded `capos-lib` harness set for frame allocation, stale-handle rejection, frame-grant and cap-slot fail-closed accounting, and transfer-origin fail-closed behavior.
DMA-authority core Kani gate	`make kani-dma-authority`	Bounded Kani over the extracted pure DMA-authority core (`capos_lib::dma_authority`): a recycled slot’s generation strictly increases and never aliases a live handle, a stale-generation completion is rejected without mutating completion/free/reuse state, and a buffer cannot be re-exposed until its in-flight completion is observed. Faithful model of the `kernel/src/device_dma.rs` authority arithmetic; the kernel call-through is a tracked follow-up.
Full image build	`make`	Kernel, userspace demos, runtime smoke binaries, manifest, Limine artifacts, and ISO packaging.
Default interactive boot	`make run`	Operator-facing default init-owned boot path from layered `system.cue`: standalone init starts the foreground shell, resident demo services, and the remote-session CapSet gateway, forwards only the remote CapSet endpoint on loopback, and keeps console/debug output logged separately.
Default QEMU smoke	`make run-smoke`	Scripted focused shell-led boot from `system-smoke.cue`: kernel boot-launches `capos-shell` as `init`, grants the shell bootstrap cap bundle, then proves anonymous-session bootstrap, `login` failed-auth redaction, successful password auth, broker upgrade to operator bundle, terminal isolation, and clean halt.
Focused spawn QEMU smoke	`make run-spawn`	Narrower init-owned ProcessSpawner graph: kernel boot-launches only standalone `init` with Console, BootPackage, and ProcessSpawner; init validates BootPackage metadata, spawns endpoint/IPC/VirtualMemory/Timer/FrameAllocator children, waits for them, exercises hostile spawn checks, and halts cleanly.
Shell, terminal, and local-auth smokes	`make run-shell`; `make run-terminal`; `make run-credential`; `make run-login`; `make run-login-setup`	Anonymous shell behavior, TerminalSession line input and cancellation, CredentialStore verifier behavior, username-aware password login, broker-issued operator bundle upgrade, volatile first-boot setup credential creation, terminal isolation, and stale-handle release.
Focused service smokes	`make run-chat`; `make run-adventure`; `make run-paperclips`; `make run-revocable-read`; `make run-memoryobject-shared`; `make run-ringtap-failing-call`	Resident-service demos, the clean-room Paperclips terminal demo, revocation behavior, MemoryObject sharing, and debug-tap viewer behavior.
Networking smoke	`make run-net`; `make qemu-net-harness`	QEMU virtio-net attachment, kernel PCI/device-discovery path, descriptor-accounting guard evidence, ARP, and ICMP. TCP/UDP socket proof lives under the Phase C userspace network-stack gates.
SSH gateway proof smokes	`make run-ssh-host-key`; `make run-ssh-authorized-key`; `make run-ssh-public-key-session`; `make run-ssh-public-key-auth`; `make run-ssh-feature-policy`; `make run-restricted-shell-launcher`	Development host-key fixture validation, authorized-key mapping, public-key session minting, public-key authentication failure privacy, unsupported SSH feature policy, and restricted shell launch authority. The bounded host-local socket-to-TerminalSession wiring proof is retired with the kernel socket owner.

Do not claim full verification unless the relevant command actually ran in the current change. For doc-only changes, use an appropriately narrower check such as mdbook build.

Review Workflow

Identify the changed trust boundary or state that the change is docs-only.
Read REVIEW.md (at the repository root) for the applicable security, unsafe, memory, performance, capability, and emergency-path checklist.
Read the relevant review-finding task records under docs/tasks/ before judging correctness so known open findings are not treated as solved behavior.
For system-design work, list the concrete design and research files read; reviewers should reject vague grounding such as “docs” or “research”.
Run the smallest command set that exercises the changed behavior, then add QEMU proof for user-visible kernel or runtime behavior.
Record unresolved non-critical findings as task records under docs/tasks/ or docs/tasks/on-hold/ with concrete remediation context before treating the task as reviewed.

Evidence by Claim

Claim type	Required evidence
Parser or manifest validation	Host tests for valid and malformed input; fuzz target when arbitrary bytes can reach the parser.
Kernel/user pointer safety	QEMU hostile-pointer smoke plus code review of address, length, permissions, and validation-to-use windows.
Ring or IPC transport behavior	Host model/property tests where possible, plus QEMU process output proving success and failure paths.
Userspace runtime primitive ownership	`tools/check-userspace-runtime-surface.sh` plus review of `capos-rt/src/entry.rs`, `alloc.rs`, `panic.rs`, and `syscall.rs`.
Capability transfer or release	Rollback tests for copy/move/release failure, cap-slot exhaustion, stale caps, and process-exit cleanup. A release-only proof shows local cleanup only; any claim that peers, children, sessions, or delegated holders lose authority needs a separate explicit revoke, session-expiry, object-epoch, or service-specific invalidation proof.
Resource accounting	Tests that prove quota rejection, matched release on success and failure, and process-exit cleanup.
Generated code, schema, or generated content changes	`make generated-code-check` and a checked-in baseline diff generated by the pinned compiler or pinned CUE/generator path.
Dependency or toolchain changes	Dependency-class review plus `make dependency-policy-check`; update Trusted Build Inputs when trust assumptions change.
Device or DMA work	`make run-net` or a targeted QEMU smoke; no userspace-driver transition without the gates in DMA Isolation.
Panic-surface hardening	Updated Panic Surface Inventory when reachability or classification changes.
Authentication and session work	Host tests for `TerminalSession` line-input bounds, secret-mode echo suppression, cancellation behavior, exclusive terminal handoff, non-inheritance without an explicit grant, verifier encoding, entropy-unavailable fail-closed behavior, bootstrap-plus-RAM-overlay credential handling, volatile credential/disable-state disclosure, bounded single-use setup-token/challenge first-consume/expiry/replay semantics, generic failure/backoff policy, and audit redaction with opaque record IDs plus pre-auth serial-safe failure events; QEMU proof for setup/login, failed auth, successful `capos-shell` launch through `TerminalSession`/`CredentialStore`/`SessionManager`/`AuthorityBroker`, lack of terminal access for an ungranted child, absence of broad `BootPackage`/raw `ProcessSpawner` caps in the shell, and fail-closed behavior when the secure-randomness path is unavailable.

Fuzzing and Proof Tracks

The current fuzz corpus lives under fuzz/ and covers manifest Cap’n Proto input, exported JSON conversion for mkmanifest, arbitrary ELF parser input, Telnet IAC filtering, terminal line discipline, and ring SQE wire validation. Run fuzzers when a change alters those parsers, schema shape, terminal/network byte-stream handling, SQE validation, or related validation rules.

Kani coverage is intentionally narrow and lives in capos-lib, where pure logic can be bounded without hardware state. Add or refresh Kani harnesses for ledger, cap-table, bitmap, and parser invariants when those invariants become part of a security claim. The required local/CI gate is make kani-lib. The extracted DMA-authority core (capos_lib::dma_authority) has its own bounded gate, make kani-dma-authority, which proves ownership-generation bump on recycle, stale-handle rejection without mutation, and no-re-expose before completion — a faithful model of the kernel/src/device_dma.rs arithmetic whose kernel call-through is a tracked follow-up.

Loom coverage belongs in shared ring logic. Extend cargo test-ring-loom when SQ/CQ ownership, ordering, corruption recovery, or wake semantics change.

DMA assurance model files live under models/dma/ and are bounded checked evidence for device and cloud-backend claims. The Alloy relational authority graph (models/dma/dma_authority.als) is now an analyzer-checked gate: make model-dma-alloy runs the pinned Alloy Analyzer 6.2.0 headless at scope for 4 and fails on any counterexample (free-page reachability, same-domain IOVA uniqueness, and the ownership-generation stale-handle gate), with the checked verdict table recorded in models/dma/README.md. The focused Loom gate for the DeferredCompletionQueue reservation budget and the multi-CPU generation re-read is also checked (make model-dma-deferred-completion-loom, pinned loom 0.7.2). The TLA+ lifecycle model (models/dma/dma_authority.tla) is now a model-checked gate as well: make model-dma-tla runs the pinned TLC 2.19 (tla2tools 1.7.4) over the bounded configuration (2 devices / 2 domains / 2 pages / 2 iovas, generations 0..1) and fails closed on any invariant violation, deadlock, or analyzer error, with the checked result recorded in models/dma/README.md. It covers the lifecycle ordering plus the landed generation-keyed stale completion, record-before-PTE-install split, drive-pin/ quarantine, and queue-enable epoch-fence interleavings. The extracted pure DMA-authority core is checked by Kani as well (make kani-dma-authority, pinned kani-verifier 0.67.0): ownership-generation bump on recycle, stale-handle rejection without mutation, and no-re-expose before completion over capos_lib::dma_authority.

The DMA checked gates are wired into CI. The GitHub dma-assurance-models job runs make model-dma-alloy, make model-dma-tla, and make model-dma-deferred-completion-loom; the kani-proofs job runs make kani-dma-authority after the mandatory make kani-lib gate. The local make dma-assurance-model-check aggregate runs all four when cargo-kani is installed. Do not claim Verus evidence – or any Alloy/TLC/Loom/Kani DMA-authority result beyond what these targets actually check – unless the exact command, checker version, configuration, model bounds, and output are recorded in the task closeout.

For DMA work, map claims through DMA Assurance Model: TLA+ for lifecycle ordering and races, Alloy for authority topology, Kani for pure Rust validators/accounting, and Loom for atomic or queue interleavings such as DeferredCompletionQueue. The model supplements the required QEMU or cloud evidence; it does not replace hardware-facing smokes.

Documentation Sources

REVIEW.md (at the repository root): rules for security, unsafe code, capability invariants, resource accounting, and emergency paths.
docs/tasks/: open remediation backlog, review-finding task records, and latest verification task records.
Trusted Build Inputs: trusted compiler, generated-code, dependency, bootloader, manifest, and host-tool inputs.
Panic Surface Inventory: classified panic-like surfaces and commands used to generate the inventory.
Authority Accounting: authority graph, quota, transfer, rollback, and ProcessSpawner accounting invariants.