Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Current Status

This page describes current repository behavior, not the full long-term design.

Current Snapshot

capOS boots on x86_64 QEMU, starts a standalone init process from the default manifest, and runs the native shell plus resident demo services through typed capabilities. The current operator path starts as an anonymous shell session; login prompts for username> and hidden password>, validates the selected bootstrap account through SessionManager and CredentialStore, and upgrades to a broker-issued operator bundle. Default password-authenticated local operator sessions do not expire by wall-clock timestamp; they remain intended to end through logout, terminal/connection/process-tree close, or administrator revocation. Manifests can still set a non-default operator lifetime for focused expiry proofs. setup can create a volatile local operator credential and then follows the same login upgrade path.

The implemented baseline includes isolated processes, user-mode ELF loading, shared-memory capability rings, endpoint IPC, copy/move capability transfer, thread and park primitives, init-owned service spawning, local shell login, focused Telnet shell demo, resident chat/adventure services, and the Paperclips terminal demo. The current selected milestone is GCE Self-Hosted Web UI: the next visible path is serving the remote-session Web UI through the Phase C userspace network stack and proving private GCE reachability before any public endpoint. Installable System is the completed previous selected milestone for the bounded local/QEMU contract: data-region mount, config-overlay merge, generation/rollback machinery, integrated bootable disk, install, first-boot provision, update/rollback, and structural proposal/body wording reconcile have landed. Device Driver Foundation is also complete; its GCP-first provider rollup has live operator-access, selected NIC raw-frame, selected storage I/O, and gVNIC portability evidence. Durable multi-account credential storage, broader account policy, production SSH/WebShell ingress, public GCE ingress/TLS, AWS/Azure providers, broader storage variants, high-throughput NIC work, direct-remapping DMA, and persistence beyond the landed installable data-region and generation paths remain future work.

Recent Status Notes

Update 2026-06-11 19:21 UTC: the GCE Self-Hosted Web UI local readiness wave is reconciled here. Since the local Web UI L4 proof, the following landed as local QEMU/cloudboot or no-spend harness evidence only: legacy GCE virtio-net Web UI serving (make run-cloud-gce-legacy-virtio-webui-serving proves a host HTTP peer fetching the byte-verified UI bundle over the kernel-brokered legacy virtio 0.9 runtime that backs the typed Nic cap), a browser-facing hardening set proved on the L4 gate (single public-origin policy, IAP-aware SameSite cookie policy, JSON content-type guard, security response headers with a strict CSP, GFE-range-pinned X-Forwarded-Proto trust, the public /healthz health-check contract, and in-guest login peer-gate/failure-backoff hardening), and a no-spend provider-harness fixture set (the private-proof --preflight-only mode, private and public proof-evidence validators, the public ingress resource plan gate, the journal-driven teardown engine, and the provider-command allowlist gate), each fixture gate driving recording stub provider CLIs only, with no real provider invocation or mutation on any path. None of this proves private GCE reachability, public exposure, TLS custody, or production readiness; the live private and public proofs stay on hold.

The selected GCE Self-Hosted Web UI evidence ladder is:

Update 2026-06-08 10:21 UTC: local bounded ICMPv4 Echo Reply diagnostics have landed for the Phase C userspace network stack. make run-cloud-prod-icmp-echo-reply first reruns the served TcpListenAuthority local proof, then boots the ICMP manifest, acquires the QEMU SLIRP DHCP lease 10.0.2.15/24, proves same-subnet ARP and ICMP Echo Request / Echo Reply preservation for identifier 0x04d2, sequence 9, and a 23-byte payload, and rejects bad-checksum, invalid-code, truncated, address-family, oversized-payload, and oversized-frame controls. This answers bounded local ping for diagnostics only; it is not Web UI readiness and does not open public ICMP or change GCE firewall posture.

Update 2026-06-08 08:24 UTC: Phase C slice 7c-ii(b) has a local serve-from-userspace proof, and the legacy kernel socket-path grant is retired for non-qemu production manifests. make run-cloud-prod-userspace-network-stack-smoltcp boots the non-qemu cloudboot manifest, starts a userspace smoltcp network-stack service, grants an application client only Console plus a served TcpListenAuthority, and completes one hostfwd TCP request/response through served TcpListener/TcpSocket caps while preserving host_physical_user_visible=0. Non-qemu manifests that request kernel network_manager or tcp_listen_authority now fail closed instead of reaching kernel/src/virtio_stub.rs; remaining kernel socket grants are qemu-only fixtures. DHCP/IPv4 configuration, Web UI L4, private GCE reachability, public ingress/TLS, and kernel smoltcp/virtio-net cleanup remain separate tasks. The current GCE Self-Hosted Web UI evidence ladder lives in the 2026-06-11 19:21 UTC update above.

Update 2026-06-07 18:20 UTC (through commit 12b8334a, committed 2026-06-07 18:19 UTC): Installable System is closed for the bounded local/QEMU contract. The closeout reconciles the proposal, backlog, proposal index, roadmap, and status page to the landed data-region, overlay, generation, install, provision, and update/rollback proofs while preserving the RAM-only Namespace caveat and leaving secure boot/signing, production release authority, public ingress, broader provider support, full userspace smoltcp/L4 readiness, and full durable account policy as future work. The selected milestone is now GCE Self-Hosted Web UI.

Update 2026-06-07 08:23 UTC (commit ef8d98c2): Device Driver Foundation production-authority closeout is recorded by ddf-production-authority-closeout. The closeout ties together the landed provider-driver, interrupt, audit, and DMA-policy prerequisites and keeps public ingress, AWS/Azure support, direct-remapping hardware, device-autonomous MSI-X, high-throughput NIC, and userspace smoltcp/L4 readiness as future follow-up work.

Update 2026-06-07 05:26 UTC: the GCP-first usable-instance provider rollup is closed by cloud-usable-instance-provider-nic-storage. The rollup cites real GCE serial-console operator access (1779868872-2424), live legacy virtio-net raw-frame provider-nic-bound (1780412056-e1cb), live NVMe Persistent Disk brokered READ (1780806087-bf69), and separate live gVNIC raw-frame / typed-Nic portability evidence (1780794927-1aa9, 1780796615-decc). This is not a public L4 ingress, SSH/WebShell, AWS/Azure, high-throughput NIC, broader storage, direct-remapping DMA, or production cloud-image release claim.

Update 2026-05-23 16:51 UTC (commit c86374f8): make run-ddf-provider-consumer now has a stable userspace virtio-net provider closeout proof line. The line ties together selected queue 1 TX descriptor/avail/doorbell/used-ring/CQ ownership across the full QEMU TX queue depth, bounded queue 0 RX synthetic-token CQ identity, selected TX/RX MSI-X/LAPIC wait/ack/EOI, selected-route reset/reassignment, teardown, stale-handle blocking, and explicit no-silent-provider-fallback boundaries. This remains local bounded provider evidence over manager-owned bounce buffers: live hardware RX used-ring ownership, full virtio-net ownership, direct DMA/IOMMU, cloud NIC/storage readiness, and virtio block/storage drivers remain future work.

Update 2026-05-23 13:36 UTC (commit e248d42b): make run-ddf-provider-consumer now exercises selected userspace virtio-net TX CQ ownership across the full eight-entry TX queue depth used by the smoke. Eight manager-owned bounce buffers can be live before the first completion, the ninth allocation fails closed, wrong-order completion of descriptor 7 preserves descriptor 0, CQ identity is delivered and acknowledged in order for descriptors 0 through 7, release drains seven incomplete descriptors as teardown-only, and provider TX release retires seven delivered but unacknowledged CQ events. This remains bounded selected-queue evidence over bounce buffers: live hardware RX used-ring ownership, direct DMA/IOMMU, full userspace virtio-net ownership, and cloud NIC/storage readiness remain open.

Update 2026-05-12 16:40 UTC: make run-ddf-provider-consumer now exercises the selected userspace virtio-net TX path through bounded queue 1 descriptor/avail publication, exactly one selected notify doorbell, and a runtime-visible tx_interrupt completion event tied to the used-ring handoff. The selected submit path validates the live DMABuffer record, scrubs the bounce page, consumes the live no-write notify_mmio policy, publishes the stored descriptor/avail entry, and rings the notify doorbell only after those gates pass. DMABuffer.completeDescriptor then observes the real TX used-ring entry for the stored software descriptor generation, clears the manager in-flight record, and delivers a bounded selected-tx-used-ring-completeDescriptor event to a live tx_interrupt.wait for the same route. This is still selected-route proof coverage rather than full userspace virtio-net ownership: arbitrary doorbells, production NIC or storage migration, cloud readiness, hardware IRQ ownership, hardware acknowledgement/mask/unmask, direct DMA, IOMMU programming, broader CQ ownership, and grantable device ownership remain open.

Update 2026-05-11 14:39 UTC, commit f04a14f4: make run-ddf-provider-consumer now turns the selected userspace virtio-net TX doorbell gate into an explicit staged claimed-notify-offset admission proof. The selected queue 1 provider entry reports accepted notify-offset policy, blocked wrong-queue policy, blocked wrong-offset policy, and no_doorbell=true after descriptor authority validation and submit scrub. The same smoke submits queue 0 first and proves it remains neutral rather than selected/backend doorbell-capable. This remains bounded manager-owned bounce-buffer evidence only: no virtio-net notify BAR handle is granted, no notify register is written, no real virtio-net descriptor ring is mutated, and production userspace NIC or cloud readiness is not claimed.

Update 2026-05-11 12:01 UTC: make run-ddf-provider-consumer now extends the bounded provider-visible submit effect with a selected provider-owned queue entry. Accepted DMABuffer.submitDescriptor still validates descriptor authority and scrubs the manager-owned bounce page first, then writes queue magic, queue id, tail, descriptor id, submitted length, and flags before the submit marker. The focused smoke maps the buffer after completion and proves the queue entry and marker remain visible outside the completed byte range; it also rejects submits shorter than the full 24-byte provider-effect footprint with zero in-flight accounting and no provider mutation. This is bounded bounce-buffer evidence only: no hardware descriptor ring or CQ is published, no direct DMA or IOMMU/remapping is enabled, host physical/IOVA values stay hidden, no MMIO doorbell is written, and provider-driver IRQ consumption plus cloud NIC/storage readiness remain future work.

Update 2026-05-11 11:22 UTC: make run-ddf-provider-consumer now proves a bounded descriptor-ring-equivalent provider side effect after DMABuffer.submitDescriptor authority validation. The accepted submit path scrubs the manager-owned bounce page, writes a provider-visible shadow descriptor entry with magic, queue, descriptor id, submitted length, and flags, and then writes the existing submit marker before the in-flight record is committed. The same process maps the buffer after DMABuffer.completeDescriptor and proves the shadow descriptor and marker remain visible outside the completed byte range. A follow-up boundary check rejects submits shorter than the 24-byte provider-effect footprint as dmabuffer-provider-effect-too-short, preserving zero in-flight accounting and blocking side effects. This is still bounded bounce-buffer evidence: no hardware descriptor ring or CQ is published, no direct DMA or IOMMU/remapping is enabled, host physical/IOVA values stay hidden, arbitrary MMIO doorbells remain blocked, and provider-driver IRQ consumption plus cloud NIC/storage readiness remain future work.

Update 2026-05-11 10:46 UTC: make run-ddf-provider-consumer now proves the first bounded provider-visible DMA side effect in the four-cap provider consumer. On accepted DMABuffer.submitDescriptor, the manager-owned bounce page is scrubbed, then a submit marker is written only after descriptor authority validation succeeds. The same process maps the buffer after DMABuffer.completeDescriptor and proves the completion pattern is limited to the completed byte range while the submit marker outside that range remains visible. This is still bounded bounce-buffer evidence: no descriptor ring or CQ is published, no direct DMA or IOMMU/remapping is enabled, host physical/IOVA values stay hidden, arbitrary MMIO doorbells remain blocked, and provider-driver IRQ consumption plus cloud NIC/storage readiness remain future work.

Update 2026-05-11 10:06 UTC: commit c52064c0 extends the same make run-ddf-provider-consumer four-cap provider-consumer smoke beyond bounded DMABuffer submit/complete accounting. The service now also calls brokered DeviceMmio.read32, the existing claimed-register DeviceMmio.write32 path, and brokered readback before DeviceMmio.unmap. The interrupt half proves one async Interrupt.wait completes as delivered after route unmask, a second async wait stays pending for a kernel turn, and Interrupt.mask completes that second waiter as cancelled. This remains bounded provider-authority composition evidence only: DMA is still the manager-owned bounce-buffer path, MMIO writes remain limited to the claimed register policy with no arbitrary doorbell, IRQ behavior is bounded route-generation-checked waiter delivery/cancellation, and production NIC/storage migration, IOMMU/remapping, descriptor-ring mutation, completion-queue publication, provider-driver interrupt consumption, and cloud readiness remain future work.

Update 2026-05-11 09:25 UTC: make run-ddf-provider-consumer now extends the same four-cap provider-consumer smoke across the bounded DMABuffer submit/complete descriptor-accounting path. After allocating and unmapping one manager-owned bounce buffer, the service calls typed DMABuffer.submitDescriptor and DMABuffer.completeDescriptor, asserts manager-inflight-recorded then manager-inflight-completed, and checks DMAPool.info reports live_inflight=1 after submit and live_inflight=0 after completion before freeing the buffer. This remains bounded provider-authority composition evidence only: no descriptor ring is mutated, no CQ is published, direct DMA stays blocked, host physical/IOVA exposure stays hidden, arbitrary MMIO writes and doorbells remain blocked, and production NIC/block migration remains future work.

Update 2026-05-11 03:09 UTC: commit 9c0a5183 carries the manager-owned fixed bounce-buffer DMAPool budget ledger into DMAPool.allocateBuffer. The manifest-granted three-slot pool now attaches a device-manager-owned budget policy with three live buffers/pages, 12288 bytes, four queues, eight descriptors per queue, one in-flight descriptor per live slot, zero MMIO mappings/bytes, and zero interrupt holds. With all three fixed slots live, a fourth valid 4096-byte allocation returns no result cap and reports result=dmapool-budget-exceeded, reason=over-buffer-budget, sideEffect=side-effect-blocked, and bufferPresent=false before slot selection, frame allocation, generation allocation, cap minting, or manager ledger mutation. Imported live virtio-net proof records continue to use the kernel-owned device_dma:virtio-net budget policy. This remains the bounded manager-owned bounce-buffer path: direct DMA stays blocked, host physical addresses and IOVAs stay hidden, descriptor rings and completion queues are not mutated, and IOMMU/remapping plus production driver consumption remain future work.

Update 2026-05-11 00:20 UTC: make run-ddf-provider-consumer now boots one focused service that receives console, DMAPool, DeviceMmio, and Interrupt in the same CapSet. The smoke uses existing typed runtime clients and unchanged ABI surfaces: it validates DMAPool.info, allocates one manager-owned bounce-buffer DMABuffer, maps and unmaps it through the existing userspace bounce-buffer path, frees it with scrub-before-frame-free evidence, maps and unmaps a boot-preseeded DeviceMmio BAR page read-only, and exercises bounded Interrupt.wait, acknowledge, unmask, and mask on the manager-attached route. The harness asserts the four-cap service spawn, one manager-grant-source acquire/release for each authority family, and the stable provider-consumer proof line with authorities=dmapool,device_mmio,interrupt, no direct DMA, no host physical/IOVA exposure, no arbitrary MMIO write or doorbell, no real interrupt delivery requirement, and no production NIC/block migration. This is bounded composition evidence only; it does not add schema/generated/runtime changes, IOMMU programming, provider-driver interrupt consumption, or production storage/network driver migration.

Update 2026-05-10 20:06 UTC: manifest-granted DeviceMmio.write32(offset, value) now performs a bounded kernel-side volatile MMIO write after validating the active manager-attached handle, owner/state, region and policy binding, pure DeviceMmioOperation::Write authority, a dword-aligned in-BAR range, and the single PCI MSI-X metadata-derived provider claim, including BDF, BAR, BAR base, offset, and value. The effect uses only the boot-preseeded kernel MMIO mapping cache already used by DeviceMmio.read32; it does not install a post-userspace kernel mapping, and the userspace BAR VMA remains read-only. The focused smoke uses the claimed virtio-rng MSI-X entry-0 vector-control mask dword, reports side_effect=mmio-write-performed and register_write=performed, then reads the same value back through brokered read32 and the read-only userspace VMA. It also attempts an unclaimed message-address dword write and reads back the original value unchanged. Invalid range and unclaimed-register paths remain typed side-effect-blocked results, while stale or released handles fail closed before any write and do not return a write32 result payload. This does not add writable userspace MMIO, arbitrary register writes, doorbells, host physical/IOVA exposure, IOMMU programming, or a production provider-driver consumer.

Update 2026-05-10 19:29 UTC: DMABuffer.completeDescriptor now produces a bounded userspace-visible completion effect on the manager-owned bounce-buffer page. On the existing valid matching manager-inflight-completed path, after active owner/epoch/slot validation and submitted-length checks pass, the manager writes a deterministic byte pattern into the first completionLength bytes of that slot’s bounce page before clearing the in-flight record. The focused DMAPool smoke maps the same slot after completion and proves byte 0 and the last completed byte match the pattern while the next byte remains unchanged. Invalid, stale, no-inflight, mismatched, length-exceeded, mapped-live, and after-free paths keep their fail-closed labels and do not write. This is bounce-buffer completion data only: no direct DMA, descriptor ring mutation, CQ publication, host physical/IOVA exposure, IOMMU programming, or production driver consumer is added.

Update 2026-05-10 16:47 UTC: the manifest-granted Interrupt.wait path now has a fixed-table deferred waiter object for the current manager-attached route. A masked wait still fails closed synchronously as stale-pending-irq-masked / route-masked / side-effect-blocked, but an unmasked wait now returns pending. The focused interrupt smoke submits that wait nonblocking, drives the kernel once, observes it remains pending, calls Interrupt.mask, then finishes the original wait as interrupt-waiter-cancelled / route-masked / waiter-completed-no-irq with wake_blocked=false, matching source/route generations, unchanged delivery counts, and no real IRQ delivery. This is no-IRQ cancellation waiter behavior only; hardware acknowledgement, MSI/MSI-X programming, IRQ-delivery userspace waiters, and production interrupt dispatch remain future work.

Update 2026-05-10 15:33 UTC: the manifest-granted DeviceMmio.map path now maps a boot-preseeded MMIO BAR page into the caller’s userspace address space as read-only, user-accessible, no-execute, no-cache PTEs. Accepted requests return userspace-mmio-bar-mapped, boot-preseeded-read-only-bar-page, user-vma-mapped, and a nonzero page-aligned userspace address; writable, executable, unknown-protection, zero-size, unaligned, out-of-BAR, overflow, and duplicate active map requests return typed no-side-effect results. The focused smoke reads the same QEMU BAR value through the returned userspace address and brokered DeviceMmio.read32, then exercises explicit DeviceMmio.unmap, a no-op second unmap, remap after unmap, and stale unmap failure after cap release. Release, drop, driver-crash, and reset-disable cleanup revoke any borrowed user VMA before the manager record is detached. This does not add writable MMIO, doorbells, volatile register writes, host physical/IOVA exposure, post-userspace kernel MMIO mappings, IOMMU programming, or a production provider-driver consumer.

Update 2026-05-10 14:12 UTC: DMABuffer.submitDescriptor / DMABuffer.completeDescriptor now keep in-flight descriptor identity on each live DMABuffer slot instead of one pool-global descriptor. The focused DMAPool grant smoke proves slot 0 and slot 1 can both be in flight, duplicate submit on the same slot still fails closed, mismatched completion preserves both live slot descriptors, completing slot 0 decrements aggregate DMAPool.info live_inflight from 2 to 1, explicit freeBuffer of the remaining in-flight slot fails closed, and cap release of an in-flight slot drains only that slot while preserving another slot’s in-flight accounting. This remains bounded manager accounting: no descriptor ring is mutated, no CQ entry is published, no direct DMA is attempted, no IOVA or host physical address is exposed, and no IOMMU programming or production driver consumer is added.

Update 2026-05-10 13:45 UTC: commit 3bbeb3d4 makes DMABuffer.unmap explicitly remove the manager-owned bounce-buffer userspace VMA for the calling process without freeing or scrubbing the bounce page, detaching the DMABuffer record, changing DMAPool.info live buffer/page/in-flight accounting, or touching real DMA state. The typed result reports userspace-bounce-buffer-unmapped / single-page-bounce-buffer / user-vma-unmapped when a live mapping is removed, and dmabuffer-mapping-absent / no-user-mapping with no side effect on a second unmap. The focused smoke maps slot 0 read-only, rejects a writable remap while the read-only mapping remains live, unmaps it, proves the second unmap is a typed no-op, remaps the same VMA writable, and verifies stale unmap after freeBuffer fails closed like info, map, submitDescriptor, and completeDescriptor. While VMA teardown is in progress, the cap records an in-progress mapping state so concurrent map/free/release paths fail closed instead of observing an absent mapping before the page-table unmap and TLB wait complete.

Update 2026-05-10 12:49 UTC: commit 28e16431 extends the manifest-granted DMAPool bounce-buffer allocator from two live result caps to three fixed manager-owned slots. The focused smoke now proves slot 0, slot 1, and slot 2 can be live at the same time, DMAPool.info reports live_buffers=3 live_pages=3 live_bytes=12288, a fourth allocation fails closed as dmapool-already-attached / active-buffer-attached, and freeing slot 0 while slots 1 and 2 remain live leaves two committed/resident/unswappable pages. The same smoke reallocates slot 0 with a fresh generation while slots 1 and 2 remain live, verifies the old cap stays revoked, explicitly frees the reused slot 0 and slot 2 buffers, and releases slot 1 while one bounded descriptor submission remains in flight so parent-first DMAPool release completes only after the last live result buffer detaches and drains manager-owned accounting. This is a fixed three-slot bounce-buffer allocator; it still does not expose direct DMA, IOVA or host physical addresses, descriptor-ring mutation, CQ publication, IOMMU programming, hostile isolation coverage, or a production driver consumer.

Update 2026-05-10 11:44 UTC: commit 75beeeb8 extends the manifest-granted DMAPool bounce-buffer allocator from one live result cap to two fixed manager-owned slots. The focused smoke now proves slot 0 and slot 1 can be live at the same time, DMAPool.info reports live_buffers=2 live_pages=2 live_bytes=8192, a third allocation fails closed as dmapool-already-attached / active-buffer-attached, and freeing slot 0 while slot 1 remains live leaves one committed/resident/unswappable page. The same smoke reallocates slot 0 with a fresh generation while slot 1 remains live, verifies the old cap stays revoked, and releases the parent DMAPool before the remaining buffers so the staged pool detach completes only after the final DMABuffer release drains bounded in-flight accounting. This is a fixed two-slot bounce-buffer allocator; it still does not expose direct DMA, IOVA or host physical addresses, descriptor-ring mutation, CQ publication, IOMMU programming, hostile isolation coverage, or a production driver consumer.

Update 2026-05-10 10:56 UTC: commit 9659763e adds typed DeviceMmio.write32(offset, value) admission on the manifest-granted DeviceMmio cap. The kernel validates the active manager-attached handle, owner/state, region and policy binding, DeviceMmioOperation::Write authority, and 32-bit aligned in-BAR range before returning admission-check-only, real-mmio-write-not-programmed, and side-effect-blocked with register_write=blocked. The focused smoke asserts accepted-shaped admission, unaligned/out-of-BAR/overflow mmio-write32-range-invalid denials, stale-after-release failure, and two sequential grant-cycle runs. This is an admission proof only; no volatile register write, userspace BAR mapping, doorbell, host physical exposure, IOMMU programming, or production driver consumer is added.

Update 2026-05-10 10:20 UTC: commit 3777a50d gives the manifest-granted proof-buffer DMABuffer descriptor path an explicit single in-flight descriptor identity. After one valid submitDescriptor, the focused grant smoke now proves a duplicate submit for the same queue/descriptor returns dmabuffer-descriptor-already-inflight with side-effect-blocked, and a valid-shaped completion for a different live descriptor returns dmabuffer-inflight-descriptor-mismatch with side-effect-blocked. Both refusals preserve DMAPool.info live_inflight=1; the matching completion still restores it to 0. This is still bounded manager accounting only, not descriptor-ring mutation, CQ publication, direct DMA, IOVA export, or a production driver consumer.

Update 2026-05-10 05:03 UTC: unsupported-protection DeviceMmio.map requests now stay on the typed admission result path. The focused grant smoke decodes range_result=mmio-map-prot-invalid, range_reason=unsupported-map-prot, range_side_effect=side-effect-blocked, addr=0, and unchanged manager identity fields instead of treating executable or missing-read protections as a capability exception. This remains admission-only evidence; no real BAR mapping, register access, doorbell write, or host physical address exposure is added.

Update 2026-05-10 09:36 UTC: a harness-hardening follow-up, commit 7dfa1d65, strengthens make run-dmapool-grant around that bounded userspace bounce-buffer map. The current focused smoke maps the first slot generation read-only, proves the zeroed page is readable, and asserts a same-cap writable remap attempt fails while that read-only mapping remains live. It then writes and reads a marker through the second slot generation’s read-write mapping while preserving the existing typed partial-range, protection, and free/release cleanup assertions. This is a stronger proof of the existing mapping permission contract, not new direct DMA, IOVA, host physical, descriptor-ring, CQ, IOMMU, hostile-isolation, or production-driver authority.

Update 2026-05-10 08:53 UTC: the manifest-granted DMABuffer.map path now maps the single manager-owned bounce-buffer page into the caller’s userspace VMA. Accepted readable full-page requests return userspace-bounce-buffer-mapped, single-page-bounce-buffer, and user-vma-mapped with a nonzero page-aligned userspace address while still reporting real_dma_mapping=not-programmed, direct_dma=blocked, and host_physical_user_visible=false. The feature slice proved the mapped page could be reached from userspace, kept typed partial-range/protection denials at addr=0, and proved DMABuffer.freeBuffer / cap release revoke the user mapping before the bounce page is scrubbed and freed. This is userspace access to the kernel-managed bounce buffer only; it does not expose IOVA, host physical addresses, direct DMA, descriptor-ring mutation, CQ publication, IOMMU programming, or a production driver consumer.

Update 2026-05-10 06:37 UTC: the manifest-granted DMAPool.allocateBuffer and DMABuffer.freeBuffer path now uses production bounce-buffer labels for the single-page userspace allocation/free authority. A valid 4096-byte request still mints exactly one same-session DMABuffer result cap, but the cap surfaces now report userspace_dmapool=manager-issued-bounce-buffer, allocation=single-bounce-buffer-page, record_pool=userspace-bounce-buffer-live, and free_buffer=bounce-buffer-page; zero-live records report zero-live-dmapool-bounce-buffer, and oversized requests fail as size-exceeds-bounce-buffer. The backing frame remains ledger-owned, resident, unswappable, scrubbed before frame free, and hidden from userspace. That slice advanced allocation/free authority; the later DMABuffer.map slice above adds userspace bounce-buffer VMA access while descriptor methods still only update bounded manager accounting, and no IOVA, host physical address, CQ publication, descriptor-ring mutation, IOMMU programming, or production driver consumer is added.

Update 2026-05-10 04:40 UTC: invalid-size DMAPool.allocateBuffer requests now use the same typed no-result-cap rejection shape as duplicate-active requests. Zero-size and over-bounce-buffer calls return result=dmapool-allocation-request-invalid, the exact request reason, side_effect=side-effect-blocked, buffer_present=false, no result cap, and no page mutation instead of relying on a capability exception string.

Update 2026-05-10 04:27 UTC: the manifest-granted Interrupt.mask and Interrupt.unmask methods perform bounded route-state control over the manager-attached dispatch slot. unmask changes claimed-masked to driver-unmasked, mask changes it back to claimed-masked, both preserve delivery counts, and release masks retained manager-grant routes before detaching. The 2026-05-10 16:47 UTC status above supersedes the original post-unmask wait placeholder with deferred no-IRQ cancellation behavior.

Update 2026-05-10 03:23 UTC: DMAPool.allocateBuffer now reports the duplicate-active bounded proof-buffer rejection as typed result data instead of requiring userspace to infer the label from a capability exception string. When the first proof DMABuffer is still live, the second valid-size allocation returns no result cap and reports result=dmapool-already-attached, reason=active-buffer-attached, side_effect=side-effect-blocked, and buffer_present=false; the smoke then proves DMAPool.info still reports one live 4096-byte proof frame. This remains bounded proof-buffer guard evidence, not real multi-buffer DMA allocation or production userspace DMA authority.

Update 2026-05-10 02:52 UTC: the focused DMAPool grant smoke now proves active duplicate allocation is refused while the single proof buffer is still attached. The smoke calls DMAPool.allocateBuffer a second time after the first result DMABuffer becomes live, requires the failure path, then re-reads DMAPool.info to prove the record still reports exactly one live 4096-byte proof frame. This is a bounded proof-buffer guard only; it does not add real multi-buffer allocation, DMA mappings, IOVA/physical exposure, or production driver authority.

Update 2026-05-10 02:21 UTC: DMAPool.info now reports the live manager record accounting for the bounded proof-buffer path. The manifest-granted DMAPool starts as zero-live-dmapool-proof, moves to synthetic-live-dmapool-proof with one live 4096-byte page while the manager-attached DMABuffer result cap is active, and returns to zero-live after typed DMABuffer.freeBuffer scrubs/releases the proof frame. The focused grant smoke asserts the live and after-free accounting lines through the typed runtime client. This remains bounded proof-buffer accounting only; it does not implement real device-visible DMA mappings, IOVA/physical address exposure, production descriptor side effects, or production page lifecycle.

Update 2026-05-09 23:52 UTC: the manifest-granted Interrupt skeleton now also has bounded Interrupt.mask and Interrupt.unmask admission methods. They validate the current manager-attached route through the existing Mask/Unmask authority paths and return typed no-side-effect labels while proving route state and delivery counts stay unchanged. This does not implement real route mask/unmask mutation, hardware acknowledgement, blocking userspace waiters, MSI/MSI-X table programming, or real interrupt delivery.

Update 2026-05-09 23:21 UTC: the manifest-granted Interrupt skeleton now also has a bounded Interrupt.acknowledge admission method. It validates the current manager-attached route through the existing Acknowledge authority path and returns typed no-side-effect labels without acknowledging hardware, waking waiters, or changing delivery counts. This does not implement real interrupt acknowledgement, blocking userspace waiters, real mask/unmask route mutation, or real interrupt delivery.

Update 2026-05-09 19:18 UTC: the manifest-granted Interrupt skeleton now has a bounded Interrupt.wait admission method. It validates the current manager-attached route, delegates to the shared pending-IRQ token validator, and returns typed masked-route labels without waking a waiter or advancing delivery counts. This does not implement blocking userspace waiters, hardware acknowledgement, real route mask/unmask mutation, or real interrupt delivery.

Update 2026-04-28 22:02 UTC: normal shell client @... grants reject explicit badge N selector syntax and preserve delegated client endpoint identity when the selector is omitted. Low-level and hostile-path tests still carry explicit selector fixtures.

Update 2026-04-29 05:59 UTC: the focused chat manifest now routes the kernel singleton chat_endpoint through init to the resident chat server, and the focused chat shell no longer receives a manifest-forwarded chat service export. Its normal chat authority comes from the broker-issued operator shell bundle, matching the default and remote shell paths while the resident bot keeps its manifest service grant.

Update 2026-04-29 07:35 UTC: Session-Bound Invocation Context core gates are landed. Implemented pieces include the process-session invariant, endpoint caller-session metadata, stale normal endpoint rejection, transfer scopes, field-granular disclosure gating, session expiry for broker-issued shell bundle caps, guest bundle narrowing, chat session-keyed membership, and Aurelian player state keyed by live endpoint caller-session metadata.

Update 2026-05-01 08:47 UTC: default password-authenticated local operator sessions now mint with no wall-clock expiration. Short-expiry operator proofs remain available by setting a non-default sessionLifetimes.operatorMs in the manifest.

Update 2026-04-29 09:44 UTC: later Gate 4 cleanup put terminal output behind live caller-session dispatch, bound shell-serviced stdio bridge waits to opaque live caller-session metadata, removed remaining badge-facing service-common handler APIs from normal chat paths, and widened non-adventure endpoint caller-session opaque references to 128 bits while preserving the scoped_ref ABI field as the low half.

Update 2026-04-29 10:20 UTC: non-adventure endpoint caller-session references now use an entropy-backed boot secret and HMAC-SHA256 over a non-reused endpoint service-scope id plus kernel session id. The ABI layout is unchanged, but scoped_ref is no longer value-compatible with the old unkeyed hash. References rotate on reboot and endpoint object replacement.

Update 2026-04-29 21:40 UTC: Gate 4 of the Session-Bound Invocation Context milestone is implemented and verified on mainline. Commit faeff80 at 2026-04-29 21:39 UTC records the final closeout: normal chat, adventure, terminal, and stdio paths no longer expose caller-selected receiver identity, focused make run-adventure passed with session-bound Adventure/chat service grants, and focused make docs passed after docs PDF render hardening. The paper/status alignment now records the session-bound shared-service evidence as landed. Remaining work in this area is future stable service-audit identity across upgrades, not additional shared-service migration.

Implemented

Visible Milestone Proofs

  • First Packet: commit b56a5c1 at 2026-04-24 15:37 UTC.
  • First HTTP: commit a4f1722 at 2026-04-24 16:47 UTC.
  • The Unprivileged Stranger: commit d4016ab at 2026-04-22 16:35 UTC.
  • Native Cap Shell: commit f554e88 at 2026-04-23 08:41 UTC.
  • Boot to Shell: commit e5adafb at 2026-04-23 13:39 UTC.
  • The Revocable Read: commit 7f19af2 at 2026-04-23 16:15 UTC.
  • First Chat: commit 2cd85a8 at 2026-04-24 00:13 UTC.
  • Local MUD: commit add7f9b at 2026-04-24 01:40 UTC.
  • Verified Core: commit d43b691 at 2026-04-23 22:09 UTC.
  • Ring as Black Box: commit da5f5e9 at 2026-04-24 03:13 UTC.
  • First AP Scheduler: commit d88bca7 at 2026-04-25 11:31 UTC.
  • Telnet Shell Demo: commit 2834bfc at 2026-04-25 20:25 UTC. Demo scope: plaintext, loopback-only research demo proving the TerminalSession/SessionManager/AuthorityBroker/ RestrictedShellLauncher boundary over a real TCP socket; not a shippable Telnet service. Production remote shell tracks the SSH Shell Gateway in SSH Shell Gateway.
  • Multi-Process SMP Concurrency: commit 3fb89923 at 2026-04-30 09:45 UTC.
  • Default Run Telnet Wiring Retired: commit 367117be at 2026-05-01 16:54 UTC removes default host-local Telnet gateway forwarding and the default manifest service. Current make run starts the foreground shell, chat service, and remote-session CapSet gateway, and forwards only the remote CapSet endpoint. The plaintext demo was later retired with qemu-only kernel TCP listener removal; make run-telnet now exits before QEMU with a retirement diagnostic. Earlier commit 7a155f4 at 2026-04-26 21:02 UTC moves Telnet IAC filtering into the kernel socket terminal (best-effort silent swallow for the BSD/netkit clients we test against; no WONT/DONT replies) so a normal Telnet client lands at the shell prompt without a userspace pre-handoff recv, and refactors the gateway to loop accept/handoff/launch_shell/wait per connection so repeated host Telnet connections succeed.
  • Service Object Routing/Lifecycle: commit a4655f0 at 2026-04-28 14:10 UTC; make run-service-object-routing proves trusted service-object minting, receiver-cookie dispatch, payload-spoof rejection, copy/move IPC transfer, nested spawn delegation, generation-checked receiver cookies, close/revoke rejection, and stale-cookie rejection after record reuse. This is now historical low-level coverage: the implemented Session-Bound Invocation Context baseline gives normal workload processes one immutable session context, and endpoint subject disclosure is private by default.
  • Session Context Invariant: commit 3edee90 at 2026-04-28 16:26 UTC adds make run-session-context, proving every spawned process has one immutable session context, raw child spawns inherit the caller context, a copied UserSession cap cannot relabel invocation context, and a broker-issued launcher can select a validated child context while mismatched profile requests fail closed. Commit 3469c27 at 2026-04-28 16:54 UTC extends the proof so expired guest-session bundle refreshes fail closed at the broker. Commit 687511a at 2026-04-28 17:43 UTC adds privacy-preserving endpoint caller-session metadata and stale normal endpoint rejection: endpoint servers receive only a service-scoped opaque caller-session reference, epoch, and live/stale flags by default, spoofed user/session/role payload labels do not affect the delivered invocation context, and calls after the process session expires fail before transfer preparation or enqueue. Commit f0cb74b at 2026-04-28 18:38 UTC adds session-aware cap transfer scopes: same-session-only caps cannot cross into another session, explicitly shareable caps may cross and then invoke under the receiver session, and service-regrant-only caps require a trusted fixed-session broker/launcher path. Commit 0f92d77 at 2026-04-28 19:33 UTC adds explicit endpoint subject disclosure gating: request without scope and scope without request expose no subject fields, request plus matching scope exposes only allowed fields, and broader requests are narrowed. Commit dc7ece4 at 2026-04-28 20:06 UTC migrates the chat demo to session-keyed membership: chat member state is keyed by the endpoint caller-session reference, the focused chat manifest no longer assigns static chat badges, and make run-chat proves operator-session chat clients plus rejected delegated endpoint relabeling. A follow-up review fix keeps join handles as request data and uses service-assigned visible member labels.
  • 2026-04-28 20:48 UTC narrows guest shell bundles: guest sessions require an explicit manifest guest seed, guest bundles receive no default chat/adventure service endpoint caps, and guest launcher policy comes from resource-profile launcherProfile rather than the full manifest binary list.

Boot and Kernel Baseline

  • Limine boots the x86_64 kernel in QEMU.
  • The kernel initializes dual UART output, GDT, IDT, LAPIC, syscall MSRs, memory management, page tables, heap allocation, and the global capability registry. The legacy PIC/PIT path remains as a fallback when LAPIC timer setup or PIT-based calibration is unavailable.
  • User page-table map, unmap, and protect operations are routed through a TLB shootdown helper keyed by address-space CPU residency. Remote targets get pending full-TLB flush generations plus vector-49 IPIs, and the sender waits for observed target completion after ring dispatch releases address-space, cap-table, and scratch locks. Deferred queue slots are reserved before page-table mutation, and drains flush the current CPU before waiting. Delayed maskable interrupt delivery is covered by syscall-entry and flush-before-user-return hooks. Scheduler CR3 handoff marks the current CPU resident, including AP cpu=1 during the first AP scheduler-owner proof, so remote shootdown targets become active when an address space has run on more than one CPU.
  • AP cpu=1 can own scheduler/user execution under -smp 2: APs register their PerCpu records, program LAPIC timers from the BSP calibration, update AP TSS.RSP0 during context switches, and enter the scheduler from the AP idle loop when AP timer setup succeeds. This proof keeps one scheduler owner; when AP cpu=1 is online with a programmed timer, the BSP stays in kernel idle so the process-wide capability ring is not executed concurrently.
  • The Multi-Process SMP Concurrency milestone is complete at commit 3fb89923 (2026-04-30 09:45 UTC). The scheduler tree includes a narrow reschedule-IPI wake path for halted scheduler-owner loops, and make run-smp-process-scale builds capos-smp-process-scale.iso from system-smp-process-scale.cue, runs repeated -smp 1, -smp 2, and best-effort 4-vCPU QEMU cases, parses compact verified timing lines, stores raw serial logs under target/smp-process-scale/<timestamp>/, and enforces the 1.6x median speedup threshold when KVM-backed evidence is available. The accepted run in target/smp-process-scale/cycle-balanced-default/ recorded 1.608x 1-to-2 speedup. A later capos-bench nested-QEMU/KVM rerun on GCE n2-highcpu-8 at commit 0d89a91b (2026-04-30 11:09 UTC) pinned QEMU to host CPUs 0,1,2,3 and recorded 1.873x capOS 1-to-2 speedup; the matching Linux guest baseline under the same CPU pinning recorded 1.934x. The same run recorded capOS smp4=1111 scaled cycles, or 1.475x from the 1-vCPU baseline but slower than the 2-vCPU median, while Linux recorded 3.774x 1-to-4 speedup; capOS therefore still claims only the 1-to-2 milestone gate. The closeout also reran ordinary run-smoke and run-spawn under -smp 2, with logs in target/smp2-smokes/, covering the default manifest, ring, thread lifecycle, park cleanup, generic child waits, and process exit.
  • The kernel creates its own page tables with per-section permissions and keeps the higher-half direct map for physical memory access.
  • SMEP/SMAP are enabled when the QEMU CPU advertises support.

Code: kernel/src/main.rs, kernel/src/arch/x86_64/, kernel/src/mem/.

Validation: cargo build --features qemu, make run-smoke.

Process and Userspace Runtime

  • Processes have isolated address spaces, one or more internal Thread records with per-thread kernel stacks and saved CPU context, CapSet bootstrap pages, capability rings, and local capability tables.
  • ELF loading supports static no_std userspace binaries and TLS setup.
  • capos-rt owns the userspace entry path, allocator initialization, ring-client access, typed clients, result-cap parsing, and owned-handle release.
  • capos-rt is the only source owner for the userspace _start, panic, global allocator, raw syscall, and capos_rt_main handoff surfaces; a source check guards this split.
  • targets/x86_64-unknown-capos.json defines the capOS userspace target for booted init, demos, shell, and capos-rt runtime builds; the kernel default remains x86_64-unknown-none.
  • The 7.1.0 in-process threading contract defines the split between process-owned address-space/capability state and thread-owned execution state, plus thread/kernel-stack quotas and generation-checked waiter identity. 7.2.0 moved saved context, kernel stack, FS base, and block state into Thread records; 7.2.1 schedules and wakes generation-checked ThreadRef values; 7.2.2 adds process-local ThreadSpawner and ThreadHandle caps plus ThreadControl.exitThread for create, join, detach, self-join rejection, exit-code observation, and last-thread process exit; and 7.2.3 adds private ParkSpace wait/wake with timeout, wake, and reserved waiter completion semantics. SharedParkSpace park-words remain future work.

Code: kernel/src/spawn.rs, kernel/src/process.rs, capos-rt/src/, init/src/main.rs, demos/, shell/, targets/x86_64-unknown-capos.json, tools/check-userspace-runtime-surface.sh.

Design: In-Process Threading, Park Authority.

Validation: tools/check-userspace-runtime-surface.sh, make capos-rt-check, make init-capos-build, make demos-capos-build, make shell-capos-build, make capos-rt-capos-build, make run-smoke, make run-spawn.

Programming Language Support

  • Native capOS Rust is the only implemented booted Rust language path. It uses #![no_std], alloc, capos-rt, static ELF binaries, and the targets/x86_64-unknown-capos.json custom target.
  • Native C boots through the libcapos C-substrate (Phase 0; make run-c-hello exercises Console + Timer + EntropySource + a 4 KiB anonymous VM roundtrip) and through the POSIX adapter (Phase P1.2 Phase B; the historical make run-posix-dns-smoke resolved example.com over the qemu-only kernel UdpSocket cap via QEMU slirp DNS at 10.0.2.3:53, but that target is retired after kernel socket-owner removal; make run-posix-pipe-smoke and make run-posix-spawn-smoke exercise pipe, fork-for-exec, direct posix_spawn, minimal file actions, read, and waitpid over ProcessSpawner / Pipe; the Console-backed stdio proof landed at commit aa6a56d7 (2026-05-13 11:03 UTC) and make run-posix-stdio-smoke exercises write(1, ...) and write(2, ...) over a granted Console while proving read(0, ...) stays closed without a stdin grant; the file/directory fd closeout landed at commit f97d9833 (2026-05-23 06:23 UTC) and make run-posix-file exercises open(), write(), lseek(), read(), opendir(), readdir(), and closedir() over a granted root Directory; make run-posix-printf exercises the focused printf/string subset: formatted output, string/mem, numeric conversion, and ctype helpers; make run-posix-signal-time exercises Timer-backed time, nanosleep, and sleep plus the documented fail-closed signal-delivery stubs). Both bypass WASI – they are static ELF binaries linked against libcapos.a and, for POSIX smokes, libcapos_posix.a. POSIX posix_spawn() accepts argv/envp for source compatibility but does not deliver them until LaunchParameters / environment support lands. Broader C/libcapos surface and full POSIX adapter scope remain future design.
  • Sandboxed wasm32-wasi is the first booted WASI-hosted language path. Phase W.5 (filesystem; capos-wasm/src/wasi/fs.rs) closed and is exercised by make run-wasi-fs: the wasm-host installs the manifest-granted root Directory cap as a preopened fd, the WASI payload writes and reads back a file through path_open / fd_write / fd_close / re-open / fd_read, and the preopen sandbox refuses absolute paths and parent-escape .. segments. The WASI host adapter closed Phase W.4 at commit b0f6939f (2026-05-07 20:09 UTC); Phase W.3 closed at commit ca41ecc1 (2026-05-07 18:29 UTC; the surrounding W.3 narrative stamps from 2026-05-07 18:25 UTC predate the feat commit by a few minutes); Phase W.2 closed at commit 7bfcb1d8 (2026-05-07 10:53 UTC): the wasm-host userspace binary (capos-wasm/ standalone crate over vendored wasmi 1.0.9) hosts WebAssembly modules whose wasi_snapshot_preview1 imports are backed by typed capOS capabilities (Console + Timer + BootPackage, the per-instance argv text grant from W.3, the 2026-05-13 bounded environment text grant through initConfig.init.wasiEnv, and the optional W.4 EntropySource cap looked up from the per-instance CapSet under the well-known name random). make run-wasi-hello-rust, make run-wasi-hello-c, make run-wasi-cli-args, and make run-wasi-env are the regression smokes; make run-wasi-random is the W.4 granted gate (the payload reads N=64 bytes through random_get and prints [wasi-random] entropy_bytes=64 entropy_bound_ok=true) and make run-wasi-random-ungranted is the matching refusal gate (the same payload observes ERRNO_NOSYS = 52 from the closed-fail branch when the manifest withholds the grant). A 2026-05-13 authority-free compatibility slice adds make run-wasi-stdio-fd, whose direct-import payload proves clock_res_get(MONOTONIC), sched_yield, fd_fdstat_get(1/2), and fd_seek(1/2) no longer return ERRNO_NOSYS; make run-wasi-env proves one granted environment value reaches a WASI payload through environ_get / environ_sizes_get; make run-wasi-preview1-refusals remains the storage/socket fail-closed gate for path_open, fd_read, sock_send, and sock_recv. Wall-clock support stays deferred until capOS has a typed WallClock/RealTimeClock cap; clock_time_get(CLOCKID_REALTIME) keeps returning ERRNO_NOSYS until that cap lands.
  • Rust std, C++, Go, Python, JavaScript/TypeScript, and full POSIX shell/utilities are not implemented as supported capOS runtime paths.
  • Lua has a Phase 0 in-tree capability-aware Lua-subset interpreter under demos/lua-smoke/ (gated by make run-lua-smoke); it validates the long-term capability-userdata host API design but is NOT a PUC Lua dialect-compatible runner. Dialect compatibility waits on the future C/libcapos PUC port.
  • The planned compatibility story is split by adapter type rather than one generic “compatibility layer”: native runtime adapters for languages such as Rust and Go, capability-native bindings over Cap’n Proto interfaces, POSIX compatibility adapters over scoped file/socket/process caps, and WASI host adapters backed by capabilities.

Design: Programming Languages, Userspace Runtime, Userspace Binaries, Go Runtime, Lua Scripting.

Validation: current native Rust validation uses tools/check-userspace-runtime-surface.sh, custom-target userspace builds, and the runtime QEMU smokes listed above. Native C/POSIX validation is through the focused make run-c-* and make run-posix-* smokes named in this section, including make run-posix-file for the File/Directory fd surface. WASI filesystem validation uses make run-wasi-fs (Phase W.5 preopened-directory round-trip and sandbox proof).

Capability Ring and IPC

  • The shared ring ABI supports CALL, RECV, RETURN, RELEASE, CANCEL, NOP, and compact ParkSpace PARK/UNPARK transport operations.
  • cap_enter processes submissions and can block until completions arrive or a timeout expires.
  • Endpoints route ring-native IPC between processes.
  • Direct IPC handoff lets a blocked receiver run before unrelated round-robin work after a matching CALL arrives.
  • Transport errors and application exceptions are surfaced through CQEs and typed runtime client errors.
  • Ordinary capability implementation errors, revoked ordinary/endpoint use, live endpoint target errors after endpoint identification, and endpoint RETURN application failures use serialized CapException payloads when a caller result buffer can safely receive one. No-payload application failures report CAP_ERR_APPLICATION_EXCEPTION_TRUNCATED; malformed transport metadata and unsafe result-buffer paths remain transport errors.
  • Endpoint RETURN can propagate a serialized CapException from a userspace endpoint server to the original cross-process caller.
  • debug_tap builds export metadata-only ringtap: records for observed SQEs and posted CQEs on the QEMU/debug UART. The format is fixed, bounded, and deliberately records payload_len = 0 until a separate payload-capture authority lands.
  • tools/ringtap-viewer/ parses ringtap: logs into SQE/CQE summaries and can decode authorized Cap’n Proto payloads for CapException, TerminalSession.readLine params, and ProcessHandle.wait results when future tap output includes payload_schema and payload_hex fields.
  • make run-ringtap-failing-call boots the default shell smoke with debug_tap, drives the known typed-call method-99 launcher failure, runs the viewer over the captured kernel log, and leaves offline inspection logs in target/ringtap-failing-call-*.log.

Code: capos-config/src/ring.rs, kernel/src/cap/ring.rs, kernel/src/cap/endpoint.rs, kernel/src/debug_tap.rs, capos-rt/src/ring.rs, capos-rt/src/client.rs, tools/ringtap-failing-call-smoke.sh, tools/ringtap-viewer/.

Validation: cargo test-ring-loom, make run-smoke, make run-spawn, make run-smoke CARGO_FLAGS='--features debug_tap', cd tools/ringtap-viewer && cargo test, make run-ringtap-failing-call.

Capabilities

Implemented kernel capabilities include:

  • Console for debug UART output.
  • TerminalSession for the separate session UART with line input/output, bounded readLine, visible/hidden echo, structured cancellation, and a single move-only foreground holder.
  • BootPackage for read-only, chunked boot manifest reads from init.
  • FrameAllocator for typed MemoryObject frame ownership grants.
  • MemoryObject for owned physical frame ranges, caller-local map/unmap/protect, and final backing release after cap/mapping teardown.
  • Endpoint for IPC rendezvous.
  • VirtualMemory for anonymous user page map, unmap, and protect operations.
  • Timer for monotonic tick/time reads and bounded sleep completions through the capability ring.
  • ThreadControl for runtime-owned FS-base get/set and current-thread exitThread on the current thread.
  • ThreadSpawner and ThreadHandle for process-local in-process thread creation, one-shot join, exit-code observation, detach-on-release, and retained-status cleanup.
  • ParkSpace for process-local private park wait/wake on 32-bit userspace words, with per-thread blocking and reserved waiter CQE credits.
  • ProcessSpawner and ProcessHandle for init-driven child process creation and wait semantics.
  • Retired NetworkManager, TcpListener, and TcpSocket qemu-only kernel socket capabilities. Their entry points now fail closed; the active TCP/UDP socket authority shape is the Phase C userspace network-stack path.

MemoryObject holders and anonymous VirtualMemory mappings charge the same per-process ResourceLedger::frame_grant_pages quota. Mapping a held MemoryObject records borrowed address-space pages and reserves mapping quota until unmap so backing frames cannot stay pinned after the cap charge is released.

Code: kernel/src/cap/console.rs, kernel/src/cap/terminal_session.rs, kernel/src/cap/boot_package.rs, kernel/src/cap/frame_alloc.rs, kernel/src/cap/endpoint.rs, kernel/src/cap/virtual_memory.rs, kernel/src/cap/timer.rs, kernel/src/cap/thread_control.rs, kernel/src/cap/thread_handle.rs, kernel/src/cap/process_spawner.rs, kernel/src/cap/network.rs.

Validation: make run-smoke, make run-memoryobject-shared, make run-spawn, make run-shell, make run-terminal, make run-net, cargo test-lib.

Capability Transfer and Release

  • IPC CALL and RETURN support sideband transfer descriptors.
  • Copy and move transfer are implemented.
  • Move transfer reserves the sender slot until destination insertion and commit.
  • Transfer result caps carry interface ids to userspace.
  • CAP_OP_RELEASE removes local capability-table slots. Runtime owned-handle drop queues one local release, and Runtime::flush_releases() forces queued releases when code cannot wait for the next ring-client acquisition/drop.

Code: kernel/src/cap/transfer.rs, kernel/src/cap/ring.rs, capos-lib/src/cap_table.rs, capos-rt/src/ring.rs.

Validation: cargo test-lib, make run-smoke.

Manifest Tooling and Smokes

  • tools/mkmanifest turns system.cue into a Cap’n Proto boot manifest.
  • The build uses repo-pinned Cap’n Proto and CUE tool paths through the Makefile; direct mkmanifest invocation also rejects missing, unpinned, or version-mismatched CUE compilers. mkmanifest cue-to-capnp extends the same pinned-tool policy to general CUE-authored data messages: it exports CUE as JSON, validates CAPOS_CAPNP, and delegates arbitrary specified schema-rooted struct serialization to capnp convert json:binary.
  • Default scripted QEMU smoke still uses the focused shell-led system-smoke.cue path: anonymous session on boot, login prompting for username before hidden password entry, generic failed-auth output on a wrong password, successful operator login, broker upgrade to the operator bundle, child terminal isolation, stale-handle release, single-capos-shell init boot, and clean halt. The default operator-facing system.cue path is init-owned and is exercised by make run.
  • system.cue is now the default init-owned manifest. The kernel starts only the first init service, and init starts capos-shell, the remote-session CapSet gateway, and the default demo services from the manifest service graph. The shell receives terminal/creds/sessions/audit/broker caps and mints its own anonymous session.
  • system-shell.cue is the focused anonymous-shell proof (no verifier), which exercises the shell in its anonymous bundle and asserts that the anonymous launcher rejects spawns because its allowlist is empty.
  • system-chat.cue is the focused First Chat prototype proof. It starts a resident Chat endpoint service on the kernel singleton chat_endpoint, a resident bot participant, and the shell; make run-chat drives run "chat-client" with explicit StdIO plus the broker-issued chat endpoint grant, sends one line, and checks that the bot reply is printed by the foreground client.
  • system-adventure.cue is the focused adventure prototype proof. It keeps adventure out of shell builtins and drives run "adventure-client" through explicit StdIO, adventure, and chat endpoint grants. See the Aurelian Frontier (proof slice) page for the current mission, commands, and transcript coverage.
  • system-paperclips.cue is the focused clean-room Paperclips-style terminal demo proof. As of commit 532207c1 (2026-04-30 20:54 UTC), it boots Paperclips server services plus a terminal client. The server owns generated content, game state, regular timer cadence, unlock checks, game-rule mutation, and proof-command gating; the terminal client receives explicit StdIO plus a PaperclipsGame endpoint and renders server-mode help from the server’s structured command specs. Commit e9ae4e97 (2026-04-30 22:02 UTC) adds structured plain-status snapshots, so server-mode plain status is rendered from the server’s structured PaperclipsStatusSnapshot. Commit 32462e9f (2026-04-30 22:32 UTC) adds the structured project-list follow-up: server-provided project entries for terminal-rendered plain projects, while project <id> remains a raw text request that mutates server-owned game state. make run-paperclips first proves that normal server authority rejects run <ms> fast-forward plus rejection of a forged proof_accelerator: @timer grant, then relaunches against the proof server endpoint with the focused manifest’s explicit proof_accelerator cap for transcript acceleration. The accelerated proof drives one-at-a-time manual production, locked-purchase and insufficient-funds refusal output, bulk-manual rejection, high-price zero-demand sale refusal, no-wire manual production refusal, explicit sales, immediate repeat-sale cooldown refusal, repeatable marketing, autoclipper unlock, real-time automation, generated Cap’n Proto content loading, first project completion, scaled business-phase production, the design-search and forecast-engine project chain, survey-drones, the visible == autonomous phase == transition, then representative autonomous drone/factory scaling with local-matter conversion and additional clip production, mesh-coordination, seed-probes, the visible == cosmic phase == transition, one bounded probe interval with cosmic matter conversion, probe replication, and additional production, then asserts a compact status --json machine-readable status line and verifies final-conversion remains locked before clean process exit through the native shell. Active schema, content, rules, and smoke sources use clean-room Strategy internals, and host tests reject explicit zero-count purchases without mutating state. Host tests cover the one-real-time-hour non-completion property under a generous normal-play creativity upper bound.
  • demos/service-common/ holds the shared caller-session endpoint loop and chat actor bootstrap/polling helpers used by the chat/adventure resident services, chat bot, and adventure NPC processes. New shared endpoint loop code uses EndpointUserData; the old badge-named user-data alias remains only for compatibility while peer branches migrate. Shared event queues remain deferred until another service has queue needs matching chat history/inbox behavior.
  • system-spawn.cue remains the focused ProcessSpawner smoke for endpoint, IPC, VirtualMemory, Timer, ThreadControl, FrameAllocator cleanup, and hostile spawn inputs. make run-spawn asserts that the kernel boot-launches only the standalone init, that init validates BootPackage metadata, and that the init-owned manifest executor spawns and waits for every focused child service, including the timer-smoke monotonic now/sleep proof, timer-flood per-process Timer sleep quota proof, runtime-fs-base runtime-owned FS-base proof, single-thread-runtime VirtualMemory plus Timer runtime checkpoint, and thread-lifecycle in-process thread/park proof.

Code: tools/mkmanifest/, system.cue, system-chat.cue, system-adventure.cue, system-paperclips.cue, system-spawn.cue, demos/, capos-rt/.

Validation: cargo test-mkmanifest, make generated-code-check, make run-smoke, make run-chat, make run-adventure, make run-paperclips, make run-spawn.

Partially Implemented

Login Boot and Init-Owned Spawn

Default make run now uses the init-owned default manifest. The kernel validates the kernel-owned boot boundary, boot-launches standalone init, and leaves the service graph plus login/session/broker flow in userspace. Init starts the foreground capos-shell service, resident demo services, and the host-local remote-session CapSet gateway; make run forwards host-local TCP to guest port 2327 for the remote CapSet path only. The foreground shell mints its own anonymous UserSession on boot; login and setup commands drive CredentialStore/SessionManager/AuthorityBroker to upgrade the session in place. Local password login is username-aware on the ordinary foreground shell path, while durable multi-account credential storage remains future work. The plaintext Telnet gateway was only a focused make run-telnet / system-telnet.cue research demo. That target is retired after qemu-only kernel TCP listener removal, and the gateway demo, its manifest, and the kernel SocketTerminalSession shim are removed; use the in-guest login smokes for current shell coverage and rebuild any socket-backed terminal proof on the Phase C userspace network stack before using it as validation.

The focused init-owned spawn path remains under make run-spawn. There the kernel boot-launches init with Console, BootPackage, and ProcessSpawner. Parent endpoint facets used for later service-sourced imports are returned by ProcessSpawner during child spawn, not granted at boot. init performs metadata-only manifest validation, resolves kernel and service cap sources, spawns children through ProcessSpawner, records exports, waits for children, and reports failures through Console output. The QEMU target now asserts the single-init boot markers, the three-cap init bundle, BootPackage validation, child exit records, manifest child waits, spawn-loop completion, and clean halt.

Measurement startup now follows the same boundary. make run-measure uses a focused system-measure.cue manifest where the kernel boots standalone init with Console, BootPackage, and ProcessSpawner, and init spawns ring-nop with Console, FrameAllocator, the measurement-only NullCap, and the measurement-only ParkBench cap through ProcessSpawner grants. It also spawns thread-lifecycle with ThreadControl, ThreadSpawner, ParkSpace, and a measurement marker cap. The demos print compact versus generic park-shaped failed-wait/empty-wake cycle averages plus real ParkSpace blocked/resume cycle averages before the measure-feature kernel prints segmented dispatch counts, total cycles, and averages for SQE processing, validation, cap lookup, capnp decode, method body dispatch, CQE posting, and waiter wake/check. Kernel bootstrap now loads only initConfig.init and validates only the kernel-owned manifest boundary; mkmanifest and init own initConfig.services graph validation for focused BootPackage executor manifests.

SSH Shell Gateway

The SSH Shell Gateway proof targets are implemented, covering the authority prerequisites and fixture authentication path that precede an encrypted SSH transport. Bounded QEMU smokes exist for:

  • Host-key fixture signing (make run-ssh-host-key): a development-only non-production SshHostKey cap returns public metadata, signs bounded fixture exchange hashes for QEMU proof, fails wrong-algorithm requests closed, and does not leak the private host-key seed.
  • Authorized-key lookup (make run-ssh-authorized-key): a manifest-seeded AuthorizedKeyStore cap accepts configured ssh-ed25519 public keys mapped to seed-account principals, denies unknown, disabled, and unsupported-algorithm keys, and does not leak private key material.
  • Public-key session minting (make run-ssh-public-key-session, make run-ssh-public-key-auth): SessionManager.sshPublicKey rechecks configured key records, verifies a bounded ssh-ed25519 signature over fixture authentication bytes, mints a publicKey UserSession only after the signature succeeds, and logs stable audit reason codes for each denial path without leaking principal or profile metadata. UserSession.auditContext fails closed after logout through the same ensure_session_live guard as info().
  • Unsupported feature policy (make run-ssh-feature-policy): a capos-config::ssh_policy surface classifies password auth, exec requests, SFTP, direct-tcpip, agent/X11 forwarding, env import, and multiple session or shell channels into stable audit reason codes; all denied paths produce event=session result=denied reason=policy audit records.
  • Restricted shell launcher (make run-restricted-shell-launcher): a manifest-declared RestrictedShellLauncher cap launches only capos-shell, injects supplied terminal/session caps plus child-local stdio, rejects session/profile mismatch and kernel-sourced or dangerous pass-through grant attempts, and strips hidden process-supervision result caps.
  • Bounded terminal-host proof (retired): make run-ssh-gateway-terminal-host wired scoped TcpListenAuthority listen, authorized-key lookup, public-key UserSession minting, broker profile matching, socket-to-TerminalSession conversion, and restricted shell launch over a host-local plain TCP connection. It sat on the qemu-only kernel socket owner and the kernel SocketTerminalSession, both of which are retired with the userspace network-stack migration; the smoke now exits with a retirement diagnostic and a future terminal host must target the userspace network stack.

Encrypted SSH packet transport, OpenSSH-compatible key exchange and channel handling, full SSH userauth transcript validation, channel binding, TerminalSessionFromByteStream terminal-factory wiring, a terminal host over the userspace network stack, and a production OpenSSH harness remain open. The landed proofs use development/fixture key material; they are not a production SSH service and are not safe for non-loopback deployment.

Design: SSH Shell Gateway.

Code: kernel/src/cap/ssh_host_key.rs, kernel/src/cap/authorized_key_store.rs, kernel/src/cap/restricted_launcher.rs, capos-config/src/ (ssh_policy), demos/ssh-*/, tools/qemu-ssh-*-smoke.sh.

Validation: make run-ssh-host-key, make run-ssh-authorized-key, make run-ssh-public-key-session, make run-ssh-public-key-auth, make run-ssh-feature-policy.

Hardware and Networking

The hardware bring-up path has bounded ACPI RSDP/RSDT/XSDT, MADT, MCFG, DMAR, and IVRS diagnostics plus reusable PCI config-space access through legacy I/O ports and Q35 PCIe ECAM, and the x86 path programs masked MADT-backed I/O APIC routes for legacy IRQs while honoring source overrides. IOMMU reporting is policy-only: malformed DMAR/IVRS structures fail closed, DMAR DRHD include-all or single-hop PCI endpoint device-scope metadata can mark retained DMA-capable PCI functions as IOMMU-attached/covered; bridge and multi-hop scopes remain diagnostic-only until PCI topology traversal exists, and include-all fallback fails closed when retained DMAR coverage metadata is capped. Direct DMA remains blocked with zero trusted domains, and every retained DMA-capable prototype function requires bounce buffering. The current staged domain-policy proof also reports that future claimed DMA-capable devices use a device-manager-owned per-device domain or trusted sharing group, exported device addresses are IOVA-only, host physical addresses are not user-visible, remapping tables are not programmed, and production userspace hardware authority is still blocked. That blocked-direct-DMA admission decision now runs through the host-tested capos-lib::device_authority helper used for device-authority validation, so the PCI proof line and diagnostics mirrors share the same fail-closed labels for absent, malformed, unsupported, or retained-capped remapping metadata. Active device-manager DMAPool policy records also carry a software remapping-domain ledger staging record. The QEMU lifecycle/imported-live proofs bind it to the active record and matching handle with diagnostics-only static ACPI/PCI coverage, remapping_domain_owner=device-manager, remapping_domain_ready=false, remapping_tables=not-programmed, iova_export=disabled-future-only, direct_dma=blocked, and host_physical_user_visible=0; no remapping tables, direct-DMA trusted domains, host physical addresses, or IOVAs are exposed. Bounded manifest grants now exist for DMAPool, DeviceMmio, and Interrupt: DeviceMmio exposes bounded .info, read-only userspace .map / .unmap over boot-preseeded BAR pages, brokered read-only .read32 backed by the same boot-preseeded 64-page kernel mapping cache, and bounded brokered claimed-register .write32; Interrupt exposes bounded .info plus admission-only .wait, .acknowledge, .mask, and .unmask, and DMAPool reports conservative .info status and can mint eight fixed manager-attached bounce-buffer DMABuffer result caps via request-shaped allocateBuffer. Valid one-page bounce-buffer requests report requested bytes, allocated bytes, page count, and request labels; zero-size and over-bounce-buffer requests fail closed as dmapool-allocation-request-invalid before result-cap or page mutation. The same DMAPool.info result now exposes the attached manager record’s owner/pool labels, live buffer/page/byte counts, in-flight submissions, and committed/resident/unswappable/scrub-before-release flags. The bounded manifest smoke proves zero-live accounting before allocation, one, two, three, and eight live 4096-byte bounce pages while result DMABuffers are active, a ninth-allocation full-pool rejection, restoration to the existing four-buffer descriptor working set after freeing slots 4 through 7, three-live accounting after freeing one descriptor-test slot, slot-0 reuse while the other three descriptor-test slots remain live, and zero-live again after all typed DMABuffer releases complete. That DMABuffer now supports typed .freeBuffer, single-page userspace bounce-buffer .map and .unmap, and bounded manager-accounted .submitDescriptor / .completeDescriptor. The .map path validates the live bounce-buffer epoch, accepts readable full-page requests, maps the manager-owned bounce page into the caller’s userspace address space, returns userspace-bounce-buffer-mapped, single-page-bounce-buffer, user-vma-mapped, and a nonzero userspace address, rejects zero-size, partial-page/out-of-range, and executable/unknown protections with typed range/protection labels, and still reports real_dma_mapping=not-programmed, direct DMA blocked, and no host physical address exposure. The .unmap path validates the live buffer record first, removes only that cap-owned borrowed VMA for the caller process, reports a typed no-op when no mapping is present, and leaves page lifetime plus pool and descriptor accounting unchanged. The descriptor paths carry request labels plus the bounded proof counts (queue_count=4, descriptor_count=8, buffer_bytes=4096): valid submits return manager-inflight-recorded and raise the attached DMAPool.info live_inflight count to 1, valid completions return manager-inflight-completed and restore it to 0, and valid completions with no outstanding submission return dmabuffer-no-inflight-submission with side-effect-blocked. The manager record also tracks the single live descriptor identity: duplicate submits for that live queue/descriptor return dmabuffer-descriptor-already-inflight, and valid-shaped completions for a different descriptor return dmabuffer-inflight-descriptor-mismatch; both paths leave live_inflight=1 until the matching completion arrives. Out-of-range queues/descriptors, zero submit lengths, submit lengths beyond the bounce buffer, and completion lengths beyond the bounce buffer still fail closed as dmabuffer-descriptor-request-invalid without mutating the counter. The default manager-accounting descriptor path also preflights result serialization before mutating accounting, and cap-table release drains bounded in-flight accounting before detaching so a removed userspace cap cannot strand the bounce buffer. The selected provider-TX exception for make run-ddf-provider-consumer is narrower and runtime-visible: queue 1 submits may publish the selected eight-entry TX queue depth, descriptors 0..7, into the existing kernel-owned virtio-net TX ring after the same DMABuffer authority, bounce-scrub, and live notify_mmio policy gates; that selected path then rings exactly one notify doorbell per accepted provider descriptor and lets DMABuffer.completeDescriptor consume the stored software descriptor generation from the real TX used ring before clearing each manager in-flight record. Live tx_interrupt.wait calls over that selected route can observe the ordered bounded completion events, and provider tx_interrupt release proves bounded teardown by draining seven incomplete descriptor handoffs or retiring seven delivered-but-unacked completion events with no pending provider waiters. Wrong queue, stale DMABuffer, stale notify policy, inflight publication, duplicate completion, and stale tx_interrupt issue paths still fail closed before their guarded side effects. DMABuffer.freeBuffer, cap release, driver-crash, reset-disable, and drop cleanup still revoke any remaining user mapping before scrub/free. None of these methods program direct DMA, publish arbitrary CQ entries, transfer full virtio-net ownership, or expose host physical addresses. Allocations beyond the eight fixed bounce-buffer slots, DMA map/submit/complete side effects outside the selected provider-TX proof, writable userspace BAR mappings, arbitrary MMIO writes and doorbells, unbrokered register access, blocking IRQ wait beyond the bounded selected-route completion waiter, real hardware acknowledgement, hardware IRQ ownership, hardware mask/unmask, hardware MSI/MSI-X programming, and general IRQ delivery remain blocked; parent-first DMAPool release defers until all live DMABuffer slots release, or successful DMABuffer driver-crash/reset-disable cleanup frees the remaining bounce pages and completes the staged zero-live pool detach. make run-hardware-grant-cycle proves sequential DeviceMmio/Interrupt skeleton grants can release and reacquire fresh DeviceMmio mapping generations while the Interrupt grant retains its source generation and refreshes only the route generation, and its read-only HardwareAuditLog.snapshot check decodes those two-cycle audit records through the current volatile unsigned audit surface. make run-hardware-audit-interrupt-waiter also decodes recent boot-time DmaBuffer, DmaPool, and Interrupt driver-crash / reset-disable lifecycle records through that typed volatile snapshot path, and its cursor snapshot requests from the first older retained DeviceMmio lifecycle sequence to decode rows outside the default latest 16-record tail. make run-hardware-audit also proves below-oldest cursor clamping and past-end empty cursor metadata on the overflowed volatile ring, and the QEMU-only local-ring proof now checks those same cursor edges without mutating live audit records. Unsafe retained metadata fails closed, and prototype devices remain kernel-owned bounce-buffer-only. The device-manager DMAPool attachment path now stores that explicit bounce-buffer policy on the attached pool record and the QEMU lifecycle/imported-live proofs read it through the active manager record and matching DmaPoolHandle; no new cross-manager lock was added, so the existing PCI_DEVICE_MANAGER before DEVICE_INTERRUPT_ROUTES order remains unchanged. PCI memory-BAR subregions are validated and mapped through a shared kernel helper before in-kernel drivers use device MMIO, and PCI capability walking reports non-programming MSI/MSI-X metadata for the QEMU virtio-net function. make run-pci-nvme now applies the same metadata-only PCI path to a QEMU NVMe controller: class/subclass/programming-interface, memory BAR, capability, and MSI-X metadata are visible, while userspace device authority, DMAPool, DeviceMmio, Interrupt, controller init, admin queues, I/O queues, MMIO doorbells, and direct DMA remain not started or blocked. make run-diagnostics now boots a feature-gated COM1 early-boot diagnostics prompt before capability, scheduler, timer, manifest, or userspace startup, with bounded commands for status, CPU, memory, ACPI, PCI, IRQ, timers, devices, logs, reboot placeholder, and halt. The ACPI and PCI diagnostics commands now also print bounded MADT/MCFG/DMAR/IVRS record details, PCI function/config-header summaries, BAR summaries, capability counts, MSI/MSI-X summaries, and bounded PCI DMA-attachment policy counters/details when present; devices reports PCI totals plus network/storage/display/bridge class counts and mirrors the current DMA-domain policy without owner identity: direct DMA is blocked, trusted-domain and ready-domain counts are zero, remapping tables are not programmed, future exported device addresses are IOVA-only, userspace device authority is not started, and prototype devices remain kernel-owned bounce-buffer-only. The current runtime-state diagnostics slice also attaches QEMU virtio-net plus the second virtio-rng proof device before the prompt and reports virtio_net=ready, bounded RX/TX virtqueue ring state, MSI-X route/vector/counter state, live buffer state, the kernel-owned DMA owner/pool ledger, and device interrupt route aggregates plus per-route delivery counters. Future driver extensions, production teardown/lifecycle diagnostics, IOMMU remapping table programming, production DMAPool, and userspace driver authority remain planned. The QEMU virtio-net path has a make run-net boot target, modern virtio PCI transport discovery for the common, notify, ISR, and device-specific MMIO regions, feature negotiation, and RX/TX split-virtqueue initialization, a TX descriptor completion proof, minimal Ethernet ARP resolution, and ICMP echo validation against the QEMU user-mode gateway. It no longer wraps the virtio driver in a kernel smoltcp interface or performs a kernel TCP HTTP GET; the remaining run-net evidence is a lower-layer QEMU fixture, and TCP/UDP socket proof lives under the Phase C userspace network-stack gates. QEMU currently exposes a transitional 1af4:1000 virtio-net function with modern vendor capabilities; capOS accepts that shape only through the modern capability layout and now selects a usable MSI-X capability for config/RX/TX table entries, records kernel-owned MSI-X sources for config/RX/TX in the device interrupt dispatch table, programs those entries through the typed PCI MSI-X table helper using a bounded first-fit LAPIC device MSI vector pool, lets the in-kernel virtio-net owner claim and unmask only its routes, assigns the virtio common/config and queue MSI-X vector fields, and keeps descriptor, ARP, and ICMP fixture evidence in make run-net after the kernel L4 owner is retired. Device-autonomous virtio-net MSI-X delivery is covered by the dedicated userspace-provider gates. A masked lifecycle probe on the unused virtio-net MSI-X table entry proves claimed-route reassignment, stale old-route rejection, old-vector unregistered delivery, reassigned-vector masked delivery, unsupported-vector delivery, and release before the live routes are registered. The QEMU virtio-rng second-device metadata path also exercises the device-manager ownership model, the claimed MSI-X route handoff, and a bounded teardown-trigger contract for cap release, process exit, driver crash, reset/disable escalation, interrupt waiter, future DeviceMmio, and future DMAPool trigger labels through the same claim/transfer/revoke/release transaction. It also proves a bounded manager-owned DeviceMmio record lifecycle: active-owner attach, stale and owner-mismatch rejection, duplicate attach rejection, active RingDoorbell validation through capos-lib::device_authority, binding to the first decoded PCI memory BAR region from the tested PciDevice, region_source=pci-decoded-memory-bar, region_bound_to_manager=true, bar_present=true, bar_memory=true, bar_base, bar_length, fail-closed wrong-BDF, wrong-BAR, and zero-length region metadata as devicemmio-region-invalid, no invalid mapping created, negative side-effect blocking, stale-after-revoke rejection as devicemmio-stale-handle with stale-owner-generation and side-effect-blocked, the RevokingHandles -> MmioRevoked transition blocking while attached, bounded detach, and no userspace handle, real BAR mapping, or doorbell write. For the bounded userspace DeviceMmio.map path, the shared pure capos-lib::device_authority validator now accepts only page-aligned in-BAR read-only requests and denies writable, executable, unknown-protection, zero-size, unaligned, offset-size overflow, and out-of-BAR requests before the kernel maps anything. Accepted map requests install a caller-owned borrowed read-only userspace VMA over boot-preseeded BAR pages only, and DeviceMmio.unmap removes that borrowed VMA after checking the active manager record and caller address space. Duplicate map, second unmap, stale release, drop, driver-crash, and reset-disable paths all preserve fail-closed VMA revocation and no-side-effect labels. For the brokered userspace DeviceMmio.read32 path, the same authority layer validates active handle/state/policy and in-BAR dword-aligned offsets before a single kernel-side volatile read from the boot-preseeded cache. The cap call path never installs new kernel mappings, returns typed mmio-read32-range-invalid denials for unaligned, overflowing, and out-of-BAR offsets, reports register_read=performed only for accepted reads, and fails closed after cap release. For the brokered userspace DeviceMmio.write32 path, the kernel runs the same manager-attached identity, policy, and range checks before one volatile dword write through the boot-preseeded kernel MMIO mapping cache, and it accepts only the single PCI MSI-X metadata-derived provider-scoped masked vector-control claim. The focused proof uses that idempotent dword on the virtio-rng BAR, confirms it through both read32 and the read-only userspace VMA, and proves an unclaimed message-address dword write leaves the original value unchanged. Unclaimed, unaligned, overflowing, and out-of-BAR calls report typed blocked labels before any write; stale or released handles fail closed before a write and do not return a write32 result payload. The same path proves a bounded zero-live DMAPool record lifecycle: active-owner attach, stale and owner-mismatch rejection, duplicate attach rejection, generation invalidation on revocation, DmaMappingsRemoved blocking while attached, teardown detach, and terminal release. It now also issues and records a bounded manager-attached DMA buffer handle under that attached pool, validates active SubmitDescriptor through the pure DMA-buffer validator, and records stale-after-revoke (stale-owner-generation), freed-buffer (freed), and reused-slot (stale-slot-generation) rejection with side-effect-blocked; the attached buffer record now also blocks zero-live pool teardown as dmapool-buffer-attached, rejects a stale same-slot proof-scoped FreeBuffer as dmabuffer-stale-handle with stale-slot-generation and side-effect-blocked while preserving the manager-owned buffer record, then validates an active FreeBuffer and manager-owned buffer-record detach as ok, after which the existing DMAPool detach succeeds. This still exposes no userspace handle and attempts no real DMA. The pending IRQ token path now also delegates the source/route-generation, masked, unregistered, invalid-owner, and malformed identity decision to a host-tested capos-lib::device_authority validator after snapshotting the live dispatch slot, while preserving the existing stale-pending-irq-* QEMU labels. This remains validator/adapter evidence, not production userspace interrupt waiter authority. The virtio-net smoke now also derives an imported live-accounting DMAPool record from the authoritative kernel-owned DMA ledger, records live buffer/page count, live bytes, in-flight submissions, committed/resident/unswappable flags, and scrub-before-release policy, and proves both teardown detach and DmaMappingsRemoved fail closed while that ledger is live. The live proof now consumes the device_dma teardown-evidence API, records the expected authoritative-ledger-live block with matching imported live accounting, and defers completion because this slice has no authoritative zero-live/scrubbed evidence for the live virtio-net ledger and does not attempt real DMA teardown, scrub, DmaMappingsRemoved, terminal Dead, or release for the live virtio-net record. A separate scratch-ledger proof reaches authoritative-ledger-zero-live only after both quiesce and scrub markers are set, without touching live virtio-net DMA. Another scratch proof covers stale DMA page handles by generation-tagging same-phys reuse and rejecting stale, wrong-queue, wrong-label, and duplicate-free attempts without mutating the active ledger. Userspace DMAPool/DeviceMmio/Interrupt authority, real lifecycle hook plumbing, real page quiesce/scrub/release cleanup, and broader driver interrupt dispatch remain planned. The kernel negotiates VIRTIO_F_VERSION_1 plus MAC when safe and VIRTIO_NET_F_MRG_RXBUF for QEMU’s merged-buffer virtio-net header, maps the virtio MMIO regions after kernel paging is live, allocates kernel-owned DMA pages for RX/TX descriptor, available, used rings, RX packet buffers, and one-shot TX buffers, submits a descriptor proof frame, sends an ARP request from 10.0.2.15 to 10.0.2.2, and observes the ARP reply in make run-net. Those current DMA pages now pass through a bounded kernel-owned device_dma pool ledger that proves live pool bytes, page counts, page-rounded MMIO mapping bytes, config/RX/TX interrupt holds, RX/TX ring depths, and RX/TX descriptor submission/completion accounting while no userspace DMA/MMIO/interrupt handles are exposed. The net smoke also proves the current kernel-owned budget/OOM policy with a scratch ledger: page and byte allocation over budget, overlarge queue depth, duplicate and over-budget MMIO holds, MMIO byte over budget, duplicate and over-budget interrupt holds, and descriptor submission beyond queue depth all fail closed while the live virtio-net ledger still validates normally, and the live device-manager record proof is derived from that same ledger without zeroing the copied record as a stand-in for real cleanup. The device-manager DMAPool record now also carries that budget profile, and the lifecycle/imported-live proofs read it through the active manager record plus matching DmaPoolHandle before checking zero-live accounting or live aggregate in-flight accounting against derived total budgets. The same scratch-proof pattern now covers zero-live teardown evidence and stale DMA page handles. Production userspace DMAPool, DeviceMmio, and Interrupt handles, production userspace DMA-buffer handles, real page cleanup/reuse, real DeviceMmio mapping objects, cache attributes/write policy enforcement, hostile stale-MMIO/DMA smokes, S.11.2 hostile smokes, and real doorbell writes remain unavailable; the current malformed-region and manager-attached buffer proofs are only bounded fail-closed metadata evidence in the manager proof path. The same smoke sends an IPv4 ICMP echo request to 10.0.2.2, validates the echo reply identifier, sequence, payload, IPv4/ICMP checksums, and addresses, and prints an icmp echo ok proof line. The former kernel smoltcp TCP HTTP smoke, scheduler-polled smoltcp runtime, Phase B NetworkManager/TcpListener/TcpSocket qemu-only cap objects, socket-backed Telnet terminal handoff, and POSIX DNS UdpSocket smoke are retired. The kernel no longer depends on smoltcp; qemu-only kernel TCP/UDP socket entry points fail closed; and the corresponding Make targets exit before QEMU with retirement diagnostics. Phase C (Networking Part 3) moves TCP/IP behavior into a userspace network stack process and keeps the kernel production surface focused on DMAPool/DeviceMmio/Interrupt device capabilities. The local serve-from-userspace proof now boots a non-qemu cloudboot manifest where a userspace smoltcp service grants an application client a TcpListenAuthority and serves TcpListener/TcpSocket caps for one hostfwd TCP round trip. A later local DHCP/IPv4 proof now lands the first lease/default-route/ARP configuration evidence on that userspace stack. Local bounded ICMPv4 Echo Reply diagnostics are also proved through a local cloudboot manifest, but remain diagnostic-only and outside the Web UI readiness ladder. For the selected GCE Self-Hosted Web UI milestone, the evidence order is local served TcpListenAuthority, local DHCP/IPv4, local Web UI L4, private GCE reachability, then the separately authorized public ingress/TLS proof. The legacy kernel socket owner no longer accepts non-qemu production manifest grants; qemu-only fixtures keep their explicit kernel socket sources until the broader Phase C exit cleanup removes that path.

Code: kernel/src/acpi.rs, kernel/src/diagnostics.rs, kernel/src/pci.rs, kernel/src/device_interrupt.rs, kernel/src/device_manager/, kernel/src/device_dma.rs, kernel/src/virtio.rs, kernel/src/cap/network.rs, kernel/src/cap/ring.rs, kernel/src/sched.rs, kernel/src/mem/paging.rs, kernel/src/arch/x86_64/pci_config.rs, Makefile, tools/qemu-diagnostics-smoke.sh, tools/qemu-iommu-acpi-smoke.sh, tools/qemu-net-smoke.sh, tools/qemu-net-harness.sh.

Validation: make run-diagnostics, make run-iommu-acpi, make run-net, make qemu-net-harness.

Security and Verification Track

The repo has Miri, proptest, fuzz, Loom, Kani, generated-code, dependency policy, trusted-build-input, panic-surface, and DMA-isolation work. CI now runs a bounded Kani gate for capos-lib bitmap, cap-table stale-handle, transfer preflight, transfer rollback split between source-visible rollback and destination-ledger restoration, and frame-grant accounting invariants. The heavier prepare-copy to provisional-destination seam proof passed in the high-memory make kani-lib-full Cloud Build gate, but coverage is not complete for every trust boundary.

References: Trusted Build Inputs, Panic Surface Inventory, DMA Isolation, and Security and Verification Proposal.

Future Work

Future architecture includes service restart policy, capability-scoped system monitoring, notification objects, promise pipelining, service-facing SharedBuffer APIs on top of the MemoryObject substrate, scheduling-context donation, session quotas, SMP, storage and naming, userspace networking, cloud boot support, user identity, policy enforcement, multi-front-end terminal hosts, richer native command surfaces, and broader language/runtime support.

Design references: