Hardware, Boot, And Storage Backlog
Detailed decompositions for hardware, boot packaging, block devices, and local
storage. docs/tasks/README.md links here but should not inline these subtasks.
This is a forward-decomposition reservoir: it carries the open frontier
(explicit DDF follow-up tasks, cloud/network next gaps, and the DMA-authority
invariants that constrain new slices). Landed proof-by-proof chronology lives in
docs/tasks/done/,
docs/changelog.md, and git history; this file keeps only one-line “Landed:”
pointers to it where a reader needs to know a capability exists.
DDF Dispatch Budget
Device Driver Foundation was the previously selected milestone and its
production-authority closeout is recorded for the current brokered-bounce path.
Future DDF slices should not reopen the retained review finding as a generic
blocker; they should advance explicit follow-up tasks such as
direct-remapping/vIOMMU production hardware support, device-autonomous MSI-X
delivery, broader writable-DeviceMmio region selection, or follow-on
provider/device variants. Harness-only updates should protect one of those
authority steps rather than add another standalone proof layer.
Landed: the IOMMU/remapping groundwork and its disabled scaffold (DRHD/source/
domain records, MMIO-status diagnostics, disabled IOVA ledger, mapping-lifecycle
preflight) through the bounded QEMU Intel path; see the IOMMU section below and
docs/tasks/done/2026-05-12/ .. done/2026-05-23/.
docs/proposals/device-manager-refactor-proposal.md core refactor has landed:
the device manager is the kernel/src/device_manager/ module tree. Remaining
refactor work is optional risk reduction only: run behavior-preserving
registry, ledger, or proof-internal splits when they reduce the risk of
upcoming DeviceMmio, Interrupt, or DMAPool authority work or unblock that
work’s review. Those slices remain subordinate to behavior-moving DDF authority
slices and to scheduler SMP/nohz prerequisites.
Landed local follow-up: multi-PRP brokered NVMe BlockDevice windows
(ddf-nvme-multiprp-blockdevice-window-local-proof).
Landed local follow-up: the read-cap reply-scratch fail-closed clamp
(storage-file-read-reply-scratch-clamp).
Landed local follow-up: DeviceMmio map/unmap stale-generation proof
(ddf-devicemmio-map-unmap-stale-generation-local-proof).
Landed local follow-up: production DeviceMmio teardown transaction manager
hold proof
(ddf-devicemmio-production-teardown-transaction-local-proof).
Landed local follow-up: production DMAPool buffer lifecycle over the manager
ledger
(ddf-dmapool-production-buffer-lifecycle-local-proof).
Landed local follow-up: manager-owned DMABuffer free/reuse generation
(ddf-dmabuffer-free-reuse-generation-local-proof).
Landed local follow-up: Interrupt waiter reset-generation
(ddf-interrupt-waiter-reset-generation-local-proof).
Landed local follow-up: production Interrupt routed waiter / deferred-EOI
lifecycle over the manager ledger
(ddf-interrupt-production-waiter-lifecycle-local-proof).
Landed local follow-up: provider IRQ/MSI stale-notification hostile lifecycle
proof
(ddf-provider-interrupt-stale-notification-hostile-local-proof).
Landed closeout: the retained DDF production-authority review finding is closed
by ddf-production-authority-closeout.
Keep direct-remapping/vIOMMU and broad umbrella tasks blocked until their named
gates are actually satisfied.
Growing the inline AttachedDmaPoolRecord::proof_buffers slot count beyond
three slots is blocked on a prerequisite refactor: boot-time proof emissions
pass AttachedDmaPoolRecord by value through nested paths starting at
validate_dmapool_budget_policy_for_record
(kernel/src/device_manager/dma_pool.rs) and the descriptor lifecycle
emissions in kernel/src/device_manager/proofs.rs. A direct slot bump to four
double-faulted make run-net with the BSP boot stack exhausted. Prerequisite:
ddf-attached-dmapool-record-by-ref
is done; any future proof-buffer growth should verify it still avoids by-value
stack expansion before increasing the inline slot count.
Device Manager Refactor Track
The refactor keeps the kernel device manager as the single authoritative
ledger for claimed devices. It must preserve the same ownership transactions
across DMAPool, DMABuffer, DeviceMmio, and Interrupt; it should not
create independent managers or move authority decisions into userspace.
Landed: proof split, handles/errors split, domain modules
(mmio.rs/dma_pool.rs/dma_buffer.rs/interrupt.rs), and the
transaction-helper cleanup, all while PciDeviceRecord remains the aggregate
ledger owner. See
ddf-device-manager-proof-split-closeout,
ddf-device-manager-handles-errors-split,
ddf-device-manager-domain-modules,
and
ddf-device-manager-transaction-helper-cleanup.
Open:
- Optional follow-up splits. Further registry, ledger, or proof-internal
splits may run when they are behavior-preserving and reduce near-term DDF
review risk. They must preserve cap semantics, audit labels, proof
labels, QEMU smoke output, lock ordering, and the single aggregate
PciDeviceRecordownership ledger.
Conflict guidance: treat this as part of the DDF kernel-core serial surface.
It owns kernel/src/device_manager/ and overlaps with any DDF slice touching
kernel/src/cap/device_mmio.rs, kernel/src/cap/interrupt.rs,
kernel/src/cap/dma_pool.rs, kernel/src/cap/dma_buffer.rs,
kernel/src/device_dma.rs, kernel/src/device_interrupt.rs, or DDF QEMU smoke
assertions. Do not run it in parallel with scheduler SMP/nohz kernel slices
that need kernel/src/process.rs or kernel/src/sched.rs review capacity if
those prerequisites are the selected blocker.
Bootable Disk Image
Landed (complete track): make image raw hybrid BIOS+UEFI disk image, make run-disk (OVMF) and make run-disk-bios boot proofs, and provider packaging
helpers (make package-cloud-image / package-gcp-image / package-aws-image)
plus the import notes in docs/backlog/cloud-image-import.md. See
docs/tasks/done/ (disk-image-*, closed 2026-05-25). Cloud NIC/storage driver
ownership remains a separate, blocked track below.
Serial Diagnostics Console
Visible outcome: before cloud NIC/storage drivers are trusted, a cloud VM can boot to a COM1 diagnostics prompt and expose enough state to debug ACPI, PCI, interrupt, DMA, storage, and NIC bring-up through the provider serial console.
Landed: the COM1 diagnostics mode (no network/disk), the bounded command set
(help/status/reboot/halt/cpu/mem/acpi/pci/irq/timers/
devices/logs; reboot is a recognized placeholder), the ACPI/PCI and
virtio-net/DMA-ledger/interrupt-route dump slices, and scripted QEMU coverage.
Open:
- Keep the serial path for command/control and bounded diagnostics only. Do not require large binary upload, in-place kernel replacement, or high-volume tracing over provider serial consoles.
ACPI And PCIe Discovery
Landed: Limine RSDP map, MADT LAPIC/I/O APIC enumeration, and MCFG parse with PCIe ECAM config-space access beside legacy QEMU I/O-port access.
Interrupt Infrastructure
Depends on ACPI and SMP Phase C LAPIC timer/IPI.
The MSI-X proof is kernel-owned: virtio-net config/RX/TX sources are recorded in
the device interrupt registry against a bounded first-fit LAPIC device MSI
vector pool, programmed through the typed PCI MSI-X table helper, claimed and
unmasked by the in-kernel virtio-net owner, assigned to virtio vector registers,
and proved by the TX source’s dispatch counter. A metadata-only QEMU
virtio-rng function reuses the same path with a distinct claimed-masked owner.
That virtio-rng function is a QEMU-only proof fixture, not a production
driver and backs no userspace-facing capability (see
virtio-rng); the entropy service
is the separate RDRAND-backed EntropySource cap.
Legacy I/O APIC routes have a bounded QEMU proof through the same registry.
Landed (kernel-side proof evidence, docs/tasks/done/2026-05-* / done/2026/;
also make run-net, make run-interrupt-grant, make run-hardware-audit*):
masked I/O APIC routing foundation, MSI/MSI-X capability discovery, the static
and registry-backed virtio-net source-route proofs, the device MSI vector pool +
exhaustion policy, claimed-route lifecycle / vector reassignment / stale-route
rejection, driver-owned mask/unmask, the second-device (virtio-rng) proof, the
first device-manager ownership and interrupt-source handoff proofs, the bounded
teardown-trigger contract (seven object-backed rows), cap-specific
release/process-exit/driver-crash/reset-disable/interrupt-waiter teardown hooks
for DeviceMmioCap/InterruptCap/DmaPoolCap/DmaBufferCap, the read-side
HardwareAuditLog.snapshot coverage, pending-IRQ token validation through
capos-lib::device_authority, and bounded Interrupt
wait/acknowledge/mask/unmask admission promoted to bounded route-state
control plus one manager-grant-source routed waiter / deferred-EOI lifecycle
proof (make run-interrupt-grant).
Open:
- Continue real interrupt-source teardown beyond the manager-grant-source routed waiter proof: provider-driver IRQ/MSI waiters now have a local hostile stale-notification proof for reset/release/provider-death/waiter- cancel boundaries, but broader process-exit/driver-crash/reset-disable smoke coverage must keep using the proven ownership lifecycle rather than a separate route cleanup path.
- Expose userspace
Interruptauthority only after source ownership, generation checks, broader stale-notification lifecycle wiring, and the S.11.2 hostile IRQ smokes are implemented. - Add a selected-mode x2APIC QEMU proof over the landed x2APIC MSR backend
(
kernel/src/arch/x86_64/lapic.rs):make run-interrupt-grant-x2apicboots with-cpu qemu64,+smep,+smap,+rdrand,+x2apic, assertsLapicMode::X2Apic, and reuses the routedInterrupt.wait/Interrupt.acknowledgeproof. This remains a bounded local proof, not a high-core hardware readiness claim.
PCI/PCIe Infrastructure
Promotes PCI enumeration from a networking substep to a reusable subsystem consumed by all device drivers.
Landed: PCI config access via legacy I/O ports and PCIe ECAM, the ECAM function
mapping cache/ledger, full Q35 bus enumeration (scanned_buses=256), BAR
parsing + reusable kernel MMIO subregion mapping, MSI/MSI-X metadata discovery,
the second-device (virtio-rng) PCI proof, and the metadata-only QEMU NVMe PCI
proof (make run-pci-nvme). See docs/tasks/done/.
NVMe userspace-bind chain (forward-relevant; landed steps with their successor gaps preserved):
- Landed: the Model B kernel on-notify DMA validator
(
nvme-doorbell-dma-validator,kernel/src/cap/nvme_doorbell_validator.rs,validate_doorbell_scan/completion_wakes_waiter): provider-writes / kernel-validates, fails closed outside the owner’s granted DMA window. Synthetic owner windows stand in for the live grant ledger; wiring the validator into a live NVMeDeviceMmiodoorbell claim is valid only on a verified direct-remapping/vIOMMU or synthetic-address lane. The current no-IOMMU QEMU/GCP lane must use brokered queue-base/PRP materialization instead. Design:docs/proposals/nvme-model-b-doorbell-dma-validator.md; provenance:docs/devices/nvme.md; reconciliation:docs/dma-isolation-design.md(Provider-Written Addresses And No-IOMMU Brokered Bounce). - Landed: the read-only userspace NVMe bind (
nvme-bind-claimed-mmio-read), userspace NVMe controller reset (nvme-controller-reset-selected-write,CC-scoped fail-closed selected write), and the no-IOMMU brokered controller enable (DeviceMmio.brokeredNvmeControllerEnable, schema@6; kernel-authoredAQA/ASQ/ACQfrom the liveDMAPoolledger, no provider-supplied CC bits or host-physical/device-visible address). Proofmake run-pci-nvme; provenancedocs/devices/nvme.md§§5-6. - The provider-written Model B enable
(
nvme-userspace-bind-and-controller-bringup) remains a separate direct-remapping/vIOMMU lane (still open / blocked).
Open:
- Add userspace
DeviceMmioauthority and ownership boundaries for out-of-kernel drivers only after the device-manager andDMAPoolgates below are in place. - Extend beyond metadata-only discovery to virtio and NVMe driver binding as those reusable driver paths land.
Device Authority And Userspace Driver Gate
Ordered after the generic MSI/MSI-X dispatch table and second-device proof.
The current brokered-bounce provider paths have landed their local/GCE evidence;
future direct-remapping/vIOMMU, provider-written-address, hostile-hardware, or
broader device-owner paths remain gated by the selected backend contract and
Security Verification Track S.11.2 in docs/dma-isolation-design.md.
DMA authority invariants (settled; these constrain every new slice — do not
weaken them). Per docs/dma-isolation-design.md (accepted): backend selection
is a runtime, fail-closed kernel decision — direct IOMMU remapping only when a
probe verifies usable hardware, otherwise kernel-owned bounce buffers. On the
no-IOMMU lane the manager is the single owner of every bounce page’s
host-physical address and IOVA: host_physical_user_visible=0,
direct_dma=blocked, iova_export=disabled-future-only, real DMA
not-attempted. Pool/buffer/handle lifecycle is generation-checked and
fail-closed on stale/freed/wrong-owner/wrong-state; pages stay committed,
resident, and unswappable while device-visible and are scrubbed before release;
quiesce + scrub precede free; stale completions and stale IRQs after reset must
not wake a waiter or mutate accounting. The device-manager ledger is the single
record of DMA pool bytes, buffer count, descriptor/ring depth, page-rounded MMIO
mappings, interrupt holds, in-flight DMA submissions/completions, ownership
generations, budget/OOM policy, and teardown state.
Landed (prerequisite proofs and the first production userspace surface;
docs/tasks/done/, make run-net / make run-dmapool-grant /
make run-dmapool-grant-exit / make run-devicemmio-grant /
make run-interrupt-grant / make run-hardware-grant-cycle /
make run-hardware-audit* / make run-ddf-provider-consumer /
make run-iommu-remapping):
- the in-kernel device-manager object model, interrupt-source attach/detach, the
kernel-owned
DMAPoolaccounting / budget / OOM / tamper / over-budget proofs bound to attached records, the imported-live-accounting record over thedevice_dmaledger, and thedevice_dmazero-live / stale-handle / stale-completion / publication scratch proofs routed through the purecapos-lib::device_authorityvalidators; - the documented production handle epoch invariants plus their pure validator and host tests; the manager-attached DMA-buffer record proof;
- the production
DMAPool.allocateBufferresult-cap method and its manifest grant, plus admission/typedDMABuffersubmitDescriptor/completeDescriptor/mapcoverage, the userspace-VMA bounce-buffer map + protection hardening, the shared descriptor validator and manager-inflight accounting, the userspace-visible completion effect, and the provider-visible shadow-descriptor / selected-queue-entry side effects feeding the provider-consumer gate; the selected virtio-net TX backend + notify-offset claim policy; - the bounded sequential
DeviceMmio/Interruptgrant-cycle reuse proof; admission + shared-validator + real-effectDeviceMmiomap/read32/write32coverage; cursorable/edge read-side audit snapshots; and the first productionDeviceMmioCap/InterruptCapcap-release + process-exit hooks; the exposedDeviceMmiouser-map path records a manager-owned user hold (borrowed VMA, page-rounded BAR window, mapping generation, and selected-write policy label) and explicit unmap, cap release, and process exit clear it before detaching the reusable mapping generation; driver-crash and reset/disable hook markers remain bounded no-userspace-MMIO proofs that assert no user hold is live before detach; - the manager-handle identity fields carried into the result-only
DMAPool/DMABuffer/DeviceMmio/Interruptinfosurfaces; - the real pinned-page
DMAPoolpage-lifecycle slice (ddf-real-dmapool-pinned-page-realness, done 2026-05-26): the kernel ledger owns real scrubbedframe::alloc_frame_zeroedpages and the manager imports a live snapshot on the honest bounce-bufferrun-netpath; - the S.11.2 hostile smokes (stale DMA handles, descriptor abuse, revoke/reset
races, stale IRQ after reset, stale DMA completion after reset, exit-under-DMA;
S.11.2.7/8 over real free/realloc on
make run-net; the IOMMU-backed production matrix onmake run-iommu-remapping); seedocs/tasks/done/2026-05-26/and the IOMMU section; - the first exposed userspace
DeviceMmio+Interruptsurface (ddf-userspace-writable-devicemmio-interrupt, done 2026-05-26): read-only BAR map + brokeredread32+ a realwrite32on a claimed register, manager-capwait/mask/unmaskwith deferred delivery and no-stale-wake-after-revoke, and real-route userspacewait/acknowledgewith deferred LAPIC EOI through the providertx_interrupt/rx_interruptcaps driven by a userspace process (make run-ddf-provider-consumer), plus the non-implication negative-authority assertions on both grant smokes.
Open:
- Require DDF authority-surface hazard preflight before new behavior slices. The slice handoff/review prompt should state the relevant paging/MMIO, DMA, IRQ, ABI, and docs-authority invariants before code changes start. This is a workflow gate for avoiding bounded-proof overclaims and late review discovery of known infrastructure hazards.
- Broader writable-
DeviceMmioregion selection remains out of scope until a separate manager-selected register-window design lands. - Direct-remapping/vIOMMU, provider-written device addresses, and hostile bus-mastering hardware isolation remain future work. The current no-IOMMU cloud path stays on brokered bounce-buffer authority.
- Physical Store-backed hardware-audit local persistence, keyed segment
seals, and runtime subscriber refusal are closed by
hardware-audit-physical-persistence-signing-local-proof: the QEMU proof reuses onepersistent_storedisk across two boots, recovers pass-1 audit segment blobs through Store inventory before pass-2 drain, verifies development-source RAM-local HMAC segment seals, reports key lifecycle caveats, and refuses runtime reader admission until an authority-broker path exists. External verifier key custody, production rotation/revocation, rollback resistance, and broader runtime admission remain future; audit is observer evidence and does not grant DMA/MMIO/IRQ authority. - Device-autonomous MSI-X local APIC delivery is closed by
cloud-prod-qemu-kvm-virtio-net-msix-apic-delivery-resolutionand the dependent RX waiter proofcloud-prod-virtio-net-rx-device-autonomous-msix-raise-local-proof. The current provider path can still use polled completion when interrupt delivery is not required, and live-GCE device-autonomous interrupt evidence remains future work.
IOMMU/DMAR/AMD-Vi Staging
Deferred-with-known-dependency planning gate. capOS has a bounded QEMU Intel
remapping implementation for the selected smoke path, not a general hardware
isolation claim for production NIC or storage ownership. The selected QEMU Intel
path programs manager-owned per-device domains for two claimed DMA-capable
functions, exports only domain-scoped IOVAs, hides host physical addresses, and
fails closed for stale or wrong-owner domain assignment; it emits an honest
direct-DMA posture (real_dma=attempted, direct_dma=enabled,
remapping_tables=programmed) over the real ledger, with mappings installed
before the doorbell and invalidated/IOTLB-flushed before reuse, while
hostile_hardware_isolation stays not-claimed (QEMU-emulator evidence).
Current no-IOMMU cloud/user-provider paths use brokered bounce-buffer authority,
not direct DMA. Direct-remapping/vIOMMU work, trusted sharing groups, and
hostile-hardware isolation remain blocked on their own future gates in
docs/dma-isolation-design.md.
Landed (umbrella + children, docs/tasks/done/2026-05-12/ ..
done/2026-05-26/; make run-iommu-acpi, make run-iommu-remapping):
the IOMMU dependency record, bounded Intel DMAR / AMD-Vi IVRS ACPI discovery,
DMA-capable-function attach + uncovered marking, the per-device DMA domain
policy and its pure fail-closed admission helper, the COM1 diagnostics mirror,
the disabled table scaffold + MMIO-status diagnostics + disabled IOVA ledger +
mapping-lifecycle preflight, the first real QEMU Intel table-programming smoke
(real VT-d table programming, hardware-DMA translation, two-phase
invalidation/IOTLB-flush revocation, IOMMU-backed hostile stale-DMA smokes),
production DMAPool ledger integration, domain-scoped IOVA export discipline,
fault recording/diagnostics, per-device domain granularity, the no-usable-IOMMU
fallback policy, the IOMMU-production teardown/bounce-buffer S.11.2 matrix, and
the honest direct-DMA posture line
(ddf-iommu-remapping-production-closeout).
Open (future, not on the bounce-buffer critical path): AMD-Vi programming,
scalable-mode / interrupt-remapping / device-IOTLB, aw-bits=48 4-level tables,
trusted multi-device sharing groups, and production cloud NIC/storage driver
ownership remain separate future tasks. kernel/src/iommu.rs stays
cfg(feature = "qemu")-gated as a separate verified-remapping lane.
Reusable Block-Device Path
Landed: the device-generic virtio queue/transport helpers factored into
kernel/src/virtio.rs pub(crate) mod transport
(ddf-virtio-transport-helper-factor), the device-agnostic VirtqueueDma
DMA/notify seam + seam-driven Virtqueue/DmaPage + parameterized
discover_modern_transport (ddf-virtio-driver-foundation-boundary), the
virtio-blk sector read/write smoke (make run-virtio-blk,
ddf-blockdevice-boundary-virtio-blk-smoke), the first BlockDevice
trait/CapObject boundary (kernel/src/cap/block_device.rs), and multi-device
virtio-blk support + a target-disk grant source (make run-multi-virtio-blk,
KernelCapSource.blockDeviceTarget @44, ddf-multi-virtio-blk-device-support).
Landed: block_device_target now resolves by manifest PCI
segment:bus:device:function identity and fails closed when the selector is
absent, mismatched, or names the resolved boot disk; proof
make run-blockdevice-target-identity.
See docs/tasks/done/2026-05-25/, done/2026-05-26/, and
done/2026-06-05/.
Open:
- Add storage services behind userspace ownership:
storage-userspace-persistent-store-namespace-service-local-proofmovedStore/Namespaceserving onto a persistent userspace service (make run-storage-persist-service), andstorage-userspace-directory-file-service-local-prooffollowed withDirectory/Fileserving and result-cap transfer from userspace (make run-userspace-directory-file-smoke). - Retire the ambiguous kernel-owned
Store/Namespace/Directory/Fileproduction storage routes:storage-legacy-kernel-storage-cap-backer-retirementgated the RAM-backedfile/directory/store/namespacekernel grant sources behindqemu(fail-closed in the default production kernel, joining the already-gated virtioread_only_fs_root/persistent_store/writable_fs_rootmount sources) and named all remaining kernel storage backers as proof/fixture surface in code and docs. Production storage is userspace-served; the defaultsystem.cueboot grants no kernel storage caps. - Retire the transitional kernel virtio-blk production owner:
storage-legacy-kernel-virtio-blk-path-retirementratified that the kernel-owned virtio-blk driver, itsBlockDevicecap arm (BlockDeviceBackend::Virtio), and its PCI discovery (diagnose_qemu_virtio_blk) are allqemu-feature-gated; the default production kernel never binds virtio-blk and resolvesblock_deviceto the userspace-brokered NVMe arm (BlockDeviceBackend::NvmeBrokered, fail-closed without a verified controller and a livedevice_mmiogrant), withblock_device_targetfail-closed (requires the qemu feature). virtio-blk is named as a qemu fixture / regression in the device doc, smoke scripts, and fixture manifests; the production-storage gate is therun-cloud-provider-nvme-blockdevice-*chain. The kernel broker responsibilities (PCI claim arbitration, MMIO/IRQ/DMA admission, bounce/IOMMU isolation, stale-generation rejection, and revocation) stay kernel-owned and are the same surfaces the userspace storage driver binds into.
Local Disk Storage Milestone
Visible outcome: default storage-focused QEMU boots from a disk image, exposes a read-only directory from local disk, and proves one capnp object can be persisted and read back after reboot. Milestone complete.
Landed (docs/tasks/done/2026-05-14/ .. done/2026-05-25/): the
Store/Namespace + file-I/O schema slices and RAM-backed naming round-trip
proof (make run-storage-naming); virtio-blk wired into BlockDevice
(make run-virtio-blk); the read-only filesystem service over BlockDevice
(kernel/src/cap/readonly_fs.rs, CAPOSRO1, make run-storage-fs); and the
disk-backed persistent Store with a two-boot reboot proof
(kernel/src/cap/persistent_store.rs, CAPOSST1, make run-storage-persist).
Disk-backed delete tombstones entries in place; a later put that would hit
the entry-table or data-cursor limit now compacts live CAPOSST1 store entries
through a shadow generation before recommitting the canonical front generation
(make run-storage-persist). Store/persistent durability across passes
rests on host page-cache coherence; a virtio FLUSH for write-back-cache media
durability is deferred to the Writable milestone.
Writable Local Storage Milestone
Visible outcome: a storage-focused QEMU image can create, overwrite, truncate,
rename, and remove files through capability-scoped Directory/File caps,
persist both file and store mutations across reboot, and recover to a
consistent state after an unclean shutdown test. Milestone complete.
Landed (docs/tasks/done/2026-05-26/; make run-storage-writable,
make run-storage-writable-recovery): the fail-closed single-writer policy
(documented in the storage proposal); directory mutation
(create/mkdir/remove/rename, additive Directory.create @5/rename @6)
and writable File paths (overwrite/append/truncate/sync/close, bounded by
MAX_FILE_BYTES 64 KiB) over kernel/src/cap/writable_fs.rs; disk-backed
write-through persistence of the CAPOSWF1 sub-volume co-located with the
CAPOSST1 Store in one combined image (now produced by
tools/mkstore-image --writable); real File.stat created/modified
timestamps with internal ClockProvenance labels carried from the same
WallClock source in the CAPOSWF1 node record;
and one forced-poweroff unclean-shutdown recovery proof (proof-only
storage_writable_recovery feature) verifying the superblock-commit-ordering
invariant.
Bounded-proof caveat: the recovery proof exercises one record-vs-commit window
under host-page-cache durability (no VIRTIO_BLK_F_FLUSH; kill -9 preserves
the host page cache); it proves the kernel’s superblock-commit-ordering
invariant, not general media crash-consistency against host power loss. The
co-located CAPOSST1 Store now has bounded tombstone reclamation through
make run-storage-persist; writable-file extent reclamation remains future
work.
Managed Cloud Store Bridge
Visible outcome: application services can persist bounded Cap’n Proto records through a cloud-backed capability while local QEMU tests exercise the same semantics through a fake bridge.
Open gates:
- Define a provider-neutral
CloudStoreBridgeor app-specificSaveStoreinterface with put/get/compare-and-set/append operations, explicit size limits, profile or tenant scoping, schema version, and stale-write rejection. - Add a local fake-cloud bridge used by host tests and QEMU smokes. It must reject wrong-profile loads, stale mutable writes, oversized records, and ledger rewrites.
- Add a GCP deployment note for Cloud Run bridge service, Firestore Native mode mutable indexes/profile summaries, Cloud Storage versioned blobs, and Secret Manager credentials.
- Add Cloud KMS keying notes for managed game-world storage: key ring/key per world or shard, narrow encrypt/decrypt IAM authority, rotation, retired world revocation, and audit logging.
- Keep provider credentials outside ordinary capOS clients. Only the bridge service receives cloud credentials; game/storage clients receive narrow capabilities.
- Add lifecycle/retention/cost controls before writing real snapshots or evidence blobs to Cloud Storage.
- Treat local disk-backed
Storeas the offline/QEMU baseline even when cloud persistence is available.
User-Owned Browser Save Transport
Visible outcome: private user data can be backed up through the user’s browser to Google Drive or Firebase as encrypted capsules while capOS never receives provider tokens.
Landed (policy + host-test gates): the provider-neutral browser transport
policy for opaque encrypted save capsules / opaque provider handles / capsule +
wrapped-DEK metadata; fake Drive and fake Firebase host-test adapters modeling
deletion / duplicate writes / stale versions / rollback / missing network /
non-opaque handles / authenticated-user mismatch / Firebase auth UID path
injection; the Drive appDataFolder (drive.appdata) and Firebase/Firestore
per-user-capsule notes; and the KMS / token / key-capability boundary records
(browser transports ciphertext + handles only).
Open (future real-provider integration):
- Implement real Google Drive and Firebase browser-companion adapters after the provider-token boundary is exercised outside ordinary capOS clients.
- Reuse the existing save-capsule restore rejection tests as the acceptance gate for real provider adapters: tampered, wrong-profile, stale, oversized, unknown-content, and unsigned capsules must still fail before provider bytes can mutate save state.
- Add real-provider failure-mode coverage for deletion, duplicate writes, stale versions, rollback attempts, offline cache/sync replay, and missing network using the same semantics as the fake adapters.
Boot Binary ISO Layout
Move ELF payloads out of the Cap’n Proto manifest blob and into explicit boot
package sources. The CD-ROM path uses ISO 9660 files read on demand through a
minimal kernel ISO driver; the raw disk and cloudboot paths use Limine-loaded
modules staged on the FAT ESP. Both keep the manifest as topology and decouple
ordinary service binary bytes from NamedBlob.data. capOS remains Limine-backed
for the current boot line; Limine supports FAT and ISO9660/CD-ROM media, so
CD-ROM/ISO is a planned boot/install variant rather than a path to delete.
Landed (docs/tasks/done/2026-05-24/; make run-boot-iso-read,
make run-boot-iso; producer guard added 2026-06-06): the minimal ATA PIO
CD-ROM read_sectors reader (boot_iso_read), the read-only ISO 9660 driver
(open_file(name) -> (lba, size), fail-closed bounds), mkmanifest --copy-bins (names-only manifest, empty NamedBlob.data) with producer-side
rejection for names whose ISO 9660 d-character form exceeds the level-3
31-character limit or collides after normalization, the opt-in make capos-name-only.iso, the kernel
run_init() on-demand-read switch + BootBinary registry behind the boot_iso
feature (make run-boot-iso), and the BOOT_MANIFEST_MAX_BYTES doc + the
-iso-level 3 name-only ISO build recipe. Landed 2026-06-07 21:36 UTC in
commits 22320411 and f0695442: the default make image raw disk and
make capos-cloudboot-image cloudboot targets now use a name-only manifest plus
Limine module payloads staged under /boot/bins/; see
boot-limine-disk-boot-binary-source-local-proof.
Landed 2026-06-07 21:59 UTC: the default make, make run, and
make run-smoke ISO paths now use name-only manifests plus boot_iso
on-demand reads from /boot/bins/, so ordinary service ELF bytes no longer ride
in NamedBlob.data for the default bootable ISO paths. The generic embedded ISO
rule remains available for focused fixtures that have not moved to a name-only
boot source.
Closed:
- After that source is proven,
boot-embedded-data-retirement-and-atapi-userspace-servingretires the embedded-data branch for ordinary service binaries. The retained ATAPI/ISODirectory/Filecap is explicitly a QEMU install-source fixture over the early boot reader, not a general kernel filesystem service; broader post-bootstrap package browsing remains a userspace-service concern outside this fixture.
Cloud Device Tracks
These are portability notes, not implementation evidence. The first cloud
milestone is imported-image serial-console boot; provider NIC/storage drivers
are later usable-instance work and remain blocked by cloud-provider binding,
DMA/IOMMU or explicitly accepted bounce-buffer policy, interrupt, teardown, and
network/storage evidence gates above. Local implementation and *-local-proof
records in this track run under host tests, QEMU, or local cloudboot-image QEMU
unless their acceptance explicitly says otherwise; they must not be blocked on
cloud access. The local bounded provider-consumer closeout does not implement a
cloud-ready userspace virtio-net, virtio-blk, virtio-scsi, NVMe, gVNIC, ENA, or
cloud storage/NIC driver. The GCP-first usable-instance provider rollup is
closed by
cloud-usable-instance-provider-nic-storage;
future public ingress, AWS, Azure, broader storage, high-throughput NIC, and
direct-remapping lanes remain separate work.
Access correction (2026-05-27, updated 2026-06-06). The GCP cloud tracks are
NOT prefix-blocked on cloud access. Local implementation and *-local-proof
tasks stay dispatchable once their local prerequisites are satisfied, including
tasks named cloud-prod-* that only boot the production cloudboot kernel under
QEMU. Only live/billable proof tasks that cross a provider API, provider
hardware, public ingress, public CA/DNS, or explicit make cloudboot-test
acceptance require access authorization. GCE access is provisioned for the
configured cloud sandbox project: tools/cloudboot/run-test.sh is hardcoded to
it (no public IP,
no service account/scopes), the 2026-05-24 GCE live probes recorded
n1-standard-1, e2-small, c3-standard-4, and n2d-standard-2
Confidential shapes (IOMMU disabled → SWIOTLB → labeled bounce-buffer) in
Cloud DMA Provider Evidence Inventory,
and Cloud Build runs Kani proofs (tools/cloudbuild-kani.yaml). The local QEMU
virtio-net/NVMe foundations and the local production cloudboot bind markers
exist. The GCP live NVMe Persistent Disk read proof is now closed by
cloud-gcp-storage-driver; remaining live driver slices are blocked only by
their own local authority, product-scope, and real-provider evidence gates.
Slices that only need structured serial evidence from already-production code
(for example cloud-network-terminal-access path 1) are runnable on real GCE.
Cloud-Leg Decomposition Track (2026-05-24)
The cloud-usable-instance-provider-nic-storage umbrella was decomposed into
discrete slices and is now closed as the GCP-first provider rollup.
Landed foundation (docs/tasks/done/2026-05-24/ .. done/2026-05-30/):
the cloud DMA provider-evidence inventory, the runtime fail-closed DMA backend
selection mechanism (cloud-dma-backend-selection: probe → fail-closed select →
manifest override; authoritative contract in the “Cloud DMA Backend” section of
docs/dma-isolation-design.md), the local-QEMU GCP virtio-net binding precursor
- cloud-shape classification, the production (non-
qemu)cloudboot-evidence: dma-backend/device-class/device-inventorymarkers, the minimal read-only production PCI enumeration surface, and the production DDF/PCI bind-stack decomposition.
Landed production bind-stack children (terminal local-bind markers settled with
a kernel-side dispatch-slot proxy where the userspace driver authority surface
is cfg(feature = "qemu")-gated out of the non-qemu build):
cloud-prod-pci-claim-inventory, the DeviceMmio BAR-readback grant
(make run-cloud-devicemmio-grant), the DMAPool bounce-buffer grant
(make run-cloud-dmapool-grant), the interrupt route-alloc + live-delivery
proofs, the terminal provider-nic-bound / storage-bound proxy markers, the
three production userspace-provider grant-source proofs
(DeviceMmio/DMAPool/Interrupt), the aggregate grant-surface closeout, and
the real provider-cap-side Interrupt.wait/acknowledge cap-waiter proof
(cloud_provider_cap_waiter_proof, make run-cloud-provider-cap-waiter). See
docs/tasks/done/2026-05-28/ and the task-graph reconcile
cloud-live-driver-task-graph-reconcile.
Landed virtio-net userspace-provider chain (the stale parent is closed by its
child sequence, make run-cloud-provider-virtio-net*;
docs/tasks/done/2026-05-28/ .. done/2026-06-07/): the
non-qemu-buildable virtio modern-transport host surface
(kernel/src/virtio_transport.rs), the device bring-up proof, the same-BDF
DeviceMmio+DMAPool+Interrupt authority bundle, TX and RX queue
materialization, MSI-X function-enable, TX submit/doorbell + polled completion,
the userspace DMABuffer map/submit live-publish path, TX and RX MSI-X
wait/ack, RX userspace-submit, the production-IDT real-interrupt-gate dispatch
wiring, the RX polled-completion-no-inject proof, the always-built polled
provider graduated off the per-proof feature, the real-polled-driver
provider-nic-bound re-point (removing the proxy as source), the polled
teardown + driver-death/process-exit stale-authority discipline, the
legacy/transitional virtio 0.9 PIO + INTx local bind, and the real-GCE
legacy-polled provider-nic-bound run.
Landed NVMe brokered userspace-provider chain (the parent is closed by its child
sequence; make run-cloud-provider-nvme-*;
docs/tasks/done/2026-05-29/ .. done/2026-06-05/): read-only bind →
controller reset (selected CC-clear write) → admin queue materialization →
brokered controller enable (manager-op DeviceMmio.brokeredNvmeControllerEnable
@6, manager-authored AQA/ASQ/ACQ; raw CC.EN-set fails closed) → admin
IDENTIFY (@7, then split SUBMIT @8 / COMPLETE @9) with the admin-completion
Interrupt.wait/acknowledge handoff over the cap-waiter MSI-X route → I/O
queue-pair create (@10/@11) → I/O READ (@12/@13) → WRITE (@14/@15,
read-back match) → arbitrary/second LBA (@16/@17) and multiblock
(@18/@19) → single-call synchronous poll-read (@20/@21, no
Interrupt.wait on the data path) and inline read-bytes (@22) → the
BlockDevice.readBlocks-shaped fixed-LBA then arbitrary-LBA read arm
(BlockDeviceBackend::{Virtio,NvmeBrokered}) → readonly_fs over the NVMe
BlockDevice (single-file then multi-file dir-walk) → writeBlocks @1
durability + real FLUSH @3 (opcode 0x00) + clean-reboot persistence +
forced-poweroff crash-consistency → persistent_store and writable_fs (plus
recovery) over the NVMe write arm → File.sync/Store-commit routed to a real
NVMe FLUSH → the capstone read-arm graduation into always-built production
(fail-closed runtime capability probe kernel/src/nvme_storage_backend.rs) and
the always-built device_manager::nvme_sync_io_state sync-I/O state seam →
dedicated data-path completion interrupts for BlockDevice.writeBlocks @1 and
readBlocks @0 (make run-cloud-provider-nvme-io-completion-interrupt).
All brokered NVMe steps hold the no-IOMMU discipline: PRP1/queue-base addresses
are manager-owned bounce buffers, never exported; no provider-written
queue-base/PRP/SGL address, no host-physical or IOVA export, no direct-DMA
claim, no cloud/guest IOMMU assumption. QEMU caveat: “an unflushed write rolls
back” is not provable under QEMU’s -device nvme cache=writeback model
(unflushed_rollback=not-provable-under-qemu-nvme-model).
Open / blocked:
-
cloud-usable-instance-provider-nic-storage(done 2026-06-07) — closeout-only rollup over the landed GCE evidence: serial-console operator access (1779868872-2424), live legacy virtio-net raw-frameprovider-nic-bound(1780412056-e1cb), live NVMe Persistent Disk brokeredREAD(1780806087-bf69), and the separate gVNIC raw-frame / typed-Nic portability runs (1780794927-1aa9,1780796615-decc). This closes the GCP-first provider NIC/storage bar without claiming public L4 ingress, AWS/Azure, broader storage variants, direct DMA/remapping, or high-throughput NIC readiness. -
cloud-gcp-nic-enumeration-evidence(blocked/decomposed 2026-05-27) — coupled honest production-path enumeration markers to aprovider-nic-bound+--require-provider-nic-proofgate the harness reserves for the driver slice, plus a billable real-GCE run an autonomous worker cannot self-authorize. The honest production-marker slice landed; theprovider-nic-bound+ real-GCE proof folds intocloud-gcp-virtio-net-nic-driver. -
cloud-prod-virtio-net-userspace-provider-local-proof(done/closed 2026-06-07 02:54 UTC) — this stale parent is closed by the landed child chain above. The local non-qemucloudboot/QEMU path has the modern TX/RX provider proofs, always-built polled provider, honestprovider-nic-boundmarker sourced from real polled TX+RX progress, and clean-release plus process-exit teardown. The GCE-compatible legacy-polled path also passed real GCE through the billablecloud-prod-gce-billable-boot-real-polled-nic-boundrun. Remaining future lanes are L4 socket/smoltcp relocation, literalsystem.cueprovider fold, reusable full-NIC/multiqueue readiness, and live-provider device-autonomous MSI-X evidence. -
cloud-prod-nvme-brokered-userspace-provider-local-proof(done/closed 2026-06-07 02:08 UTC) — this stale parent is closed by the landed child chain above. The local non-qemucloudboot/QEMU path has the brokered controller/admin/I/O provider proof,BlockDeviceread/write/flush and filesystem consumers, dedicated data-path completion interrupts, and NLB > 8 multi-PRP windows with manager-authored PRP lists. Remaining future lanes are a second namespace, FUA/DSM, live GCP evidence, device-autonomous MSI-X completion delivery, and any direct-remapping/vIOMMU/provider-written-address model.
Production Bind-Stack Port (qemu-gate dissolution)
The cloud-prod-*-local-proof chain proved each behavior behind a focused
per-proof Cargo feature (cloud_*_proof) that compiles a kernel-side
cap::*_proof module into the non-qemu build only when its feature is on.
Those proofs are correct but do not graduate the underlying device surface to
always-built production code. The qemu feature conflates three jobs: (1)
test-harness affordances (isa-debug-exit shutdown, self-tests,
diagnostics/measure/debug_tap/boot_iso/storage_writable_recovery,
the VT-d smoke) that must stay compile-gated; (2) unproven-on-hardware device
surface kept dormant; (3) genuine host capabilities that should be
runtime-probed, not compile-gated. The unlock is not removing the cfg and
not an “am-I-QEMU” runtime branch (it links unproven MMIO/DMA into
production = fail-open against the brokered-DMA discipline, and forfeits
dead-code elimination as a TCB property). The unlock is to dissolve the gate
per-piece: port each dormant capability into always-built production code
as it is proven, fronting hardware-dependent behavior with a fail-closed
runtime capability probe (the kernel/src/dma_backend.rs
probe → fail-closed → manifest-override pattern). Hard caveat: the no-IOMMU
bounce-buffer discipline is preserved (host_physical_user_visible=0,
direct_dma=blocked, iova_export=disabled-future-only), and
kernel/src/iommu.rs stays cfg(feature = "qemu")-gated as a separate future
verified-remapping lane.
Umbrella: cloud-prod-ddf-bindstack-qemu-gate-dissolution
(done 2026-05-30).
Landed children (docs/tasks/done/2026-05-29/ / done/2026-05-30/): the RX
MSI-X waiter-determinism fix (the provider-consumer flake was a
synthetic-RX-dispatch delivery-ordering race; gating injection on the waiter
thread being parked in cap_enter, 28/28 green), grant-source despecialization
(stage_with_class + ProdGrantClass), ECAM/MCFG enumeration graduation
(fail-closed runtime MCFG probe), MSI-X programming graduation
(cap::interrupt_programmed::program_attach_arm_unmask +
device_interrupt::wait_kernel_injected_dispatch now always-built), the
device-manager backend port (always-built ProductionDeviceTable device-record
/ bounce-DMA / interrupt-route backend replacing the device_manager::stub
slot), and the qemu/test_harness feature split.
Open:
- [~]
ddf-provider-consumer-dmabuffer-page-fault-baseline(blocked/premise-refuted) — the reported deterministic DDF/QEMUDMAPool/DMABufferPAGE FAULT did not reproduce (0/28 ond2a342d2, byte-identical kernel to45c4beb9). Keep historical unless new evidence re-establishes the original fault.
The local virtio-net and NVMe userspace-provider parents are both closed by their child chains, so the live provider tasks now sit behind their own real-cloud evidence and product-scope gates rather than stale local-parent blockers. The cloud/GCP track stays brokered bounce-buffer authority; this does not reopen direct DMA, guest IOMMU, or direct-remapping assumptions.
-
cloud-gcp-virtio-net-nic-driver(DONE/superseded 2026-06-02 by the slice-6 billable run, see the GCE Polling Path track below) — the live legacy virtio 0.9 NIC was bound through the kernel-brokered legacy polled path, passing--require-provider-nic-proof. Honest scope:userspace_driver_authority=kernel-brokered-legacy-polled, so this closes the real-GCE bind bar without claiming L4 socket reachability, reusable multiqueue/full NIC readiness, or live-provider device-autonomous MSI-X delivery. -
cloud-gcp-storage-driver(done 2026-06-07) — the live GCE NVMe Persistent Disk path passedmake cloudboot-gcp-storage-nvme-io-read-teston run1780806087-bf69at source commit28518165518c29a48633682f4a6d9b5844c43335. Evidence identifiedstorage_interface=nvme,vendor.1ae0,device.001f,c3-standard-4,europe-west3-a, one brokered 512-byteREAD, no public IP, no service account, and complete teardown. The selected GCP path remains brokered-bounce queue-base/PRP materialization; provider-written Model B is reserved for a direct-remapping/vIOMMU or synthetic-address lane. This does not claim the older virtio-scsi PD path, Local SSD, a gVNIC datapath, or full filesystem integration. -
cloud-network-terminal-access(done 2026-05-27; path 1, serial-console shell, needs no NIC driver) — proved a reviewed cloud operator access path beyondcapos kernel startingover the GCE serial console (cloudboot-evidence: access-path serial-console-shell; real-GCE run1779868872-2424, no public IP, no service account). Paths 2/3 (TCP/Telnet) depend oncloud-gcp-virtio-net-nic-driver; path 4 (SSH) is a separate milestone. -
cloud-launch-teardown-policy-hardening(done) — hardened the cloudboot harness into the usable-instance gate:--require-provider-nic-proof, structuredprovider.jsonevidence, fail-closed launch-policy read-back, and nonzero exit on teardown failure or incomplete evidence.
Future provider slices (not required for the initial GCP usable-instance gate).
The AWS and Azure tracks are split by proof surface: standard storage
controllers (NVMe / virtio-scsi) are QEMU-emulable now, while the vendor-custom
NICs (ENA, MANA) get host-conformance gates plus a deferred live proof because
QEMU does not emulate them. The NVMe path’s shared GCP storage-provider
foundation has landed via nvme-io-queue-and-read, so the NVMe-only AWS (Nitro
EBS) and Azure (managed-disk) tracks re-scoped to a small cloud-shape
classification delta and landed (both done 2026-05-28). The virtio-scsi
alternative is not a shortcut: capOS has no userspace virtio-scsi provider
driver, and make run-virtio-blk proves the kernel-owned virtio-blk driver,
which leaves the hidden kernel DMA ownership the provider-authority acceptance
forbids — so the older-family SCSI path stays out of scope.
AWS:
-
cloud-aws-nvme-storage-driver(done) — the AWS Nitro EBS NVMe cloud-shape classification delta on the shared NVMe foundation (make run-pci-nvme;docs/devices/aws-nvme.md). Live AWS EBS evidence is the deferredcloud-aws-storage-live-proof. -
cloud-aws-ena-nic-protocol-conformance(done) — ENA protocol encode/decode incapos-lib/src/ena.rswith a host conformance suite vetted against the ENA spec / Linux driver headers. Gate:cargo test-lib(deliberate QEMU-exception; QEMU has no ENA device). -
cloud-aws-ena-nic-live-proof(blocked on conformance +cloud-gcp-virtio-net-nic-driver; deferred until AWS access) — end-to-end ENA bind/send/receive/teardown on real AWS hardware.
Azure:
-
cloud-azure-disk-storage-driver(done) — the Azure Boost managed-disk NVMe cloud-shape classification delta on the shared NVMe foundation (make run-pci-nvme;docs/devices/azure-disk.md). The older-family Hyper-V/virtio-scsi path is out of scope (azure_scsi_path=no-userspace-provider-driver-out-of-scope). Live Azure evidence is the deferredcloud-azure-storage-live-proof. -
cloud-azure-mana-nic-protocol-conformance(done) — MANA/GDMA protocol encode/decode incapos-lib/src/mana.rswith a host conformance suite vetted against the MANA Linux driver headers; provenancedocs/devices/azure-mana.md. Gate:cargo test-lib(QEMU has no MANA device). -
cloud-azure-mana-nic-live-proof(blocked on conformance +cloud-gcp-virtio-net-nic-driver; deferred until Azure access) — end-to-end MANA bind/send/receive/teardown on real Azure hardware, including SR-IOV VF revocation with fallback-to-synthetic.
Superseded umbrella records (do not dispatch):
-
cloud-aws-ena-nvme-driver— umbrella pointer to the three AWS slices above. -
cloud-azure-mana-driver— umbrella pointer to the three Azure slices above.
Cloud milestones and per-provider paths:
- First cloud milestone: imported-image serial-console boot. Closed for GCP
by run
1778230874-715a(2026-05-08) against source commit3951e275:make cloudboot-testimported thecapos-cloudboot-imagetarball, started ane2-smallwith no public IP and no service account, observedcapos kernel startingon serial, and tore down cleanly. Does not require or prove cloud NIC/block-device drivers beyond the boot path. - Second cloud milestone: GCP-first usable instance provider rollup. The
selected operator path, provider storage, and provider NIC data path are
closed by
cloud-usable-instance-provider-nic-storage: serial-console shell access on real GCE, live legacy virtio-net raw-frameprovider-nic-bound, live NVMe Persistent Disk brokeredREAD, and separate live gVNIC raw-frame / typed-Nic portability evidence. Scope split (decided 2026-06-02,network-reachable-datapath-scope-decision): the network data-path reachability sub-requirement is raw-frame TX/RX over the live NIC (GCE polling-path slices 1-4 + slice 6); the SSH/WebShell / network terminal access sub-requirement is L4 and is deferred to networking-proposal Phase C. - Add NVMe controller init (brokered admin queue pair + identify on no-IOMMU). Closed by the brokered enable / admin / IDENTIFY / interrupt-wake child chain ending 2026-05-28.
- Add NVMe I/O queue pair (submission/completion rings + doorbell writes).
Closed by
nvme-io-queue-and-readon 2026-05-28. - [~] Add NVMe read/write commands with PRP-based DMA transfers; no-IOMMU PRPs
are manager-materialized from live buffer authority. READ and WRITE are
done (see the NVMe chain above); multi-block PRP-list (
count > 8) remains. - Implement
BlockDevicefor NVMe. Done via theBlockDeviceBackend:: NvmeBrokeredread/write/flush arms (still per-proof-feature-gated for activation pending the capstone graduation). - Add QEMU NVMe metadata-only PCI testing via
-device nvme. - [~] Extend QEMU NVMe testing to cover controller init, queues, PRP DMA, and
BlockDevicebehavior. Controller/admin, I/O queue, READ/WRITE/FLUSH, andBlockDeviceread/write/flush plus dedicated data-completion interrupts over-device nvmeare covered; NLB>1 PRP-list and always-built graduation remain. - [~] GCP storage path: NVMe Persistent Disk on a third-generation GCE shape has
one live brokered READ proof (
cloud-gcp-storage-driver, run1780806087-bf69). The older virtio-scsi Persistent Disk path, Local SSD, and reusable filesystem-backed storage provider remain future work. Keep virtio-blk as a local/QEMU block-driver proof only unless a provider target explicitly exposes it. - GCP NIC path: virtio-net first where supported, then gVNIC for newer
machine families, Confidential VM paths, generation-3-or-later shapes, and
higher network performance tiers. The virtio-net raw-frame provider gate
passed on live GCE, and the gVNIC portability lane below now has live
raw-frame and typed
Nicevidence. High-throughput, multiqueue, public ingress, and first-public-Web-UI productization remain future tasks. - AWS storage path: NVMe on Nitro-backed EBS instances. Treat AWS Nitro as an NVMe storage dependency rather than a virtio-blk path.
- AWS NIC path: ENA driver, including ENA queue setup, MSI-X routing, and Nitro generation/version expectations. Do not claim AWS network support from QEMU virtio-net evidence.
- Azure NIC path: MANA driver and Mellanox mlx4/mlx5 accelerated-networking fallback awareness where Azure exposes SR-IOV VFs. Driver lifecycle must tolerate dynamic VF binding and revocation by falling back to the synthetic interface rather than assuming the VF is permanent.
Cloud Benchmark Reruns
Visible outcome: once capOS reaches a first real cloud-VM boot, rerun the current benchmark profiles on that boot path and separate cloud evidence from local QEMU/KVM evidence.
Open gates:
- Define the first supported cloud benchmark profile after the booted cloud
hardware surface is known. At minimum, rerun boot/session smokes and any
CPU-only benchmark such as
run-smp-process-scale, and laterrun-thread-scale, that does not depend on missing cloud NIC or block drivers. A GCEn2-highcpu-8-class nested-KVM host is a reasonable first CPU-only benchmark target if/dev/kvmis usable by the benchmark user. - Record provider, region, instance type, CPU topology, cloud image id, firmware/device model, nested-KVM state, QEMU CPU pinning/isolation policy, and serial-console collection method in the benchmark artifact.
- Retain provenance for the exact disk/cloud image, kernel, manifest, embedded binaries, host toolchain, and cloud image import path.
- Compare cloud-VM results with local QEMU/KVM results only as separate environments; do not replace the selected local proof gate with a cloud result unless the milestone explicitly changes.
Cloud Device Tracks – Real GCE Polling Path (decoupled from MSI-X)
Decision (2026-06-01): the real-GCE-boot milestone (userspace virtio-net driver
binding a real GCE NIC plus a reachable network data path) is decoupled from
device-autonomous MSI-X interrupt delivery. The production data path uses
polling the used ring, which already works on the non-qemu cloud kernel:
the landed cloud_virtio_net_rx_userspace_submit_proof does a real device->host
RX DMA (used_len=76) with zero interrupts, via the always-built
virtio_transport + poll_used_idx. Every TX/RX data movement and completion
in the repo is already polled; device-autonomous MSI-X remains a parallel
efficiency follow-up, not a boot blocker. The local MSI-X track is now closed:
the missing precondition was explicit PCI COMMAND memory-space/bus-master
enablement in the proof path. With pci_command=0x0107, local QEMU/KVM delivers
virtio-net RX MSI-X vector 0x50 through the guest IDT path with
int_injected=0, idt_handler_observed=true, and one deferred-EOI
acknowledgement. Live-GCE interrupt evidence remains outside the polling-path
critical path.
Production-kernel ground truth (verified): PCI/ECAM enumeration, device_manager,
the bounce-buffer DMA backend, MSI-X programming, and all three DDF grant
sources are already always-built. Still cfg(feature = "qemu")-stubbed in
production (the real gap): kernel/src/virtio.rs (legacy driver + smoltcp +
cap/network.rs TCP/UDP socket caps) → virtio_stub.rs returns
DeviceUnavailable.
Ordered slices (only the last is billable; none require interrupt delivery). Slices 1-5d are done; the legacy real-GCE blockers found in flight are all closed locally:
- RX polled-completion-no-inject local proof (done 2026-06-01) — flipped the
RX-submit proof’s completion observation from the kernel-injected dispatch
proxy to the already-latched polled used-ring state
(
make run-cloud-provider-virtio-net-rx-polled-completion). - Polled provider default manifest (done 2026-06-01) — graduated the polled
RX+TX provider off the per-proof feature into always-built
cap::virtio_net_polled_provider, staged by a manifest-observable condition (make run-cloud-provider-virtio-net-polled-provider-default). - Real-polled-driver
provider-nic-bound(done 2026-06-02) — re-pointedcap::provider_nic_bind_proof::reportso the marker fires only after the real polled provider completes a TX+RX over the live function, removing the kernel-side dispatch-slot proxy as the source. The literalsystem.cuefold remains the open remainder (make run-cloud-provider-nic-bound-real-polled-driver). - Polled teardown / stale-authority (done 2026-06-02) — ported the S.11.2 hostile-smoke discipline (DMA/MMIO/IRQ stale-authority rejection, release/reset/driver-death teardown, no host-physical export) to the real polled production provider.
- Network-reachable-datapath scope decision (done 2026-06-02) — Option A:
the milestone’s “reachable network stack” bar means raw-frame TX/RX
reachability over the live NIC, because the billable
make cloudboot-testgate checks no L4 socket round-trip. Slices 1-4 + slice 6 close that bar. L4 sockets (smoltcp +cap/network.rssocket caps offcfg(qemu)virtio.rs) are a separate future track (networking-proposal Phase C). Decision doc:network-reachable-datapath-scope-decision. 5b. [x] Legacy/transitional virtio 0.9 bind (decomposed 2026-06-02) — the real GCE NIC is a legacy/transitional virtio 0.9 device (PIO config BAR, INTx, no MMIO BAR, no MSI-X); the modern-only production polled provider returned no candidate on real GCE. Both decomposition slices landed 2026-06-02, so the local-proof acceptance is closed; the later billable slice-6 re-run also passed.cloud-prod-virtio-net-legacy-transitional-bind-local-proof.- 5b.1 [x] Legacy PIO select (done 2026-06-02) — kernel-brokered legacy PIO
config access (
pci::LegacyIoBar/pci::io_bar, scoped to the claimed I/O BAR, no ambient port authority) + legacy candidate selection with no MSI-X precondition (make run-cloud-provider-virtio-net-legacy-select,virtio-net-pci,disable-modern=on,vectors=0). - 5b.2 [x] Legacy datapath bind (done 2026-06-02) — single-PFN contiguous
virtqueue materialization (
frame::alloc_contiguous, reusing the modern ring helpers), legacy PIO notify, 10-byte legacy net header, polled TX (ARP) + RX over the legacy device with no MSI-X route (make run-cloud-provider-nic-bound-legacy). Sources exactly oneprovider-nic-boundfromreport_real_completion_legacy. 5c. [x] Legacy GCE-viable RX stimulus (done 2026-06-02) — the landed legacy proof’s RX stimulus was QEMU-SLIRP-only (spoofed ARP to10.0.2.2); replaced by a broadcast DHCP DISCOVER from the device’s real MAC (legacy config0x14), an accept-any inbound frame completion model, and a wall-clock (monotonic_ns) RX budget with an iteration-ceiling backstop. Marker carriesrx_stimulus=dhcp-discover-broadcast,eth_src=device-mac,-srcmac.<12hex>(make run-cloud-provider-nic-bound-legacy). 5d. [x] Legacy large-queue-size (landed 2026-06-02) — live GCE legacy virtio-net advertises a 4096-entry virtqueue, exceeding the proof’s defensiveMAX_LEGACY_QUEUE_SIZE = 1024. Raised to the virtio spec max 32768 (power-of-two enforced; non-power-of-two / over-bound / zero reject cleanly;alloc_contiguousfails closed without panic). QEMU caps queue size at 1024 and lockstx_queue_sizeat 256 for the non-vhost SLIRP legacy device, so the largest local shape isrx_queue_size=1024(8-page RX single-PFN vring); the full 4096-entry materialization is a real-GCE attestation (make run-cloud-provider-nic-bound-legacy-large-queue).
- 5b.1 [x] Legacy PIO select (done 2026-06-02) — kernel-brokered legacy PIO
config access (
-
cloud-gcp-virtio-net-nic-driver(reopen) — DONE 2026-06-02 (run1780412056-e1cb,e2-small,europe-west3-a, source commit1fb65683): the real GCE boot bound the live legacy virtio 0.9 NIC (00:04.0,1af4:1000) through the kernel-brokered legacy polled path and passed--require-provider-nic-proof. The full 4096-entry vring materialized on real hardware for the first time (rx_vring_pages=28contiguous), the real GCE device MAC was read (src_mac=42:01:0a:c8:00:12), a broadcast DHCP DISCOVER was transmitted, and a real device->host RX DMA completed within the TSC-governed wall-clock budget (rx_used_len=532 ethertype=0x0800). Closes the GCE Polling Path track and retires thecloud-gcp-virtio-net-nic-driverblocker. The billable run was authorized on 2026-05-27 and recorded at commit2aaeaa53; durable evidence is summarized in the completed task entry below. Dispatched ascloud-prod-gce-billable-boot-real-polled-nic-bound. To re-run the billable bind: build the cloudboot image from the legacy manifestsystem-cloud-provider-virtio-net-legacy-datapath.cue(not the modernsystem-cloud-provider-nic-bound-real-polled-driver.cue; the literalsystem.cuestages no provider), confirmmake run-cloud-provider-nic-bound-legacygreen on the build commit, thentools/cloudboot/run-test.sh --require-provider-nic-proof.
Real-Filesystem Track (2026-06-02)
The real-filesystem direction is decided in
Real-Filesystem Decision:
a role-split, not one on-disk format. capOS-managed state stays capnp-native
(CAPOSWF1/CAPOSST1, evolved not replaced; crash-consistency already proven by
make run-storage-writable-recovery); host-populated/interop images gain
read-only FAT32 via the fatfs no_std crate; a single host capnp image tool
retires the per-format tools/mkstorage-*.py byte-offset hazard. ext4-read is
deferred behind an explicit trigger (“must read a disk capOS did not format”);
FAT write is rejected (no crash-consistency story).
Landed: read-only FAT32 over virtio-blk (kernel/src/cap/fat_fs.rs, vendored
vendor/fatfs-no_std/, make run-storage-fat-read, storage_fat_read feature
on the existing read_only_fs_root source; provenance docs/devices/fat32.md),
and read-only FAT32 over the graduated NVMe read arm (the Nvme BlockSource
arm + deferred FatMount, cloud_fat_read_over_nvme_proof,
make run-cloud-provider-fat-read-over-nvme). See docs/tasks/done/2026-06-02/
and done/2026-06-03/.
Open (next): the real-FS slice chain continues with FAT-over-NVMe follow-ups
and timestamps/provenance on CAPOSST1/CAPOSRO1 where those layouts expose
time metadata. FAT32 now surfaces valid host-authored directory-entry timestamps
over both virtio-blk and NVMe through schema-stable File.stat values, with
proof logs labeling the source as FAT metadata rather than trusted wall-clock
custody. The capnp-native storage smokes and installable-system seeded
variants now use the Rust host capnp image tool as the maintained fixture path;
the retired Python capnp-layout fixture scripts are no longer referenced by the
local proofs. The FAT image path stays on real mkfs.fat / mcopy tooling.
ext4-read stays deferred behind its explicit trigger.
Phase C / L4 Track Opened (relocation, post raw-frame GCE proof) (2026-06-02; refreshed 2026-06-07)
The L4 socket reachability track — relocating the virtio-net driver and
smoltcp into userspace processes (networking-proposal Phase C), sequenced after
the cloud milestone per the
network-reachable-datapath scope decision
(Option A) — is designed in
Phase C Userspace NIC Driver Relocation.
It is no longer waiting on a new security ruling: the selected-write
common-config and DMA-address export pieces landed through the bounded Phase C
slices, reusing the accepted notify-doorbell discipline and the landed
bounce/IOVA-export DMA isolation posture. The lower-layer blocker for Web UI on
a GCE instance is production L4 plus live IPv4 configuration. The full
boot-resource UI bundle is separate parallel work: it is ready and should close
before claiming a useful public Web UI, but it is not the raw NIC/L4 blocker.
Current task chain:
cloud-prod-nic-driver-userspace-clean-tx-rx-split-local-proofis Phase C slice 6 (DONE 2026-06-03). It removed the last coupled raw-frameNic.receiveself-stimulus.cloud-prod-userspace-network-stack-smoltcp-local-proofis Phase C slice 7c-ii(b) (DONE 2026-06-07). It locally proves the selected serve-from-userspace architecture: the non-qemucloudboot manifest starts a userspace smoltcp network-stack service, the service spawns an application client with onlyConsoleplus a servedTcpListenAuthority, and the client completes one hostfwd TCP request/response through a servedTcpListenerandTcpSocket. The armed path now receives socket authority from the userspace smoltcp service for this proof rather than extending the legacy kernelcap/network.rs/virtio_stub.rssocket owner. The selected design is recorded in the Phase C proposal’s 7c-ii Mechanism and Decomposition section.cloud-prod-legacy-kernel-network-socket-path-retirementis done. Non-qemuproduction manifests now reject legacy kernelnetwork_manager/tcp_listen_authoritygrants, so the armed socket route stays behind the userspace network-stack service; remaining kernel socket grants are qemu-only fixtures.cloud-prod-phase-c-kernel-smoltcp-virtio-net-removalis done. It removes the kernelsmoltcpdependency, retires the qemu-only kernel TCP/UDP runtime behind fail-closed socket entry points, and leaves the remaining virtio-net code as lower-layer QEMU fixture evidence rather than production cloud socket ownership.cloud-prod-network-stack-dhcp-ipv4-config-local-proofis done. It follows the served-socket proof and locally proves DHCP/IPv4 lease acquisition, default-route installation, ARP/neighbor resolution, and userspace-servedNetworkManager.getConfigstatus needed by a GCE-hosted listener.- Network Usability and Post-smoltcp
decomposes the follow-on usability lanes: operator status tooling, DHCPv4
renewal/rebind/expiry/status beyond the first config proof, system
DnsResolver, POSIXgetaddrinfo, ping/ping6 diagnostics, socket readiness/cancel/backpressure, packet trace authority, and transport policy/status. These are not first public Web UI blockers except for the already-listed DHCP/IPv4 config proof. remote-session-self-served-full-ui-bundleis done and provides the reviewed fixed-name boot-resource operator bundle for follow-on Web UI proofs.cloud-prod-remote-session-web-ui-l4-local-proofnow consumes the done userspace L4 and DHCP/IPv4 config proofs; it provesremote-session-web-uilocally on the non-qemucloudboot socket path.cloud-gce-legacy-virtio-webui-serving-local-proofis done (2026-06-11 04:26 UTC), proved bymake run-cloud-gce-legacy-virtio-webui-serving. It closes the local legacy-datapath serving gap: a persistent kernel-brokered legacy virtio 0.9 polled runtime (cap::virtio_net_legacy_datapath_proof::legacy_nic_runtime, kernel featurecloud_gce_legacy_virtio_webui_serving_proof) backs the same typedNiccap the modern path serves, and the Phase C userspace network stack plusremote-session-web-uiserve the fixed UI bundle to a host HTTP peer over the GCE NIC shape (disable-modern=on, no MSI-X), byte-verified against the committed bundle pin with a singlecloudboot-evidence: legacy-virtio-webui-servingmarker. PIO/vring ownership stays kernel-side; no host-physical, IOVA, queue, or port-I/O authority crosses the cap boundary. This closes only the LOCAL serving story – it does not claim private GCE reachability.cloud-gce-private-self-hosted-webui-proofis on hold (2026-06-09). Its local prerequisites are done, and the legacy-datapath Web UI serving story is now locally proven (2026-06-11 04:26 UTC, above), but it still shares the missing firewall IAM / default-deny ingress blocker recorded oncloud-gce-private-icmp-echo-proof: the cloudtest credential cannot create firewall rules, so a private probe cannot reach the instance. It keeps the current no-public-IP cloudboot posture and requires a private probe that crosses the live GCE NIC under an explicit billable-run authorization.cloud-gce-public-webui-ingress-tls-policy-designis done and records the selected ingress, TLS/certificate, firewall/source, browser session, and teardown policy for public exposure work.cloud-gce-public-self-hosted-webui-ingress-tlsis blocked on the private proof; public operator access is a separate exposure slice that implements the recorded ingress/TLS policy.cloud-prod-phase-c-kernel-smoltcp-virtio-net-removalis the done Phase C exit cleanup after userspace L4 was proven. It is not the first GCE Web UI proof, and it does not claim private GCE reachability, public ingress, or TLS.
Networking diagnostics and stack-completeness follow-ups:
cloud-prod-icmp-echo-reply-local-proofis done (2026-06-08). It consumes the done userspace L4 and DHCP/IPv4 config proofs, acquires a local DHCP lease, proves a same-subnet ARP plus ICMP Echo Request / Echo Reply exchange that preserves identifier, sequence, and payload, and rejects malformed or oversized requests with a bounded per-poll budget. This is diagnostics, not Web UI readiness.cloud-prod-icmp-echo-reply-real-nic-datapath-local-proofis done (2026-06-08), proved bymake run-cloud-prod-icmp-echo-reply-real-nic-datapath. The done local responder proof above runs smoltcp over an in-processQueuePhyDevice: it injects the inbound Echo Request in-process and uses the realNiccap only for the DHCP lease and ARP probe, so no inbound ICMP traversesNic.receivePoll/Nic.transmit. The live GCE NIC is legacy virtio 0.9 (no userspace driver authority), so an inbound Echo Reply over the real NIC needs a kernel-owned responder on the legacy datapath. This task built that responder (cap::virtio_net_legacy_datapath_proof::run_icmp_echo_reply_real_nic_datapath) and locally proved it: a host peer over a QEMUsocketnetdev (not SLIRP, which drops inbound host->guest ICMP Echo) drives DHCP, ARP, multiple malformed Echo Requests (rejected;icmp_malformed_drops>=1), then a valid one, and the kernel answers an RFC 792 Echo Reply over the real RX/TX vrings, emittingcloudboot-evidence: icmp-echo-reply-real-nic-datapath <token>withrx_inbound_provenance=real-nic-rx-vring/in_process_queuephydevice=absent.cloud-gce-private-icmp-echo-proofis blocked (2026-06-09) on GCP firewall IAM. Its harness, GCE-importable image (make capos-gce-private-icmp-echo-cloudboot-image), and probe orchestration (tools/cloudboot/run-test.sh --require-private-icmp-proof) are implemented and pre-spend-validated locally, and a real billable run (1780962265-4a2e) proved the GCE datapath: capOS DHCP-leased the exact GCE-assigned IP10.200.0.38over the live legacy virtio 0.9 NIC and emittedcloudboot-evidence: icmp-echo-reply-real-nic-datapath-ready, with the probe pinging that IP during capOS’s responder window. The pings showed 100% loss because GCE default-denies ingress and the cloudtest service-account credential lackscompute.firewalls.create/.delete/.list, so no temporary ICMP rule could be created; all resources tore down cleanly. Unblock by granting those firewall permissions to the cloudtest credential or pre-provisioning a persistent allow-ICMP rule in the cloudtest VPC network, then re-runningmake cloudboot-gce-private-icmp-echo-test. It proves private same-VPC ping over the live NIC with no public ICMP exposure and should not become a public HTTPS Web UI closeout condition unless a later ingress policy explicitly chooses ICMP health checks.
IPv6 Support Lane, Non-Blocking For First Public Web UI
The current Web UI cloud path is deliberately IPv4-first: Phase C userspace L4,
DHCP/IPv4 configuration, ARP, private GCE reachability, and reviewed public
HTTPS ingress remain the required blockers for the first public proof. IPv6 is
not a reason to hold that path. It is a separate network-stack capability lane
because the old qemu-only runtime remains IPv4-only and the legacy
kernel-owned non-qemu socket fallback is retired; the Phase C userspace
service path now carries the explicit address-family ABI. Private GCE IPv6
reachability and public IPv6 ingress policy remain unproven. Local ICMPv6 Echo
Reply, GCE-style DHCPv6 configuration, and IPv6 TCP listener/connect behavior
now have bounded local proofs.
The task chain is:
cloud-prod-ipv6-architecture-status-groundingis done (2026-06-03). It recorded the explicit current-state audit and the non-blocking decision, then unblocked the address-ABI task.cloud-prod-network-address-abi-ipv6is done (2026-06-03, the lane’s entry point). The socket/interface address ABI now represents IPv4 and IPv6 explicitly throughIpAddressFamilyand a documented address-length contract:getConfigreports the family plus anipv6Supportedflag, and the IPv4-only stack rejects IPv6 with a distinctipv6Unsupportedclass and malformed lengths withmalformedAddress, source- compatible for existing 4-byte IPv4 callers. Proofmake run-cloud-prod-network-address-abi-ipv6.cloud-prod-ipv6-link-local-nd-local-proofis done (2026-06-08). It enables the local smoltcp IPv6 feature set, installs a link-local address, verifies all-nodes plus solicited-node multicast joins, and proves a bounded Neighbor Solicitation / Neighbor Advertisement exchange plus cached-peer UDP egress locally. Proofmake run-cloud-prod-ipv6-link-local-nd.cloud-prod-ipv6-ra-slaac-local-proofis done (2026-06-08). It proves Router Solicitation, Router Advertisement acceptance, SLAAC address configuration, default-route installation, invalid-RA rejection, and prefix/default-route expiry locally. Proofmake run-cloud-prod-ipv6-ra-slaac.cloud-prod-ipv6-dhcpv6-gce-config-local-proofis done (2026-06-08). It proves a local GCE-shaped DHCPv6 Solicit / Advertise / Request / Reply exchange, installs the assigned/128, keeps default-route provenance tied to Router Advertisement, and rejects wrong source, wrong port, transaction-id, identifier, oversized-option, lease-lifetime/timer, and timeout cases. Proofmake run-cloud-prod-ipv6-dhcpv6-gce-config.cloud-prod-icmpv6-echo-reply-local-proofis done (2026-06-08). It proves bounded local ICMPv6 Echo Request / Echo Reply handling through the Phase C userspace smoltcp substrate, including identifier, sequence, payload preservation and checksum, type/code, address-family, and oversized-input rejection. It is diagnostics and stack completeness, not Web UI readiness.network-ping6-diagnostics-tool-local-proofis done (2026-06-08). It proves a bounded local ping6-style diagnostic over the smoltcp ICMP socket path, including link-local scope reporting, configured global address status, malformed-reply drop, timeout/unreachable classification, one bounded retry after the neighbor-discovery timer, payload bounds, and one-outstanding-request enforcement. It remains diagnostics only and does not change the IPv4-first Web UI critical path or authorize public IPv6 ingress.cloud-prod-ipv6-tcp-l4-local-proofis done (2026-06-08). It proves TCP listener and connect behavior through the production socket contract with IPv6 endpoints. Proofmake run-cloud-prod-ipv6-tcp-l4.cloud-prod-ipv6-real-nic-datapath-local-proofis ready. The done IPv6 proofs above run smoltcp over an in-processHarnessPhyDevicepeer (markers self-declaremetadata_only=true/public_ingress=not-attempted) and use the realNiccap only for MAC/link status; the real-NIC TX/RX datapath exists today only for IPv4 (cloud-prod-network-stack-dhcp-ipv4-config-smoke). This task builds the IPv6 DHCPv6/RA + probe datapath over the real bound NIC and proves it locally, emittingcloudboot-evidence: ipv6-real-nic-datapath <token>.cloud-gce-private-ipv6-reachability-proofis on-hold on missing GCP IAM access. The real-NIC IPv6 datapath proof above is now done, but its live-GCE acceptance fundamentally requires a dual-stack subnet (so the GCE NIC receives an IPv6 assignment at all) plus an IPv6 ingress firewall rule for the same-VPC probe. The cloudtest service-account credential lackscompute.networks.create/compute.subnetworks.*(the only existing cloudtest subnet is IPv4-only) andcompute.firewalls.create/.delete/.list, so neither can be provisioned. Unblock by granting those permissions, or by pre-provisioning a dual-stack subnet plus an IPv6 ingress rule in the cloudtest VPC scoped to the probe. See the on-hold record for the consolidated blocker analysis and the parkedcodex/cloud-gce-private-ipv6-reachability-proofharness checkpoint.cloud-gce-public-ipv6-ingress-tls-policy-updateis blocked on the private IPv6 proof, then updates the selected public Web UI ingress/TLS policy for DNS/AAAA, IPv6 firewall, TLS coverage, and teardown before any public IPv6 exposure.
Non-blocking GCE gVNIC portability lane:
cloud-gce-gvnic-protocol-grounding-device-mapis done. It landed the GCE gVNIC provenance map from the Google Cloud gVNIC docs and the Google/Linux GVE driver documentation: PCI identity (0x1ae0:0x0042), BAR/admin-queue/MSI-X wire subset, GQI/DQO formats, QPL/RDA addressing, and the planned DDF (DeviceMmio/DMAPool/DMABuffer/Interrupt) authority mapping. No capOS gVNIC driver or QEMU model exists yet.cloud-gce-gvnic-image-launch-inventory-proofis done. It requestsGVNICimage/instance launch posture, reads the GCE image/instance policy back, proves serial PCI inventory for the1ae0:0042function with BAR and MSI-X metadata, and records that no gVNIC driver bind was claimed. The live run used a private no-public-IP/no-service-account VM and completed teardown.cloud-gce-gvnic-adminq-register-proofis done. It builds a proof-only cloudboot image, maps the live GCE gVNIC BAR0 throughDeviceMmio, allocates manager-owned bounce-buffer DMA pages for the admin queue and descriptor, issues oneDESCRIBE_DEVICEcommand, releases the admin queue, and checks staleDeviceMmio/DMAPool/DMABufferhandles. The live privateGVNICrun completed teardown and recorded no userspace host-physical/IOVA export and no provider NIC bind.cloud-gce-gvnic-raw-frame-tx-rx-proofis done. It builds a proof-only cloudboot image, configures one GQI/QPL TX queue and one RX queue over the live GCE gVNIC, sends one DHCP DISCOVER raw Ethernet frame from the device MAC, receives one inbound IPv4 frame, destroys queues, unregisters QPLs, deconfigures resources, releases/resets the admin queue, and records no providerNicbind claim.cloud-gce-gvnic-nic-cap-adaptation-proofis done. It adapts the proven GQI/QPL queue path behind the existing typedNicsemantics and emitsgvnic-nic-cap-adaptationevidence with inline-frame TX/RX, MAC/link metadata, hidden queue addresses, no host-physical or IOVA export, and noprovider-nic-boundclaim. It remains a portability/future-machine-family lane; the first public Web UI proof can stay on the already-proven GCE virtio-net path.