# Cloud Driver Foundation: Gap Analysis

## Premise Correction

A prior framing held that "capOS has no userspace device-driver foundation."
That is **wrong**. The userspace virtio driver foundation exists and is proven in
QEMU across a month of landed DDF work. This document establishes precisely what
the foundation covers and reduces each blocked cloud-driver task to its narrow
real remaining gap, so no one re-implements a foundation that already exists.

## What The Foundation Already Provides (proven, in `docs/tasks/done/`)

- **Device-agnostic virtio DMA/notify seam + relocated queue/discovery**
  (`ddf-virtio-driver-foundation-boundary`, 2026-05-25). The split-ring
  `Virtqueue` and `discover_modern_transport` live in
  `kernel/src/virtio.rs mod transport`, driven through the `VirtqueueDma` seam
  (preflight/register/allocate/free/record-submission/record-completion over the
  `device_dma` ledger). virtio-net is *one* caller of the seam, not the only
  possible caller -- a non-net virtio device (e.g. virtio-blk) can drive the same
  bounded ledger semantics. Proofs: `make run-net`, `make run-ddf-provider-consumer`.
- **Userspace provider owns the selected virtio-net TX queue end-to-end**
  (`ddf-provider-virtio-net-driver-closeout`, 2026-05-23). A userspace process
  publishes real selected-queue TX descriptors, rings the doorbell through a
  `DeviceMmio` notify-write claim, consumes the TX used-ring completion, and
  exposes CQ identity -- all through user-mode `DMAPool`/`DMABuffer`/`DeviceMmio`/
  `Interrupt` authority, with no silent fallback to the in-kernel virtio-net TX
  helper while the provider owns TX. RX is bounded synthetic-token CQ identity
  (kernel RX cohabitation explicit). DMA backend is manager-owned bounce buffers.
- **Manager-granted provider/consumer authority lifecycle**
  (`ddf-userspace-driver-provider-consumer`, 2026-05-11). A userspace provider
  consumes manifest-granted DMAPool/DeviceMmio/Interrupt authority; stale-authority
  rejection, revoke, and release/reset/driver-death teardown are proven.
- **GCP virtio-net function bound through the gate locally in QEMU**
  (`cloud-gcp-virtio-net-local-qemu-binding`, 2026-05-26). The enumerated/bound
  function matches the documented GCP 1st/2nd-gen virtio-net surface (vendor
  `0x1af4`), the resolved DMA backend is the labeled bounce-buffer path, proven by
  `make run-net` and `make run-ddf-provider-consumer`.
- **DMA backend selection** (`cloud-dma-backend-selection`, 2026-05-24): boot probe
  -> fail-closed select -> manifest override; GCE resolves to bounce-buffer.
- **Production IOMMU remapping closeout** (`ddf-iommu-remapping-production-closeout`,
  2026-05-23): the direct-remapping domain path for IOMMU shapes
  (`make run-iommu-remapping`).
- **First `BlockDevice` CapObject** (`ddf-blockdevice-boundary-virtio-blk-smoke`,
  2026-05-25): a bounded sector write/read-back over virtio-blk
  (`make run-virtio-blk`). **Note: this `BlockDevice` is kernel-side**, over
  manager-owned bounce buffers -- it is not a userspace storage provider.

## Boundary Of The Foundation (where userspace ownership stops today)

- **NIC: userspace owns virtio-net TX; RX is synthetic/cohabited.** No live
  hardware RX used-ring ownership, no direct DMA/IOMMU on the provider path, no
  cloud enumeration.
- **Storage: there is no userspace storage provider of any device class.** The
  `BlockDevice` cap is kernel-side; NVMe is metadata-only
  (`kernel/src/pci.rs` enumerates the controller and emits a `no-authority/
  no-driver ... controller_init=not-started` line, no register/queue/IDENTIFY/IO
  code). The NIC userspace driver does not transfer to storage: NVMe is a
  different device class (admin/IO submission+completion queue pairs, doorbells,
  PRP/SGL), and even userspace virtio-blk/virtio-scsi has no provider driver --
  the foundation seam makes it *possible*, but no slice has built it.
- **Production grant sources stage an arbitrary function through one
  device-agnostic entry point (done 2026-05-30).** The non-`qemu`
  `{dmapool,devicemmio,interrupt}_grant_source_prod` statics previously inferred
  their candidate function from a hardcoded selection rule narrowed by
  `#[cfg(feature = "cloud_*")]` blocks scattered through each `pick_candidate`
  body. `cloud-prod-grant-source-despecialization` replaced that with one
  `stage_with_class` entry point per source that takes an explicit
  `ProdGrantClass` device-class descriptor (`cap::prod_grant_source_class`):
  `AnyFunction` (plain BAR / first usable function), `DmaCapable` (virtio or
  NVMe), or `NvmeController` (NVMe only); the DeviceMmio source additionally
  takes the explicit mapped-window length (one page for the plain/virtio-net
  notify family, two pages for the NVMe `CC`/admin-register selected-write
  region). The no-arg `init()` wrappers select the build's descriptor and
  delegate, so a non-virtio-net function is staged by passing the matching
  descriptor rather than by reaching virtio-net-specific code. The transitional
  in-kernel `qemu`-path grant sources still carry the per-function
  `init_*_for_device` / `init_provider_*` variants; those follow the virtio
  transport into userspace under Phase C of the networking proposal rather than
  through this slice.

## Per-Task Gap (the narrow real Y)

### `cloud-gcp-virtio-net-nic-driver` -> runnable-now claim is superseded

The 2026-05-27 version of this document concluded that the GCP virtio-net live
driver task was runnable as a cloud-evidence slice. That conclusion is now
stale. The local production cloudboot bind markers have landed, but
`cloud-prod-provider-nic-bound-local-proof` deliberately settled its completion
boundary with a kernel-side dispatch-slot proxy because the production
userspace-provider grant/waiter surface is still not available in the
non-`qemu` cloudboot build.

The current local production chain is therefore still implementation work, not
just billable evidence capture:
The `cloud-prod-provider-devicemmio-grant-source-local-proof`,
`cloud-prod-provider-dmapool-grant-source-local-proof`, and
`cloud-prod-provider-interrupt-grant-source-local-proof` children are **done**
(2026-05-28): the non-`qemu` cloudboot kernel can deliver `DeviceMmio`,
`DMAPool`, and `Interrupt` grants to small userspace provider services through
manifest/process-spawner delivery, each with its own local-QEMU proof and
bounded caveats. The aggregate docs-status closeout
`cloud-prod-provider-grant-surface-local-proof` is also **done** (2026-05-28):
it records those landed children as one provider grant-surface boundary
without adding new behavior. The remaining local production work is
`cloud-prod-provider-cap-waiter-local-proof`, then
`cloud-prod-virtio-net-userspace-provider-local-proof` (and the brokered NVMe
sibling). Only after those local production userspace-provider tasks land does
the live-GCE NIC task reduce to a cloud evidence/harness run.

The access and spend corrections still stand: GCE access is provisioned and the
operator authorized billable runs on 2026-05-27. The blocker is local
production userspace-provider authority, not cloud access.

### Storage tasks -> gap is a userspace NVMe-class storage provider

`cloud-gcp-storage-driver`, `cloud-gcp-storage-local-qemu-binding`,
`cloud-aws-nvme-storage-driver`, `cloud-azure-disk-storage-driver` all reduce to
the same genuine missing piece: **a userspace storage provider driver**. virtio-net
TX ownership does not carry to storage. Two real sub-gaps:

1. **No userspace storage provider driver.** Either (a) a userspace virtio-blk/
   virtio-scsi provider over the existing virtio seam (the kernel `BlockDevice` is
   kernel-side and does not satisfy the "no hidden kernel DMA ownership"
   acceptance), or (b) a userspace NVMe-class driver (controller bring-up + admin/
   IO queue pairs + doorbells + PRP DMA) over the bounce-buffer/IOMMU backend.
   NVMe is the strategic target: GCP 3rd-gen+, AWS Nitro EBS, and Azure Boost are
   all NVMe, so one NVMe foundation unblocks all three providers' storage legs.
2. **The no-IOMMU `run-pci-nvme` proof gate and the DMA-address ownership model.**
   A real provider-driven NVMe completion + "no hidden kernel DMA ownership" +
   "no host-physical exposure" must all hold under the no-IOMMU bounce-buffer
   shape. The 2026-05-27 Model B override (provider writes queue-base/PRP
   addresses, kernel validates on notify) does **not** satisfy those constraints
   on the current no-IOMMU gate: device-visible equals host physical, and
   reviewed IOVA export discipline intentionally returns no usable device
   address to userspace.

   The correction is to split the lanes. Model B remains valid for a verified
   direct-remapping/vIOMMU gate, or a future synthetic address namespace
   translated by trusted code. The GCP/no-IOMMU lane must use brokered bounce:
   the provider owns NVMe protocol state and buffer/command capabilities, while
   the kernel or device manager materializes `ASQ`/`ACQ`, I/O queue-base, and
   PRP/SGL device-visible fields from the live `DMAPool` ledger. That is the
   only current path that preserves no-host-physical-exposure on GCP.

The ordered NVMe work therefore splits into:

- no-IOMMU brokered lane: `nvme-no-iommu-brokered-controller-enable` (landed
  2026-05-27 21:38 UTC, commit `11b86568`) -> `nvme-admin-queue-identify`
  (landed 2026-05-27 22:34 UTC, commit `cede5257`) ->
  `nvme-admin-interrupt-delivery` (landed 2026-05-27 23:07 UTC, commit
  `18fd25c7`) -> `nvme-io-queue-and-read` (ready brokered I/O/read);
- direct-remapping lane: `nvme-doorbell-dma-validator` (landed mechanism) ->
  provider-written enable/admin/I/O slices on a verified IOMMU/vIOMMU gate.

Those are the real storage Y for the NVMe path; the virtio-scsi path is an
alternative userspace provider of comparable size. None of this is "build a
foundation" -- it is "build a storage device-class provider on the existing
foundation."

### AWS / Azure storage -> consume the GCP NVMe foundation + provider delta

`cloud-aws-nvme-storage-driver` and `cloud-azure-disk-storage-driver` already
re-scope themselves to a small provider delta once the shared NVMe foundation
lands. No new driver decomposition; their blocked-until is the GCP NVMe child
chain. Their AWS/Azure NIC siblings (ENA, MANA) are vendor-custom and out of GCP-first scope.

## What This Document Changes

1. **Supersedes the `cloud-gcp-virtio-net-nic-driver` runnable-now claim.** The
   QEMU userspace virtio foundation remains useful grounding, but the live GCP
   NIC task stays blocked until the local production userspace-provider
   grant-source, waiter, and userspace virtio-net provider chain lands.
2. **Decomposes the storage gap GCP-first** into a no-IOMMU brokered-bounce
   userspace NVMe lane for GCP and a separate direct-remapping Model B lane for
   IOMMU/vIOMMU proofs.
3. **Re-points AWS/Azure storage** at the GCP NVMe child chain.

## Design Grounding

- `docs/tasks/done/2026-05-25/ddf-virtio-driver-foundation-boundary.md`
- `docs/tasks/done/2026-05-23/ddf-provider-virtio-net-driver-closeout.md`
- `docs/tasks/done/2026-05-11/ddf-userspace-driver-provider-consumer.md`
- `docs/tasks/done/2026-05-26/cloud-gcp-virtio-net-local-qemu-binding.md`
- `docs/tasks/done/2026-05-25/ddf-blockdevice-boundary-virtio-blk-smoke.md`
- `docs/tasks/done/2026-05-24/cloud-dma-backend-selection.md`
- `docs/tasks/done/2026-05-23/ddf-iommu-remapping-production-closeout.md`
- `docs/proposals/nvme-model-b-doorbell-dma-validator.md` (conditional Model B
  validator for direct-remapping/synthetic-address lanes)
- `docs/research/dma-userspace-driver-isolation.md`
- `docs/dma-isolation-design.md` (Cloud DMA Backend; IOVA export discipline)
- `kernel/src/virtio.rs` (`transport::VirtqueueDma`, `transport::Virtqueue`),
  `kernel/src/cap/{dma_pool,dma_buffer,device_mmio,interrupt,block_device}.rs`,
  `kernel/src/device_dma.rs`, `kernel/src/device_interrupt.rs`, `kernel/src/pci.rs`
- `docs/proposals/cloud-deployment-proposal.md`,
  `docs/backlog/hardware-boot-storage.md#cloud-device-tracks`
