Cloud DMA Provider Evidence Inventory
This note is the research substrate for the cloud DMA backend decision. It records official AWS, Azure, and Google Compute Engine device-surface facts, defines the evidence-matrix schema that the backend policy fills, specifies the live guest-probe checklist a later credentialed cloud-run task captures, and fixes the classification rules that separate a DMA-capable surface from guest-programmable remapping authority.
It makes no backend selection and no per-VM-shape safety claim. It does not
launch a cloud VM, require provider credentials, or assert that any instance
shape is safe for direct DMA. Selecting a backend and asserting bounce-buffer
safety or IOMMU coverage for a specific shape require attended sign-off and are
out of scope here; that work is cloud-dma-backend-selection. The model this
note feeds is docs/proposals/dma-assurance-model-proposal.md; the local
QEMU/IOMMU grounding it builds on is docs/research/iommu-remapping.md.
How These Facts Were Collected
Provider facts are from official provider documentation and API/CLI references only, retrieved on the dates recorded below. A “fact” here is a statement the provider document makes directly. Where a property is read from an API field rather than stated in prose, it is marked as an inference from API field. No statement in this note comes from running a cloud instance; the live-probe checklist exists precisely because a guest cannot prove provider-side isolation from documentation alone.
Provider Official Facts
AWS EC2
Source: ec2:DescribeInstanceTypes API reference
(InstanceTypeInfo,
NetworkInfo,
EbsInfo),
retrieved 2026-05-24. The matching CLI is
aws ec2 describe-instance-types --instance-types <type>.
- Network surface.
networkInfo.enaSupportreports Elastic Network Adapter (ENA) support with valuesunsupported | supported | required.networkInfo.efaSupported(boolean) andnetworkInfo.efaInforeport Elastic Fabric Adapter presence.networkInfo.enaSrdSupported(boolean) reports ENA Express (Scalable Reliable Datagram).networkInfo.encryptionInTransitSupported(boolean) reports automatic in-transit encryption between instances. - EBS/NVMe surface.
ebsInfo.nvmeSupportreports NVMe support for EBS with valuesunsupported | supported | required.ebsInfo.ebsOptimizedSupportreports EBS-optimized behavior (unsupported | supported | default). - Instance store.
instanceStorageSupported(boolean) andinstanceStorageInforeport local instance-store NVMe disks. - Accelerators.
gpuInfo,fpgaInfo,inferenceAcceleratorInfo,neuronInfo, andmediaAcceleratorInfodescribe GPU/FPGA/inference/Neuron/ media accelerator surfaces when present. - Hypervisor.
hypervisorreportsnitro | xen. Modern Nitro instances reportnitro; the Nitro system is where ENA and NVMe EBS exposure originate.
Inference from API field: an instance type with enaSupport=required and
ebsInfo.nvmeSupport=required exposes a DMA-capable NIC and NVMe block surface.
This identifies a DMA-capable surface; it is not evidence of guest-programmable
remapping authority.
Azure Virtual Machines
Source: Azure Accelerated Networking overview
(page ms.date 2026-02-05, last updated 2026-05-05) and
az vm list-skus,
retrieved 2026-05-24.
- Network surface. Accelerated Networking enables single-root I/O virtualization (SR-IOV) on supported VM sizes, providing a host-bypass data path. The underlying SR-IOV hardware is one of NVIDIA/Mellanox ConnectX-3, ConnectX-4 Lx, ConnectX-5, or the Microsoft Azure Network Adapter (MANA).
- Capability query. A VM size’s Accelerated Networking capability is read
from
az vm list-skusas theAcceleratedNetworkingEnabledcapability value. Most general-purpose and compute-optimized sizes with two or more vCPUs support it (four or more on hyperthreaded sizes); NC and NV sizes appear in output but do not support it. - VF dynamic binding and revocation. The document states the SR-IOV virtual
function (VF) is dynamically revoked and restored across host maintenance and
live migration. Guest images must bind to the synthetic
hv_netvscdevice, not the VF, to keep connectivity, and must markmana | mlx4_core | mlx5_coreSR-IOV devices unmanaged so the synthetic/VF bond is transparent. - Driver delivery. Azure does not update the Mellanox or MANA in-guest drivers; the guest kernel/distribution provides them.
Inference from API field: AcceleratedNetworkingEnabled=True identifies a
DMA-capable SR-IOV NIC surface whose VF can appear and disappear at runtime. The
documented VF revoke/restore behavior is a driver-lifecycle constraint, not
remapping evidence.
Google Compute Engine
Source: Use Google Virtual NIC (gVNIC) and About Local SSD disks, retrieved 2026-05-24.
- Network surface. Third-generation and later machine series (excluding bare
metal) support only gVNIC for the virtual network interface (no virtio-net).
First- and second-generation machines must use gVNIC when on Arm CPU
platforms, when configured as Confidential VM, or when requiring network
speeds between 50 and 100 Gbps, and otherwise still support VirtIO-Net. Custom images declare gVNIC support through
the
GVNICguest OS feature (--guest-os-features=GVNIC, orguestOsFeatures:[{type:"GVNIC"}]). - Local SSD surface. Local SSD is attached over either the NVMe or SCSI
interface; the NVMe interface is required for peak performance, and some
machine series support only one of the two interfaces. The interface is chosen
by the disk
interfacefield (NVMEorSCSI). - Storage transport. Persistent Disk attaches as virtio-scsi on machine families that expose it, while newer families expose NVMe; the exact transport is a per-machine-family property to be captured per shape rather than assumed.
Inference from API field: a third-generation-or-later GCE machine type exposes a gVNIC NIC surface and may expose NVMe Local SSD/Persistent Disk. This identifies DMA-capable NIC/storage surfaces; it is not remapping evidence.
Evidence-Matrix Schema
The backend policy fills one row per observed (provider, shape, image) tuple. Provider-fact columns come from documentation/API; observation columns come from the live-probe checklist; the last two columns are derived classifications, not provider claims.
| Column | Meaning |
|---|---|
| Provider | aws / azure / gcp. |
| Region/zone | The region or zone the observation was taken in. |
| Instance type | Provider instance type / VM size / machine type. |
| Image/kernel | Boot image identifier and guest kernel version. |
| Source command or URL | The exact API/CLI command or official doc URL. |
| Retrieval date | Date the source was read or the probe was captured. |
| Visible PCI/storage/network devices | Devices the guest enumerates (lspci, block/net inventory). |
| Visible IOMMU tables/groups | ACPI DMAR/IVRS/IORT presence and /sys/kernel/iommu_groups. |
| Provider-side isolation notes | Documented host-side isolation (support-policy assumption, not proof). |
| Guest-programmable remapping observations | Whether the guest can discover, program, and validate a remapping authority. |
| Runtime backend inferred by capOS | The backend capOS would select from observations (see classification rules). |
| Support-policy status | Coarse advertised-target roll-up: Direct-remapping / Labeled-bounce-buffer / Unsupported, pending attended sign-off. |
Seed Rows (docs/API-derived, no safety claim)
These rows are seeded from documentation and API fields only. Observation and backend columns are intentionally blank because no instance was probed; they are filled by a later credentialed cloud-run task. No row asserts that any shape is safe for direct DMA.
| Provider | Example shape | Documented NIC surface | Documented storage surface | Remapping observation | Backend |
|---|---|---|---|---|---|
| aws | Nitro instance, enaSupport=required, nvmeSupport=required | ENA (SR-IOV) | NVMe EBS + optional instance-store NVMe | not yet probed | not yet selected |
| azure | Size with AcceleratedNetworkingEnabled=True | SR-IOV VF (MANA/ConnectX) bonded to synthetic hv_netvsc | Managed disk (transport per shape) | not yet probed | not yet selected |
| gcp | 3rd-gen+ machine type (e.g. C3) | gVNIC only | NVMe Local SSD / PD per family | probed 2026-05-24: IOMMU disabled, SWIOTLB (see GCE Live Probe Results) | labeled bounce-buffer |
| gcp | 1st/2nd-gen, x86, non-Confidential, under 50 Gbps | VirtIO-Net or gVNIC | virtio-scsi PD / Local SSD (NVMe or SCSI) | probed 2026-05-24: IOMMU disabled, SWIOTLB (see GCE Live Probe Results) | labeled bounce-buffer |
GCE Live Probe Results (2026-05-24)
These rows replace the GCE “not yet probed” placeholders with live guest
observations. Four representative shapes were booted on Google Compute Engine
(stock Debian 12, kernel 6.1.0-47-cloud-amd64) in a dedicated test project,
each running a /sys- and /proc-only probe delivered through instance
metadata and read back over the serial console. Every instance booted with no
external IP, no service account, and was deleted immediately after its probe
output was captured.
| Machine type | Class | NIC driver | Storage | Guest IOMMU / DMAR | DMA path |
|---|---|---|---|---|---|
n1-standard-1 | 1st-gen | virtio_net | virtio-scsi (sda) | intel_iommu=off, DMAR: IOMMU disabled, no DMAR table, empty iommu_groups | SWIOTLB software bounce buffering |
e2-small | 2nd-gen | virtio_net | virtio-scsi (sda) | same: IOMMU disabled, no DMAR, no groups | SWIOTLB |
c3-standard-4 | 3rd-gen Intel | gvnic | nvme Local SSD (Google vendor 0x1ae0) | same | SWIOTLB |
n2d-standard-2 Confidential | AMD SEV | gvnic | nvme | same; additionally Memory Encryption Features active: AMD SEV | SWIOTLB forced (512 MB) |
Verbatim kernel evidence common to all four shapes:
- the boot command line carries
intel_iommu=off; DMAR: IOMMU disabled;PCI-DMA: Using software bounce buffering for IO (SWIOTLB);/sys/kernel/iommu_groupsis empty, and noDMAR,IVRS, orIORTtable is present under/sys/firmware/acpi/tables/.
The Confidential (SEV) shape additionally logs software IO TLB: Memory encryption is active and system is using DMA bounce buffers, confirming that
bounce buffering is enforced by memory encryption, not merely by configuration.
Classification. No probed GCE shape – neither the older virtio surface nor
the modern gVNIC/NVMe surface – exposes a guest-programmable IOMMU that capOS
could discover, program, and validate. By the
classification rules this rules out the direct-remapping
backend and selects the labeled bounce-buffer fallback for the cloud path on
these shapes. On the Confidential VM the bounce-buffer path is a hardware
invariant: the device cannot reach encrypted guest memory directly. This is a
fail-closed observation, not a hostile-hardware isolation claim; the binding
backend selection and any “supported shape” advertisement remain attended
sign-off work in cloud-dma-backend-selection.
Design implication for GCP storage/NIC drivers. A provider-side or
hypervisor-side IOMMU may still protect Google infrastructure, but that is not
guest-programmable remapping authority for capOS. On the probed GCE shapes a
capOS userspace storage or NIC provider must therefore be planned as a
no-IOMMU, brokered-bounce design: userspace receives buffer capabilities,
grant IDs, or typed commands, while the kernel or device manager materializes
the device-visible queue-base, descriptor, PRP/SGL, or virtqueue address fields.
The direct-remapping lane remains valid for QEMU run-iommu-remapping and for
future cloud/hardware shapes that expose a guest-programmable remapping unit;
it is not a GCP premise today. The generic design consequences are recorded in
DMA User-Space Driver Isolation.
Runtime Probe Protocol
A later credentialed cloud-run task captures the following from the guest, with the region/zone, image, kernel, and retrieval date recorded for each command. Capture the verbatim command output as evidence; do not summarize it.
lspci -nnk -D– PCI topology with full domain:bus:device.function, vendor/ device IDs, and bound kernel driver per function (NIC, storage controller, accelerator identity).ls /sys/kernel/iommu_groups(and per-groupdevices/) – whether the guest sees IOMMU groups at all, and how devices are grouped.- ACPI table presence: DMAR (Intel VT-d), IVRS (AMD-Vi), IORT (Arm SMMU)
under
/sys/firmware/acpi/tables/. Absence is itself evidence. - Kernel log IOMMU/SWIOTLB lines (
dmesg | grep -iE 'iommu|dmar|ivrs|iort|swiotlb') – whether the kernel enabled an IOMMU, fell back to software bounce (SWIOTLB), or found no remapping unit. - Network driver identity:
ethtool -i <iface>and the bound driver (ena,mana/mlx5_core,gve,virtio_net). - Block transport identity:
lsblk -o NAME,TRAN,MODELand controller driver (nvme,virtio_blk,virtio_scsi). - NVMe inventory:
nvme listandnvme id-ctrl <dev>for controller identity where NVMe is present.
A probe result is only usable evidence if capOS could perform the equivalent discovery from its own ACPI/PCI enumeration; the Linux commands above stand in for that discovery during the research phase.
Classification Rules
These rules are deliberately fail-closed and feed the
runtime backend inferred by capOS and support-policy status columns.
- SR-IOV, a virtual NIC (ENA, gVNIC, MANA, virtio-net), a GPU, an accelerator, or local NVMe identifies a DMA-capable or DMA-adjacent surface. This is the presence of a device that does or could bus-master; it is not a safety claim.
- A direct-remapping classification requires guest-programmable remapping
authority that capOS can discover, program, and validate – a usable Intel
VT-d, AMD-Vi, or Arm SMMU unit the guest controls, with translation, fault,
and invalidation behavior matching
docs/research/iommu-remapping.md. A DMA-capable surface alone never implies this. - Provider-side isolation facts (host-enforced VPC isolation, Nitro/host data- path bypass, hypervisor-side IOMMU) are support-policy assumptions, not proof that capOS can safely use direct DMA from inside the guest.
- Ambiguous, contradictory, or unvalidated observations select
Unsupported. This matches the assurance model: unknown or contradictory observations selectUnsupported, not an optimistic default.
These map onto the three backend candidates in the assurance model
(docs/proposals/dma-assurance-model-proposal.md): a direct remapping domain, a
labeled bounce-buffer fallback (direct_dma=blocked, all device-visible memory
manager-owned, no host physical address exposed, hostile-hardware isolation not
claimed), or Unsupported.
Relationship to Backend Selection
cloud-dma-backend-selection consumes this inventory: it maps each backend
candidate to the assurance-model invariants, fills the evidence matrix per cloud
VM shape, and drafts the downstream-contract scaffolding (which device-manager
policy fields a driver declares – direct_dma, trusted_domain,
bounce_buffer – and which stale-handle/stale-completion/teardown/
no-host-physical-exposure gates each candidate must satisfy). That task already
declares this inventory as a dependency. The binding backend selection and any
per-shape safety assertion remain attended-sign-off work and are not made here.
Relevant Research and Grounding
docs/research/iommu-remapping.md– primary-source Intel VT-d/AMD-Vi/QEMU remapping grounding the direct-DMA classification depends on.docs/proposals/dma-assurance-model-proposal.md– the model objects, invariants, and backend-candidate matrix this evidence feeds.docs/dma-isolation-design.md– the manager-owned DMA isolation contract and bounce-buffer fallback the labeled-fallback candidate must satisfy.docs/proposals/cloud-deployment-proposal.md– the cloud deployment context for the usable-instance milestone.docs/tasks/cloud-dma-backend-selection.md– the backend decision that consumes this inventory.