virtio-rng (modern PCI entropy device)
This is a provenance map for the in-tree virtio-rng path: it cites the spec, summarizes only the wire-format subset the code actually implements, and points into the implementation. It is not a re-spec – where the spec is implemented unchanged it links rather than transcribes.
Unlike virtio-net and virtio-blk, the
virtio-rng device does not back a userspace-facing capability. It is a
QEMU-only proof fixture, not a production driver, and not forward DDF
production evidence: the entropy the device produces is consumed only by
in-kernel proofs, never handed to a process. The capOS EntropySource
capability is a separate, RDRAND-backed path
(kernel/src/cap/entropy_source.rs, fill_random / rdrand64 / has_rdrand;
per-call bound MAX_ENTROPY_FILL_BYTES) and does not touch this device. This
classification is asserted, not just documented: on every cfg(qemu) boot
diagnose_qemu_virtio_rng emits a deterministic marker
(virtio-rng: classification=qemu-only-proof-fixture userspace_capability=none production_driver=no ...) that make run-iommu-remapping
(tools/qemu-iommu-remapping-smoke.sh) requires, so a regression that promoted
this path into a production-driver claim would fail the smoke. virtio-rng exists
in the tree for two reasons:
- A DDF metadata-diagnostics path that exercises modern-transport
discovery, MSI-X metadata selection, and the device-manager
ownership/teardown/grant-source hooks against a real PCI function on every
cfg(qemu)boot (kernel/src/virtio.rsdiagnose_virtio_rng_metadata, driven fromkernel/src/pci.rsdiagnose_qemu_virtio_rng). - An IOMMU VT-d second-level remapping hardware-DMA proof vehicle (the
Slice A2/B/C proofs in
kernel/src/iommu.rs, driven throughkernel/src/virtio.rsprove_iommu_rng_mapped_dma/prove_iommu_rng_unmapped_dma/prove_iommu_rng_stale_dma). This is the minimal real virtqueue driver QEMU’s entropy device lets us stand up to prove a device DMA actually walks the programmed translation tables.
It reuses the modern split-ring transport seam introduced for virtio-net
(virtio-net); this page covers only the rng-specific usage.
1. Spec basis
- Device: virtio entropy device, modern (virtio 1.x) PCI transport.
PCI vendor
0x1af4; device0x1044(modern) /0x1005(transitional). IDs atkernel/src/pci.rs(VIRTIO_VENDOR_ID,VIRTIO_RNG_MODERN_DEVICE_ID,VIRTIO_RNG_TRANSITIONAL_DEVICE_ID; matched byPciDevice::is_virtio_rng). QEMU exposes it asvirtio-rng-pci-non-transitional(see §3). - Authoritative spec: Virtual I/O Device (VIRTIO) Version 1.2, OASIS Committee Specification 01 (2022-07-01). Source: https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html. Relevant sections: 4.1 (virtio over PCI bus), 2.7 (split virtqueues), 5.4 (entropy device).
- Reference: cross-checked against the Linux
virtio_rngdriver for the single-request-queue model and thevirtio_pci_modernmodern-transport handshake.
2. Wire format (implemented subset)
The modern PCI capability parsing, common-config register map, split-ring
descriptor layout, and feature-negotiation handshake are the shared transport
seam documented in virtio-net §2
(kernel/src/virtio.rs transport module, ModernTransport, the COMMON_*
register offsets, VIRTQ_DESC_F_WRITE). The rng path discovers that transport
through discover_virtio_rng_metadata_transport and maps regions with
map_region. Only the rng-specific subset is summarized here.
- Device shape: the transport discovery reports whether the function is a
modern device id or a transitional id that still exposes modern capabilities
(
DeviceShape::Modern/DeviceShape::TransitionalWithModernCaps); both are driven through the modern path. The entropy device has no device-specific config space and no device-specific feature bits. - Single request queue: the entropy device exposes one virtqueue, the
requestq(VIRTIO_RNG_REQUEST_QUEUE= queue 0). The IOMMU proof drives it at a deliberately smallVIRTIO_RNG_PROOF_QUEUE_SIZE(2) – a single in-flight descriptor is enough to prove a DMA through translation, and a power of two keeps the ring layout legal. The per-queue notify address is computed fromnotify_off_multiplierlike any modern virtio queue. - Request framing: each request is a single device-writable descriptor
(
VIRTQ_DESC_F_WRITE) pointing at a buffer the device fills with entropy. The proof requestsVIRTIO_RNG_PROOF_REQUEST_LEN(64) bytes (rng_publish_descriptor_and_notifywrites the 16-byte descriptor and bumps the available ring; completion is read from the used ring’s{ id:u32, len:u32 }entry). There is no request header or status byte – the entropy device just writes bytes into the supplied buffer. - Feature negotiation: virtio-rng offers no device-specific features. The
metadata path negotiates nothing; the IOMMU hardware-DMA proof requires both
VIRTIO_F_VERSION_1(modern transport) andVIRTIO_F_ACCESS_PLATFORM– the latter is what makes QEMU route the device’s DMA through the platform IOMMU and consume the IOVAs the driver programs into the ring registers, rather than treating them as host-physical addresses. A device that does not offer both fails the proof closed (rng-missing-access-platform-feature). - Completion: the proof polls the used ring (
hhdm_read_u16ofused.idx, bounded byVIRTIO_RNG_USED_POLL_LIMIT) rather than waiting on the request interrupt; the MSI-X path is exercised at the metadata level only (see §3).
3. capOS mapping
- Binding (transitional, in-kernel, no userspace cap): virtio-rng is driven
entirely in the kernel and is not exposed to userspace at all – there is
no
RandomNumberGenerator/EntropySource-style cap routed to this device. The metadata-diagnostics path runs on everycfg(qemu)boot fromkernel/src/pci.rsdiagnose_qemu_virtio_rng; the hardware-DMA proofs run under therun-iommu-remappingtarget only. - Device-manager authority (metadata path):
diagnose_virtio_rng_metadatabinds authority through the kerneldevice_manageragainstDeviceOwner::VirtioRng– it proves QEMU ownership (prove_qemu_ownership), teardown triggers, and theDeviceMmio/DMAPool/DMABuffercap release / driver-crash / reset-disable hooks, then logs thedevicemmio/dmapool/interruptgrant-source status (devicemmio_grant_source::log_statusand thedmapool/interruptequivalents). This is the same DDF ledger the cloud-NIC and block drivers bind through; virtio-rng is the function the bring-up hooks are proved against. - MMIO: the modern-transport common/notify/ISR/device-config regions are
mapped from the device BARs (
map_regionoverpci::map_bar_region) into the device-uncacheable (NO_CACHE) window; the metadata path additionally logs each decoded region (log_device_region). Doorbell (queue-notify) writes are scoped to the per-queue notify address computed fromnotify_off_multiplier. - Interrupt: MSI-X is handled at the metadata level – the request queue
uses
VIRTIO_RNG_MSIX_METADATA_ENTRY(0) and requiresVIRTIO_RNG_MSIX_REQUIRED_ENTRIES(1) usable table entries; the plan is selected byselect_virtio_rng_msix_planand the route programming is proved byprove_virtio_rng_msix_metadata_route. The hardware-DMA proof completes by polling the used ring, so it does not arm a completion-IRQ waiter. - DMA: the IOMMU proof’s descriptor table, available ring, used ring, and
request buffer are placed at programmed IOVAs carried in the
iommu::IommuRngDmaVehicle, never at host-physical addresses; onceGCMD.TEis set every DMA the device issues must walk the second-level table the IOMMU module installed. The ring pages are zeroed through the HHDM before their IOVAs are handed to the device so a stale reading can never be mistaken for a completion. No host physical address or IOVA leaves the kernel boundary. - Fail-closed / validation rules: the proof fails closed at every step –
transport discovery, bus-master enable, MMIO map, reset handshake, the
required-feature check, notify-offset/map-length overflow, queue-size floor,
and queue-enable rejection each return a distinct
failed(...)reason rather than proceeding. A page whose invalidation never completes is not freed (a page freed before invalidation completes would be a stale-DMA hole). The unmapped-IOVA and stale-IOVA re-drives must fault in the IOMMU (FSTS.PPF/FRCD[0].F) instead of reaching memory. - QEMU-emulable vs hardware-only: fully QEMU-emulable. QEMU provides
virtio-rng-pci-non-transitional(the sharedQEMU_SECOND_DEVICEdefault);make run-iommu-remappingoverrides it withiommu_platform=onbehind anintel-iommudevice and is the end-to-end proof of the mapped-IOVA hardware DMA, the unmapped-IOVA fault, and the Slice C two-phase revocation / stale-DMA fault. The DDF metadata diagnostics emit on everycfg(qemu)boot. No hardware-only path.
Related
kernel/src/virtio.rs– the rng metadata diagnostics (diagnose_virtio_rng_metadata), the IOMMU hardware-DMA proof driver (prove_iommu_rng_mapped_dma/prove_iommu_rng_unmapped_dma/prove_iommu_rng_stale_dma), and the shared modern split-ring transport.kernel/src/iommu.rs– the VT-d Slice A2/B/C remapping, fault, and revocation proofs that drive this device.kernel/src/cap/entropy_source.rs– the separate RDRAND-backedEntropySourcecapability (this device backs no capability).docs/dma-isolation-design.md– the DMA backend and isolation model the IOMMU remapping proofs validate.