# virtio-rng (modern PCI entropy device)

This is a provenance map for the in-tree virtio-rng path: it cites the spec,
summarizes only the wire-format subset the code actually implements, and points
into the implementation. It is not a re-spec -- where the spec is implemented
unchanged it links rather than transcribes.

Unlike [`virtio-net`](virtio-net.md) and [`virtio-blk`](virtio-blk.md), the
virtio-rng device does **not** back a userspace-facing capability. It is a
**QEMU-only proof fixture, not a production driver, and not forward DDF
production evidence**: the entropy the device produces is consumed only by
in-kernel proofs, never handed to a process. The capOS `EntropySource`
capability is a **separate, RDRAND-backed** path
(`kernel/src/cap/entropy_source.rs`, `fill_random` / `rdrand64` / `has_rdrand`;
per-call bound `MAX_ENTROPY_FILL_BYTES`) and does not touch this device. This
classification is asserted, not just documented: on every `cfg(qemu)` boot
`diagnose_qemu_virtio_rng` emits a deterministic marker
(`virtio-rng: classification=qemu-only-proof-fixture userspace_capability=none
production_driver=no ...`) that `make run-iommu-remapping`
(`tools/qemu-iommu-remapping-smoke.sh`) requires, so a regression that promoted
this path into a production-driver claim would fail the smoke. virtio-rng exists
in the tree for two reasons:

1. A **DDF metadata-diagnostics** path that exercises modern-transport
   discovery, MSI-X metadata selection, and the device-manager
   ownership/teardown/grant-source hooks against a real PCI function on every
   `cfg(qemu)` boot (`kernel/src/virtio.rs` `diagnose_virtio_rng_metadata`,
   driven from `kernel/src/pci.rs` `diagnose_qemu_virtio_rng`).
2. An **IOMMU VT-d second-level remapping hardware-DMA** proof vehicle (the
   Slice A2/B/C proofs in `kernel/src/iommu.rs`, driven through
   `kernel/src/virtio.rs` `prove_iommu_rng_mapped_dma` /
   `prove_iommu_rng_unmapped_dma` / `prove_iommu_rng_stale_dma`). This is the
   minimal real virtqueue driver QEMU's entropy device lets us stand up to
   prove a device DMA actually walks the programmed translation tables.

It reuses the modern split-ring transport seam introduced for virtio-net
([`virtio-net`](virtio-net.md)); this page covers only the rng-specific usage.

## 1. Spec basis

- **Device**: virtio entropy device, modern (virtio 1.x) PCI transport.
  PCI vendor `0x1af4`; device `0x1044` (modern) / `0x1005` (transitional).
  IDs at `kernel/src/pci.rs` (`VIRTIO_VENDOR_ID`, `VIRTIO_RNG_MODERN_DEVICE_ID`,
  `VIRTIO_RNG_TRANSITIONAL_DEVICE_ID`; matched by `PciDevice::is_virtio_rng`).
  QEMU exposes it as `virtio-rng-pci-non-transitional` (see §3).
- **Authoritative spec**: *Virtual I/O Device (VIRTIO) Version 1.2*, OASIS
  Committee Specification 01 (2022-07-01).
  Source: <https://docs.oasis-open.org/virtio/virtio/v1.2/virtio-v1.2.html>.
  Relevant sections: 4.1 (virtio over PCI bus), 2.7 (split virtqueues),
  5.4 (entropy device).
- **Reference**: cross-checked against the Linux `virtio_rng` driver for the
  single-request-queue model and the `virtio_pci_modern` modern-transport
  handshake.

## 2. Wire format (implemented subset)

The modern PCI capability parsing, common-config register map, split-ring
descriptor layout, and feature-negotiation handshake are the **shared transport
seam** documented in [`virtio-net` §2](virtio-net.md#2-wire-format-implemented-subset)
(`kernel/src/virtio.rs` `transport` module, `ModernTransport`, the `COMMON_*`
register offsets, `VIRTQ_DESC_F_WRITE`). The rng path discovers that transport
through `discover_virtio_rng_metadata_transport` and maps regions with
`map_region`. Only the rng-specific subset is summarized here.

- **Device shape**: the transport discovery reports whether the function is a
  modern device id or a transitional id that still exposes modern capabilities
  (`DeviceShape::Modern` / `DeviceShape::TransitionalWithModernCaps`); both are
  driven through the modern path. The entropy device has **no device-specific
  config space** and no device-specific feature bits.
- **Single request queue**: the entropy device exposes one virtqueue, the
  `requestq` (`VIRTIO_RNG_REQUEST_QUEUE` = queue 0). The IOMMU proof drives it
  at a deliberately small `VIRTIO_RNG_PROOF_QUEUE_SIZE` (2) -- a single
  in-flight descriptor is enough to prove a DMA through translation, and a power
  of two keeps the ring layout legal. The per-queue notify address is computed
  from `notify_off_multiplier` like any modern virtio queue.
- **Request framing**: each request is a single device-writable descriptor
  (`VIRTQ_DESC_F_WRITE`) pointing at a buffer the device fills with entropy. The
  proof requests `VIRTIO_RNG_PROOF_REQUEST_LEN` (64) bytes
  (`rng_publish_descriptor_and_notify` writes the 16-byte descriptor and bumps
  the available ring; completion is read from the used ring's
  `{ id:u32, len:u32 }` entry). There is no request header or status byte --
  the entropy device just writes bytes into the supplied buffer.
- **Feature negotiation**: virtio-rng offers no device-specific features. The
  metadata path negotiates nothing; the IOMMU hardware-DMA proof requires both
  `VIRTIO_F_VERSION_1` (modern transport) and `VIRTIO_F_ACCESS_PLATFORM` -- the
  latter is what makes QEMU route the device's DMA through the platform IOMMU
  and consume the IOVAs the driver programs into the ring registers, rather than
  treating them as host-physical addresses. A device that does not offer both
  fails the proof closed (`rng-missing-access-platform-feature`).
- **Completion**: the proof **polls the used ring** (`hhdm_read_u16` of
  `used.idx`, bounded by `VIRTIO_RNG_USED_POLL_LIMIT`) rather than waiting on the
  request interrupt; the MSI-X path is exercised at the metadata level only (see
  §3).

## 3. capOS mapping

- **Binding (transitional, in-kernel, no userspace cap)**: virtio-rng is driven
  entirely **in the kernel** and is not exposed to userspace at all -- there is
  no `RandomNumberGenerator`/`EntropySource`-style cap routed to this device.
  The metadata-diagnostics path runs on every `cfg(qemu)` boot from
  `kernel/src/pci.rs` `diagnose_qemu_virtio_rng`; the hardware-DMA proofs run
  under the `run-iommu-remapping` target only.
- **Device-manager authority (metadata path)**: `diagnose_virtio_rng_metadata`
  binds authority through the kernel `device_manager` against
  `DeviceOwner::VirtioRng` -- it proves QEMU ownership
  (`prove_qemu_ownership`), teardown triggers, and the
  `DeviceMmio`/`DMAPool`/`DMABuffer` cap release / driver-crash / reset-disable
  hooks, then logs the `devicemmio` / `dmapool` / `interrupt` grant-source
  status (`devicemmio_grant_source::log_status` and the `dmapool` / `interrupt`
  equivalents). This is the same DDF ledger the cloud-NIC and block drivers
  bind through; virtio-rng is the function the bring-up hooks are proved against.
- **MMIO**: the modern-transport common/notify/ISR/device-config regions are
  mapped from the device BARs (`map_region` over `pci::map_bar_region`) into the
  device-uncacheable (`NO_CACHE`) window; the metadata path additionally logs
  each decoded region (`log_device_region`). Doorbell (queue-notify) writes are
  scoped to the per-queue notify address computed from `notify_off_multiplier`.
- **Interrupt**: MSI-X is handled at the **metadata level** -- the request queue
  uses `VIRTIO_RNG_MSIX_METADATA_ENTRY` (0) and requires
  `VIRTIO_RNG_MSIX_REQUIRED_ENTRIES` (1) usable table entries; the plan is
  selected by `select_virtio_rng_msix_plan` and the route programming is proved
  by `prove_virtio_rng_msix_metadata_route`. The hardware-DMA proof completes by
  polling the used ring, so it does not arm a completion-IRQ waiter.
- **DMA**: the IOMMU proof's descriptor table, available ring, used ring, and
  request buffer are placed at **programmed IOVAs** carried in the
  `iommu::IommuRngDmaVehicle`, never at host-physical addresses; once
  `GCMD.TE` is set every DMA the device issues must walk the second-level table
  the IOMMU module installed. The ring pages are zeroed through the HHDM before
  their IOVAs are handed to the device so a stale reading can never be mistaken
  for a completion. No host physical address or IOVA leaves the kernel boundary.
- **Fail-closed / validation rules**: the proof fails closed at every step --
  transport discovery, bus-master enable, MMIO map, reset handshake, the
  required-feature check, notify-offset/​map-length overflow, queue-size floor,
  and queue-enable rejection each return a distinct `failed(...)` reason rather
  than proceeding. A page whose invalidation never completes is **not** freed
  (a page freed before invalidation completes would be a stale-DMA hole). The
  unmapped-IOVA and stale-IOVA re-drives must fault in the IOMMU
  (`FSTS.PPF` / `FRCD[0].F`) instead of reaching memory.
- **QEMU-emulable vs hardware-only**: fully QEMU-emulable. QEMU provides
  `virtio-rng-pci-non-transitional` (the shared `QEMU_SECOND_DEVICE` default);
  `make run-iommu-remapping` overrides it with `iommu_platform=on` behind an
  `intel-iommu` device and is the end-to-end proof of the mapped-IOVA
  hardware DMA, the unmapped-IOVA fault, and the Slice C two-phase revocation /
  stale-DMA fault. The DDF metadata diagnostics emit on every `cfg(qemu)` boot.
  No hardware-only path.

## Related

- `kernel/src/virtio.rs` -- the rng metadata diagnostics
  (`diagnose_virtio_rng_metadata`), the IOMMU hardware-DMA proof driver
  (`prove_iommu_rng_mapped_dma` / `prove_iommu_rng_unmapped_dma` /
  `prove_iommu_rng_stale_dma`), and the shared modern split-ring transport.
- `kernel/src/iommu.rs` -- the VT-d Slice A2/B/C remapping, fault, and
  revocation proofs that drive this device.
- `kernel/src/cap/entropy_source.rs` -- the **separate** RDRAND-backed
  `EntropySource` capability (this device backs no capability).
- `docs/dma-isolation-design.md` -- the DMA backend and isolation model the
  IOMMU remapping proofs validate.
