IOMMU Remapping Grounding
This note records primary-source facts for future real IOMMU/remapping work.
It is not an implementation design and does not change current capOS behavior.
Today capOS performs diagnostics-only DMAR/IVRS parsing and PCI attachment
reporting, including retained DMAR metadata, include-all coverage, and direct
endpoint scopes. DMAPool has manager-owned domain identity and
mapping-lifecycle preflight records, but the active labels remain
remapping_tables=not-programmed, invalidation/IOTLB/stale-cleanup
not-installed, and direct_dma=blocked.
Sources
- Intel, Intel Virtualization Technology for Directed I/O Architecture
Specification,
content ID 671081. Intel page metadata on 2026-05-12 listed Date
2022-06-02and Version5.1 (Latest). Sections used: 6.2.2 “Context-Cache”, 6.2.4 “IOTLB”, 6.5.1 “Register-based Invalidation Interface”, 6.5.2 “Queued Invalidation Interface”, 6.5.3 “IOTLB Invalidation Considerations”, 6.6 “Set Root Table Pointer Operation”, 6.8 “Write Buffer Flushing”, 7.10 “Software Steps to Drain Page Requests & Responses”, 8.3 “DMA Remapping Hardware Unit Definition Structure”, 8.3.1 “Device Scope Structure”, 9.1 “Root Entry”, 9.3 “Context Entry”, 9.4 “Scalable-Mode Context-Entry”, and 11.4.5-11.4.9 covering the root-table-address, invalidation, fault, protected-memory-range, and invalidation-queue registers. - AMD, AMD I/O Virtualization Technology (IOMMU) Specification 48882, 48882-PUB Rev 3.10, February 2025. Sections used: 2.2 device table, device-table entry, I/O page table, and interrupt-remapping material; 2.4 “Commands”; 2.5 “Event Logging”; 3.4 “IOMMU MMIO Registers”; IVRS/device-table/page-table, command-buffer, completion-wait, invalidation, and event-log material.
- QEMU, qemu-manpage
entries for
-device intel-iommu,-device amd-iommu, and-device virtio-iommu-pci; and QEMU PCI developer documentation for PCI IOMMU and IOTLB notifier APIs. These are current-master QEMU docs, not a frozen release manual; theqemu-manpageand PCI developer pages observed on 2026-05-12 were generated for QEMU version 11.0.50.
Intel VT-d Grounding
Intel VT-d identifies DMA request sources through PCI requester/source IDs and resolves them through DMA remapping hardware units described by DMAR DRHD structures. The table path is rooted at a root table and context tables. Root entries select context tables, context entries bind a source to a translation type, domain identifier, address width, and second-level page-table root, and scalable-mode context entries extend that context format. A future capOS table builder therefore needs explicit data for the DRHD unit, PCI segment and BDF/source ID, domain ID, address-width choice, and second-level page-table root before any IOVA can be exported as a device address.
Invalidation is part of the mapping lifetime, not a diagnostic detail. Intel’s register-based and queued invalidation interfaces cover context-cache, IOTLB, device-TLB, interrupt-entry-cache, and wait/completion descriptors. Future capOS page reuse must not proceed merely because a software ledger drops a mapping; it must account for the relevant context-cache/IOTLB invalidation, queued-invalidation ordering, completion observation, and write-buffer flushing required by the selected hardware mode. Fault-reporting registers are the minimum diagnostic surface for translation failures and protection faults.
QEMU’s intel-iommu documentation is useful for focused emulator smokes but
should not be treated as hardware coverage. It is q35-only in QEMU current
master. Relevant options include intremap, caching-mode, device-iotlb,
and aw-bits=39|48; QEMU documents 39-bit IOVA space for 3-level IOMMU page
tables and 48-bit IOVA space for 4-level tables.
AMD-Vi Grounding
AMD-Vi uses a different vocabulary and table root. Device requests are keyed by DeviceID and resolved through a Device Table Entry. A DTE carries validity, translation, interrupt-remapping, DomainID, mode/page-table-depth, and page-table-root information. Future shared capOS abstractions can name the logical domain and IOVA lifetime generically, but AMD-specific code should not pretend it is programming Intel root/context tables.
AMD invalidation and completion are command-buffer operations. The future mapping lifetime must include command-buffer invalidation commands, completion wait, and event-log handling. The event log is the basic hardware-facing diagnostic record for malformed requests, page faults, and table errors; the MMIO register set covers control/status, command and event pointers, event-log state, alternate event-log buffers, device-table segment bases, and extended features.
QEMU’s amd-iommu documentation is also q35-only in current master. The
documented options include dma-remap for DMA address translation and
permission checking and intremap for interrupt remapping. Treat these as
emulator smoke inputs until capOS has separate hardware or provider evidence.
QEMU Test Surface
QEMU can provide useful negative and smoke tests for a future remapping path:
intel-iommuon q35 with explicitaw-bits, optional interrupt remapping, and caching/device-IOTLB options selected deliberately for each test.amd-iommuon q35 with DMA remapping enabled when testing translated DMA.virtio-iommu-pcion q35 x86_64 orvirtARM for a virtio-IOMMU model, if a later portable IOMMU frontend is selected.- PCI IOMMU/IOTLB notifier APIs in QEMU developer docs for understanding how emulated devices observe translation changes, not as guest architectural requirements.
Because the QEMU citations are current-master documentation, tests should pin
the local qemu-system-x86_64 --version, machine type, device options, and
expected diagnostic labels when real smokes are added.
Possible Implementation Slices
The following are neutral decomposition boundaries for later implementation work. They are not an ordering recommendation.
- Source-grounding refresh: update this note when a real branch selects exact Intel, AMD, or QEMU features beyond the sections above.
- Table-builder data structures: represent Intel DRHD/root/context/domain data and AMD IVRS/DeviceID/DTE/DomainID data without programming hardware.
- Diagnostics-only MMIO/fault/status: read bounded status and fault surfaces while keeping direct DMA blocked.
- Disabled IOVA allocator: build domain-scoped IOVA allocation records with no exported device addresses until table programming and invalidation exist.
- QEMU-only remapping smoke: prove one minimal translated mapping and teardown path under a pinned QEMU shape, with labels that do not claim hardware isolation.
- AMD compatibility naming: keep generic capOS “domain”/“IOVA”/“mapping” terms while preserving AMD-specific DTE, DeviceID, command-buffer, and event-log names in implementation and diagnostics.