Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Azure MANA (Microsoft Azure Network Adapter)

This is a provenance map for the MANA / GDMA wire logic in capos-lib/src/mana.rs: it cites the spec basis, summarizes only the wire-format subset the code actually implements, and points into the implementation by symbol name. It is not a re-spec.

Maturity caveat. This page documents protocol encode/decode logic with a host-side conformance suite, not a bound driver. There is no MANA device in QEMU, so this logic is a deliberate QEMU-exception gated by cargo test-lib plus a warning-free cargo build --features qemu, not a make run-* smoke. End-to-end MANA bind / send / receive / teardown on real Azure hardware – including SR-IOV VF revocation with fallback-to-synthetic and DMA/MMIO/IRQ teardown – is future work (tracked as cloud-azure-mana-nic-live-proof), blocked until Azure access is provisioned. The ## 3. capOS mapping section below therefore describes the planned binding, not landed authority.

1. Spec basis

  • Device: Microsoft Azure Network Adapter (MANA), the modern Azure NIC for Dv5/Ev5 and later VM families. Exposed to the guest as a PCI SR-IOV Virtual Function. PCI vendor 0x1414 (Microsoft); device 0x00ba (VF, the guest-bound function) / 0x00b9 (PF). IDs at capos-lib/src/mana.rs (MANA_PCI_VENDOR_ID, MANA_VF_DEVICE_ID, MANA_PF_DEVICE_ID). The device is fronted by GDMA (Generic DMA), Microsoft’s queue/DMA abstraction; MANA is the network client riding on GDMA queues.
  • Authoritative spec: MANA has no freely published register specification. The basis of record is the upstream open-source MANA Linux driver, whose “HW DATA” structures are the documented wire contract:
    • include/net/mana/gdma.h – GDMA registers, doorbells, message headers, WQE/CQE/EQE, request-type space, device/queue enums.
    • include/net/mana/mana.h – MANA TX/RX OOB descriptors, completion OOBs, mana_cqe_type, mana_command_code.
    • include/net/mana/hw_channel.h, include/net/mana/shm_channel.h – the HWC management channel and the shared-memory bootstrap aperture.
    • Reference snapshot: torvalds/linux master at commit d60ec36cab338dfe2ae40d73e9c8d6c4af70d2b8 (the gdma.h structures are stable across recent kernels).
  • Reference driver: the same MANA Linux driver (drivers/net/ethernet/microsoft/mana/) is the behavior cross-check; mana_gd_init_req_hdr defines the standard request-header construction mirrored by GdmaReqHdr::standard.

2. Wire format (implemented subset)

All multi-byte words are little-endian; GDMA “HW DATA” structures are naturally aligned (not packed). Every decoder validates buffer length, rejects unknown enum members, and enforces must-be-zero (MBZ) reserved fields; every encoder range-checks its bitfields. Symbols below are in capos-lib/src/mana.rs.

  • Registers / BAR: single register BAR (BAR0). VF doorbell-page and shared-memory aperture offsets (GDMA_REG_*) and PF offsets (GDMA_PF_REG_*), the SR-IOV config base, and the fixed CQE/EQE/WQE-BU and max SQE/RQE sizes are in the regs module (REG_DB_PAGE_OFFSET, REG_SHM_OFFSET, PF_REG_*, SRIOV_REG_CFG_BASE_OFF, CQE_SIZE, EQE_SIZE, MAX_SQE_SIZE, MAX_RQE_SIZE, WQE_BU_SIZE).
  • Doorbells: the four-variant union gdma_doorbell_entry is modeled by the DoorbellEntry enum (Cq/Rq/Sq/Eq), encoding the 24- or 16-bit queue id, the 31- or 32-bit tail pointer, the RQ wqe_cnt, and the CQ/EQ arm bit, with kind-specific reserved MBZ enforcement on decode.
  • Admin (HWC) messages: GdmaMsgHdr (gdma_msg_hdr), GdmaDevId (gdma_dev_id), GdmaReqHdr (gdma_req_hdr, with standard mirroring mana_gd_init_req_hdr), and GdmaRespHdr (gdma_resp_hdr, reserved-word MBZ). The request-type space is GdmaRequestType (gdma_request_type, fail-closed); the GDMA admin status is the open GdmaStatus space (success / MoreEntries / CmdUnsupported / preserved Other, since GDMA status is a firmware error space, not a closed enum).
  • Work queue: GdmaSge (gdma_sge, 16-byte SGE with 64-bit address) and GdmaWqeHeader (gdma_wqe, the 8-byte WQE header: num_sge, inline_oob_size_div4, client_oob_in_sgl, client_data_unit, with reserved MBZ). MANA TX OOB descriptors that prepend the SGL: ManaTxShortOob (mana_tx_short_oob, checksum-offload + completion-CQ + vSQ-frame selection) and ManaTxLongOob (mana_tx_long_oob, encapsulation / VLAN / inner-offset fields).
  • Completion / event: GdmaCqeInfo (gdma_cqe.cqe_info: wq_num, is_sq, 3-bit owner_bits) and GdmaEqeInfo (union gdma_eqe_info: event type via GdmaEqeType, client_id, owner_bits). MANA completion OOBs: ManaCqeHeader (mana_cqe_header, cqe_type via the fail-closed ManaCqeType enum), ManaRxcompOob (mana_rxcomp_oob, RX flags + MANA_RXCOMP_OOB_NUM_PPI per-packet ManaRxcompPerpktInfo + RX WQE offset), and ManaTxCompOob (mana_tx_comp_oob, TX data/SGL/WQE offsets + reserved-padding MBZ).
  • Capability / feature negotiation: the verify-version surface (GdmaRequestType::VerifyVfDriverVersion, GDMA_PROTOCOL_V1, GdmaOsType) and the MANA control command space ManaCommandCode (mana_command_code, fail-closed) including QueryDevConfig / QueryVportConfig / ConfigVportTx/Rx / CreateWqObj.

3. capOS mapping (planned – not yet implemented)

MANA is a vendor-custom cloud NIC behind SR-IOV. The intended binding, when the live-proof work is unblocked, follows the same userspace-driver authority gate the other DDF device classes use; none of the grants below are exercised by the host conformance logic.

  • Authority gate: the MANA VF would be enumerated over PCI, claimed through the reviewed userspace-driver hardware-authority gate, and tracked in the device-manager ownership ledger, exactly as the cloud NIC/storage drivers are planned to bind. The current implementation grants nothing.
  • DeviceMmio: BAR0 (the GDMA register block, doorbell page, and SHM aperture) would be mapped device-uncacheable / NX, with doorbell writes scoped to the owning driver’s BAR window. The 64-bit DoorbellEntry values are the writes that path would emit.
  • Interrupt: GDMA EQs deliver completions via MSI-X; the live driver would bind one Interrupt per EQ vector and arm it through the EQ doorbell arm bit. The owner_bits phase mechanism (GdmaCqeInfo/GdmaEqeInfo) is how the driver detects new entries without a tail register.
  • DMAPool: GDMA queues and TX/RX buffers would be allocated from a labeled DMA pool through the selected DMA backend (cloud-dma-backend-selection: direct IOMMU vs labeled bounce buffer), with quiesce/scrub-before-reuse and host-physical-address / IOVA non-exposure. The GdmaSge address fields are IOVAs from that pool; the current implementation does not allocate or program any DMA.
  • Fail-closed / validation rules: the encode/decode logic is the fail-closed boundary capOS implements today – unknown request/queue/event/completion types and command codes are rejected, reserved fields are MBZ-enforced, and bitfields are range-checked. Stale-generation rejection, BAR bounds, doorbell scoping, and release/reset/VF-revocation teardown are the live driver’s responsibility and are future work.
  • QEMU-emulable vs hardware-only: none of MANA is QEMU-emulable – QEMU has no MANA device model. The wire logic here is provable only by the host conformance suite (cargo test-lib); SR-IOV VF revocation/hot-remove semantics in particular cannot be reproduced even by a hypothetical QEMU MANA device model and remain a live-hardware concern.