Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

x2APIC and APIC Virtualization

Research note for the SMP Phase C LAPIC/IPI decision. The goal is to decide how x2APIC should fit after the current LAPIC/IPI implementation work and to record which virtualization facts affect that choice.

Status note (2026-06-06): The x2APIC backend has landed in kernel/src/arch/x86_64/lapic.rs: the BSP checks CPUID.01H:ECX.x2APIC at boot and prefers x2APIC MSR access when available, falling back to xAPIC MMIO. AP initialization follows the BSP-selected mode. The selected-mode QEMU proof is make run-interrupt-grant-x2apic, which forces +x2apic, asserts LapicMode::X2Apic, and reuses the routed Interrupt.wait / Interrupt.acknowledge path. The proof is a bounded QEMU backend-selection proof, not high-core hardware readiness.

Existing Local Research

Before adding this note, docs/research/ contained:

  • capnp-error-handling.md
  • completion-ring-threading.md
  • eros-capros-coyotos.md
  • genode.md
  • ix-on-capos-hosting.md
  • llvm-target.md
  • os-error-handling.md
  • out-of-kernel-scheduling.md
  • pingora.md
  • plan9-inferno.md
  • sel4.md
  • small-llm-survey.md
  • zircon.md

None of those files directly cover APIC/x2APIC or KVM APIC virtualization.

Sources Checked

Local verification:

  • Host command qemu-system-x86_64 --version reported QEMU 8.2.2.
  • Host command qemu-system-x86_64 -cpu help listed x2apic as a recognized CPUID feature.
  • The current capOS LAPIC implementation has both xAPIC MMIO and x2APIC MSR backends. The BSP selects x2APIC when CPUID or firmware state makes it available and otherwise falls back to xAPIC MMIO.
  • make run-interrupt-grant-x2apic uses -cpu qemu64,+smep,+smap,+rdrand,+x2apic, asserts the selected LapicMode::X2Apic backend, and proves the routed interrupt waiter / deferred-EOI acknowledgement path still works in that mode.

x2APIC Findings

x2APIC is still the forward-looking LAPIC backend for later hardware and VM coverage:

  • It avoids mapping the local APIC MMIO page and uses architectural MSRs for local APIC register access.
  • It supports wider APIC IDs than xAPIC’s 8-bit destination model, which keeps the CPU-id/LAPIC-id split introduced by the SMP proposal relevant on larger systems and VMs.
  • Intel’s current public guidance says x2APIC is required above 255 cores, newer Intel client families default to x2APIC, and legacy xAPIC can become unavailable or locked out after firmware or system software enters x2APIC.
  • The local capOS dependency set already has x86_64 MSR access, and the implemented x2APIC backend covers EOI, ICR/IPI, spurious vector, LVT timer, timer initial count, divide config, and current APIC ID without adding another architecture crate.

The implementation shape is:

  1. Keep the xAPIC MMIO LAPIC timer/IPI foundation as the fallback for older hardware and VM configurations that only expose xAPIC.
  2. Select x2APIC when CPUID.01H:ECX.x2APIC is available or when firmware has already enabled/locked x2APIC.
  3. Keep TLB shootdown, timer, EOI, and device-vector paths on the architectural LAPIC interface rather than on KVM paravirtual APIC helpers.
  4. Treat larger-APIC-ID and high-core hardware validation as future hardware evidence; the current selected-mode QEMU proof covers backend selection and the routed waiter/ack path only.

Virtualization Findings

Virtualization is relevant to validation and future performance, not to the guest-visible correctness contract:

  • QEMU/KVM can expose x2APIC through CPU model feature selection. capOS tests should make that explicit by extending the current QEMU model to -cpu qemu64,+smep,+smap,+rdrand,+x2apic, or by using another named CPU model with +x2apic, instead of relying on the host or accelerator default.
  • KVM exposes APIC state through its own API and has x2APIC-specific handling for 32-bit APIC IDs. That matters to the VMM, but a capOS guest should use the architectural x2APIC interface.
  • QEMU/KVM paravirtual features such as kvm-pv-eoi, kvm-pv-ipi, and kvm-pv-tlb-flush are optional accelerations. They should not be part of the first LAPIC/IPI or TLB-shootdown proof because they would make correctness depend on a Linux/KVM-specific host contract.
  • APIC virtualization features such as APICv or AMD AVIC are VMM-side acceleration mechanisms. capOS should not require or detect them before it has a stable architectural x2APIC path.

The practical QEMU proof targets are therefore:

  1. Boot the current xAPIC MMIO LAPIC implementation with -smp 2.
  2. Prove LAPIC timer ticks on vector 48 and IPI delivery on vector 49.
  3. Keep KVM paravirtual APIC/TLB/IPI features disabled or ignored for the first correctness proof.
  4. Run make run-interrupt-grant-x2apic as the selected-mode x2APIC proof, using -cpu qemu64,+smep,+smap,+rdrand,+x2apic and asserting the selected backend plus the routed interrupt wait/ack path.

capOS Recommendation

Keep x2APIC as the preferred backend when CPUID or firmware state exposes it, with xAPIC MMIO as the fallback. Keep correctness on the architectural LAPIC timer, IPI, EOI, and device-vector paths; KVM paravirtual APIC/TLB/IPI features remain optional accelerations rather than proof dependencies. Do not treat the selected-mode QEMU proof as high-core hardware readiness.