Security And Verification Backlog
Detailed decompositions for security and verification work. docs/tasks/README.md
links here but should not inline these subtasks.
Stage-6 Trust-Boundary Refresh
- Refresh trust-boundary docs after Stage 6 IPC/capability-transfer work.
Untrusted-Service Hardening Pass
Cover unmapped pointers, kernel-half pointers, invalid capability IDs, corrupted rings, SQ/CQ overflow behavior, and a service without Console authority. Audit manifest, ELF, SQE, params, and result-buffer paths so untrusted input fails closed instead of reaching kernel panic paths.
Completed context:
- Panic-surface inventory: audited
panic!,assert!,unwrap, andexpectreachable from manifest, ELF, SQE, params, result-buffer, IPC, and spawn inputs. - Ring/user-pointer hostile demos: added unmapped params/result-pointer,
kernel-half params-path, invalid-capability-ID, corrupted RETURN
call_id, corrupted SQ/CQ head, undersized-params, undersized-result, and SQ/CQ overflow coverage. - No-authority smoke: empty-CapSet service verifies expected cap lookups
fail and invalid-cap CALLs return controlled CQEs; after removal of
syscall 0, it proves a no-authority process cannot write and can only
exit/cap_enter.
Remaining decomposition:
- Quota and exhaustion smokes (
make run-untrusted-exhaustion, two QEMU passes; covered 2026-05-25 06:42 EEST):- Cap-table and endpoint-queue exhaustion fail closed without corrupting
existing calls. Endpoint-queue is proven by the small-scratch core pass
(per-owner queue ceiling ->
Overloaded, then a held console call still completes). Cap-table is proven by the small-scratch core pass and the default-profile*-captablecompanion pass: single-frameMemoryObjectallocations first return boundedFrameAllocatorsuccess replies, then continue until the per-process cap-slot ledger fails closed (Overloaded: failed to reserve MemoryObject cap slot); a held console call still completes after the boundary. - Scratch/result-buffer pressure returns controlled errors and later
valid calls still complete (core pass: ring-scratch oversize CALL
rejected with
CAP_ERR_INVALID_REQUEST, reply-scratch clamp returns a serialized exception, then a valid console write completes). - Repeated invalid submissions stay bounded: each structurally invalid
SQE returns a controlled per-SQE error CQE and the ring stays usable (a
recovery NOP completes). Note: the per-key token-bucket log aggregation
in
docs/authority-accounting-transfer-design.md§3 (D1/D2 suppressed- count summary line) is still a design target, not implemented; the smoke asserts bounded per-SQE rejection, not the summary line. - Frame-grant-page exhaustion: not cleanly reachable from a smoke. For
single-page allocations the cap-slot ceiling (
PROCESS_CAP_SLOT_LIMIT, 256) is reached far before the frame-grant ledger (PROCESS_FRAME_GRANT_PAGE_LIMIT, 4096 pages), and reaching 4096 grant pages needs large contiguous allocations whose failure mode is physical fragmentation, not the grant ledger. The cap-table pass exercises the same fail-closed preflight-reserve path. Remaining gap.
- Cap-table and endpoint-queue exhaustion fail closed without corrupting
existing calls. Endpoint-queue is proven by the small-scratch core pass
(per-owner queue ceiling ->
- Fail-closed cleanup: the
FrameAllocatorsuccess-path result serialization now honors the caller’s effective reply-scratch capacity, so small-scratch processes can receive boundedMemoryObjectresult caps before cap-slot exhaustion fails closed. Closed bysecurity-reply-scratch-success-path-limit-local-proof.
Kani Harness Bounds Refresh
- Revisit Kani harness bounds and proof shape once capability transfer,
resource accounting, or user-buffer validation has more concrete proof
obligations. Keep current bounds practical for
make kani-lib; expand only when the added verifier cost buys a specific kernel invariant.
DMA Assurance Model Operationalization
dma-assurance-model-v0 (2026-05-24) landed the accepted proposal
(docs/proposals/dma-assurance-model-proposal.md) and inspectable-only TLA+/Alloy
skeletons (models/dma/), but stopped there: no run target, no CI gate, no
reconciliation with DMA code landed since. Kickoff task:
dma-assurance-model-operationalization
(decomposition — reconciled the v0 model with landed code and emitted the
per-tool slices below).
- Reconcile
models/dma/with landed invariants (ownership-generation on recycle, map-record-before-PTE-install ordering, drive-pin, epoch fence, scrub-before-free): gap table inmodels/dma/README.mdgrounded against the landed symbols, done 2026-06-04. -
make model-dma-tla— bounded TLC run ofdma_authority.tla(pinned TLC 2.19 / tla2tools 1.7.4 + pinned Temurin JRE 17.0.19), lifecycle ordering plus generation-keyed stale completion, record-before-PTE-install split, drive-pin/quarantine, and queue-enable epoch-fence interleavings, checked clean at 2 devices / 2 domains / 2 pages / 2 iovas, generations 0..1, done 2026-06-04:dma-assurance-model-tla-checked-gate. -
make model-dma-alloy— Alloy analysis ofdma_authority.als(pinned Alloy Analyzer 6.2.0), device/domain/IOVA/page/alias authority graph plus the ownership-generation stale-handle gate, checked at scopefor 4, done 2026-06-04:dma-assurance-model-alloy-checked-gate. -
make kani-dma-authority— bounded Kani over an extracted pure DMA-authority core (capos_lib::dma_authority: ownership-generation bump on recycle, stale-handle rejection without mutation, no-re-expose before completion),make kani-libstyle, done 2026-06-04. Faithful extraction of thedevice_dma.rsauthority arithmetic; routing the kernel call site through the core is a tracked follow-up (kernel isno_std/no_main, not host-built):dma-assurance-model-kani-authority-core. -
make model-dma-deferred-completion-loom— focused Loom (pinned 0.7.2) over theDeferredCompletionQueuereservation budget and the multi-CPU TLB shootdown generation re-read (deferred-EOI / completion concurrency the ring Loom does not cover), done 2026-06-04:dma-assurance-model-deferred-completion-loom. - CI wiring into the GitHub gate and local aggregate now that each target has
a checked result.
make dma-assurance-model-checkruns Alloy/TLA+/Loom/ Kani locally whencargo-kaniis installed; GitHub CI runsmodel-dma-alloy,model-dma-tla, andmodel-dma-deferred-completion-loomindma-assurance-models, andkani-dma-authorityinkani-proofs. Done 2026-06-05:dma-assurance-model-ci-wiring.
Scheduler & IRQ Assurance Models
The scheduler is the densest unmodeled concurrency surface in the kernel
(per-CPU atomics read lock-free from ISR context while another path holds the
scheduler mutex via try_lock, plus IPI cross-CPU activation) and has zero
formal coverage today (smoke + measured suppressed-tick counters only). The IRQ
MSI-X waiter race was fixed by reproduction, not a model. Mirrors the DMA
operationalization pattern; tasks reuse the TLC/Alloy/Kani pins that track lands.
- S1
scheduler-nohz-activation-model(done 2026-06-04 09:00 UTC) – TLA+/TLC for the nohz activation/rollback lifecycle + a focused Loom for the lock-freeNOHZ_ACTIVE_CPUSbit vs lockednohz_activation[slot]record race.make model-scheduler-nohz-tlachecks no timer-less CPU (NoTimerlessStall+EventuallyReArmed), bit/record agreement (EventuallyConsistent), and that a staled remote activation is dropped not applied to a newer lease (NoStaleActivation+StaledRecordEventuallyCleared);make model-scheduler-nohz-loomchecks the lock-free-bit ↔ locked-record reconciliation keeps the timer armed. Checked results + mutation/non-vacuity evidence inmodels/scheduler/README.md. - S2
scheduler-lapic-oneshot-timer-model(done 2026-06-04) – Kani over the extracted pure count/clamp arithmetic (capos_lib::clockevent) + a TLA+ mode-transition lemma pinning the halt-first reprogram ordering.make kani-lapic-oneshotproves the clamp window is well-formed, the armed count is in[1, u32::MAX]with nou128overflow, and the count round-trips to the request within one LAPIC count (3/3 SUCCESSFUL).make model-scheduler-lapic-oneshot-tlachecks that after the halt-first reprogram the next fire is the one-shot at the armed count, never the periodic reload (OneshotModeBoundedCount+HaltedDisarmed+ theperiodicFiredInOneshotModesentinel), and that every fire path (user- and kernel-mode consumption) restores a timer source (NoTimerlessStall+EventuallyReArmed/FiredEventuallyRestoredliveness). Checked results + mutation/non-vacuity evidence inmodels/scheduler/README.md. - S3/S4
scheduler-cpu-isolation-lease-authority-model(done 2026-06-04 07:04 UTC) – Alloy for the lease/grant relational invariants + TLA+ for the two-lock teardown and the documented non-atomic createLease-vs-revokeGrant SMP window.make model-scheduler-lease-alloychecks the unforgeable grant->lease binding, no live lease through a revoked grant outside the explicitly modeled bounded window, capacity never undercounting a live lease, and the stale-handle generation gate;make model-scheduler-lease-tlachecks generation advances exactly once per termination, no capacity double-free, the single chokepoint always runs unregister + SQPOLL-stop + nohz-rollback before recycle, no stranded generation (liveness), and that the renew deadline branch never resurrects. Checked results + non-vacuity evidence inmodels/scheduler/README.md. - IRQ
irq-msix-waiter-determinism-model(done 2026-06-04 06:10 UTC) – TLA+/TLC for the waiter <-> delivery <-> deferred-EOI ordering the RX MSI-X waiter fix established.make model-irq-waiter-tlachecks no spurious/early injection (NoCompletedEarly), exactly-once delivery/EOI/completion accounting, EOI drain before route re-arm (EpochDrainSound), and theNoLostWakeliveness property; checked result + mutation evidence inmodels/irq/README.md.
Preserved Completed Security Context
These are completed and should not be re-read by default. They remain here so
future work can find their design context without bloating docs/tasks/README.md.
- Authority graph and resource accounting design for transfer model:
docs/authority-accounting-transfer-design.md. - Supply-chain and generated-code TCB hardening: pinned Limine and
external build downloads, generated-code drift checks, dependency policy,
pinned Cap’n Proto compiler, shared
tools/capnp-build, and deterministic generated-binding comparisons. - DMA isolation model before PCI/virtio/user-driver work:
docs/dma-isolation-design.mddefines short-term QEMU bounce-buffer decision,DMAPool,DeviceMmio,Interruptinvariants, and the userspace-driver transition gate. - ELF parser arbitrary-input coverage: proptest coverage plus a bounded
cargo-fuzztarget. - Telnet IAC filter fuzz coverage:
TelnetFilterextracted tocapos-lib::telnet,fuzz/fuzz_targets/telnet_filter.rsexercises the state machine with structural assertions (Normal/AfterIac emission rules, monotonic emit count). Will travel with the parser when networking moves to userspace perdocs/proposals/networking-proposal.md. - Telnet IAC filter differential round-trip fuzzing
(
fuzz/fuzz_targets/telnet_filter_roundtrip.rs): synthesize arbitrary RFC 854 event streams (Data, WILL/WONT/DO/DONT, SB blocks with payload), encode to wire bytes, and assert that filter output equals the concatenatedDatapayloads. Found a real EXOPL handling bug in the original filter – the option byte right afterIAC SBwas being interpreted as the start of anIAC IACescape, leaving the filter stuck in the subnegotiation state and silently dropping all subsequent data bytes. Fixed via a newAfterSbstate that consumes the option byte unconditionally before entering payload parsing. - Line discipline extraction and fuzz coverage: pure
LineDisciplinelives incapos-lib::line_discipline, returningLineStep { outcome, echo }descriptions. The kernel transport drives it and translatesEcho::Byte/Echo::Backspace/Cancelled/ Submitted/Reprompt into the existingsend_*_track_crcalls. Backed byfuzz/fuzz_targets/line_discipline.rswith structural invariants (line_len <= max_bytes, ±1 line_len delta per Pending step, Cancelled clears, echo only when buffer grows/shrinks). - Future: differential fuzzing against an external Telnet library (libtelnet or a Rust port) to catch RFC conformance bugs the structural and round-trip targets cannot express. Tracked as a follow-up to Track S.14.
- Ring SQE wire validation extraction and fuzz coverage: lifted the
per-opcode
*_sqe_has_unsupported_fieldspredicates from the kernel intocapos_config::ring, exposed a unifiedsqe_wire_validation_errorentry point, and reroute the kernel through it. Addedfuzz/fuzz_targets/sqe_validation.rsplus 12 host unit tests covering the classification boundaries each opcode imposes. Closes the originally planned three-parser fuzz set (elf::parse,manifest::decode, ring SQE decoder). - Well-formed SQE generator oracle
(
docs/tasks/done/2026-06-06/security-sqe-well-formed-generator-fuzz-local-proof.md): the test/fuzz-onlysqe-validation-oraclefeature exposescapos_config::ring::sqe_oracle, which generates validator-accepted SQEs for every opcode accepted by the current build, plus one-field rejecting mutations for reserved fields, unsupported flags, constrained cap/pointer/size fields, opcode-specific constraints, and session-disclosure reserved bits. The existingsqe_validationfuzz target keeps arbitrary-byte coverage and runs the positive/negative oracle on each input. This closes the shared wire-validator oracle gap only; it does not claim cap-table lookup, userspace pointer mapping, transfer-descriptor loading, or full kernel ring semantic coverage. - Track S.17 – sanitizers on host tests – partially landed.
make sanitizer-host-testsruns AddressSanitizer over thecapos-libandcapos-confighost suites (crate set / features mirror thetest-lib/test-configaliases). Outcome: zero findings – both suites pass clean under ASan, including the namedunsafesuspects (FrameBitmap slot indexing, CapTable generation counters,lazy_bufferraw&mut [u8]). The “cheap to add” claim holds for ASan only: it needs no-Zbuild-stdbecause its libc interceptors cover the uninstrumented precompiled std. - ThreadSanitizer (make sanitizer-host-tests-tsan) is blocked upstream, not by a capOS defect. TSan changes the crate ABI, so rustc refuses to link sanitized code against the uninstrumented precompiled std (mixing -Zsanitizer will cause an ABI mismatch). Instrumenting std needs-Zbuild-std, which then fails withduplicate lang item in crate core: sizedfor build-script-bearing dependencies (typenum / libc / cfg-if / subtle) when the sanitizer target equals the host triple – reproduced four ways (plain-Zbuild-std, renamed--targetJSON spec,target-applies-to-host=false, and-Zhost-config). The TSan target is kept wired so it starts passing once the upstream-Zbuild-std+ build-script issue is fixed. capOS concurrency invariants are meanwhile covered by the dedicated Loom model (cargo test-ring-loom); these host unit tests spawn no threads, so TSan’s marginal value here is low.