Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Whitepaper Plan

This document plans a future capOS whitepaper / technical report. Its job is not to draft the paper, but to keep the paper’s evidence requirements visible so that ongoing capOS development can be sequenced to produce them.

Treat this file the way WORKPLAN.md is treated: keep it current with user direction, prune as items close, and cross-link to docs/roadmap.md, WORKPLAN.md, and the relevant proposals rather than duplicating them.

Companion files:

  • outline.md – section-by-section sketch of the future paper.
  • evidence-gaps.md – live tracking of which paper claims still lack artifact-level evidence and which workplan slice will close each gap.

Format and Release Metadata

  • Format: whitepaper / technical report. No conference page limit.
  • Release metadata is intentionally out of scope for this planning page.
  • Deadline: none. Quality drives sequencing; the paper does not freeze before Tier-1 evidence is in place.

Working Title

Schema-as-ABI: Typed Capabilities and Ring-Transport Dispatch in capOS.

Thesis

A research OS where Cap’n Proto schemas serve simultaneously as

  1. the kernel ABI,
  2. the only access-control mechanism (the interface is the permission),
  3. the IPC wire format, and
  4. the persistence and network-transparency substrate,

replacing the parallel rights bitmasks, untyped IPC primitives, and C-header syscall conventions that prior capability OSes still carry.

Position relative to prior art:

  • seL4 – formally verified, but rights are kernel-managed bitmasks and IPC payloads are untyped message registers.
  • Zircon – typed kernel objects, but the typing lives in C handle conventions and a fixed syscall surface, not a schema language.
  • EROS / CapROS – persistent capabilities, but no schema-as-wire layer on top of the capability primitives.
  • Genode – component-typed RPC, but RPC is built on top of kernel IPC primitives that are not themselves the wire format.
  • Plan 9 – “everything is a file” universal interface, but file I/O is a single weak type compared to schema-typed methods.

The unique claim under evaluation is that one schema layer subsumes the rights, ABI, IPC, persistence, and network-transparency mechanisms that those systems each provide separately.

Core Claims and Required Evidence

The paper stands on five claims. Each must be backed by an artifact, not by description alone.

ClaimRequired artifactStatus
C1. Schema-typed methods can replace parallel rights bitmasks.Working capability dispatch where authority narrowing happens via wrapper caps, not flag bits. Session-bound invocation context replaces caller-selected endpoint identity without generic rights flags.Partial. Delegated-relabel containment, transitional representation substrate, the historical service-object routing proof, and the Gate 1 process-session invariant proof are done; endpoint caller-session metadata and shared-service migration remain open.
C2. A shared-memory ring with one progress syscall is a sufficient kernel boundary.Implemented cap_enter + SQ/CQ + opcodes covering CALL/RECV/RETURN/RELEASE/NOP. Loom model. Real latency/throughput numbers vs. a syscall-per-call baseline.Implementation done. Numbers missing.
C3. The wire format enables persistence.A capability whose state is serialized via its schema and restored across kernel reboot. Even a RAM-backed Store proof-of-concept is sufficient.Missing.
C4. The wire format enables network transparency.One capability invocation crossing TCP between two capOS instances (or between QEMU and a host stub) using the same schema.Missing.
C5. The kernel can be developed and verified at this size with practical tooling.Kani/Loom/Miri/panic-inventory artifacts plus the workplan/review discipline.Strong; needs distillation.

The paper should not promise more than the artifacts deliver. Limitations go in their own section, written before claims tighten.

Tier-1 Development Pre-requisites (load-bearing)

Without these, the corresponding claim is bluff.

  1. Session-Bound Invocation Context. Closes the C1 gap. Active in WORKPLAN.md; detailed execution lives in docs/backlog/session-bound-invocation-context.md.
  2. Measurement harness producing real numbers. Closes the C2 gap. Extends make run-measure to record:
    • cap_enter round-trip latency (warm and cold ring, single CPU)
    • SQE dispatch throughput in ops/sec, single-process baseline
    • Endpoint IPC: ordinary scheduler path vs. direct-switch handoff
    • Schema dispatch overhead vs. the existing NullCap baseline
    • Memory cost per capability slot Output should be reproducible from a single make target and saved under a tracked path so the paper can cite specific commits.
  3. Persistence proof-of-concept. Closes the C3 gap. Minimum viable shape: a RAM-backed Store that serializes one non-trivial capability via its capnp schema, dumps it, restores it after kernel reboot in QEMU, and proves a method call still works. This is explicitly narrower than the full storage proposal; treat it as paper evidence, not the production storage track.
  4. Network-transparency proof-of-concept. Closes the C4 gap. Minimum viable shape: a RemoteCap shim that forwards one capnp method call over an established TCP capability between two capOS QEMU guests (or between QEMU and a host capnp stub), using the existing schema. Naturally rides on the SSH/Telnet networking work already in flight.
  5. At least one of {promise pipelining, notification objects}. Promotes the schema-as-ABI claim from synchronous CALL/RECV to capnp-rpc-shaped composition. Listed as remaining Stage 6 theme; promote to selected work after session-bound invocation context.

Tier-2 Pre-requisites (strengthening)

These do not block the thesis but materially raise paper quality.

  1. Ring protocol Kani proof. Currently Loom-only. Promotes “verified core” claim from “we run tools” to “we have proofs.”
  2. Full concurrent SMP scheduling (SMP Phase C beyond first AP scheduler-owner proof). Removes a single-CPU asterisk from the evaluation.
  3. SSH Shell Gateway end-to-end. Paused behind the selected Session-Bound Invocation Context milestone. When resumed, it gives the system-composition section one concrete external-facing artifact rather than only smoke binaries.
  4. One non-toy demo beyond Adventure / First Chat. Either a storage-backed service or the federated-chat sketch, depending on which composes better with the persistence and network PoCs.

Tier-3 (out of scope for first paper)

Acknowledge in Future Work; do not block on:

  • aarch64 port
  • GPU / cloud
  • live upgrade
  • formal MAC/MIC modeling
  • Go runtime / WASI / POSIX
  • production volume encryption

Paper-driven priority order, mapped onto the existing roadmap rather than parallel to it:

  1. Finish current selected milestone (Session-Bound Invocation Context). Closes C1 and anchors §5.5.
  2. Measurement harness. Closes C2. Parallelizable with (1); touches the measure feature only.
  3. Ring Kani proof. Parallelizable; security/verification track owns it.
  4. Persistence PoC. Closes C3. Scope explicitly to paper evidence, not the full storage proposal.
  5. Network-transparency PoC. Closes C4. Rides on existing network work.
  6. Promise pipelining or notifications. Pick whichever composes better with the persistence and network PoCs.
  7. SMP concurrent scheduling. Lifts evaluation quality.

When any of these is selected as a workplan slice, the slice description should reference this file so the paper-evidence motivation stays visible alongside the system motivation.

Honest Framing Notes

  • The collaboration methodology section should describe the actual workflow: dedicated worktrees per task, mandatory verification gates, the WORKPLAN.md / REVIEW.md / REVIEW_FINDINGS.md discipline, and the task-branch merge-back-to-main rule. This is itself a contribution: it shows what process structure makes draft-to-review flow tractable on a kernel-shaped artifact.
  • Verification claims must state exactly what Kani / Loom / Miri / panic inventory cover and what they do not. No formal-methods overstatement.
  • Capability-OS novelty against seL4 / EROS / Genode must be specific about what the schema-as-wire layer adds beyond what those systems already provide. Generic capability-OS claims are not novel.
  • Persistence and network-transparency claims must lean only on the actual PoCs once they exist, not on the shape of the future storage and networking proposals.

Process

  • This file lives on its own task branches under paper/....
  • Edits to this file follow the same dedicated-worktree rule as everything else in the repo.
  • The paper sources live at papers/schema-as-abi/ (Typst). docs/paper/ remains the proposal/reference home: plan.md, outline.md, and evidence-gaps.md describe what the paper needs and why, while the paper itself (main.typ, references.bib) lives next to other buildable artifacts under papers/. Build via make paper from the repository root; output goes to target/papers/schema-as-abi/main.pdf.
  • This plan is read alongside docs/roadmap.md and WORKPLAN.md whenever workplan priorities are revisited, so paper-driven motivation does not silently drift out of sequencing.