# Whitepaper Plan

This document plans a future capOS whitepaper / technical report. Its job is
**not** to draft the paper, but to keep the paper's evidence requirements
visible so that ongoing capOS development can be sequenced to produce them.

Treat this file the way `WORKPLAN.md` is treated: keep it current with user
direction, prune as items close, and cross-link to `docs/roadmap.md`,
`WORKPLAN.md`, and the relevant proposals rather than duplicating them.

Companion files:
- `outline.md` -- section-by-section sketch of the future paper.
- `evidence-gaps.md` -- live tracking of which paper claims still lack
  artifact-level evidence and which workplan slice will close each gap.

## Format and Release Metadata

- Format: whitepaper / technical report. No conference page limit.
- Release metadata is intentionally out of scope for this planning page.
- Deadline: none. Quality drives sequencing; the paper does not freeze before
  Tier-1 evidence is in place.

## Working Title

*Schema-as-ABI: Typed Capabilities and Ring-Transport Dispatch in capOS.*

## Thesis

A research OS where Cap'n Proto schemas serve simultaneously as

1. the kernel ABI,
2. the only access-control mechanism (the interface *is* the permission),
3. the IPC wire format, and
4. the persistence and network-transparency substrate,

replacing the parallel rights bitmasks, untyped IPC primitives, and C-header
syscall conventions that prior capability OSes still carry.

Position relative to prior art:
- **seL4** -- formally verified, but rights are kernel-managed bitmasks and
  IPC payloads are untyped message registers.
- **Zircon** -- typed kernel objects, but the typing lives in C handle
  conventions and a fixed syscall surface, not a schema language.
- **EROS / CapROS** -- persistent capabilities, but no schema-as-wire layer
  on top of the capability primitives.
- **Genode** -- component-typed RPC, but RPC is built on top of kernel IPC
  primitives that are not themselves the wire format.
- **Plan 9** -- "everything is a file" universal interface, but file I/O is
  a single weak type compared to schema-typed methods.

The unique claim under evaluation is that one schema layer subsumes the
rights, ABI, IPC, persistence, and network-transparency mechanisms that
those systems each provide separately.

## Core Claims and Required Evidence

The paper stands on five claims. Each must be backed by an artifact, not by
description alone.

| Claim | Required artifact | Status |
|---|---|---|
| C1. Schema-typed methods can replace parallel rights bitmasks. | Working capability dispatch where authority narrowing happens via wrapper caps, not flag bits. Session-bound invocation context replaces caller-selected endpoint identity without generic rights flags. | Partial. Delegated-relabel containment, transitional representation substrate, the historical service-object routing proof, and the Gate 1 process-session invariant proof are done; endpoint caller-session metadata and shared-service migration remain open. |
| C2. A shared-memory ring with one progress syscall is a sufficient kernel boundary. | Implemented `cap_enter` + SQ/CQ + opcodes covering CALL/RECV/RETURN/RELEASE/NOP. Loom model. Real latency/throughput numbers vs. a syscall-per-call baseline. | Implementation done. **Numbers missing.** |
| C3. The wire format enables persistence. | A capability whose state is serialized via its schema and restored across kernel reboot. Even a RAM-backed `Store` proof-of-concept is sufficient. | **Missing.** |
| C4. The wire format enables network transparency. | One capability invocation crossing TCP between two capOS instances (or between QEMU and a host stub) using the same schema. | **Missing.** |
| C5. The kernel can be developed and verified at this size with practical tooling. | Kani/Loom/Miri/panic-inventory artifacts plus the workplan/review discipline. | Strong; needs distillation. |

The paper should not promise more than the artifacts deliver. Limitations
go in their own section, written before claims tighten.

## Tier-1 Development Pre-requisites (load-bearing)

Without these, the corresponding claim is bluff.

1. **Session-Bound Invocation Context.** Closes the C1 gap. Active in
   `WORKPLAN.md`; detailed execution lives in
   `docs/backlog/session-bound-invocation-context.md`.
2. **Measurement harness producing real numbers.** Closes the C2 gap.
   Extends `make run-measure` to record:
   - `cap_enter` round-trip latency (warm and cold ring, single CPU)
   - SQE dispatch throughput in ops/sec, single-process baseline
   - Endpoint IPC: ordinary scheduler path vs. direct-switch handoff
   - Schema dispatch overhead vs. the existing `NullCap` baseline
   - Memory cost per capability slot
   Output should be reproducible from a single `make` target and saved
   under a tracked path so the paper can cite specific commits.
3. **Persistence proof-of-concept.** Closes the C3 gap.
   Minimum viable shape: a RAM-backed `Store` that serializes one
   non-trivial capability via its capnp schema, dumps it, restores it after
   kernel reboot in QEMU, and proves a method call still works. This is
   explicitly narrower than the full storage proposal; treat it as paper
   evidence, not the production storage track.
4. **Network-transparency proof-of-concept.** Closes the C4 gap.
   Minimum viable shape: a `RemoteCap` shim that forwards one capnp method
   call over an established TCP capability between two capOS QEMU guests
   (or between QEMU and a host capnp stub), using the existing schema. Naturally
   rides on the SSH/Telnet networking work already in flight.
5. **At least one of {promise pipelining, notification objects}.** Promotes
   the schema-as-ABI claim from synchronous CALL/RECV to capnp-rpc-shaped
   composition. Listed as remaining Stage 6 theme; promote to selected work
   after session-bound invocation context.

## Tier-2 Pre-requisites (strengthening)

These do not block the thesis but materially raise paper quality.

6. **Ring protocol Kani proof.** Currently Loom-only. Promotes "verified
   core" claim from "we run tools" to "we have proofs."
7. **Full concurrent SMP scheduling** (SMP Phase C beyond first AP
   scheduler-owner proof). Removes a single-CPU asterisk from the evaluation.
8. **SSH Shell Gateway end-to-end.** Paused behind the selected
   Session-Bound Invocation Context milestone. When resumed, it gives the system-composition
   section one concrete external-facing artifact rather than only smoke
   binaries.
9. **One non-toy demo beyond Adventure / First Chat.** Either a
   storage-backed service or the federated-chat sketch, depending on
   which composes better with the persistence and network PoCs.

## Tier-3 (out of scope for first paper)

Acknowledge in Future Work; do not block on:

- aarch64 port
- GPU / cloud
- live upgrade
- formal MAC/MIC modeling
- Go runtime / WASI / POSIX
- production volume encryption

## Recommended Sequencing

Paper-driven priority order, mapped onto the existing roadmap rather than
parallel to it:

1. Finish current selected milestone (**Session-Bound Invocation Context**).
   Closes C1 and anchors §5.5.
2. **Measurement harness.** Closes C2. Parallelizable with (1); touches the
   `measure` feature only.
3. **Ring Kani proof.** Parallelizable; security/verification track owns it.
4. **Persistence PoC.** Closes C3. Scope explicitly to paper evidence, not
   the full storage proposal.
5. **Network-transparency PoC.** Closes C4. Rides on existing network work.
6. **Promise pipelining or notifications.** Pick whichever composes better
   with the persistence and network PoCs.
7. **SMP concurrent scheduling.** Lifts evaluation quality.

When any of these is selected as a workplan slice, the slice description
should reference this file so the paper-evidence motivation stays visible
alongside the system motivation.

## Honest Framing Notes

- The collaboration methodology section should describe the actual workflow:
  dedicated worktrees per task, mandatory verification gates, the
  `WORKPLAN.md` / `REVIEW.md` / `REVIEW_FINDINGS.md` discipline, and the
  task-branch merge-back-to-main rule. This is itself a contribution: it
  shows what process structure makes draft-to-review flow tractable on a
  kernel-shaped artifact.
- Verification claims must state exactly what Kani / Loom / Miri / panic
  inventory cover and what they do not. No formal-methods overstatement.
- Capability-OS novelty against seL4 / EROS / Genode must be specific
  about what the schema-as-wire layer adds beyond what those systems
  already provide. Generic capability-OS claims are not novel.
- Persistence and network-transparency claims must lean only on the actual
  PoCs once they exist, not on the shape of the future storage and
  networking proposals.

## Process

- This file lives on its own task branches under `paper/...`.
- Edits to this file follow the same dedicated-worktree rule as everything
  else in the repo.
- The paper sources live at `papers/schema-as-abi/` (Typst). `docs/paper/`
  remains the proposal/reference home: `plan.md`, `outline.md`, and
  `evidence-gaps.md` describe what the paper needs and why, while the
  paper itself (`main.typ`, `references.bib`) lives next to other
  buildable artifacts under `papers/`. Build via `make paper` from the
  repository root; output goes to `target/papers/schema-as-abi/main.pdf`.
- This plan is read alongside `docs/roadmap.md` and `WORKPLAN.md` whenever
  workplan priorities are revisited, so paper-driven motivation does not
  silently drift out of sequencing.
