# Proposal: mdBook Documentation Site

Turn the existing Markdown documentation into a navigable mdBook site that
explains capOS as a working system, while keeping proposals and research as
deep reference material.

The current docs are useful for agents and maintainers who already know what
they are looking for. They are weaker as a reader path: a new contributor has
to jump between `README.md`, `docs/roadmap.md`, `docs/tasks/README.md`, proposal files,
research reports, and source code before they can form an accurate model of
the system. The mdBook site should fix that by adding a concise, current
system manual above the existing archive.


## Goals

- Make the first reading path obvious: what capOS is, how to build it, what
  works today, and where the important subsystems live.
- Separate implemented behavior from future design, rejected ideas, and
  research background.
- Preserve existing long-form proposal and research documents instead of
  rewriting them prematurely.
- Give architecture pages a repeatable structure so future edits do not turn
  into ad hoc status notes.
- Make validation visible: each architecture page should name the host tests,
  QEMU smokes, fuzz targets, Kani proofs, Loom models, or manual checks that
  support its claims.
- Keep the docs useful from a local clone, without requiring hosted services,
  databases, or custom frontend code.

## Non-Goals

- Replacing `docs/tasks/README.md`. Task records remain operational planning
  documents; `REVIEW_FINDINGS.md` is only a tombstone for older links, and
  `docs/roadmap.md` is now part of the book while still owning long-range
  planning.
- Turning proposals into user manuals by bulk editing every existing document.
  Long proposal files stay as references until a subsystem needs a targeted
  refresh.
- Building a marketing site, blog, changelog, or public product page.
- Adding MDX, React, Vue, custom components, or a JavaScript application layer.
- Automatically generating API reference documentation from Rust or Cap'n
  Proto. That can be evaluated later as a separate documentation track.

## Audience

The site should serve three readers:

- **New contributor:** wants to build the ISO, boot QEMU, understand the
  current architecture, and find the right files to edit.
- **Reviewer:** wants to verify whether a change preserves the intended
  ownership, authority, lifecycle, and validation rules.
- **Future agent:** wants current project context without having to infer the
  system from stale proposals or source code alone.

The primary audience is maintainers and agents, not end users. This matters:
accuracy, status labels, and code maps are more important than a polished
external landing page.

## Current State

The repository already has a substantial Markdown corpus:

- `README.md` explains the project and core commands.
- `docs/roadmap.md` describes long-range stages and visible milestones.
- `docs/tasks/state.toml` tracks the selected milestone.
- `docs/tasks/state.toml` tracks the selected milestone; task records under
  `docs/tasks/` track active implementation order.
- `docs/tasks/**` tracks open remediation, review-finding work, and verification
  history.
- `docs/capability-model.md` is a real architecture reference.
- `docs/proposals/` contains accepted, future, exploratory, and rejected
  design material.
- `docs/research/` contains prior-art analysis (the
  `capability-systems-survey.md` synthesis plus per-system deep-dive reports).
- `docs/*-design.md` and inventory files capture targeted design/security
  decisions.

The weakness is not lack of content. The weakness is keeping the current manual
visibly separate from archival planning, proposal, and research material.

## Site Shape

The mdBook site should be structured as a book, not as a mirror of the file
tree. The current hierarchy is:

- Start Here: reader orientation and commands.
- Runnable Demos: current user-visible proofs.
- System Architecture: current implementation, with code maps and invariants.
- Security and Verification: threat boundaries, validation workflow, and
  security inventories.
- Planning: roadmap, changelog, and backlog links.
- Design Archive: proposal index plus nested active, future, and rejected
  long-form design documents.
- Research Archive: research index plus nested prior-art reports.

All proposal and research files should remain reachable through the sidebar so
mdBook builds them, but they should be nested under their indexes rather than
listed as peer pages beside the current system manual. Sidebar folding should be
enabled so the default reader path stays compact.

## Page Standard

Every architecture page should use this shape:

```md
---
status: "Partially implemented."
last_reviewed: "2026-04-27 10:00 UTC"
description: "Page description."
topics:
  - { key: "capabilities-ipc-and-authority", reason: "Explains authority or invocation behavior." }
---

# Page Title

What problem this subsystem solves and why a reader should care.
```

The preprocessor strips front matter from rendered page content and uses the
metadata to regenerate `docs/topics.md`. A post-build agent asset pass patches
final rendered HTML so `status`, `description`, and `last_reviewed` appear as
page-head metadata without adding visible status blocks to each page. The same
pass adds HTML head discovery links for `llms.txt` and each page's Markdown
mirror.

The docs build also emits agent-facing static assets in `target/docs-site`:
`llms.txt`, Markdown source mirrors for pages listed in `docs/SUMMARY.md`,
`sitemap.xml`, `robots.txt`, and a Cloudflare Pages `_headers` file with
discovery links. `robots.txt` includes a comment pointing agents to
`llms.txt`; crawler rules stay in standard `User-agent`, `Allow`, `Disallow`,
`Sitemap`, and `Content-Signal` fields.

## Current Behavior
What exists in the repo today.

## Design
How it works, with concrete data flow.

## Invariants
Security, lifetime, ownership, ordering, or failure rules.

## Code Map
Important files and entry points.

## Validation
Relevant host tests, QEMU smokes, fuzz/Kani/Loom checks.

## Open Work
Concrete known gaps, linked to task ledger records when relevant.
```

Architecture pages should normally stay between 100 and 300 lines. Longer
background belongs in proposals or research reports.

## Status Vocabulary

Use explicit status labels only where a reader could reasonably confuse
implemented behavior, accepted design, future design, or rejected material.
Status belongs on the page itself only when the page role is not already
obvious from the page type or nearby index. Put this information in YAML front
matter (`status`, `last_reviewed`, `topics`) as the first block in the file.

Canonical page-level form:

```md
---
status: "Partially implemented."
last_reviewed: "2026-04-25 11:36 UTC"
description: "Canonical page-level metadata layout."
topics:
  - { key: "capabilities-ipc-and-authority", reason: "Describes authority and invocation behavior." }
---
```

`last_reviewed` is hand-maintained and uses the same minute-precision,
timezone-aware format as status updates in `docs/tasks/README.md`,
`docs/roadmap.md`, and task records. Get it from
`date '+%Y-%m-%d %H:%M %Z'`; do not infer or round from memory. Use this field
for substantial content edits that should reset a reader's trust.

Use one of these labels:

- **Implemented:** behavior exists in the mainline code and has validation.
- **Partially implemented:** some behavior exists, but the page also describes
  missing work.
- **Accepted design:** intended direction, not fully implemented.
- **Future design:** plausible direction, not selected for near-term work.
- **Rejected:** explicitly not the chosen direction.
- **Research note:** background used to inform design, not a direct plan.

Add a page-level status label to:

- proposal pages whose content could be mistaken for current behavior
- architecture or design pages that mix implemented facts with future or
  partial behavior
- design-gate documents whose role is to define an accepted implementation
  contract before the implementation is complete
- research pages that would otherwise read like selected design rather than
  background

Do not add a page-level status label to:

- orientation, index, command-reference, and workflow pages where the page type
  already makes the role obvious
- reader-orientation overview pages whose role is to explain *why* the design
  looks the way it does (design bets, project framing) rather than catalogue
  *what* is implemented. These pages must point at `status.md` or the relevant
  architecture page for implementation state; a mixed "Partially implemented"
  label on them is misleading because each bullet it covers has its own,
  different status
- status summary pages that already classify other documents
- pages whose content is purely operational and only describes current,
  validated behavior

When only one section differs from the rest of the page, keep the page-level
status for the dominant role of the document and add a local sentence in that
section such as `Current implementation status:` or `Current status:`. Do not
replace the page-level label with timestamped prose unless the timestamp itself
is the point.

Avoid ambiguous language like "planned" without a stage, dependency, or status
label. When a page mixes current and future behavior heavily, split those
sections instead of relying on status text alone.

## Content Rules

The docs-scoped authoring contract lives in [`docs/AGENTS.md`](../AGENTS.md);
the rules below extend it with site-shape conventions specific to the mdBook
manual. Apply the AGENTS.md rules first when editing any file under `docs/`,
then layer the site-shape rules from this proposal.

- Start with operational facts, not motivation.
- Prefer concrete nouns: process, cap table, ring, endpoint, manifest, init,
  QEMU smoke.
- Name source files when a claim depends on implementation.
- State authority and ownership rules explicitly.
- State failure behavior explicitly.
- Link to proposals and research instead of duplicating long rationale.
- Keep `docs/roadmap.md` and `docs/tasks/README.md` as planning sources, not as content to
  paste into the book.
- Do not describe behavior as implemented unless validation exists or the code
  map makes the claim directly checkable.
- Do not bury current limitations at the bottom of a long proposal.

## Proposal Index

`docs/proposals/index.md` should classify proposal files instead of listing
them alphabetically. A useful classification:

- Active or near-term:
  - service architecture
  - service object capabilities
  - storage and naming
  - error handling
  - security and verification
  - SMP
  - Ring v2 for full SMP
- Future architecture:
  - networking
  - userspace binaries
  - shell
  - SSH shell gateway
  - boot to shell
  - user identity and policy
  - cryptography and key management
  - certificates and TLS
  - OIDC and OAuth2
  - volume encryption
  - cloud metadata
  - cloud deployment
  - live upgrade
  - GPU capability
  - formal MAC/MIC
  - browser/WASM
- Rejected or superseded:
  - rejected Cap'n Proto ring SQE envelope

Each proposal entry should have a one-sentence purpose and a status label.

## Research Index

`docs/research/index.md` is the top-level research index, and the
capability/microkernel survey lives at
`docs/research/capability-systems-survey.md` with a "Design consequences for
capOS" section near the top. Readers should not need to read every long report
to learn which ideas were accepted.

Each long research report should eventually end with:

```md
## Used By

- Architecture or proposal page that relies on this research.
- Concrete design decision influenced by this report.
```

## Diagrams

Use Mermaid only where it clarifies flow or authority:

- boot flow: firmware, Limine, kernel, manifest, init
- capability ring: SQE submission, `cap_enter`, CQE completion
- endpoint IPC: client CALL, server RECV, server RETURN
- manifest startup: boot package, init, ProcessSpawner, child caps

Avoid diagrams that duplicate file layout or become stale when a function is
renamed. Every diagram should have nearby text that states the same key
invariant in prose.

## Migration Plan

### Phase 1: Skeleton and Reader Path

- Add `book.toml` with `docs` as the source directory and output under
  `target/docs-site`.
- Add `docs/SUMMARY.md`.
- Add `docs/index.md`.
- Add `docs/overview.md`.
- Add `docs/status.md`.
- Add `docs/build-run-test.md`.
- Add `docs/repo-map.md`.

Acceptance criteria:

- `mdbook build` succeeds.
- The first section explains what capOS is, how to build it, how to boot it,
  and where to find the major code areas.
- Existing proposal and research files are reachable through the sidebar.

### Phase 2: Current Architecture Pages

- Add the first architecture pages:
  - boot flow
  - process model
  - capability ring
  - IPC and endpoints
  - userspace runtime
  - manifest and service startup
  - memory management
  - scheduling
- Keep `docs/capability-model.md` as a first-class architecture page.

Acceptance criteria:

- Each architecture page has status, current behavior, invariants, code map,
  validation, and open work.
- Each page distinguishes implemented behavior from future design.
- At least boot flow, capability ring, IPC, and manifest startup include a
  concise Mermaid diagram.

### Phase 3: Security and Verification Pages

- Add `docs/security/trust-boundaries.md`.
- Add `docs/security/verification-workflow.md`.
- Link existing inventories and designs from the security section.
- Make each security page name the relevant validation commands and review
  documents.

Acceptance criteria:

- A reviewer can find the hostile-input boundaries, trusted inputs, and
  verification workflow without reading all proposals.
- The security section links to `REVIEW.md`, `docs/tasks/README.md`,
  `docs/trusted-build-inputs.md`, and `docs/panic-surface-inventory.md`.

### Phase 4: Proposal and Research Curation

- Add `docs/proposals/index.md`.
- Keep proposal and research documents reachable through `SUMMARY.md`, but nest
  them under archive groups so they do not dominate the default sidebar.
- Add status labels to proposal files as they are touched.
- Add "Used By" sections to research files incrementally.

Acceptance criteria:

- Proposal status is visible before a reader opens a long document.
- Rejected and future proposals are not confused with implemented behavior.
- Research pages point back to the architecture or proposal pages they
  influence.
- The default sidebar presents the current manual before backlog, proposal, and
  research archives.

## Maintenance Rules

- When implementation changes a subsystem, update the corresponding
  architecture page in the same change when the page would otherwise become
  misleading.
- When a proposal is accepted, rejected, or partially implemented, update its
  status and the proposal index.
- When `docs/tasks/state.toml` changes the selected milestone, update
  `docs/status.md` only if the public current-system summary changes. Do not
  mirror every operational task into the docs site.
- When validation commands change, update `docs/build-run-test.md` and the
  affected architecture page.

## Tooling Follow-Up

The content proposal continues to assume mdBook because it matches the repo's
Rust toolchain and plain Markdown corpus. The current tooling baseline is:

- `book.toml`
- `make docs`
- `make docs-serve`
- `make cloudflare-pages-build`
- pinned `mdbook` and `mdbook-mermaid` downloads in `Makefile`, with version
  and SHA-256 inputs catalogued in
  [`docs/trusted-build-inputs.md`](../trusted-build-inputs.md) under the
  mdBook documentation tools row. `make docs` and `make cloudflare-pages-build`
  verify those checksums and the executable versions before rendering the
  book, and `mdbook-mermaid` supplies the pinned `mermaid.min.js` browser
  bundle used by both mdBook HTML rendering and docs-PDF Mermaid rasterization
- a small local stylesheet for readability and sidebar spacing

Do not add a frontend package manager, theme framework, or generated site
assets unless the content structure proves insufficient. If mdBook becomes too
limited after the sidebar, index, metadata, and styling cleanup, the preferred
replacement candidate is Astro Starlight because it supports Markdown/MDX,
content collections, structured sidebars, built-in docs components, and static
Cloudflare Pages output. Docusaurus is better only if versioned public docs,
blogging, and a larger external project site become requirements. VitePress is
reasonable only if the project wants Vue-oriented customization.

## Open Questions

- Should `docs/tasks/README.md` remain outside the book and linked from
  `status.md`, or should redacted public summaries be generated later?
- Should long proposal files keep their current filenames, or should accepted
  designs eventually move from `docs/proposals/` into `docs/architecture/`?
- Should `docs/status.md` be manually maintained, or generated from a smaller
  checked-in status data file later?
- Should Cap'n Proto schema documentation be generated into the book once the
  interface surface stabilizes?
- Should proposal and research indexes eventually be generated from structured
  frontmatter instead of hand-maintained Markdown tables?

## Recommended First Commit

The first implementation commit should be deliberately small:

1. Add mdBook config.
2. Add `SUMMARY.md`.
3. Add the Start Here pages.
4. Link existing proposal and research files without rewriting them.
5. Verify `mdbook build`.

That gives the project a usable docs site quickly, without blocking on a full
architecture rewrite.
