Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Roadmap

Long-term direction for capOS. Keep this file outcome-oriented. Detailed task decomposition belongs in docs/backlog/; current execution order belongs in WORKPLAN.md; completed milestone/review reports belong in docs/changelog.md.

Current Direction

Current selected milestone: Session-Bound Invocation Context.

The visible goal is in cleanup: the core gates are landed, and remaining work is to finish the peer-owned adventure shared-service migration and final full-gate verification. Implemented pieces include the process-session invariant, endpoint caller-session metadata, stale normal endpoint rejection, transfer scopes, field-granular disclosure gating, session expiry for broker-issued shell bundle caps, guest bundle narrowing, chat session-keyed membership, terminal and stdio bridge live-session guards, keyed service-scoped caller references, and the session-context proof that distinct endpoint service scopes derive distinct opaque caller-session reference tuples. The milestone still replaces caller-selected service-visible identity without continuing the service-object identity migration.

The prior core service-object routing/lifecycle subproof landed in commit a4655f0 at 2026-04-28 14:10 UTC: it proves trusted service-object minting, generation-checked receiver cookies, copy/move IPC transfer, nested spawn delegation, close/revoke rejection, and stale-cookie rejection after record reuse. That proof remains historical low-level coverage. The active milestone does not continue subject/proof root opening or shared-service service-object migration.

This milestone intentionally precedes more remote shell work. The SSH Shell Gateway remains a planned Stage 7 shell/networking milestone, but safe network-backed shell delegation depends on the same one-session-per-process and privacy-preserving endpoint session model. The SSH version-exchange checkpoint lives on workplan/ssh-version-exchange-recovery and still requires QEMU harness review before merge.

Details:

  • WORKPLAN.md
  • docs/backlog/session-bound-invocation-context.md
  • docs/backlog/service-object-identity-migration.md (superseded)
  • docs/backlog/stage-6-capability-semantics.md
  • docs/proposals/session-bound-invocation-context-proposal.md
  • docs/proposals/service-object-capabilities-proposal.md (superseded)
  • docs/proposals/user-identity-and-policy-proposal.md
  • docs/proposals/oidc-and-oauth2-proposal.md
  • docs/backlog/local-users-management.md

Whitepaper Track

A future capOS whitepaper / technical report consumes – not duplicates – work from the other tracks. The plan, outline, and live evidence-gap log remain in docs/paper/ (plan.md, outline.md, evidence-gaps.md). The paper itself is a Typst project at papers/schema-as-abi/ and is built via make paper.

The paper’s Tier-1 evidence requirements pull these existing items into explicit paper-supporting roles. They are not new tracks; they are the selection lens this track applies:

  • Stage 6 session-bound invocation context migration (closes the “interface IS the permission” claim).
  • A measurement harness over make run-measure producing reproducible ring throughput, cap_enter latency, IPC handoff, and schema-dispatch numbers (closes the ring-as-sufficient-boundary claim).
  • A paper-scoped persistence proof-of-concept narrower than the storage proposal (closes the wire-format-enables-persistence claim).
  • A paper-scoped network-transparency proof-of-concept narrower than the general networking proposal (closes the wire-format-enables-network-transparency claim).
  • At least one of {promise pipelining, notification objects} (closes capnp-rpc-shaped composition beyond CALL/RECV).

Tier-2 strengtheners: ring-protocol Kani proof, full concurrent SMP scheduling, end-to-end SSH Shell Gateway, one non-toy demo beyond Adventure or First Chat.

Out of scope for the first paper (acknowledge in Future Work only): aarch64, GPU, live upgrade, formal MAC/MIC, Go/WASI, cloud metadata, production volume encryption.

When workplan slices close a paper-evidence gap they should reference docs/paper/evidence-gaps.md and update it in the same task, including the matching #todo block in papers/schema-as-abi/main.typ. A structural pre-evidence draft already exists at papers/schema-as-abi/main.typ; the abstract, the Evaluation section, the Conclusion, and any contribution claim that depends on missing Tier-1 evidence stay deferred until that evidence lands. New paper content that does not depend on missing artifacts may be drafted at any time and lives next to the existing #todo blocks.

Completed Foundation

  • Stage 0: Foundations: bitmap physical frame allocator, heap for alloc, IDT exception handling, and initial Cap’n Proto schema scaffolding.
  • Stage 1: Virtual Memory: kernel and per-process address spaces, page table abstraction, HHDM preservation, and user-half cleanup.
  • Stage 2: User-Space Transition: GDT/TSS/syscall setup and Ring 3 round-trip path.
  • Stage 3: Process Abstraction: ELF loading, process ownership of address spaces and cap tables, process exit cleanup, and the current exit / cap_enter syscall surface.
  • Stage 4: Capability Syscalls / Ring Transport: Console capability, shared-memory submission/completion rings, cap_enter, CQE transport errors, and alloc-free dispatch paths.
  • Stage 5: Scheduling Core: PIT/PIC timer preemption, round-robin scheduler, context switching, generation-tagged caps, and VirtualMemory cap.
  • Kernel Networking Smoke: in-kernel QEMU virtio-net + smoltcp proof for ARP, ICMP, and TCP HTTP.
  • Boot To Shell / Native Shell: shell-led boot flow, split debug/terminal UARTs, local setup/login, anonymous/operator sessions, and shell REPL.
  • Verified Core: bounded local/GitHub Kani gate plus high-memory proof gate for selected cap-table, frame-bitmap, transfer rollback, and resource accounting invariants.
  • Shared-Service Demo Base: chat, adventure, NPC-as-process, and shared service harness prototypes.

Historical completion reports live in docs/changelog.md.

Stage 6: IPC And Capability Transfer

Outcome: cross-process capability calls, capability transfer, revocation, and process spawning are capability-shaped and usable by init-owned service graphs. Caller-selected service-visible identity is being replaced by session-bound invocation context: each normal process has one immutable session context, endpoint calls expose privacy-preserving caller-session metadata, and broker-granted service roots/facets carry service access.

Implemented:

  • cap_enter blocking wait
  • Endpoint kernel object
  • RECV/RETURN ring opcodes
  • cross-process IPC
  • direct-switch IPC handoff
  • legacy endpoint receiver metadata as transitional IPC machinery
  • copy/move capability transfer
  • CAP_OP_RELEASE
  • runtime handle release integration
  • epoch revocation and Revocable Read proof
  • MemoryObject substrate – the kernel-level mapping mechanism that backs zero-copy IPC. Demonstrated end-to-end by make run-memoryobject-shared (single-shot transfer) and make run-ipc-zerocopy (multi-message shared point-to-point buffer with metadata-only endpoint CALLs). The typed SharedBuffer surface and service APIs that consume it (File.readBuf, BlockDevice.readBlocks, NIC RX/TX rings) are still pending.
  • ProcessSpawner / ProcessHandle
  • init-owned manifest execution and boot package boundary cleanup
  • immutable per-process SessionContext ownership, default child-session inheritance, and trusted broker-selected child sessions, demonstrated by make run-session-context

Remaining themes:

  • typed SharedBuffer capability and consuming service APIs (storage, block, network, GPU) on top of the existing MemoryObject substrate
  • notification objects (so zero-copy producers/consumers can signal each other without per-record endpoint CALLs)
  • promise pipelining
  • CapabilityManager list/grant interface
  • remaining session-keyed shared-service migration for adventure and terminal bridges, including service use of bounded disclosure where needed
  • scheduling context and resource donation
  • init ELF embedding

Details:

  • docs/backlog/session-bound-invocation-context.md
  • docs/backlog/service-object-identity-migration.md (superseded)
  • docs/backlog/stage-6-capability-semantics.md
  • docs/proposals/service-architecture-proposal.md
  • docs/proposals/storage-and-naming-proposal.md
  • docs/proposals/error-handling-proposal.md

Stage 7: SMP, Runtime, Networking, And Shell

Outcome: capOS moves from single-CPU scheduling and local-only shell access to multi-CPU execution, thread-aware runtime behavior, socket-shaped network capabilities, and agent/web shell entry points.

SMP status:

  • Phase A complete: BSP per-CPU syscall stack/current-thread state and unified kernel-entry stack hook.
  • Phase B complete: APs start through Limine MP, switch to capOS kernel paging/stacks, initialize AP-local CPU state, and park.
  • Phase C selected AP scheduler-owner proof complete: GS/swapgs, LAPIC timer/IPI, TLB shootdown, and first AP scheduler-owner proof are complete. Commit d88bca7 at 2026-04-25 11:31 UTC proves AP cpu=1 can run scheduler-owned user contexts under -smp 2 while a scheduler-owner latch keeps the BSP in kernel idle. Full concurrent scheduling remains future work: per-CPU scheduler ownership, reschedule IPIs, and process-ring-safe concurrent scheduler-owned work.
  • The next visible SMP milestone is Multi-Process SMP Concurrency. Its technical prerequisite is full concurrent SMP scheduling: multiple CPUs must own scheduler work at the same time through reviewed per-CPU ownership, runnable handoff, and cross-CPU wakeup paths. The visible proof is that independent worker processes improve wall-clock runtime on a deterministic CPU-bound workload.
  • A separate later milestone is In-Process Threading Scalability. It proves sibling threads in one process can run on different CPUs and scale the same class of workload after per-thread ring/completion routing removes the current process-wide capability-ring bottleneck.

Runtime/network/shell themes:

  • reconcile in-process threading implementation status and any follow-on work
  • Telnet Shell Demo as first TCP-backed TerminalSession proof. Plaintext, loopback-only research demo; not a shippable Telnet service.
  • Tickless idle as the near-term timer cleanup: split clocksource from clockevent, convert timeout waiters to absolute deadlines, replace the user-mode idle process with kernel/per-CPU idle, then stop the periodic tick only when no runnable work exists. Generic full-nohz remains deferred; SQPOLL nohz belongs behind Ring v2, per-CPU scheduler ownership, housekeeping, CPU accounting, and CPU-isolation authority. See docs/proposals/tickless-realtime-scheduling-proposal.md and docs/research/nohz-sqpoll-realtime.md.
  • SSH Shell Gateway as the production remote CLI successor to Telnet after host-key, authorized-key, audit, and persistence prerequisites exist
  • decomposed userspace NIC/network-stack milestone after driver authority gates
  • native shell agent runner
  • WebShellGateway using the same broker-issued shell/agent authority model

Details:

  • docs/backlog/smp-phase-c.md
  • docs/backlog/runtime-network-shell.md
  • docs/proposals/smp-proposal.md
  • docs/proposals/tickless-realtime-scheduling-proposal.md
  • docs/proposals/networking-proposal.md
  • docs/proposals/shell-proposal.md
  • docs/proposals/llm-and-agent-proposal.md
  • docs/proposals/boot-to-shell-proposal.md

Hardware, Boot, And Storage

Outcome: capOS boots beyond the current ISO/QEMU manifest path, discovers real hardware, supports block devices, and exposes local persistent storage through typed capabilities.

Tracks:

  • bootable GPT/EFI disk image and make run-disk
  • ACPI/MADT/MCFG discovery
  • reusable interrupt and PCI/PCIe infrastructure
  • virtio-blk and NVMe block-device paths
  • boot binary ISO layout that moves ELF payloads out of the manifest blob
  • RAM-backed Store/Namespace
  • read-only local filesystem proof
  • writable local storage with recovery policy
  • cloud device tracks for GCP/AWS/Azure NICs

Details:

  • docs/backlog/hardware-boot-storage.md
  • docs/proposals/cloud-deployment-proposal.md
  • docs/proposals/storage-and-naming-proposal.md
  • docs/dma-isolation-design.md

User Identity, Sessions, And Policy

Outcome: shell, service, and future web sessions receive narrow capability bundles based on explicit identity, freshness, policy, and audit context.

Implemented base:

  • anonymous/operator shell sessions
  • password setup/login proof
  • broker-issued shell bundles
  • redacted auth/session audit records

Remaining themes:

  • manifest-seeded local accounts, recovery identities, service identities, and initial role/resource profiles
  • disk-backed local account store over capability-native storage
  • default per-account, guest, anonymous, external, and service-account resource bundles
  • explicit external identity bindings for OIDC/passkey/cloud/certificate principals
  • durable verifier/passkey records
  • WebAuthn and passkey-only setup path
  • broader AuditLog completion
  • ABAC context such as auth freshness, session age, source, and claims
  • mandatory-policy labels and wrapper caps
  • guest and anonymous workload demos
  • POSIX profile adapter metadata
  • OIDC/OAuth2 integration

Details:

  • docs/proposals/user-identity-and-policy-proposal.md
  • docs/backlog/local-users-management.md
  • docs/proposals/oidc-and-oauth2-proposal.md
  • docs/proposals/certificates-and-tls-proposal.md
  • docs/proposals/cryptography-and-key-management-proposal.md
  • docs/security/trust-boundaries.md

Security And Verification

Outcome: trust boundaries fail closed, proof gates stay practical, and trusted build inputs remain review-visible.

Implemented base:

  • host tests for pure logic
  • Loom ring model
  • Miri/proptest/Kani paths
  • dependency policy checks
  • pinned Limine and Cap’n Proto tooling
  • DMA isolation design gate
  • panic-surface inventory

Remaining themes:

  • Stage-6 trust-boundary refresh
  • untrusted-service hardening and quota/exhaustion smokes
  • Kani harness bounds refresh when new proof obligations are concrete

Details:

  • docs/backlog/security-verification.md
  • REVIEW.md
  • REVIEW_FINDINGS.md
  • docs/proposals/security-and-verification-proposal.md
  • docs/security/verification-workflow.md
  • docs/trusted-build-inputs.md

Shared-Service Demos

Outcome: multi-process demos prove resident services, shell-spawned clients, session-bound invocation context, shared harnesses, and eventually network-transparent federation.

Implemented:

  • First Chat MVP
  • Local MUD/adventure prototype
  • NPC-as-process fleet
  • shared service harness extraction

Remaining themes:

  • session-keyed service state replacing legacy receiver-selected chat/adventure identity
  • per-principal chat state and audit
  • Aurelian Frontier game-depth work after the first deterministic mission slice
  • native command-surface replacement for prototype StdIO
  • federated chat after network transparency

Details:

  • docs/backlog/shared-service-demos.md
  • docs/backlog/aurelian-frontier.md
  • docs/demos/adventure.md
  • docs/proposals/aurelian-frontier-proposal.md
  • docs/proposals/interactive-command-surface-proposal.md

aarch64 Support

Outcome: port the architecture layer after x86_64 hardware abstraction stabilizes.

Shared code expected to carry over:

  • capability model and schema
  • ring structs and transport contracts
  • userspace runtime model
  • process/capability abstractions above arch/

Architecture-specific work:

  • EL0/EL1 syscall entry/exit
  • GICv3 interrupts
  • ARM generic timer
  • PL011 UART
  • TTBR0/TTBR1 MMU setup
  • TPIDR_EL1 per-CPU data
  • kernel/linker-aarch64.ld

Future Tracks

These are not selected unless WORKPLAN.md or user direction pulls them into active scope:

  • regular Rust runtime support
  • C libcapos
  • Go GOOS=capos
  • Lua scripting
  • POSIX compatibility
  • WASI runtime
  • C++ experiments
  • GPU/CUDA capability integration
  • system monitoring
  • network transparency
  • process persistence/checkpoint-restore
  • live upgrade
  • cloud metadata
  • volume encryption
  • formal MAC/MIC modeling
  • browser/WASM support
  • robotics realtime control

Use proposal files under docs/proposals/ and research notes under docs/research/ before promoting any future track into WORKPLAN.md. Lua scripting should arrive as an ordinary capability-scoped userspace runner, not as kernel scripting or ambient shell authority.

Observable Milestones

Completed visible milestones:

  • 2026-04-22 16:35 UTC, commit d4016ab: Unprivileged Stranger
  • 2026-04-23 08:41 UTC, commit f554e88: Native Cap Shell
  • 2026-04-23 13:39 UTC, commit e5adafb: Boot to Shell
  • 2026-04-23 16:15 UTC, commit 7f19af2: Revocable Read
  • 2026-04-23 16:34 UTC, commit 8b66c13: split UART shell session
  • 2026-04-23 22:09 UTC, commit d43b691: Verified Core
  • 2026-04-24 00:13 UTC, commit 2cd85a8: First Chat MVP
  • 2026-04-24 01:40 UTC, commit add7f9b: Local MUD/adventure prototype
  • 2026-04-24 03:13 UTC, commit da5f5e9: Ring as Black Box
  • 2026-04-24 15:37 UTC, commit b56a5c1: First Packet
  • 2026-04-24 16:47 UTC, commit a4f1722: First HTTP
  • 2026-04-25 05:36 UTC, commit 0b79054: SMP Phase A: per-CPU data on BSP
  • 2026-04-25 06:59 UTC, commit d3c30c6: SMP Phase B: APs running
  • 2026-04-25 11:31 UTC, commit d88bca7: First AP Scheduler
  • 2026-04-25 20:25 UTC, commit 2834bfc: Telnet Shell Demo

Visible demo follow-ups:

  • Adventure/shared-service follow-ups after the Local MUD prototype: 73d83aa, da51dc7, 353c8bc, e20cf07, 948c96e, and ca6300c. These refine discoverability, room context, expedition map, relic custody, explicit resume, and chat-only named actors; detailed reports live in commit history.
  • 2026-04-26 04:10 UTC, commit 5480304: Scoped Telnet Gateway Authority. telnet-gateway now uses manifest-forwarded scoped listener authority plus RestrictedShellLauncher; detailed verification history lives in commit history.
  • 2026-04-26 23:12 EEST, commit 4304b0e: Default run Telnet wiring. The default manifest starts telnet-gateway, and make run attaches host-local 127.0.0.1:2323 -> guest :23 forwarding.
  • 2026-04-27 00:02 EEST, commit 7a155f4: Telnet IAC handoff fix and repeat-connect support. Telnet handoff no longer consumes raw socket input before intoTerminalSession, repeated host connections succeed, and the harness drives two consecutive sessions.
  • 2026-04-28 17:46 UTC, commit d09243d: Aurelian Phase 9 competency gates. The adventure proof now has host-testable rank/star/circle policy, status output for rank marks and standing, signifer skill gates, first-mission spell gates, and QEMU assertions for rank denial plus debrief reward.
  • 2026-04-28 18:12 UTC, commit 47dbfc5: Aurelian Phase 10 market logistics. Adventure now has typed quote/buy/sell/trade/repair calls, bounded market roles, a deterministic Maro route purchase, and QEMU assertions for market quote, successful exchange, and clean-custody trade refusal.
  • 2026-04-28 19:36 UTC, commit e204454: Aurelian Phase 11a calendar foundation. Generated content now carries fixed-smoke season/day/weather and hazard state plus bounded seasonal resources, Adventure status prints that state, and the real scenario process asserts it through Adventure.status.
  • 2026-04-28 20:08 UTC, commit 48c62db: Aurelian Phase 11b regional foundation. Generated content now carries settlement, outpost, and route metadata with validation and stable ordering; Adventure status prints a regional summary, and the real scenario process asserts it through Adventure.status.
  • 2026-04-28 21:08 UTC, commit 0b7db05: Aurelian Phase 11c construction foundation. Generated content now carries material, facility, blueprint, artifact, and enchantment-slot metadata with pure Rust validation and deterministic property derivation; Adventure status prints a construction summary, and the real scenario process asserts it through Adventure.status. Construction jobs, material reservation, escrow, completion/release, and full artifact crafting gameplay remain future work.
  • 2026-04-28 21:36 UTC, commit f53d044: Aurelian Phase 11d agent NPC budget foundation. Generated content now carries disabled-by-default optional NPC agent budget metadata with model profiles, per-session/day input/output token limits, tool-call limits, cooldown, fatigue, sleep, refusal, and audit visibility. Pure Rust fake-model tests cover spending, refusals, disabled transcript stability, bounded output, and no authority mutation from model text; Adventure status prints an aggregate budget line asserted through Adventure.status. Live LLM integration, hosted-agent execution, durable memory, autonomous NPC actions, and authority mutation from model output remain future work.
  • 2026-04-28 22:22 UTC, commit 335a9ee: Aurelian Phase 12 party foundation. Adventure now has typed local party create/invite/accept/leave/delegate calls and assist, keyed by service-created local player labels derived from live caller-session keys. The server uses the unit-tested adventure-content party transition state for invite, accept, scoped delegation, assist, and leave cleanup; the scenario process asserts the one-client cap surface and party status line. Two-client QEMU proof, transfer escrow, duel/spar/contest authority, and cross-device multiplayer remain future work.
  • 2026-04-29 06:43 UTC, commit ac49375: Aurelian Phase 12 physical-item transfer foundation. Adventure adds typed transfer for same-party service-local player labels, with ordinary inventory mutation kept atomic inside the existing service and backed by pure Rust transfer tests. The scenario process asserts one-client refusal paths without faking a second live session. Currency escrow, broad market/trade coordination, and successful two-client QEMU transfer proof remain future work.
  • Pending branch feature/paperclips-demo: Paperclips Terminal Demo. The default manifest advertises the clean-room paperclips terminal game, and system-paperclips.cue plus make run-paperclips provide the focused QEMU proof for production, sale, automation, simulation ticks, project listing, and clean shell exit. The demo is intentionally outside the active Session-Bound Invocation Context milestone because it exercises a standalone StdIO terminal process rather than shared-service caller identity.

Active visible milestone:

  • Session-Bound Invocation Context: normal workload processes have exactly one immutable live session context, endpoint calls reveal only privacy-preserving caller-session metadata by default, and shared services stop deriving caller identity from caller-selected service-visible metadata. Commit 3edee90 at 2026-04-28 16:26 UTC lands the first proof for child session inheritance, failed second-session injection, and trusted broker-selected child contexts; commit 3469c27 at 2026-04-28 16:54 UTC adds broker-side expired-session rejection; commit 687511a at 2026-04-28 17:43 UTC adds endpoint caller-session metadata, payload-spoof rejection for invocation context, and stale normal endpoint rejection; commit f0cb74b at 2026-04-28 18:38 UTC adds transfer-scope enforcement for endpoint IPC, endpoint returns, and spawn grants; commit 0f92d77 at 2026-04-28 19:33 UTC adds explicit endpoint subject disclosure gating by request and scope; commit dc7ece4 at 2026-04-28 20:06 UTC migrates chat membership to endpoint caller-session keys. Later Gate 4 slices retired normal shell badge selection, bound terminal and stdio bridge authority to live caller sessions, keyed the 128-bit opaque caller reference with a non-reused endpoint service-scope id, and commit 5e9dc4e at 2026-04-29 11:05 UTC proves one child process/session receives distinct opaque caller-session reference tuples across two endpoint service scopes. Remaining selected-milestone work is the peer-owned adventure migration, final full-gate verification, and any documentation alignment needed after that migration lands.

Paused visible milestone:

  • SSH Shell Gateway: ssh reaches the capOS login/native shell flow through an SSH-backed TerminalSession in QEMU, using host-local forwarding, public-key authentication, denied unsupported SSH features, and the same child shell capability boundary proven by Telnet. This remains planned Stage 7 work, but network-backed shell delegation should wait for the active session-bound invocation context migration to settle.

Candidate next visible milestones:

  • Multi-Process SMP Concurrency: implement full concurrent SMP scheduling, then have make run-smp-process-scale boot QEMU with multiple CPUs, run a deterministic CPU-bound workload split across independent worker processes, print verified output plus 1/2/4-process timing, and record near-linear 1-to-2 CPU speedup under repeated KVM-backed runs.
  • In-Process Threading Scalability: after per-thread capability rings and completion routing exist, have make run-thread-scale run the same class of workload across sibling threads in one process, verify the result, and record 1/2/4-thread timing without relying on a process-wide ring waiter.
  • Agent Shell
  • WebShellGateway
  • bootable disk image
  • local disk storage
  • federated chat

Select the next milestone in WORKPLAN.md only after the current selected milestone is achieved and recorded, or when the user explicitly changes the selected milestone.