# Runtime, Networking, And Shell Backlog

Detailed decompositions for runtime, networking, shell, agent, and web shell
work. `WORKPLAN.md` links here but should not inline these subtasks.

## Scheduler/Park Measurement

Pre-thread dispatch instrumentation and compact-vs-generic ParkBench
comparison are historical context. In-process threading later closed the first
blocked/resume measurement path with QEMU samples for private ParkSpace
wait/wake. Future measurement work should be tied to a concrete runtime or SMP
change, especially per-thread/per-CPU ring behavior.

## In-Process Threading Implementation

Current implementation subgates recorded in the old workplan were all marked
complete, but the parent task still appeared unchecked. Before starting
follow-up work, reconcile this status against code, `docs/roadmap.md`, and
`docs/changelog.md`.

Completed subgates retained for context:
- [x] Add `Thread` state with per-thread kernel stack, registers, and FS base.
- [x] Change scheduling from process-level to thread-level while preserving
      process-owned address spaces and cap tables.
- [x] Add `ThreadSpawner`/`ThreadHandle` and basic join/exit smoke.
- [x] Implement the first park authority capability and contended-path
      measurements.

## Runtime Ring Reactor Bridge

The current kernel ABI still exposes one process-owned capability ring. A
multithreaded runtime therefore needs a compatibility bridge until per-thread
kernel rings land.

Ordered gates:
- [ ] Add one runtime-owned process-ring CQ drainer.
- [ ] Map `user_data` completions back to ParkSpace-backed per-thread wait
      records.
- [ ] Prove sibling threads can issue ordinary calls and receive out-of-order
      completions without both draining the process CQ.
- [ ] Retire the bridge when per-thread capability rings and completion routing
      by generation-checked `ThreadRef` become the kernel ABI.

## Telnet Shell Demo

Visible outcome: `make run-telnet` boots capOS in QEMU with
`hostfwd=tcp:127.0.0.1:2323-:23`, a `telnet-gateway` boot service listens on
guest port 23 through the kernel TCP capability surface, and a scripted host
smoke runs `telnet 127.0.0.1 2323`, logs in through the existing credential
flow, issues one shell command, and sees a clean disconnect. The proof should
include a console UART line and a host-side transcript.

Ordered gates:
- [x] Add the Phase B TCP interfaces to the canonical shared schema:
      `NetworkManager`, `TcpListener`, and `TcpSocket`. Keep this milestone
      TCP-only; `UdpSocket`, `DeviceMmio`, `DMAPool`, and `Interrupt` are
      decomposed-NIC / userspace-driver scope.
- [x] Replace the synthetic 10 ms smoltcp clock with scheduler-driven polling
      on real `TICK_COUNT`; the HTTP proof now persists as a retained smoltcp
      runtime polled from scheduler ticks. Depends on `Timer`.
- [x] Close the delegated endpoint relabeling gap before exposing shell launch
      over Telnet. A remote shell user must not be able to type an arbitrary
      endpoint identity such as `badge 200` and spawn a child that acts as a
      different chat/adventure participant. Omitted shell syntax now preserves
      the delegated source identity, and the low-level spawn hardening proof
      keeps the legacy badge-zero encoding covered. The containment gates in
      `docs/backlog/stage-6-capability-semantics.md` are complete; do not
      expose Telnet shell launch to any future badge-selection regression.
      Normal shell help and smoke-help expectations no longer advertise badge
      syntax.
- [x] Implement `NetworkManager`, `TcpListener`, and `TcpSocket` as kernel
      `CapObject`s wrapping the existing smoltcp smoke path. Reuse ring
      dispatch; do not add syscalls. `accept` and `recv` may be blocking calls
      for this milestone, with bounded result buffers and explicit close
      behavior. Initial implementation landed in commit `7446e04` at
      `2026-04-25 14:48 UTC`; follow-up review fixes removed timer-path
      allocation from deferred completion, hardened result-cap cleanup, and
      added `make qemu-network-client-harness` coverage for userspace
      `NetworkManagerClient`, `TcpListenerClient.accept`, and `TcpSocketClient`
      send/recv/close.
- [x] Complete the next endpoint-identity containment transition before
      unrelated Telnet gateway work: Gate 1 representation plus the minimum
      trusted mint path landed as the historical service-object routing proof.
      The selected follow-on is now Session-Bound Invocation Context: keep
      production remote shell launch blocked until one-session-per-process,
      privacy-preserving endpoint caller-session metadata, and shared-service
      migration settle.
- [x] Add the socket-backed terminal handoff needed by the demo. `capos-shell`
      must still receive a cap named `terminal` with `TerminalSession`
      interface id, backed by the accepted TCP socket. Do not pass raw
      `TcpSocket`, `ByteStream`, or `StdIO` as a replacement for the login
      terminal boundary. Satisfy this either by adding typed service-export /
      grant support so a userspace `telnet-gateway` endpoint can be presented
      as a `TerminalSession`, or by implementing a real kernel
      socket-backed `TerminalSession` `CapObject`.
      Implemented as `TcpSocket.intoTerminalSession`, which consumes a
      connected socket cap and returns a move-only `TerminalSession` result
      cap. `make qemu-network-client-harness` proves output, prompt, visible
      echo, and submitted line handling over an accepted TCP connection.
- [x] Add a `telnet-gateway` demo binary and `system-telnet.cue` manifest. The
      trusted demo gateway gets bootstrap `NetworkManager` and `ProcessSpawner`
      authority, plus pass-through `creds`, `sessions`, `audit`, and `broker`
      caps needed to spawn `capos-shell` with the same login/session semantics
      as the UART shell. The spawned shell must not receive raw network or broad
      process-spawn authority.
- [x] Add `make run-telnet` and a scripted `qemu-telnet-harness` host smoke
      that drives the full login/command/exit sequence and requires a proof
      line.
- [x] Document in `docs/proposals/networking-proposal.md` and
      `docs/proposals/shell-proposal.md` that telnet is demo-only plaintext,
      binds only to host loopback in the QEMU harness, preserves the
      `TerminalSession` boundary, and will be replaced by the SSH gateway
      once host-key, user-key, account, audit, and persistence prerequisites
      land.
      Implemented by branch commit `5d11b12` at `2026-04-25 20:06 UTC`.
      `make qemu-telnet-harness` proves `127.0.0.1:2323 -> guest :23`,
      password login, `caps`, the `session` command, and clean exit with no
      password, raw `NetworkManager`, raw `ProcessSpawner`, raw TCP, or
      unknown-cap leakage in the host transcript. Replacing the gateway's
      factory network/spawn authority with scoped listener and shell-launch
      caps is tracked in `REVIEW_FINDINGS.md`; it is not required for the
      host-local visible demo.

## SSH Shell Gateway

Visible outcome: `make run-ssh-shell` boots capOS in QEMU with a host-local
forward to guest SSH, an `ssh-gateway` service authenticates a normal OpenSSH
client with a configured public key, launches `capos-shell` with an
SSH-backed `TerminalSession`, runs one shell command, and disconnects cleanly.
The shell must see the same terminal/session/broker boundary as the Telnet
demo, not raw TCP or SSH protocol authority.

Blocked by: Telnet Shell Demo for socket-backed `TerminalSession`,
cryptography/key-management for sign-only host keys, local account/key records
for authorized SSH keys, audit records for remote authentication decisions,
and persistent storage before production host or authorized keys are treated
as durable.

Closeout prerequisite: before this milestone closes, reconcile its target
name and host-harness placement with the run-target/init-mandate policy in
`docs/backlog/run-targets-and-init-policy.md` (Gate A naming split, Gate B
init mandate, Gate C test split, Gate D default-`make run` integration).
The current `make run-ssh-shell` working name and any scripted host harness
may need to become `test-ssh-shell` and be relocated, and default-run
exposure has to be addressed there, not as another `run-ssh-*` recipe.

Ordered gates:
- [x] Document the first SSH gateway contract in
      `docs/proposals/ssh-shell-proposal.md`: gateway authority, host-key
      custody, authorized-key mapping, accepted channel set, denied SSH
      features, terminal handoff, audit, resource limits, and teardown.
- [x] Close or explicitly preserve the scoped gateway authority gap for SSH
      before implementation: the gateway must receive a manifest-declared
      scoped listener or listener factory for only the configured SSH port, and
      the spawned shell must receive no raw `NetworkManager`, `TcpListener`,
      `TcpSocket`, or transport protocol authority. A temporary host-local demo
      compromise must stay documented in `REVIEW_FINDINGS.md` and the harness
      must prove the child boundary with `caps`.
      - [x] Scoped listener authority sub-slice: `tcp_listen_authority`
            manifest grants use the cap badge as a validated TCP port and mint
            a one-shot `TcpListenAuthority` that can create only that listener;
            `make run-tcp-listen-authority` proves generic init can forward
            the scoped cap to a child without raw `NetworkManager`.
- [x] Terminal-host wiring sub-slice: `ssh-gateway-terminal-host` now uses
      manifest-scoped `TcpListenAuthority` on the SSH development port and
      `RestrictedShellLauncher` to hand a socket-backed `TerminalSession` to
      `capos-shell` while proving the child lacks raw network, TCP, spawn,
      key-store, host-key, SSH gateway, terminal-factory, and launcher
      authority. This closes the scoped gateway authority gap for the bounded
      host-local proof; the final OpenSSH transport and channel harness remain
      separate gates.
- [x] Add manifest-declared shell launch authority for the gateway. Prefer a
      shell-only launcher or supervisor grant that can start only
      `capos-shell` with reviewed pass-through caps; do not grant broad
      `ProcessSpawner` authority to the SSH gateway unless it is explicitly
      recorded as a host-local development compromise.
      - [x] Restricted shell launcher sub-slice: `restricted_shell_launcher`
            manifest grants forward an init-held `RestrictedShellLauncher` cap
            to a child service. `make run-restricted-shell-launcher` proves the
            child service has no raw `ProcessSpawner`, `launchShell` has no
            binary selector and launches only `capos-shell`, session/profile
            mismatch and dangerous grant attempts fail closed, and the spawned
            shell uses the supplied session while lacking raw network, TCP,
            host-key, authorized-key-store, SSH gateway, and
            restricted-shell-launcher authority.
- [x] Add schema/design stubs for the minimum SSH support objects:
      `SshGateway` or equivalent service contract, sign-only `SshHostKey`
      wrapper around a `KeyVault`/`PrivateKey`, `AuthorizedKeyStore`, and
      SSH-backed `TerminalSession` construction. Do not expose private-key
      bytes, raw authorized-key storage, or vault administration to the spawned
      shell. Implemented as schema/type-surface stubs for `SshGateway`,
      `SshHostKey`, `AuthorizedKeyStore`, `SshTerminalFactory`,
      `TcpListenAuthority`, and `RestrictedShellLauncher`; no bootable kernel
      or userspace implementation is implied by this gate.
- [x] Add a development host-key path. Manifest-seeded keys may be used only
      for QEMU proof and must be labeled non-production; production host keys
      require the key-management and storage path. Implemented as
      `kernelParams.sshDevelopmentHostKey` plus the narrow
      `ssh_development_host_key` kernel source. The focused proof is
      `make run-ssh-host-key`; the development cap signs bounded
      `ssh-ed25519` exchange hashes from the manifest seed, verifies against
      the configured public key in QEMU, denies wrong algorithms, and remains
      explicitly non-production. Persistent production host-key storage,
      rotation, and key management remain future work.
- [ ] Add public-key user authentication. Accepted SSH keys map to principals
      and allowed shell profiles; `SessionManager` mints the session only
      after signature verification, and `AuthorityBroker` still decides the
      actual shell bundle.
      - [x] Public-key session bridge sub-slice: `SessionManager.sshPublicKey`
            checks a configured `AuthorizedKeyStore` record plus bounded
            fixture auth bytes/signature, mints a `UserSession` with the
            accepted principal/profile and `publicKey` auth strength, and
            `make run-ssh-public-key-auth` proves unknown, disabled,
            unsupported, and bad-signature paths fail closed before broker
            bundle minting. This is not full SSH transport authentication or
            shell launch wiring.
      - [x] AccountStore-bound session sub-slice:
            `SessionManager.sshPublicKey` consults the bootstrap
            `RamAccountStore` after signature verification
            (`lookup_by_principal`), so non-`Active` account statuses
            (Disabled, Locked, RecoveryOnly) and missing principals fail
            closed before a session is minted. Each denial cause maps to a
            stable, principal-blanked `auth=` audit code
            (`ssh-key-unknown`, `ssh-key-disabled`,
            `ssh-key-profile-not-allowed`, `ssh-bad-signature`,
            `ssh-account-missing`, `ssh-account-disabled`,
            `ssh-account-locked`, `ssh-account-recovery-only`,
            `ssh-account-lookup-failed`, `ssh-profile-kind-invalid`,
            `ssh-profile-not-interactive`, `ssh-auth-bytes-invalid`).
            `make run-ssh-public-key-auth` covers the non-account-status
            codes; the `ssh-account-*` codes need an
            `AccountStoreManagerCap` kernel cap source for runtime-mutated
            QEMU proofs (tracked in
            `docs/backlog/local-users-management.md` Gate 2).
- [ ] Reject unsupported SSH features with protocol failures and audit reason
      codes: password auth when disabled, `exec`, SFTP/subsystems, port
      forwarding, agent forwarding, X11 forwarding, arbitrary environment
      import, and multiple active shell channels.
      - [x] Policy-surface sub-slice: `capos-config::ssh_policy` returns
            allowed/denied decisions, SSH protocol failure classes, and stable
            audit reason codes for the narrow allowed path and the denied
            feature set, including second session-channel opens before any
            shell request. Password auth remains fail-closed until a real
            verifier/backoff path is part of the gateway policy.
            `make run-ssh-feature-policy` proves the table in QEMU. The full
            gateway item remains open until this policy is invoked by
            `ssh-gateway`.
- [ ] Implement the gateway as a terminal host. It owns SSH packet/channel
      state and gives `capos-shell` only a cap named `terminal` plus the
      normal scoped launch grants. The child must not receive raw network,
      host-key, authorized-key-store, key-vault, or broad spawn authority.
      - [x] Bounded terminal-host wiring sub-slice:
            `make run-ssh-gateway-terminal-host` proves a generic-init child
            service can combine scoped `TcpListenAuthority`,
            `AuthorizedKeyStore`, `SessionManager`, `AuthorityBroker`, and
            `RestrictedShellLauncher` grants to deny an unknown key, mint a
            `publicKey` session from a configured key, reject a mismatched
            broker profile, accept the matching broker profile, convert one
            host-local TCP socket into a `TerminalSession`, and launch
            `capos-shell` without giving the shell raw network,
            process-spawner, TCP listener/socket, host-key,
            authorized-key-store, SSH gateway, SSH terminal-factory, or
            restricted-shell-launcher authority. This remains a bounded
            plain-TCP proof and does not complete full SSH packet/channel
            ownership or the OpenSSH harness gate.
- [ ] Add `system-ssh-shell.cue`, `make run-ssh-shell`, and a host harness
      using `ssh` against the forwarded port. The harness must prove one
      successful public-key login, one shell command, clean exit, unknown-key
      denial, disabled-password denial, denied forwarding/subsystem requests,
      and cleanup after client disconnect.
      - [ ] OpenSSH version-exchange slice: add a real `ssh-gateway` service
            and `system-ssh-shell.cue` skeleton that accepts one host-local
            OpenSSH TCP connection, exchanges RFC 4253 identification strings,
            records the client software/version in bounded audit/proof output,
            and disconnects before key exchange without launching a shell.
            The normal compatibility harness should use `/usr/bin/ssh`; a
            separate low-level hostile TCP/banner fixture should prove
            malformed banners plus overlong identification strings fail
            closed.
      - [ ] KEXINIT and algorithm-selection slice: parse the unencrypted
            KEXINIT binary-packet exchange far enough to negotiate a pinned
            development algorithm set, reject unsupported algorithms with SSH
            disconnects, and keep the negotiated algorithm names out of any
            authority decision. The initial reviewed set should be exactly one
            modern KEX, `ssh-ed25519` host keys, one AEAD cipher/MAC pair, and
            `none` compression until rekey and broader algorithm policy exist.
      - [ ] Development key-exchange slice: complete the negotiated KEX,
            derive traffic keys from the shared secret, exchange hash, and
            session id per RFC 4253, call `SshHostKey.signExchangeHash` for the
            SSH exchange hash, and complete the OpenSSH handshake without
            exposing private host-key bytes or raw entropy to the gateway's
            child shell. Entropy is input for ephemeral KEX material, padding,
            and challenges; this remains non-production until host keys are
            durable and the entropy source has a reviewed production-quality
            policy.
      - [ ] OpenSSH public-key userauth slice: bind the OpenSSH userauth
            transcript to `SessionManager.sshPublicKey` so the accepted key
            maps to the configured principal/profile, unknown keys are denied
            generically, and disabled password auth returns the expected SSH
            failure without invoking `CredentialStore`.
      - [ ] Channel policy slice: invoke `capos-config::ssh_policy` for
            session-channel open, PTY, window-change, shell, exec, subsystem,
            forwarding, agent, X11, environment, and second-channel requests.
            The harness must prove the allowed shell path plus the denied
            feature requests with protocol-visible failures and sanitized audit
            reason codes.
      - [ ] SSH terminal launch slice: replace the plain-TCP terminal-host
            driver with the SSH channel-backed terminal path, launch
            `capos-shell` through `RestrictedShellLauncher`, run `session`,
            `caps`, and `exit` over OpenSSH, and prove disconnect cleanup for
            both client-close-before-shell and shell-exit-before-client-close.
- [ ] Update `docs/proposals/shell-proposal.md`,
      `docs/proposals/boot-to-shell-proposal.md`,
      `docs/security/trust-boundaries.md`, and `docs/proposals/index.md` when
      implementation begins so remote SSH login policy, terminal authority,
      and audit records stay aligned with the code.

## Decomposed NIC Milestone

Move the NIC driver and TCP/IP stack out of the kernel into dedicated
userspace processes after the Telnet Shell Demo has made the socket interfaces
capability-shaped. `make run-telnet` must still pass end-to-end with zero
change to the shell or gateway.

Blocked by: Telnet Shell Demo and the userspace-driver transition gate.

- [ ] Define first `DeviceMmio`, `DMAPool`, and `Interrupt` schemas.
- [ ] Move virtio-net ownership into a userspace driver process holding only
      `DeviceMmio`, `Interrupt`, and `DMAPool` caps.
- [ ] Split smoltcp into a separate userspace network-stack process that holds
      the `Nic` cap from the driver and re-exports the Phase B socket
      interfaces.
- [ ] Confirm `make run-telnet` still passes with the decomposed topology and
      the kernel no longer depends on `smoltcp` or virtio-net.

## Agent Shell / Agent Runner

The native shell's agent mode must land before exposing the shell through a
browser. The shell remains the trusted runner and session-cap holder. The model
service receives prompts and returns structured tool calls, but never receives
session caps, terminal caps, launcher authority, raw tokens, or secrets. Use a
deterministic test model for the first proof.

Visible outcome: `make run-agent-shell` boots capOS in QEMU, grants
`capos-shell` a broker-issued `LanguageModel` cap plus per-tool permission map,
enters agent mode, exposes the current session bundle as typed tool
descriptors, executes one read-only tool call automatically, requires consent
or step-up for a mutating/admin-shaped call, handles user cancellation, and
records redacted audit output.

Ordered gates:
- [ ] Add the first agent-runner schema/interfaces: `LanguageModel`,
      `ModelInfo`, `ToolDescriptor`, `ToolCall`, `ToolResult`, permission mode
      metadata, and bounded streaming/cancel semantics. Keep tool calls
      structured; do not parse model text as shell commands.
- [ ] Extend `AuthorityBroker` session profiles so an operator shell can
      receive a `LanguageModel` cap and a per-tool permission map without
      receiving model-admin, model-catalog, or provider-token authority.
- [ ] Add a deterministic in-tree `LanguageModel` test service that emits
      scripted tool calls for QEMU proofs. Do not block this milestone on
      large local model weights, remote providers, GPU, or storage.
- [ ] Implement native shell agent mode: build the tool table from granted
      session caps and schema metadata, stream model turns, gate each tool call
      through `auto` / `consent` / `stepUp` / `forbidden`, invoke only the
      capabilities held by the shell runner, and feed outcomes back into the
      loop.
- [ ] Wire consent, step-up, cancellation, timeout, quota, and audit behavior.
      User interrupts beat model momentum; denied or cancelled tool calls
      become ordinary tool outcomes instead of hidden control flow.
- [ ] Add `make run-agent-shell` and a scripted QEMU harness that proves
      read-only auto execution, denied forbidden/admin tool exposure, one
      consent or step-up prompt, cancellation, and redacted audit records.
- [ ] Update `docs/proposals/llm-and-agent-proposal.md`,
      `docs/proposals/shell-proposal.md`, and `WORKPLAN.md` to record that
      WebShellGateway hosts this agent-capable shell/runner instead of
      defining a separate browser-side agent authority model.

## WebShellGateway

Add the browser-hosted terminal and authentication gateway after both remote
`TerminalSession` proof and agent shell are in place. The gateway owns
HTTP/WebSocket or equivalent transport, TLS/origin/RP-ID validation, WebAuthn
challenge/response, terminal rendering, and session teardown. It launches the
same agent-capable native shell with the same broker-issued session profile.

Blocked by: Telnet Shell Demo for socket-backed `TerminalSession`, Agent
Shell / Agent Runner, passkey challenge/credential support in auth/session
services, and TLS/origin/RP-ID policy. OIDC is a follow-up path on the same
gateway, not a prerequisite for the first WebAuthn shell.

Visible outcome: `make run-webshell` boots capOS in QEMU with host-local
forwarding to the web gateway, a headless browser harness opens the terminal
UI with a virtual WebAuthn authenticator, authenticates, runs one shell or
agent command, logs out or closes the tab, and verifies clean
shell/process/session teardown plus a recorded transcript/proof line.

Ordered gates:
- [ ] Define the web terminal stream protocol over WebSocket or an equivalent
      browser transport: input, output, resize, paste, close, cancellation,
      flow control, session IDs, and bounded buffering.
- [ ] Add WebAuthn/passkey credential and challenge support: public-credential
      records, single-use bounded challenges, entropy fail-closed behavior,
      origin/RP-ID binding, user-presence/user-verification policy,
      sign-count handling, rate limiting, and redacted audit events.
- [ ] Add TLS and browser origin policy for QEMU and deployment modes. The
      first harness may use a local development trust path, but the gateway
      must have explicit Host/Origin/RP-ID checks and no production plaintext
      mode.
- [ ] Implement `WebShellGateway` as a terminal host service: accept browser
      sessions, authenticate, request the narrow shell/agent bundle from
      `AuthorityBroker`, create or wrap a web-backed `TerminalSession`, spawn
      `capos-shell`, proxy terminal events, and release all session resources
      on logout, tab close, timeout, or shell exit.
- [ ] Add `system-webshell.cue` and manifest/grant wiring. The gateway gets
      only listen/TLS/auth/session/broker/restricted-launch grants needed for
      the job; the spawned shell does not receive raw network, raw auth
      material, model-provider tokens, or broad process-spawn authority.
- [ ] Add `make run-webshell` and `qemu-webshell-harness` with a headless
      browser virtual authenticator, transcript capture, login/command proof,
      logout/close proof, and assertions that failed auth and stale browser
      sessions do not leave a live shell.
- [ ] Add optional OIDC authorization-code + PKCE login on the same gateway
      after the OAuth/OIDC service exists. ID-token verification and
      `acr`/`amr` mapping feed `SessionManager`/`AuthorityBroker`; raw tokens
      do not enter the shell or browser terminal transcript.
- [ ] Update `docs/proposals/boot-to-shell-proposal.md`,
      `docs/proposals/shell-proposal.md`,
      `docs/proposals/llm-and-agent-proposal.md`, and security trust-boundary
      docs with WebShellGateway authority, auth, terminal, audit, and teardown
      rules.