Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Remote Session CapSet Client Backlog

Detailed decomposition for the remote host app path described in Remote Session CapSet Clients. docs/tasks/README.md should point here when selecting implementation slices; it should not inline the details.

Visible Outcome

make run-remote-session-capset-interop boots capOS in QEMU, starts a loopback-scoped remote session gateway, runs a regular host-side Rust client on the host, authenticates or exercises an explicitly configured guest/anonymous denial path, obtains a RemoteSession, lists a broker-issued RemoteCapSet, gets typed capabilities by name/interface ID, calls at least two granted capabilities, proves missing/wrong-interface denials, logs out or disconnects, and observes stale proxy calls fail closed.

The first harness can be a small CLI because it is easy to script. The product shape should also support a native desktop GUI, a Tauri app whose Rust backend holds the remote CapSet, or a webapp whose trusted server/gateway holds the remote CapSet and exposes only UI frames, command descriptors, or bounded tool requests to browser JavaScript. The UI path can be bidirectional: the host UI may grant a narrow UI-surface capability back to capOS-side services or agents so they can propose task-specific panes, command palettes, visualizations, theme hints, and layout changes without receiving arbitrary host UI authority.

The ordinary operator run story is: start capOS with make run, note the printed remote CapSet: tcp 127.0.0.1 <port> -> guest :2327 line, then start one of the host clients against that endpoint. make run injects the host USER as the default operator account name on the capOS side; the CLI may take --user (or CAPOS_REMOTE_SESSION_USER) as an explicit operator override, but the web bridge keeps the login username field empty by default to avoid leaking host identity hints into the page before authentication. The current repo-local commands are:

make run
cargo run --manifest-path tools/remote-session-client/Cargo.toml \
  --target x86_64-unknown-linux-gnu \
  --bin remote-session-client -- --host 127.0.0.1 --port <printed-port>
CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-ui

The CLI also accepts --launch-adventure for the default-manifest proof that starts the Adventure service graph through serviceLaunch and requires a running status. --adventure-status follows a successful Adventure launch with bounded Adventure.status, Adventure.look, and Adventure.inventory calls through the session-bound worker; --adventure-go <direction> adds the first mutable typed DTO call by invoking bounded Adventure.go(direction) and checking the returned text/room response. The same CLI path now accepts bounded --adventure-take <item>, --adventure-use <item>, and --adventure-drop <item> controls for simple item interactions. The focused positive proof is make run-remote-session-adventure-interop; the existing make run-remote-session-capset-interop fixture remains a launch-denial proof shared with the browser UI smoke path.

The CLI and trusted local web bridge are development tools in this repo. The repo-local Tauri path reuses the same Rust backend boundary by loading the loopback remote-session-ui surface in a desktop webview:

CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-tauri

By default that target first runs a policy preflight over the reviewed check/dev scaffold, then checks Tauri CLI and Linux build prerequisites, reports dependency/scaffold status, and runs a deterministic wrapper cargo check when the host has those prerequisites. Set CAPOS_REMOTE_SESSION_TAURI_MODE=dev to launch cargo tauri dev. Missing host Tauri packages fail with explicit diagnostics and point operators back to make remote-session-ui; the Tauri wrapper is not a different authority model. CAPOS_REMOTE_SESSION_TAURI_MODE=policy tools/remote-session-tauri.sh runs only the scaffold guardrail and does not need Tauri system packages or a desktop session. package and automation modes are intentionally blocked until distributable packaging and desktop automation receive reviewed designs.

The first visible proof keeps QEMU host forwarding and a development transport. The current implementation uses length-prefixed schema-framed Cap’n Proto DTOs for remote login, session summary, CapSet list/get, calls, denials, and logout. Standard capnp-rpc framing and live object proxies remain the transport direction, but the first proxy slice is now explicitly dual-stack: host-backend-only capnp-rpc proxy objects over the existing DTO gateway first, then guest-wire replacement after the capOS userspace runtime decision.

Implementation Status

Implemented and active slices:

  • The capnp-rpc transport DTO surface is pinned in schema/capos.capnp ahead of the transport rewrite: RemoteAuthStart, RemoteAuthStep, RemoteServiceGrantRequirement, RemoteServiceExport, RemoteServiceProfile, plus the RemoteSessionGateway, RemoteAuthFlow, RemoteSession, RemoteCapSet, RemoteServiceCatalog, and RemoteServiceRunner interfaces. Round-trip coverage lives in capos-config/tests/remote_capnp_rpc_dto_roundtrip.rs.

  • Runtime placement decision: capnp-rpc v0.25 is std-only and needs a futures executor, while demos/remote-session-capset-gateway/src/main.rs is a #![no_std] #![no_main] gateway with a synchronous accept/recv/handle/send loop. Therefore the first proxy implementation is host-backend-only. The trusted Linux Rust backend may host a local capnp-rpc facade/proxy layer for chat or Adventure and translate those calls into the existing RemoteGatewayRequest / RemoteGatewayResponse DTO transport. The gateway, schema, generated bindings, kernel services, browser API, and browser view models stay unchanged for that slice. This is a temporary dual-stack period: the backend proves proxy semantics and exception mapping over the DTO wire, but it must not be documented as live standard capnp-rpc support inside the capOS guest. The gateway wire replacement remains gated on a reviewed capOS userspace async runtime or a reviewed sync-friendly Cap’n Proto RPC adapter. The completed task file docs/tasks/done/2026/remote-session-host-backend-capnp-rpc-facade.md records the implementation metadata and validation for the host-backend slice.

  • Host-backend capnp-rpc facade for Chat landed 2026-05-13 08:29 UTC. tools/remote-session-client/src/rpc_facade.rs creates a local capnp-rpc Chat client/server object in the trusted Linux backend and translates join, leave, send, who, and poll calls into the existing synchronous RemoteGatewayRequest / RemoteGatewayResponse DTO operations. The CLI client and trusted web bridge now route chat calls through the same backend-only facade. Browser JavaScript still receives only view models, typed results, typed denial envelopes, and redacted transcript rows; it does not receive raw capOS caps, local cap ids, endpoint owner handles, result-cap slots, process handles, or proxy table positions. Denials remain DTO/domain results at the browser boundary, and transport disconnects keep the existing reconnect-required mapping. This proves backend proxy semantics over the DTO transport, not live standard capnp-rpc support inside capOS or on the guest gateway wire.

  • The capOS SDK transitional RemoteTransport now uses the same trusted host DTO backend as a host-side std transport for shared typed clients. The first proof maps a forwarded system_info cap obtained through CapSetGet to a synthetic host-side cap id and drives SystemInfoClient::motd_wait through the current systemMotd DTO. This is still backend proxying over the length-prefixed DTO gateway, not live guest-wire capnp-rpc.

  • make run starts the remote-session CapSet gateway in the default manifest and forwards guest port 2327 to a host-local loopback port. The helper prefers 127.0.0.1:2327 but selects a free fallback when another QEMU run or developer process already occupies the port, unless the port is explicitly configured.

  • make run-remote-session-capset-interop boots a focused manifest, runs a Linux Rust host client, authenticates as the configured operator by default, lists the broker-shaped remote CapSet, calls session, system_info, and the first endpoint-backed chat service through a per-session worker proxy, proves wrong-interface/unknown/stale denials, and records a redacted transcript.

  • make run-remote-session-adventure-interop uses a focused manifest with the Adventure server, companion NPC binaries, and remote-session-adventure-worker embedded. The operator client launches the Adventure graph, calls adventureStatus, adventureLook, adventureInventory, and the first mutable adventureGo(direction) DTO plus bounded adventureTake(item), adventureUse(item), and adventureDrop(item) controls, proves stale failure after logout, and preserves the same transcript authority-leak checks.

  • RemoteAuthMethod advertises password and anonymous as enabled methods plus disabled public-key, OIDC, and passkey/WebAuthn entries so the protocol and client are not password-shaped.

  • The capOS gateway uses manifest-scoped TcpListenAuthority on guest port 2327, plus SessionManager and AuthorityBroker. It does not receive raw NetworkManager, TcpListener, or TcpSocket authority, and the manifest does not grant service endpoint caps directly to the gateway. The gateway asks the broker for a narrower remote-client bundle, exposes broker-held service endpoints such as adventure and chat as remote CapSet descriptors, and starts the first chat endpoint proxy through a session-bound worker when the client calls chatSend. Adventure status, look, inventory, bounded go(direction), and bounded take/drop/use item calls now have matching service-specific worker/client slices after the Adventure graph is launched. Other mutable Adventure methods and Paperclips direct methods still wait for their service-specific worker/client slices. Login source metadata is derived by the gateway from the accepted socket and a gateway-generated connection event id rather than from client-supplied fields.

  • The host Rust client crate is UI-neutral and can back a CLI, native GUI, Tauri backend, or trusted web gateway.

  • A first trusted local web bridge now exists as remote-session-ui. It serves a loopback-only browser UI whose Rust backend holds the TCP connection and remote session state. make run-remote-session-capset-ui boots the focused gateway-only fixture, drives every visible button in the browser UI, and captures a screenshot plus a redacted transcript. The current web UI uses a dedicated full-window sign-in view with compact endpoint/auth controls and no full persistent technical header. Login includes a visible username field that is empty by default – the bridge does not pre-fill from CAPOS_REMOTE_SESSION_USER, host USER, or any other host-side identity hint, because pre-filling would leak operator/account hints to anything observing the page before authentication. The browser sends only username/password for password login; operator and other resource-profile names are not user-typed system details. For the current legacy DTO protocol, the trusted Rust backend maps an omitted password profile to the default operator profile before calling the gateway; gateway-side profile policy/picker support remains future work for manifests with multiple user-meaningful choices. Authenticated users land in a Services-first SPA workspace with Services, CapSet, Diagnostics, Transcript, and Session views rather than seeing every technical panel at once. The UI smoke tracks visible buttons across login and workspace states and fails when any visible button is not exercised.

  • The current UI slice makes Services the task-oriented SPA action hub for the default-manifest service surface. It should use the catalog and launcher view models to show runnable profiles, required grants, launch status, denials, and generic/simple service panels without moving capOS authority into browser JavaScript.

  • A read-only DTO service catalog now advertises currently available remote DTO services (session, system_info) plus backend-held endpoint services such as chat and adventure when the broker returns them for the authenticated profile. A companion launcher catalog describes service-runner profiles, required grants, and exported service descriptors. Adventure is the active default-manifest launch profile; Paperclips remains a future profile until its authoritative server path is available to the default remote session. The catalogs are browser-safe view models only: no raw ProcessSpawner, process handle, endpoint owner, local cap id, or result-cap slot is exposed.

  • The launch DTO/probe slice is complete. It exposes the remote-safe serviceLaunch request/status path for cataloged profiles. The request carries only a profile id plus explicit grant names; the status reports support state, accepted grant names, a message, and exported or planned service descriptors. The completed probe contract does not call spawn, create/own endpoint receivers, return process handles, or attach new service caps to the remote CapSet.

  • The current Adventure serviceLaunch slice implements the actual restricted backend launch for the default make run manifest. The trusted backend/gateway starts adventure-server plus simple NPC companion processes through an approved service-runner profile and attaches or retains backend-held descriptors/caps for the Adventure/chat-facing services. Browser JavaScript still receives only view models, launch status, service descriptors, denial diagnostics, and typed results. Real direct Chat.send now runs through the first per-session worker/proxy proof; Adventure status, look, inventory, bounded go(direction), and bounded take/drop/use item actions use the same pattern after launch, while richer Adventure controls remain later client layers over the same backend-held capability boundary.

  • The launch-denial proof is implemented for the currently exposed remote gateway paths. Focused CLI and UI QEMU harnesses drive operator missing-grant, wrong-interface, and disallowed-binary serviceLaunch denials; the CLI QEMU harness also drives stale-session and anonymous/no-runner serviceLaunch denials. Smoke checks require explicit error codes/messages, backend teardown, no Adventure server or companion process spawn in the denial-only fixture, and no raw process-handle, endpoint-owner, local-cap, result-cap, capability-manager, process-spawner, terminal-authority, or network-authority markers in browser-visible envelopes, UI reports, or redacted transcripts. The separate run-remote-session-adventure-interop fixture embeds the Adventure binaries, requires the Adventure process graph to spawn, and verifies direct Adventure.status, Adventure.look, Adventure.inventory, mutable Adventure.go(direction), and bounded item take/use/drop responses through the worker. Guest admission shipped on 2026-05-08 03:59 UTC as RemoteAuthMode::Guest plus the RemoteGatewayRequest.guestLogin @24 union arm; the gateway routes it through start_guest_session and the shared validate_guest_admission lib-level helper, which refuses any attempt to acquire a non-guest profile (e.g. operator, anonymous) via the guest method and any session whose minted principal is not Guest. The QEMU interop harness now exercises a guest happy-path proof and a guest-profile-mismatch denial; the RemoteErrorCode::DisabledAuthMethod path is covered through the bridge host-test layer (a manifest with no guest seed makes the kernel SessionManager.guest() return failure, which the gateway maps to that code).

  • Rust-level backend/account-store denial coverage now proves inactive accounts (disabled, locked, and recovery-only), unknown principals, and missing or retired resource profiles cannot produce remote-client bundle plans. Focused SessionManager account-selection coverage records that unknown, inactive, non-operator, or no-console-password account paths do not become password-login candidates suitable for later broker use. The live CLI QEMU gateway proof now drives failed password proof, unknown account, wrong password requested profile, and anonymous profile mismatch cases; each denied client completes as auth-denied with no session start, CapSet list/get, session info, or service-launch activity. Denied re-login clears prior per-connection gateway state plus cached host-client and web-bridge session view state instead of leaving stale authority usable after denial.

  • Kernel-backed remote logout is implemented for the DTO gateway. Each SessionManager-minted UserSession registers a kernel-private liveness cell keyed by the minted session id. Reconstructed broker and launcher SessionContext values resolve that existing cell and fail closed if it is absent or logged out; they do not create fresh live state from SessionInfo bytes. Explicit remote logout calls UserSession.logout, and connection teardown logs out the owned live remote session before dropping the backend session cap. UserSession.info, session-bound SystemInfo, endpoint call admission, and normal service-cap dispatch go stale after logout; UserSession.auditContext remains available for audit attribution. Endpoint returns now recheck the caller session at the return commit point: if the caller logged out, expired, or otherwise went stale after admission, the kernel rolls back prepared result-cap move sources, cancels the in-flight call instead of restoring it, posts an invoke-failed caller completion when the caller CQ can accept it, and rejects the server RETURN without copying result bytes, application-exception payloads, result-cap records, or returned caps into the stale caller.

  • Gateway idle-disconnect bug fixed (operator-reported regression on the trusted web bridge). Symptom: after some time of using make remote-session-ui against make run, the next routine action – often a periodic or user-driven sessionInfo refresh – failed with gatewayDisconnected carrying the message “remote gateway closed the connection during sessionInfo; retry login to reconnect”, forcing the operator to log in again. Root cause was gateway-side: the per-frame TCP recv on the accepted remote-session socket used a 5-second timeout (WAIT_NS = 5_000_000_000) inside recv_exact / recv_frame. Routine inter-request idleness on the bridge – which is reactive, not driven by a background poller – exceeded the 5 s budget, the gateway treated the timeout as a fatal recv failure, exited the per-connection loop, ran close_remote_session_state (issuing UserSession.logout and the “remote session stale” / “connection teardown” audit lines) and dropped the TCP connection, then accepted the next host TCP attempt fresh. The bridge’s next request hit the closed socket and surfaced the disconnect through gateway_io_error. Fix: use RECV_FRAME_WAIT_NS = CAP_ENTER_WAIT_FOREVER for the per-frame recv loop. The kernel-side TCP recv waiter still resolves on data arrival, on clean peer FIN as a 0-byte completion (treated as graceful peer teardown), and on transport-level errors (treated as fatal recv failure); only the spurious 5-second idle timeout is removed. Regression test: recv_frame_wait_is_forever_to_survive_idle_remote_clients in demos/remote-session-capset-gateway/src/lib.rs pins the policy constant. A const _: () = assert!(...) in the gateway main keeps the lib constant and the runtime CAP_ENTER_WAIT_FOREVER sentinel in lockstep so the value cannot drift back to a finite timeout. The short-lived smoke harnesses (make run-remote-session-capset-interop, make run-remote-session-capset-ui) finish well within the previous 5 s budget and so did not catch this – the bug only fires under realistic interactive operator pacing. Future work: when SSH Shell Gateway lands, audit the equivalent recv-loop policy on that path before borrowing the shape from this gateway.

  • Partial-frame DoS proof closed 2026-05-07 08:37 UTC. The forever-wait fix above survives quiet remote peers but, taken alone, also lets a peer that sends a frame header and then stalls (or dribbles a few bytes per minute) keep the gateway accept loop pinned on a single connection. The gateway recv now uses a two-phase wait policy: byte 1 of an idle frame waits forever (RECV_FRAME_WAIT_NS = CAP_ENTER_WAIT_FOREVER) with up to TCP_RETRY_ATTEMPTS = 1024 EAGAIN retries, while bytes 2..N of an already-started frame use the bounded WAIT_NS = 5_000_000_000 (5 s) wait with no EAGAIN retry, and the per-frame recv-call count is capped at MAX_FRAME_COMPLETION_RECVS = 64, bounding a slow-dribble peer at roughly 5 minutes per frame before the gateway closes the connection. Proven by run_partial_frame_probe in tools/qemu-remote-session-capset-harness.sh, which opens a TCP connection, sends a 4-byte header declaring an 8192-byte payload followed by only 4096 payload bytes, and observes the gateway closing the connection within 20 seconds; the QEMU smoke (tools/qemu-remote-session-capset-smoke.sh) asserts the proof line remote-session partial-frame proof: started payload closed after bounded wait.

Default Run And Game Server Story

The default operator manifest is system.cue, layered on cue/defaults/defaults.cue. Today it boot-launches standalone init; init starts chat-server, remote-session-capset-gateway, remote-session-web-ui, and the foreground shell. The default binary catalog embeds Adventure server, Adventure NPC, Adventure client, and the terminal Paperclips binary. Adventure is not boot-started automatically, but the current remote-session slice makes the default-manifest serviceLaunch path start adventure-server plus simple NPC companions through a restricted backend service-runner profile and attach or retain backend-held Adventure/chat-facing service descriptors/caps. Paperclips launch remains future. The default remote-session gateway receives only console, scoped TCP listen authority for guest port 2327, SessionManager, AuthorityBroker, and narrowly approved backend launch authority; it does not expose raw ProcessSpawner, raw network-manager/socket authority, endpoint owner caps, process handles, local cap ids, or result-cap slots. The remote-session-web-ui service receives scoped TCP listen authority for guest port 8080, SessionManager, AuthorityBroker, console, and the read-only system manual cap. make run forwards guest port 8080 to a loopback host port and prints remote self-served UI: tcp 127.0.0.1 <port> -> guest :8080 so the operator can open the self-served UI in a browser directly from the default operator run.

Current game-server proofs live in focused manifests:

  • make run-adventure uses system-adventure.cue, which starts chat-server, adventure-server, Adventure NPC companion processes, an adventure-scenario-test, and the shell. The Adventure server exports the adventure endpoint, consumes a client facet of chat, owns room/player state, and keys player access by the live caller-session reference.
  • make run-paperclips uses system-paperclips.cue, which starts paperclips-server and paperclips-proof-server services exporting PaperclipsGame endpoints, then launches the terminal paperclips client with explicit StdIO, game endpoint, timer, and optional proof_accelerator grants. The server owns generated content, game state, timer cadence, command descriptors, status snapshots, project entries, unlock checks, and game-rule mutation.

The remote UI direction is therefore not “open a terminal and type the MOTD commands.” The completed DTO/probe slice can describe and probe runnable game-server profiles without side effects. The current Adventure implementation gate is the real restricted service-runner/catalog surface for the default manifest: it starts the approved Adventure server graph and attaches or retains the capabilities those processes export or receive to the backend-held remote CapSet. The service-panel UI can expose this as launch state, descriptors, denials, and generic/simple surfaces. Chat now has the first worker-backed method proof; Adventure status, look, inventory, bounded go(direction), and bounded take/drop/use item calls have a service-specific per-session worker/client context after launch. Paperclips stays future until the server-owned Paperclips profile is available to the default remote session.

Host UI UX Direction

For the high-level synthesis of UI scope, invariants, and architecture, read docs/proposals/remote-session-capset-client-proposal.md -> “UI Scope And Architecture”. This section keeps the operator-story guidance for day-to-day UX work.

The host UI should optimize for the ordinary operator stories instead of mirroring protocol objects one-for-one:

  • Connect and sign in: start with a dedicated OS-like authentication view. The username field is visible and empty by default – the web bridge does not pre-fill from CAPOS_REMOTE_SESSION_USER, host USER, or any other host-side identity hint, because pre-filling leaks operator/account hints to anything observing the page before authentication. The CLI may take --user as an explicit operator override; the web UI does not. Endpoint/auth method controls remain available but secondary; retryable login/transport errors stay in the login view without losing the configured endpoint. Resource-profile names such as operator are not requested from the user during password login; they are filled only by the trusted Rust backend for the current legacy DTO. A gateway-side policy choice or post-auth profile picker should appear only when multiple manifest-published profiles are meaningful to the user.
  • Auth method advertising: the gateway forwards the auth methods the system supports, narrowed only by explicit manifest policy. Disabled methods stay listed and clearly marked (so the protocol is not password-shaped); the gateway does not silently hide methods the system supports.
  • Understand session health: after login, keep the active profile, principal, expiry, recent result, and logout in a Session view so common service work does not start on a protocol summary.
  • Use granted services: make Services the action hub for runnable profiles and remote-proxyable service descriptors. It should show availability, required grants, denial reasons, launch status, and generic command/status forms. When a descriptor is not directly callable yet, the panel should say so instead of implying method success. Service-specific rich clients (real Chat panel, Adventure rich client, Paperclips client, future agent-shell services) layer on top of the same backend-held caps.
  • Terminal panels are allowed when granted: the CapSet UI is not defined as a terminal emulator and works without one, but when the broker grants a TerminalSession cap (for native shell, POSIX shell, or any StdIO-based service expecting a terminal on the other side), the UI may host a terminal panel for that cap. Terminal bytes flow through a backend-held TerminalSession; the browser renders frames it receives, never opens a raw shell or holds a ProcessSpawner.
  • Agent-shell-exposed capabilities are first-class: the CapSet UI does not contain the LLM loop, model client, or tool-execution runner, but agent-shell-exposed services (e.g. “send message to running agent”, “approve queued action”, “audio stream to/from agent”) are services the broker can bundle, exposed through the same per-session worker / typed view-model pattern as Chat or Adventure. Whether some of those agent surfaces should themselves be layered on Chat rather than distinct caps is the cross-cutting refinement task tracked in docs/tasks/.
  • Inspect capabilities: keep CapSet as an explicit inspection view for users who need names, interface IDs, policies, and descriptor selection.
  • Diagnose calls: isolate low-level probes, stale-session proofs, MOTD, and raw result JSON in Diagnostics so common service use is not buried under transport details. The session-summary diff control belongs in Session/Diagnostics, not in the main Services flow.
  • Audit and export: keep transcript review/export in its own view, with redaction status visible and raw authority material absent.

Modernization should build on that navigation shape: no full persistent technical header on the login view, a compact authenticated app shell, clear loading and denial states, empty states with next actions, searchable service/capability lists, command forms generated from typed descriptors, side panels for details, keyboard-friendly controls, responsive layouts, and service-specific rich clients layered over the same backend-held capabilities. Adventure and Paperclips should eventually have rich client views, but the minimum viable UI must still expose their available server capabilities through simple generic forms first.

Service-Runner And Catalog Path

Staged path:

  1. The first reader-facing service catalog is implemented in the DTO gateway and UI. It lists available DTO calls plus service-runner profiles and exported capability descriptors for the current session.
  2. The remote-safe launch DTO/probe contract for those profiles is complete. The request names a catalog profile and explicit grants; the probe/status result reports support state, accepted grant names, a message, and planned exported descriptors. This slice is intentionally side-effect-free: it does not start a process, allocate endpoint owners, return process handles, or attach caps.
  3. The current Adventure slice implements a restricted service-runner surface behind the broker for the default make run manifest. It may use local spawn authority internally, but the remote session receives only catalog descriptors, launch requests, launch status, and returned remote capability descriptors. Raw ProcessSpawner, process owner handles, endpoint owner caps, local cap IDs, result-cap slots, and process handles stay inside capOS or the trusted backend.
  4. The CLI and remote-session-ui backend can call the runner and attach or retain the returned backend-held descriptors/caps. Browser JavaScript receives view models, launch forms, progress, denials, command/status descriptors, and call results for methods that are actually callable through the current DTO path; it does not receive raw capOS capability objects.
  5. Start with simple generic panels. Adventure now exposes launch plus status/look/inventory, bounded mutable go(direction), and simple bounded take/drop/use item controls over the backend-held Adventure endpoint and chat-facing descriptors. The first direct chat call and these Adventure controls run through session-bound worker proxies; broader Adventure verbs and Paperclips calls still need service-specific worker/client layers before richer clients sit on top of the same backend-held CapSet. Paperclips can expose PaperclipsGame.commands, status, projects, and command once the server profile is available to the default remote session.
  6. Keep hardening the repo-local Tauri wrapper. The current make remote-session-tauri command policy-checks, dependency-checks, or launches a scaffolded desktop wrapper over the same Rust/backend authority boundary as the web bridge and uses the printed make run remote CapSet port. The policy check fails closed if bundling, window URLs, default capabilities, app-specific invoke handlers, Tauri commands, or tauri-plugin-* usage drift from the reviewed check/dev scaffold. Distributable packaging and desktop automation remain future polish.

Remaining major gaps:

  • Continue expanding the first host UI beyond the current session, system_info, and worker-backed chat proof while still reusing the Rust backend boundary and DTO gateway. A later Tauri package can wrap the same backend when the goal is a distributable desktop app.
  • The first richer service client is a session-summary diff. The pure Rust helper lives in tools/remote-session-client/src/session_diff.rs and compares two snapshots of the remote session view (CapSet plus SessionInfoSummary) into CapSetDiff / SessionSummaryFieldDiff records keyed on (name, interface_id) and visible session fields. The trusted web bridge stores the raw snapshots backend-side and exposes /api/call/session-diff-refresh, which returns a redacted SessionSummaryDiffVm. The browser renders the diff in a dedicated “Last refresh diff” pane on the Session view, with the new session-diff-refresh button exercised twice by the focused UI smoke (first call captures a baseline with hasBaseline=false; the second call reports the diff against the previous snapshot with hasBaseline=true). Backend host tests cover the baseline + no-change path and an added-cap + expiry-change path.
  • Make the remote UI capable of discovering and presenting the full remote-proxyable functionality granted to the authenticated session in the default make run manifest. The first pass may use generic/simple panels for demo services such as chat, Adventure, and Paperclips, but users should not have to switch tools merely because a capability is part of their default remote session bundle. Rich game-specific clients are a later UI layer on top of the same backend-held CapSet, not a reason to narrow the first UI to only session and system_info.
  • Extend the implemented Adventure service-runner slice beyond the first mutable control. The current host backend can start the allowed default-manifest Adventure server graph through the restricted launch path, discover the resulting descriptor in the backend-held remote CapSet, and call Adventure.status, Adventure.look, Adventure.inventory, bounded Adventure.go(direction), and bounded Adventure.take/Adventure.drop/ Adventure.use through a per-session worker. Next work is broader Adventure command coverage and richer game-specific clients on top of that same worker-held boundary.
  • Keep Paperclips launch future until the authoritative Paperclips server profile is available to the default remote session. The UI may show Paperclips as planned/not remote-proxyable rather than claiming launch support.
  • Replace the DTO transport with standard capnp-rpc framing and live typed remote proxy objects.
  • Expand auth adapters beyond password and anonymous.
  • Use the generalized per-session worker lifecycle manager for future endpoint-backed services. Chat send and Adventure status/look/ inventory/go(direction)/take/drop/use now share worker spawn validation, logout/close teardown, graceful shutdown, forced termination fallback, and release flushing; broader Adventure controls, Paperclips worker/client protocol, and live-proxy lifecycle hardening remain future work.
  • Gateway response writes now fail closed per connection: a send-side host disconnect or invalid send byte count breaks the connection loop, then drops backend-held session state and terminates any session-started Adventure processes instead of aborting the gateway process. Direct Chat.send is no longer called from the gateway process; it runs through the first session-bound worker proxy. Adventure status, look, inventory, bounded go(direction), and bounded take/drop/use item methods now receive the same treatment; broader Adventure methods remain later.
  • Add resource limits, TLS/mTLS, renewal, revocation, and UI-composition surfaces.

Design Constraints

  • Do not serialize local capOS cap IDs, cap-table slots, endpoint receiver selectors, endpoint generations, result-cap indexes, server cookies, or global session identifiers as portable authority.
  • Do not treat password auth as the only remote path. The schema and docs must leave room for public key, OIDC, passkey/WebAuthn, mTLS, guest/anonymous, and service/workload admission.
  • Keep the session-bound invocation invariant. Remote post-auth calls run under the remote session’s capOS worker context or an equivalent reviewed context.
  • Keep default remote bundles narrower than operator shell bundles.
  • Keep browser JavaScript and model providers away from raw capOS caps. Browser and agent paths use gateway-side tool/cap proxies.
  • Keep the first CapSet UI distinct from WebShell. It can inspect and call currently implemented remote session capabilities without launching a shell, terminal emulator, shell-runner policy engine, or model agent.
  • Treat raw ProcessSpawner and browser-held capOS capabilities as explicit non-goals for the remote UI path. A service-runner may hold launch authority inside capOS, but browser and webview code see only catalog entries, launch forms, service descriptors, view models, and typed results.
  • Service launch from the remote UI must go through a restricted, session-bound launcher or broker service-runner profile. The browser must not receive raw process handles, local cap ids, endpoint owner handles, or a raw ProcessSpawner; it receives only view models, launch plans, service descriptors, and typed call forms/results.
  • Keep UI composition declarative and bounded. A capOS service may propose layout/theme/view updates only through an explicit UI capability; it cannot inject arbitrary JavaScript/CSS, spoof trusted chrome, or persist UI state without a settings/profile cap.
  • Keep listener and transport authority scoped; no raw NetworkManager or broad ProcessSpawner in the long-term gateway.
  • Preserve the error split: transport/CQE errors, capability infrastructure exceptions, and domain result unions remain distinct.

The planning update that introduced this backlog aligned these documents:

  • remote-session-capset-client-proposal.md: owning design.
  • shell-proposal.md: remote clients are peer clients of broker-issued bundles, not shell transports.
  • boot-to-shell-proposal.md: web/remote login feeds the same session manager and broker, and must support non-password admission.
  • ssh-shell-proposal.md: SSH remains a terminal transport, while public-key auth records can also feed non-shell remote clients through a domain-separated protocol.
  • user-identity-and-policy-proposal.md: broker bundles need a remote-client profile shape in addition to shell bundles.
  • browser-capability-proposal.md, llm-and-agent-proposal.md, and interactive-command-surface-proposal.md: UI composition, browser/agent front ends, and typed command surfaces remain capability-mediated rather than raw browser or shell authority.
  • roadmap.md and docs/tasks/README.md: the old chat-only interop item is reframed as remote session CapSet interop without changing the selected threading milestone.

Grounding Files

Relevant design and research grounding:

  • docs/proposals/session-bound-invocation-context-proposal.md
  • docs/proposals/user-identity-and-policy-proposal.md
  • docs/proposals/boot-to-shell-proposal.md
  • docs/proposals/shell-proposal.md
  • docs/proposals/ssh-shell-proposal.md
  • docs/proposals/certificates-and-tls-proposal.md
  • docs/proposals/oidc-and-oauth2-proposal.md
  • docs/proposals/capos-service-proposal.md
  • docs/proposals/interactive-command-surface-proposal.md
  • docs/proposals/browser-capability-proposal.md
  • docs/proposals/llm-and-agent-proposal.md
  • docs/research/cloudflare-capnproto-workers.md
  • docs/research/spritely-captp-ocapn.md

Ordered Gates

Gate 0: Rename The Target

  • Rename the planning target from chat interop to remote session CapSet interop while preserving the existing chat proof as a historical transport slice.
  • Add docs that say the remote client is a regular host app and does not use capos-rt, the capOS ring page, or the local CapSet page.
  • Keep the existing make run-capnp-chat-interop target until a successor proof exists; do not remove useful evidence.

Gate 1: Host Rust Cap’n Proto RPC Client

  • Add a host-built Rust client crate or tool using generated schema bindings. The first slice uses length-prefixed schema-framed Cap’n Proto DTOs; standard capnp-rpc remains open.
  • Keep the client library UI-neutral so it can back a CLI harness, a native GUI, or a Tauri backend without changing the capOS protocol.
  • Connect through QEMU host forwarding to the capOS gateway.
  • Verify schema version/interface ID mismatches fail with explicit diagnostics.
  • Add a host-side transcript that records successful connect, bootstrap, session info, CapSet list, calls, denials, and logout.

Gate 1A: First Host UI Client

  • Build a thin Tauri or trusted-local-web UI over tools/remote-session-client, without changing the capOS gateway protocol. Prefer Tauri when the goal is a distributable desktop app whose Rust backend can hold the remote session; prefer a local web bridge when browser iteration speed matters more than app packaging.
  • Document and support the repo-local operator paths: make run for capOS/QEMU, cargo run --manifest-path tools/remote-session-client/Cargo.toml --target x86_64-unknown-linux-gnu --bin remote-session-client -- --host 127.0.0.1 --port <printed-port> for the CLI, and CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-ui for the trusted local web bridge. The Makefile target wraps the same remote-session-ui Rust backend and defaults to http://127.0.0.1:3337/. The Tauri wrapper layers over the same backend, not a separate authority model.
  • Add a bounded repo-local Tauri wrapper command: CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-tauri. It checks Tauri CLI and Linux build prerequisites, including xdo and openssl pkg-config modules, reports dependency/scaffold status, and either runs a deterministic wrapper check or launches cargo tauri dev when requested. Missing prerequisites fail with explicit diagnostics and point operators back to make remote-session-ui.
  • Add the actual repo-local Tauri wrapper over the existing backend. The wrapper shares the same tools/remote-session-client backend boundary by loading the loopback remote-session-ui surface; webview code receives view models and user events, not replayable capOS handles. Distributable package bundling remains disabled until the sidecar/backend lifecycle is reviewed.
  • Add a policy-only Tauri wrapper preflight: CAPOS_REMOTE_SESSION_TAURI_MODE=policy tools/remote-session-tauri.sh. The guardrail proves the current wrapper remains check/dev only: bundle.active=false, the Tauri devUrl and single main window URL stay pinned to http://127.0.0.1:3337, default permissions stay exactly ["core:default"], and app-specific invoke_handler, generate_handler, #[tauri::command], and tauri-plugin-* drift is rejected. This does not prove distributable packaging or desktop automation.
  • Keep capOS authority in the backend. Browser/webview JavaScript receives session summaries, auth-method descriptors, CapSet entries, capability call forms, transcript rows, and denial diagnostics, but no replayable capOS handles.
  • Implement the first UI views for endpoint configuration, auth-method inventory, password/anonymous login, session summary, CapSet list/get, sessionInfo, systemMotd, denied-chat probe, logout, stale-call proof, and redacted transcript export. The first web bridge now uses a dedicated full-window sign-in view and authenticated SPA navigation so the common workflow is not a single technical page.
  • Implement selectable remote UI themes based on the committed concept assets in tools/remote-session-client/ui/assets/: a space login theme using bg-space.2k.webp and design-mockup-space-login.webp, a mountain login theme using bg-mountain.2k.webp and design-mockup-mountain-login.webp, a light login theme using design-mockup-light-login.webp, and a hacker terminal theme using design-mockup-operator-console.webp. The hacker theme should use a black/deep-teal background, phosphor-green monospace typography, thin terminal-grid borders, subdued binary side texture, bracketed primary action text, and a footer status line such as “Secure connection established” with a lock indicator, without keeping a persistent global header above the login or workspace views. Treat the mockups as visual references, not runtime screenshots. The implementation should expose a bounded theme selector in the trusted local web UI, persist the selected theme locally, keep browser JavaScript limited to UI state and backend view models, preserve the existing authenticated SPA workflow, and prove contrast, focus, small-screen layout, and screenshot coverage for every theme. The trusted web UI now serves only the committed theme assets by fixed name, stores theme choice in browser-local UI state, drives the selector in both login and workspace modes, and captures desktop plus mobile screenshots for both login and workspace views of each theme in the focused UI smoke. The login view is styled as a focused OS-style sign-in surface without a persistent header; endpoint configuration, auth method inventory, anonymous login, and theme choice remain accessible as compact secondary controls.
  • Ensure the UI discovers every granted remote CapSet entry in the default make run operator session and offers at least a generic/simple surface for each remote-proxyable service exposed by that bundle. Call forms are only for methods the current DTO/proxy path can actually invoke. The first endpoint-backed chat call is now callable through the session-bound worker proxy, and Adventure status, look, inventory, bounded go(direction), and bounded take/drop/use item actions are callable after the Adventure service graph is launched. Game surfaces can start with a simple chat send/probe form, a generic Adventure panel when the service is callable remotely, and Paperclips status/command panels when its server capabilities are exposed. Rich game clients remain a later layer over those same capability bindings. The gateway now lists broker-held endpoint descriptors from service_endpoints, so operator sessions include session, system_info, adventure, and chat; the focused QEMU proof asserts those CapSet entries and the web UI exposes them through CapSet and Services surfaces.
  • Add a task-oriented “Services” view for default-manifest operator sessions: list broker/launcher-advertised runnable services, show which grants are required, start allowed game server processes through a remote-safe restricted launcher/service-runner API, and attach or retain the returned exported descriptors/caps in the backend-held remote CapSet. The first Adventure flow should be able to start adventure-server plus required NPC/server companion processes with their manifest-shaped grants, then show the resulting Adventure/chat descriptors through generic/simple panels. Chat method success now runs under the authenticated session through the first per-session worker proxy; direct Adventure status, look, and inventory now have matching service-specific worker/client paths, and bounded mutable go(direction) plus take/drop/use item paths use the same worker. Broader Adventure method success remains later work. The Paperclips flow may stay simple until the authoritative Paperclips server backlog lands, but the UI direction is server-owned game state and remotely callable game capabilities, not terminal text scraping. The web bridge refreshes CapSet, service catalog, and launcher catalog view models after a successful serviceLaunch so the SPA reflects post-launch descriptors immediately; the focused UI fixture still treats missing Adventure binaries as an explicit planned/denied state.
  • Add browser/UI automation for the chosen client: start a gateway-only QEMU fixture, such as run-remote-session-capset-interop-vm with explicit hostfwd/pid/log handling or a new focused UI fixture target, then drive login, CapSet inspection, capability calls, denials, logout, and transcript redaction, and capture screenshots or traces for review. Do not drive the UI against make run-remote-session-capset-interop because that wrapper starts the scripted CLI client and shuts QEMU down.
  • Keep WebShell-specific work out of this gate. No terminal emulator, shell process delegation, shell-runner policy, agent tool execution, or UI-composition cap is required for the first CapSet UI.

Gate 1B: Self-Served capOS Web UI

Gate 1A is host-served bridge work: make remote-session-ui serves the browser UI from the trusted host Rust backend while capOS exposes the remote CapSet gateway over QEMU host forwarding. Gate 1B adds the first self-served capOS web UI proof: a capOS-side service serves the browser UI entry point and same-origin backend path itself.

Task records:

  • remote-session-self-served-web-ui-design selected the capOS-side hosting boundary, listener authority, asset source, session/admission path, asset integrity/update story, and browser-safe view model boundary.
  • remote-session-self-served-web-ui implemented the first self-served proof with a focused immutable UI shell and browser automation against the capOS-served origin.
  • remote-session-self-served-web-ui-default-run integrated the self-served path into ordinary make run. The default manifest now auto-starts remote-session-web-ui and make run prints remote self-served UI: tcp 127.0.0.1 <port> -> guest :8080. Completed 2026-05-14 09:07 UTC.
  • remote-session-self-served-full-ui-bundle replaces the immutable proof shell with the reviewed fixed-name boot-resource UI bundle. The capOS service now serves /, /app.js, /styles.css, /feature-flags.js, /themes/retro.css, the icon/background/logo assets, /ui-config.js, and /bundle/manifest.json from the capOS-owned origin with explicit content types, no directory traversal, and a build-time digest pinned in demos/remote-session-web-ui/ui-bundle.digest. The focused proof verifies every served asset byte-for-byte against the manifest and then drives the operator workspace views, logout, stale failure, transcript redaction, and system-manual view models.
  • cloud-prod-remote-session-web-ui-l4-local-proof consumed the landed Phase C userspace L4 and DHCP/IPv4 config proofs. It proves remote-session-web-ui through the non-qemu cloudboot socket path locally with the full fixed-name UI bundle, password login, backend-held SystemInfo, logout/stale failure, manual viewer, and browser-boundary checks. Completed 2026-06-09 01:49 UTC (ff769a5c) as local QEMU/cloudboot evidence only; it does not claim private GCE reachability, public ingress, TLS, or production browser readiness.
  • cloud-prod-network-stack-web-ui-slow-client-bounds hardened the userspace network-stack server that backs the L4 Web UI listener (Review C medium: a single-writer accept loop and fatal recv/accept/send budgets let one idle or held-open unauthenticated client crash the network stack or block every other connection). The server now keeps a bounded multi-socket listen backlog, hands out only data-ready connections (idle held-open ones are left for the reaper), reaps idle/half-closed backlog connections after a short idle window, and treats every budget expiry as non-fatal (abandon the offending connection and re-arm instead of exiting). make run-cloud-prod-remote-session-web-ui-l4 adds a slow-client bound proof in two phases: several idle held-open clients that send no request bytes (kept out of the serving path by the reaper) and one partial-request (Slowloris) client that sends incomplete headers then stalls (served, then abandoned when the recv budget expires). In both phases a concurrent /healthz keeps completing and the server survives, and the kernel log shows the backlog config, idle reaping, and the recv-budget abandon. Serving is still serial, so a data-ready partial-request client adds a bounded head-of-line delay (one recv budget) to the next connection rather than blocking it indefinitely; that bound is the accepted limit for this research demo. This is the server-side prerequisite for remote-session-web-ui-connection-bounds, which layers per-connection deadlines in the remote-session-web-ui RPC client on top.
  • remote-session-web-ui-connection-bounds completed the client side of that boundary (Review C medium). The remote-session-web-ui service replaced its retry-count spin budgets with per-connection wall-clock deadlines on the monotonic clock: a request-read deadline (6 s, anchored at accept, covering request line, headers, and body together) and a response-send deadline (30 s, anchored conservatively at request dispatch, before routing), neither of which resets on byte progress, so total accept-loop occupancy per connection is bounded regardless of client pacing. Deadline expiry abandons only the offending connection fail-closed with an explicit console evidence line. This closes the case the server-side per-call budgets cannot see: a drip-feed client that delivers one header byte at a time keeps every server recv budget fresh while never completing the request. make run-cloud-prod-remote-session-web-ui-l4 adds a third slow-client phase driving exactly that drip-feed client and asserts the web-ui abandons it at the read deadline and /healthz still completes afterwards, alongside the existing held-open-vs-concurrent-/healthz and Slowloris phases. Connection admission limits (the bounded listen backlog and idle reaping, which cap all pre-login connections) remain server-owned in the network-stack listener layer.
  • remote-session-web-ui-session-hardening closed Review C high (predictable capos_remote_session tokens and missing browser-session enforcement). The remote-session-web-ui service now mints an unpredictable, opaque server-side session id (one-way SHA-256 over the kernel-CSPRNG backend session id, base64url, never the accept counter) and a domain-separated per-session double-submit CSRF token; rotates both on login, re-login, and logout (clearing the browser cookies and failing closed on a replayed rotated-out id); enforces idle and absolute lifetime bounds before request dispatch; validates Host (DNS-rebinding) and Origin and requires the X-CSRF-Token double-submit cookie/header on state-changing requests; and marks the session cookie Secure when X-Forwarded-Proto: https reports HTTPS ingress (the plaintext loopback proof stays explicitly non-Secure). This aligns the in-capOS server with the committed operator-bundle and host-bridge CSRF contract (tools/remote-session-client/{ui/app.js,src/web_security.rs}). make run-cloud-prod-remote-session-web-ui-l4 extends the self-served proof with stale-token, CSRF (missing/mismatch), Origin (missing/cross-site), Host, and idle/absolute expiry denial paths plus a login/re-login rotation check, all failing closed before any backend-held capability call. Local QEMU/cloudboot evidence only; it does not claim private GCE reachability, public ingress, or TLS.
  • The public-ingress browser hardening set is done on the same make run-cloud-prod-remote-session-web-ui-l4 gate (all local QEMU/cloudboot evidence, no public exposure): in-guest login peer-gate and failure-backoff hardening, the single public-origin policy (one manifest-granted public_origin.<host> marker fixes the only accepted public origin on the trusted forwarded-scheme HTTPS path), the IAP-aware SameSite cookie policy (Strict by default, Lax only under the manifest IAP marker with a cross-site GET provenance gate), the JSON content-type guard (typed 415 on every state-changing /api/* POST before backend dispatch), the security response headers and strict CSP (uniform header set plus a no-unsafe-inline CSP proved violation-free in a real Chromium), the GFE-range-pinned forwarded-scheme trust (X-Forwarded-Proto authoritative only from 130.211.0.0/22 / 35.191.0.0/16, implementing the firewall-bounded forwarded-scheme trust rule below), and the public /healthz health-check contract (bounded anonymous JSON body, no session state, Host-allowlist exempt for by-IP provider health checkers).
  • Two browser-boundary local proofs remain dispatchable task records under docs/tasks/, not landed: the public-deployment loopback gate (reject loopback Host/Origin/Referer acceptance and loopback-shaped source hints under the configured public-origin load-balancer posture while preserving the local QEMU loopback proof) and the consolidated browser-visible forbidden-marker matrix proof across success, denial, health, manual, and error response classes, including hostile browser-supplied authority fields. Both extend make run-cloud-prod-remote-session-web-ui-l4 locally and do not authorize private GCE reachability or public exposure.
  • cloud-gce-legacy-virtio-webui-serving-local-proof closed the legacy-virtio serving gap locally (2026-06-11): a persistent kernel-brokered legacy virtio 0.9 runtime backs the typed Nic cap, and make run-cloud-gce-legacy-virtio-webui-serving proves a host HTTP peer fetching the byte-verified UI bundle under disable-modern=on. Local serving evidence for the GCE NIC shape only, not live GCE reachability.
  • The no-spend provider-harness gates are done as recording-stub fixture evidence — provider CLIs resolve only to the stubs, with no real provider invocation or mutation on any path: the private-proof harness --preflight-only mode, the private and public proof-evidence validators, the public ingress resource plan gate, the journal-driven teardown engine, and the provider-command allowlist gate. They bound the future private/public runs’ evidence, resource graph, teardown, and provider-command surfaces; they are not reachability, exposure, or spend authorization. A matching public-harness no-spend preflight task is dispatchable future work, not landed.
  • cloud-gce-private-self-hosted-webui-proof follows the local Web UI L4 proof and proves private GCE reachability over the live NIC without public IP or public firewall exposure. It remains on hold on missing firewall IAM against GCE default-deny ingress and on per-run billable authorization; the legacy-virtio serving gap is closed locally.
  • cloud-gce-public-webui-ingress-tls-policy-design selected the public ingress, TLS/certificate, firewall, browser-session, and teardown policy before exposure work starts (see “Selected public ingress and TLS policy” below).
  • cloud-gce-public-self-hosted-webui-ingress-tls is blocked on the private proof and on explicit public-exposure approval. With the policy design closed, it is the first public operator-access step, builds against the selected provider-terminated-HTTPS policy, and does not permit raw public HTTP as the closeout proof. The local plan/teardown/evidence/allowlist gates above bound this future run without authorizing it.

IPv6 is a separate network-stack capability lane, not a Gate 1B blocker for the first public Web UI proof. The IPv4 path above still owns the first useful GCE Web UI closeout; the IPv6 scope decision cloud-prod-ipv6-architecture-status-grounding is done and the lane is tracked in Hardware, Boot, and Storage. The broader network usability lane is Network Usability and Post-smoltcp: DNS resolver, POSIX getaddrinfo, ping/ping6, packet tracing, socket readiness, and transport policy are follow-on usability work. They do not block Gate 1B or the first IPv4 public Web UI proof unless a later ingress policy explicitly promotes one; the local DHCP/IPv4 configuration gate is done and now feeds the Web UI L4 and private GCE proof gates.

Selected public ingress and TLS policy:

  • The first public exposure of remote-session-web-ui on GCE terminates HTTPS at a GCP external Application Load Balancer (Google front end, provider-managed certificate). capOS serves only plain HTTP/1.1 on its UI backend port; the operator browser reaches the UI exclusively through the load balancer’s HTTPS origin, and capOS never holds the TLS private key.
  • This is the bootstrap shape chosen because capOS does not yet have TLS termination and private-key custody. The Phase-1 certificate verifier has landed, but TlsServerConfig, key custody, and the userspace L4 TcpSocket relocation have not landed. The ACME/Let’s Encrypt path is now decomposed in Certificates / TLS as a capability-native successor: minimal PrivateKey / KeyVault / KeySource custody, TLS client/server support, RFC 8555 account/order, scoped http-01, CertificateStore.watch renewal, and then a separate public GCE direct-termination proof with explicit public-ingress and CA authorization. That successor does not replace the provider-managed first public proof.
  • Raw public HTTP is rejected as closeout evidence; any port-80 listener is a 301 redirect to HTTPS at the load balancer and never reaches capOS.
  • Browser session rules add a single public HTTPS origin, firewall-bounded trust of the load balancer’s forwarded-scheme header, Secure/HttpOnly/SameSite session cookies, HSTS, anti-CSRF tokens with an origin check, bounded session/idle lifetime, and server-side logout — over the unchanged Gate 1B view-model boundary.
  • Firewall ingress to the UI backend port is restricted to Google load-balancer/health-check ranges (130.211.0.0/22, 35.191.0.0/16) and, if IAP fronts the door, the IAP range (35.235.240.0/20); never 0.0.0.0/0.
  • The full firewall, certificate-custody, evidence, and teardown policy lives in the “Public Web UI Ingress Policy” section of Cloud Deployment, and the TLS-termination/key-custody decision in the “Bootstrap TLS for the First Public GCE Web UI” section of Certificates and TLS.

Selected design:

  • Add a capOS userspace service named remote-session-web-ui for the first proof. It is a sibling of remote-session-capset-gateway, not a replacement for the gateway and not the host remote-session-ui bridge running inside capOS. The service owns the web listener, static assets, authenticated web sessions, remote-session backend state, per-session worker proxies, and browser-facing view-model projection.
  • Static assets live as a checked-in, fixed-name UI bundle embedded in the capOS boot package and served by remote-session-web-ui. The service serves only fixed files, /bundle/manifest.json, and same-origin JSON API routes; it does not expose a general filesystem, asset directory traversal, host path, or development hot-reload surface. The full-bundle proof is remote-session-self-served-full-ui-bundle.
  • The first listener is HTTP/1.1 on a manifest-scoped TcpListenAuthority for a dedicated UI port, for example guest port 8080 under QEMU host forwarding. The service serves static GET assets and same-origin JSON API routes. WebSocket, server-sent events, and streaming terminal/media paths are later extensions that require separate per-route authority and resource bounds; the first self-served proof does not need them.
  • Manifest grants authorize the listener and backend work: scoped TcpListenAuthority for the UI port, SessionManager, AuthorityBroker, a named immutable UI asset bundle, and only the same narrow remote-client service-runner/backend-launch authority already allowed for the remote session path. The service does not receive raw NetworkManager, raw TcpListener factories, broad storage roots, raw ProcessSpawner, shell launcher authority, endpoint owner caps, or arbitrary endpoint creation authority.
  • remote-session-web-ui is the trusted backend and holds the remote session CapSet/proxy state server-side. Browser JavaScript receives only browser-safe view models, launch forms, user-event commands, typed results, denial diagnostics, and redacted transcript rows. It never receives raw capOS caps, raw ProcessSpawner, process handles, endpoint owner authority, local cap IDs, result-cap slots, session-global identifiers, remote CapSet handles, host usernames, host environment variables, host paths, or QEMU-forwarding identity hints.
  • Authentication remains gateway/session-manager shaped. The browser sends credentials or guest/anonymous intent to the capOS-served JSON endpoint; the service derives connection/source metadata from its accepted socket and its own event id, asks SessionManager for a UserSession, asks AuthorityBroker for the remote-client bundle, and projects only the disclosed session and service fields into browser-safe view models. The browser cannot choose a principal, profile, worker session context, or backend cap holder by replaying a request field.
  • Cloudboot-local authority inventory for the completed cloud-prod-remote-session-web-ui-l4-local-proof: the non-qemu proof manifest grants remote-session-web-ui only console, a scoped UI TcpListenAuthority for guest port 8080 served by the Phase C userspace network-stack path, SessionManager, AuthorityBroker, the read-only manual cap, the timer cap used by the HTTP/backend loop, and the fixed-name boot-resource UI bundle. It does not satisfy the UI listener from a kernel tcp_listen_authority source in the non-qemu cloudboot path, and does not grant raw NetworkManager, TcpListener/TcpSocket factories, broad storage roots, raw ProcessSpawner, shell launcher authority, endpoint-owner caps, arbitrary endpoint creation authority, host filesystem paths, or provider/cloud mutation authority. Backend launch/service-runner authority remains available only through the same broker-approved remote-client bundle policy described above.
  • The local cloudboot proof should assert the same browser boundary as the self-served QEMU proof while proving the different listener substrate: browser-visible envelopes, DOM state, diagnostics, transcripts, and JSON responses must not contain raw capOS caps, raw process authority, endpoint-owner authority, local cap ids, result-cap slots, NetworkManager, TcpListenAuthority, TcpListener, TcpSocket, host usernames, host environment variables, host paths, QEMU-forwarding identity hints, provider resource identifiers, public IPs, firewall rules, or TLS key material. Login/source metadata must come from the accepted socket plus a service-generated event id; browser requests cannot supply the trusted principal, profile, source address, worker-session context, or backend cap holder.
  • Expected local cloudboot proof markers are the existing service-side lines that show the narrow service capset, scoped listener, fixed-name bundle, backend-held login/session, backend-held SystemInfo call, browser-safe workspace view models, redacted transcript, backend-held manual view-model projection, and stale-call failure, followed by exactly one cloudboot-evidence: remote-session-web-ui-l4 <token> marker after all forbidden-authority and browser-visible marker checks pass. That marker is local QEMU/cloudboot evidence only; it does not prove private GCE reachability, public ingress, HTTPS/TLS custody, firewall policy, or browser production readiness.
  • Proof marker triage:
Missing or failed marker classLikely failed invariantOwning laneBlocks local Web UI L4 proof?
Narrow service capset, scoped UI listener, or trusted listener/source metadata is absent, or the listener is satisfied by the non-cloudboot qemu kernel socket pathremote-session-web-ui is not bound to the manifest-scoped TcpListenAuthority served by the Phase C userspace network-stack pathListener substrateYes. The local L4 proof cannot close without the non-qemu cloudboot listener source.
Fixed-name bundle, byte-for-byte asset, content-type, /ui-config.js, or /bundle/manifest.json marker is absent or mismatchedThe capOS-served origin is not serving the reviewed immutable boot-resource UI bundleFixed-bundle servingYes. A health-only service marker is not a self-served Web UI proof.
Backend-held login/session, SystemInfo, manual view-model, or workspace view-model marker is absent, or a browser request supplies trusted principal/source/backend holder fieldsThe service is not deriving authority from server-side session state and broker-approved backend capsAuthenticated backend callYes. The proof must exercise at least one backend-held cap path after login.
Logout/stale-call failure marker is absent, stale requests keep dispatching, or result-cap/session table identifiers leak into client-visible stateBackend session teardown does not fail closed before later public or provider promotionStale/logout failureYes. The first local L4 proof needs the stale-call denial; later session-hardening work may add stricter lifetime controls.
Browser-visible envelopes, DOM, diagnostics, transcripts, or JSON contain raw caps, cap ids, process/socket/network authority, host identity, provider resource ids, public IPs, firewall rules, or TLS materialThe browser-safe view-model boundary leaked trusted authority or out-of-scope provider/exposure stateBrowser-visible forbidden marker leakYes for local-service leaks. Provider, public-ingress, and TLS material also route to their later proof lanes before promotion.
All service-side markers pass but the final cloudboot-evidence: remote-session-web-ui-l4 <token> marker is missing, duplicated, or emitted before forbidden-authority checks finishThe harness has not produced a single closeout marker tied to the completed local cloudboot proofEvidence-class boundaryYes. The local proof is incomplete without exactly one final local L4 marker.
Private GCE probe, public HTTPS, DNS, certificate, firewall, load-balancer, or operator-exposure markers are absentThe run did not attempt a later evidence class, or correctly kept provider/public exposure out of the local proofEvidence-class boundaryNo. Those belong to cloud-gce-private-self-hosted-webui-proof or the on-hold public ingress/TLS task, not the local L4 closeout.
  • The first implementation gate was remote-session-self-served-web-ui: boot a focused manifest, load the UI from the capOS-owned HTTP endpoint, log in, exercise at least one granted capability call through the service-held backend state, prove logout/stale failure remains closed, and run browser automation against that capOS-served origin. That pre-Phase-C target used the qemu-only kernel tcp_listen_authority socket owner and is no longer current selected- milestone evidence after the kernel L4 owner was retired. The replacement gate is make run-cloud-prod-remote-session-web-ui-l4, owned by cloud-prod-remote-session-web-ui-l4-local-proof.
  • Validation targets: make run-cloud-prod-remote-session-web-ui-l4 clearly distinguishes the self-served origin from the host development bridge and asserts forbidden browser-visible markers are absent. The current make remote-session-ui bridge remains a development tool, and make run-remote-session-capset-ui keeps its existing host-bridge smoke coverage while the self-served path evolves. Ordinary make run remains a remote CapSet forwarding path, not a self-served UI proof, unless the default-run integration task closes with reviewed manifest, forwarding, and operator-instruction changes.
  • Rollback path: remove the self-served focused manifest/target and stop granting remote-session-web-ui its UI TcpListenAuthority and asset bundle, while leaving the host-served make remote-session-ui path and the remote-session CapSet gateway unchanged. Because the static assets are boot-package resources and the listener is manifest-granted, rollback is a manifest/build-target selection change rather than a downgrade of the gateway authority model.

Acceptance for the implementation gate:

  • The browser retrieves UI assets or the UI backend entry point from a capOS-owned service path, not from the host remote-session-ui development bridge.
  • Browser JavaScript receives browser-safe view models and user-event commands only; raw caps, raw ProcessSpawner, endpoint owner authority, result-cap slots, and host-local identity hints stay out of browser-visible state.
  • The proof uses browser automation against the self-served path and exercises login plus at least one granted capability call.

Gate 2: Gateway Bootstrap And Auth Method Inventory

  • Add RemoteSessionGateway.authMethods and a policy-shaped method list.
  • Support explicit denial for disabled methods so the harness can prove password-only assumptions are not baked into the protocol.
  • Record gateway-derived source metadata, method kind, requested profile, and protocol binding in audit-shaped output.
  • Keep first-remote-client setup disabled unless a manifest explicitly grants a local setup authority path.

Gate 3: First Auth Adapter

  • Choose one bounded first adapter for the proof. Acceptable first choices are public-key fixture auth, password via existing SessionManager.login under explicit policy, or guest/anonymous admission under a narrow profile. Do not design the schema as password-only.
  • Map the accepted proof into SessionManager and mint a real UserSession.
  • Add Rust-level backend/account-store proof coverage that disabled, locked, and recovery-only accounts, unknown principals, and missing or retired resource profiles cannot yield remote-client bundle plans, and that SessionManager password-account selection rejects unknown or inactive account records before a UserSession can be minted.
  • Prove failed proof, wrong requested profile, and unknown principal in the live host/QEMU remote-gateway path before the broker returns a CapSet. The proof also covers anonymous profile mismatch and asserts denied re-login clears previous per-connection/session view state.

Gate 4: Broker Remote Bundle

  • Add an AuthorityBroker path for remote-client bundles, or a temporary clearly named wrapper around the existing shell bundle that does not imply terminal authority.
  • Bundle at least session and systemInfo; add one demo service cap such as chat or paperclips for behavior proof.
  • Add a remote-client bundle shape that preserves the useful default-operator service surface without becoming an operator shell bundle. It should include a restricted launcher/service-runner descriptor for allowed service binaries, broker-held or remote-proxyable service endpoints such as chat and adventure, and enough metadata for the UI to construct launch plans for server processes. It must not grant a raw shell launcher, terminal authority, raw ProcessSpawner, raw network factories, or endpoint owner authority to browser code.
  • Ensure anonymous/guest/default remote bundles do not receive operator shell launcher or broad service endpoints unless policy explicitly grants them.
  • Add wrong-name and wrong-interface tests for RemoteCapSet.get.

Gate 4A: Remote Service Catalog, Launch DTO, Adventure Launch, And Game Server Caps

  • Define a remote service catalog DTO or capnp-rpc object. It should list policy-approved service profiles, runnable binaries, companion processes, required grant names/interfaces/transfer modes, exported capability descriptors, attach/start/stop policy, and whether each grant is backend-held, service-owned, or a client facet. The current DTO catalog describes available DTO services plus Adventure/Paperclips launch profiles. Adventure start/attach is the current restricted-runner slice; Paperclips attach/start/stop policy remains future runner work.
  • Define the restricted service-runner launch request/status/probe DTO shape: submit a catalog profile plus explicit named grants, then return side-effect-free support state, accepted grant names, a message, and planned remote descriptors for exported or broker-held capabilities. This slice intentionally does not start processes, create endpoint owners, attach returned caps, or expose raw ProcessSpawner, process owner handles, endpoint owner caps, local cap IDs, result-cap slots, or browser-held capOS caps.
  • Implement the actual restricted service-runner behind the serviceLaunch contract for Adventure in the default make run manifest. The service runner may use local spawn authority internally, but the remote/browser-facing contract must still expose only launch request/status DTOs and remote capability descriptors, never raw spawn authority or local handles.
  • Implement the first game-server flow for Adventure. The backend should use the remote session’s restricted launcher/service-runner to start adventure-server and simple NPC companion processes with the remote-safe endpoint grant shape: the Adventure endpoint owner and Chat client facet are passed to child processes, while the gateway’s system Console cap is not regranted across the operator-session boundary. The backend then attaches or retains backend-held Adventure and chat-facing service descriptors/caps. Chat now uses a per-session worker endpoint proxy for Chat.send; Adventure status, look, inventory, go(direction), and bounded take/drop/use item calls use the same pattern after launch. Broader Adventure endpoint calls and rich client controls remain later.
  • Implement the Paperclips direction as soon as the server-owned Paperclips server profile is available in the remote catalog: start or attach to the authoritative Paperclips server, read structured status/project/command descriptors, and submit commands through server-owned capabilities. Until then, the UI may show Paperclips as “terminal-only/not remote-proxyable yet” rather than scraping terminal text.
  • Prove launch denials are explicit: disallowed binaries, missing required grants, wrong-interface grants, stale sessions, and anonymous/guest profiles without service-runner authority all fail before any process is started or any returned cap is exposed. The live remote-gateway proof covers stale sessions and anonymous/no-runner sessions in the CLI QEMU path; guest admission now has a dedicated RemoteGatewayRequest.guestLogin arm, and guest sessions go through the broker/account-store remote-client bundle policy with the same no-runner constraint.
  • Prove process handles and endpoint owner caps stay backend-local or are withheld entirely from the browser. Browser-visible state is limited to launch status, service descriptors, command/status view models, denial diagnostics, and redacted transcript rows. CLI and UI smoke checks reject raw authority markers in transcripts, reports, and API envelopes.
  • Add a focused guest remote-gateway login proof once the wire protocol and gateway expose a concrete guest auth adapter, then repeat the same no-runner serviceLaunch denial assertions for guest sessions. Landed 2026-05-08 03:59 UTC. The QEMU interop harness ships a guest admission happy proof (manifest seeds a guest principal, gateway accepts the requestedProfile = "guest" request) and an guest launch-denial proof (successfully admitted guest sessions repeat the service-launch denial matrix; in the Adventure interop manifest this proves the guest bundle still lacks service-runner authority even when the operator path can launch) plus an auth denial guest profile mismatch proof (gateway refuses requestedProfile = "operator" through the guest method with the redacted "guest login denied" message). The bridge host-tests additionally pin the RemoteErrorCode::DisabledAuthMethod denial that fires when the manifest has no guest seed.

Gate 5: Per-Session Worker And Proxy Lifetime

  • Host the first post-auth endpoint-backed remote cap, Chat.send, in a per-session worker/proxy context instead of calling it from the gateway process.
  • Associate the first chat proxied calls with the live remote session context; the focused QEMU proof shows the spawned chat worker running with the operator session context.
  • Drop/release the chat worker holds when logout is called, the connection closes, or the worker exits; teardown now asks the worker to shut down through its control endpoint and falls back to termination only if that path fails.
  • Generalize the worker/proxy lifecycle infrastructure for the currently supported endpoint-backed calls. Chat send and Adventure status/look/inventory now share worker spawn validation, exactly-one parent control endpoint validation, graceful shutdown, forced termination fallback, logout/close teardown, and release flushing.
  • Add the first richer Adventure worker/client protocol slice on top of the shared lifecycle manager: read-only Adventure.look and Adventure.inventory now share the same per-session Adventure worker as Adventure.status.
  • Add the first service-specific mutable Adventure worker/client protocol slice: bounded Adventure.go(direction) now runs through the same per-session Adventure worker and returns bounded movement text plus room state.
  • Add the first item-oriented Adventure worker/client protocol slice: bounded Adventure.take(item), Adventure.drop(item), and Adventure.use(item) run through the same per-session Adventure worker, validate transcript-safe item tokens, and return bounded text or room state to the CLI and web bridge.
  • Add service-specific worker/client protocol slices for broader mutable Adventure calls and future Paperclips service calls on top of the shared lifecycle manager.
  • Treat send-side disconnects while replying as connection close, then release gateway-held state through the existing per-connection teardown path instead of failing the whole gateway process.
  • Prove stale proxy calls after logout/disconnect fail closed.

Host-client/backend coverage now includes pre-session bootstrap reset and zero-byte read-timeout retry, repeated DTO calls, repeated post-logout stale-call probes, authenticated gateway close during a call, and oversized gateway response frames. The scripted CLI retries authMethods connection resets before login so QEMU host-forwarding races do not look like real session loss. The trusted web backend also retries a pre-session authMethods bootstrap disconnect or no-byte read timeout before any auth inventory or session state exists, clears backend-held session state for disconnect/oversized response failures, and returns user-facing gatewayDisconnected / reconnectRequired guidance without exposing raw frame errors to browser JavaScript. Kernel deferred TCP recv waiters now fail closed with an error CQE on terminal runtime/transport errors instead of dropping the pending call without completion; WouldBlock still requeues, and socket close still returns zero-byte EOF. The gateway now uses a connection frame-read wait instead of the short service-call wait, so an idle TCP remote session remains open past the former five-second read window and tears down only when the peer closes or the transport actually fails.

Gate 6: Capability Calls Beyond Chat

  • Call at least two granted capabilities through generated host bindings. The current proof covers UserSession.info/session, SystemInfo.motd/system_info, the first worker-backed Chat.send, and the worker-backed Adventure methods, Adventure.status, Adventure.look, Adventure.inventory, mutable Adventure.go(direction), and bounded item controls Adventure.take/Adventure.drop/Adventure.use. Broader Adventure controls and PaperclipsGame.status wait for later service-specific proxy/client gates.
  • Prove a service-specific domain denial remains a schema result rather than a transport failure. The focused chat proof asks the per-session worker to call Chat.send without first joining the proof channel and requires chatSent(false) in the CLI/UI API smokes, not RemoteError or a gateway disconnect.
  • Prove target service sees session-bound caller metadata rather than a caller-selected identity field. The remote-client chat facet now grants only the existing bounded disclosure fields to the per-session worker, the worker explicitly requests those fields on Chat.join/Chat.send, and chat-server logs a target-service proof only after it sees a live opaque caller-session reference with operator principal class, password auth strength, and operator profile class. Browser/client-visible DTOs still do not expose raw scoped refs, local cap handles, or process handles.

Gate 7: Transport Security And Non-Password Auth Expansion

  • Add capOS-terminated TLS server config once certificate/TLS primitives exist. Until then, the first public Web UI ingress terminates HTTPS at the provider load balancer (see “Selected public ingress and TLS policy” under Gate 1B); this checklist item is the capability-native successor, not the first public proof.
  • Add mTLS client identity admission when certificate policy and account bindings exist.
  • Add public-key auth with protocol-domain-separated challenge bytes.
  • Add OIDC device-code and browser-assisted PKCE flows when OAuth/OIDC token capabilities exist.
  • Add passkey/WebAuthn through the web gateway path when authenticator primitives exist.
  • Add service/workload credential admission for non-human automation.

Gate 8: Renewal, Revocation, And Resource Bounds

  • Wire kernel-backed UserSession.logout and gateway/connection close propagation for the current DTO remote-session gateway.
  • Reject already-admitted endpoint returns after caller logout/session death before result bytes, exception payloads, or result caps are installed in the stale caller.
  • Extend logout/revocation cleanup to live remote proxy objects once standard RPC framing lands.
  • Add renewal only through a narrow session-manager/broker path that does not revive stale ordinary grants by accident.
  • Add resource limits for connections, remote refs, in-flight calls, queued promises, result sizes, and per-session CPU/memory/network accounting. Initial four classes landed 2026-05-03 16:21 UTC: transcript ring (6d855c01), backend cap-holders + catalog mirrors (5ec0e456), outstanding worker calls per session (0f82528c), and gateway concurrent logins per principal (99955d59). Bound choices and the exhaustion-as-typed-denial contract are documented in the proposal’s “Resource and revocation bounds” section. Per-session CPU/memory/network accounting and remote-ref limits remain future work tied to the capnp-rpc rewrite.
  • Add explicit CapException/RPC exception tests for the currently representable Gate 8 failure classes: transport breakage, worker/proxy failure, stale sessions after logout, and oversized messages. Host coverage now checks that the backend-only capnp-rpc Chat facade maps DTO transport breakage to capnp::ErrorKind::Disconnected, maps DTO denials and unexpected worker/proxy responses to Failed CapException-like errors, and does not expose raw proxy positions, local cap ids, result-cap labels, session ids, or socket hints in exception text. The trusted web bridge coverage now drives worker-targeted Chat.send disconnect, oversized worker response, and post-logout stale-session paths; each fails closed as gatewayDisconnected or staleSession, decrements outstanding worker-call accounting, clears or preserves backend state according to the existing lifetime contract, and keeps redacted transcript export free of raw socket errors, frame sizes, local cap ids, proxy positions, raw session id hex, passwords, and host endpoint hints. Revoked-lease coverage remains blocked rather than faked: the current DTO surface has lease timestamps in RemoteCapEntry, but no explicit revoke/lease-expired request path or RemoteErrorCode variant that can distinguish a revoked lease from the existing staleSession / methodDenied denials. Add the revoked-lease proof when the standard RPC object lifetime path or a reviewed DTO denial code makes it observable.

Gate 9: Bidirectional UI Composition

  • Keep this separate from Gate 1A. Gate 1A is a host-rendered UI over the existing client; Gate 9 lets capOS-side services propose bounded UI surfaces back to that host UI through explicit capabilities.
  • Add a proposal-level RemoteUiHost / RemoteUiSurface schema slice or equivalent typed DTOs for declarative UI patches and typed user events.
  • Keep the first UI proof behind a separate granted UI-surface cap, not implicit in RemoteSession or RemoteCapSet.
  • Prove a capOS service can open/update one bounded surface and receive one typed user event from the host UI.
  • Prove the same service cannot spoof login/permission chrome, inject raw JavaScript/CSS, persist layout or theme state, or exceed update/size quotas without explicit authority.
  • Add a host-app reset/close path that releases the UI surface and leaves underlying service caps intact.

Verification Targets

Initial documentation/planning check:

make docs
git diff --check

First implementation check:

cargo test --manifest-path tools/remote-session-client/Cargo.toml --target x86_64-unknown-linux-gnu
make run-remote-session-capset-interop
make run-remote-session-adventure-interop
make run-capnp-chat-interop

Security review checklist:

  • Remote client cannot obtain authority by guessing a cap name.
  • Remote client cannot replay a session or grant identifier on another connection.
  • Remote client cannot ask for a local cap slot, endpoint selector, or receiver metadata.
  • Logout/close/revocation tears down all session-bound proxies.
  • Guest/anonymous profiles receive only explicitly policy-granted caps.
  • Browser/agent paths never receive raw capOS capability objects client-side.
  • GUI/Tauri/web front ends keep capOS caps in the Rust/backend/gateway side of the trust boundary; UI code receives typed view models, command descriptors, or tool requests.
  • UI composition is capability-gated, declarative, quota-bound, and reversible by the user.