Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: Remote Session CapSet Clients

Let a regular host application connect to a capOS instance, authenticate through the same session machinery as shells and gateways, receive a broker-issued remote view of its CapSet, and invoke the granted capabilities over standard Cap’n Proto RPC. The first proof can be a Linux Rust CLI because it is easy to script, but the design is for host applications generally: native GUI apps, Tauri apps with Rust backends, server-side webapp gateways, desktop tools, and agent runners can all consume the same remote session CapSet model.

The important correction is that this is not a special “remote chat client” and not another shell transport. Chat, Paperclips, Adventure, system-info, command surfaces, and future service APIs should be ordinary capabilities in a remote session bundle. A shell is one possible client of that bundle; it is not the universal protocol.

Current State

The tree has several local interop and UI proofs:

  • demos/capnp-chat-interop runs inside capOS, accepts one scoped TCP connection, decodes a schema-framed Chat.send parameter message, calls the resident chat endpoint, returns a schema-framed result, and exits.
  • The host harness uses a Linux Python script plus the pinned capnp tool to encode/decode request and result messages.
  • demos/remote-session-capset-gateway runs inside capOS, listens through a manifest-scoped TcpListenAuthority on guest port 2327, authenticates a remote session through SessionManager, returns a broker-shaped remote CapSet view, calls session/system-info DTO operations, and proves wrong-interface, unknown-cap, and stale-session denials. It derives login source metadata from the accepted socket and a gateway-generated connection event id.
  • tools/remote-session-client is a regular Linux Rust client crate. Its library is UI-neutral so the same client logic can back a CLI harness, native GUI, Tauri backend, or trusted web gateway.
  • remote-session-ui is a trusted loopback web bridge in that crate. Its Rust backend holds the TCP connection and remote session state, serves a browser UI, and exposes only view models, call results, denial diagnostics, and redacted transcript rows to browser JavaScript. The focused make run-remote-session-capset-ui harness drives that UI against a gateway-only QEMU fixture.
  • remote-session-web-ui is a capOS-served browser UI backend. Default make run starts it on guest port 8080 with loopback host forwarding, and make run-remote-session-self-served-web-ui proves the full boot-resource UI bundle is served from the capOS-owned origin while preserving the same browser-safe view-model boundary. This remains local/QEMU evidence; the cloudboot L4, private GCE, and public ingress proofs are separate tasks.

Those proofs are useful because they show external Cap’n Proto data can cross the QEMU TCP boundary and reach capOS-hosted services through narrowed listener caps. The remote-session proof is the first target-shaped slice, but it is not the final RPC API. It still lacks:

  • standard capnp-rpc message transport;
  • live typed RPC proxy objects rather than DTO-mediated gateway operations;
  • live endpoint-backed proxy objects beyond the current authenticated per-session DTO worker slices for Chat.send, Adventure status/look/inventory/go(direction), and the Paperclips Path B bridge-internal initial/command/status/projects synthesis from cached serviceLaunch state;
  • Paperclips service-runner launch on the default make run manifest (Path B wires the gateway worker, bridge dispatch, UI launch slot, and the system-remote-session-paperclips.cue focused manifest now declares its AuthorityBroker launch policy, but default-manifest Paperclips launch wiring remains future work);
  • the on-wire Paperclips control-plane (Path C): extending RemoteGatewayRequest/RemoteGatewayResponse with paperclips arms so the bridge no longer synthesizes responses from cached launch state and the gateway worker drives PaperclipsGameClient over a real DTO arm rather than the manifest-static game endpoint fallback;
  • rich Adventure/Paperclips client controls and broader service-specific worker/client implementations beyond the current Chat, Adventure, and Path B Paperclips slices;
  • complete object lifetime and exception behavior;
  • broader revocation and object-drop propagation beyond the current kernel-backed DTO logout and connection-teardown path;
  • TLS/mTLS and expanded auth adapters beyond password, anonymous, and guest;
  • resource accounting for remote references, in-flight calls, and result sizes.

Goals

  • Support a normal host client built and run outside capOS. A Linux Rust CLI is the smallest harness; native GUI and Tauri/webapp-backed clients should not need a different capOS protocol.
  • Authenticate through capOS session/admission services, not through an application-specific service secret.
  • Support multiple admission methods: local password where policy enables it, public-key signatures, OIDC/OAuth browser or device flows, passkey/WebAuthn through the web gateway path, mTLS client identity, guest/anonymous profiles where explicitly enabled, and future service/workload credentials.
  • Return a live remote CapSet view whose entries are typed RPC client objects, not serialized local cap-table slots.
  • Let the client call any granted remote-proxyable capability by name and expected interface ID.
  • Let a host UI discover broker-approved service profiles, start allowed game server processes through a restricted service-runner, and attach the capabilities those processes export or receive without exposing local spawn authority.
  • Support bidirectional session UI composition: a host UI can call capOS capabilities, and capOS-side services or agents can propose bounded changes to the host session’s panes, command palette, visualizations, density, theme, and workflow-specific controls through explicit UI capabilities.
  • Keep local-only authority local: cap IDs, endpoint generations, receiver selectors, session-global identifiers, and kernel result-cap indexes never become portable remote authority.
  • Preserve session-bound invocation context. Remote calls run under the gateway/worker session created for that remote client.
  • Make logout, disconnect, transport breakage, session expiry, policy revocation, and object drop observable and fail closed.

Non-Goals

  • General network transparency across arbitrary capOS hosts.
  • OCapN compatibility or third-party handoffs.
  • Browser JavaScript receiving capOS capability objects directly. A webapp may be a front end, but a trusted server, gateway, or Tauri Rust backend holds the remote CapSet.
  • Letting capOS services execute arbitrary host UI code, inject unreviewed JavaScript/CSS, spoof trusted browser/desktop chrome, or persist UI changes outside the granted session UI scope.
  • Replacing SSH, WebShellGateway, native shell, or interactive command surfaces.
  • Exposing raw ProcessSpawner, raw process handles, endpoint owner caps, local cap ids, result-cap slots, raw network factories, broad storage roots, key material, or browser-held capOS capability objects as a default remote bundle. Process handles stay backend-local.
  • Treating a browser or webview as a capOS capability host. Browser code sees view models, launch forms, command descriptors, user events, diagnostics, and rendered results; the trusted Rust/backend side holds the remote session and any remote capability proxies.
  • Treating password authentication as the only or preferred remote path.
  • Serializing the kernel CapSet page or local cap table to the client.

UI Scope And Architecture

This section is the single-page synthesis future contributors should read before changing anything in tools/remote-session-client/ or the gateway. The detailed mechanics live in the rest of this proposal, the backlog (docs/backlog/remote-session-capset-client.md), and the plan (docs/backlog/remote-session-capset-client.md); this section captures what the UI is for, what it must hold, and how the pieces decompose.

Goal

A remote operator, after authenticating to a capOS gateway, can drive every remote-proxyable capability the broker grants their session – directly, with typed UI, without a shell, without webview-held capOS handles, without leaking session-id hex, cap slots, or process handles to the browser. The CapSet UI is not a shell, not a generic API explorer, and not a browser; it is a peer client of the same broker bundle a shell would consume, over TCP/RPC instead of the ring page, with a backend-held authority boundary and a typed UI on top.

What the UI is for

Grouped by intent, not by panel. Each item is constrained by the corresponding section later in this proposal.

  • Sign in to a remote capOS host. OS-style login surface with a visible username field, secondary endpoint/auth controls, no full persistent technical header. The gateway advertises the auth methods the system makes available (narrowed only by explicit manifest policy); disabled methods stay listed and clearly marked so the protocol is not password-shaped. The web UI’s username field is empty by default – the bridge does not pre-fill from CAPOS_REMOTE_SESSION_USER, host USER, or any other host-side identity hint, because a pre-fill leaks operator/account hints to anything observing the page before authentication. The CLI may take --user as an explicit operator override; the web UI does not. Denials surface with explicit codes, never as silent transport errors.
  • Understand who/what the operator is. Session view: principal, profile, auth method, auth strength, freshness/expiry, logout. Redacted session-id only. Lifecycle states observable: live / logged_out / future expired / revoked / recovery_only. Stale-call attempts must visibly fail closed. (See ## Invocation Context and docs/proposals/session-bound-invocation-context-proposal.md.)
  • Discover what was granted. CapSet view as the inspection surface (name, interface id, transfer policy, lease expiry, get-by-name+id); service catalog view as the task-oriented surface (broker- and launcher-advertised runnable profiles, required grants, exported descriptors, launch/probe/status). See ## Service Catalog And Game Server Launch.
  • Use what was granted. For every cap the broker bundles, the UI must offer at least a generic invocable form – not just inspection. Service-specific rich clients (Adventure rich client, real Chat panel, Paperclips client, future agent-shell-services) layer on top of the same backend-held caps. Where a service exposes a typed CommandSurface (see docs/proposals/interactive-command-surface-proposal.md), the UI renders typed buttons/inputs/selectors driven by that surface’s metadata rather than hand-coded controls. Where a service exposes text/audio/video surfaces, the UI consumes them through the Chat substrate (docs/proposals/chat-multimedia-substrate-proposal.md): listener caps for incoming text/audio/video, capnp -> stream methods for outgoing media, capability-mediated peer/channel granting, and a WebRTC mapping for the browser-to-backend audio/video path. The CapSet UI never holds the listener caps directly; the trusted Rust backend owns them and emits redacted view-model events plus WebRTC handles for the browser.
  • Host a terminal panel when granted. The CapSet UI is not defined as a terminal emulator and works without one. But when the broker grants a TerminalSession cap – for a native shell, a POSIX shell, or any StdIO-based service that expects a terminal on the other side – the UI may host a terminal panel for that cap. The boundary stays: terminal bytes flow through a backend-held TerminalSession; the browser renders frames it receives, never opens a raw shell or holds a ProcessSpawner.
  • Surface agent-shell-exposed capabilities as first-class. The CapSet UI does not contain the LLM loop, model client, or tool-execution runner – those live in the agent shell process (see docs/proposals/llm-and-agent-proposal.md). But agent-shell-exposed services (e.g. “send message to running agent”, “approve queued action”, “audio stream to/from agent”) are services the broker can bundle. When bundled, the CapSet UI exposes them through the same per-session worker / typed view-model pattern as Chat or Adventure. Action-approval queues are the canonical capability-driven UI surface here – the policy engine asks, the operator sees a queue and approves/denies per item.
  • Launch services where policy allows. Service-runner launch flow: select profile → see required grants → side-effect-free probe → confirm → backend launches restricted server graph (e.g. adventure-server + NPC companions) → backend attaches/retains exported descriptors in the backend-held remote CapSet. Browser sees launch form, status, denials, descriptors – never raw ProcessSpawner or process handles.
  • Diagnose / audit. Low-level probes (denied-chat, stale-call, system MOTD, session-summary diff) live in a Diagnostics or Session panel, not interleaved with normal service use. Redacted transcript export in its own view; redaction status visible; raw authority material absent. UI smoke checks for forbidden markers (processhandle, capabilitymanager, capslot, …).
  • Bidirectional UI composition (later). A capOS service may, only when granted a RemoteUiSurface cap, propose bounded layout/theme/command/visualization patches and receive typed user events back. Cannot inject JS/CSS, spoof login chrome, persist UI state without a separate settings cap, or exceed quota/size bounds. See ## Bidirectional UI Composition.

Design invariants the UI must hold

The proposals don’t specify pixel layout; they specify a small number of hard invariants. Every UI design choice has to fit these:

  1. Authority boundary. Trusted Rust backend holds: TCP connection, remote session state, per-session worker proxies, capOS cap references, broker bundle policy, raw snapshots used to compute view models. Browser holds: view models, command descriptors, launch forms, redacted transcript rows, theme state.
  2. Session-bound invocation. Every post-auth call runs under the immutable SessionContext of the per-session worker. The browser cannot select identity by request field; the backend cannot construct a fresh SessionContext from request bytes. Logout, disconnect, expiry, revocation must break all session-bound proxies and fail closed before result bytes reach the caller.
  3. Privacy-preserving disclosure. Default endpoint metadata is opaque (scoped_ref + freshness). Subject fields (principal, profile, auth strength) appear in the UI only because the broker policy explicitly disclosed them for that service.
  4. Capability = invoke gate; UI surface = render gate. A button on the screen is not what authorizes a call. The cap held in the backend is. UI controls that aren’t currently invocable must say “planned / not remote-proxyable yet” rather than imply they work.
  5. Interface = permission. Method-level access lives in the schema, not in a per-cap rights bitmask. Narrowing what a remote client can do means a narrower wrapper cap from the broker – not a flag on the same cap.
  6. Side-effect-free probes are real. A probe response that says “supported / required grants accepted / message” did not spawn anything, allocate endpoint owners, or attach caps.
  7. Redaction is structural, not after-the-fact. Sensitive fields are dropped or redacted on the way into view models, not stripped from logs after the fact. Backend tests assert browser envelopes never contain raw session-id hex or password material.
  8. UI smoke fails if any visible button is unexercised. This prevents the UI from accumulating decorative controls.
  9. Theme/layout state is local UI state, not capOS state. Persistence requires an explicit settings cap.

Architecture decomposition

flowchart LR
  subgraph host[Host machine]
    subgraph browser[Browser / webview / Tauri webview]
      js[Browser JS - view models, forms, results, redacted transcript, theme state]
    end
    subgraph rust[Trusted Rust backend - tools/remote-session-client]
      bridge[HTTP bridge - /api/* endpoints]
      app[AppState - session VM, caps VM, snapshots, transcript, automation]
      tcp[Gateway TCP connection - schema-framed DTOs today, capnp-rpc planned]
      lib[remote-session-client lib - protocol, frame, session_diff, transcript]
    end
    cli[CLI binary - same lib backend]
  end

  subgraph capos[capOS guest in QEMU or future hardware]
    subgraph gw[Remote-session gateway process]
      tcplisten[TcpListenAuthority on guest port 2327]
      authflow[Auth flow - password, anonymous, future adapters]
      sm[SessionManager.login -> UserSession]
      broker[AuthorityBroker.remoteClientBundle]
    end
    subgraph workers[Per-session RPC workers]
      chatw[Chat worker - holds Chat client facet]
      advw[Adventure worker - holds Adventure endpoint]
      futurew[Future workers per service - terminal, agent, voice...]
    end
    subgraph services[Backing services]
      cs[chat-server]
      ad[adventure-server + NPCs]
      pc[paperclips-server - future]
    end
    kernel[Kernel - SessionManager, CapTable, Endpoints, ring, audit]
  end

  js -- HTTP JSON --> bridge
  bridge --> app --> lib --> tcp
  cli --> lib
  tcp -- TCP / DTO today / capnp-rpc planned --> tcplisten
  tcplisten --> authflow --> sm --> broker
  broker -- backend-held descriptors / caps --> app
  app -- worker spawn requests --> broker
  broker --> workers
  chatw --> cs
  advw --> ad
  workers <--> kernel

Key seams:

  • Gateway boundary (demos/remote-session-capset-gateway/): scoped TcpListenAuthority, SessionManager, AuthorityBroker, narrowly approved backend launch authority. No raw NetworkManager, raw ProcessSpawner, broad endpoint authority.

  • Per-session worker boundary (demos/remote-session-chat-worker/, demos/remote-session-adventure-worker/, future workers): each endpoint-backed remote method runs in a worker that holds the live session-bound caller context. Worker spawn is validated; logout/connection-close tears down workers; release flushing happens on shutdown.

  • Trusted Rust backend boundary (tools/remote-session-client/src/): the AppState keeps gateway: Option<GatewayConnection>, current_snapshot: RemoteSessionSnapshot (raw), and view-model fields (redacted). The HTTP bridge’s /api/* surface is the only path the browser has into capOS authority.

  • Browser boundary (tools/remote-session-client/ui/): pure client of /api/state view models, /api/call/* typed calls, /api/capset/*, /api/probe/*, /api/transcript/*. JS state is presentation: theme, active tab, login form values, click coverage report.

  • Transport evolution. Today: bespoke schema-framed Cap’n Proto DTOs, length-prefixed frames, request/response sequence numbers. Planned: standard capnp-rpc with live proxy objects, exception mapping, release/drop, promise pipelining. The backend boundary stays the same; the wire shape changes.

    Standard capnp-rpc (the capnp-rpc Rust crate, v0.25 at the time of writing) is std-only and requires a futures executor; the QEMU-side gateway is #![no_std] #![no_main] with a synchronous loop { accept; loop { recv_frame; handle; send_frame } } shape (demos/remote-session-capset-gateway/src/main.rs). The wire-level replacement is therefore gated on either bringing an async runtime to capOS userspace or shipping a sync-friendly capnp-rpc adapter. Until then, transport-lifetime / exception behavior carries the contract documented next, which the eventual rewrite must preserve.

    Runtime decision for the first proxy layer: use a temporary dual-stack. The Linux host backend now has a local capnp-rpc Chat facade/proxy layer because that side already has std and can run a futures executor. The facade translates backend-held typed proxy calls into the existing RemoteGatewayRequest / RemoteGatewayResponse DTO transport, so the guest gateway remains synchronous and #![no_std]. This proves host-backend proxy semantics, denial/disconnect mapping, and browser-safe view-model integration; it does not claim standard capnp-rpc framing or live RPC vats inside capOS. Gateway-wire replacement waits for the userspace runtime decision above, and the dual-stack must be removed after the reviewed guest-side RPC path carries live service traffic.

Transport lifetime and exception contract

The bespoke transport’s lifetime contract is what the future capnp-rpc proxy layer has to preserve. The host-side test module in tools/remote-session-client/src/bin/remote_session_ui.rs pins each rule end-to-end:

  • Connection close mid-call clears state, returns gatewayDisconnected. A TCP FIN observed during a request surfaces as 503 gatewayDisconnected with view.lastResult.code = "gatewayDisconnected", view.connected = false, session = null, empty caps / services / launchers, and a disconnect transcript row scoped to the operation that failed. Covered by authenticated_gateway_close_during_call_clears_view_with_reconnect_guidance, oversized_gateway_response_during_call_clears_view_with_reconnect_guidance, password_denial_then_closed_tcp_resets_before_retry, http_password_denial_then_closed_tcp_preserves_backend_error_and_clears_view.
  • Half-open transport (write succeeds, read stalls) times out cleanly. The bridge’s read_timeout (endpoint.io_timeout()) must fire and surface the same gatewayDisconnected shape; no hang or partial-state leak. Both the post-request stall case and the partial-frame-header stall case are covered: half_open_response_read_times_out_as_disconnect, partial_response_header_then_stall_treated_as_disconnect.
  • Protocol-level decode errors (sequence mismatch, malformed payload) yield 500 internal without tearing down the connection. This documents current behavior; the future capnp-rpc rewrite is expected to tighten this to a connection- level abort once the proxy layer is in place. Covered by response_with_wrong_seq_yields_internal_error, malformed_response_payload_yields_internal_error.
  • Immediate re-login after transport failure succeeds. No retry / cooldown gate; the recovered session must not echo the prior call’s failure as lastResult. Covered by immediate_relogin_after_mid_call_close_succeeds.
  • disconnect rows survive into the operator-visible exported transcript (GET /api/transcript/redacted) scoped to the operation that failed and free of stream-level metadata (peer addresses, frame sizes, raw os error strings, secrets). Covered by disconnect_recorded_in_exported_transcript_after_mid_call_close.
  • Gateway-side teardown calls kernel UserSession.logout on both the explicit-logout DTO path and the connection-close path. Verified by the QEMU-driven harness in tools/qemu-remote-session-capset-smoke.sh, which asserts that UserSession.logout cap call succeeded; remote session stale and connection teardown UserSession.logout cap call succeeded both appear during the multi-cycle interop run.
  • Post-logout calls fail closed. The bridge keeps the gateway socket alive after logout so a stale-call probe gets an explicit staleSession denial rather than a transport failure. Covered by repeated_stale_calls_after_logout_remain_fail_closed and the worker-targeted stale_chat_proxy_after_logout_returns_typed_denial.
  • Worker/proxy lifetime failures preserve the same split. Worker-targeted Chat.send transport loss and oversized worker responses clear backend gateway/session state and surface gatewayDisconnected with reconnect guidance, while post-logout worker calls remain typed staleSession denials on the still-open gateway socket. The backend-only capnp-rpc facade maps transport breakage to ErrorKind::Disconnected, and maps DTO denials or unexpected worker/proxy responses to Failed CapException-like errors rather than panics or silent broader authority. Covered by chat_worker_transport_breakage_clears_state_and_redacts_export, oversized_chat_worker_response_maps_to_disconnect_without_frame_leak, generated_chat_client_transport_breakage_maps_to_disconnected_exception, generated_chat_client_dto_denial_maps_to_failed_cap_exception_like_error, and generated_chat_client_unexpected_worker_response_maps_to_failed_exception.
  • Revoked leases are not yet separately observable. The current DTO surface carries leaseExpiresAtMs on cap entries, but it has no explicit revoke/lease-expired call path or denial code that can distinguish a revoked lease from staleSession or methodDenied. Tests must not fake this coverage; add it with the standard RPC object lifetime path or a reviewed DTO denial shape.
  • Redacted transcript export does not expose exception/lifetime internals. Worker-targeted disconnect, oversized response, and stale-session exports are asserted free of raw socket addresses, OS error strings, frame-size diagnostics, local cap ids, result-cap labels, proxy table positions, raw session-id hex, passwords, and host endpoint hints.

Resource and revocation bounds

Each per-session resource class has an explicit named ceiling and maps over-cap conditions to a typed denial diagnostic that reuses the transport-error envelope from above. Operators tuning these bounds should re-audit the per-session memory budget and the operator-multitool scenario before changing them; raw observed counters are not exposed to browser-facing view models.

ResourceConstantDefaultWhere enforcedDenial code
Outstanding worker calls per sessionMAX_OUTSTANDING_WORKER_CALLS_PER_SESSION4tools/remote-session-client/src/bin/remote_session_ui.rs::transact (gates Adventure / Chat-shaped requests before submission)tooManyWorkerCalls (HTTP 503)
Transcript ring per sessionTRANSCRIPT_ROWS_CAP (4096), TRANSCRIPT_DETAIL_BYTES_CAP (1 MiB)row + byte capsAppState::push_transcript / enforce_transcript_caps in the same filedrop-oldest plus a single audit "transcript truncated; ..." row
Backend cap holders per sessionMAX_BACKEND_CAP_HOLDERS_PER_SESSION (64), MAX_BACKEND_SERVICE_CATALOG_ENTRIES (64), MAX_BACKEND_LAUNCHER_CATALOG_ENTRIES (32)per-Vec entry capscapset_list / service_catalog / launcher_catalog in the same filetooManyCapHolders (mirrors transport-error envelope)
Browser-session owner slotone tentative or authenticated ownerfirst-wins bridge ownerlogin-route preflight reserves before gateway authentication; success finalizes on cookie rotation, failure releases the reservationsessionAlreadyInUse (HTTP 409)
Local HTTP request parserrequest line 8 KiB, header line 8 KiB, 96 headers, aggregate headers 32 KiB, body 64 KiB, fixed read/write timeoutloopback bridge input boundsread_http_request and handle_connection reject before route dispatch, JSON parsing, auth, or gateway I/OhttpLineTooLong, tooManyHeaders, headersTooLarge, requestBodyTooLarge, requestTimeout
Local HTTP handler slotsMAX_HTTP_HANDLER_THREADS (32)concurrent request handlersaccept loop acquires a bounded slot before spawning a handler threadhandlerLimitExceeded (HTTP 503)
Concurrent gateway logins per principalMAX_CONCURRENT_LOGINS_PER_PRINCIPAL (4), PRINCIPAL_TABLE_SLOTS (32)per-principal counter, distinct-principal table ceilingdemos/remote-session-capset-gateway/src/lib.rs::PrincipalLoginTable::try_admit, called from both password and anonymous login pathsserviceUnavailable with “per-principal concurrent-session cap reached…”

The bridge-side bounds are exercised by host tests in remote_session_ui.rs::tests (transcript_row_count_cap_drops_oldest_with_truncation_marker, transcript_byte_cap_drops_oldest_with_truncation_marker, transcript_at_exact_row_cap_does_not_truncate, capset_list_at_max_holders_bound_stores_all_entries, capset_list_over_max_holders_returns_typed_denial, service_catalog_at_max_entries_bound_stores_all_entries, service_catalog_over_max_entries_returns_typed_denial, launcher_catalog_at_max_entries_bound_stores_all_entries, launcher_catalog_over_max_entries_returns_typed_denial, outstanding_worker_calls_at_bound_still_allow_one_more_after_completion, outstanding_worker_calls_over_bound_returns_typed_denial, concurrent_first_wins_login_reservations_allow_one_post_login_owner, failed_login_reservation_releases_for_later_owner, http_parser_rejects_oversized_request_line_before_route_work, http_parser_rejects_oversized_header_line, http_parser_rejects_too_many_headers, http_parser_rejects_aggregate_headers_too_large, http_parser_rejects_oversized_body_from_content_length, http_parser_times_out_incomplete_request_line, handler_slots_bound_concurrent_request_threads). The gateway-side bound is exercised by host tests in demos/remote-session-capset-gateway/src/lib.rs::tests (admits_up_to_max_concurrent_logins_per_principal, rejects_over_cap_admission_with_typed_denial, release_reopens_a_slot_for_the_same_principal, distinct_principals_have_independent_counters, release_to_zero_drops_the_slot, release_unknown_principal_is_a_noop, table_full_admission_does_not_grow_past_slot_ceiling).

Two contracts the future capnp-rpc rewrite must preserve: fail-closed bound exhaustion never panics or leaks raw counters into browser envelopes (only typed denial codes plus a backend audit row); and operator-visible audit material (bound-exhausted transcript rows, drop-oldest truncation markers) is recorded backend-side through the existing redacted-transcript path, not surfaced through new untyped error channels.

Layer map for future iterations

LayerOwnerTodayHeading toward
Wiregateway ↔ backendlength-prefixed schema-framed DTOsstandard capnp-rpc over TCP, then TLS/mTLS
Authgatewaypassword, anonymous, guest; disabled methods advertised+ public key, OIDC (device-code + PKCE), passkey, mTLS, service credential
Bundlebrokershell-bundle-shaped wrapper for remotefirst-class remoteClientBundle profile shape
Workerper-sessionChat.send, Adventure status/look/inventory/gobroader Adventure verbs, real Chat panel, Paperclips worker, generalized lifecycle, terminal-session host, agent-shell services
Backend (Rust)trustedAppState, snapshot, view models, transcript, automation, first-wins BrowserSession ownership, local HTTP parser/handler bounds, per-session resource bounds (worker-calls, transcript rows + bytes, cap holders, gateway logins per principal)live RPC proxy state, RemoteUiHost cap holder
Browseruntrusted UIlogin + Services / CapSet / Diagnostics / Transcript / Session SPAricher service-specific clients, generic CommandSurface-driven forms, agent-shell mode, terminal panel for granted TerminalSession, RemoteUiSurface rendering
Host packagingtrustedCLI, make remote-session-ui, make remote-session-tauri check/dev wrapperdistributable Tauri package sharing the same Rust backend

Self-served capOS web UI boundary

The first self-served browser UI is a capOS-hosted application service, not the host remote-session-ui development bridge moved into the guest. A new capOS userspace service, remote-session-web-ui, owns the HTTP listener, serves the UI bundle, runs the authenticated web-session backend, holds the remote session CapSet/proxy state, and projects browser-safe view models.

Static assets are boot-package resources. The implementation should reuse the reviewed host UI asset source or a smaller reviewed subset, but the served copy is an immutable, fixed-name bundle embedded in the capOS boot package and granted by manifest resource name with a pinned digest or equivalent build-time integrity label. remote-session-web-ui serves only that bundle and a small generated bootstrap document; it does not expose a host directory, capOS storage root, asset traversal, or development hot-reload path.

The first listener surface is HTTP/1.1 on a manifest-scoped TcpListenAuthority for a dedicated UI port such as guest port 8080. HTTP serves static assets plus same-origin JSON API routes. WebSocket, server-sent events, and terminal/media streaming remain later extensions that need separate route-level bounds; the first proof should avoid them so the authority and validation surface is small.

The manifest grants for remote-session-web-ui are narrow: scoped TcpListenAuthority for the UI port, SessionManager, AuthorityBroker, the immutable UI asset bundle, and the same restricted remote-client service-runner/backend-launch authority needed to expose approved service descriptors. It must not receive raw NetworkManager, raw socket factories, broad storage roots, raw ProcessSpawner, shell launcher authority, endpoint owner caps, or arbitrary endpoint creation authority.

The service is the trusted backend and holds remote CapSet/proxy state server-side. Browser JavaScript receives only view models, launch forms, user-event commands, typed results, denial diagnostics, and redacted transcript rows. It never receives raw capOS caps, raw ProcessSpawner, process handles, endpoint owner authority, local cap IDs, result-cap slots, session-global identifiers, remote CapSet handles, host usernames, host environment variables, host paths, or QEMU-forwarding identity hints.

Login remains session-manager shaped. The browser submits credentials or guest/anonymous intent to the capOS-served JSON endpoint; the service derives source metadata from its accepted socket and service-generated event id, asks SessionManager for a UserSession, asks AuthorityBroker for the remote-client bundle, and only then exposes disclosed session/service fields as browser-safe models. The browser cannot select principal, profile, worker session context, or backend cap holder by replaying a request field.

Gate 1B is now an evidence ladder rather than a single proof name. The landed local/QEMU layer is:

  • remote-session-self-served-web-ui: a focused manifest boots remote-session-web-ui, browser automation loads assets from the capOS-owned origin, logs in, calls at least one granted capability through the service-held backend state, proves logout/stale failure stays closed, and checks forbidden authority markers are absent from browser-visible envelopes and transcripts.
  • remote-session-self-served-web-ui-default-run: default make run starts the capOS-served UI on guest port 8080 and forwards it to a loopback host port for local operator use.
  • remote-session-self-served-full-ui-bundle: the capOS service now serves the reviewed fixed-name boot-resource bundle, including the operator workspace assets and /bundle/manifest.json, with explicit content types, no directory traversal, and digest-pinned build evidence.

Those proofs do not close the selected GCE Web UI path by themselves. The local service proof cloud-prod-remote-session-web-ui-l4-local-proof is done: it runs remote-session-web-ui through the non-qemu cloudboot socket path using the Phase C userspace network stack and configured IPv4 route, not the older QEMU-only kernel socket fixture or the host remote-session-ui bridge. After that, cloud-gce-private-self-hosted-webui-proof proves private GCE reachability over the live NIC without public IP or public firewall exposure. cloud-gce-public-self-hosted-webui-ingress-tls is the later public operator-access task; it remains on hold for explicit public-ingress/TLS authorization even though the ingress policy design is recorded.

Rollback is manifest/build-target selection: remove the focused target and the remote-session-web-ui listener/asset grants while keeping the host-served make remote-session-ui bridge and the remote-session CapSet gateway unchanged.

Architecture

flowchart TD
    Client[Host app: CLI, GUI, Tauri, or web gateway] -->|TCP/TLS + capnp-rpc| Gateway[RemoteSessionGateway]
    Gateway --> Auth[Auth adapters]
    Auth --> Sessions[SessionManager]
    Gateway --> Broker[AuthorityBroker]
    Broker --> Worker[Per-session RPC worker]
    Broker --> Catalog[Remote service catalog]
    Catalog --> Runner[Restricted service runner]
    Runner --> GameServers[Game server processes]
    Worker --> RemoteCapSet[RemoteCapSet]
    RemoteCapSet --> Proxies[Remote capability proxies]
    GameServers --> Proxies
    Proxies --> LocalCaps[capOS capabilities]
    Worker --> Audit[AuditLog]

The remote listener is a trusted gateway. In the final RPC shape it accepts the transport, performs or delegates authentication, obtains a UserSession, asks the broker for a remote-client bundle, and hosts a per-session RPC vat. That vat exports a RemoteSession object and remote proxy objects for capabilities in the broker-issued bundle. During the temporary dual-stack period, the guest side still accepts DTO frames and the Linux host backend hosts the first local proxy facade over those DTO calls.

For the first implementation the per-session worker may be an ordinary capOS service process. That shape matches the session-bound invariant: one workload process has one immutable session context. A single long-lived gateway may handle pre-auth connection state, but post-auth capability invocation should run inside a worker whose session context is the authenticated remote session, or through an equivalently reviewable dispatch path that cannot mix unrelated user sessions as ambient authority.

Bootstrap Interfaces

The DTO surface below is now pinned in schema/capos.capnp: RemoteAuthStart, RemoteAuthStep, RemoteServiceGrantRequirement, RemoteServiceExport, RemoteServiceProfile, plus the RemoteSessionGateway, RemoteAuthFlow, RemoteSession, RemoteCapSet, RemoteServiceCatalog, and RemoteServiceRunner interfaces. Round-trip coverage for the new structs lives in capos-config/tests/remote_capnp_rpc_dto_roundtrip.rs. The transport that consumes them is still gated on the userspace async-runtime decision (capnp-rpc v0.25 is std-only and needs a futures executor). The first proxy slice is host-backend-only and dual-stack: it uses capnp-rpc locally in the trusted Linux backend for Chat while translating to the legacy RemoteGatewayRequest/RemoteGatewayResponse DTO union on the gateway wire. The schema and generated bindings do not change for that slice, and browser JavaScript still receives only view models, typed results, typed denials, and redacted transcript rows.

enum RemoteAuthKind {
  password @0;
  publicKey @1;
  oidcDeviceCode @2;
  oidcAuthorizationCodePkce @3;
  passkey @4;
  mtlsClientCert @5;
  guest @6;
  anonymous @7;
  serviceCredential @8;
}

struct RemoteAuthMethod {
  kind @0 :RemoteAuthKind;
  label @1 :Text;
  profileHints @2 :List(Text);
  interactive @3 :Bool;
  enabled @4 :Bool;
}

struct RemoteAuthStart {
  kind @0 :RemoteAuthKind;
  selector @1 :LoginSelector;
  requestedProfile @2 :Text;
  clientNonce @3 :Data;
  # Source metadata is intentionally not a client-supplied field.
  # The gateway derives LoginSourceMetadata from the accepted socket
  # and its own connection event id before calling
  # SessionManager.login. A client-supplied source field would let
  # remote callers forge audit metadata downstream services depend on.
}

struct RemoteAuthStep {
  prompt @0 :Text;
  redaction @1 :Bool;
  url @2 :Text;
  userCode @3 :Text;
  challenge @4 :Data;
  expiresAtMs @5 :UInt64;
}

interface RemoteSessionGateway {
  authMethods @0 () -> (methods :List(RemoteAuthMethod));
  start @1 (request :RemoteAuthStart) -> (flow :RemoteAuthFlow);
  guest @2 (requestedProfile :Text) -> (session :RemoteSession);
  anonymous @3 (requestedProfile :Text) -> (session :RemoteSession);
}

interface RemoteAuthFlow {
  next @0 (response :Data) -> (step :RemoteAuthStep, done :Bool,
      session :RemoteSession);
  cancel @1 () -> ();
}

struct RemoteCapEntry {
  name @0 :Text;
  interfaceId @1 :UInt64;
  transferPolicy @2 :Text;
  leaseExpiresAtMs @3 :UInt64;
}

interface RemoteSession {
  info @0 () -> (info :SessionInfo);
  capSet @1 () -> (caps :RemoteCapSet);
  renew @2 (proof :Data, requestedDurationMs :UInt64)
      -> (session :RemoteSession);
  logout @3 () -> ();
}

interface RemoteCapSet {
  list @0 () -> (entries :List(RemoteCapEntry));
  get @1 (name :Text, expectedInterfaceId :UInt64) -> (cap :AnyPointer);
}

struct RemoteServiceGrantRequirement {
  name @0 :Text;
  interfaceId @1 :UInt64;
  transferMode @2 :Text;
  holder @3 :Text;  # backendHeld, serviceOwned, or clientFacet
}

struct RemoteServiceExport {
  name @0 :Text;
  interfaceId @1 :UInt64;
  transferPolicy @2 :Text;
}

struct RemoteServiceProfile {
  id @0 :Text;
  label @1 :Text;
  processGraph @2 :List(Text);
  requirements @3 :List(RemoteServiceGrantRequirement);
  exports @4 :List(RemoteServiceExport);
  state @5 :Text;  # unavailable, attachable, startable, running
}

struct RemoteServiceLaunchRequest {
  profileId @0 :Text;
  grantNames @1 :List(Text);
}

struct RemoteServiceCatalogEntry {
  id @0 :Text;
  label @1 :Text;
  summary @2 :Text;
  capName @3 :Text;
  transportInterfaceId @4 :UInt64;
  schemaInterface @5 :Text;
  proxyStatus @6 :Text;
  methods @7 :List(Text);
  notes @8 :List(Text);
}

struct RemoteServiceLaunchStatus {
  profileId @0 :Text;
  status @1 :Text;         # notLaunched, unsupported, denied, ready, running
  launchSupported @2 :Bool;
  message @3 :Text;
  acceptedGrantNames @4 :List(Text);
  exportedServices @5 :List(RemoteServiceCatalogEntry);
}

interface RemoteServiceCatalog {
  list @0 () -> (profiles :List(RemoteServiceProfile));
}

interface RemoteServiceRunner {
  probe @0 (request :RemoteServiceLaunchRequest)
      -> (status :RemoteServiceLaunchStatus);
  start @1 (request :RemoteServiceLaunchRequest)
      -> (status :RemoteServiceLaunchStatus);
  attach @2 (profileId :Text) -> (status :RemoteServiceLaunchStatus);
}

The AnyPointer result is proposal shorthand for an ordinary Cap’n Proto capability pointer whose expected interface ID was already checked by the gateway. Generated client helpers should immediately cast it to the requested typed client. The remote client does not receive a numeric local capId, endpoint selector, result-cap index, or session identifier it can replay somewhere else.

The catalog and runner sketches are also proposal-level. They describe the remote-facing contract, not the internal implementation. The completed launch DTO/probe slice uses a serviceLaunch request/response arm for the side-effect-free probe: RemoteServiceLaunchRequest carries only a profile id plus explicit grant names, and RemoteServiceLaunchStatus reports status such as notLaunched, unsupported, denied, ready, or running, launch support, accepted grant names, and exported or planned service descriptors. The current Adventure slice makes that serviceLaunch path a real restricted backend launch for the default make run manifest, so Adventure may report running and launchSupported=true after the approved server graph starts. Paperclips remains a future launch profile. A capOS service runner may use local spawn authority, BootPackage data, or broker-held service caps inside capOS, but the remote client and browser/webview code receive only service descriptors, launch requests, status results, denials, and remote capability descriptors. Raw ProcessSpawner, process handles, endpoint owner caps, local cap ids, and result-cap slots are not exposed.

Service Catalog And Game Server Launch

The default make run story and the focused game proofs are intentionally different:

  • system.cue imports cue/defaults/defaults.cue, boot-launches standalone init, and lets init start chat-server, remote-session-capset-gateway, remote-session-web-ui, and the foreground shell. The default binary set includes Adventure server/NPC/client binaries and the terminal Paperclips binary. make run forwards guest port 8080 to a loopback host port and prints remote self-served UI: tcp 127.0.0.1 <port> -> guest :8080; the make run-default-web-ui target proves the capOS-served endpoint with browser automation. Adventure is not boot-started automatically; the current remote-session serviceLaunch slice starts adventure-server plus simple NPC companions through a restricted backend runner when requested. Paperclips landed in Path A + Path B as described below; the default make run manifest reports launchSupported=false / status=missingBinary for the paperclips launcher until Path C (the kernel-side AuthorityBroker allowlist extension and the on-wire DTO arm) lands.
  • The default remote-session gateway is narrow. It has console, scoped TCP-listen authority for guest port 2327, SessionManager, and AuthorityBroker, plus narrowly approved backend launch authority for the Adventure profile; it does not expose raw ProcessSpawner, raw network manager/socket authority, endpoint owner handles, process handles, local cap ids, result-cap slots, or game service endpoint owner caps. The remote-session-web-ui service separately receives scoped TCP-listen authority for guest port 8080, SessionManager, AuthorityBroker, and console.
  • make run-adventure uses system-adventure.cue to start chat-server, adventure-server, Adventure NPC companion processes, an adventure-scenario-test, and the shell. The Adventure server exports an Adventure endpoint, consumes a client facet of Chat, and owns room/player state keyed by live caller-session references.
  • make run-paperclips uses system-paperclips.cue to start paperclips-server and paperclips-proof-server services exporting PaperclipsGame endpoints. The terminal client is then launched with explicit StdIO, game endpoint, timer, and optional proof_accelerator grants. The server owns generated content, game state, timer cadence, command specs, status snapshots, project entries, unlock checks, and game-rule mutation.

The remote UI should not treat those terminal transcripts as the product boundary. The staged path is:

  1. The broker advertises a remote service catalog for the authenticated session. The catalog is derived from manifest/default profiles and policy, and includes only services the remote profile may inspect, attach to, or start.
  2. The launch DTO/probe slice is complete. It defines a remote-safe launch request, status, and probe contract for cataloged profiles. It can report unsupported launch state, accepted grant names, a message, planned exported service descriptors, and denial status without side effects: no process starts and no new capabilities are attached.
  3. The current Adventure slice implements the restricted service-runner path for the default manifest. It starts the Adventure server plus simple NPC companion processes with explicit named grants, then returns launch status and remote descriptors for exported or broker-held caps. Process handles stay backend-local.
  4. The trusted Rust backend attaches those descriptors to the backend-held RemoteCapSet and drives typed calls. Browser JavaScript or a Tauri webview receives view models, launch/status forms, service descriptors, denials, and results, not raw capOS handles. The implemented DTO worker slices cover Chat.send plus Adventure status, look, inventory, and first mutable bounded go(direction).
  5. The first UI panels can be generic: service list, start/attach, status, read-only Adventure controls, a bounded movement control, transcript, and denial details. Purpose-built Adventure and Paperclips clients can layer richer rendering and broader mutable game actions over the same service-runner and remote CapSet backend later; Paperclips does not have default-manifest remote launch support yet.

Operator commands should stay explicit:

make run
cargo run --manifest-path tools/remote-session-client/Cargo.toml \
  --target x86_64-unknown-linux-gnu \
  --bin remote-session-client -- --host 127.0.0.1 --port <printed-port>
CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-ui

Add --launch-adventure to the CLI command to start the default-manifest Adventure graph through the restricted serviceLaunch path and require a running status. Add --adventure-status after --launch-adventure to require read-only Adventure status, look, and inventory responses through the session-bound worker path. Add --adventure-go east after --launch-adventure to require the first bounded mutable Adventure go(direction) response through that same worker path.

The Tauri wrapper runs from this repository and reuses the same backend boundary by loading the loopback remote-session-ui surface in a desktop webview:

CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-tauri

That target checks Tauri CLI and Linux build prerequisites, reports dependency/scaffold status, and runs a deterministic wrapper check by default. Set CAPOS_REMOTE_SESSION_TAURI_MODE=dev to launch cargo tauri dev. Missing host Tauri packages fail with explicit diagnostics and point operators back to make remote-session-ui. The webview receives the same browser-safe view models, events, denials, typed results, and redacted transcript rows as the trusted local web bridge; the backend keeps the remote session and caps.

Bidirectional UI Composition

A conventional GUI program opens a window and owns the controls inside it. A remote capOS session does not need to be that limited. The host app can expose a session-scoped UI host capability to capOS, and capOS-side services or agents can use that capability to propose a better interface for the current task:

  • Paperclips can ask for counters, project controls, and status charts instead of printing lines.
  • Chat can ask for a channel list, unread badges, and a message pane.
  • Adventure can ask for a map pane, inventory slots, command buttons, and room transcript.
  • A diagnostics agent can open log, metric, and trace panes side by side, highlight the relevant capability calls, and change density for a debugging session.
  • A teaching or accessibility agent can request larger type, simplified controls, or a guided task layout for a particular session.

The authority is explicit and separate from service authority. Holding Chat does not let a service rewrite the user’s UI. Holding RemoteUiHost or a narrow UiSurface facet lets the service propose bounded UI changes for the current remote session. The host app remains the compositor and policy enforcer.

Conceptual shape:

enum UiPatchKind {
  openSurface @0;
  closeSurface @1;
  updateModel @2;
  setLayoutHint @3;
  setThemeHint @4;
  addCommand @5;
  removeCommand @6;
}

struct UiSurfaceSpec {
  surfaceId @0 :Data;
  title @1 :Text;
  kind @2 :Text;
  safetyClass @3 :Text;
  modelSchema @4 :UInt64;
}

struct UiPatch {
  kind @0 :UiPatchKind;
  surfaceId @1 :Data;
  payload @2 :Data;
  expiresAtMs @3 :UInt64;
}

struct UiEvent {
  surfaceId @0 :Data;
  command @1 :Text;
  payload @2 :Data;
  userInitiated @3 :Bool;
}

interface RemoteUiHost {
  open @0 (spec :UiSurfaceSpec) -> (surface :RemoteUiSurface);
  theme @1 (scope :Text, hints :Data) -> ();
}

interface RemoteUiSurface {
  apply @0 (patch :UiPatch) -> ();
  poll @1 (maxEvents :UInt16) -> (events :List(UiEvent));
  close @2 () -> ();
}

The payloads above should become typed structs before implementation. They are shown as Data only to keep the sketch short. The important boundary is that UI updates are declarative patches and typed view models, not arbitrary host code. The host validates the requested surface kind, model schema, command set, theme tokens, data size, update rate, and safety class before rendering anything.

This is still a remote CapSet client model:

host UI holds RemoteSession + RemoteCapSet
host UI grants a narrow RemoteUiHost/RemoteUiSurface cap to a trusted worker
capOS service or agent sends declarative UI patches through that cap
host UI renders and sends typed user events back
service effects still require ordinary service caps

The direction is therefore bidirectional but not symmetric. The host app can call capOS service caps. capOS can shape the session UI only through UI caps the host granted. Neither side gains ambient authority over the other.

Safety rules:

  • Host chrome, login prompts, origin indicators, permission prompts, and emergency reset controls are reserved. capOS-rendered surfaces cannot spoof them.
  • UI patches are session-scoped. Persistent layout/theme changes require an explicit profile/settings cap or user confirmation.
  • Theme and look/feel changes use bounded tokens or validated design-system variables, not raw CSS injection.
  • UI command descriptors are data; executing a command still calls a typed capability under the current session policy.
  • The user can close, reset, or pin surfaces against agent rearrangement.
  • UI updates are quota-bound and auditable when they materially affect workflow, consent, disclosure, or action execution.
  • Browser front ends keep raw capOS caps server-side or in a Tauri/native Rust backend. Browser JavaScript receives rendered state and sends user events; it does not hold RemoteCapSet entries.

This is the broader version of the WebShell idea. A web shell can be more than a terminal emulator: it can be a session workspace whose composition is negotiated by the capabilities present in the session. The terminal remains one surface in that workspace, not the only surface.

Authentication And Admission

Authentication adapters all produce the same output: a UserSession plus profile inputs for the broker. They differ only in how the proof is obtained and verified.

  • Password: maps to the existing SessionManager.login(method, selector, proof, source) path when remote password login is enabled by policy. It must use the existing credential failure/backoff/audit rules and must not be the only supported remote method.
  • Public key: maps to SessionManager.sshPublicKey or a generalized signature-auth method. SSH userauth and raw remote RPC public-key auth can share account/key records, but the transcript bytes must be domain-separated by protocol and channel binding.
  • OIDC/OAuth: device-code flow fits headless or CLI clients; authorization code + PKCE fits browser-assisted clients. The OAuth/OIDC service verifies ID tokens and maps external subjects through the user-identity admission model before SessionManager mints a session.
  • Passkey/WebAuthn: belongs behind the web-authenticator path. A remote native client may open a browser or use a platform authenticator, but raw authenticator secrets never become capOS app data.
  • mTLS client certificate: TLS client-auth can identify a principal or pseudonymous subject through certificate policy. Certificate identity is an admission input; the resulting CapSet still comes from the broker.
  • Guest and anonymous: explicit policy profiles. They are not fallbacks for missing credentials and should receive short leases and narrow bundles. Guest admission is currently surfaced through the bridge as an explicit AuthMode::Guest option (/api/login/guest, CLI --guest); the gateway enforces the requestedProfile == "guest" and principal.kind == Guest invariants before broker dispatch via the validate_guest_admission helper, and refuses with RemoteErrorCode::AuthenticationDenied and the redacted "guest login denied" message regardless of which policy branch fired. When the manifest has no guest seed account the gateway returns RemoteErrorCode::DisabledAuthMethod so the bridge can distinguish a manifest-disabled method from a credential failure. Guest sessions surface only the configured display name ("Remote Guest") and principal_kind enum label to the bridge; the seeded principal id bytes are never disclosed through the bridge transcript or API envelope.
  • Service/workload credentials: future non-human clients can authenticate with OAuth client credentials, token exchange, mTLS, or signed workload assertions. They receive service-profile bundles, not human shell bundles.

Every method must record source metadata and protocol/channel binding appropriate to its transport. A successful proof selects a principal and session; it does not directly grant service authority.

Remote CapSet Semantics

A local process starts with a read-only CapSet page plus local cap-table entries. A remote client instead receives a live RemoteCapSet object:

  • list returns names, interface IDs, display metadata, and lease summaries.
  • get returns a typed RPC capability pointer only if the name exists and the expected interface ID matches.
  • The returned object is a proxy owned by the remote-session worker.
  • Dropping the remote object releases the worker’s hold edge when no other remote references remain.
  • Logout, expiry, revocation, disconnect, or worker shutdown breaks all session-bound proxy objects. The current DTO gateway implements kernel-backed explicit logout and owned-session connection teardown; full live proxy object-drop/revocation behavior remains future work.

This is still an actual session bundle. It is not a copy of the kernel’s local CapSet ABI. The remote representation exists because a Linux process has no capOS ring page, no capOS CapSet mapping, and no local cap table.

Invocation Context

Remote capability calls should look like ordinary calls to the target service:

remote client call
  -> capnp-rpc message
  -> per-session worker proxy
  -> local capOS capability call
  -> target service sees the worker's live session context

The remote client cannot choose service-visible subject identity. Request fields are ordinary data. If a service needs subject details, it uses the existing subject-disclosure policy: explicit request plus a matching service-scoped disclosure grant. By default it receives only the opaque service-scoped caller-session reference used by the session-bound invocation model.

Error And Lifetime Model

The remote path keeps the existing error split:

  • Cap’n Proto RPC transport errors and broken connections become RPC exceptions or disconnected promises.
  • Proxy/worker infrastructure failures become CapException-like capability exceptions.
  • Domain outcomes remain schema result fields or unions.
  • A missing cap name, interface mismatch, denied profile, stale session, or revoked lease is an observable denial, not a silent fallback to a broader service.

Open promises must fail when the remote session logs out or the connection is closed. The worker must release local caps on every close path.

Relationship To Shells And Gateways

Remote session CapSet clients are a peer of shell transports:

  • Native shell: a local capOS process that uses its local CapSet and ring. It can later expose a schema-aware REPL over the same capabilities a remote client sees, but the remote client does not need to spawn a shell.
  • SSH shell: a production CLI terminal transport. It authenticates and launches capos-shell with a TerminalSession. It should not become the only way for external programs to call typed services.
  • WebShellGateway: browser terminal, webapp, and agent UI transport. Browser JavaScript must not receive raw capOS caps; the gateway can use the remote session CapSet model server-side and expose terminal frames, view models, command descriptors, or bounded tool requests to the browser. This is close to the same mental model as a “web shell”, except the shell is not the required protocol. The web UI can present service-specific controls over the same session CapSet, and capOS-side services can adjust the session workspace through UI composition caps. A remote CapSet web UI can be built before the full WebShellGateway by omitting terminal delegation, shell-runner policy, and agent execution; it is just another host client of the remote session bundle.
  • Tauri or desktop GUI: the Rust/native backend may hold the remote RemoteSession and typed capability clients, while the UI layer receives rendered state, command descriptors, and user-intent events. The UI layer should not receive replayable capOS authority as data. The backend may grant narrow UI-surface caps back to capOS services so they can propose adaptive layouts without gaining arbitrary desktop control.
  • Agent shell: the agent runner holds session caps server-side and presents tool descriptors to the model. A hosted agent can use the same remote session bundle shape as long as actual capOS invocations remain in the trusted worker.
  • Interactive command surfaces: command metadata can be one of the granted capabilities. A remote client can render command specs directly instead of scripting text through a shell.

Authority Rules

  • The gateway receives scoped listener/TLS/auth/session/broker/audit authority, not raw broad network or spawn authority.
  • Post-auth workers receive only the broker-issued remote-client bundle plus proxy lifecycle authority.
  • Default remote bundles should be narrower than operator shell bundles.
  • Raw ProcessSpawner, unrestricted NetworkManager, key-vault, credential store, broad account store, broad storage root, and host debug caps require explicit elevated policy.
  • Remote proxyable caps must declare transfer/lifetime policy. Local-only caps may appear in a local shell CapSet without being exportable through RemoteCapSet.
  • Capability names are lookup conveniences. Interface ID and broker policy define whether a returned object is usable for the requested type.
  • Replayable handles are forbidden. Session IDs, grant IDs, endpoint metadata, object epochs, and proxy table positions are not bearer tokens.

Design Grounding

  • Session-Bound Invocation Context defines the one-session-per-process invariant and privacy-preserving endpoint caller-session metadata.
  • User Identity and Policy defines principals, sessions, profiles, admission sources, renewal, and brokered CapSet minting.
  • Boot to Shell defines the existing CredentialStore/SessionManager/AuthorityBroker path and non-password login directions.
  • SSH Shell Gateway, Certificates and TLS, and OIDC and OAuth2 define public-key, TLS/mTLS, and federated admission inputs.
  • capos-service defines the service lifecycle shape needed for listener loops, per-session context, shutdown, drain, and metrics.
  • Capability-Based Service Architecture defines the broader service taxonomy, capability layering, and init/spawn boundary the gateway, per-session workers, and restricted service runner reuse. The default make run gateway, the Adventure service-runner path, and the Paperclips Path B worker plumbing inherit the process-startup, attenuation, and HTTP-capability rules described there; Path C will extend the broker allowlist surface in the same authority frame.
  • Remote Session UI Security defines the web-security posture for the loopback remote-session-ui bridge and its Tauri desktop wrapper – per-browser BrowserSession cookies, CSRF/CSP/cookie discipline, first-wins ownership, local HTTP parser bounds, and Tauri capability minimization – that the trusted Rust backend in this proposal exposes to the browser. Both proposals reference each other; this proposal owns the upstream remote-session CapSet wire and host-client shape, while that proposal owns the browser-facing authority boundary.
  • R17 – Remote-session UI bridge and Tauri wrapper are research-only routes long-horizon residual risk (distributable packaging, desktop automation, non-loopback exposure) back to this proposal and the remote-session UI security proposal. Non-loopback remote-session UI exposure must remain blocked until that production posture is accepted by the corresponding review-finding task.
  • Interactive Command Surfaces defines typed command sessions that can be rendered by remote clients.
  • Browser Capability and Agent Web Sessions defines browser-side authority boundaries and gateway mediation for web UI sessions.
  • Language Models and Agent Runtime defines agent runners, tool proxies, and browser-agent UI orchestration boundaries.
  • Cloudflare, Cap’n Proto, Workers RPC, and Cap’n Web grounds production object-capability RPC, live object bindings, and remote resource-exhaustion discipline.
  • Spritely, OCapN, and CapTP grounds distributed object-capability lifetime, promise, reference, and handoff questions while staying non-binding for capOS wire compatibility.
  • Cap’n Proto Error Handling grounds the exception-versus-domain-result split that the host-backend facade and eventual gateway RPC transport must preserve.

Implementation Shape

The first implementation is deliberately small:

  1. Keep the existing capnp-chat-interop service and harness as the transport starting point, but rename the target outcome in planning docs to remote session CapSet interop. Done.
  2. Add generated Linux Rust bindings for the relevant schema subset. Done.
  3. Add a host client library that connects through QEMU user TCP. Done with a schema-framed DTO transport; replacing it with standard capnp-rpc framing and live proxy objects remains the next transport step.
  4. Add a capOS gateway that supports one policy-enabled auth method plus explicit guest/anonymous behavior. Done for password, anonymous, and guest, with disabled public-key, OIDC, and passkey/WebAuthn method entries advertised. Guest admission ships with a dedicated RemoteGatewayRequest.guestLogin arm, the validate_guest_admission broker-side enforcement helper that pins the requestedProfile == "guest" plus principal.kind == Guest invariants, and a RemoteErrorCode::DisabledAuthMethod path so the bridge can distinguish a manifest-disabled method from a credential failure.
  5. Return remote session summary, CapSet list, and typed get metadata. Done as DTOs.
  6. Call at least two capabilities from the bundle. Done for session, system_info, the worker-backed Chat.send path, and Adventure status/look/inventory/go(direction) after serviceLaunch. The focused chat proof also shows a service-domain denial remains a schema chatSent(false) result and that chat-server sees bounded session-bound caller metadata through disclosure policy. Broader Adventure methods, Paperclips methods, live proxy objects, and object-level release/drop lifecycle remain future work.
  7. Prove a missing cap, wrong interface ID, wrong profile, stale session, and logout path fail closed. Done for the focused proof, including a kernel-backed UserSession.logout call and owned-session disconnect propagation in the DTO gateway; full release, live proxy object-drop, renewal, and revocation propagation remains future work.
  8. Add a first host UI client over the current UI-neutral Rust client. Done for a trusted local web bridge with a loopback browser UI and Rust backend that holds the remote session state. It covers endpoint configuration, auth methods, login, session summary, CapSet inspection, sessionInfo, systemMotd, denial probes, logout, stale-call proof, redacted transcript export, and a focused browser automation proof. The repo-local Tauri wrapper now checks or launches the same loopback backend/webview boundary; distributable packaging remains later. The UI remains separate from WebShell and does not include a terminal emulator, shell-runner policy, or agent execution.
  9. Define the launch DTO/probe shape after the read-only remote service catalog. Done: this slice defines a remote-safe launch request, launch status, and side-effect-free probe so the CLI/web backend can render forms and denials for Adventure/Paperclips profiles. It deliberately does not start processes, create endpoint owners, attach caps, or expose raw ProcessSpawner, process handles, endpoint owner handles, local cap ids, result-cap slots, or browser-held capOS capabilities.
  10. Implement the actual restricted Adventure service-runner path. Done: the default-manifest Adventure profile starts adventure-server plus simple NPC companion processes and attaches or retains the resulting Adventure/chat descriptors/caps in the backend-held remote CapSet. Paperclips landed in two halves: Path A added the read-side RemotePaperclips* DTO schema (RemotePaperclipsCommandResult, RemotePaperclipsCommandList, RemotePaperclipsProjectList, RemotePaperclipsStatusSnapshot, RemotePaperclipsEvent, RemotePaperclipsProjectStatus, RemotePaperclipsEventKind, and the single-command RemotePaperclipsCommand input DTO) in schema/capos.capnp, with bounded wire-roundtrip coverage in capos-config/tests/remote_paperclips_dto_roundtrip.rs; Path B added the dedicated demos/remote-session-paperclips-worker/ crate mirroring the Adventure worker shape, the gateway SessionWorkerKind::Paperclips enum variant with matching SessionWorkerSet arms and spawn_paperclips_graph/build_paperclips_service_launch/ fill_paperclips_launcher/paperclips_catalog_status helpers, a manifest-static game endpoint slot on the gateway capset, bridge RequestKind::PaperclipsInitial/Command/Status/Projects synthesis from cached serviceLaunch state (the on-wire control plane lands in Path C), UI launch slot plus status chip with paired smoke automation (paperclipsLaunchVisible/paperclipsStatus/paperclipsStatusObserved), the system-remote-session-paperclips.cue focused manifest, and the make run-remote-session-paperclips-vm / make run-remote-session-paperclips-ui gates. Raw ProcessSpawner, process owner handles, endpoint owner caps, local cap ids, result-cap slots, and browser-held capOS capabilities stay out of the remote contract. Process handles stay backend-local. Adventure status/look/inventory controls and first mutable bounded go(direction) use the session-bound worker pattern; Paperclips Path B uses the same worker shape with bridge-side response synthesis until Path C lands the wire-level DTO arm and the broker allowlist grants for the default manifest. Broader Adventure controls, Path C wire/broker extension, and rich Paperclips client implementations remain later.
  11. Replace the bounded make remote-session-tauri preflight with the actual repo-local Tauri wrapper over the same Rust backend. Done for check/dev mode: CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-tauri validates the wrapper scaffold and host prerequisites, and CAPOS_REMOTE_SESSION_TAURI_MODE=dev launches the wrapper through cargo tauri dev. Distributable packaging remains gated on reviewed sidecar/backend lifecycle handling.
  12. Add the first typed proxy layer as a host-backend-only temporary dual-stack. Done for Chat: tools/remote-session-client/ hosts a local capnp-rpc facade that translates backend-held proxy calls to the existing DTO gateway protocol while keeping schema/generated bindings, gateway wire shape, and browser authority unchanged. The later gateway rewrite must provide standard capnp-rpc framing, typed remote proxy objects, exception mapping, release/drop handling, and resource bounds before the bespoke DTO service path can be retired.
  13. Layer richer service clients on top of the same backend boundary. The first richer client is a session-summary diff: a pure Rust helper in tools/remote-session-client/src/session_diff.rs compares two snapshots of the session view (CapSet entries plus SessionInfoSummary) and returns typed CapSetDiff / SessionSummaryFieldDiff records keyed on (name, interface_id) and on visible session fields. Renewals or policy rebinding surface as policy_changed rather than removed + added. The trusted web bridge stores the raw snapshots backend-side and exposes /api/call/session-diff-refresh, which returns a redacted SessionSummaryDiffVm. Browser JavaScript receives only that view model: added/removed cap entries by (name, interfaceIdHex, transferPolicy, leaseExpiresAtMs), policy/lease changes, redacted session-id changes, and a summary string. The first call after login captures a baseline (hasBaseline=false); subsequent calls return the diff against the previous snapshot. The browser renders the diff in a dedicated “Last refresh diff” pane on the Session view; raw session_id_hex, replayable cap handles, and kernel session ids stay backend-side. The focused UI smoke clicks “Refresh & Show Diff” twice and asserts both the no-baseline and post-baseline shapes. Two backend host tests cover the baseline + no-change path and the added-cap + expiry-change path.
  14. Add a separate UI-composition proof only after the basic session proof: grant a narrow test RemoteUiSurface, accept one declarative patch, send one typed user event back, and prove the service cannot spoof trusted chrome or persist layout state without the relevant cap.

Later slices can add more auth adapters, TLS, renewal, browser-assisted auth, service credentials, UI composition surfaces, promise pipelining, and distributed GC.

Visual Design Handoff

The host UI visual language is anchored on two Claude Design handoffs:

  • The original capOS Login bundle (delivered 2026-05-02 13:26 UTC). Only the CSS tokens and design intent were ported into the production UI; the prototype is not kept in-tree.
  • The capOS Workspace bundle (delivered 2026-05-02, see tools/remote-session-client/ui/design-bundle/). Covers the post-login workspace shell, chat list, active group chat with embedded approval cards, active DM with E2E lock + fingerprint card, active call (collapsed banner + full-pane), stage room, and a “start sheet” with the four ocap-clean entry flows (open DM from contact card, redeem invite, browse directory, start ephemeral chat). This bundle IS kept in-tree as reference at tools/remote-session-client/ui/design-bundle/ and includes conversation transcripts, HTML prototypes, JSX components, and the unique theme assets. See its CAPOS-INTEGRATION.md for the bundle-to-live-UI mapping and the iteration-7 prerequisite (CSP refactor + per-browser BrowserSession cookie before any inline scripts/styles from the prototype reach production).

Both bundles ship four themes (Space, Mountain, Light, Operator) and a consistent token system (themes.jsx in either bundle is authoritative for palette / typography / radii / blur). The branding assets actually shipped under branding/ were copied into tools/remote-session-client/ui/assets/ for the bridge to serve; the prototype’s reference imagery is kept only in the in-tree design-bundle directory.

What landed in tools/remote-session-client/ui/:

  • Vanilla CSS rewrite of styles.css around the design’s theme tokens. No React, no Babel, no third-party CDN script tags. Trust boundary stays intact: the loopback bridge serves only static assets.
  • index.html restructured to the design’s hero + auth-card + footer layout with mobile responsiveness, an Operator dashed inner frame (capos://auth label), and the original data-test surface fully preserved so make run-remote-session-capset-ui still passes.
  • A trusted-static feature flag block (window.CAPOS_UI_FEATURES, overridable via ?features=) gates surfaces that are scaffolded but not yet backed by the Rust gateway. Default flags match what the current backend honours.

Surfaces scaffolded but flag-gated off by default (no functional support in capOS yet; future tracks will wire them):

  • Passkey sign-in (?features=passkey). Tracks docs/proposals/boot-to-shell-proposal.md (passkey/WebAuthn, credential setup) and docs/proposals/cryptography-and-key-management-proposal.md.
  • OIDC / SSO providers (?features=sso for Google/GitHub/Okta). Tracks docs/proposals/oidc-and-oauth2-proposal.md. The trusted Rust backend must own the provider integration; browser JavaScript must continue to receive only view models, results, and denials.
  • MFA second-factor step (?features=mfa). Tracks docs/proposals/boot-to-shell-proposal.md. The 6-digit input animates end-to-end as a UI demo today; production wiring is a future slice.
  • Success step (?features=successStep). The current Rust backend transitions straight to the workspace on session start; the success card is design-parity scaffolding for a future mid-step surface.
  • Capability-grant consent strip. Removed from the design itself during iteration (the user concluded it demonstrated the wrong thing for capOS); kept in the deferred list because a future consent-on-grant flow for OAuth-style external identities would re-use the same visual language.

Surfaces flag-gated on by default but UI-only today (decorative state without a backend round-trip):

  • System status pill, Region pill, Language pill, Footer, Hero panel, Remember-device checkbox, Forgot-password link, Password show/hide toggle.

Constraints the visual layer must keep across future slices:

  • Login is a dedicated OS-like screen with a visible username field and no full persistent technical header. Resource profile names such as operator are not user-typed system details.
  • Browser login sends username/password only. The username field is empty by default: the browser UI does not pre-fill from CAPOS_REMOTE_SESSION_USER, host USER, or any other host-local identity hint, because that would disclose account hints before authentication.
  • Authenticated users land in a compact Services-first workspace where Session, CapSet, Diagnostics, and Transcript are separate views. The UI smoke harness must continue to fail if any visible button is not exercised; new flag-gated buttons must stay hidden by default so the smoke surface does not grow without paired automation coverage.
  • No third-party CDN script tags or runtime frameworks are added to the trusted UI. Theme switching uses the existing data-theme attribute on <html>/<body>; CSS variables flip the design tokens.