Proposal: Remote Session CapSet Clients
Let a regular host application connect to a capOS instance, authenticate through the same session machinery as shells and gateways, receive a broker-issued remote view of its CapSet, and invoke the granted capabilities over standard Cap’n Proto RPC. The first proof can be a Linux Rust CLI because it is easy to script, but the design is for host applications generally: native GUI apps, Tauri apps with Rust backends, server-side webapp gateways, desktop tools, and agent runners can all consume the same remote session CapSet model.
The important correction is that this is not a special “remote chat client” and not another shell transport. Chat, Paperclips, Adventure, system-info, command surfaces, and future service APIs should be ordinary capabilities in a remote session bundle. A shell is one possible client of that bundle; it is not the universal protocol.
Current State
The tree has several local interop and UI proofs:
demos/capnp-chat-interopruns inside capOS, accepts one scoped TCP connection, decodes a schema-framedChat.sendparameter message, calls the resident chat endpoint, returns a schema-framed result, and exits.- The host harness uses a Linux Python script plus the pinned
capnptool to encode/decode request and result messages. demos/remote-session-capset-gatewayruns inside capOS, listens through a manifest-scopedTcpListenAuthorityon guest port2327, authenticates a remote session throughSessionManager, returns a broker-shaped remote CapSet view, calls session/system-info DTO operations, and proves wrong-interface, unknown-cap, and stale-session denials. It derives login source metadata from the accepted socket and a gateway-generated connection event id.tools/remote-session-clientis a regular Linux Rust client crate. Its library is UI-neutral so the same client logic can back a CLI harness, native GUI, Tauri backend, or trusted web gateway.remote-session-uiis a trusted loopback web bridge in that crate. Its Rust backend holds the TCP connection and remote session state, serves a browser UI, and exposes only view models, call results, denial diagnostics, and redacted transcript rows to browser JavaScript. The focusedmake run-remote-session-capset-uiharness drives that UI against a gateway-only QEMU fixture.remote-session-web-uiis a capOS-served browser UI backend. Defaultmake runstarts it on guest port8080with loopback host forwarding, andmake run-remote-session-self-served-web-uiproves the full boot-resource UI bundle is served from the capOS-owned origin while preserving the same browser-safe view-model boundary. This remains local/QEMU evidence; the cloudboot L4, private GCE, and public ingress proofs are separate tasks.
Those proofs are useful because they show external Cap’n Proto data can cross the QEMU TCP boundary and reach capOS-hosted services through narrowed listener caps. The remote-session proof is the first target-shaped slice, but it is not the final RPC API. It still lacks:
- standard
capnp-rpcmessage transport; - live typed RPC proxy objects rather than DTO-mediated gateway operations;
- live endpoint-backed proxy objects beyond the current authenticated
per-session DTO worker slices for
Chat.send, Adventurestatus/look/inventory/go(direction), and the Paperclips Path B bridge-internalinitial/command/status/projectssynthesis from cachedserviceLaunchstate; - Paperclips service-runner launch on the default
make runmanifest (Path B wires the gateway worker, bridge dispatch, UI launch slot, and thesystem-remote-session-paperclips.cuefocused manifest now declares its AuthorityBroker launch policy, but default-manifest Paperclips launch wiring remains future work); - the on-wire Paperclips control-plane (Path C): extending
RemoteGatewayRequest/RemoteGatewayResponsewith paperclips arms so the bridge no longer synthesizes responses from cached launch state and the gateway worker drivesPaperclipsGameClientover a real DTO arm rather than the manifest-staticgameendpoint fallback; - rich Adventure/Paperclips client controls and broader service-specific worker/client implementations beyond the current Chat, Adventure, and Path B Paperclips slices;
- complete object lifetime and exception behavior;
- broader revocation and object-drop propagation beyond the current kernel-backed DTO logout and connection-teardown path;
- TLS/mTLS and expanded auth adapters beyond password, anonymous, and guest;
- resource accounting for remote references, in-flight calls, and result sizes.
Goals
- Support a normal host client built and run outside capOS. A Linux Rust CLI is the smallest harness; native GUI and Tauri/webapp-backed clients should not need a different capOS protocol.
- Authenticate through capOS session/admission services, not through an application-specific service secret.
- Support multiple admission methods: local password where policy enables it, public-key signatures, OIDC/OAuth browser or device flows, passkey/WebAuthn through the web gateway path, mTLS client identity, guest/anonymous profiles where explicitly enabled, and future service/workload credentials.
- Return a live remote CapSet view whose entries are typed RPC client objects, not serialized local cap-table slots.
- Let the client call any granted remote-proxyable capability by name and expected interface ID.
- Let a host UI discover broker-approved service profiles, start allowed game server processes through a restricted service-runner, and attach the capabilities those processes export or receive without exposing local spawn authority.
- Support bidirectional session UI composition: a host UI can call capOS capabilities, and capOS-side services or agents can propose bounded changes to the host session’s panes, command palette, visualizations, density, theme, and workflow-specific controls through explicit UI capabilities.
- Keep local-only authority local: cap IDs, endpoint generations, receiver selectors, session-global identifiers, and kernel result-cap indexes never become portable remote authority.
- Preserve session-bound invocation context. Remote calls run under the gateway/worker session created for that remote client.
- Make logout, disconnect, transport breakage, session expiry, policy revocation, and object drop observable and fail closed.
Non-Goals
- General network transparency across arbitrary capOS hosts.
- OCapN compatibility or third-party handoffs.
- Browser JavaScript receiving capOS capability objects directly. A webapp may be a front end, but a trusted server, gateway, or Tauri Rust backend holds the remote CapSet.
- Letting capOS services execute arbitrary host UI code, inject unreviewed JavaScript/CSS, spoof trusted browser/desktop chrome, or persist UI changes outside the granted session UI scope.
- Replacing SSH, WebShellGateway, native shell, or interactive command surfaces.
- Exposing raw
ProcessSpawner, raw process handles, endpoint owner caps, local cap ids, result-cap slots, raw network factories, broad storage roots, key material, or browser-held capOS capability objects as a default remote bundle. Process handles stay backend-local. - Treating a browser or webview as a capOS capability host. Browser code sees view models, launch forms, command descriptors, user events, diagnostics, and rendered results; the trusted Rust/backend side holds the remote session and any remote capability proxies.
- Treating password authentication as the only or preferred remote path.
- Serializing the kernel CapSet page or local cap table to the client.
UI Scope And Architecture
This section is the single-page synthesis future contributors should read
before changing anything in tools/remote-session-client/ or the gateway.
The detailed mechanics live in the rest of this proposal, the backlog
(docs/backlog/remote-session-capset-client.md), and the plan
(docs/backlog/remote-session-capset-client.md); this section captures
what the UI is for, what it must hold, and how the pieces decompose.
Goal
A remote operator, after authenticating to a capOS gateway, can drive every remote-proxyable capability the broker grants their session – directly, with typed UI, without a shell, without webview-held capOS handles, without leaking session-id hex, cap slots, or process handles to the browser. The CapSet UI is not a shell, not a generic API explorer, and not a browser; it is a peer client of the same broker bundle a shell would consume, over TCP/RPC instead of the ring page, with a backend-held authority boundary and a typed UI on top.
What the UI is for
Grouped by intent, not by panel. Each item is constrained by the corresponding section later in this proposal.
- Sign in to a remote capOS host. OS-style login surface with a
visible username field, secondary endpoint/auth controls, no full
persistent technical header. The gateway advertises the auth methods
the system makes available (narrowed only by explicit manifest
policy); disabled methods stay listed and clearly marked so the
protocol is not password-shaped. The web UI’s username field is
empty by default – the bridge does not pre-fill from
CAPOS_REMOTE_SESSION_USER, hostUSER, or any other host-side identity hint, because a pre-fill leaks operator/account hints to anything observing the page before authentication. The CLI may take--useras an explicit operator override; the web UI does not. Denials surface with explicit codes, never as silent transport errors. - Understand who/what the operator is. Session view: principal,
profile, auth method, auth strength, freshness/expiry, logout.
Redacted session-id only. Lifecycle states observable:
live/logged_out/ futureexpired/revoked/recovery_only. Stale-call attempts must visibly fail closed. (See## Invocation Contextanddocs/proposals/session-bound-invocation-context-proposal.md.) - Discover what was granted. CapSet view as the inspection surface
(name, interface id, transfer policy, lease expiry, get-by-name+id);
service catalog view as the task-oriented surface (broker- and
launcher-advertised runnable profiles, required grants, exported
descriptors, launch/probe/status). See
## Service Catalog And Game Server Launch. - Use what was granted. For every cap the broker bundles, the UI
must offer at least a generic invocable form – not just inspection.
Service-specific rich clients (Adventure rich client, real Chat
panel, Paperclips client, future agent-shell-services) layer on top
of the same backend-held caps. Where a service exposes a typed
CommandSurface(seedocs/proposals/interactive-command-surface-proposal.md), the UI renders typed buttons/inputs/selectors driven by that surface’s metadata rather than hand-coded controls. Where a service exposes text/audio/video surfaces, the UI consumes them through the Chat substrate (docs/proposals/chat-multimedia-substrate-proposal.md): listener caps for incoming text/audio/video, capnp-> streammethods for outgoing media, capability-mediated peer/channel granting, and a WebRTC mapping for the browser-to-backend audio/video path. The CapSet UI never holds the listener caps directly; the trusted Rust backend owns them and emits redacted view-model events plus WebRTC handles for the browser. - Host a terminal panel when granted. The CapSet UI is not
defined as a terminal emulator and works without one. But when
the broker grants a
TerminalSessioncap – for a native shell, a POSIX shell, or any StdIO-based service that expects a terminal on the other side – the UI may host a terminal panel for that cap. The boundary stays: terminal bytes flow through a backend-heldTerminalSession; the browser renders frames it receives, never opens a raw shell or holds aProcessSpawner. - Surface agent-shell-exposed capabilities as first-class. The
CapSet UI does not contain the LLM loop, model client, or
tool-execution runner – those live in the agent shell process (see
docs/proposals/llm-and-agent-proposal.md). But agent-shell-exposed services (e.g. “send message to running agent”, “approve queued action”, “audio stream to/from agent”) are services the broker can bundle. When bundled, the CapSet UI exposes them through the same per-session worker / typed view-model pattern as Chat or Adventure. Action-approval queues are the canonical capability-driven UI surface here – the policy engine asks, the operator sees a queue and approves/denies per item. - Launch services where policy allows. Service-runner launch flow:
select profile → see required grants → side-effect-free probe →
confirm → backend launches restricted server graph (e.g.
adventure-server+ NPC companions) → backend attaches/retains exported descriptors in the backend-held remote CapSet. Browser sees launch form, status, denials, descriptors – never rawProcessSpawneror process handles. - Diagnose / audit. Low-level probes (denied-chat, stale-call,
system MOTD, session-summary diff) live in a Diagnostics or Session
panel, not interleaved with normal service use. Redacted transcript
export in its own view; redaction status visible; raw authority
material absent. UI smoke checks for forbidden markers
(
processhandle,capabilitymanager,capslot, …). - Bidirectional UI composition (later). A capOS service may, only
when granted a
RemoteUiSurfacecap, propose bounded layout/theme/command/visualization patches and receive typed user events back. Cannot inject JS/CSS, spoof login chrome, persist UI state without a separate settings cap, or exceed quota/size bounds. See## Bidirectional UI Composition.
Design invariants the UI must hold
The proposals don’t specify pixel layout; they specify a small number of hard invariants. Every UI design choice has to fit these:
- Authority boundary. Trusted Rust backend holds: TCP connection, remote session state, per-session worker proxies, capOS cap references, broker bundle policy, raw snapshots used to compute view models. Browser holds: view models, command descriptors, launch forms, redacted transcript rows, theme state.
- Session-bound invocation. Every post-auth call runs under the
immutable
SessionContextof the per-session worker. The browser cannot select identity by request field; the backend cannot construct a freshSessionContextfrom request bytes. Logout, disconnect, expiry, revocation must break all session-bound proxies and fail closed before result bytes reach the caller. - Privacy-preserving disclosure. Default endpoint metadata is
opaque (
scoped_ref+ freshness). Subject fields (principal, profile, auth strength) appear in the UI only because the broker policy explicitly disclosed them for that service. - Capability = invoke gate; UI surface = render gate. A button on the screen is not what authorizes a call. The cap held in the backend is. UI controls that aren’t currently invocable must say “planned / not remote-proxyable yet” rather than imply they work.
- Interface = permission. Method-level access lives in the schema, not in a per-cap rights bitmask. Narrowing what a remote client can do means a narrower wrapper cap from the broker – not a flag on the same cap.
- Side-effect-free probes are real. A probe response that says “supported / required grants accepted / message” did not spawn anything, allocate endpoint owners, or attach caps.
- Redaction is structural, not after-the-fact. Sensitive fields are dropped or redacted on the way into view models, not stripped from logs after the fact. Backend tests assert browser envelopes never contain raw session-id hex or password material.
- UI smoke fails if any visible button is unexercised. This prevents the UI from accumulating decorative controls.
- Theme/layout state is local UI state, not capOS state. Persistence requires an explicit settings cap.
Architecture decomposition
flowchart LR
subgraph host[Host machine]
subgraph browser[Browser / webview / Tauri webview]
js[Browser JS - view models, forms, results, redacted transcript, theme state]
end
subgraph rust[Trusted Rust backend - tools/remote-session-client]
bridge[HTTP bridge - /api/* endpoints]
app[AppState - session VM, caps VM, snapshots, transcript, automation]
tcp[Gateway TCP connection - schema-framed DTOs today, capnp-rpc planned]
lib[remote-session-client lib - protocol, frame, session_diff, transcript]
end
cli[CLI binary - same lib backend]
end
subgraph capos[capOS guest in QEMU or future hardware]
subgraph gw[Remote-session gateway process]
tcplisten[TcpListenAuthority on guest port 2327]
authflow[Auth flow - password, anonymous, future adapters]
sm[SessionManager.login -> UserSession]
broker[AuthorityBroker.remoteClientBundle]
end
subgraph workers[Per-session RPC workers]
chatw[Chat worker - holds Chat client facet]
advw[Adventure worker - holds Adventure endpoint]
futurew[Future workers per service - terminal, agent, voice...]
end
subgraph services[Backing services]
cs[chat-server]
ad[adventure-server + NPCs]
pc[paperclips-server - future]
end
kernel[Kernel - SessionManager, CapTable, Endpoints, ring, audit]
end
js -- HTTP JSON --> bridge
bridge --> app --> lib --> tcp
cli --> lib
tcp -- TCP / DTO today / capnp-rpc planned --> tcplisten
tcplisten --> authflow --> sm --> broker
broker -- backend-held descriptors / caps --> app
app -- worker spawn requests --> broker
broker --> workers
chatw --> cs
advw --> ad
workers <--> kernel
Key seams:
-
Gateway boundary (
demos/remote-session-capset-gateway/): scopedTcpListenAuthority,SessionManager,AuthorityBroker, narrowly approved backend launch authority. No rawNetworkManager, rawProcessSpawner, broad endpoint authority. -
Per-session worker boundary (
demos/remote-session-chat-worker/,demos/remote-session-adventure-worker/, future workers): each endpoint-backed remote method runs in a worker that holds the live session-bound caller context. Worker spawn is validated; logout/connection-close tears down workers; release flushing happens on shutdown. -
Trusted Rust backend boundary (
tools/remote-session-client/src/): theAppStatekeepsgateway: Option<GatewayConnection>,current_snapshot: RemoteSessionSnapshot(raw), and view-model fields (redacted). The HTTP bridge’s/api/*surface is the only path the browser has into capOS authority. -
Browser boundary (
tools/remote-session-client/ui/): pure client of/api/stateview models,/api/call/*typed calls,/api/capset/*,/api/probe/*,/api/transcript/*. JS state is presentation: theme, active tab, login form values, click coverage report. -
Transport evolution. Today: bespoke schema-framed Cap’n Proto DTOs, length-prefixed frames, request/response sequence numbers. Planned: standard
capnp-rpcwith live proxy objects, exception mapping, release/drop, promise pipelining. The backend boundary stays the same; the wire shape changes.Standard
capnp-rpc(thecapnp-rpcRust crate, v0.25 at the time of writing) isstd-only and requires a futures executor; the QEMU-side gateway is#![no_std]#![no_main]with a synchronousloop { accept; loop { recv_frame; handle; send_frame } }shape (demos/remote-session-capset-gateway/src/main.rs). The wire-level replacement is therefore gated on either bringing an async runtime to capOS userspace or shipping a sync-friendly capnp-rpc adapter. Until then, transport-lifetime / exception behavior carries the contract documented next, which the eventual rewrite must preserve.Runtime decision for the first proxy layer: use a temporary dual-stack. The Linux host backend now has a local
capnp-rpcChatfacade/proxy layer because that side already hasstdand can run a futures executor. The facade translates backend-held typed proxy calls into the existingRemoteGatewayRequest/RemoteGatewayResponseDTO transport, so the guest gateway remains synchronous and#![no_std]. This proves host-backend proxy semantics, denial/disconnect mapping, and browser-safe view-model integration; it does not claim standardcapnp-rpcframing or live RPC vats inside capOS. Gateway-wire replacement waits for the userspace runtime decision above, and the dual-stack must be removed after the reviewed guest-side RPC path carries live service traffic.
Transport lifetime and exception contract
The bespoke transport’s lifetime contract is what the future
capnp-rpc proxy layer has to preserve. The host-side test module
in tools/remote-session-client/src/bin/remote_session_ui.rs pins
each rule end-to-end:
- Connection close mid-call clears state, returns
gatewayDisconnected. A TCP FIN observed during a request surfaces as503 gatewayDisconnectedwithview.lastResult.code = "gatewayDisconnected",view.connected = false,session = null, emptycaps/services/launchers, and adisconnecttranscript row scoped to the operation that failed. Covered byauthenticated_gateway_close_during_call_clears_view_with_reconnect_guidance,oversized_gateway_response_during_call_clears_view_with_reconnect_guidance,password_denial_then_closed_tcp_resets_before_retry,http_password_denial_then_closed_tcp_preserves_backend_error_and_clears_view. - Half-open transport (write succeeds, read stalls) times out
cleanly. The bridge’s
read_timeout(endpoint.io_timeout()) must fire and surface the samegatewayDisconnectedshape; no hang or partial-state leak. Both the post-request stall case and the partial-frame-header stall case are covered:half_open_response_read_times_out_as_disconnect,partial_response_header_then_stall_treated_as_disconnect. - Protocol-level decode errors (sequence mismatch, malformed
payload) yield
500 internalwithout tearing down the connection. This documents current behavior; the future capnp-rpc rewrite is expected to tighten this to a connection- level abort once the proxy layer is in place. Covered byresponse_with_wrong_seq_yields_internal_error,malformed_response_payload_yields_internal_error. - Immediate re-login after transport failure succeeds. No
retry / cooldown gate; the recovered session must not echo the
prior call’s failure as
lastResult. Covered byimmediate_relogin_after_mid_call_close_succeeds. disconnectrows survive into the operator-visible exported transcript (GET /api/transcript/redacted) scoped to the operation that failed and free of stream-level metadata (peer addresses, frame sizes, rawos errorstrings, secrets). Covered bydisconnect_recorded_in_exported_transcript_after_mid_call_close.- Gateway-side teardown calls kernel
UserSession.logouton both the explicit-logout DTO path and the connection-close path. Verified by the QEMU-driven harness intools/qemu-remote-session-capset-smoke.sh, which asserts thatUserSession.logout cap call succeeded; remote session staleandconnection teardown UserSession.logout cap call succeededboth appear during the multi-cycle interop run. - Post-logout calls fail closed. The bridge keeps the gateway
socket alive after logout so a stale-call probe gets an explicit
staleSessiondenial rather than a transport failure. Covered byrepeated_stale_calls_after_logout_remain_fail_closedand the worker-targetedstale_chat_proxy_after_logout_returns_typed_denial. - Worker/proxy lifetime failures preserve the same split.
Worker-targeted
Chat.sendtransport loss and oversized worker responses clear backend gateway/session state and surfacegatewayDisconnectedwith reconnect guidance, while post-logout worker calls remain typedstaleSessiondenials on the still-open gateway socket. The backend-onlycapnp-rpcfacade maps transport breakage toErrorKind::Disconnected, and maps DTO denials or unexpected worker/proxy responses toFailedCapException-like errors rather than panics or silent broader authority. Covered bychat_worker_transport_breakage_clears_state_and_redacts_export,oversized_chat_worker_response_maps_to_disconnect_without_frame_leak,generated_chat_client_transport_breakage_maps_to_disconnected_exception,generated_chat_client_dto_denial_maps_to_failed_cap_exception_like_error, andgenerated_chat_client_unexpected_worker_response_maps_to_failed_exception. - Revoked leases are not yet separately observable. The current
DTO surface carries
leaseExpiresAtMson cap entries, but it has no explicit revoke/lease-expired call path or denial code that can distinguish a revoked lease fromstaleSessionormethodDenied. Tests must not fake this coverage; add it with the standard RPC object lifetime path or a reviewed DTO denial shape. - Redacted transcript export does not expose exception/lifetime internals. Worker-targeted disconnect, oversized response, and stale-session exports are asserted free of raw socket addresses, OS error strings, frame-size diagnostics, local cap ids, result-cap labels, proxy table positions, raw session-id hex, passwords, and host endpoint hints.
Resource and revocation bounds
Each per-session resource class has an explicit named ceiling and maps over-cap conditions to a typed denial diagnostic that reuses the transport-error envelope from above. Operators tuning these bounds should re-audit the per-session memory budget and the operator-multitool scenario before changing them; raw observed counters are not exposed to browser-facing view models.
| Resource | Constant | Default | Where enforced | Denial code |
|---|---|---|---|---|
| Outstanding worker calls per session | MAX_OUTSTANDING_WORKER_CALLS_PER_SESSION | 4 | tools/remote-session-client/src/bin/remote_session_ui.rs::transact (gates Adventure / Chat-shaped requests before submission) | tooManyWorkerCalls (HTTP 503) |
| Transcript ring per session | TRANSCRIPT_ROWS_CAP (4096), TRANSCRIPT_DETAIL_BYTES_CAP (1 MiB) | row + byte caps | AppState::push_transcript / enforce_transcript_caps in the same file | drop-oldest plus a single audit "transcript truncated; ..." row |
| Backend cap holders per session | MAX_BACKEND_CAP_HOLDERS_PER_SESSION (64), MAX_BACKEND_SERVICE_CATALOG_ENTRIES (64), MAX_BACKEND_LAUNCHER_CATALOG_ENTRIES (32) | per-Vec entry caps | capset_list / service_catalog / launcher_catalog in the same file | tooManyCapHolders (mirrors transport-error envelope) |
| Browser-session owner slot | one tentative or authenticated owner | first-wins bridge owner | login-route preflight reserves before gateway authentication; success finalizes on cookie rotation, failure releases the reservation | sessionAlreadyInUse (HTTP 409) |
| Local HTTP request parser | request line 8 KiB, header line 8 KiB, 96 headers, aggregate headers 32 KiB, body 64 KiB, fixed read/write timeout | loopback bridge input bounds | read_http_request and handle_connection reject before route dispatch, JSON parsing, auth, or gateway I/O | httpLineTooLong, tooManyHeaders, headersTooLarge, requestBodyTooLarge, requestTimeout |
| Local HTTP handler slots | MAX_HTTP_HANDLER_THREADS (32) | concurrent request handlers | accept loop acquires a bounded slot before spawning a handler thread | handlerLimitExceeded (HTTP 503) |
| Concurrent gateway logins per principal | MAX_CONCURRENT_LOGINS_PER_PRINCIPAL (4), PRINCIPAL_TABLE_SLOTS (32) | per-principal counter, distinct-principal table ceiling | demos/remote-session-capset-gateway/src/lib.rs::PrincipalLoginTable::try_admit, called from both password and anonymous login paths | serviceUnavailable with “per-principal concurrent-session cap reached…” |
The bridge-side bounds are exercised by host tests in
remote_session_ui.rs::tests (transcript_row_count_cap_drops_oldest_with_truncation_marker,
transcript_byte_cap_drops_oldest_with_truncation_marker,
transcript_at_exact_row_cap_does_not_truncate,
capset_list_at_max_holders_bound_stores_all_entries,
capset_list_over_max_holders_returns_typed_denial,
service_catalog_at_max_entries_bound_stores_all_entries,
service_catalog_over_max_entries_returns_typed_denial,
launcher_catalog_at_max_entries_bound_stores_all_entries,
launcher_catalog_over_max_entries_returns_typed_denial,
outstanding_worker_calls_at_bound_still_allow_one_more_after_completion,
outstanding_worker_calls_over_bound_returns_typed_denial,
concurrent_first_wins_login_reservations_allow_one_post_login_owner,
failed_login_reservation_releases_for_later_owner,
http_parser_rejects_oversized_request_line_before_route_work,
http_parser_rejects_oversized_header_line,
http_parser_rejects_too_many_headers,
http_parser_rejects_aggregate_headers_too_large,
http_parser_rejects_oversized_body_from_content_length,
http_parser_times_out_incomplete_request_line,
handler_slots_bound_concurrent_request_threads).
The gateway-side bound is exercised by host tests in
demos/remote-session-capset-gateway/src/lib.rs::tests
(admits_up_to_max_concurrent_logins_per_principal,
rejects_over_cap_admission_with_typed_denial,
release_reopens_a_slot_for_the_same_principal,
distinct_principals_have_independent_counters,
release_to_zero_drops_the_slot,
release_unknown_principal_is_a_noop,
table_full_admission_does_not_grow_past_slot_ceiling).
Two contracts the future capnp-rpc rewrite must preserve:
fail-closed bound exhaustion never panics or leaks raw counters into
browser envelopes (only typed denial codes plus a backend audit row);
and operator-visible audit material (bound-exhausted transcript
rows, drop-oldest truncation markers) is recorded backend-side
through the existing redacted-transcript path, not surfaced through
new untyped error channels.
Layer map for future iterations
| Layer | Owner | Today | Heading toward |
|---|---|---|---|
| Wire | gateway ↔ backend | length-prefixed schema-framed DTOs | standard capnp-rpc over TCP, then TLS/mTLS |
| Auth | gateway | password, anonymous, guest; disabled methods advertised | + public key, OIDC (device-code + PKCE), passkey, mTLS, service credential |
| Bundle | broker | shell-bundle-shaped wrapper for remote | first-class remoteClientBundle profile shape |
| Worker | per-session | Chat.send, Adventure status/look/inventory/go | broader Adventure verbs, real Chat panel, Paperclips worker, generalized lifecycle, terminal-session host, agent-shell services |
| Backend (Rust) | trusted | AppState, snapshot, view models, transcript, automation, first-wins BrowserSession ownership, local HTTP parser/handler bounds, per-session resource bounds (worker-calls, transcript rows + bytes, cap holders, gateway logins per principal) | live RPC proxy state, RemoteUiHost cap holder |
| Browser | untrusted UI | login + Services / CapSet / Diagnostics / Transcript / Session SPA | richer service-specific clients, generic CommandSurface-driven forms, agent-shell mode, terminal panel for granted TerminalSession, RemoteUiSurface rendering |
| Host packaging | trusted | CLI, make remote-session-ui, make remote-session-tauri check/dev wrapper | distributable Tauri package sharing the same Rust backend |
Self-served capOS web UI boundary
The first self-served browser UI is a capOS-hosted application service, not the
host remote-session-ui development bridge moved into the guest. A new
capOS userspace service, remote-session-web-ui, owns the HTTP listener,
serves the UI bundle, runs the authenticated web-session backend, holds the
remote session CapSet/proxy state, and projects browser-safe view models.
Static assets are boot-package resources. The implementation should reuse the
reviewed host UI asset source or a smaller reviewed subset, but the served copy
is an immutable, fixed-name bundle embedded in the capOS boot package and
granted by manifest resource name with a pinned digest or equivalent build-time
integrity label. remote-session-web-ui serves only that bundle and a small
generated bootstrap document; it does not expose a host directory, capOS
storage root, asset traversal, or development hot-reload path.
The first listener surface is HTTP/1.1 on a manifest-scoped
TcpListenAuthority for a dedicated UI port such as guest port 8080.
HTTP serves static assets plus same-origin JSON API routes. WebSocket,
server-sent events, and terminal/media streaming remain later extensions that
need separate route-level bounds; the first proof should avoid them so the
authority and validation surface is small.
The manifest grants for remote-session-web-ui are narrow: scoped
TcpListenAuthority for the UI port, SessionManager, AuthorityBroker, the
immutable UI asset bundle, and the same restricted remote-client
service-runner/backend-launch authority needed to expose approved service
descriptors. It must not receive raw NetworkManager, raw socket factories,
broad storage roots, raw ProcessSpawner, shell launcher authority, endpoint
owner caps, or arbitrary endpoint creation authority.
The service is the trusted backend and holds remote CapSet/proxy state
server-side. Browser JavaScript receives only view models, launch forms,
user-event commands, typed results, denial diagnostics, and redacted transcript
rows. It never receives raw capOS caps, raw ProcessSpawner, process handles,
endpoint owner authority, local cap IDs, result-cap slots, session-global
identifiers, remote CapSet handles, host usernames, host environment variables,
host paths, or QEMU-forwarding identity hints.
Login remains session-manager shaped. The browser submits credentials or
guest/anonymous intent to the capOS-served JSON endpoint; the service derives
source metadata from its accepted socket and service-generated event id, asks
SessionManager for a UserSession, asks AuthorityBroker for the
remote-client bundle, and only then exposes disclosed session/service fields
as browser-safe models. The browser cannot select principal, profile, worker
session context, or backend cap holder by replaying a request field.
Gate 1B is now an evidence ladder rather than a single proof name. The landed local/QEMU layer is:
remote-session-self-served-web-ui: a focused manifest bootsremote-session-web-ui, browser automation loads assets from the capOS-owned origin, logs in, calls at least one granted capability through the service-held backend state, proves logout/stale failure stays closed, and checks forbidden authority markers are absent from browser-visible envelopes and transcripts.remote-session-self-served-web-ui-default-run: defaultmake runstarts the capOS-served UI on guest port8080and forwards it to a loopback host port for local operator use.remote-session-self-served-full-ui-bundle: the capOS service now serves the reviewed fixed-name boot-resource bundle, including the operator workspace assets and/bundle/manifest.json, with explicit content types, no directory traversal, and digest-pinned build evidence.
Those proofs do not close the selected GCE Web UI path by themselves. The local
service proof
cloud-prod-remote-session-web-ui-l4-local-proof
is done: it runs remote-session-web-ui through the non-qemu cloudboot socket
path using the Phase C userspace network stack and configured IPv4 route, not
the older QEMU-only kernel socket fixture or the host remote-session-ui
bridge.
After that, cloud-gce-private-self-hosted-webui-proof
proves private GCE reachability over the live NIC without public IP or public
firewall exposure.
cloud-gce-public-self-hosted-webui-ingress-tls
is the later public operator-access task; it remains on hold for explicit
public-ingress/TLS authorization even though the ingress policy design is
recorded.
Rollback is manifest/build-target selection: remove the focused target and the
remote-session-web-ui listener/asset grants while keeping the host-served
make remote-session-ui bridge and the remote-session CapSet gateway
unchanged.
Architecture
flowchart TD
Client[Host app: CLI, GUI, Tauri, or web gateway] -->|TCP/TLS + capnp-rpc| Gateway[RemoteSessionGateway]
Gateway --> Auth[Auth adapters]
Auth --> Sessions[SessionManager]
Gateway --> Broker[AuthorityBroker]
Broker --> Worker[Per-session RPC worker]
Broker --> Catalog[Remote service catalog]
Catalog --> Runner[Restricted service runner]
Runner --> GameServers[Game server processes]
Worker --> RemoteCapSet[RemoteCapSet]
RemoteCapSet --> Proxies[Remote capability proxies]
GameServers --> Proxies
Proxies --> LocalCaps[capOS capabilities]
Worker --> Audit[AuditLog]
The remote listener is a trusted gateway. In the final RPC shape it accepts
the transport, performs or delegates authentication, obtains a UserSession,
asks the broker for a remote-client bundle, and hosts a per-session RPC vat.
That vat exports a RemoteSession object and remote proxy objects for
capabilities in the broker-issued bundle. During the temporary dual-stack
period, the guest side still accepts DTO frames and the Linux host backend
hosts the first local proxy facade over those DTO calls.
For the first implementation the per-session worker may be an ordinary capOS service process. That shape matches the session-bound invariant: one workload process has one immutable session context. A single long-lived gateway may handle pre-auth connection state, but post-auth capability invocation should run inside a worker whose session context is the authenticated remote session, or through an equivalently reviewable dispatch path that cannot mix unrelated user sessions as ambient authority.
Bootstrap Interfaces
The DTO surface below is now pinned in schema/capos.capnp:
RemoteAuthStart, RemoteAuthStep, RemoteServiceGrantRequirement,
RemoteServiceExport, RemoteServiceProfile, plus the
RemoteSessionGateway, RemoteAuthFlow, RemoteSession,
RemoteCapSet, RemoteServiceCatalog, and RemoteServiceRunner
interfaces. Round-trip coverage for the new structs lives in
capos-config/tests/remote_capnp_rpc_dto_roundtrip.rs. The transport
that consumes them is still gated on the userspace async-runtime
decision (capnp-rpc v0.25 is std-only and needs a futures
executor). The first proxy slice is host-backend-only and dual-stack:
it uses capnp-rpc locally in the trusted Linux backend for Chat while
translating to the legacy RemoteGatewayRequest/RemoteGatewayResponse DTO
union on the gateway wire. The schema and generated bindings do not change for
that slice, and browser JavaScript still receives only view models, typed
results, typed denials, and redacted transcript rows.
enum RemoteAuthKind {
password @0;
publicKey @1;
oidcDeviceCode @2;
oidcAuthorizationCodePkce @3;
passkey @4;
mtlsClientCert @5;
guest @6;
anonymous @7;
serviceCredential @8;
}
struct RemoteAuthMethod {
kind @0 :RemoteAuthKind;
label @1 :Text;
profileHints @2 :List(Text);
interactive @3 :Bool;
enabled @4 :Bool;
}
struct RemoteAuthStart {
kind @0 :RemoteAuthKind;
selector @1 :LoginSelector;
requestedProfile @2 :Text;
clientNonce @3 :Data;
# Source metadata is intentionally not a client-supplied field.
# The gateway derives LoginSourceMetadata from the accepted socket
# and its own connection event id before calling
# SessionManager.login. A client-supplied source field would let
# remote callers forge audit metadata downstream services depend on.
}
struct RemoteAuthStep {
prompt @0 :Text;
redaction @1 :Bool;
url @2 :Text;
userCode @3 :Text;
challenge @4 :Data;
expiresAtMs @5 :UInt64;
}
interface RemoteSessionGateway {
authMethods @0 () -> (methods :List(RemoteAuthMethod));
start @1 (request :RemoteAuthStart) -> (flow :RemoteAuthFlow);
guest @2 (requestedProfile :Text) -> (session :RemoteSession);
anonymous @3 (requestedProfile :Text) -> (session :RemoteSession);
}
interface RemoteAuthFlow {
next @0 (response :Data) -> (step :RemoteAuthStep, done :Bool,
session :RemoteSession);
cancel @1 () -> ();
}
struct RemoteCapEntry {
name @0 :Text;
interfaceId @1 :UInt64;
transferPolicy @2 :Text;
leaseExpiresAtMs @3 :UInt64;
}
interface RemoteSession {
info @0 () -> (info :SessionInfo);
capSet @1 () -> (caps :RemoteCapSet);
renew @2 (proof :Data, requestedDurationMs :UInt64)
-> (session :RemoteSession);
logout @3 () -> ();
}
interface RemoteCapSet {
list @0 () -> (entries :List(RemoteCapEntry));
get @1 (name :Text, expectedInterfaceId :UInt64) -> (cap :AnyPointer);
}
struct RemoteServiceGrantRequirement {
name @0 :Text;
interfaceId @1 :UInt64;
transferMode @2 :Text;
holder @3 :Text; # backendHeld, serviceOwned, or clientFacet
}
struct RemoteServiceExport {
name @0 :Text;
interfaceId @1 :UInt64;
transferPolicy @2 :Text;
}
struct RemoteServiceProfile {
id @0 :Text;
label @1 :Text;
processGraph @2 :List(Text);
requirements @3 :List(RemoteServiceGrantRequirement);
exports @4 :List(RemoteServiceExport);
state @5 :Text; # unavailable, attachable, startable, running
}
struct RemoteServiceLaunchRequest {
profileId @0 :Text;
grantNames @1 :List(Text);
}
struct RemoteServiceCatalogEntry {
id @0 :Text;
label @1 :Text;
summary @2 :Text;
capName @3 :Text;
transportInterfaceId @4 :UInt64;
schemaInterface @5 :Text;
proxyStatus @6 :Text;
methods @7 :List(Text);
notes @8 :List(Text);
}
struct RemoteServiceLaunchStatus {
profileId @0 :Text;
status @1 :Text; # notLaunched, unsupported, denied, ready, running
launchSupported @2 :Bool;
message @3 :Text;
acceptedGrantNames @4 :List(Text);
exportedServices @5 :List(RemoteServiceCatalogEntry);
}
interface RemoteServiceCatalog {
list @0 () -> (profiles :List(RemoteServiceProfile));
}
interface RemoteServiceRunner {
probe @0 (request :RemoteServiceLaunchRequest)
-> (status :RemoteServiceLaunchStatus);
start @1 (request :RemoteServiceLaunchRequest)
-> (status :RemoteServiceLaunchStatus);
attach @2 (profileId :Text) -> (status :RemoteServiceLaunchStatus);
}
The AnyPointer result is proposal shorthand for an ordinary Cap’n Proto
capability pointer whose expected interface ID was already checked by the
gateway. Generated client helpers should immediately cast it to the requested
typed client. The remote client does not receive a numeric local capId,
endpoint selector, result-cap index, or session identifier it can replay
somewhere else.
The catalog and runner sketches are also proposal-level. They describe the
remote-facing contract, not the internal implementation. The completed launch
DTO/probe slice uses a serviceLaunch request/response arm for the
side-effect-free probe: RemoteServiceLaunchRequest carries only a profile id
plus explicit grant names, and RemoteServiceLaunchStatus reports status such
as notLaunched, unsupported, denied, ready, or running, launch
support, accepted grant names, and exported or planned service descriptors.
The current Adventure slice makes that serviceLaunch path a real restricted
backend launch for the default make run manifest, so Adventure may report
running and launchSupported=true after the approved server graph starts.
Paperclips remains a future launch profile. A capOS service runner may use
local spawn authority, BootPackage data, or broker-held service caps inside
capOS, but the remote client and browser/webview code receive only service
descriptors, launch requests, status results, denials, and remote capability
descriptors. Raw ProcessSpawner, process handles, endpoint owner caps, local
cap ids, and result-cap slots are not exposed.
Service Catalog And Game Server Launch
The default make run story and the focused game proofs are intentionally
different:
system.cueimportscue/defaults/defaults.cue, boot-launches standaloneinit, and lets init startchat-server,remote-session-capset-gateway,remote-session-web-ui, and the foreground shell. The default binary set includes Adventure server/NPC/client binaries and the terminal Paperclips binary.make runforwards guest port 8080 to a loopback host port and printsremote self-served UI: tcp 127.0.0.1 <port> -> guest :8080; themake run-default-web-uitarget proves the capOS-served endpoint with browser automation. Adventure is not boot-started automatically; the current remote-sessionserviceLaunchslice startsadventure-serverplus simple NPC companions through a restricted backend runner when requested. Paperclips landed in Path A + Path B as described below; the defaultmake runmanifest reportslaunchSupported=false / status=missingBinaryfor thepaperclipslauncher until Path C (the kernel-side AuthorityBroker allowlist extension and the on-wire DTO arm) lands.- The default remote-session gateway is narrow. It has console, scoped
TCP-listen authority for guest port
2327,SessionManager, andAuthorityBroker, plus narrowly approved backend launch authority for the Adventure profile; it does not expose rawProcessSpawner, raw network manager/socket authority, endpoint owner handles, process handles, local cap ids, result-cap slots, or game service endpoint owner caps. Theremote-session-web-uiservice separately receives scoped TCP-listen authority for guest port8080,SessionManager,AuthorityBroker, andconsole. make run-adventureusessystem-adventure.cueto startchat-server,adventure-server, Adventure NPC companion processes, anadventure-scenario-test, and the shell. The Adventure server exports anAdventureendpoint, consumes a client facet ofChat, and owns room/player state keyed by live caller-session references.make run-paperclipsusessystem-paperclips.cueto startpaperclips-serverandpaperclips-proof-serverservices exportingPaperclipsGameendpoints. The terminal client is then launched with explicitStdIO, game endpoint, timer, and optionalproof_acceleratorgrants. The server owns generated content, game state, timer cadence, command specs, status snapshots, project entries, unlock checks, and game-rule mutation.
The remote UI should not treat those terminal transcripts as the product boundary. The staged path is:
- The broker advertises a remote service catalog for the authenticated session. The catalog is derived from manifest/default profiles and policy, and includes only services the remote profile may inspect, attach to, or start.
- The launch DTO/probe slice is complete. It defines a remote-safe launch request, status, and probe contract for cataloged profiles. It can report unsupported launch state, accepted grant names, a message, planned exported service descriptors, and denial status without side effects: no process starts and no new capabilities are attached.
- The current Adventure slice implements the restricted service-runner path for the default manifest. It starts the Adventure server plus simple NPC companion processes with explicit named grants, then returns launch status and remote descriptors for exported or broker-held caps. Process handles stay backend-local.
- The trusted Rust backend attaches those descriptors to the backend-held
RemoteCapSetand drives typed calls. Browser JavaScript or a Tauri webview receives view models, launch/status forms, service descriptors, denials, and results, not raw capOS handles. The implemented DTO worker slices coverChat.sendplus Adventurestatus,look,inventory, and first mutable boundedgo(direction). - The first UI panels can be generic: service list, start/attach, status, read-only Adventure controls, a bounded movement control, transcript, and denial details. Purpose-built Adventure and Paperclips clients can layer richer rendering and broader mutable game actions over the same service-runner and remote CapSet backend later; Paperclips does not have default-manifest remote launch support yet.
Operator commands should stay explicit:
make run
cargo run --manifest-path tools/remote-session-client/Cargo.toml \
--target x86_64-unknown-linux-gnu \
--bin remote-session-client -- --host 127.0.0.1 --port <printed-port>
CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-ui
Add --launch-adventure to the CLI command to start the default-manifest
Adventure graph through the restricted serviceLaunch path and require a
running status.
Add --adventure-status after --launch-adventure to require read-only
Adventure status, look, and inventory responses through the
session-bound worker path.
Add --adventure-go east after --launch-adventure to require the first
bounded mutable Adventure go(direction) response through that same worker
path.
The Tauri wrapper runs from this repository and reuses the same backend
boundary by loading the loopback remote-session-ui surface in a desktop
webview:
CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-tauri
That target checks Tauri CLI and Linux build prerequisites, reports
dependency/scaffold status, and runs a deterministic wrapper check by default.
Set CAPOS_REMOTE_SESSION_TAURI_MODE=dev to launch cargo tauri dev.
Missing host Tauri packages fail with explicit diagnostics and point operators
back to make remote-session-ui. The webview receives the same browser-safe
view models, events, denials, typed results, and redacted transcript rows as
the trusted local web bridge; the backend keeps the remote session and caps.
Bidirectional UI Composition
A conventional GUI program opens a window and owns the controls inside it. A remote capOS session does not need to be that limited. The host app can expose a session-scoped UI host capability to capOS, and capOS-side services or agents can use that capability to propose a better interface for the current task:
- Paperclips can ask for counters, project controls, and status charts instead of printing lines.
- Chat can ask for a channel list, unread badges, and a message pane.
- Adventure can ask for a map pane, inventory slots, command buttons, and room transcript.
- A diagnostics agent can open log, metric, and trace panes side by side, highlight the relevant capability calls, and change density for a debugging session.
- A teaching or accessibility agent can request larger type, simplified controls, or a guided task layout for a particular session.
The authority is explicit and separate from service authority. Holding Chat
does not let a service rewrite the user’s UI. Holding RemoteUiHost or a
narrow UiSurface facet lets the service propose bounded UI changes for the
current remote session. The host app remains the compositor and policy
enforcer.
Conceptual shape:
enum UiPatchKind {
openSurface @0;
closeSurface @1;
updateModel @2;
setLayoutHint @3;
setThemeHint @4;
addCommand @5;
removeCommand @6;
}
struct UiSurfaceSpec {
surfaceId @0 :Data;
title @1 :Text;
kind @2 :Text;
safetyClass @3 :Text;
modelSchema @4 :UInt64;
}
struct UiPatch {
kind @0 :UiPatchKind;
surfaceId @1 :Data;
payload @2 :Data;
expiresAtMs @3 :UInt64;
}
struct UiEvent {
surfaceId @0 :Data;
command @1 :Text;
payload @2 :Data;
userInitiated @3 :Bool;
}
interface RemoteUiHost {
open @0 (spec :UiSurfaceSpec) -> (surface :RemoteUiSurface);
theme @1 (scope :Text, hints :Data) -> ();
}
interface RemoteUiSurface {
apply @0 (patch :UiPatch) -> ();
poll @1 (maxEvents :UInt16) -> (events :List(UiEvent));
close @2 () -> ();
}
The payloads above should become typed structs before implementation. They are
shown as Data only to keep the sketch short. The important boundary is that
UI updates are declarative patches and typed view models, not arbitrary host
code. The host validates the requested surface kind, model schema, command
set, theme tokens, data size, update rate, and safety class before rendering
anything.
This is still a remote CapSet client model:
host UI holds RemoteSession + RemoteCapSet
host UI grants a narrow RemoteUiHost/RemoteUiSurface cap to a trusted worker
capOS service or agent sends declarative UI patches through that cap
host UI renders and sends typed user events back
service effects still require ordinary service caps
The direction is therefore bidirectional but not symmetric. The host app can call capOS service caps. capOS can shape the session UI only through UI caps the host granted. Neither side gains ambient authority over the other.
Safety rules:
- Host chrome, login prompts, origin indicators, permission prompts, and emergency reset controls are reserved. capOS-rendered surfaces cannot spoof them.
- UI patches are session-scoped. Persistent layout/theme changes require an explicit profile/settings cap or user confirmation.
- Theme and look/feel changes use bounded tokens or validated design-system variables, not raw CSS injection.
- UI command descriptors are data; executing a command still calls a typed capability under the current session policy.
- The user can close, reset, or pin surfaces against agent rearrangement.
- UI updates are quota-bound and auditable when they materially affect workflow, consent, disclosure, or action execution.
- Browser front ends keep raw capOS caps server-side or in a Tauri/native Rust
backend. Browser JavaScript receives rendered state and sends user events; it
does not hold
RemoteCapSetentries.
This is the broader version of the WebShell idea. A web shell can be more than a terminal emulator: it can be a session workspace whose composition is negotiated by the capabilities present in the session. The terminal remains one surface in that workspace, not the only surface.
Authentication And Admission
Authentication adapters all produce the same output: a UserSession plus
profile inputs for the broker. They differ only in how the proof is obtained
and verified.
- Password: maps to the existing
SessionManager.login(method, selector, proof, source)path when remote password login is enabled by policy. It must use the existing credential failure/backoff/audit rules and must not be the only supported remote method. - Public key: maps to
SessionManager.sshPublicKeyor a generalized signature-auth method. SSH userauth and raw remote RPC public-key auth can share account/key records, but the transcript bytes must be domain-separated by protocol and channel binding. - OIDC/OAuth: device-code flow fits headless or CLI clients; authorization
code + PKCE fits browser-assisted clients. The OAuth/OIDC service verifies
ID tokens and maps external subjects through the user-identity admission
model before
SessionManagermints a session. - Passkey/WebAuthn: belongs behind the web-authenticator path. A remote native client may open a browser or use a platform authenticator, but raw authenticator secrets never become capOS app data.
- mTLS client certificate: TLS client-auth can identify a principal or pseudonymous subject through certificate policy. Certificate identity is an admission input; the resulting CapSet still comes from the broker.
- Guest and anonymous: explicit policy profiles. They are not fallbacks for
missing credentials and should receive short leases and narrow bundles. Guest
admission is currently surfaced through the bridge as an explicit
AuthMode::Guestoption (/api/login/guest, CLI--guest); the gateway enforces therequestedProfile == "guest"andprincipal.kind == Guestinvariants before broker dispatch via thevalidate_guest_admissionhelper, and refuses withRemoteErrorCode::AuthenticationDeniedand the redacted"guest login denied"message regardless of which policy branch fired. When the manifest has no guest seed account the gateway returnsRemoteErrorCode::DisabledAuthMethodso the bridge can distinguish a manifest-disabled method from a credential failure. Guest sessions surface only the configured display name ("Remote Guest") andprincipal_kindenum label to the bridge; the seeded principal id bytes are never disclosed through the bridge transcript or API envelope. - Service/workload credentials: future non-human clients can authenticate with OAuth client credentials, token exchange, mTLS, or signed workload assertions. They receive service-profile bundles, not human shell bundles.
Every method must record source metadata and protocol/channel binding appropriate to its transport. A successful proof selects a principal and session; it does not directly grant service authority.
Remote CapSet Semantics
A local process starts with a read-only CapSet page plus local cap-table
entries. A remote client instead receives a live RemoteCapSet object:
listreturns names, interface IDs, display metadata, and lease summaries.getreturns a typed RPC capability pointer only if the name exists and the expected interface ID matches.- The returned object is a proxy owned by the remote-session worker.
- Dropping the remote object releases the worker’s hold edge when no other remote references remain.
- Logout, expiry, revocation, disconnect, or worker shutdown breaks all session-bound proxy objects. The current DTO gateway implements kernel-backed explicit logout and owned-session connection teardown; full live proxy object-drop/revocation behavior remains future work.
This is still an actual session bundle. It is not a copy of the kernel’s local CapSet ABI. The remote representation exists because a Linux process has no capOS ring page, no capOS CapSet mapping, and no local cap table.
Invocation Context
Remote capability calls should look like ordinary calls to the target service:
remote client call
-> capnp-rpc message
-> per-session worker proxy
-> local capOS capability call
-> target service sees the worker's live session context
The remote client cannot choose service-visible subject identity. Request fields are ordinary data. If a service needs subject details, it uses the existing subject-disclosure policy: explicit request plus a matching service-scoped disclosure grant. By default it receives only the opaque service-scoped caller-session reference used by the session-bound invocation model.
Error And Lifetime Model
The remote path keeps the existing error split:
- Cap’n Proto RPC transport errors and broken connections become RPC exceptions or disconnected promises.
- Proxy/worker infrastructure failures become
CapException-like capability exceptions. - Domain outcomes remain schema result fields or unions.
- A missing cap name, interface mismatch, denied profile, stale session, or revoked lease is an observable denial, not a silent fallback to a broader service.
Open promises must fail when the remote session logs out or the connection is closed. The worker must release local caps on every close path.
Relationship To Shells And Gateways
Remote session CapSet clients are a peer of shell transports:
- Native shell: a local capOS process that uses its local CapSet and ring. It can later expose a schema-aware REPL over the same capabilities a remote client sees, but the remote client does not need to spawn a shell.
- SSH shell: a production CLI terminal transport. It authenticates and
launches
capos-shellwith aTerminalSession. It should not become the only way for external programs to call typed services. - WebShellGateway: browser terminal, webapp, and agent UI transport. Browser JavaScript must not receive raw capOS caps; the gateway can use the remote session CapSet model server-side and expose terminal frames, view models, command descriptors, or bounded tool requests to the browser. This is close to the same mental model as a “web shell”, except the shell is not the required protocol. The web UI can present service-specific controls over the same session CapSet, and capOS-side services can adjust the session workspace through UI composition caps. A remote CapSet web UI can be built before the full WebShellGateway by omitting terminal delegation, shell-runner policy, and agent execution; it is just another host client of the remote session bundle.
- Tauri or desktop GUI: the Rust/native backend may hold the remote
RemoteSessionand typed capability clients, while the UI layer receives rendered state, command descriptors, and user-intent events. The UI layer should not receive replayable capOS authority as data. The backend may grant narrow UI-surface caps back to capOS services so they can propose adaptive layouts without gaining arbitrary desktop control. - Agent shell: the agent runner holds session caps server-side and presents tool descriptors to the model. A hosted agent can use the same remote session bundle shape as long as actual capOS invocations remain in the trusted worker.
- Interactive command surfaces: command metadata can be one of the granted capabilities. A remote client can render command specs directly instead of scripting text through a shell.
Authority Rules
- The gateway receives scoped listener/TLS/auth/session/broker/audit authority, not raw broad network or spawn authority.
- Post-auth workers receive only the broker-issued remote-client bundle plus proxy lifecycle authority.
- Default remote bundles should be narrower than operator shell bundles.
- Raw
ProcessSpawner, unrestrictedNetworkManager, key-vault, credential store, broad account store, broad storage root, and host debug caps require explicit elevated policy. - Remote proxyable caps must declare transfer/lifetime policy. Local-only caps
may appear in a local shell CapSet without being exportable through
RemoteCapSet. - Capability names are lookup conveniences. Interface ID and broker policy define whether a returned object is usable for the requested type.
- Replayable handles are forbidden. Session IDs, grant IDs, endpoint metadata, object epochs, and proxy table positions are not bearer tokens.
Design Grounding
- Session-Bound Invocation Context defines the one-session-per-process invariant and privacy-preserving endpoint caller-session metadata.
- User Identity and Policy defines principals, sessions, profiles, admission sources, renewal, and brokered CapSet minting.
- Boot to Shell defines the existing
CredentialStore/SessionManager/AuthorityBrokerpath and non-password login directions. - SSH Shell Gateway, Certificates and TLS, and OIDC and OAuth2 define public-key, TLS/mTLS, and federated admission inputs.
- capos-service defines the service lifecycle shape needed for listener loops, per-session context, shutdown, drain, and metrics.
- Capability-Based Service Architecture
defines the broader service taxonomy, capability layering, and
init/spawn boundary the gateway, per-session workers, and restricted
service runner reuse. The default
make rungateway, the Adventure service-runner path, and the Paperclips Path B worker plumbing inherit the process-startup, attenuation, and HTTP-capability rules described there; Path C will extend the broker allowlist surface in the same authority frame. - Remote Session UI Security
defines the web-security posture for the loopback
remote-session-uibridge and its Tauri desktop wrapper – per-browserBrowserSessioncookies, CSRF/CSP/cookie discipline, first-wins ownership, local HTTP parser bounds, and Tauri capability minimization – that the trusted Rust backend in this proposal exposes to the browser. Both proposals reference each other; this proposal owns the upstream remote-session CapSet wire and host-client shape, while that proposal owns the browser-facing authority boundary. - R17 – Remote-session UI bridge and Tauri wrapper are research-only routes long-horizon residual risk (distributable packaging, desktop automation, non-loopback exposure) back to this proposal and the remote-session UI security proposal. Non-loopback remote-session UI exposure must remain blocked until that production posture is accepted by the corresponding review-finding task.
- Interactive Command Surfaces defines typed command sessions that can be rendered by remote clients.
- Browser Capability and Agent Web Sessions defines browser-side authority boundaries and gateway mediation for web UI sessions.
- Language Models and Agent Runtime defines agent runners, tool proxies, and browser-agent UI orchestration boundaries.
- Cloudflare, Cap’n Proto, Workers RPC, and Cap’n Web grounds production object-capability RPC, live object bindings, and remote resource-exhaustion discipline.
- Spritely, OCapN, and CapTP grounds distributed object-capability lifetime, promise, reference, and handoff questions while staying non-binding for capOS wire compatibility.
- Cap’n Proto Error Handling grounds the exception-versus-domain-result split that the host-backend facade and eventual gateway RPC transport must preserve.
Implementation Shape
The first implementation is deliberately small:
- Keep the existing
capnp-chat-interopservice and harness as the transport starting point, but rename the target outcome in planning docs to remote session CapSet interop. Done. - Add generated Linux Rust bindings for the relevant schema subset. Done.
- Add a host client library that connects through QEMU user TCP. Done with a
schema-framed DTO transport; replacing it with standard
capnp-rpcframing and live proxy objects remains the next transport step. - Add a capOS gateway that supports one policy-enabled auth method plus
explicit guest/anonymous behavior. Done for password, anonymous, and
guest, with disabled public-key, OIDC, and passkey/WebAuthn method
entries advertised. Guest admission ships with a dedicated
RemoteGatewayRequest.guestLoginarm, thevalidate_guest_admissionbroker-side enforcement helper that pins therequestedProfile == "guest"plusprincipal.kind == Guestinvariants, and aRemoteErrorCode::DisabledAuthMethodpath so the bridge can distinguish a manifest-disabled method from a credential failure. - Return remote session summary, CapSet list, and typed
getmetadata. Done as DTOs. - Call at least two capabilities from the bundle. Done for
session,system_info, the worker-backedChat.sendpath, and Adventurestatus/look/inventory/go(direction)afterserviceLaunch. The focused chat proof also shows a service-domain denial remains a schemachatSent(false)result and thatchat-serversees bounded session-bound caller metadata through disclosure policy. Broader Adventure methods, Paperclips methods, live proxy objects, and object-level release/drop lifecycle remain future work. - Prove a missing cap, wrong interface ID, wrong profile, stale session, and
logout path fail closed. Done for the focused proof, including a
kernel-backed
UserSession.logoutcall and owned-session disconnect propagation in the DTO gateway; full release, live proxy object-drop, renewal, and revocation propagation remains future work. - Add a first host UI client over the current UI-neutral Rust client. Done for
a trusted local web bridge with a loopback browser UI and Rust backend that
holds the remote session state. It covers endpoint configuration, auth
methods, login, session summary, CapSet inspection,
sessionInfo,systemMotd, denial probes, logout, stale-call proof, redacted transcript export, and a focused browser automation proof. The repo-local Tauri wrapper now checks or launches the same loopback backend/webview boundary; distributable packaging remains later. The UI remains separate from WebShell and does not include a terminal emulator, shell-runner policy, or agent execution. - Define the launch DTO/probe shape after the read-only remote service
catalog. Done: this slice defines a remote-safe launch request, launch
status, and side-effect-free probe so the CLI/web backend can render forms
and denials for Adventure/Paperclips profiles. It deliberately does not
start processes, create endpoint owners, attach caps, or expose raw
ProcessSpawner, process handles, endpoint owner handles, local cap ids, result-cap slots, or browser-held capOS capabilities. - Implement the actual restricted Adventure service-runner path. Done: the
default-manifest Adventure profile starts
adventure-serverplus simple NPC companion processes and attaches or retains the resulting Adventure/chat descriptors/caps in the backend-held remote CapSet. Paperclips landed in two halves: Path A added the read-sideRemotePaperclips*DTO schema (RemotePaperclipsCommandResult,RemotePaperclipsCommandList,RemotePaperclipsProjectList,RemotePaperclipsStatusSnapshot,RemotePaperclipsEvent,RemotePaperclipsProjectStatus,RemotePaperclipsEventKind, and the single-commandRemotePaperclipsCommandinput DTO) inschema/capos.capnp, with bounded wire-roundtrip coverage incapos-config/tests/remote_paperclips_dto_roundtrip.rs; Path B added the dedicateddemos/remote-session-paperclips-worker/crate mirroring the Adventure worker shape, the gatewaySessionWorkerKind::Paperclipsenum variant with matchingSessionWorkerSetarms andspawn_paperclips_graph/build_paperclips_service_launch/fill_paperclips_launcher/paperclips_catalog_statushelpers, a manifest-staticgameendpoint slot on the gateway capset, bridgeRequestKind::PaperclipsInitial/Command/Status/Projectssynthesis from cachedserviceLaunchstate (the on-wire control plane lands in Path C), UI launch slot plus status chip with paired smoke automation (paperclipsLaunchVisible/paperclipsStatus/paperclipsStatusObserved), thesystem-remote-session-paperclips.cuefocused manifest, and themake run-remote-session-paperclips-vm/make run-remote-session-paperclips-uigates. RawProcessSpawner, process owner handles, endpoint owner caps, local cap ids, result-cap slots, and browser-held capOS capabilities stay out of the remote contract. Process handles stay backend-local. Adventurestatus/look/inventorycontrols and first mutable boundedgo(direction)use the session-bound worker pattern; Paperclips Path B uses the same worker shape with bridge-side response synthesis until Path C lands the wire-level DTO arm and the broker allowlist grants for the default manifest. Broader Adventure controls, Path C wire/broker extension, and rich Paperclips client implementations remain later. - Replace the bounded
make remote-session-tauripreflight with the actual repo-local Tauri wrapper over the same Rust backend. Done for check/dev mode:CAPOS_REMOTE_SESSION_PORT=<printed-port> make remote-session-taurivalidates the wrapper scaffold and host prerequisites, andCAPOS_REMOTE_SESSION_TAURI_MODE=devlaunches the wrapper throughcargo tauri dev. Distributable packaging remains gated on reviewed sidecar/backend lifecycle handling. - Add the first typed proxy layer as a host-backend-only temporary
dual-stack. Done for
Chat:tools/remote-session-client/hosts a localcapnp-rpcfacade that translates backend-held proxy calls to the existing DTO gateway protocol while keeping schema/generated bindings, gateway wire shape, and browser authority unchanged. The later gateway rewrite must provide standardcapnp-rpcframing, typed remote proxy objects, exception mapping, release/drop handling, and resource bounds before the bespoke DTO service path can be retired. - Layer richer service clients on top of the same backend boundary. The
first richer client is a session-summary diff: a pure Rust helper in
tools/remote-session-client/src/session_diff.rscompares two snapshots of the session view (CapSet entries plusSessionInfoSummary) and returns typedCapSetDiff/SessionSummaryFieldDiffrecords keyed on(name, interface_id)and on visible session fields. Renewals or policy rebinding surface aspolicy_changedrather thanremoved+added. The trusted web bridge stores the raw snapshots backend-side and exposes/api/call/session-diff-refresh, which returns a redactedSessionSummaryDiffVm. Browser JavaScript receives only that view model: added/removed cap entries by(name, interfaceIdHex, transferPolicy, leaseExpiresAtMs), policy/lease changes, redacted session-id changes, and a summary string. The first call after login captures a baseline (hasBaseline=false); subsequent calls return the diff against the previous snapshot. The browser renders the diff in a dedicated “Last refresh diff” pane on the Session view; rawsession_id_hex, replayable cap handles, and kernel session ids stay backend-side. The focused UI smoke clicks “Refresh & Show Diff” twice and asserts both the no-baseline and post-baseline shapes. Two backend host tests cover the baseline + no-change path and the added-cap + expiry-change path. - Add a separate UI-composition proof only after the basic session proof:
grant a narrow test
RemoteUiSurface, accept one declarative patch, send one typed user event back, and prove the service cannot spoof trusted chrome or persist layout state without the relevant cap.
Later slices can add more auth adapters, TLS, renewal, browser-assisted auth, service credentials, UI composition surfaces, promise pipelining, and distributed GC.
Visual Design Handoff
The host UI visual language is anchored on two Claude Design handoffs:
- The original
capOS Loginbundle (delivered 2026-05-02 13:26 UTC). Only the CSS tokens and design intent were ported into the production UI; the prototype is not kept in-tree. - The
capOS Workspacebundle (delivered 2026-05-02, seetools/remote-session-client/ui/design-bundle/). Covers the post-login workspace shell, chat list, active group chat with embedded approval cards, active DM with E2E lock + fingerprint card, active call (collapsed banner + full-pane), stage room, and a “start sheet” with the four ocap-clean entry flows (open DM from contact card, redeem invite, browse directory, start ephemeral chat). This bundle IS kept in-tree as reference attools/remote-session-client/ui/design-bundle/and includes conversation transcripts, HTML prototypes, JSX components, and the unique theme assets. See itsCAPOS-INTEGRATION.mdfor the bundle-to-live-UI mapping and the iteration-7 prerequisite (CSP refactor + per-browser BrowserSession cookie before any inline scripts/styles from the prototype reach production).
Both bundles ship four themes (Space, Mountain, Light, Operator)
and a consistent token system (themes.jsx in either bundle is
authoritative for palette / typography / radii / blur). The
branding assets actually shipped under branding/ were copied
into tools/remote-session-client/ui/assets/ for the bridge to
serve; the prototype’s reference imagery is kept only in the
in-tree design-bundle directory.
What landed in tools/remote-session-client/ui/:
- Vanilla CSS rewrite of
styles.cssaround the design’s theme tokens. No React, no Babel, no third-party CDN script tags. Trust boundary stays intact: the loopback bridge serves only static assets. index.htmlrestructured to the design’s hero + auth-card + footer layout with mobile responsiveness, an Operator dashed inner frame (capos://authlabel), and the originaldata-testsurface fully preserved somake run-remote-session-capset-uistill passes.- A trusted-static feature flag block (
window.CAPOS_UI_FEATURES, overridable via?features=) gates surfaces that are scaffolded but not yet backed by the Rust gateway. Default flags match what the current backend honours.
Surfaces scaffolded but flag-gated off by default (no functional support in capOS yet; future tracks will wire them):
- Passkey sign-in (
?features=passkey). Tracksdocs/proposals/boot-to-shell-proposal.md(passkey/WebAuthn, credential setup) anddocs/proposals/cryptography-and-key-management-proposal.md. - OIDC / SSO providers (
?features=ssofor Google/GitHub/Okta). Tracksdocs/proposals/oidc-and-oauth2-proposal.md. The trusted Rust backend must own the provider integration; browser JavaScript must continue to receive only view models, results, and denials. - MFA second-factor step (
?features=mfa). Tracksdocs/proposals/boot-to-shell-proposal.md. The 6-digit input animates end-to-end as a UI demo today; production wiring is a future slice. - Success step (
?features=successStep). The current Rust backend transitions straight to the workspace on session start; the success card is design-parity scaffolding for a future mid-step surface. - Capability-grant consent strip. Removed from the design itself during iteration (the user concluded it demonstrated the wrong thing for capOS); kept in the deferred list because a future consent-on-grant flow for OAuth-style external identities would re-use the same visual language.
Surfaces flag-gated on by default but UI-only today (decorative state without a backend round-trip):
- System status pill, Region pill, Language pill, Footer, Hero panel, Remember-device checkbox, Forgot-password link, Password show/hide toggle.
Constraints the visual layer must keep across future slices:
- Login is a dedicated OS-like screen with a visible username field and
no full persistent technical header. Resource profile names such as
operatorare not user-typed system details. - Browser login sends username/password only. The username field is empty
by default: the browser UI does not pre-fill from
CAPOS_REMOTE_SESSION_USER, hostUSER, or any other host-local identity hint, because that would disclose account hints before authentication. - Authenticated users land in a compact Services-first workspace where Session, CapSet, Diagnostics, and Transcript are separate views. The UI smoke harness must continue to fail if any visible button is not exercised; new flag-gated buttons must stay hidden by default so the smoke surface does not grow without paired automation coverage.
- No third-party CDN script tags or runtime frameworks are added to the
trusted UI. Theme switching uses the existing
data-themeattribute on<html>/<body>; CSS variables flip the design tokens.