Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: Remote Session CapSet Clients

Let a regular host application connect to a capOS instance, authenticate through the same session machinery as shells and gateways, receive a broker-issued remote view of its CapSet, and invoke the granted capabilities over standard Cap’n Proto RPC. The first proof can be a Linux Rust CLI because it is easy to script, but the design is for host applications generally: native GUI apps, Tauri apps with Rust backends, server-side webapp gateways, desktop tools, and agent runners can all consume the same remote session CapSet model.

The important correction is that this is not a special “remote chat client” and not another shell transport. Chat, Paperclips, Adventure, system-info, command surfaces, and future service APIs should be ordinary capabilities in a remote session bundle. A shell is one possible client of that bundle; it is not the universal protocol.

Current State

The tree has two interop proofs:

  • demos/capnp-chat-interop runs inside capOS, accepts one scoped TCP connection, decodes a schema-framed Chat.send parameter message, calls the resident chat endpoint, returns a schema-framed result, and exits.
  • The host harness uses a Linux Python script plus the pinned capnp tool to encode/decode request and result messages.
  • demos/remote-session-capset-gateway runs inside capOS, listens through a manifest-scoped TcpListenAuthority on guest port 2327, authenticates a remote session through SessionManager, returns a broker-shaped remote CapSet view, calls session/system-info DTO operations, and proves wrong-interface, unknown-cap, and stale-session denials. It derives login source metadata from the accepted socket and a gateway-generated connection event id.
  • tools/remote-session-client is a regular Linux Rust client crate. Its library is UI-neutral so the same client logic can back a CLI harness, native GUI, Tauri backend, or trusted web gateway.

Those proofs are useful because they show external Cap’n Proto data can cross the QEMU TCP boundary and reach capOS-hosted services through narrowed listener caps. The remote-session proof is the first target-shaped slice, but it is not the final RPC API. It still lacks:

  • standard capnp-rpc message transport;
  • live typed RPC proxy objects rather than DTO-mediated gateway operations;
  • endpoint-backed service proxy calls running in an authenticated per-session worker context;
  • complete object lifetime and exception behavior;
  • kernel/user-session logout, disconnect, and revocation propagation beyond the gateway-local stale flag;
  • TLS/mTLS and expanded auth adapters beyond password and anonymous;
  • resource accounting for remote references, in-flight calls, and result sizes.

Goals

  • Support a normal host client built and run outside capOS. A Linux Rust CLI is the smallest harness; native GUI and Tauri/webapp-backed clients should not need a different capOS protocol.
  • Authenticate through capOS session/admission services, not through an application-specific service secret.
  • Support multiple admission methods: local password where policy enables it, public-key signatures, OIDC/OAuth browser or device flows, passkey/WebAuthn through the web gateway path, mTLS client identity, guest/anonymous profiles where explicitly enabled, and future service/workload credentials.
  • Return a live remote CapSet view whose entries are typed RPC client objects, not serialized local cap-table slots.
  • Let the client call any granted remote-proxyable capability by name and expected interface ID.
  • Support bidirectional session UI composition: a host UI can call capOS capabilities, and capOS-side services or agents can propose bounded changes to the host session’s panes, command palette, visualizations, density, theme, and workflow-specific controls through explicit UI capabilities.
  • Keep local-only authority local: cap IDs, endpoint generations, receiver selectors, session-global identifiers, and kernel result-cap indexes never become portable remote authority.
  • Preserve session-bound invocation context. Remote calls run under the gateway/worker session created for that remote client.
  • Make logout, disconnect, transport breakage, session expiry, policy revocation, and object drop observable and fail closed.

Non-Goals

  • General network transparency across arbitrary capOS hosts.
  • OCapN compatibility or third-party handoffs.
  • Browser JavaScript receiving capOS capability objects directly. A webapp may be a front end, but a trusted server, gateway, or Tauri Rust backend holds the remote CapSet.
  • Letting capOS services execute arbitrary host UI code, inject unreviewed JavaScript/CSS, spoof trusted browser/desktop chrome, or persist UI changes outside the granted session UI scope.
  • Replacing SSH, WebShellGateway, native shell, or interactive command surfaces.
  • Exposing raw ProcessSpawner, raw network factories, broad storage roots, or key material as a default remote bundle.
  • Treating password authentication as the only or preferred remote path.
  • Serializing the kernel CapSet page or local cap table to the client.

Architecture

flowchart TD
    Client[Host app: CLI, GUI, Tauri, or web gateway] -->|TCP/TLS + capnp-rpc| Gateway[RemoteSessionGateway]
    Gateway --> Auth[Auth adapters]
    Auth --> Sessions[SessionManager]
    Gateway --> Broker[AuthorityBroker]
    Broker --> Worker[Per-session RPC worker]
    Worker --> RemoteCapSet[RemoteCapSet]
    RemoteCapSet --> Proxies[Remote capability proxies]
    Proxies --> LocalCaps[capOS capabilities]
    Worker --> Audit[AuditLog]

The remote listener is a trusted gateway. It accepts the transport, performs or delegates authentication, obtains a UserSession, asks the broker for a remote-client bundle, and hosts a per-session RPC vat. The vat exports a RemoteSession object and remote proxy objects for capabilities in the broker-issued bundle.

For the first implementation the per-session worker may be an ordinary capOS service process. That shape matches the session-bound invariant: one workload process has one immutable session context. A single long-lived gateway may handle pre-auth connection state, but post-auth capability invocation should run inside a worker whose session context is the authenticated remote session, or through an equivalently reviewable dispatch path that cannot mix unrelated user sessions as ambient authority.

Bootstrap Interfaces

These schema sketches are proposal-level. Ordinals must be assigned from the checked-in schema when implemented.

enum RemoteAuthKind {
  password @0;
  publicKey @1;
  oidcDeviceCode @2;
  oidcAuthorizationCodePkce @3;
  passkey @4;
  mtlsClientCert @5;
  guest @6;
  anonymous @7;
  serviceCredential @8;
}

struct RemoteAuthMethod {
  kind @0 :RemoteAuthKind;
  label @1 :Text;
  profileHints @2 :List(Text);
  interactive @3 :Bool;
}

struct RemoteAuthStart {
  kind @0 :RemoteAuthKind;
  selector @1 :LoginSelector;
  requestedProfile @2 :Text;
  clientNonce @3 :Data;
  source @4 :LoginSourceMetadata;
}

struct RemoteAuthStep {
  prompt @0 :Text;
  redaction @1 :Bool;
  url @2 :Text;
  userCode @3 :Text;
  challenge @4 :Data;
  expiresAtMs @5 :UInt64;
}

interface RemoteSessionGateway {
  authMethods @0 () -> (methods :List(RemoteAuthMethod));
  start @1 (request :RemoteAuthStart) -> (flow :RemoteAuthFlow);
  guest @2 (requestedProfile :Text, source :LoginSourceMetadata)
      -> (session :RemoteSession);
  anonymous @3 (requestedProfile :Text, source :LoginSourceMetadata)
      -> (session :RemoteSession);
}

interface RemoteAuthFlow {
  next @0 (response :Data) -> (step :RemoteAuthStep, done :Bool,
      session :RemoteSession);
  cancel @1 () -> ();
}

struct RemoteCapEntry {
  name @0 :Text;
  interfaceId @1 :UInt64;
  transferPolicy @2 :Text;
  leaseExpiresAtMs @3 :UInt64;
}

interface RemoteSession {
  info @0 () -> (info :SessionInfo);
  capSet @1 () -> (caps :RemoteCapSet);
  renew @2 (proof :Data, requestedDurationMs :UInt64)
      -> (session :RemoteSession);
  logout @3 () -> ();
}

interface RemoteCapSet {
  list @0 () -> (entries :List(RemoteCapEntry));
  get @1 (name :Text, expectedInterfaceId :UInt64) -> (cap :AnyPointer);
}

The AnyPointer result is proposal shorthand for an ordinary Cap’n Proto capability pointer whose expected interface ID was already checked by the gateway. Generated client helpers should immediately cast it to the requested typed client. The remote client does not receive a numeric local capId, endpoint selector, result-cap index, or session identifier it can replay somewhere else.

Bidirectional UI Composition

A conventional GUI program opens a window and owns the controls inside it. A remote capOS session does not need to be that limited. The host app can expose a session-scoped UI host capability to capOS, and capOS-side services or agents can use that capability to propose a better interface for the current task:

  • Paperclips can ask for counters, project controls, and status charts instead of printing lines.
  • Chat can ask for a channel list, unread badges, and a message pane.
  • Adventure can ask for a map pane, inventory slots, command buttons, and room transcript.
  • A diagnostics agent can open log, metric, and trace panes side by side, highlight the relevant capability calls, and change density for a debugging session.
  • A teaching or accessibility agent can request larger type, simplified controls, or a guided task layout for a particular session.

The authority is explicit and separate from service authority. Holding Chat does not let a service rewrite the user’s UI. Holding RemoteUiHost or a narrow UiSurface facet lets the service propose bounded UI changes for the current remote session. The host app remains the compositor and policy enforcer.

Conceptual shape:

enum UiPatchKind {
  openSurface @0;
  closeSurface @1;
  updateModel @2;
  setLayoutHint @3;
  setThemeHint @4;
  addCommand @5;
  removeCommand @6;
}

struct UiSurfaceSpec {
  surfaceId @0 :Data;
  title @1 :Text;
  kind @2 :Text;
  safetyClass @3 :Text;
  modelSchema @4 :UInt64;
}

struct UiPatch {
  kind @0 :UiPatchKind;
  surfaceId @1 :Data;
  payload @2 :Data;
  expiresAtMs @3 :UInt64;
}

struct UiEvent {
  surfaceId @0 :Data;
  command @1 :Text;
  payload @2 :Data;
  userInitiated @3 :Bool;
}

interface RemoteUiHost {
  open @0 (spec :UiSurfaceSpec) -> (surface :RemoteUiSurface);
  theme @1 (scope :Text, hints :Data) -> ();
}

interface RemoteUiSurface {
  apply @0 (patch :UiPatch) -> ();
  poll @1 (maxEvents :UInt16) -> (events :List(UiEvent));
  close @2 () -> ();
}

The payloads above should become typed structs before implementation. They are shown as Data only to keep the sketch short. The important boundary is that UI updates are declarative patches and typed view models, not arbitrary host code. The host validates the requested surface kind, model schema, command set, theme tokens, data size, update rate, and safety class before rendering anything.

This is still a remote CapSet client model:

host UI holds RemoteSession + RemoteCapSet
host UI grants a narrow RemoteUiHost/RemoteUiSurface cap to a trusted worker
capOS service or agent sends declarative UI patches through that cap
host UI renders and sends typed user events back
service effects still require ordinary service caps

The direction is therefore bidirectional but not symmetric. The host app can call capOS service caps. capOS can shape the session UI only through UI caps the host granted. Neither side gains ambient authority over the other.

Safety rules:

  • Host chrome, login prompts, origin indicators, permission prompts, and emergency reset controls are reserved. capOS-rendered surfaces cannot spoof them.
  • UI patches are session-scoped. Persistent layout/theme changes require an explicit profile/settings cap or user confirmation.
  • Theme and look/feel changes use bounded tokens or validated design-system variables, not raw CSS injection.
  • UI command descriptors are data; executing a command still calls a typed capability under the current session policy.
  • The user can close, reset, or pin surfaces against agent rearrangement.
  • UI updates are quota-bound and auditable when they materially affect workflow, consent, disclosure, or action execution.
  • Browser front ends keep raw capOS caps server-side or in a Tauri/native Rust backend. Browser JavaScript receives rendered state and sends user events; it does not hold RemoteCapSet entries.

This is the broader version of the WebShell idea. A web shell can be more than a terminal emulator: it can be a session workspace whose composition is negotiated by the capabilities present in the session. The terminal remains one surface in that workspace, not the only surface.

Authentication And Admission

Authentication adapters all produce the same output: a UserSession plus profile inputs for the broker. They differ only in how the proof is obtained and verified.

  • Password: maps to the existing SessionManager.login(method, selector, proof, source) path when remote password login is enabled by policy. It must use the existing credential failure/backoff/audit rules and must not be the only supported remote method.
  • Public key: maps to SessionManager.sshPublicKey or a generalized signature-auth method. SSH userauth and raw remote RPC public-key auth can share account/key records, but the transcript bytes must be domain-separated by protocol and channel binding.
  • OIDC/OAuth: device-code flow fits headless or CLI clients; authorization code + PKCE fits browser-assisted clients. The OAuth/OIDC service verifies ID tokens and maps external subjects through the user-identity admission model before SessionManager mints a session.
  • Passkey/WebAuthn: belongs behind the web-authenticator path. A remote native client may open a browser or use a platform authenticator, but raw authenticator secrets never become capOS app data.
  • mTLS client certificate: TLS client-auth can identify a principal or pseudonymous subject through certificate policy. Certificate identity is an admission input; the resulting CapSet still comes from the broker.
  • Guest and anonymous: explicit policy profiles. They are not fallbacks for missing credentials and should receive short leases and narrow bundles.
  • Service/workload credentials: future non-human clients can authenticate with OAuth client credentials, token exchange, mTLS, or signed workload assertions. They receive service-profile bundles, not human shell bundles.

Every method must record source metadata and protocol/channel binding appropriate to its transport. A successful proof selects a principal and session; it does not directly grant service authority.

Remote CapSet Semantics

A local process starts with a read-only CapSet page plus local cap-table entries. A remote client instead receives a live RemoteCapSet object:

  • list returns names, interface IDs, display metadata, and lease summaries.
  • get returns a typed RPC capability pointer only if the name exists and the expected interface ID matches.
  • The returned object is a proxy owned by the remote-session worker.
  • Dropping the remote object releases the worker’s hold edge when no other remote references remain.
  • Logout, expiry, revocation, disconnect, or worker shutdown breaks all session-bound proxy objects.

This is still an actual session bundle. It is not a copy of the kernel’s local CapSet ABI. The remote representation exists because a Linux process has no capOS ring page, no capOS CapSet mapping, and no local cap table.

Invocation Context

Remote capability calls should look like ordinary calls to the target service:

remote client call
  -> capnp-rpc message
  -> per-session worker proxy
  -> local capOS capability call
  -> target service sees the worker's live session context

The remote client cannot choose service-visible subject identity. Request fields are ordinary data. If a service needs subject details, it uses the existing subject-disclosure policy: explicit request plus a matching service-scoped disclosure grant. By default it receives only the opaque service-scoped caller-session reference used by the session-bound invocation model.

Error And Lifetime Model

The remote path keeps the existing error split:

  • Cap’n Proto RPC transport errors and broken connections become RPC exceptions or disconnected promises.
  • Proxy/worker infrastructure failures become CapException-like capability exceptions.
  • Domain outcomes remain schema result fields or unions.
  • A missing cap name, interface mismatch, denied profile, stale session, or revoked lease is an observable denial, not a silent fallback to a broader service.

Open promises must fail when the remote session logs out or the connection is closed. The worker must release local caps on every close path.

Relationship To Shells And Gateways

Remote session CapSet clients are a peer of shell transports:

  • Native shell: a local capOS process that uses its local CapSet and ring. It can later expose a schema-aware REPL over the same capabilities a remote client sees, but the remote client does not need to spawn a shell.
  • SSH shell: a production CLI terminal transport. It authenticates and launches capos-shell with a TerminalSession. It should not become the only way for external programs to call typed services.
  • WebShellGateway: browser terminal, webapp, and agent UI transport. Browser JavaScript must not receive raw capOS caps; the gateway can use the remote session CapSet model server-side and expose terminal frames, view models, command descriptors, or bounded tool requests to the browser. This is close to the same mental model as a “web shell”, except the shell is not the required protocol. The web UI can present service-specific controls over the same session CapSet, and capOS-side services can adjust the session workspace through UI composition caps. A remote CapSet web UI can be built before the full WebShellGateway by omitting terminal delegation, shell-runner policy, and agent execution; it is just another host client of the remote session bundle.
  • Tauri or desktop GUI: the Rust/native backend may hold the remote RemoteSession and typed capability clients, while the UI layer receives rendered state, command descriptors, and user-intent events. The UI layer should not receive replayable capOS authority as data. The backend may grant narrow UI-surface caps back to capOS services so they can propose adaptive layouts without gaining arbitrary desktop control.
  • Agent shell: the agent runner holds session caps server-side and presents tool descriptors to the model. A hosted agent can use the same remote session bundle shape as long as actual capOS invocations remain in the trusted worker.
  • Interactive command surfaces: command metadata can be one of the granted capabilities. A remote client can render command specs directly instead of scripting text through a shell.

Authority Rules

  • The gateway receives scoped listener/TLS/auth/session/broker/audit authority, not raw broad network or spawn authority.
  • Post-auth workers receive only the broker-issued remote-client bundle plus proxy lifecycle authority.
  • Default remote bundles should be narrower than operator shell bundles.
  • Raw ProcessSpawner, unrestricted NetworkManager, key-vault, credential store, broad account store, broad storage root, and host debug caps require explicit elevated policy.
  • Remote proxyable caps must declare transfer/lifetime policy. Local-only caps may appear in a local shell CapSet without being exportable through RemoteCapSet.
  • Capability names are lookup conveniences. Interface ID and broker policy define whether a returned object is usable for the requested type.
  • Replayable handles are forbidden. Session IDs, grant IDs, endpoint metadata, object epochs, and proxy table positions are not bearer tokens.

Design Grounding

Implementation Shape

The first implementation is deliberately small:

  1. Keep the existing capnp-chat-interop service and harness as the transport starting point, but rename the target outcome in planning docs to remote session CapSet interop. Done.
  2. Add generated Linux Rust bindings for the relevant schema subset. Done.
  3. Add a host client library that connects through QEMU user TCP. Done with a schema-framed DTO transport; replacing it with standard capnp-rpc framing and live proxy objects remains the next transport step.
  4. Add a capOS gateway that supports one policy-enabled auth method plus explicit guest/anonymous behavior. Done for password and anonymous, with disabled public-key, OIDC, and passkey/WebAuthn method entries advertised.
  5. Return remote session summary, CapSet list, and typed get metadata. Done as DTOs.
  6. Call at least two capabilities from the bundle. Done for session and system_info; endpoint-backed services such as chat remain blocked on the per-session worker proxy.
  7. Prove a missing cap, wrong interface ID, wrong profile, stale session, and logout path fail closed. Done for the focused proof; full disconnect, release, and revocation propagation remains future work.
  8. Add a first host UI client over the current UI-neutral Rust client. This can be a Tauri app or a trusted local web bridge, and should cover endpoint configuration, auth methods, login, session summary, CapSet inspection, sessionInfo, systemMotd, denial probes, logout, stale-call proof, and redacted transcript export. It is separate from WebShell and does not need a terminal emulator, shell-runner policy, or agent execution.
  9. Replace the DTO transport with standard capnp-rpc, typed remote proxy objects, exception mapping, release/drop handling, and resource bounds.
  10. Add a separate UI-composition proof only after the basic session proof: grant a narrow test RemoteUiSurface, accept one declarative patch, send one typed user event back, and prove the service cannot spoof trusted chrome or persist layout state without the relevant cap.

Later slices can add more auth adapters, TLS, renewal, browser-assisted auth, service credentials, UI composition surfaces, promise pipelining, and distributed GC.