# Proposal: Remote Session CapSet Clients

Let a regular host application connect to a capOS instance, authenticate
through the same session machinery as shells and gateways, receive a
broker-issued remote view of its CapSet, and invoke the granted capabilities
over standard Cap'n Proto RPC. The first proof can be a Linux Rust CLI because
it is easy to script, but the design is for host applications generally:
native GUI apps, Tauri apps with Rust backends, server-side webapp gateways,
desktop tools, and agent runners can all consume the same remote session
CapSet model.

The important correction is that this is not a special "remote chat client" and
not another shell transport. Chat, Paperclips, Adventure, system-info, command
surfaces, and future service APIs should be ordinary capabilities in a remote
session bundle. A shell is one possible client of that bundle; it is not the
universal protocol.

## Current State

The tree has two interop proofs:

- `demos/capnp-chat-interop` runs inside capOS, accepts one scoped TCP
  connection, decodes a schema-framed `Chat.send` parameter message, calls the
  resident chat endpoint, returns a schema-framed result, and exits.
- The host harness uses a Linux Python script plus the pinned `capnp` tool to
  encode/decode request and result messages.
- `demos/remote-session-capset-gateway` runs inside capOS, listens through a
  manifest-scoped `TcpListenAuthority` on guest port `2327`, authenticates a
  remote session through `SessionManager`, returns a broker-shaped remote
  CapSet view, calls session/system-info DTO operations, and proves
  wrong-interface, unknown-cap, and stale-session denials. It derives login
  source metadata from the accepted socket and a gateway-generated connection
  event id.
- `tools/remote-session-client` is a regular Linux Rust client crate. Its
  library is UI-neutral so the same client logic can back a CLI harness, native
  GUI, Tauri backend, or trusted web gateway.

Those proofs are useful because they show external Cap'n Proto data can cross
the QEMU TCP boundary and reach capOS-hosted services through narrowed listener
caps. The remote-session proof is the first target-shaped slice, but it is not
the final RPC API. It still lacks:

- standard `capnp-rpc` message transport;
- live typed RPC proxy objects rather than DTO-mediated gateway operations;
- endpoint-backed service proxy calls running in an authenticated per-session
  worker context;
- complete object lifetime and exception behavior;
- kernel/user-session logout, disconnect, and revocation propagation beyond the
  gateway-local stale flag;
- TLS/mTLS and expanded auth adapters beyond password and anonymous;
- resource accounting for remote references, in-flight calls, and result
  sizes.

## Goals

- Support a normal host client built and run outside capOS. A Linux Rust CLI is
  the smallest harness; native GUI and Tauri/webapp-backed clients should not
  need a different capOS protocol.
- Authenticate through capOS session/admission services, not through an
  application-specific service secret.
- Support multiple admission methods: local password where policy enables it,
  public-key signatures, OIDC/OAuth browser or device flows, passkey/WebAuthn
  through the web gateway path, mTLS client identity, guest/anonymous profiles
  where explicitly enabled, and future service/workload credentials.
- Return a live remote CapSet view whose entries are typed RPC client objects,
  not serialized local cap-table slots.
- Let the client call any granted remote-proxyable capability by name and
  expected interface ID.
- Support bidirectional session UI composition: a host UI can call capOS
  capabilities, and capOS-side services or agents can propose bounded
  changes to the host session's panes, command palette, visualizations,
  density, theme, and workflow-specific controls through explicit UI
  capabilities.
- Keep local-only authority local: cap IDs, endpoint generations, receiver
  selectors, session-global identifiers, and kernel result-cap indexes never
  become portable remote authority.
- Preserve session-bound invocation context. Remote calls run under the
  gateway/worker session created for that remote client.
- Make logout, disconnect, transport breakage, session expiry, policy
  revocation, and object drop observable and fail closed.

## Non-Goals

- General network transparency across arbitrary capOS hosts.
- OCapN compatibility or third-party handoffs.
- Browser JavaScript receiving capOS capability objects directly. A webapp may
  be a front end, but a trusted server, gateway, or Tauri Rust backend holds
  the remote CapSet.
- Letting capOS services execute arbitrary host UI code, inject unreviewed
  JavaScript/CSS, spoof trusted browser/desktop chrome, or persist UI changes
  outside the granted session UI scope.
- Replacing SSH, WebShellGateway, native shell, or interactive command
  surfaces.
- Exposing raw `ProcessSpawner`, raw network factories, broad storage roots, or
  key material as a default remote bundle.
- Treating password authentication as the only or preferred remote path.
- Serializing the kernel CapSet page or local cap table to the client.

## Architecture

```mermaid
flowchart TD
    Client[Host app: CLI, GUI, Tauri, or web gateway] -->|TCP/TLS + capnp-rpc| Gateway[RemoteSessionGateway]
    Gateway --> Auth[Auth adapters]
    Auth --> Sessions[SessionManager]
    Gateway --> Broker[AuthorityBroker]
    Broker --> Worker[Per-session RPC worker]
    Worker --> RemoteCapSet[RemoteCapSet]
    RemoteCapSet --> Proxies[Remote capability proxies]
    Proxies --> LocalCaps[capOS capabilities]
    Worker --> Audit[AuditLog]
```

The remote listener is a trusted gateway. It accepts the transport, performs
or delegates authentication, obtains a `UserSession`, asks the broker for a
remote-client bundle, and hosts a per-session RPC vat. The vat exports a
`RemoteSession` object and remote proxy objects for capabilities in the
broker-issued bundle.

For the first implementation the per-session worker may be an ordinary capOS
service process. That shape matches the session-bound invariant: one workload
process has one immutable session context. A single long-lived gateway may
handle pre-auth connection state, but post-auth capability invocation should
run inside a worker whose session context is the authenticated remote session,
or through an equivalently reviewable dispatch path that cannot mix unrelated
user sessions as ambient authority.

## Bootstrap Interfaces

These schema sketches are proposal-level. Ordinals must be assigned from the
checked-in schema when implemented.

```capnp
enum RemoteAuthKind {
  password @0;
  publicKey @1;
  oidcDeviceCode @2;
  oidcAuthorizationCodePkce @3;
  passkey @4;
  mtlsClientCert @5;
  guest @6;
  anonymous @7;
  serviceCredential @8;
}

struct RemoteAuthMethod {
  kind @0 :RemoteAuthKind;
  label @1 :Text;
  profileHints @2 :List(Text);
  interactive @3 :Bool;
}

struct RemoteAuthStart {
  kind @0 :RemoteAuthKind;
  selector @1 :LoginSelector;
  requestedProfile @2 :Text;
  clientNonce @3 :Data;
  source @4 :LoginSourceMetadata;
}

struct RemoteAuthStep {
  prompt @0 :Text;
  redaction @1 :Bool;
  url @2 :Text;
  userCode @3 :Text;
  challenge @4 :Data;
  expiresAtMs @5 :UInt64;
}

interface RemoteSessionGateway {
  authMethods @0 () -> (methods :List(RemoteAuthMethod));
  start @1 (request :RemoteAuthStart) -> (flow :RemoteAuthFlow);
  guest @2 (requestedProfile :Text, source :LoginSourceMetadata)
      -> (session :RemoteSession);
  anonymous @3 (requestedProfile :Text, source :LoginSourceMetadata)
      -> (session :RemoteSession);
}

interface RemoteAuthFlow {
  next @0 (response :Data) -> (step :RemoteAuthStep, done :Bool,
      session :RemoteSession);
  cancel @1 () -> ();
}

struct RemoteCapEntry {
  name @0 :Text;
  interfaceId @1 :UInt64;
  transferPolicy @2 :Text;
  leaseExpiresAtMs @3 :UInt64;
}

interface RemoteSession {
  info @0 () -> (info :SessionInfo);
  capSet @1 () -> (caps :RemoteCapSet);
  renew @2 (proof :Data, requestedDurationMs :UInt64)
      -> (session :RemoteSession);
  logout @3 () -> ();
}

interface RemoteCapSet {
  list @0 () -> (entries :List(RemoteCapEntry));
  get @1 (name :Text, expectedInterfaceId :UInt64) -> (cap :AnyPointer);
}
```

The `AnyPointer` result is proposal shorthand for an ordinary Cap'n Proto
capability pointer whose expected interface ID was already checked by the
gateway. Generated client helpers should immediately cast it to the requested
typed client. The remote client does not receive a numeric local `capId`,
endpoint selector, result-cap index, or session identifier it can replay
somewhere else.

## Bidirectional UI Composition

A conventional GUI program opens a window and owns the controls inside it. A
remote capOS session does not need to be that limited. The host app can expose
a session-scoped UI host capability to capOS, and capOS-side services or agents
can use that capability to propose a better interface for the current task:

- Paperclips can ask for counters, project controls, and status charts instead
  of printing lines.
- Chat can ask for a channel list, unread badges, and a message pane.
- Adventure can ask for a map pane, inventory slots, command buttons, and room
  transcript.
- A diagnostics agent can open log, metric, and trace panes side by side,
  highlight the relevant capability calls, and change density for a debugging
  session.
- A teaching or accessibility agent can request larger type, simplified
  controls, or a guided task layout for a particular session.

The authority is explicit and separate from service authority. Holding `Chat`
does not let a service rewrite the user's UI. Holding `RemoteUiHost` or a
narrow `UiSurface` facet lets the service propose bounded UI changes for the
current remote session. The host app remains the compositor and policy
enforcer.

Conceptual shape:

```capnp
enum UiPatchKind {
  openSurface @0;
  closeSurface @1;
  updateModel @2;
  setLayoutHint @3;
  setThemeHint @4;
  addCommand @5;
  removeCommand @6;
}

struct UiSurfaceSpec {
  surfaceId @0 :Data;
  title @1 :Text;
  kind @2 :Text;
  safetyClass @3 :Text;
  modelSchema @4 :UInt64;
}

struct UiPatch {
  kind @0 :UiPatchKind;
  surfaceId @1 :Data;
  payload @2 :Data;
  expiresAtMs @3 :UInt64;
}

struct UiEvent {
  surfaceId @0 :Data;
  command @1 :Text;
  payload @2 :Data;
  userInitiated @3 :Bool;
}

interface RemoteUiHost {
  open @0 (spec :UiSurfaceSpec) -> (surface :RemoteUiSurface);
  theme @1 (scope :Text, hints :Data) -> ();
}

interface RemoteUiSurface {
  apply @0 (patch :UiPatch) -> ();
  poll @1 (maxEvents :UInt16) -> (events :List(UiEvent));
  close @2 () -> ();
}
```

The payloads above should become typed structs before implementation. They are
shown as `Data` only to keep the sketch short. The important boundary is that
UI updates are declarative patches and typed view models, not arbitrary host
code. The host validates the requested surface kind, model schema, command
set, theme tokens, data size, update rate, and safety class before rendering
anything.

This is still a remote CapSet client model:

```text
host UI holds RemoteSession + RemoteCapSet
host UI grants a narrow RemoteUiHost/RemoteUiSurface cap to a trusted worker
capOS service or agent sends declarative UI patches through that cap
host UI renders and sends typed user events back
service effects still require ordinary service caps
```

The direction is therefore bidirectional but not symmetric. The host app can
call capOS service caps. capOS can shape the session UI only through UI caps
the host granted. Neither side gains ambient authority over the other.

Safety rules:

- Host chrome, login prompts, origin indicators, permission prompts, and
  emergency reset controls are reserved. capOS-rendered surfaces cannot spoof
  them.
- UI patches are session-scoped. Persistent layout/theme changes require an
  explicit profile/settings cap or user confirmation.
- Theme and look/feel changes use bounded tokens or validated design-system
  variables, not raw CSS injection.
- UI command descriptors are data; executing a command still calls a typed
  capability under the current session policy.
- The user can close, reset, or pin surfaces against agent rearrangement.
- UI updates are quota-bound and auditable when they materially affect
  workflow, consent, disclosure, or action execution.
- Browser front ends keep raw capOS caps server-side or in a Tauri/native Rust
  backend. Browser JavaScript receives rendered state and sends user events; it
  does not hold `RemoteCapSet` entries.

This is the broader version of the WebShell idea. A web shell can be more than
a terminal emulator: it can be a session workspace whose composition is
negotiated by the capabilities present in the session. The terminal remains
one surface in that workspace, not the only surface.

## Authentication And Admission

Authentication adapters all produce the same output: a `UserSession` plus
profile inputs for the broker. They differ only in how the proof is obtained
and verified.

- **Password:** maps to the existing `SessionManager.login(method, selector,
  proof, source)` path when remote password login is enabled by policy. It must
  use the existing credential failure/backoff/audit rules and must not be the
  only supported remote method.
- **Public key:** maps to `SessionManager.sshPublicKey` or a generalized
  signature-auth method. SSH userauth and raw remote RPC public-key auth can
  share account/key records, but the transcript bytes must be domain-separated
  by protocol and channel binding.
- **OIDC/OAuth:** device-code flow fits headless or CLI clients; authorization
  code + PKCE fits browser-assisted clients. The OAuth/OIDC service verifies
  ID tokens and maps external subjects through the user-identity admission
  model before `SessionManager` mints a session.
- **Passkey/WebAuthn:** belongs behind the web-authenticator path. A remote
  native client may open a browser or use a platform authenticator, but raw
  authenticator secrets never become capOS app data.
- **mTLS client certificate:** TLS client-auth can identify a principal or
  pseudonymous subject through certificate policy. Certificate identity is an
  admission input; the resulting CapSet still comes from the broker.
- **Guest and anonymous:** explicit policy profiles. They are not fallbacks for
  missing credentials and should receive short leases and narrow bundles.
- **Service/workload credentials:** future non-human clients can authenticate
  with OAuth client credentials, token exchange, mTLS, or signed workload
  assertions. They receive service-profile bundles, not human shell bundles.

Every method must record source metadata and protocol/channel binding
appropriate to its transport. A successful proof selects a principal and
session; it does not directly grant service authority.

## Remote CapSet Semantics

A local process starts with a read-only CapSet page plus local cap-table
entries. A remote client instead receives a live `RemoteCapSet` object:

- `list` returns names, interface IDs, display metadata, and lease summaries.
- `get` returns a typed RPC capability pointer only if the name exists and the
  expected interface ID matches.
- The returned object is a proxy owned by the remote-session worker.
- Dropping the remote object releases the worker's hold edge when no other
  remote references remain.
- Logout, expiry, revocation, disconnect, or worker shutdown breaks all
  session-bound proxy objects.

This is still an actual session bundle. It is not a copy of the kernel's local
CapSet ABI. The remote representation exists because a Linux process has no
capOS ring page, no capOS CapSet mapping, and no local cap table.

## Invocation Context

Remote capability calls should look like ordinary calls to the target service:

```text
remote client call
  -> capnp-rpc message
  -> per-session worker proxy
  -> local capOS capability call
  -> target service sees the worker's live session context
```

The remote client cannot choose service-visible subject identity. Request
fields are ordinary data. If a service needs subject details, it uses the
existing subject-disclosure policy: explicit request plus a matching
service-scoped disclosure grant. By default it receives only the opaque
service-scoped caller-session reference used by the session-bound invocation
model.

## Error And Lifetime Model

The remote path keeps the existing error split:

- Cap'n Proto RPC transport errors and broken connections become RPC
  exceptions or disconnected promises.
- Proxy/worker infrastructure failures become `CapException`-like capability
  exceptions.
- Domain outcomes remain schema result fields or unions.
- A missing cap name, interface mismatch, denied profile, stale session, or
  revoked lease is an observable denial, not a silent fallback to a broader
  service.

Open promises must fail when the remote session logs out or the connection is
closed. The worker must release local caps on every close path.

## Relationship To Shells And Gateways

Remote session CapSet clients are a peer of shell transports:

- **Native shell:** a local capOS process that uses its local CapSet and ring.
  It can later expose a schema-aware REPL over the same capabilities a remote
  client sees, but the remote client does not need to spawn a shell.
- **SSH shell:** a production CLI terminal transport. It authenticates and
  launches `capos-shell` with a `TerminalSession`. It should not become the
  only way for external programs to call typed services.
- **WebShellGateway:** browser terminal, webapp, and agent UI transport.
  Browser JavaScript must not receive raw capOS caps; the gateway can use the
  remote session CapSet model server-side and expose terminal frames, view
  models, command descriptors, or bounded tool requests to the browser. This is
  close to the same mental model as a "web shell", except the shell is not the
  required protocol. The web UI can present service-specific controls over the
  same session CapSet, and capOS-side services can adjust the session
  workspace through UI composition caps. A remote CapSet web UI can be built
  before the full WebShellGateway by omitting terminal delegation, shell-runner
  policy, and agent execution; it is just another host client of the remote
  session bundle.
- **Tauri or desktop GUI:** the Rust/native backend may hold the remote
  `RemoteSession` and typed capability clients, while the UI layer receives
  rendered state, command descriptors, and user-intent events. The UI layer
  should not receive replayable capOS authority as data. The backend may grant
  narrow UI-surface caps back to capOS services so they can propose adaptive
  layouts without gaining arbitrary desktop control.
- **Agent shell:** the agent runner holds session caps server-side and presents
  tool descriptors to the model. A hosted agent can use the same remote
  session bundle shape as long as actual capOS invocations remain in the
  trusted worker.
- **Interactive command surfaces:** command metadata can be one of the granted
  capabilities. A remote client can render command specs directly instead of
  scripting text through a shell.

## Authority Rules

- The gateway receives scoped listener/TLS/auth/session/broker/audit authority,
  not raw broad network or spawn authority.
- Post-auth workers receive only the broker-issued remote-client bundle plus
  proxy lifecycle authority.
- Default remote bundles should be narrower than operator shell bundles.
- Raw `ProcessSpawner`, unrestricted `NetworkManager`, key-vault, credential
  store, broad account store, broad storage root, and host debug caps require
  explicit elevated policy.
- Remote proxyable caps must declare transfer/lifetime policy. Local-only caps
  may appear in a local shell CapSet without being exportable through
  `RemoteCapSet`.
- Capability names are lookup conveniences. Interface ID and broker policy
  define whether a returned object is usable for the requested type.
- Replayable handles are forbidden. Session IDs, grant IDs, endpoint metadata,
  object epochs, and proxy table positions are not bearer tokens.

## Design Grounding

- [Session-Bound Invocation Context](session-bound-invocation-context-proposal.md)
  defines the one-session-per-process invariant and privacy-preserving endpoint
  caller-session metadata.
- [User Identity and Policy](user-identity-and-policy-proposal.md) defines
  principals, sessions, profiles, admission sources, renewal, and brokered
  CapSet minting.
- [Boot to Shell](boot-to-shell-proposal.md) defines the existing
  `CredentialStore`/`SessionManager`/`AuthorityBroker` path and non-password
  login directions.
- [SSH Shell Gateway](ssh-shell-proposal.md), [Certificates and TLS](certificates-and-tls-proposal.md),
  and [OIDC and OAuth2](oidc-and-oauth2-proposal.md) define public-key,
  TLS/mTLS, and federated admission inputs.
- [libcapos-service](libcapos-service-proposal.md) defines the service
  lifecycle shape needed for listener loops, per-session context, shutdown,
  drain, and metrics.
- [Interactive Command Surfaces](interactive-command-surface-proposal.md)
  defines typed command sessions that can be rendered by remote clients.
- [Browser Capability and Agent Web Sessions](browser-capability-proposal.md)
  defines browser-side authority boundaries and gateway mediation for web UI
  sessions.
- [Language Models and Agent Runtime](llm-and-agent-proposal.md) defines
  agent runners, tool proxies, and browser-agent UI orchestration boundaries.
- [Cloudflare, Cap'n Proto, Workers RPC, and Cap'n Web](../research/cloudflare-capnproto-workers.md)
  grounds production object-capability RPC, live object bindings, and remote
  resource-exhaustion discipline.
- [Spritely, OCapN, and CapTP](../research/spritely-captp-ocapn.md) grounds
  distributed object-capability lifetime, promise, reference, and handoff
  questions while staying non-binding for capOS wire compatibility.

## Implementation Shape

The first implementation is deliberately small:

1. Keep the existing `capnp-chat-interop` service and harness as the transport
   starting point, but rename the target outcome in planning docs to remote
   session CapSet interop. Done.
2. Add generated Linux Rust bindings for the relevant schema subset. Done.
3. Add a host client library that connects through QEMU user TCP. Done with a
   schema-framed DTO transport; replacing it with standard `capnp-rpc` framing
   and live proxy objects remains the next transport step.
4. Add a capOS gateway that supports one policy-enabled auth method plus
   explicit guest/anonymous behavior. Done for password and anonymous, with
   disabled public-key, OIDC, and passkey/WebAuthn method entries advertised.
5. Return remote session summary, CapSet list, and typed `get` metadata. Done
   as DTOs.
6. Call at least two capabilities from the bundle. Done for `session` and
   `system_info`; endpoint-backed services such as `chat` remain blocked on
   the per-session worker proxy.
7. Prove a missing cap, wrong interface ID, wrong profile, stale session, and
   logout path fail closed. Done for the focused proof; full disconnect,
   release, and revocation propagation remains future work.
8. Add a first host UI client over the current UI-neutral Rust client. This can
   be a Tauri app or a trusted local web bridge, and should cover endpoint
   configuration, auth methods, login, session summary, CapSet inspection,
   `sessionInfo`, `systemMotd`, denial probes, logout, stale-call proof, and
   redacted transcript export. It is separate from WebShell and does not need a
   terminal emulator, shell-runner policy, or agent execution.
9. Replace the DTO transport with standard `capnp-rpc`, typed remote proxy
   objects, exception mapping, release/drop handling, and resource bounds.
10. Add a separate UI-composition proof only after the basic session proof:
   grant a narrow test `RemoteUiSurface`, accept one declarative patch, send
   one typed user event back, and prove the service cannot spoof trusted chrome
   or persist layout state without the relevant cap.

Later slices can add more auth adapters, TLS, renewal, browser-assisted auth,
service credentials, UI composition surfaces, promise pipelining, and
distributed GC.