# Proposal: Interactive Command Surfaces

Typed command surfaces for native interactive applications without moving
application parsing into `StdIO` text streams.


## Current Target Versus Future Design

The immediate target is deliberately narrower than this proposal:

- `capos-shell` exposes generic process control commands, including `spawn`
  for asynchronous launch and `run` for launch-and-wait.
- Chat and adventure clients are ordinary spawned commands, not shell
  builtins.
- Interactive child I/O uses an explicit `StdIO` endpoint client with
  stdin/stdout/stderr-shaped semantics while the shell keeps ownership of its
  `TerminalSession`.
- Focused QEMU smokes prove the resident-service plus shell-spawned-client
  path before the native command protocol hardens.

The future native design is the `CommandSession`/`CommandSurface` protocol
below. It should replace semantic command parsing inside chat/adventure
clients once the prototype has proved the process, grant, wait, and terminal
bridging mechanics.

## Problem

The current chat/adventure worktree moved application commands out of
`capos-shell` builtins and into ordinary shell-spawned clients. That fixes one
bad boundary, but it leaves another one: the clients read lines from `StdIO`
and parse command text such as `go north`, `take key`, `/join #lobby`, and
`say hello` themselves.

That is still too stringly for capOS. The kernel and services already expose
typed capabilities. Native interactive applications should not receive their
primary operation as an unstructured terminal line and then rebuild an ad hoc
parser. `StdIO` is useful for textual programs, logs, compatibility layers,
and simple smoke harnesses. It is not the right semantic boundary for a native
application command language.

The other design pressure is terminal reuse. The same native shell should work
from a local UART, GUI pane, web terminal, or test harness. That argues for a
terminal host process that owns terminal transport and rendering separately
from the shell process that owns command routing and capability context.

## Goals

- Keep application-specific verbs out of `capos-shell`.
- Keep application command semantics out of unstructured `StdIO` text parsing.
- Let a user type familiar command forms such as `go north` or `chat join
  #lobby` while the executable representation is a typed invocation.
- Support nested subcommands without hardcoding app grammar into the shell.
- Let terminal hosts provide line editing, completion, history, resize, and
  GUI/web rendering from the same command metadata.
- Preserve typed service authority: parsing a command never grants access, and
  every effect still requires the right capability.

## Non-Goals

- POSIX shell compatibility.
- A global command namespace.
- Making terminal text a security boundary.
- Removing `StdIO`; it remains the byte/text stream adapter for programs whose
  interface really is textual.

## Layering

```mermaid
flowchart TD
    Uart[UART TerminalHost] --> Terminal[Terminal entity]
    Web[Web TerminalHost] --> Terminal
    Gui[GUI TerminalHost] --> Terminal
    Terminal --> Shell[Native shell session]
    Shell --> Cmd[Interactive CommandSession]
    Cmd --> Adventure[Adventure service cap]
    Cmd --> Chat[Chat service cap]
    Shell --> Launcher[Restricted launcher]
    Shell --> Broker[AuthorityBroker]
```

The terminal host owns raw input/output, line discipline, presentation state,
history, paste handling, resize events, and later GUI/web affordances. The
terminal entity is the session object the host exposes to a foreground shell or
application view. `TerminalSession` remains the capability boundary for a
foreground text session, but it does not have to be implemented inside the
shell.

The native shell owns command namespace, current capability context, spawn/wait
state, and policy-mediated bundle changes. It can run from any terminal host
because it talks to the terminal entity, not to a particular UART.

An interactive application owns a `CommandSession`. It exposes a command
surface and receives structured invocations. The application may be a thin
adapter over service capabilities, as the adventure client should be, or a
resident service may expose the command session directly.

## Command Pattern

`command <args>` is acceptable as user-facing syntax, but it must not become
the application ABI. It is a parseable notation for a declared command surface.
The shell or terminal host parses text into a `CommandInvocation`; the
application receives typed fields.

Conceptual schema:

```capnp
struct CommandSurface {
  revision @0 :UInt64;
  prompt @1 :Text;
  commands @2 :List(CommandSpec);
}

struct CommandSpec {
  path @0 :List(Text);
  summary @1 :Text;
  args @2 :List(CommandArg);
  flags @3 :List(CommandFlag);
  redaction @4 :List(RedactionClass);
}

struct CommandArg {
  name @0 :Text;
  kind @1 :CommandValueKind;
  required @2 :Bool;
  variadic @3 :Bool;
  restOfLine @4 :Bool;
  completions @5 :CompletionSource;
}

struct CommandInvocation {
  surfaceRevision @0 :UInt64;
  path @1 :List(Text);
  args @2 :List(CommandValue);
  flags @3 :List(CommandFlagValue);
}

interface CommandSession {
  describe @0 () -> (surface :CommandSurface);
  invoke @1 (command :CommandInvocation) -> (result :CommandResult);
  poll @2 (maxEvents :UInt16) -> (events :List(CommandEvent));
  close @3 () -> ();
}
```

The parser is generic:

- Match the longest declared command path.
- Parse arguments according to the declared shapes.
- Treat ambiguous prefixes as errors with alternatives.
- Treat `restOfLine` as one text argument; do not split it again in the app.
- Attach redaction metadata before audit or transcript recording.
- Re-read `CommandSurface` when a command returns a new revision.

The application can still reject a typed invocation if the command is no longer
valid. That is ordinary semantic validation, not text parsing.

## Subcommand Nesting

Nested subcommands work if the command path is represented as a token list
rather than a single string. Examples:

```text
go north
take brass-key
say hello there
chat join #lobby
chat who
inventory equip lantern
admin npc spawn wanderer room=atrium
```

Those become:

```text
path=["go"], args={direction:"north"}
path=["take"], args={item:"brass-key"}
path=["say"], args={text:"hello there"}
path=["chat","join"], args={channel:"#lobby"}
path=["chat","who"], args={}
path=["inventory","equip"], args={item:"lantern"}
path=["admin","npc","spawn"], args={kind:"wanderer", room:"atrium"}
```

The shell does not need adventure-specific code for any of these. It needs a
generic command tree, longest-prefix matching, value parsers, and completion
hooks. The same mechanism can describe shell commands such as `spawn`, `wait`,
`login`, and `caps`, even if the implementations remain inside the shell for
now.

Subcommand nesting is also a better fit for GUI/web sessions than raw `StdIO`.
A terminal host can render `chat join` as a command palette entry, offer room
completions for `go`, or show buttons for zero-argument commands such as
`look`, all from the same metadata.

## Adventure Shape

The adventure command session should own only the caps it needs:

```text
adventure       Adventure or Endpoint client cap
chat            Chat or Endpoint client cap
session         optional UserSession metadata cap
```

It should expose a dynamic surface derived from current player state:

- `look`
- `go <direction>` with room-specific direction completions
- `take <item>` with visible item completions
- `drop <item>` with inventory completions
- `inventory`
- `say <text...>` with `restOfLine=true`
- `chat join <channel>`
- `chat who`
- `quit`

The shell or terminal host parses those forms. The adventure command session
turns the resulting invocation into typed `Adventure` and `Chat` calls. The
adventure service still validates the session-bound caller identity, room,
exits, items, and chat channel authority. Dynamic completions are convenience,
not authority.

This is the balance capOS wants: generic shell integration, app-owned command
metadata, typed service calls, and no application-specific shell builtins.

## Role of StdIO

`StdIO` remains useful, but it should be demoted to a transport and
compatibility interface:

- output streams for simple textual programs,
- test harnesses that script input and check transcript output,
- POSIX personality descriptor emulation,
- applications whose real protocol is text.

For capOS-native interactive applications, `StdIO.read()` should not be the
primary command interface. A command session can still emit render events that
the shell forwards to a terminal host, and a compatibility adapter can expose
the same session as text when necessary.

## Terminal Host Separation

The shell should not permanently own the terminal implementation. A separate
terminal host process gives the system one shell that can be reused across
different front ends:

- local UART host for QEMU and early hardware,
- web host for browser terminal sessions,
- GUI host for a desktop pane or command palette,
- test host for smoke scripts.

Each host owns a terminal entity and grants a foreground `TerminalSession` or
equivalent view to the shell. The shell runs command sessions and returns
render/update events. The host decides how to display them.

This also avoids a future false choice between "shell owns the terminal" and
"child process receives the terminal." The terminal entity can support a
foreground lease, shell-mediated command sessions, and later split panes or GUI
widgets without making every child process a terminal driver.

## Migration Plan

1. Land the current shell-spawned `StdIO` clients as an explicit prototype:
   no app-specific shell builtins, no terminal-cap delegation to children, and
   `run` available for blocking command execution.
2. Add focused QEMU smokes for chat and adventure against that prototype so
   the resident service, exact grants, wait path, and terminal bridge have a
   stable regression target.
3. Add a userspace `CommandSession` DTO/protocol in the shared demo/runtime
   layer, carried over ordinary `Endpoint` until a manifest-visible interface
   is worth committing.
4. Teach `capos-shell` a generic command-surface parser and command-provider
   registry. Do not add `chat`, `play adventure`, `go`, `take`, or similar
   application verbs as hardcoded shell matches.
5. Move adventure command parsing out of `demos/adventure-client/` and into
   command descriptors plus typed `Adventure`/`Chat` invocations.
6. Split terminal hosting from the shell when the local UART path needs to
   support a second front end or when the web terminal work starts. Until then,
   keep the current terminal implementation constrained to the `TerminalSession`
   boundary so the split is mechanical.