Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: Interactive Command Surfaces

Typed command surfaces for native interactive applications without moving application parsing into StdIO text streams.

Current Target Versus Future Design

The immediate target is deliberately narrower than this proposal:

  • capos-shell exposes generic process control commands, including spawn for asynchronous launch and run for launch-and-wait.
  • Chat and adventure clients are ordinary spawned commands, not shell builtins.
  • Interactive child I/O uses an explicit StdIO endpoint client with stdin/stdout/stderr-shaped semantics while the shell keeps ownership of its TerminalSession.
  • Focused QEMU smokes prove the resident-service plus shell-spawned-client path before the native command protocol hardens.

The future native design is the CommandSession/CommandSurface protocol below. It should replace semantic command parsing inside chat/adventure clients once the prototype has proved the process, grant, wait, and terminal bridging mechanics.

Problem

The current chat/adventure worktree moved application commands out of capos-shell builtins and into ordinary shell-spawned clients. That fixes one bad boundary, but it leaves another one: the clients read lines from StdIO and parse command text such as go north, take key, /join #lobby, and say hello themselves.

That is still too stringly for capOS. The kernel and services already expose typed capabilities. Native interactive applications should not receive their primary operation as an unstructured terminal line and then rebuild an ad hoc parser. StdIO is useful for textual programs, logs, compatibility layers, and simple smoke harnesses. It is not the right semantic boundary for a native application command language.

The other design pressure is terminal reuse. The same native shell should work from a local UART, GUI pane, web terminal, or test harness. That argues for a terminal host process that owns terminal transport and rendering separately from the shell process that owns command routing and capability context.

Goals

  • Keep application-specific verbs out of capos-shell.
  • Keep application command semantics out of unstructured StdIO text parsing.
  • Let a user type familiar command forms such as go north or chat join #lobby while the executable representation is a typed invocation.
  • Support nested subcommands without hardcoding app grammar into the shell.
  • Let terminal hosts provide line editing, completion, history, resize, and GUI/web rendering from the same command metadata.
  • Preserve typed service authority: parsing a command never grants access, and every effect still requires the right capability.

Non-Goals

  • POSIX shell compatibility.
  • A global command namespace.
  • Making terminal text a security boundary.
  • Removing StdIO; it remains the byte/text stream adapter for programs whose interface really is textual.

Layering

flowchart TD
    Uart[UART TerminalHost] --> Terminal[Terminal entity]
    Web[Web TerminalHost] --> Terminal
    Gui[GUI TerminalHost] --> Terminal
    Terminal --> Shell[Native shell session]
    Shell --> Cmd[Interactive CommandSession]
    Cmd --> Adventure[Adventure service cap]
    Cmd --> Chat[Chat service cap]
    Shell --> Launcher[Restricted launcher]
    Shell --> Broker[AuthorityBroker]

The terminal host owns raw input/output, line discipline, presentation state, history, paste handling, resize events, and later GUI/web affordances. The terminal entity is the session object the host exposes to a foreground shell or application view. TerminalSession remains the capability boundary for a foreground text session, but it does not have to be implemented inside the shell.

The native shell owns command namespace, current capability context, spawn/wait state, and policy-mediated bundle changes. It can run from any terminal host because it talks to the terminal entity, not to a particular UART.

An interactive application owns a CommandSession. It exposes a command surface and receives structured invocations. The application may be a thin adapter over service capabilities, as the adventure client should be, or a resident service may expose the command session directly.

Command Pattern

command <args> is acceptable as user-facing syntax, but it must not become the application ABI. It is a parseable notation for a declared command surface. The shell or terminal host parses text into a CommandInvocation; the application receives typed fields.

Conceptual schema:

struct CommandSurface {
  revision @0 :UInt64;
  prompt @1 :Text;
  commands @2 :List(CommandSpec);
}

struct CommandSpec {
  path @0 :List(Text);
  summary @1 :Text;
  args @2 :List(CommandArg);
  flags @3 :List(CommandFlag);
  redaction @4 :List(RedactionClass);
}

struct CommandArg {
  name @0 :Text;
  kind @1 :CommandValueKind;
  required @2 :Bool;
  variadic @3 :Bool;
  restOfLine @4 :Bool;
  completions @5 :CompletionSource;
}

struct CommandInvocation {
  surfaceRevision @0 :UInt64;
  path @1 :List(Text);
  args @2 :List(CommandValue);
  flags @3 :List(CommandFlagValue);
}

interface CommandSession {
  describe @0 () -> (surface :CommandSurface);
  invoke @1 (command :CommandInvocation) -> (result :CommandResult);
  poll @2 (maxEvents :UInt16) -> (events :List(CommandEvent));
  close @3 () -> ();
}

The parser is generic:

  • Match the longest declared command path.
  • Parse arguments according to the declared shapes.
  • Treat ambiguous prefixes as errors with alternatives.
  • Treat restOfLine as one text argument; do not split it again in the app.
  • Attach redaction metadata before audit or transcript recording.
  • Re-read CommandSurface when a command returns a new revision.

The application can still reject a typed invocation if the command is no longer valid. That is ordinary semantic validation, not text parsing.

Subcommand Nesting

Nested subcommands work if the command path is represented as a token list rather than a single string. Examples:

go north
take brass-key
say hello there
chat join #lobby
chat who
inventory equip lantern
admin npc spawn wanderer room=atrium

Those become:

path=["go"], args={direction:"north"}
path=["take"], args={item:"brass-key"}
path=["say"], args={text:"hello there"}
path=["chat","join"], args={channel:"#lobby"}
path=["chat","who"], args={}
path=["inventory","equip"], args={item:"lantern"}
path=["admin","npc","spawn"], args={kind:"wanderer", room:"atrium"}

The shell does not need adventure-specific code for any of these. It needs a generic command tree, longest-prefix matching, value parsers, and completion hooks. The same mechanism can describe shell commands such as spawn, wait, login, and caps, even if the implementations remain inside the shell for now.

Subcommand nesting is also a better fit for GUI/web sessions than raw StdIO. A terminal host can render chat join as a command palette entry, offer room completions for go, or show buttons for zero-argument commands such as look, all from the same metadata.

Adventure Shape

The adventure command session should own only the caps it needs:

adventure       Adventure or Endpoint client cap
chat            Chat or Endpoint client cap
session         optional UserSession metadata cap

It should expose a dynamic surface derived from current player state:

  • look
  • go <direction> with room-specific direction completions
  • take <item> with visible item completions
  • drop <item> with inventory completions
  • inventory
  • say <text...> with restOfLine=true
  • chat join <channel>
  • chat who
  • quit

The shell or terminal host parses those forms. The adventure command session turns the resulting invocation into typed Adventure and Chat calls. The adventure service still validates the session-bound caller identity, room, exits, items, and chat channel authority. Dynamic completions are convenience, not authority.

This is the balance capOS wants: generic shell integration, app-owned command metadata, typed service calls, and no application-specific shell builtins.

Role of StdIO

StdIO remains useful, but it should be demoted to a transport and compatibility interface:

  • output streams for simple textual programs,
  • test harnesses that script input and check transcript output,
  • POSIX personality descriptor emulation,
  • applications whose real protocol is text.

For capOS-native interactive applications, StdIO.read() should not be the primary command interface. A command session can still emit render events that the shell forwards to a terminal host, and a compatibility adapter can expose the same session as text when necessary.

Terminal Host Separation

The shell should not permanently own the terminal implementation. A separate terminal host process gives the system one shell that can be reused across different front ends:

  • local UART host for QEMU and early hardware,
  • web host for browser terminal sessions,
  • GUI host for a desktop pane or command palette,
  • test host for smoke scripts.

Each host owns a terminal entity and grants a foreground TerminalSession or equivalent view to the shell. The shell runs command sessions and returns render/update events. The host decides how to display them.

This also avoids a future false choice between “shell owns the terminal” and “child process receives the terminal.” The terminal entity can support a foreground lease, shell-mediated command sessions, and later split panes or GUI widgets without making every child process a terminal driver.

Migration Plan

  1. Land the current shell-spawned StdIO clients as an explicit prototype: no app-specific shell builtins, no terminal-cap delegation to children, and run available for blocking command execution.
  2. Add focused QEMU smokes for chat and adventure against that prototype so the resident service, exact grants, wait path, and terminal bridge have a stable regression target.
  3. Add a userspace CommandSession DTO/protocol in the shared demo/runtime layer, carried over ordinary Endpoint until a manifest-visible interface is worth committing.
  4. Teach capos-shell a generic command-surface parser and command-provider registry. Do not add chat, play adventure, go, take, or similar application verbs as hardcoded shell matches.
  5. Move adventure command parsing out of demos/adventure-client/ and into command descriptors plus typed Adventure/Chat invocations.
  6. Split terminal hosting from the shell when the local UART path needs to support a second front end or when the web terminal work starts. Until then, keep the current terminal implementation constrained to the TerminalSession boundary so the split is mechanical.