Proposal: SSH Shell Gateway

Production remote shell access for capOS using SSH as a terminal transport while preserving the native shell’s capability boundaries.

Status Split

Implemented:

SSH-shaped authority prerequisites and fixture authentication proof: development-only sign-only host key, manifest-seeded authorized-key lookup, public-key session minting over fixture authentication bytes, unsupported feature policy/audit classification, restricted shell launcher, and a bounded host-local plain-TCP terminal-host proof.
UserSession.auditContext fails closed after logout (same ensure_session_live guard as info()); test-ssh-public-key-session proves pre-logout success, post-logout failure, idempotent second logout, and continued closed state.

Not implemented:

encrypted SSH packet transport;
OpenSSH-compatible key exchange and channel handling;
full SSH userauth transcript validation;
channel binding;
TerminalSessionFromByteStream terminal-factory wiring;
OpenSSH harness.

Do not infer OpenSSH-compatible remote login from the current “partially implemented” status.

Remote and non-loopback deployment is blocked. The current proof uses development/fixture key material and host-local plaintext wiring for bounded authority checks; it is not a production SSH service. Before exposure beyond loopback, the implementation must have encrypted SSH transport, production host-key storage, durable authorized-key/account storage, full userauth transcript validation, channel binding, audit records for auth and shell launch, and a reviewed pre-auth/post-auth isolation story.

Problem

The Telnet Shell Demo described in Networking proves that a remote TCP connection can become a TerminalSession without granting the shell raw network authority. That is the right capability boundary, but Telnet is intentionally not a production remote access path. It has no encryption, no host authentication, no replay protection, no key-based user authentication, and no deployable security story beyond “host loopback in QEMU.” This proposal is the production remote-shell successor to that loopback-only research demo; the demo’s TerminalSession boundary survives, but its plaintext transport does not.

capOS needs a production-oriented CLI remote shell that works with normal SSH clients while avoiding the Unix mistake of treating an SSH login as a raw remote root shell, ambient user id, inherited file descriptor set, or global filesystem entry point.

The SSH path should be a terminal host and session authenticator. It should not become a general-purpose privilege broker, TCP proxy, process supervisor, or substitute for the native shell’s capability model.

Relationship To Telnet

SSH reuses the Telnet Shell Demo’s core contract – the same TerminalSession boundary Shell requires for any terminal-backed capos-shell, and the same broker-issued shell bundle Boot to Shell mints for a fresh session:

A gateway accepts TCP connections.
The gateway owns transport framing and terminal-host behavior.
The spawned capos-shell receives a cap named terminal implementing TerminalSession.
The shell receives the normal broker-issued shell bundle for the authenticated session.
The shell does not receive raw TcpSocket, NetworkManager, listener, broad process-spawn, private-key, authorized-key-store, or host-key authority.

The transport changes. Telnet handles plaintext option negotiation over a host-loopback QEMU forwarding rule. SSH handles version exchange, key exchange, host-key proof, encrypted packet framing, user authentication, session channels, PTY requests, window changes, shell requests, and clean channel teardown.

The security boundary does not change. The shell still sees only a terminal session and a scoped capability bundle.

SSH is not the only remote client model. It is the production terminal/CLI transport for operators who want an interactive shell. Programmatic clients should use the remote session CapSet path instead: authenticate through a session/admission method, receive a broker-issued remote CapSet view, and call provided capabilities over Cap’n Proto RPC without creating a shell. Public-key account records may feed both paths, but the authentication transcript bytes must be domain-separated by protocol and channel binding. See Remote Session CapSet Clients.

The first SSH implementation milestone is still host-local development. It should not silently inherit the Telnet demo’s trusted gateway compromise. Before implementation, the SSH path must either close the gateway authority gap with scoped listener and shell-only launcher grants, or explicitly preserve that gap in a task record as a host-local-only compromise while still proving that the spawned shell has no raw network, spawn, key, or SSH transport authority.

Pre-auth and post-auth shell flows must not share broad process/address-space authority for production exposure. Either split the authentication gateway and post-auth shell launcher into separate processes with narrow handoff caps, or produce a reviewable proof that the shared process cannot use pre-auth network, key, listener, or parser state as post-auth shell authority.

Scope

Initial SSH support is deliberately narrow:

SSH-2 only, following the RFC 4251-4254 family at the protocol level.
One interactive session channel per connection for the first proof.
pty-req, window-change, shell, EOF, close, and disconnect handling.
Public-key user authentication first.
Fresh random material for key exchange, rekey, padding, session identifiers, and authentication challenges comes from EntropySource or a narrowed SSH transport-crypto service that owns EntropySource; it is never ambient process state.
Password authentication only if it is wired to the existing CredentialStore failure/backoff path and policy explicitly enables it.
No port forwarding, agent forwarding, X11 forwarding, SFTP, SCP, subsystem requests, exec requests, direct TCP forwarding, or arbitrary environment import in the first milestone.

Those excluded SSH features are not harmless defaults. In capOS they require their own capabilities, policy, accounting, and audit records before exposure.

Components

flowchart TD
    Client[SSH client] -->|TCP 22| Gateway[SshGateway]
    Gateway --> HostKey[SshHostKey cap]
    Gateway --> Keys[AuthorizedKeyStore]
    Gateway --> Sessions[SessionManager]
    Gateway --> Broker[AuthorityBroker]
    Gateway --> Launcher[RestrictedShellLauncher]
    Gateway --> Listen[TcpListenAuthority]
    Gateway --> Audit[AuditLog]

    Keys --> Sessions
    Sessions --> Broker
    Broker --> Bundle[Scoped shell bundle]
    Gateway --> Terminal[SSH-backed TerminalSession]
    Launcher --> Shell[capos-shell]
    Terminal --> Shell
    Bundle --> Shell

SshGateway is the only component exposed to the network. It is an ordinary userspace service once the socket capability path can support it. During an early implementation it may wrap the same in-kernel TCP capabilities used by Telnet; a later decomposed-network stack should not change the shell contract. The schema-level gateway contract is intentionally small: status and shutdown methods identify the service surface without granting child shell authority.

SshHostKey is a sign-only private-key capability. It should be backed by the PrivateKey/KeyVault model from Cryptography and Key Management: the gateway can sign the SSH exchange hash but cannot export private key material, enumerate unrelated keys, or administer the vault.

AuthorizedKeyStore maps an SSH public key to a principal and authentication policy. It stores public key material and policy metadata, not shell authority. OpenSSH-format public keys are bytes imported into a verifier path, matching the crypto proposal’s PublicKeyFormat.opensshWire escape hatch for public material. The initial schema returns an SshAuthorizedKeyDecision with principal/profile metadata and an audit reason; actual shell authority still comes from SessionManager and AuthorityBroker.

TerminalSession is backed by the SSH channel. The gateway translates channel data, EOF, close, PTY mode, and window-size events into the terminal host contract. The schema names this construction surface SshTerminalFactory; it returns a result-cap index for the SSH-backed TerminalSession. Password prompts, hidden echo, cancellation, and teardown stay at that boundary.

TcpListenAuthority is the scoped listener grant shape for this milestone. It can mint only the configured TcpListener rather than exposing raw NetworkManager.createTcpListener for arbitrary ports.

RestrictedShellLauncher is narrower than the transitional RestrictedLauncher: it launches only the native shell against a supplied terminal/session context instead of accepting an arbitrary binary name. The current kernel source is manifest-declared as restricted_shell_launcher; it adds the child terminal, session, and stdio grants itself and accepts only named capability-sourced pass-through grants for the reviewed shell startup bundle (creds, sessions, audit, broker, and optional system_info). Before spawn it verifies the supplied UserSession profile matches the requested profile, and the focused proof shows the spawned native shell running under that supplied session.

Authority Model

The gateway receives only the capabilities required for its job:

TCP listen authority for the configured SSH port, preferably as a manifest-declared TcpListener handoff or scoped listener factory rather than raw NetworkManager.
Sign-only SshHostKey authority for configured host-key algorithms.
Narrow EntropySource authority, or an SshTransportCrypto cap that owns entropy and exposes only SSH key-exchange, rekey, cipher/MAC, and random padding operations.
Read or verify authority over AuthorizedKeyStore.
SessionManager authority to mint a session after successful SSH authentication.
AuthorityBroker authority to request the normal remote shell profile.
Restricted shell launch authority scoped to capos-shell.
Pass-through grants required by the current shell startup path, such as creds, sessions, audit, and broker, where policy permits them.
AuditLog append authority for connection, authentication, launch, and teardown records.

In the production-shaped authority model, it does not receive:

Broad ProcessSpawner authority.
Raw NetworkManager, outbound connectTcp, or an arbitrary listener factory.
Key export or KeyVault administrative authority.
Storage namespace authority except the narrow public-key records required by AuthorizedKeyStore.
SSH agent, port-forward, or subsystem authority unless later proposals add explicit caps for those surfaces.

A host-local development checkpoint may temporarily preserve raw NetworkManager, arbitrary listener factory, or broad ProcessSpawner authority in the gateway only if a task record captures the compromise and the harness proves it does not cross the shell boundary. The spawned shell must never receive raw NetworkManager, TcpListener, TcpSocket, ProcessSpawner, SSH transport, host-key, authorized-key-store, key-vault, or general-purpose entropy authority.

Identity metadata is not authority. A login name, SSH username, key fingerprint, source IP, principal id, or profile label only becomes useful after a trusted service returns a capability bundle.

Authentication

Host authentication

The host key should be a narrow wrapper around a PrivateKey cap, constrained to SSH host-key signing. Host keys are generated or imported through KeyVault, opened through an explicit SealPolicy, and rotated through a versioned host identity record. The gateway can sign the key exchange hash but cannot export private material.

SSH transport keys are separate from the host key. Key exchange must use fresh entropy and the algorithm policy selected for the deployment. The baseline standards are RFC 4251-4254; extension negotiation and modern algorithm recommendations come from later SSH RFCs such as RFC 8308, RFC 8709, RFC 9142, and other updates recorded by the RFC Editor for the 4251-4254 family. The first implementation should pin a small reviewed algorithm set rather than accepting every algorithm a library exposes.

For development, a manifest-seeded host key may be acceptable only when the manifest field, docs, and harness mark it as non-production. The current development path uses kernelParams.sshDevelopmentHostKey with the required label capos-development-only-ssh-host-key and the kernel source ssh_development_host_key; the resulting cap exposes only public metadata and signs bounded ssh-ed25519 exchange hashes with the manifest seed for QEMU proof. make test-ssh-host-key verifies the signature against the configured public key, proves wrong-algorithm denial, and checks that the development seed and raw signature are not printed to proof logs. For deployment, host keys need persistent storage, rotation policy, key-management-backed signing, and audit.

User public keys

Public-key login maps an accepted SSH public key to a principal record and authentication strength. The key record should include:

AuthorizedSshKey {
  keyId
  principalId
  publicKey
  algorithm
  fingerprint
  allowedProfiles
  sourcePolicy
  createdAtMs
  disabledAtMs
  comment
}

The current manifest-seeded prerequisites implement public key record loading, generic authorization decisions, and a bounded session-mint bridge. The AuthorizedKeyStore accepts ssh-ed25519 records with 32-byte public keys and SHA-256 fingerprints, rejects duplicate ids and fingerprints, maps principals to existing seed accounts, and denies disabled records. SessionManager accepts bounded fixture authentication bytes/signatures for configured keys and mints UserSession metadata with publicKey authentication strength; the focused make test-ssh-public-key-auth proof also shows AuthorityBroker denying a mismatched shell profile.

SessionManager.sshPublicKey consults the bootstrap RamAccountStore after signature verification using lookup_by_principal. Non-Active account statuses (Disabled, Locked, RecoveryOnly) and missing principals fail closed before a session is minted, so a runtime account-store mutation cannot be ignored by the SSH path even though authorized-key records carry their own disabledAtMs flag. The bootstrap fallback (no account store wired) keeps the seed-account validation contract: manifest validation guarantees every authorized-key principal binds to an active seed account. The test-ssh-public-key-session smoke also proves UserSession.auditContext returns principal metadata before logout and fails closed with ensure_session_live after explicit logout(), matching the same fail-closed contract as info().

Each denial path emits a stable auth= audit code (no schema variant change). The codes form the SSH gateway’s operator-visible audit contract: ssh-public-key for success, ssh-key-unknown, ssh-key-disabled, ssh-key-profile-not-allowed, ssh-bad-signature, ssh-account-missing, ssh-account-disabled, ssh-account-locked, ssh-account-recovery-only, ssh-account-lookup-failed, ssh-profile-kind-invalid, ssh-profile-not-interactive, ssh-auth-bytes-invalid. Failed records keep principal and profile blank by policy: the auth= code is the only discriminator, so failed-auth lines cannot be used as a side channel to probe for valid principal IDs.

This is still not a complete SSH public-key authentication exchange: no SSH transport transcript, channel binding, or terminal factory is wired end-to-end. A bounded plain-TCP terminal-host proof now reuses the configured key fixture to mint a public-key session and launch capos-shell through RestrictedShellLauncher, but that proof is not an encrypted SSH transport or OpenSSH userauth exchange. End-to-end QEMU proof of the ssh-account-disabled/ssh-account-locked paths requires an AccountStoreManagerCap kernel cap source so a demo can mutate account state at runtime; that is tracked in the local-users management backlog and is not required by the bounded host-local SSH gateway proofs.

Cloud metadata may seed initial authorized keys through the cloud-bootstrap path, but those keys are input to AuthorizedKeyStore, not ambient login authority. A metadata-provided key still needs an account/profile mapping and should be auditable as cloud-seeded material.

Passwords and step-up

Password authentication over SSH is optional and should be disabled unless CredentialStore can enforce the same generic failure text, bounded backoff, rate limits, and audit behavior as the local shell. Keyboard-interactive can later drive step-up prompts, but it should not be the first implementation unless a concrete policy needs it.

SSH Channel Policy

The first gateway accepts only session channels that request an interactive shell. It rejects:

exec requests.
subsystem requests such as SFTP.
agent forwarding.
TCP forwarding and reverse forwarding.
X11 forwarding.
environment variables except a small reviewed allow-list, if any.
more than one active shell channel per connection.

Each rejected request should produce an SSH protocol failure plus an audit record with a reason code. The audit record should not include command lines, environment dumps, key material, or terminal content.

The current bounded policy surface is capos-config::ssh_policy. It allows public-key auth, one session channel, PTY, window-change, and a first shell request. It denies disabled password auth, exec, subsystem/SFTP, direct TCP/IP, TCP/IP forwarding and cancellation, agent forwarding, X11 forwarding, environment import, second session-channel opens, and second shell channels. Password auth has no policy allow path in this proof; it stays denied until a real CredentialStore verifier, backoff, and audit path is wired into the gateway. Denials return only a protocol failure class and a stable audit reason code; request payloads such as command text and environment values are not part of the decision data.

Implementation Slices

The final OpenSSH proof should not land as one opaque SSH server commit. Keep the implementation reviewable by landing these slices in order:

Version exchange. A bootable ssh-gateway service accepts one host-local OpenSSH TCP connection, exchanges RFC 4253 identification strings, records only sanitized client software/version metadata, and disconnects before key exchange without launching a shell. The compatibility harness uses /usr/bin/ssh; malformed and overlong client identification strings are covered by a separate low-level hostile TCP/banner fixture.
KEXINIT and algorithm selection. Parse KEXINIT, select exactly one reviewed development algorithm set, and disconnect on unsupported algorithms. Algorithm names are transport policy inputs, not authority.
Development key exchange. Complete the host-local encrypted transport by deriving traffic keys from the negotiated KEX shared secret, exchange hash, and session id per RFC 4253. Entropy supplies ephemeral KEX material, padding, and challenges, not direct session-key bytes. Call SshHostKey.signExchangeHash and prove no private host-key or raw entropy material reaches logs or child shell grants.
Public-key userauth. Bind the OpenSSH public-key userauth transcript to SessionManager.sshPublicKey, accept the configured key, deny unknown keys generically, and keep password auth disabled until a real verifier/backoff path is wired.
Channel policy. Route session open, PTY, window-change, shell, exec, subsystem, forwarding, agent, X11, environment, and second-channel requests through capos-config::ssh_policy, producing protocol-visible failures and sanitized audit reason codes for denied features.
SSH-backed terminal launch. Replace the plain-TCP terminal-host proof with an SSH channel-backed TerminalSession, launch capos-shell through RestrictedShellLauncher, run session, caps, and exit via OpenSSH, and prove cleanup for both client disconnect and shell exit.

Resource And Teardown Rules

SSH exposes several resource boundaries before the shell even starts: handshake CPU, pending connections, packet buffers, channels, PTY state, terminal buffers, authentication attempts, and live shell processes.

The gateway must have fixed per-connection bounds and fail closed when they are exceeded. Disconnect, TCP close, SSH channel close, failed authentication, session expiration, shell exit, and gateway teardown must all release the same resources:

accepted socket,
SSH connection state,
terminal session object,
spawned shell handle,
broker-issued grants,
authentication challenge state,
audit correlation record.

Shell exit should close the SSH channel. Client disconnect should close the terminal and let the shell observe the normal TerminalSession close path.

Exit Criteria

The first SSH milestone is complete when:

SshGateway, host-key, authorized-key, and SSH-backed terminal contracts are documented in schema/design form.
The development host-key path is available only through an explicitly non-production manifest field and a narrow SshHostKey cap; production signing remains blocked on key management and persistent storage.
A manifest can start an SSH gateway with only scoped TCP listen, host-key, authorized-key, session, broker, audit, and restricted shell-launch grants, or the remaining host-local demo compromise is explicitly preserved in a task record.
The gateway accepts a normal OpenSSH client on a host-local QEMU forwarded port, authenticates one public key, spawns capos-shell with a TerminalSession, runs one command, and disconnects cleanly.
The harness proves denied password login when disabled, denied port forwarding, denied subsystem requests, rejected unknown keys, and cleanup after client disconnect.
The harness proves unavailable entropy or disabled KEX algorithms fail closed before authentication or shell launch.
Documentation states which parts are development-only and which are acceptable for production deployment.

Dependencies

Telnet Shell Demo from Networking for the socket-backed TerminalSession proof this gateway succeeds.
TerminalSessionFromByteStream as a shared prerequisite for SSH channel and TLS/mTLS-backed remote terminals. SSH channel data is not a connected TcpSocket; it must enter the same terminal factory used by Telnet-over-TLS – whose certificate, trust store, ACME, and pinning model lives in Certificates and TLS – so line discipline, echo policy, IAC handling where relevant, close semantics, and hidden password behavior do not fork by transport.
Cryptography and key-management primitives for sign-only host keys.
EntropySource or a narrowed SSH transport-crypto service for key exchange, rekey, packet padding, and challenge freshness.
User identity, account, and session policy records for AuthorizedKeyStore principal/profile mapping.
System-monitoring audit records for remote authentication, denied SSH features, launch decisions, and teardown.
Resource accounting for connection, channel, and shell-process limits.
Persistent storage before production host keys and authorized keys can survive reboot safely.

Remote-shell ingress should land in this order:

TerminalSessionFromByteStream and shared terminal line/echo/hidden-input discipline.
A transport-neutral byte-stream terminal factory used by both SSH channel data and TLS/mTLS cleartext byte streams.
Either Telnet-over-TLS or SSH may land first, but neither should fork terminal semantics.
Production deployment profile chooses SSH for familiar operator CLI access and TLS/mTLS, configured through Certificates and TLS, for PKI-integrated service/operator environments.

No more SSH terminal transport work should land until the shared prerequisite exists and has proof coverage for byte-identical hidden password behavior, line/IAC factoring, and repeated close/reconnect behavior.

Grounding

This proposal relies on these in-tree design documents and research notes:

Networking for the Telnet Shell Demo this gateway succeeds and the TCP capability path the SSH listener reuses.
Shell for the native capos-shell and the TerminalSession boundary every remote-shell transport must preserve.
Boot to Shell for CredentialStore, SessionManager, AuthorityBroker, RestrictedShellLauncher, and EntropySource, including the bounded SSH terminal-host proof that already lands inside that flow.
Certificates, TLS, and Certificate Transparency for the TLS/mTLS counterpart transport profile and the shared certificate, trust-store, and pinning model the Telnet-over-TLS factory consumes.
Cryptography and Key Management for PrivateKey, PublicKeyFormat.opensshWire, KeyVault, and SealPolicy.
User Identity and Policy for principal/account/session/profile semantics.
Resource Accounting and Quotas for listener, socket, channel, packet-buffer, and shell-process bounds.
System Monitoring for audit record shape and retention boundaries.
Storage and Naming for the capability-native storage model needed before production host keys and authorized keys become durable.
Trust Boundaries for remote-shell ingress review criteria.
Local Users Management Backlog for account, role, and RAM-store sequencing that feeds authorized-key principal mapping.
Genode Research for the session-factory precedent: clients request narrowed sessions from authority-bearing components instead of receiving broad factories directly.
Pingora Research for the listener/service/runtime split that informs keeping TCP listener setup separate from application shell authority.

External standards grounding starts from RFC 4251, RFC 4252, RFC 4253, and RFC 4254. Later SSH algorithm and extension updates, including RFC 8308, RFC 8709, and RFC 9142, must be checked when choosing the implementation’s accepted algorithm set.

Non-Goals

Replacing the native shell with a POSIX shell.
Treating SSH username or Unix UID as authority.
Ambient home directories, inherited file descriptors, or global paths.
SSH agent forwarding as a shortcut to key authority.
SFTP/SCP as a storage API before scoped file/storage capabilities exist.
Port forwarding before explicit network-proxy capabilities and policy exist.

Keyboard shortcuts

capOS Documentation