Proposal: Chat As Multimedia Substrate
How capOS should design Chat as a unified text + audio + video transport
interface for human-to-human, human-to-agent, and service-driven channels –
mapped cleanly to WebRTC for browser participants – so that adding a new
messaging surface (operator chat, agent prompt input, audio call, video call,
file drop) does not require a new top-level capability or a new gateway DTO.
This proposal is the resolution of the “Chat as messaging substrate” research
task in WORKPLAN.md. It does not replace the existing Chat interface in
schema/capos.capnp directly; it specifies the shape the next iteration of
that interface should take, and it states what stays separate (notably:
approvals).
Problem
The existing Chat interface (schema/capos.capnp:372-378) is a text-only,
poll-based room: join, leave, send(text), who, poll(maxEvents) -> List(ChatEvent) where ChatEvent.kind is one of
message|joined|left|system|history. That works for the chat-server demo and
for a denial probe, but it cannot carry:
- incoming events without polling (every browser tab paying for a poll loop is the wrong end-state);
- audio frames (low-latency, lossy, ordered);
- video frames (high-bandwidth, key-frame-aware);
- file/binary attachments (bounded, integrity-checked);
- structured non-text payloads that other surfaces want to share, e.g. agent prompts with tool-call hints, presence beacons, typing indicators, reactions.
Adjacent proposals each invent their own transport for what is fundamentally the same shape:
realtime-voice-agent-shell-proposal.mddefinesVoiceSessionwithopenCapture/openPlaybackand aRealtimeModelSessionwithRealtimeInputEvent/RealtimeOutputEvent. Audio frames flow onMemoryObject-backed media rings rather than capnp payloads.llm-and-agent-proposal.mddefines tool-call records and a per-tool permission gate, but never says how the operator talks to a running agent (send a prompt, get a partial response stream, push audio, receive audio).remote-session-capset-client-proposal.mdexposes onechatSendDTO method per chat, with no audio/video path at all.
Each proposal independently arrives at “we need a stream-of-events transport with capability-mediated subscription”. The right design is to share one substrate. Chat is already the user-facing name; the substrate should be Chat, extended.
WebRTC is the existing browser-side abstraction that solves the same problem
(text via DataChannel, audio via audio tracks, video via video tracks, all
under one peer connection with negotiated codecs and ICE-managed
connectivity). A capOS Chat channel should map onto a WebRTC peer
connection cleanly enough that a browser participant can be implemented as a
WebRTC peer talking to a capOS-side gateway, without translation gymnastics.
Goals
- Carry text, audio, video, and bounded binary attachments on the same chat cap, with capability-gated subscription per kind.
- Replace
pollwith listener caps the channel calls back, so capnp-rpc participants do not poll. Keeppollavailable as a transport-stopgap for DTO clients during the migration to capnp-rpc. - Carry low-latency frames (audio, video) without copying them through capnp
message payloads on the hot path – use
MemoryObject-backed media rings or shared frame buffers, with the chat cap conveying control and frame metadata only. - Map cleanly to WebRTC for browser participants so the gateway can act as a signalling and ICE-relay endpoint without leaking raw browser handles to capOS code.
- Preserve the existing capability model: capability = invoke gate; channel membership = render gate. A subscriber cap is required to receive text events; a separate audio-subscriber cap is required to receive audio frames; a separate video-subscriber cap is required to receive video.
- Preserve session-bound invocation: the chat-cap holder’s session is the
caller; channel servers see the live opaque session-scoped reference and
may be granted disclosure scopes per
session-bound-invocation-context-proposal.md. - Strict ocap discipline. Every Chat capability is granted explicitly by a holder that already has it. There is no protocol-level “request permission to write to me” flow: until a recipient (or a chain authorized by the recipient) shares a peer cap with the sender, the sender has no path. Rephrased: capabilities flow forward only, by deliberate sharing.
- Cap lineage and transitive revocation are substrate-level invariants,
enforced by the Chat service with kernel support. Lineage is a
service concern, not a kernel one (per capOS’s “prefer userspace
capability wrappers over kernel-side policy checks” principle). The
root of every chat-cap lineage tree is the Chat service’s own root
cap – the cap chat-server holds for “I run this Chat service”. The
manifest is Chat service configuration, not kernel or broker
configuration: chat-server reads it at startup and uses its root cap
to materialize the configured groups and channels. Every cap chat
hands out is parented somewhere in chat-server’s internal tree;
ultimately every chain terminates at chat-server’s root.
Cross-principal sharing goes through a chat-server method
(
GroupMember.invite,DiscoverableGroupJoin.join,DiscoverableChannelTextSubscribe.subscribe, etc.), which mints a fresh derived cap and records its parent. Raw bearer transfer of chat caps is blocked by the kernel viatransfer_policyenforcement (see Open Questions). Revocation walks the tree and rotates the kernel-level cap epoch of every descendant in the revoked branch; subsequent dispatch fails closed at the kernel site (epoch rotation is already an existing kernel-level mechanism). This is what makes “a member started inviting spam bots into the group” recoverable: revoke the spammer’s branch; their downstream invitees go with them; unrelated siblings – and unrelated branches under the same group – are untouched. - Chat session sees callers via session-bound identity, not via a
user-info cap. Per
session-bound-invocation-context-proposal.md, the kernel attaches an opaque session-scoped reference to every invocation. Chat-server uses that reference to route messages, populatesenderfields per its disclosure policy, and identify who joined which group, without holding any “look up user X” cap. - Telegram-shaped channel categories. Groups (with nested topics, owner
- admin role hierarchy, extensible permissions), broadcast channels
(read-only for subscribers), DMs, and end-to-end-encrypted DMs as a
distinct cap layer. There is no special “system room” category –
system-managed channels are just channels owned by service principals or
designated admin principals (capOS already treats services as
principals; see
user-identity-and-policy-proposal.mdPrincipalKindincludingservice).
- admin role hierarchy, extensible permissions), broadcast channels
(read-only for subscribers), DMs, and end-to-end-encrypted DMs as a
distinct cap layer. There is no special “system room” category –
system-managed channels are just channels owned by service principals or
designated admin principals (capOS already treats services as
principals; see
- Keep backpressure tractable: outgoing media uses capnp
-> streamfor flow-controlled writes; incoming media listener caps may indicate drop-vs-queue policy in the subscription request.
Non-Goals
- Replacing WebRTC for browser-to-browser P2P. capOS is the gateway; the browser still uses WebRTC primitives. We map them onto the gateway-held Chat cap, not the other way around.
- Replacing
RealtimeModelSession(realtime-voice-agent-shell-proposal.md) for agent-runtime ↔ model-provider transport. That session is a different layer: it carries provider-specific events (RealtimeInputEvent/RealtimeOutputEvent) between the runner and an external model API. The operator-facing surface (operator talks to the running agent, agent speaks back) is a chat; the agent runner bridges the two. - Replacing
ApprovalClient/ApprovalGrant(shell-proposal.md:407-427). Action approvals are a separate capability. A chat may surface an approval request as a message event with a payload referencing anApprovalGrant, but the cap holding the approval state stays distinct. See## Approvals Stay Separatebelow. - Carrying raw on-the-wire codec bytes inside capnp payloads in the hot path. Frame metadata travels on capnp; frame bodies travel via shared memory or provider-owned handles.
- Defining a global chat name registry. Channels are scoped: a chat cap hands you a specific server-owned room; how rooms get named lives in the hosting service (chat-server, adventure-server, agent runner, etc.).
- File-transfer protocol design (resume, integrity, deduplication). Bounded attachments are in scope; large-file transfer reuses a separate File or ContentStore cap, with Chat carrying only the reference.
Architecture
flowchart LR
subgraph capos[capOS]
chatsrv[chat-server / agent-runner / adventure-server]
ch[chat cap - per chat]
chatsrv --> ch
end
subgraph rust[Trusted Rust backend]
wrk[Per-session worker holds chat cap]
listeners[ChatListener, AudioSink, VideoSink listener caps]
wrk -- subscribe(listener) --> ch
ch -- listener.post(event) --> listeners
listeners --> appstate[AppState - text history buffer, audio ring, video ring]
end
subgraph browser[Browser]
js[Browser JS - text view models, WebRTC peer for audio/video]
end
appstate -- text events as view models --> js
appstate <-- WebRTC SDP/ICE signalling via /api/chat/webrtc --> js
appstate <-- audio frames via WebRTC audio track --> js
appstate <-- video frames via WebRTC video track --> js
Three layers, three transports:
-
capnp-rpc, between capOS and the trusted Rust backend. Listener caps for incoming text events.
-> streammethods for outgoing audio/video frames. Frame metadata on capnp; frame bodies onMemoryObject-backed rings shared between the worker process and the gateway. -
Trusted Rust backend bookkeeping. The backend holds the chat cap, buffers a bounded text history, and owns the audio/video media rings. Browser-visible state stays in view models.
-
HTTP + WebRTC, between the trusted Rust backend and the browser. Text events flow as JSON view models on the existing
/api/*HTTP surface. Audio and video flow through a WebRTC peer connection: the browser does the SDP offer; the backend produces an answer using a small capOS-side WebRTC adapter (or relays SDP to a capOS-side WebRTC service); audio/video tracks carry the frames the backend got via the media rings.
Schema Sketch
This is a sketch, not the final wire shape. Field numbers, exact param names, and struct nesting will be finalized when the implementation iteration starts; what matters here is the shape.
The substrate is not one interface. Role caps, discovery caps,
contact caps, DM peer caps, listener caps, and outgoing-media caps
are distinct interfaces because they have distinct authorities.
Possessing a cap is the authority; calling a method that returns
a derived cap is just a normal method call (no separate “redeem”
step exists). The cap class’s transfer_policy (kernel-enforced)
forbids raw bearer transfer between principals; sharing must go
through chat-server’s derive*-shaped methods.
Naming convention (Telegram-aligned). Three concrete chat categories:
- Group – multi-party two-way chat. Roles:
GroupOwner,GroupAdmin,GroupMember. Supports nested topics. - Channel – broadcast (read-only for subscribers). Roles:
ChannelOwner,ChannelAdmin,ChannelPublisher, plus the per-media-facet subscriber capsChannelTextSubscriber/ChannelAudioSubscriber/ChannelVideoSubscriber. The substrate has no type-erased genericChannelSubscriber; the result type of a subscribe path tells the caller exactly which media facets it grants (see schema below). - DM – direct message between two principals. Caps:
DmPeer,E2EDmPeer. Established viaContactCap.
The unqualified word “channel” in this proposal only refers to a
Telegram-style broadcast Channel. Any generic “stream of events” or
“thing you can subscribe to” is called a chat (the substrate-level
term). Base interfaces use the Chat prefix (ChatEndpoint,
ChatWriter, ChatDirectory, ChatInfo, ChatKind); concrete
roles use the category prefix (Group*, Channel*, Dm*).
# Identity / describe surface every chat-cap embeds (except pure
# listener caps and revokers). Holding ChatEndpoint alone grants
# nothing beyond inspecting metadata.
interface ChatEndpoint {
describe @0 () -> (info :ChatInfo);
}
# ============================================================
# Per-kind read facets. The interface IS the permission: holding
# ChatTextReader grants subscribeText authority and ONLY that.
# Audio and video are separate caps. A text-only role does not
# expose subscribeAudio / subscribeVideo at all -- there is no
# runtime check for "are you allowed to read audio"; the absence
# of the method is the gate.
# ============================================================
interface ChatTextReader extends(ChatEndpoint) {
subscribeText @0 (listener :TextListener,
options :SubscribeOptions) -> (sub :Subscription);
}
interface ChatAudioReader extends(ChatEndpoint) {
subscribeAudio @0 (listener :AudioSink,
options :AudioSubscribeOptions) -> (sub :Subscription);
}
interface ChatVideoReader extends(ChatEndpoint) {
subscribeVideo @0 (listener :VideoSink,
options :VideoSubscribeOptions) -> (sub :Subscription);
}
# ============================================================
# Per-kind write facets. Each writer extends the corresponding
# reader (a writer is also a reader of the same kind). Concrete
# roles compose the kinds they need.
# ============================================================
interface ChatTextWriter extends(ChatTextReader) {
send @0 (event :ChatOutboundEvent) -> ();
postAttachment @1 (descriptor :AttachmentDescriptor) -> ();
}
interface ChatAudioWriter extends(ChatAudioReader) {
openAudioOut @0 (format :AudioFormat) -> (track :AudioOut);
}
interface ChatVideoWriter extends(ChatVideoReader) {
openVideoOut @0 (format :VideoFormat) -> (track :VideoOut);
}
# Convenience: full-multimedia writer. Most roles in this proposal
# extend this one; a "text-only group member" role would extend
# only ChatTextWriter, exposing strictly fewer methods.
interface ChatWriter extends(ChatTextWriter, ChatAudioWriter, ChatVideoWriter) {}
# ============================================================
# Group: multi-party two-way chat with topics + voice/stage rooms
# and an Owner/Admin/Member role hierarchy. Roles inherit upward:
# Owner is an Admin is a Member is a ChatWriter is a ChatEndpoint.
# ============================================================
interface GroupMember extends(ChatWriter) {
rooms @0 () -> (rooms :List(RoomInfo));
# Each per-room accessor returns a kind-specific facet so
# joining a text topic does not grant audio/video subscribe.
textRoom @1 (roomId :Text) -> (writer :ChatTextWriter);
voiceRoom @2 (roomId :Text) -> (room :VoiceRoom);
stageRoom @3 (roomId :Text) -> (room :StageRoom);
callSurface @4 () -> (calls :CallSurface);
# `invite` returns the bearer token (handed to the invitee via
# chat-server-mediated cap delivery), an issuer-held revoker,
# AND the GroupCapRef of the issuance lineage node so the
# caller can pass it to `GroupAdmin.describeBranch` /
# `revokeBranch` later without having to walk the lineage to
# find it. Splitting token from revoker prevents the invitee
# or any downstream holder from revoking their own invite --
# the InviteToken interface has no revoke method.
invite @5 (forSubject :PrincipalRef, lifetime :UInt64)
-> (token :InviteToken,
revoker :InviteRevoker,
inviteRef :GroupCapRef);
# Out-of-band invite path. Returns BEARER-SECRET bytes the
# issuer delivers via paper / QR / non-chat channel, the
# issuer-side `revoker`, AND the `inviteRef` GroupCapRef
# naming the issuance lineage node (analogous to `invite`).
# The bytes name a distinct lineage node in chat-server's
# tree (the issuance entry); any holder plus a Self cap can
# redeem them via Self.acceptInviteCode(code). Treat them
# with the same care as any bearer secret: do not log, do
# not include in transcripts, do not expose to untrusted
# observers, prefer bounded lifetimes and one-time-use
# semantics. The `inviteRef` is non-secret and safe to log.
inviteCode @6 (lifetime :UInt64)
-> (code :Data,
revoker :InviteRevoker,
inviteRef :GroupCapRef);
acceptInvite @7 (token :InviteToken) -> (member :GroupMember);
leave @8 () -> ();
}
interface GroupAdmin extends(GroupMember) {
removeMember @0 (memberRef :Data) -> ();
# Both `revokeBranch` and `describeBranch` accept any lineage
# node ref -- a member cap, an admin cap, an inviteCode lineage
# node, or a transformation operation node (from
# mergeIntoGroupAsTopic / moveTopicHere / extractTopicAsGroup).
# Revoking a transformation node epochs the entire grafted
# subtree; revoking a member cap epochs that member and the
# invitees they admitted. See the BranchInfo schema for the
# node kinds chat-server may return.
revokeBranch @1 (node :GroupCapRef) -> ();
setMemberInvitePolicy @2 (policy :MemberInvitePolicy) -> ();
createRoom @3 (config :RoomConfig) -> (info :RoomInfo);
removeRoom @4 (roomId :Text) -> ();
setRoomPolicy @5 (roomId :Text, policy :RoomPolicy) -> ();
# Per-principal ban list (deny-list for FUTURE mints only).
# `banPrincipal` only adds the principal to the group's
# ban list, so subsequent `DiscoverableGroupJoin.join()`,
# `Self.acceptInvite` / `acceptInviteCode`, and
# admin-mint paths fail closed with `principalBanned` for
# this principal. It does NOT kick the principal's existing
# caps; that's `revokeBranch`'s job. Without the deny-list,
# a previously-revoked principal who still holds a
# `DiscoverableGroupJoin` cap or a session bundle hook
# could simply re-join and mint a fresh chain. The full
# "kick + ban" workflow is the admin pairing
# `GroupAdmin.revokeBranch(node :GroupCapRef)` with
# `banPrincipal(principal :PrincipalRef)` in a single UI
# step. The branch ref comes from one of the typed sources
# (the `inviteRef` returned by the original
# `GroupMember.invite(...)` tuple if the admin issued the
# invite themselves; otherwise
# `GroupAdmin.lookupByPrincipal(principal)` or
# `describeRoot()` to walk the lineage tree). Raw transfer
# of the target's bearer member cap is forbidden by
# `transfer_policy`. The schema keeps the two concerns
# separate so each is idempotent and individually meaningful.
banPrincipal @6 (principalRef :PrincipalRef) -> ();
unbanPrincipal @7 (principalRef :PrincipalRef) -> ();
# Admin-only stage facet. Returns a StageRoomAdmin cap whose
# promoteToSpeaker / closeStage methods are not reachable from
# an ordinary GroupMember.stageRoom() accessor.
stageRoomAdmin @8 (roomId :Text) -> (admin :StageRoomAdmin);
# Lineage inspection used during spam-bot triage and audit. The
# caller passes a node reference; chat-server returns the
# subtree rooted at that node (the member or operation, the
# invitees/grafted members under it, sub-invitees, etc.) plus
# enough metadata to drive a UI before calling `revokeBranch`.
# Read-only.
describeBranch @9 (node :GroupCapRef) -> (info :BranchInfo(GroupCapRef));
# Top-down lineage walker. Returns the group's whole lineage
# tree (subject to chat-server's truncation policy) so an
# admin can locate a `GroupCapRef` for somebody else's
# invitee, public-joined member, or transformation-grafted
# member without already holding a ref. Together with
# `lookupByPrincipal`, this closes the obtain path for
# `describeBranch` / `revokeBranch` -- the caller does not
# need a pre-existing ref. Read-only.
describeRoot @10 () -> (info :BranchInfo(GroupCapRef));
# Convenience lookup: find the lineage nodes a given principal
# holds in this group. May return multiple refs if the
# principal joined via multiple paths (e.g. a manifest-bundled
# GroupMember plus a public-join chain from a different
# session). Returns an empty list for principals not in this
# group. Read-only; the cap returned is by-ref handle, not the
# principal's bearer cap.
lookupByPrincipal @11 (principalRef :PrincipalRef)
-> (refs :List(GroupCapRef));
}
# Reference to a node inside this group's lineage tree. Opaque to
# the caller; chat-server uses it to look up the node. Names BOTH
# cap-bearing nodes (members/admins/etc.) AND transformation
# operation nodes (mergeIntoGroupAsTopic / moveTopicHere /
# extractTopicAsGroup), so revokeBranch / describeBranch can
# operate on the entire-graft case as well as the per-member case
# discussed under Chat-graph transformations.
struct GroupCapRef {
nodeRef @0 :Data; # chat-server-internal handle id
}
# Snapshot of a lineage subtree returned by describeBranch /
# describeRoot. Holds enough to render "this is who would be
# revoked" UI for both per-member kicks and entire-graft
# revocations of a transformation node. Generic over the ref
# kind so the same shape serves Group lineage (RefT =
# GroupCapRef) and broadcast-Channel lineage (RefT =
# ChannelCapRef) without losing the type-level distinction
# between Group and Channel refs.
struct BranchInfo(RefT) {
root @0 :LineageNode(RefT);
totalMembers @1 :UInt32; # cap nodes in subtree (excludes
# transformation op nodes)
truncated @2 :Bool; # chat-server may cap deep trees
}
# Lineage nodes come in three flavours:
# - cap-bearing nodes (member / admin / publisher / subscriber
# caps held by a principal),
# - transformation operation nodes (mergeIntoGroupAsTopic /
# moveTopicHere / extractTopicAsGroup; no principal of their
# own; just a graft point), and
# - issuance nodes (a `ContactCap` issuance, an `InviteToken` /
# `inviteCode` issuance, a `contactCode` issuance, or any
# other "the issuer minted this so they can revoke its
# downstream subtree" entry). Issuance nodes have a non-empty
# descendants subtree once their token is redeemed.
# The shared envelope carries the ref, timestamp, parentage
# classification, and recursive children; the union arm carries
# the kind-specific data. Generic over RefT for the Group /
# Channel split.
#
# capnp generics constrain the ref type but cannot constrain the
# union arm by RefT (no dependent types in capnp). Soundness of
# "Group lineage trees only contain Group roles, Channel lineage
# trees only contain Channel roles" is therefore enforced
# at the chat-server boundary (it never emits a mismatched arm,
# and consumers may treat a mismatched arm as a chat-server
# implementation bug); the type system narrows the ref kind but
# the role kind is a documented invariant rather than a
# capnp-checked one.
struct LineageNode(RefT) {
ref @0 :RefT;
joinedAtMs @1 :UInt64;
parentage @2 :BranchParentage;
children @3 :List(LineageNode(RefT));
union {
capNode @4 :CapNodeInfo;
operationNode @5 :OperationNodeInfo;
issuanceNode @6 :IssuanceNodeInfo;
}
}
# Issuance lineage node: an entry chat-server adds to its tree
# when an issuer mints a bearer-cap or bearer-secret handle whose
# downstream descendants the issuer wants to be able to revoke
# transitively. Examples: `Self.contact` / `Self.contactCode`
# (DmPeer / E2EDmPeer descendants), `GroupMember.invite` /
# `inviteCode` (GroupMember descendants), and any future
# bearer-issuance pattern. The issuer holds either a typed
# revoker cap (`InviteRevoker`, `SpeakerRevoker`) or a non-secret
# ref handle (`ContactCapRef`, `inviteRef :GroupCapRef`,
# `codeId :Data`); revoking via that handle epochs the issuance
# node and every descendant.
struct IssuanceNodeInfo {
issuer @0 :PrincipalRef; # who minted the issuance
kind @1 :IssuanceKind;
expiresAtMs @2 :UInt64; # 0 = unbounded
}
enum IssuanceKind {
contactCap @0; # Self.contact -> ContactCap (cap form)
contactCode @1; # Self.contactCode -> bytes (code form)
inviteToken @2; # GroupMember.invite -> InviteToken (cap form)
inviteCode @3; # GroupMember.inviteCode -> bytes (code form)
speakerToken @4; # StageRoomAdmin.promoteToSpeaker -> SpeakerToken delivered via roster
groupAdminGrant @5; # GroupOwner.makeAdmin -> GroupAdmin delivered via Self.subscribeIncoming
channelPublisherGrant @6; # ChannelAdmin.makePublisher -> ChannelPublisher delivered via Self.subscribeIncoming
channelAdminGrant @7; # ChannelOwner.makeAdmin -> ChannelAdmin delivered via Self.subscribeIncoming
callHostGrant @8; # CallHost.promoteHost -> CallHost delivered via CallRosterDelta
e2eCallHostGrant @9; # E2ECallHost.promoteHost -> E2ECallHost delivered via CallRosterDelta
}
struct CapNodeInfo {
principal @0 :PrincipalRef;
role @1 :ChatNodeRole; # narrowed to the chat kind
# of the enclosing
# BranchInfo
}
# Per-chat-kind role discriminator inside lineage nodes. capnp
# generics narrow the ref type (`RefT`) but cannot narrow the
# role-union arm to match it (capnp has no dependent types).
# Documented invariant, enforced at the chat-server boundary:
# a `BranchInfo(GroupCapRef)` only emits the `group` arm, a
# `BranchInfo(ChannelCapRef)` only emits the `channel` arm.
# Consumers walking either tree may treat a mismatched arm as a
# chat-server implementation bug (return `unexpectedRoleKind`)
# rather than as caller-induced data.
struct ChatNodeRole {
union {
group @0 :GroupRole;
channel @1 :ChannelRole;
}
}
enum GroupRole {
owner @0;
admin @1;
member @2;
}
# `ChatRole` is retained as an alias for `GroupRole` for any
# audit / lineage prose that referred to "the chat role" without
# distinguishing Group from Channel (e.g. older descriptions of
# manifest-bundle entries). New schema methods use `GroupRole`
# or `ChannelRole` directly; do not introduce new uses of
# `ChatRole`.
using ChatRole = GroupRole;
enum ChannelRole {
owner @0;
admin @1;
publisher @2;
textSubscriber @3;
audioSubscriber @4;
videoSubscriber @5;
}
struct OperationNodeInfo {
operation @0 :TransformationOp;
initiator @1 :PrincipalRef; # caller-side admin that issued
consent @2 :OperationConsent; # who provided the second
# authority that authorized
# the graft
sourceTopicId @3 :Text; # may be empty for full-graft ops
targetTopicId @4 :Text;
}
# The two-cap proof consumed by chat-graph transformations is not
# always two admins. mergeIntoGroupAsTopic and moveTopicHere need
# the *other* group's admin role; extractTopicAsGroup needs the
# initiator's own Self cap (creation-quota authority), since the
# new group has no other-side admin yet. The variant tells audit
# UIs which authority shape was checked.
struct OperationConsent {
union {
partnerAdmin @0 :PrincipalRef; # mergeIntoGroupAsTopic /
# moveTopicHere: the
# other-group admin who
# consented in the same call
selfCreation @1 :PrincipalRef; # extractTopicAsGroup: the
# initiator's Self cap
# principal proving creation
# quota; same principal as
# `initiator` above
}
}
enum TransformationOp {
mergeIntoGroupAsTopic @0;
moveTopicHere @1;
extractTopicAsGroup @2;
}
enum BranchParentage {
manifestBundle @0;
publicJoin @1; # via DiscoverableGroupJoin.join()
invitedCap @2; # via Self.acceptInvite(token)
invitedCode @3; # via Self.acceptInviteCode(code)
ownerMint @4; # GroupOwner.makeAdmin / similar
transformation @5; # parented to a TransformationOp node
issuance @6; # this node IS an issuance entry
# (Self.contact, Self.contactCode,
# GroupMember.invite, inviteCode,
# StageRoomAdmin.promoteToSpeaker, etc.).
# The node's parent in the tree is its
# *issuer* (Self cap or role cap); the
# `issuance` tag distinguishes the node
# itself from a redeemed descendant.
}
interface GroupOwner extends(GroupAdmin) {
# Promote a member to admin. Same delivery shape as
# `GroupMember.invite` / `StageRoomAdmin.promoteToSpeaker`:
# chat-server records a *promotion issuance node* in the
# group's lineage tree (parented to the calling Owner cap)
# and delivers the freshly minted `GroupAdmin` cap to the
# promoted principal via that principal's `Self.subscribeIncoming`
# (`groupAdminGranted :GroupAdmin` arm), parented under the
# promotion node. The Owner gets back only an
# issuer-side `RolePromotionRevoker` (revokes the promotion --
# epoching the promoted GroupAdmin and any descendants the
# promotee minted) plus a non-secret `promotionRef
# :GroupCapRef` for `describeBranch` / `revokeBranch`. The
# caller does NOT receive the target's GroupAdmin cap; raw
# cross-principal cap delivery would violate
# `transfer_policy`.
makeAdmin @0 (memberRef :Data, perms :AdminPermissions)
-> (revoker :RolePromotionRevoker,
promotionRef :GroupCapRef);
setGroupPolicy @1 (policy :GroupPolicy) -> ();
# Discoverable join is always Member-typed. There is no
# `joinRole` argument because `DiscoverableGroupJoin.join()`
# is fixed to return `GroupMember` (admin / owner roles are
# minted via `GroupOwner.makeAdmin` (which produces a
# GroupAdmin, not an Owner -- new Owners come only from the
# manifest, `Self.startGroup`, or `extractTopicAsGroup`),
# never via
# public join). Removing the parameter eliminates the prior
# mismatch where `joinRole=admin` could be advertised but
# `.join()` would still mint only a member.
publishDiscoverable @2 (scope :ChatDirectoryScopeRef)
-> (entry :ChatDirectoryEntryHandle);
closePublicJoin @3 (entry :ChatDirectoryEntryHandle) -> ();
disband @4 () -> ();
}
# Issuer-held companion to a role-promotion. Parallel to
# InviteRevoker / SpeakerRevoker. Calling `revoke()` epochs the
# promoted role cap AND every descendant the promotee minted
# under it; the promoted principal falls back to whatever role
# they held before the promotion (the substrate does not auto-
# kick them from the chat). Promoter retains this revoker
# alongside the non-secret `promotionRef` for the cap-clean
# describeBranch / revokeBranch path.
interface RolePromotionRevoker {
describe @0 () -> (info :RolePromotionInfo);
revoke @1 () -> ();
}
# Bearer cap. Holding it lets the recipient call
# `Self.acceptInvite(token) -> GroupMember` (or
# `GroupMember.acceptInvite(token)` when joining via an existing
# group context). The token has NO revoke method -- bearers do
# not revoke their own invites. Revocation lives on the issuer's
# InviteRevoker cap.
interface InviteToken {
describe @0 () -> (info :InviteInfo);
}
# Issuer-held companion to InviteToken. The InviteRevoker is
# parented to the issuer's role cap in chat-server's lineage tree.
interface InviteRevoker {
describe @0 () -> (info :InviteInfo);
revoke @1 () -> ();
}
# ============================================================
# Channel (Telegram-strict: BROADCAST, not the generic word).
# Subscribers read; Publishers/Admins/Owner write. Subscribers
# do NOT extend ChatWriter -- the type system enforces RO at
# compile time.
# ============================================================
# Per-kind subscriber types. The interface IS the permission:
# a ChannelTextSubscriber holder cannot call subscribeAudio /
# subscribeVideo, regardless of runtime policy. Each variant
# composes only the readers it grants. Discovery yields the
# variant chat-server's configuration says applies to the
# scope's policy for this caller; the result type tells the
# caller exactly what they got.
interface ChannelTextSubscriber extends(ChatTextReader) {
unsubscribe @0 () -> ();
}
interface ChannelAudioSubscriber extends(ChatTextReader, ChatAudioReader) {
unsubscribe @0 () -> ();
}
interface ChannelVideoSubscriber extends(ChatTextReader, ChatAudioReader, ChatVideoReader) {
unsubscribe @0 () -> ();
}
# Publisher writes; lifecycle (close the whole channel) is NOT
# here. A non-admin publisher should be able to post but not
# tear down the channel. closeChannel lives on ChannelAdmin
# below.
interface ChannelPublisher extends(ChatWriter) {}
interface ChannelAdmin extends(ChannelPublisher) {
# Same delivery shape as `GroupOwner.makeAdmin`: chat-server
# records a promotion issuance node parented to the calling
# ChannelAdmin cap, delivers the freshly minted
# `ChannelPublisher` to the promoted principal via
# `Self.subscribeIncoming` (`channelPublisherGranted :ChannelPublisher`
# arm), and returns only the issuer-side revoker plus a
# non-secret promotionRef to the caller. Cross-principal
# role-cap delivery to the promoter is forbidden.
makePublisher @0 (subjectRef :PrincipalRef)
-> (revoker :RolePromotionRevoker,
promotionRef :ChannelCapRef);
removePublisher @1 (publisherRef :Data) -> ();
revokeBranch @2 (node :ChannelCapRef) -> ();
# Per-principal ban list (deny-list for FUTURE mints only).
# Same semantics as `GroupAdmin.banPrincipal`: `banPrincipal`
# only updates the broadcast Channel's deny-list; existing
# caps held by the principal are not epoched. Pair with
# `revokeBranch` for "kick + ban".
banPrincipal @3 (principalRef :PrincipalRef) -> ();
unbanPrincipal @4 (principalRef :PrincipalRef) -> ();
closeChannel @5 () -> (); # close the whole broadcast
# channel (not just the
# publisher's own stream)
# Lineage queries parallel to GroupAdmin. Same purpose: an
# admin needs `ChannelCapRef` handles to call `revokeBranch`
# for somebody else's publisher/subscriber chain, but the
# ChannelAdmin doesn't hold those caps. `describeBranch`
# accepts a known node ref and returns its subtree;
# `describeRoot` returns the whole channel lineage tree
# (truncated per policy); `lookupByPrincipal` returns refs
# for a given principal's caps in this channel. All
# read-only.
describeBranch @6 (node :ChannelCapRef) -> (info :BranchInfo(ChannelCapRef));
describeRoot @7 () -> (info :BranchInfo(ChannelCapRef));
lookupByPrincipal @8 (principalRef :PrincipalRef)
-> (refs :List(ChannelCapRef));
}
interface ChannelOwner extends(ChannelAdmin) {
# Same delivery shape as the `makePublisher` and
# `GroupOwner.makeAdmin` promotions: chat-server records a
# promotion issuance node, delivers the freshly minted
# `ChannelAdmin` to the promoted principal via
# `Self.subscribeIncoming` (`channelAdminGranted :ChannelAdmin` arm),
# and returns only the revoker plus promotionRef.
makeAdmin @0 (publisherRef :Data, perms :AdminPermissions)
-> (revoker :RolePromotionRevoker,
promotionRef :ChannelCapRef);
setChannelPolicy @1 (policy :ChannelPolicy) -> ();
publishDiscoverable @2 (scope :ChatDirectoryScopeRef)
-> (entry :ChatDirectoryEntryHandle);
closePublicJoin @3 (entry :ChatDirectoryEntryHandle) -> ();
}
# Reference to a node inside this broadcast Channel's lineage
# tree. Same shape as `GroupCapRef` but a distinct nominal type
# so a Group ref cannot be passed to `ChannelAdmin.revokeBranch`
# (and vice versa) at the type level. Names BOTH cap-bearing
# nodes (Channel{Owner,Admin,Publisher,*Subscriber}) AND any
# operation node a Channel might gain in the future. Opaque to
# the caller; chat-server resolves via its internal lineage table.
struct ChannelCapRef {
nodeRef @0 :Data;
}
# ============================================================
# Rooms within a Group. Three kinds: text topics, persistent
# voice rooms (Discord-style), broadcast stage rooms (Discord
# stage / Twitter Spaces). Per-room permission overrides are
# out of scope for the first slice (extensible via RoomPolicy).
# ============================================================
enum RoomKind {
textTopic @0;
voiceRoom @1;
stageRoom @2;
}
struct RoomInfo {
roomId @0 :Text;
kind @1 :RoomKind;
displayName @2 :Text;
topology @3 :CallTopology; # for voice/stage; ignored for text
capacity @4 :UInt32; # 0 = unbounded (per chat-server policy)
}
# Persistent voice room (always alive while the room exists).
# Joining means entering the call already in progress in this room.
interface VoiceRoom {
describe @0 () -> (info :VoiceRoomInfo);
subscribeRoster @1 (listener :CallRosterListener,
options :RosterSubscribeOptions)
-> (sub :Subscription);
describeRoster @2 () -> (snapshot :CallRosterSnapshot);
join @3 () -> (participant :CallParticipant);
}
# Stage room (broadcast voice within a Group). Subscribers listen;
# Speakers publish; admins promote a hand-raiser to speaker by
# minting a SpeakerToken (handed to the listener) plus a
# SpeakerRevoker (kept admin-side).
#
# StageRoom (member-reachable via GroupMember.stageRoom) does NOT
# carry promote authority -- ordinary members can listen, speak
# (with a token), and raise their hand, but cannot mint speaker
# tokens. Promotion lives on StageRoomAdmin, which is reached only
# through GroupAdmin (see below).
interface StageRoom {
describe @0 () -> (info :StageRoomInfo);
subscribeRoster @1 (listener :CallRosterListener,
options :RosterSubscribeOptions)
-> (sub :Subscription);
joinAsListener @2 () -> (participant :StageListener);
# On redemption, chat-server mints `StageSpeaker` with
# `parent = the SpeakerToken's lineage node`. The companion
# `SpeakerRevoker` therefore epochs both the unredeemed token
# AND any active StageSpeaker descendant; admin pulling the
# floor back kills live mic, not just future redemptions.
joinAsSpeaker @3 (token :SpeakerToken)
-> (participant :StageSpeaker);
raiseHand @4 () -> ();
}
# Admin-only stage facet. Reached via GroupAdmin.stageRoomAdmin
# (added to GroupAdmin earlier in the schema sketch); not
# obtainable from a plain GroupMember's stageRoom() accessor.
# `promoteToSpeaker` does NOT return the bearer SpeakerToken to
# the admin. Bound to listenerRef on the chat-server side and
# delivered directly to that listener via their existing
# StageRoom.subscribeRoster stream as a "you-are-now-a-speaker"
# event carrying the SpeakerToken cap reference. The admin keeps
# only the SpeakerRevoker. This avoids the cross-principal
# bearer-cap handoff problem (raw transfer is forbidden; chat
# events on the stage roster are the chat-server-mediated
# delivery path the substrate already provides).
interface StageRoomAdmin {
describe @0 () -> (info :StageRoomInfo);
promoteToSpeaker @1 (listenerRef :Data)
-> (revoker :SpeakerRevoker);
closeStage @2 () -> ();
}
interface StageListener extends(ChatTextReader, ChatAudioReader) {
leave @0 () -> ();
}
# Stage speakers are broadcast-voice only: no `publishVideo` and
# no `subscribeVideo` because the stage-room model has no video.
# Possession of `SpeakerToken` mints exactly this audio-only cap.
interface StageSpeaker extends(AudioCallParticipant) {
yieldFloor @0 () -> ();
}
# Bearer cap held by a hand-raised listener after promotion.
# Has NO revoke method -- the admin's promotion is undone via
# the issuer-held SpeakerRevoker, parallel to InviteToken/Revoker.
interface SpeakerToken {
describe @0 () -> (info :SpeakerTokenInfo);
}
interface SpeakerRevoker {
describe @0 () -> (info :SpeakerTokenInfo);
revoke @1 () -> (); # admin pulls the floor back
}
# ============================================================
# Ephemeral Call. Distinct from VoiceRoom: a Call has explicit
# start/end and lives within a chat (Group or DM). Use Call for
# "let's hop on a quick conference"; use VoiceRoom for "Discord
# voice channel always there". Both can coexist in a Group.
# ============================================================
interface CallSurface {
current @0 () -> (info :ActiveCallInfo); # may be empty
subscribeState @1 (listener :CallStateListener,
options :SubscribeOptions)
-> (sub :Subscription);
startCall @2 (config :CallStartConfig) -> (host :CallHost);
joinCall @3 () -> (participant :CallParticipant);
# Roster delivery for ad-hoc calls. Same shape as
# VoiceRoom.subscribeRoster / StageRoom.subscribeRoster, but
# bound to whatever ad-hoc call is currently active on this
# surface (or to the next call if none is active yet -- the
# subscription persists across start/end transitions of the
# surface's call until cancelled). This is the only delivery
# path for the cap-bearing roster variants
# (`hostGranted :CallHost`, `speakerGranted :SpeakerToken`),
# so a participant who needs to receive a host-promotion in
# an ad-hoc call must hold a Subscription minted here.
subscribeRoster @4 (listener :CallRosterListener,
options :RosterSubscribeOptions)
-> (sub :Subscription);
}
# Audio-only call participation facet. Lifts every call method
# that does not pull in video authority. Used by both the full
# A/V `CallParticipant` and the audio-only `StageSpeaker`.
# Stage rooms are broadcast voice (no stage video in the model),
# so a `SpeakerToken` redemption must mint a stage participant
# that does NOT expose `publishVideo` / `subscribeVideo` -- the
# split lives at the type level here.
interface AudioCallParticipant extends(ChatAudioReader) {
publishAudio @0 (format :AudioFormat) -> (track :AudioOut);
unpublishAudio @1 () -> ();
raiseHand @2 (raised :Bool) -> ();
setMyMuteState @3 (muted :Bool) -> ();
leave @4 () -> ();
}
# Full A/V plaintext participant. Adds video publish/unpublish on
# top of the audio facet, plus inherits subscribeVideo via
# `ChatVideoReader`. Returned by every Group plaintext call
# entry point: ad-hoc `CallSurface.startCall` / `joinCall`
# AND persistent `VoiceRoom.join` (group voice rooms are
# plaintext multi-party voice, so they share this cap shape).
# DM calls do NOT use this cap: they go through a separate
# `E2ECallSurface` that returns the cipher-only
# `E2ECallParticipant` (see the End-To-End Encrypted DMs section
# below) so the keyless-host invariant holds for DM media.
# `CallParticipant` must NOT be plumbed through any DM path.
# Text-during-call goes through the parent chat's
# `ChatTextWriter`, not through the call participant cap;
# that's why `ChatTextReader` is absent here.
interface CallParticipant extends(AudioCallParticipant, ChatVideoReader) {
publishVideo @0 (format :VideoFormat, purpose :VideoPurpose)
-> (track :VideoOut);
unpublishVideo @1 (purpose :VideoPurpose) -> ();
}
interface CallHost extends(CallParticipant) {
mute @0 (participantRef :Data) -> ();
unmute @1 (participantRef :Data) -> ();
eject @2 (participantRef :Data) -> ();
# Same cross-principal-cap-delivery rule as the chat
# role-promotion methods. The promoted participant is already
# listening on the call's roster subscription, so chat-server
# delivers the new `CallHost` cap to the bound participant via
# the existing `CallRosterDelta` stream
# (`hostGranted :CallHost` arm) rather than minting it back to
# the calling host. Caller keeps only the issuer-side
# `RolePromotionRevoker`. Parallels the SpeakerToken delivery
# pattern.
promoteHost @3 (participantRef :Data) -> (revoker :RolePromotionRevoker);
setRoutingMode @4 (mode :CallRoutingMode) -> ();
end @5 () -> ();
}
enum VideoPurpose { camera @0; screenShare @1; virtualScene @2; externalFeed @3; }
enum CallRoutingMode { sfu @0; mesh @1; mcu @2; }
enum CallTopology { peerToPeer @0; serverForwarded @1; serverMixed @2; }
interface CallRosterListener {
update @0 (delta :CallRosterDelta) -> ();
}
# Tagged union of roster events. Most variants carry plain data;
# `speakerGranted` carries a `SpeakerToken` cap, which is the
# substrate's only delivery path for the cross-principal bearer
# cap minted by `StageRoomAdmin.promoteToSpeaker(listenerRef)`.
# Delivery is listener-bound: chat-server only emits this variant
# to the roster subscription of the listener named in
# `listenerRef` -- other listeners on the same stage roster do
# NOT see this variant for that promotion. That listener then
# calls `StageRoom.joinAsSpeaker(token)` with the cap reference
# extracted from the delta.
struct CallRosterDelta {
union {
participantJoined @0 :ParticipantInfo;
participantLeft @1 :Data; # participantRef
muteChanged @2 :MuteUpdate;
activeSpeaker @3 :Data; # participantRef
handRaised @4 :HandRaiseUpdate;
screenShareStarted @5 :ScreenShareInfo;
screenShareEnded @6 :Data; # participantRef
connectionQuality @7 :QualityUpdate;
# Stage-specific cap-bearing variants.
speakerGranted @8 :SpeakerToken;
speakerRevoked @9 :Data; # participantRef
# Call-host promotion cap-bearing variants. Delivered
# listener-bound (only the listener named in
# `CallHost.promoteHost(participantRef)` /
# `E2ECallHost.promoteHost(participantRef)` sees the
# variant; other roster subscribers do NOT). Parallels the
# speakerGranted pattern.
hostGranted @10 :CallHost;
e2eHostGranted @11 :E2ECallHost;
hostRevoked @12 :Data; # participantRef
}
}
# The substrate is RECORDING-BLIND -- there is no "recording
# state" field, no "recording started" delta, and no
# protocol-level recording authority. Whoever holds a
# participant cap may locally record what they receive; a
# "shared recording" of a meeting is modeled by inviting a
# recorder principal into the call as a regular participant.
# Discovery surface owned by chat-server. Each session holds a
# ChatDirectory cap (or none) according to chat-server config.
# Search-based, not list-based: scopes can grow large, and the
# results visible to a session depend on chat-server policy that
# tests the calling session's identity. The unbounded "give me
# everything" shape is wrong; the right shape is "give me the
# entries matching this query, bounded".
#
# Note: this is *not* the filesystem `Directory` cap defined in
# `storage-and-naming-proposal.md`. The two interfaces share the
# dictionary meaning of "directory" (an enumerable namespace) but
# nothing else: filesystem `Directory` opens files; chat
# `ChatDirectory` returns join handles for chats. The
# names are deliberately disambiguated.
interface ChatDirectory {
search @0 (query :ChatDirectoryQuery)
-> (page :ChatDirectoryPage);
describe @1 () -> (info :ChatDirectoryScopeInfo);
}
struct ChatDirectoryQuery {
namePattern @0 :Text; # optional substring/glob
chatKind @1 :ChatKind; # optional kind filter
ownerKind @2 :PrincipalKind; # optional principal-kind filter
limit @3 :UInt32; # bounded page size; chat-server
# may further clamp
cursor @4 :Data; # opaque pagination cursor
# returned by a previous search
}
struct ChatDirectoryPage {
entries @0 :List(ChatDirectoryEntry);
nextCursor @1 :Data; # empty when no more pages
}
struct ChatDirectoryEntry {
chatInfo @0 :ChatInfo;
# Each entry carries a kind-specific join cap. The interface IS
# the permission: a Group entry hands you a DiscoverableGroupJoin
# whose .join() returns GroupMember, a Channel entry hands you
# one of the per-kind subscribe caps whose .subscribe() returns
# the matching subscriber. A caller never has to downcast.
union {
groupJoin @1 :DiscoverableGroupJoin;
channelTextSubscribe @2 :DiscoverableChannelTextSubscribe;
channelAudioSubscribe @3 :DiscoverableChannelAudioSubscribe;
channelVideoSubscribe @4 :DiscoverableChannelVideoSubscribe;
}
}
# Possessing one of these caps IS the policy gate. Calling the
# join/subscribe method mints a fresh role cap parented to the
# per-call join event (a fresh chain root in chat-server's lineage
# tree) -- not parented to this discoverable cap itself. So
# revoking one joiner's branch leaves siblings intact, and closing
# the discoverable route epochs the discoverable cap class without
# touching existing members.
interface DiscoverableGroupJoin {
join @0 () -> (member :GroupMember);
}
# Each Channel directory entry yields a per-kind subscribe cap so
# the result type tells the caller exactly which media they may
# read. chat-server config decides which variant fits the calling
# session's policy.
interface DiscoverableChannelTextSubscribe {
subscribe @0 () -> (subscriber :ChannelTextSubscriber);
}
interface DiscoverableChannelAudioSubscribe {
subscribe @0 () -> (subscriber :ChannelAudioSubscriber);
}
interface DiscoverableChannelVideoSubscribe {
subscribe @0 () -> (subscriber :ChannelVideoSubscriber);
}
# ============================================================
# DM (host plaintext-aware text; host-blind A/V) and E2E DM
# (host-blind everything).
#
# DmPeer extends only ChatTextWriter, NOT full ChatWriter. The
# plaintext audio/video write methods (openAudioOut /
# openVideoOut) and the plaintext audio/video subscribe methods
# (subscribeAudio / subscribeVideo from ChatAudioReader /
# ChatVideoReader) are absent at the type level. All DM media
# flows through `callSurface() -> E2ECallSurface` only -- the
# SFU-forward-only end-to-end-encrypted call surface. A
# plaintext-text DM cannot accidentally route media through a
# host-readable plaintext path because no method to do so
# exists on the cap.
# ============================================================
interface DmPeer extends(ChatTextWriter) {
remoteFingerprint @0 () -> (info :PeerFingerprint);
# DM calls are ALWAYS end-to-end encrypted, even when the DM
# text is not. chat-server forwards encrypted media; key
# exchange (DTLS-SRTP or equivalent) runs between the two peers
# at call start.
callSurface @1 () -> (calls :E2ECallSurface);
closeDm @2 () -> ();
}
# Each principal holds a Self cap that lets them produce a contact
# cap, accept incoming invites, accept incoming DMs, revoke contact
# caps they issued, and start new groups (subject to chat-server
# config-gated quota per principal class).
interface Self {
# Cap-form contact issuance. Returns BOTH the bearer
# `ContactCap` (handed via chat-server-mediated cap delivery to
# whoever should be able to DM the issuer) AND a stable
# `ContactCapRef` -- a non-secret, issuer-side handle the issuer
# keeps so they can later call `revokeContact(ref)`. Without a
# separate handle the issuer would have to retain the bearer
# cap itself to revoke it, and bearer caps go to the recipient.
contact @0 (lifetime :UInt64)
-> (contact :ContactCap, ref :ContactCapRef);
# Code-form contact issuance. Returns BOTH the BEARER-SECRET
# `code` bytes (suitable for paper / QR / out-of-band handoff;
# any holder plus a Self cap can redeem via openDmFromCode /
# openE2EDmFromCode) AND a stable `codeId` -- the non-secret
# issuer-side handle for `revokeContactCode(codeId)`. The
# `code` bytes embed the codeId so chat-server can find the
# issuance lineage node without exposing the secret in the
# revocation API. Treat the `code` with bearer-secret hygiene:
# do not log, do not include in transcripts, prefer bounded
# lifetimes, rate-limit redemption attempts. The codeId is a
# plain identifier safe to store in audit logs.
contactCode @1 (lifetime :UInt64)
-> (code :Data, codeId :Data);
revokeContact @2 (ref :ContactCapRef) -> ();
revokeContactCode @3 (codeId :Data) -> ();
openDm @4 (contact :ContactCap) -> (peer :DmPeer);
openE2EDm @5 (contact :ContactCap) -> (peer :E2EDmPeer);
# Out-of-band redemption paths. Take Data, not a cap, because
# paper/QR handoff cannot produce a cap when raw bearer
# transfer is forbidden by `transfer_policy`. The bytes are
# *bearer secrets* that name a distinct lineage node in
# chat-server's tree (the issuance entry created by
# `Self.contactCode` / `GroupMember.inviteCode`). chat-server
# consumes the code byte-for-byte, validates it against that
# lineage node, and mints the derived role/peer cap with
# `parent = the code's lineage node` -- NOT directly with
# parent = the issuer's role cap. So `Self.revokeContactCode`
# and the invite-code's `InviteRevoker` epoch only that
# specific code's descendants.
openDmFromCode @6 (code :Data) -> (peer :DmPeer);
openE2EDmFromCode @7 (code :Data) -> (peer :E2EDmPeer);
acceptInvite @8 (token :InviteToken) -> (member :GroupMember);
acceptInviteCode @9 (code :Data) -> (member :GroupMember);
startGroup @10 (config :GroupCreateConfig) -> (owner :GroupOwner);
describe @11 () -> (info :SelfInfo);
# Inbound-DM notification surface. When some other principal
# opens a DM to this Self via `openDm` / `openDmFromCode` /
# `openE2EDm` / `openE2EDmFromCode`, chat-server delivers the
# other side's peer cap (`DmPeer(self->other)` /
# `E2EDmPeer(self->other)`) here so the receiving principal
# can subscribe and reply. Listener is minted by the receiver
# and carries the same lifetime as any other listener cap
# (drop / Subscription.cancel revokes locally). The listener
# also fires for redeemed code-form DMs (so the issuer learns
# who claimed a `contactCode` they handed out) and for new
# group invites accepted via `Self.acceptInvite` /
# `acceptInviteCode` if the issuer subscribes -- the typed
# event lets the issuer attribute incoming chains to the
# specific contact / invite they issued.
subscribeIncoming @12 (listener :SelfIncomingListener,
options :SubscribeOptions)
-> (sub :Subscription);
}
# Listener for chat-server-mediated cap deliveries TO a Self.
# Chat-server fires `delivered` once per inbound peer / member
# cap; the listener's owning principal extracts the cap and
# decides what to do with it (subscribe, archive, ignore, etc.).
interface SelfIncomingListener {
delivered @0 (event :SelfIncomingEvent) -> ();
}
# Tagged union of inbound chat-server-mediated deliveries.
# `kind` discriminates the delivery flavour; `source` identifies
# WHICH issuance the delivery is parented under so the issuer
# can attribute the event to a specific contact / code / invite
# they handed out, drive a UI ("Bob just opened a DM via the
# contactCode I posted last week"), or call the matching
# revoke method.
#
# Cross-principal cap delivery rule: dmOpened / e2eDmOpened
# carry the *receiver's* peer cap (the listener owner is the
# contact issuer; the chat-server-minted cap belongs to that
# same principal, so this is NOT cross-principal delivery).
# inviteAccepted is the inviter notification arm. It carries
# *no live cap*: the issuance is identified by the envelope's
# `source.inviteRef :GroupCapRef` (the inviter already holds
# this from their original `GroupMember.invite(...)` tuple),
# and the redeemed branch is identified by
# `InviteAcceptedNotice.acceptedRef :GroupCapRef` (a NEW ref
# naming the redeemed `GroupMember` lineage node, distinct
# from the issuance node). Keeping the two refs distinct lets
# the inviter both attribute the event to its issuance entry
# AND drive `GroupAdmin.describeBranch(acceptedRef)` /
# `revokeBranch(acceptedRef)` on the specific redeemed member
# without conflating it with the issuance node.
# inviteOffered is the *invitee* notification arm and carries
# the InviteToken cap chat-server re-mints for the invitee
# under the original issuance node (same lineage rule as the
# chat-event delivery path), so the invitee can call
# Self.acceptInvite(token) -> GroupMember.
struct SelfIncomingEvent {
receivedAtMs @0 :UInt64;
source @1 :IssuanceSource; # which issuance the
# delivery is parented
# under
union {
dmOpened @2 :DmPeer;
e2eDmOpened @3 :E2EDmPeer;
inviteOffered @4 :InviteToken;
inviteAccepted @5 :InviteAcceptedNotice;
# Role-promotion delivery arms. Chat-server fires one of
# these on the promoted principal's Self listener after
# `GroupOwner.makeAdmin` / `ChannelAdmin.makePublisher` /
# `ChannelOwner.makeAdmin`. The cap is parented under the
# promotion issuance node (a chat-server-owned lineage
# entry); revoking via the issuer's
# `RolePromotionRevoker` epochs the cap delivered here.
groupAdminGranted @6 :GroupAdmin;
channelPublisherGranted @7 :ChannelPublisher;
channelAdminGranted @8 :ChannelAdmin;
# Listener-bound delivery of a fresh GroupMember cap to a
# principal auto-grafted into a group by mergeIntoGroupAsTopic
# / moveTopicHere / extractTopicAsGroup. The cap is parented
# under the transformation operation node; revoking via the
# entire-graft path (`revokeBranch(transformationRef)`)
# epochs every grafted cap.
transformationGrafted @9 :GroupMember;
}
}
# Typed identifier for the issuance an incoming delivery is
# parented under. Lets a listener match an event to the
# specific issuance call that produced the delivery (contact /
# code / invite / role promotion). capOS sends the variant
# that fits the delivery flavour: contact-cap deliveries carry
# `contactRef`, code redemptions carry `codeId`, invite
# deliveries carry `inviteRef`, group role-promotion
# deliveries carry `groupPromotionRef`, channel role-promotion
# deliveries carry `channelPromotionRef`.
struct IssuanceSource {
union {
contactRef @0 :ContactCapRef;
codeId @1 :Data;
inviteRef @2 :GroupCapRef;
groupPromotionRef @3 :GroupCapRef;
channelPromotionRef @4 :ChannelCapRef;
transformationRef @5 :GroupCapRef; # mergeIntoGroupAsTopic /
# moveTopicHere /
# extractTopicAsGroup
# operation node
}
}
# Inviter-side notification when the invitee redeems a
# previously-issued InviteToken / inviteCode. Carries no live
# bearer cap (the redeemed `GroupMember` belongs to the
# invitee, and `transfer_policy` forbids handing it to the
# inviter); instead carries the issuance ref the inviter
# already holds (`source.inviteRef` on the enclosing
# `SelfIncomingEvent`) plus the redeemed branch's
# `acceptedRef :GroupCapRef` so the inviter can call
# `GroupAdmin.describeBranch(acceptedRef)` /
# `revokeBranch(acceptedRef)` if needed.
struct InviteAcceptedNotice {
invitee @0 :PrincipalRef;
acceptedRef @1 :GroupCapRef; # the redeemed GroupMember
# branch root in the
# group's lineage tree
}
# Issuer-held, non-secret revocation handle returned alongside a
# bearer `ContactCap` from `Self.contact()`. Opaque to the
# caller; chat-server uses it to look up the contact's issuance
# lineage node so `Self.revokeContact(ref)` can epoch that node
# and any DmPeer / E2EDmPeer chains parented under it. Unlike
# the bearer `code` returned by `Self.contactCode`, this handle
# is safe to log in audit, persist in the issuer's "contacts I
# issued" UI list, etc. Distinct from `GroupCapRef` to avoid
# accidentally reusing the same opaque ref across different
# substrates' revocation surfaces.
struct ContactCapRef {
refId @0 :Data; # chat-server-internal handle id
}
# ============================================================
# Group lifetime policy + creation config. A Group is persistent
# by default; ephemeral variants auto-disband when their lifetime
# trigger fires. The substrate exposes lifetime as a Group-level
# property; topics and rooms inherit the parent group's lifetime.
# ============================================================
struct GroupLifetime {
union {
persistent @0 :Void;
ephemeralOnEmpty @1 :Void; # auto-disband when no member is
# present in any room of the
# group (text idle + voice idle
# + stage idle), not just when
# the roster goes empty
deadline @2 :UInt64; # absolute disband time, ms since epoch
ephemeralOnIdle @3 :UInt64; # disband after N ms with no activity
}
}
struct GroupCreateConfig {
displayName @0 :Text;
lifetimePolicy @1 :GroupLifetime;
initialInvites @2 :List(ContactCap); # ocap-clean: must already
# have ContactCap for each
# invitee. NO cold-call admit.
}
# ============================================================
# Chat-graph transformations. Every transformation that crosses
# group boundaries is a TWO-CAP operation: caller proves authority
# on one side, receiver-of-method on the other. chat-server
# validates both before mutating its internal lineage tree.
# ============================================================
enum MergeMemberPolicy {
autoInvite @0; # mint fresh GroupMember(target) for source
# members not already in target; deliver
# listener-bound to each principal via
# `Self.subscribeIncoming`
# (`transformationGrafted :GroupMember`
# arm, `source.transformationRef` carrying
# the operation node's `GroupCapRef`,
# whichever transformation invoked the
# policy: mergeIntoGroupAsTopic /
# moveTopicHere / extractTopicAsGroup).
# The source-group event stream only
# carries non-cap "you have been grafted"
# presence; cap delivery stays
# per-recipient.
dropNonMembers @1; # source members not in target lose access
}
# Methods added to Group role caps for lifetime + transformations.
# Real capnp doesn't have `extend X { add methods }` syntax; these
# methods are appended to the existing GroupOwner / GroupAdmin
# interfaces declared earlier in this schema sketch. Shown here in
# their own block for readability.
#
# GroupOwner (in addition to its existing methods) gains:
#
# setLifetimePolicy @100 (policy :GroupLifetime) -> ();
# # Promote ephemeral -> persistent or set a new ephemeral
# # trigger. Same group identity, same caps stay valid; only
# # the auto-disband watcher changes.
#
# mergeIntoGroupAsTopic
# @101 (target :GroupAdmin,
# topicId :Text,
# memberPolicy :MergeMemberPolicy)
# -> (topic :ChatWriter);
# # `this` group becomes a topic under `target` group. The caller
# # must hold both the source GroupOwner cap (this) and the
# # target GroupAdmin cap (passed as argument). Source members
# # not already in target are handled per `memberPolicy`. Source
# # role caps go stale (or transparently re-bind; see Open
# # Question).
#
# GroupAdmin (in addition to its existing methods) gains:
#
# moveTopicHere
# @100 (sourceGroupAdmin :GroupAdmin,
# sourceTopicId :Text,
# destinationTopicId :Text,
# memberPolicy :MergeMemberPolicy) -> ();
# # Move topic from source to destination (this) group. Caller
# # holds destination admin via `this`; sourceGroupAdmin proves
# # authority on the source group.
#
# extractTopicAsGroup
# @101 (topicId :Text,
# lifetime :GroupLifetime,
# displayName :Text,
# creator :Self)
# -> (owner :GroupOwner);
# # Inverse: pull a topic out of `this` group into a brand-new
# # standalone Group. The `creator` Self cap proves the calling
# # principal has group-creation authority; chat-server's
# # `Self.startGroup` policy applies here too (so a guest who
# # cannot create groups cannot bypass the quota by extracting
# # a topic). Caller becomes Owner of the new group; topic
# # members auto-migrate as Members, parented to the extract
# # operation.
# A contact cap is a chat-server-issued cap that says "any holder
# may open a DM to the issuing principal." The issuer can revoke at
# any time. Contact caps may be public (broadly shared) or narrow
# (handed to one specific principal); both shapes are the same cap
# kind, the difference is in how the issuer chose to share it.
interface ContactCap {
describe @0 () -> (info :ContactInfo);
}
# Listener-side. Held by the receiver; minted locally.
interface Subscription { cancel @0 () -> (); }
interface TextListener { post @0 (event :ChatInboundEvent) -> (); }
interface AudioSink { frame @0 (meta :AudioFrameMeta) -> (); }
interface VideoSink { frame @0 (meta :VideoFrameMeta) -> (); }
# Outgoing media. Flow-controlled via `-> stream`.
interface AudioOut {
writeFrame @0 (meta :AudioFrameMeta) -> stream;
close @1 ();
}
interface VideoOut {
writeFrame @0 (meta :VideoFrameMeta) -> stream;
close @1 ();
}
enum ChatPayloadKind {
text @0;
presence @1; # joined / left / typing / status
reactionRef @2; # reference to another event id
approvalRef @3; # reference to an ApprovalGrant; payload is the
# grant's audit-safe descriptor, not the grant
attachment @4; # see AttachmentDescriptor
custom @5; # service-defined; opaque to the substrate
}
struct ChatOutboundEvent {
kind @0 :ChatPayloadKind;
text @1 :Text; # optional, for kind=text and convenience
data @2 :Data; # optional structured payload
inReplyTo @3 :Data; # optional event id
redactionClass @4 :Text;# audit redaction class
}
struct ChatInboundEvent {
eventId @0 :Data;
chatId @1 :Text; # opaque per-chat identifier; renamed
# from the earlier `channel` field
# because "channel" is reserved for
# Telegram-style broadcast Channels.
# Holds equally for Groups, broadcast
# Channels, and DMs.
sender @2 :Text; # disclosure-policy-redacted display name
kind @3 :ChatPayloadKind;
text @4 :Text;
data @5 :Data;
inReplyTo @6 :Data;
receivedAtMs @7 :UInt64;
}
Notes:
ChatEvent(the existing struct incapos.capnp) becomesChatInboundEvent. Listener caps replacepoll, butpollmay stay as a deprecated, transport-stopgap method during the capnp-rpc migration.AudioFrameMeta/VideoFrameMetacarry timestamps, codec hints, and a ring-buffer slot reference. Frame bodies live inMemoryObject-backed rings shared between the producer and consumer.approvalRefis the only tie between this proposal and the approval surface: it lets an approval request appear in a chat as a structured message that links to anApprovalGrantcap. The grant cap travels by capnp-rpc cap reference, not as bytes inside the message data.
WebRTC Mapping
Browser-side participants use WebRTC. The trusted Rust backend (or a capOS-side WebRTC adapter the gateway delegates to) implements the peer at the capOS end. The mapping is symmetric enough that no additional abstraction layer is needed in either direction.
| Chat substrate | WebRTC equivalent | Notes |
|---|---|---|
subscribeText(listener) + send(event) | RTCDataChannel (reliable, ordered) | Text events are JSON view models on the HTTP path; the WebRTC data channel may carry the same JSON for browser peers that want lower-latency text without HTTP polling. |
openAudioOut, subscribeAudio(sink) | RTCPeerConnection audio track (addTrack, ontrack) | Codec negotiation via SDP; capOS-side adapter exposes the agreed AudioFormat. |
openVideoOut, subscribeVideo(sink) | RTCPeerConnection video track | Same as audio with codec/resolution negotiation. |
postAttachment(descriptor) | RTCDataChannel reliable chunk transfer or HTTP file fetch | Bounded attachments only; large transfers go through a separate File/ContentStore cap. |
presence payload kind | RTCPeerConnection connectionstatechange events + custom data-channel messages | capOS surfaces presence as ChatInboundEvent kind=presence. |
approvalRef payload kind | data channel message with structured payload | The approval cap stays on the capnp-rpc side; the data channel only carries the audit-safe descriptor. |
| ICE / SDP negotiation | gateway endpoint /api/chat/webrtc/* | Browser sends offer; backend produces answer; ICE candidates traded via the same endpoint. The browser never receives capOS caps through this path – only WebRTC handles. |
| DTLS / SRTP keys | WebRTC default | TLS for the browser ↔ backend signalling channel must be configured separately (see certificates-and-tls-proposal.md). |
The gateway boundary stays the same: the browser receives WebRTC handles and view models. The trusted backend holds the chat cap, the listener caps, the media rings, and the WebRTC peer connection. No capOS authority object crosses to the browser.
Approvals Stay Separate
Approvals are a different surface from “may I write to you”. They
already have a designed capability: ApprovalClient / ApprovalGrant
(shell-proposal.md:407-427, also referenced in
user-identity-and-policy-proposal.md:812). Per-tool permission modes
are defined in llm-and-agent-proposal.md:105-114
(auto|consent|stepUp|forbidden). The remote CapSet UI’s
“action-approval queue” is the canonical UI surface
(remote-session-capset-client-proposal.md § UI Scope And Architecture).
What ApprovalClient is for: a principal that already has authority
to attempt some action wants confirmation before exercising it (or
the policy engine demands a step-up). Examples: agent runtime asks
the operator before invoking a consent-mode tool; a destructive
operation needs WebAuthn step-up; a queued write awaits
human-in-the-loop sign-off.
What ApprovalClient is not for: cold-call admission. There is no
flow where principal A asks the system “may I please write to B”.
That request requires a cap A does not have. The substrate’s answer
is: B issues a contact cap (via Self.contact()) or invites A to a
shared Group via GroupMember.invite(...) (or, if B holds the
broadcast Channel role, ChannelAdmin.makePublisher(...)).
Without an existing
cap from B’s chain, A has no protocol-level path. See
“Capability Granting” above.
Chat ties to ApprovalClient in exactly one place: an approvalRef
payload kind lets a chat thread display an approval request as a
structured message linking to a live ApprovalGrant cap. The grant
cap travels by capnp-rpc cap reference; the bytes inside the message
data carry only an audit-safe descriptor. The grant state machine,
the broker call, the policy check, the step-up mechanics, and the
audit trail all remain on the existing ApprovalClient /
AuthorityBroker.request path.
Approvals-side gaps that are still open (and tracked separately in
WORKPLAN.md):
- Detailed
ActionPlanandCapRequestschema. Both are referenced in the existingApprovalClientsketch but not fully specified. - Durable approval queue / inbox shape. Today the flow is
synchronous (
ApprovalClient.requestreturns a grant cap directly); the remote CapSet UI’s queue surface implies persistence and listing. A queue cap layered on top ofApprovalClient(e.g.ApprovalQueue.list() -> List(Pending),next() -> ApprovalGrant) is a natural follow-up.
These should land in a follow-up update to shell-proposal.md /
user-identity-and-policy-proposal.md, not in this Chat proposal.
Chat Categories
Telegram-aligned naming. Three concrete chat categories plus an E2E
variant of DMs. Distinct cap types because they have distinct
authorities; all of them sit on top of the unified
ChatEndpoint / ChatWriter base interfaces.
- Group – multi-participant, two-way. Has an Owner, zero-or-more
Admins, and Members. Supports nested rooms of three kinds:
text topics (sub-channels for text), voice rooms (Discord-style
persistent always-on voice rooms), stage rooms (Discord-stage /
Twitter-Spaces broadcast voice within the group with raise-hand to
speak). Per-room permission overrides are out of scope for the
first slice;
RoomPolicyleaves the door open. - Channel (Telegram-strict: BROADCAST) – read-only for subscribers. Owner/Admin/Publisher post; Subscribers receive only. Useful for system announcements, agent status feeds, log streams, one-to-many broadcasts.
- DM – two-participant chat. No group-level role hierarchy.
Each peer holds an asymmetric
DmPeercap. - E2E DM – two-participant DM where the chat host carries
ciphertext only. Distinct cap layer (
E2EDmPeer) because key exchange, AEAD, forward-secrecy ratchets, and out-of-band fingerprint verification are concerns the unencrypted DM does not have. See “End-To-End Encrypted DMs” below.
In addition, both Groups and DMs expose an ephemeral Call surface for voice/video conferences – but with a kind-specific narrowing:
- Groups use
GroupMember.callSurface() -> CallSurfacefor multi-party calls;CallSurface.startCallallowssetRoutingMode(sfu / mesh / mcu) so server-side mixing is available when text/audio aren’t end-to-end-encrypted. - DMs (both plain
DmPeerandE2EDmPeer) usecallSurface() -> E2ECallSurface– the SFU-forward-only surface with nosetRoutingMode. Direct calls between two principals are end-to-end-encrypted at the media layer regardless of whether DM text is host-readable.
A Call has explicit start/end, distinct from the persistent VoiceRoom: use Call for “let’s hop on a quick conference”, use VoiceRoom for “Discord voice channel always there”.
There is no special “system room” category. A system-managed
chat is just a chat whose Owner principal is a service principal or
a designated admin principal. capOS already treats services as
principals (PrincipalKind.service in
user-identity-and-policy-proposal.md:91-98); a service-owned chat
applies the same role/lineage rules as any other.
Naming convention. The unqualified word “channel” in this
proposal refers only to the broadcast category (Telegram-style
Channel). Anything generic – a stream of events, a subscription
target, an A/V flow – is called a chat (the substrate-level
term). Base interfaces use the Chat prefix (ChatEndpoint,
ChatWriter, ChatDirectory, ChatInfo, ChatKind); concrete
roles use the category prefix (Group*, Channel*, Dm*).
Substrate is recording-blind. No protocol-level “start recording” / “consent to recording” / “recording state” surface exists. Server-side recording with consent is consent theater anyway – a phone next to the speakers or a screen recorder on the recipient’s own device defeats it instantly. Recording is purely a client-side concern: whoever holds a participant cap may locally record bytes they receive. A “shared meeting recording” is modeled by inviting a recorder principal into the call – it shows up in the roster like any other participant, the social contract carries the rest.
Lifetime And Transformations
Groups have a lifetime policy chosen at creation, and the chat graph supports a small set of structure-preserving transformations.
Group lifetime
GroupLifetime is one of:
persistent(default): the group lives until an owner callsdisband()or transforms it into something else. Manifest-created groups default to persistent.ephemeralOnEmpty: chat-server auto-disbands when the last member leaves. “Spin up a quick chat with these three people; it goes away when everyone closes the tab.”deadline: chat-server auto-disbands at an absolute time. “This pickup-call thread auto-archives Friday at 17:00.”ephemeralOnIdle: chat-server auto-disbands after N ms with no message activity. “Self-cleanup if nobody says anything for an hour.”
Owners can change the policy at runtime via setLifetimePolicy.
Going from ephemeral to persistent is “promote this ephemeral chat
to a permanent one”; the same group identity persists, no caps
rotate, no auto-invite happens. Going the other way (persistent ->
ephemeral) is also valid – the auto-disband watcher just starts.
Lifetime applies at the Group level. Topics and rooms inherit the parent group’s lifetime; they don’t have separate auto-disband clocks. This is the right scope: rooms are sub-spaces of a group, not independent chats.
For DMs the same GroupLifetime shape can be reused (an
ephemeralOnIdle DM is the natural shape for “self-destructing
chat” if you ever want it), via a lifetime field on the
Self.openDm config. Out of scope for this slice; the schema
leaves room.
Ad-hoc group creation
Self.startGroup(config :GroupCreateConfig) -> (owner :GroupOwner)
lets any principal whose Self cap permits it create a new group.
chat-server policy gates this per principal class – operators
typically have a creation quota; guests/anonymous don’t have
Self.startGroup at all (cap absent from their bundle).
Initial invitees are passed as a List(ContactCap). This is the
ocap-clean rule: you can only invite people you already have a
ContactCap for. No cold-call admit. Want to spin up a Group with
strangers? You can’t; you have to first arrange contact via
existing channels (someone vouches by sharing your contact card,
you publish a public ContactCap, etc.).
Each initial invite is delivered through the existing Self
notification surface of the invitee, who can Self.acceptInvite
to join. If invites are declined, the group still exists with
just the creator as Owner.
Transformations
Three structural mutations of the chat graph, each a two-cap operation: the caller proves authority on one side; the receiver of the method (i.e. the cap-self) proves authority on the other. chat-server validates both before mutating its lineage tree.
Promote ephemeral to persistent. GroupOwner.setLifetimePolicy({persistent}).
Single-cap (just the Owner of the ephemeral group). No member
migration; same caps stay valid.
Merge a group into another as a topic. GroupOwner of the
source calls mergeIntoGroupAsTopic(target :GroupAdmin, topicId, memberPolicy). After success:
- Source group ceases to exist as a top-level group; its identity
becomes a topic under
target. - Source members not already members of
targetare handled permemberPolicy:autoInvitemints freshGroupMember(target)caps for them (parented to the merge operation), and chat-server delivers each cap LISTENER-BOUND to the recipient principal via that principal’sSelf.subscribeIncoming– thetransformationGrafted :GroupMemberarm, withsource.transformationRefcarrying the merge-opGroupCapRef. The fan-out source-group event stream only carries non-cap presence (a “you have been grafted into target via merge” notice) so cap delivery stays on the listener-bound surface required bytransfer_policy. The alternativedropNonMemberslets the source caps go stale without minting new ones. - The merge operation is a node in chat-server’s lineage tree; every cap minted as part of it is parented to that node, so “revoke everything that came in via this merge” is one operation.
Move a topic between groups. GroupAdmin.moveTopicHere(sourceGroupAdmin, sourceTopicId, destinationTopicId, memberPolicy). Same two-cap
shape: caller’s this is the destination admin; sourceGroupAdmin
is the source. Topic members not in the destination are handled
per memberPolicy. The topic-as-namespace identity moves; the
topic’s history (text events, attachments) carries over.
Extract a topic into a standalone group.
GroupAdmin.extractTopicAsGroup(topicId, lifetime, displayName, creator :Self). Inverse of merge – but unlike the
single-extract-cap shape that would let any group admin mint a
top-level Group regardless of group-creation authority, this
method takes a creator :Self cap as a second argument.
chat-server applies the same policy it applies to
Self.startGroup (per principal class quota, ban-list checks,
etc.) to the calling principal before minting the new
GroupOwner. A guest or admin who is not allowed to create
groups cannot bypass the quota by extracting a topic. Caller
becomes Owner of the new group; topic members auto-migrate as
Members; their caps are parented to the extract operation.
Authority rules
All three cross-group operations share these invariants:
- Two-cap proof. Methods that move structure across groups
take the other authority as an argument. For
mergeIntoGroupAsTopic/moveTopicHerethat’s the other group’s admin role cap (thepartnerAdminarm ofOperationConsentin lineage queries). ForextractTopicAsGroupthere is no other-side group yet, so the second authority is the initiator’s ownSelfcap proving group-creation quota (theselfCreationarm ofOperationConsent); chat-server applies the same per-principal quota / ban-list checks it applies toSelf.startGroupbefore minting the newGroupOwner. chat-server rejects withincompatibleChatKindif the cross-group caps reference chats with incompatible kind/policy (e.g. you can’t merge an E2E DM into a non-E2E group). - Lineage continuity. The transformation operation is itself
a node in chat-server’s tree (
OperationNodeInfoarm ofLineageNodereturned bydescribeBranch); new caps minted as part of it recordparent = the operation(thetransformationarm ofBranchParentage). Both entire-graft revocation (revokeBranch(operationNodeRef)) and per-member revocation (revokeBranch(memberCapRef)) work, and either ref kind passes through the sameGroupCapRefenvelope. - No cold-call sneak path.
autoInvitelooks like it might be a way to drag people into a group they didn’t agree to, but it requires both the source-group owner (who has authority over those members because they’re already in the source group) AND the target-group admin (who has authority to admit) to consent in the same call. A single party can never drag people into a group on their own; the two-cap pattern is the consent.
Lifetime interaction with conferencing
A subtle thing worth flagging: ephemeralOnEmpty interacts
oddly with VoiceRooms. If a Group has a VoiceRoom and the last
text-chat member leaves but two people are still connected to
the voice room, the group should not auto-disband. Definition:
“empty” means “no member is present in any room of the group”
– text idle, voice idle, stage idle. Detail for the
implementation iteration.
A merged-into-topic source group’s lifetime policy does not
survive the merge. The topic now lives under the target group’s
lifetime; if the source was on ephemeralOnIdle and the target
is persistent, the topic becomes persistent. Worth surfacing
in the merge confirmation UX. Substrate behavior:
lifetimePolicy is a Group-level field; topics inherit.
Cap continuity at the holder (Open Question)
When a group merges into another as a topic, members hold caps that used to mean “send to the top of source group” and now mean “send to topic X under target group”. Three viable strategies; the substrate proposal does not lock one in:
- Transparent redirect. Old caps keep working; chat-server’s
dispatch routes calls to the new topic.
describe()reveals the new identity. Pros: zero client code change. Cons: leaks “this used to be a separate group” history; may surprise users. - Forwarding denial. Old caps go stale with a
chatMergeddenial that includes a forwarding hint (event id and a reference the client can fetch to obtain the new topic cap). Pros: clean break; auditable. Cons: every client across every member needs to handle the forwarded-redirect at the call site. - Holder-driven re-bind. chat-server delivers a presence event to every affected member carrying the new cap; the old cap stays usable for a grace window after the merge, then goes stale. Lets clients re-bind without disruption; the eventual stale flip ensures no permanent dual identity.
The third strategy reads cleanest to me, but it benefits from prototyping. Implementation iteration will pick one.
Capability Granting
The current Chat interface in schema/capos.capnp is open-by-default:
holding the system Chat cap lets a process join any channel by name and
send to any channel. That is the wrong model. This section defines an
ocap-disciplined replacement: every Chat capability is granted
explicitly by a holder that already has it, every derived cap has a
recorded parent, and revocation cascades through the derivation tree.
Cap flavours
The substrate defines four kinds of caps. The exact schema is part of the implementation iteration; the shape is what matters.
-
Chat service root cap. Held by chat-server itself, never handed to user code. The root authority from which every other chat cap ultimately derives. Manifest configuration tells chat-server which groups and channels to materialize at startup; chat-server uses its root cap to do so. The root cap is the lineage root; it is not “ambient authority handed out by the broker” – it is service authority held by the service that runs Chat.
-
Role caps. A role on a specific chat is a cap. Roles inherit upward; concrete role caps embed the unified
ChatEndpoint/ChatWriterbase interfaces.GroupOwner(group)extendsGroupAdminextendsGroupMemberextendsChatWriter. Full authority on the group: appoint admins, create/remove rooms (text topics + voice rooms + stage rooms), change group settings, kick members, issue invites, open public-join routes, disband.GroupAdmin(group)adds member/branch/room moderation and invite-policy management. Per-permission DSL (can-pin, can-invite, can-create-room, …) is future work; first slice ships a single Admin role.GroupMember(group)– read and write all rooms under the group’s default policy. Members may invite others if the group’s policy allows. Members access voice/stage rooms viavoiceRoom(id)/stageRoom(id)and ephemeral conferences viacallSurface().ChannelOwner(channel)extendsChannelAdminextendsChannelPublisherextendsChatWriter. Full broadcast authority. Per-kind subscribers –ChannelTextSubscriber(channel)extendsChatTextReaderonly,ChannelAudioSubscriber(channel)extendsChatTextReader + ChatAudioReader,ChannelVideoSubscriber(channel)extends all three readers – are read-only at the type level. Promotion to publisher goes throughChannelAdmin.makePublisher.DmPeer(dmId, direction)extends onlyChatTextWriter(NOT fullChatWriter). DM text is host-readable; DM media is NOT – audio/video flows only throughDmPeer.callSurface() -> E2ECallSurface, where chat-server forwards already-encrypted frames between peers. A→B peer cap gives A the right to push text to B; it is not symmetric.E2EDmPeeris the analogous cap for end-to-end-encrypted DMs (does not extendChatWriterbecause its payloads areCipherEnvelope, notChatOutboundEvent).CallParticipant/CallHost– ephemeral conference participation; held while a Call is live, parented to the joiner’s chat role cap. Voice/stage variants have their own concrete role caps (StageListener,StageSpeaker).StageListeneris parented to the joiner’sGroupMemberrole cap (joinAsListeneris a normal accessor on the member’s stage facet);StageSpeakeris the exception — see below.SpeakerToken/SpeakerRevoker– a stage-room admin’s grant of speak authority for a specific listener. HoldingSpeakerTokenlets that listener callStageRoom.joinAsSpeaker(token) -> StageSpeaker, and chat-server mints the resultingStageSpeakerwithparent = the SpeakerToken's lineage node. The admin holds the companionSpeakerRevoker(parented to the admin’sStageRoomAdmincap);revoker.revoke()epochs both the unredeemed token and any activeStageSpeakerredeemed from it, so pulling the floor back actually kills the live speaker cap rather than just blocking future redemptions.
-
Listener-side caps. Held by the receiver. Minted locally; never issued by anyone else. The receiver hands a listener cap to a chat role cap (Group, broadcast Channel, DM, voice/stage room) when subscribing; that role cap calls back per event. Dropping the listener (or cancelling the returned
Subscription) is the receiver’s instant revocation tool.TextListenerAudioSinkVideoSink
-
Discovery / join caps.
ChatDirectory(scope)– read-only access to the discoverable chats (Groups and broadcast Channels) chat-server’s configuration exposes for this scope. Bundled to sessions per chat-server config (e.g. operator-class sessions getChatDirectory(operator-scope)). Holding it lets the session callChatDirectory.search(query) -> ChatDirectoryPageand filter by chat-server-defined criteria. Not a global index – each scope is whatever chat-server’s config carves out.DiscoverableGroupJoin(group)– “you are allowed to join this group”. Returned byChatDirectory.search(query)entries that the scope’s policy says the caller may join, or bundled directly to a session by chat-server config. Possessing it is the authority; callingDiscoverableGroupJoin.join() -> GroupMembermints a fresh role cap. There is no separate “redeem” step; possession is authority, the method just produces the derived cap.DiscoverableChannelTextSubscribe(channel)/DiscoverableChannelAudioSubscribe(channel)/DiscoverableChannelVideoSubscribe(channel)– analogous for broadcast Channels. Each returns the matching per-kindChannelTextSubscriber/ChannelAudioSubscriber/ChannelVideoSubscribercap; the result type tells the caller exactly which media facets they hold.InviteToken– a one-shot or n-shot bearer token an admin or policy-permitted member produces viaGroupMember.invite(forSubject, lifetime) -> (token, revoker, inviteRef). The invitee callsSelf.acceptInvite(token) -> GroupMember. The token interface has NO revoke method; revocation lives on the issuer-held companionInviteRevokercap, parented to the issuer’s role cap in chat-server’s lineage tree. The issuer also keeps the non-secretinviteRef :GroupCapReffor the cap-cleanGroupAdmin.describeBranch/revokeBranchpath. (For paper / QR / out-of-band handoff where the recipient cannot receive a cap, the issuer usesGroupMember.inviteCode(lifetime) -> (code :Data, revoker, inviteRef)instead, and the recipient callsSelf.acceptInviteCode(code). The bytes are bearer secrets that name a distinct lineage node in chat-server’s tree – the issuance entry created byinviteCode. On redemption chat-server mints the resultingGroupMembercap withparent = the inviteCode lineage node, NOT directly withparent = the inviter's role cap. Revoking via the companionInviteRevokertherefore epochs only that code’s descendants. See How bearer caps cross principal boundaries below for the full redemption-parent contract, and treat the bytes with bearer-secret hygiene – do not log, prefer bounded lifetimes and rate-limited redemption.)SpeakerToken/SpeakerRevoker– analogous shape for stage-room speak grants. Bearer holdsSpeakerToken(no revoke method); admin holdsSpeakerRevokerminted viaStageRoomAdmin.promoteToSpeaker(listenerRef).Self.contact()– a cap a principal produces to advertise “you may DM me”. The method returns BOTH the bearerContactCap(handed to whoever should be able to DM the issuer) AND a non-secretContactCapRefthe issuer keeps forSelf.revokeContact(ref). A holder of the bearer cap callsSelf.openDm(contactCap) -> DmPeer(orSelf.openE2EDm(contactCap) -> E2EDmPeer). The contact-issuing principal sees the resulting DM via their ownSelfcap’s notification surface. Equivalent to a Telegram contact card or a published@handle; the substrate’s only guarantee is that you needed a contact cap (or its bytes form viaSelf.contactCode, which similarly returns both the bearer-secretcodeand a non-secretcodeIdrevocation handle) to initiate.
There is no IntroCap primitive. What I formerly called
“redeem an intro” is just calling a method on a DiscoverableGroupJoin / DiscoverableChannel*Subscribe,
InviteToken, or contact cap that returns a derived role cap.
How bearer caps cross principal boundaries
The substrate forbids raw bearer transfer of chat caps via
kernel-enforced transfer_policy. But a flow like “Alice creates
an InviteToken and gives it to Bob” inherently means a cap moves
from Alice’s process to Bob’s. The same applies to ContactCap
sharing.
These chat-class cap transfers go through chat-server itself,
never through raw IPC IPC_TRANSFER_CAP. Two paths:
-
Cap reference inside a chat event.
ChatOutboundEvent.datamay carry chat-server-recognized chat-class cap references (anInviteToken, aContactCap). When a holder sends such an event withChatTextWriter.send, chat-server inspects the payload, sees the cap reference, and on delivery to each recipient re-mints a fresh derived cap. The lineage parent for the re-minted recipient cap is the original issuance node, NOT the sender’s chat cap, so that the issuer-held revoker (e.g.ContactCapReffromSelf.contact,InviteRevokerfromGroupMember.invite) reaches every recipient copy and every downstream descendant when the issuer revokes. If chat-server instead parented under the sender’s chat cap, only the sender’s branch would be killed on revoke; recipient copies and theDmPeer/GroupMembercaps minted from them would survive, defeating the issuer-side revocation contract. The original bearer cap stays in the sender’s table; the recipient receives a fresh cap of the same kind, parented under the issuance node. Lineage is preserved; raw bearer transfer never happens. -
Out-of-band delivery + recipient redeem. Bytes can be exchanged through a non-chat path (paper handoff, QR code, manifest entry in a test fixture). Issuers produce the bytes through
Self.contactCode/GroupMember.inviteCode; recipients redeem them viaSelf.openDmFromCode(code),Self.openE2EDmFromCode(code), orSelf.acceptInviteCode(code).The bytes are bearer secrets – any holder who also has a
Selfcap can redeem them – so chat-server treats each issued code as a distinct lineage node in its tree, not as a transparent identifier collapsed onto the issuer’s cap. When the issuer mints a code viainviteCode/contactCode, the code’s lineage entry hasparent = the issuing role/Self capand the issuer holds the matchingInviteRevoker(forinviteCode) or revokes viaSelf.revokeContactCode(codeId)(forcontactCode). When a recipient redeems, chat-server mints the derived cap withparent = the code's lineage node, NOT directly with parent = the issuer’s cap. So:- Revoking a single
contactCodeepochs only that code’s descendants; other contact caps and codes the same issuer has handed out are unaffected. - Revoking an
InviteToken’s revoker (or its companioninviteCode) kills the redeemed Member cap and any sub-invitees that Member produced, without affecting other invites the same admin issued. - The issuer-held revoker /
revokeContactCodeis the only way to revoke that specific handoff. Bearer copies that have not yet redeemed simply fail closed once revoked.
Bearer-secret hygiene applies: codes have lifetimes, are bound to a single issuance entry, and chat-server may rate-limit redemption attempts per code to bound brute-force guessing.
- Revoking a single
The kernel’s transfer_policy rejection of raw IPC-cap-transfer
is what closes the loophole. chat-server’s typed delivery methods
(or the byte-form code paths above) are the only ways a chat-class
cap reaches a new principal; lineage is recorded at chat-server
side in either case.
Approval grants are NOT chat caps and are not re-minted through
chat lineage. approvalRef is a payload kind that lets a chat
event display an approval request, but the live ApprovalGrant
cap travels by ordinary capnp-rpc cap reference between the
approval service and its caller – the same way it would without
chat. chat-server only forwards the audit-safe descriptor for
display; if the recipient needs the actual ApprovalGrant cap,
it comes from AuthorityBroker.request / ApprovalClient, not
from a chat-server re-mint. Approvals stay separate (see the
“Approvals Stay Separate” section).
Per-principal ban list
Rotating a member’s branch (revokeBranch(memberCap)) kicks
their current chain. But if the principal still holds a
DiscoverableGroupJoin (or DiscoverableChannel*Subscribe) cap,
or has a session bundle hook that hands one out at login, they
can call .join() / .subscribe() and mint a fresh chain. For
real ban semantics, chat-server tracks a per-chat ban list:
-
Group ban.
GroupAdmin.banPrincipal(principalRef)adds the principal to the group’s ban list; chat-server checks it on every Group-side mint path that could attach a fresh role cap to that principal:- public-join redemption:
DiscoverableGroupJoin.join; - cap-form invite redemption from outside the group:
Self.acceptInvite(token); - cap-form invite redemption from inside an existing group
context:
GroupMember.acceptInvite(token)(same wire as the Self-form, but invokable when the invitee already holds a member cap in another group and chat-server forwarded the InviteToken through that group’s chat event); - byte-form invite redemption:
Self.acceptInviteCode(code); - admin-mint paths on the Group role hierarchy:
GroupOwner.makeAdmin, plus any other future role-promotion methods chat-server adds to GroupOwner / GroupAdmin (Channel-side methods likeChannelAdmin.makePublisherare NOT in this list – those belong to the Channel ban below); - every manifest-driven session bundle hook that attaches a
Group role cap at login (
GroupOwner/GroupAdmin/GroupMember); and - every transformation-driven auto-mint path
(
mergeIntoGroupAsTopic/moveTopicHerewithmemberPolicy=autoInvite, and the per-topic-member auto-migration step insideextractTopicAsGroup).
Without the transformation check, a source-owner plus target-admin pair could graft a banned principal back into a group via merge or move; without the login-bundle check, a banned operator who has the
lobbygroup attached by their session profile would receive a freshGroupMember(lobby)(orGroupAdmin(lobby)) cap on their next login and bypass the ban. Banned principals caught in a transformation are dropped from the autoInvite set with aprincipalBannedaudit event; the transformation itself still completes for non-banned members. - public-join redemption:
-
Channel ban.
ChannelAdmin.banPrincipal(principalRef)adds the principal to the broadcast Channel’s ban list; chat-server checks it when minting viaDiscoverableChannelTextSubscribe.subscribe/Audio/Video, onChannelAdmin.makePublisher, onChannelOwner.makeAdmin, and on any Channel role cap (ChannelOwner/ChannelAdmin/ChannelPublisher/Channel{Text,Audio,Video}Subscriber) attached by manifest-driven session bundles at login (same reason as the Group case). -
Self-creation ban via
Self.startGroup. A globally banned principal whose chat-server policy disallows new groups (e.g. manifest setsSelf.startGroupper principal class) cannot bypass by including a bannedContactCapininitialInvites; chat-server validates each contact against its issuer’s bans before minting auto-invites.
Banned principals get a typed principalBanned denial.
unbanPrincipal removes the entry. Banning is independent of
revokeBranch: revoke kicks the active chain; ban prevents new
chains; an admin typically does both as a single workflow (“kick
- ban“).
Where caps come from
The chain always terminates at chat-server’s own root cap. There is no broker-side ambient minting; the broker’s role is to hand out chat-server-issued caps that chat-server’s config has already authored for sessions matching certain profiles.
| Cap | Originating issuer | How a session first holds it |
|---|---|---|
Self | chat-server, once per session at login from the caller’s authenticated identity | parent is chat-server’s root, exactly one Self cap per (principal, session) tuple; chat-server creates it the first time the broker hands a session to chat-server. All ContactCap / contactCode / Self-driven group-creation chains terminate at this Self node, which terminates at chat-server’s root, satisfying the lineage invariant. The Self cap is never delivered cross-principal; its lifetime is the session’s lifetime. |
GroupOwner (manifest-bundled) | chat-server, when the manifest declares the group | bundled to the configured Owner principal’s session at login; parent is chat-server’s root, the manifest entry is its own chain |
GroupOwner (Self.startGroup) | chat-server, on Self.startGroup(config) | parent is the calling principal’s Self cap; minting is gated by chat-server’s per-principal-class group-creation quota |
GroupOwner (extractTopicAsGroup) | chat-server, on GroupAdmin.extractTopicAsGroup(..., creator :Self) | parent is the extract-operation lineage node (OperationNodeInfo with selfCreation consent); the extract op is itself a child of the source group’s root |
GroupAdmin (manifest-bundled) | chat-server, when the manifest bundles admin to a profile (e.g. the test fixture’s chat.groups.X.admins entry) | parent is chat-server’s root, the manifest entry is its own chain |
GroupAdmin (Owner-minted) | chat-server, on GroupOwner.makeAdmin(memberRef); delivered to the promoted principal via Self.subscribeIncoming.groupAdminGranted | parent is the promotion issuance lineage node (IssuanceNodeInfo with kind groupAdminGrant); the issuance node parents to the calling GroupOwner cap. Revoking via the issuer-held RolePromotionRevoker epochs the issuance node and the promoted GroupAdmin under it. |
GroupMember (manifest-bundled) | chat-server, when the manifest bundles membership to a profile | parent is chat-server’s root, the join is its own chain |
GroupMember (public-joined) | chat-server, on DiscoverableGroupJoin.join() | parent is the joiner’s own root within the group (each public join is its own distinct chain) |
GroupMember (invited, cap form, Self redemption) | chat-server, on Self.acceptInvite(token) | parent is the InviteToken issuance lineage node, which itself parents to the inviter’s role cap |
GroupMember (invited, cap form, in-context redemption) | chat-server, on GroupMember.acceptInvite(token) (the in-context redemption used when the invitee already holds a GroupMember cap in another group through which the inviter forwarded the InviteToken) | same parent semantics as the Self-form: the InviteToken issuance lineage node, which parents to the inviter’s role cap |
GroupMember (invited, code form) | chat-server, on Self.acceptInviteCode(code) | parent is the inviteCode lineage node, which itself parents to the inviter’s role cap |
GroupMember (transformation-grafted, merge/move autoInvite) | chat-server, on mergeIntoGroupAsTopic / moveTopicHere with memberPolicy=autoInvite | parent is the transformation operation node (OperationNodeInfo arm of LineageNode with partnerAdmin consent); revoking the op node epochs every grafted member |
GroupMember (transformation-grafted, extractTopicAsGroup) | chat-server, on GroupAdmin.extractTopicAsGroup(..., creator :Self) for each existing topic member auto-migrated into the new group | parent is the extract operation node (OperationNodeInfo arm of LineageNode with selfCreation consent); revoking the op node epochs every auto-migrated member of the extracted group |
ChannelOwner (manifest-bundled) | chat-server, when the manifest declares the channel | bundled to the configured Owner principal’s session at login; parent is chat-server’s root, the manifest entry is its own chain |
ChannelTextSubscriber (public) | chat-server, on DiscoverableChannelTextSubscribe.subscribe() | parent is the subscriber’s own root within the channel |
ChannelAudioSubscriber (public) | chat-server, on DiscoverableChannelAudioSubscribe.subscribe() | parent is the subscriber’s own root within the channel |
ChannelVideoSubscriber (public) | chat-server, on DiscoverableChannelVideoSubscribe.subscribe() | parent is the subscriber’s own root within the channel |
ChannelTextSubscriber / ChannelAudioSubscriber / ChannelVideoSubscriber (manifest-bundled) | chat-server, when the manifest bundles a per-kind subscriber to a profile | parent is chat-server’s root, the manifest entry is its own chain |
ChannelPublisher (Admin-minted) | chat-server, on ChannelAdmin.makePublisher(subjectRef); delivered to the promoted principal via Self.subscribeIncoming.channelPublisherGranted | parent is the promotion issuance lineage node (kind channelPublisherGrant); the issuance node parents to the calling ChannelAdmin cap. Revoking via RolePromotionRevoker epochs the issuance node and descendants. |
ChannelPublisher (manifest-bundled) | chat-server, when the manifest bundles publisher to a profile | parent is chat-server’s root, the manifest entry is its own chain |
ChannelAdmin (manifest-bundled) | chat-server, when the manifest bundles admin to a profile | parent is chat-server’s root, the manifest entry is its own chain |
ChannelAdmin (Owner-minted) | chat-server, on ChannelOwner.makeAdmin(...); delivered to the promoted principal via Self.subscribeIncoming.channelAdminGranted | parent is the promotion issuance lineage node (kind channelAdminGrant); the issuance node parents to the calling ChannelOwner cap. Revoking via RolePromotionRevoker epochs the issuance node and descendants. |
DmPeer (cap form) | chat-server, on Self.openDm(contactCap) | parent = the ContactCap lineage node |
DmPeer (code form) | chat-server, on Self.openDmFromCode(code) | parent = the contactCode lineage node |
E2EDmPeer (cap form) | chat-server, on Self.openE2EDm(contactCap) | parent = the ContactCap lineage node |
E2EDmPeer (code form) | chat-server, on Self.openE2EDmFromCode(code) | parent = the contactCode lineage node |
ChatDirectory(scope) | chat-server, configured per scope in the manifest | bundled to sessions matching the scope’s policy |
DiscoverableGroupJoin / DiscoverableChannel{Text,Audio,Video}Subscribe | chat-server, on ChatDirectory.search(query) for entries the scope policy allows | parent is the directory-scope’s policy entry |
InviteToken (cap form) | chat-server, on GroupMember.invite(...) | parent is the issuing role cap (admin or member depending on policy) |
inviteCode (code form, lineage node) | chat-server, on GroupMember.inviteCode(...) | parent is the issuing role cap |
ContactCap (cap form) | chat-server, on Self.contact(lifetime) | parent is the issuing principal’s Self cap |
contactCode (code form, lineage node) | chat-server, on Self.contactCode(lifetime) | parent is the issuing principal’s Self cap |
InviteRevoker / SpeakerRevoker | chat-server, returned alongside the matching token / promotion | parent is the issuing role cap |
SpeakerToken | chat-server, on StageRoomAdmin.promoteToSpeaker(listenerRef) | delivered to the bound listener via stage roster events; parent is the admin cap |
listener caps (TextListener, AudioSink, VideoSink) | minted locally by the receiver | not in any lineage chain; revocation is local drop |
Manifest is Chat service configuration, not kernel or broker configuration. It declares the initial groups/channels, who owns them, who appears in which discovery scope, and which sessions are auto-bundled with which caps. chat-server reads it at boot and acts on its own root cap. The kernel only manages cap epochs and dispatch.
The broker’s role is to bundle initial caps a session needs to
use what it already has – e.g. a manifest can configure that
“chat-server starts with operator-lobby already created and
GroupMember(operator-lobby) bundled to operator-class sessions”. The
broker hands those session bundles out at login; chat-server is the
issuer.
Granting flows
Operator joins the operator-lobby at boot (manifest bundle). The
manifest declares chat-server’s startup config: create
operator-lobby with chat-server’s own service principal as Owner;
bundle GroupMember(operator-lobby) to every session whose profile
is operator. At login, the broker hands the operator session a
chat-server-issued GroupMember(operator-lobby) cap. The cap’s
parent in chat-server’s lineage tree is “this session’s join entry”
– a fresh chain root specific to this session, not shared with
other operators. No approval step.
Operator joins a discoverable chat at runtime. Sessions hold a
ChatDirectory(operator-scope) cap. Operator calls
ChatDirectory.search(query) -> ChatDirectoryPage; chat-server
returns entries matching the scope’s policy. Each entry carries
a kind-specific discoverable cap depending on the chat’s kind:
DiscoverableGroupJoin for a Group, or one of
DiscoverableChannelTextSubscribe /
DiscoverableChannelAudioSubscribe /
DiscoverableChannelVideoSubscribe for a broadcast Channel.
Operator picks one and calls the matching method:
DiscoverableGroupJoin.join() -> GroupMember(group)for a Group entry.DiscoverableChannelTextSubscribe.subscribe() -> ChannelTextSubscriber(channel)(or the matching audio/video variant) for a broadcast Channel entry.
The new role cap’s parent in chat-server’s lineage is “this
session’s join event” – a fresh chain root for this join, not
shared with other joiners. Possession of the discoverable cap
is the policy gate; calling .join() / .subscribe() mints
the role cap. There is no separate “redeem” step.
An admin invites a specific person to a group. Admin holds
GroupAdmin(group) (which extends GroupMember). They call
GroupMember.invite(forSubject=PrincipalRef, lifetime=...) -> (token, revoker, inviteRef) (cap-form, used when the invitee
can receive a chat-server-mediated cap delivery – e.g. via an
existing DM) or GroupMember.inviteCode(lifetime=...) -> (code :Data, revoker, inviteRef) (byte-form, used when the
invitee can only receive bearer-secret bytes through paper
handoff, QR code, or non-chat channels). Both calls now also
return the issuance lineage node’s inviteRef :GroupCapRef,
which the issuer keeps alongside revoker for cap-clean
per-branch revocation later via GroupAdmin.describeBranch /
revokeBranch. The byte-form is the issuance entry described
under How bearer caps cross principal boundaries: a distinct
lineage node, not a transparent identifier collapsed onto the
inviter. chat-server records InviteToken.parent = the calling admin role cap (cap form), or inviteCode.parent = the calling admin role cap (byte form, naming the issuance
entry). The invitee calls Self.acceptInvite(token) -> GroupMember for the cap-form, or Self.acceptInviteCode(code) -> GroupMember for the byte-form; chat-server mints the
member cap with parent = the InviteToken/inviteCode lineage node. Lineage is Member -> InviteToken/inviteCode -> Admin -> ... -> chat-server root. The admin’s InviteRevoker
revokes that specific handoff (invalidates pre-redemption
bearer copies, epochs the redeemed member’s branch).
A member invites someone (if group policy allows). Same shape as
admin-invite, but the invite policy may restrict member-issued
invites (single-use, n-shot, or disabled). The invitee’s resulting
GroupMember cap is parented to the inviting member’s role cap,
not to the admin’s; this is the per-member chain that makes
spam-bot recovery work.
Spam-bot recovery (per-branch revoke). A member M used their
member cap’s invite authority to admit five spam bots. Owner or
admin obtains a GroupCapRef for M’s branch – without holding
M’s bearer cap, since transfer_policy forbids raw bearer
transfer. Two cap-clean obtain paths:
GroupAdmin.lookupByPrincipal(M.principal) -> List(GroupCapRef)if the admin is starting from M’sPrincipalRef. TheChatInboundEvent.senderfield is a disclosure-redacted display name (text), not aPrincipalRef, so the admin getsM.principalfrom one of the typed surfaces that actually carry aPrincipalRef: an audit-log entry, a user-search / identity-broker UI, or by inspecting a known lineage node viadescribeBranch– the returnedBranchInfo.rootis aLineageNode, and when its union arm iscapNode, thecapNode.principal :PrincipalRefis the unredacted owner (issuance / operation arms have no principal of their own; walk to acapNodedescendant). The redacted sender field is for display only.GroupAdmin.describeRoot() -> BranchInfo(GroupCapRef)for a full top-down walk when starting from “show me the whole group’s lineage tree” (recurse intoLineageNode.children; eachcapNodearm carries the unredactedprincipal :PrincipalReffor admins).
Then optionally GroupAdmin.describeBranch(node) -> BranchInfo(GroupCapRef) to render “this is who would be
revoked” UI before pulling the trigger, and
GroupAdmin.revokeBranch(node) to commit. chat-server rotates
the kernel-level cap epoch on M’s role cap and every descendant
of it – the five bots’ caps and any further sub-invitees.
Subsequent dispatch through any of them fails closed. Other
members of the same group, including operators who joined via
the same public DiscoverableGroupJoin(group) route, are
untouched because each public join produced its own distinct
chain rooted at that joiner’s join event.
Closing a public-join route without kicking existing members. Two parallel cases by chat kind:
- Group. Owner calls
GroupOwner.closePublicJoin(entry)with the entry handle minted bypublishDiscoverable. chat-server marks the public-join entry inactive and rotates the epoch on the sharedDiscoverableGroupJoincap class that every directory result handed out. SubsequentDiscoverableGroupJoin.join()calls fail closed; existingGroupMember(group)caps are unaffected because the discoverable cap is not in their lineage (the route is the policy that minted them, not their parent). To later re-open, owner publishes a freshDiscoverableGroupJoin– a new cap with a fresh epoch. - Channel (broadcast). Owner calls
ChannelOwner.closePublicJoin(entry). chat-server rotates the epoch on whicheverDiscoverableChannel{Text,Audio,Video}Subscribecap class was associated with the entry (one epoch rotation can cover all three kinds for a single Channel route or carve them separately; chat-server config decides). ExistingChannel{Text,Audio,Video}Subscribercaps are unaffected – the discoverable cap is again not in their lineage.
Two principals open a DM (contact-cap path). Alice wants to be reachable. She has two options depending on how the recipient will receive the contact:
-
- Cap form – `Self.contact(lifetime=…) -> (contact
- ContactCap, ref :ContactCapRef)
. The bearerContactCapis shared via chat-server-mediated cap delivery (e.g. attached to a chat event in an existing group Alice is in, where chat-server re-mints it for each recipient). TheContactCapRefis Alice's non-secret revocation handle; she keeps it locally (alongside whatever metadata her UI shows in a "contacts I've issued" list) and later callsSelf.revokeContact(ref)` if she wants to retract this contact. Use this form when the recipient already has a cap-bearing channel to Alice.
- Code form –
Self.contactCode(lifetime=...) -> (code :Data, codeId :Data). The bearer-secretcodebytes are shared out-of-band (pinned in Alice’s public-profile post, printed on a business card, encoded as a QR, sent over an unrelated channel); thecodeIdis the non-secret revocation handle Alice keeps and later passes toSelf.revokeContactCode(codeId). Use this form when the recipient cannot receive a cap (no shared chat yet, or out-of-band handoff).
Bob, holding Self for his own session, calls one of the
recipient methods: Self.openDm(contactCap) -> DmPeer for cap
form, or Self.openDmFromCode(code) -> DmPeer for code form.
chat-server mints Bob’s DmPeer(B->A) with parent = the ContactCap or contactCode lineage node, and delivers Alice’s
side DmPeer(A->B) to Alice via Self.subscribeIncoming –
specifically, the dmOpened :DmPeer arm of the
SelfIncomingEvent union, with source :IssuanceSource
carrying the contactRef :ContactCapRef Alice retained from
her earlier Self.contact(...) issuance (cap-form path) or
the codeId :Data from Self.contactCode(...) (code-form
path). Alice’s UI matches the event to its issuance entry
through that ref.
Either party drops their listener subscription to stop receiving
(instant); Alice may call Self.revokeContact(ref) (cap form)
or Self.revokeContactCode(codeId) (code form), passing the
issuer-side handle she retained from the earlier issuance call,
to revoke just that contact’s branch and any DM chains derived
from it, without affecting DMs Alice established via different
contact caps.
Sending to an agent the operator owns. Manifest configures: when
operator session starts an agent, chat-server creates a fresh
agent-prompt group with operator as Owner and the agent runner’s
session as a Member. Operator already holds GroupOwner(agent-prompt)
because chat-server made them Owner at group creation time. No
approval step. Tool consent inside the agent runner remains a
separate concern handled by ApprovalClient.
Sending to an agent the operator does not own. The agent’s
owner controls reachability. They publish a DiscoverableGroupJoin (or per-kind channel-subscribe)
in their scope’s directory, or hand out a contact cap to a specific
operator, or invite to a specific group. There is no protocol-level
way to write the agent without already holding such a cap.
Listener-side filter (soft mute). Subscribers may pass options
on subscribeText/Audio/Video that filter inbound events by
sender lineage, e.g. muteSenderBranch(parentCapId). Sender caps
may have been validly minted; filter is a soft mute, not a
revocation. For hard revocation, the owner must call
revokeBranch.
Worked examples
These ground the abstract granting flows in concrete scenarios that will appear in implementation iterations.
Public/system channel: making lobby reachable to all operators.
Two valid paths, both expressed as Chat-service configuration:
-
Manifest-bundled membership. The chat-server manifest declares the group and the auto-bundle policy:
chat: groups: lobby: owner: principal:chat-server # the service runs as Owner bundles: - profile: operator attach: GroupMember(lobby)At startup, chat-server creates
lobbyand prepares the attach-on-login behavior. When an operator session logs in, the broker invokes chat-server’s per-session bundle hook; chat-server mints a freshGroupMember(lobby)cap for that session withparent = chat-server root(specifically: a per-session chain root). No two operators share the same chain. To remove one operator the admin runs the deny-list-only ban semantic as a pair of calls:GroupAdmin.revokeBranch(theirMemberRef)to epoch the current chain (the operator’s active session fails closed on the next dispatch) ANDGroupAdmin.banPrincipal(theirPrincipal)to add them to the deny-list so the bundle hook does NOT mint a freshGroupMember(lobby)on their next login. Either step alone is meaningful but incomplete: revokeBranch alone leaves the bundle hook open, and banPrincipal alone leaves the current session running. Other operators’ chains are unaffected by either step. -
Discoverable join via
ChatDirectory. chat-server’s manifest declares the lobby visible in the operator scope:chat: groups: lobby: owner: principal:chat-server directories: operator-scope: bundle-to: { profile: operator } entries: - group: lobby # the entry references the # Group above; the manifest # key uses `group:` rather # than the reserved # `channel:` since lobby is # a Group not a broadcast # Channel. join-policy: any-holder # anyone holding the # DiscoverableGroupJoin(lobby) entry # may call .join()Operator sessions get
ChatDirectory(operator-scope)bundled at login. The operator callsChatDirectory.search(query), sees thelobbyentry with aDiscoverableGroupJoin(lobby)cap, and callsDiscoverableGroupJoin(lobby).join() -> GroupMember(lobby). Each public join is its own distinct chain: the new member’s parent is the per-session join event, not the shared per-kind discoverable cap cap. Kicking member M withGroupAdmin.revokeBranch(M)epochs M’s chain (and anyone M invited to the group) but leaves all other public-joined members intact. Because the public-join route is still open, M could re-join through it and mint a fresh chain unless the admin also callsGroupAdmin.banPrincipal(M.principal)– the deny-list-only ban primitive that blocks future mints for that principal. The full “kick + ban M” workflow is therefore the pairrevokeBranch(M)+banPrincipal(M.principal); either step alone is meaningful (kick without banning lets a contrite member re-join; banning a not-currently-active principal blocks future mints without epoching anything). To stop accepting new joins from anyone, the owner callsGroupOwner.closePublicJoin(entry)with theChatDirectoryEntryHandlereturned by the matchingpublishDiscoverablecall; chat-server epochs theDiscoverableGroupJoin(lobby)cap class. Existing members are unaffected.
The first path is right for “every operator should be in the lobby the moment they log in”; the second is right for “operators choose whether to join, and we want a single knob to stop accepting new joins without kicking existing members”. Both are configurations of the same Chat service, both produce per-member distinct chains, and neither requires a registry service outside Chat.
Cross-session messaging test (group case).
Iteration 4’s primary cross-session test exercises the default case: two sessions message each other through a shared group, which is how humans actually message each other in a Telegram-shaped system. The DM path is exercised separately because its cap-derivation chain is different.
Test fixture, in pseudo-CUE chat-server config:
chat:
groups:
test-lobby:
owner: principal:chat-server
# The DM negative-test case (case C below) needs an admin
# cap to call GroupAdmin.revokeBranch on a misbehaving
# invitee's chain. Manifest grants the console-tester
# profile a GroupAdmin cap on test-lobby so that test
# is implementable without changing the substrate; the
# default group-test path uses only the GroupMember
# subset of methods.
admins: [ principal:console-tester ]
bundles:
- profile: console-tester
attach: GroupAdmin(test-lobby) # extends GroupMember
- profile: ui-tester
attach: GroupMember(test-lobby)
sessions:
console:
profile: console-tester
ui:
profile: ui-tester
Test flow:
- chat-server creates
test-lobbyat boot and registers the per-session bundle behavior. At login, the broker invokes chat-server’s bundle hook for each session; chat-server mints a freshGroupAdmin(test-lobby)cap for the console session (which inherits allGroupMembermethods so the group-test path below works unchanged) and a freshGroupMember(test-lobby)cap for the UI session, each its own chain root in chat-server’s lineage tree. The admin cap is what enables Negative case C in the DM flow. - Console session opens its bundled member cap, mints a
TextListener, callsgroupMemberCap.subscribeText(listener). - UI session does the same through the trusted Rust backend.
- Console session calls
groupMemberCap.send(event{kind=text, text="hi from console"}). - UI session’s listener receives the inbound event; UI backend surfaces it as a view-model row in the browser’s chat panel.
- UI session sends a reply; console session’s listener receives.
- Test asserts both directions of the round-trip and asserts that
the redacted transcript contains
kind=textevents from both senders without leaking session-id hex or raw cap handles.
This proves: default capset distribution works; subscribe/send round-trip works; cross-session listener delivery works.
Cross-session messaging test (DM case).
Same fixture extended with a Self cap on each session (the cap
that lets a principal produce a contact cap and accept incoming
DMs). Both sessions are also members of test-lobby from the group
test, which is the substrate “out-of-band” channel through which
the contact cap travels.
-
Console session calls
console.contact()and binds the tuple result(contactCap, ref). The bearercontactCapis a chat-server-issued cap that says “any holder may open a DM to console”; theref :ContactCapRefis the issuer-side revocation handle Console retains (Negative case B uses it). Console session sends ONLY the bearercontactCapto the UI session through the existingtest-lobbygroup chat (the group’ssend()accepts cap references in events for exactly this purpose); it does NOT send theref. The contact cap’s parent in chat-server’s lineage is “console session’s contact-issuance event” – a fresh chain root. -
UI session receives the chat event carrying the contact cap, extracts it, and calls
ui.openDm(contactCap) -> DmPeer(UI->Console). chat-server mints both directions: UI’sDmPeer(UI->Console)withparent = contactCap, and Console’s ownDmPeer(Console->UI)delivered via Console’sSelfnotification surface, with the same parent. -
Both sides
subscribeText, exchange messages, assert round-trip. -
Negative case A: a third session that did not receive the contact cap cannot construct one (it has no
Self.contact()path bound to the console principal). The test does not even need a denial assertion – the third session has no cap to call. -
Negative case B: console calls
Self.revokeContact(ref), passing theContactCapRefit retained from the earlierSelf.contact(...)call. chat-server epochs the contact cap and the DmPeer chains derived from it. UI’s subsequentDmPeer.sendfails closed withstaleCap. The test asserts the typed denial. -
- Negative case C: console invites a hostile third party to
test-lobbyviaGroupMember.invite(forSubject=hostilePrincipal, lifetime=...), binding the result tuple `(token :InviteToken, revoker - InviteRevoker, inviteRef :GroupCapRef)
. console keeps bothrevoker(issuer-side revocation handle, parented to console's admin role cap) ANDinviteRef(the issuance lineage node, anIssuanceNodeInfowith kindinviteToken); both are non-secret and stored in the fixture's "outstanding invitations" record. console delivers onlytokento the hostile party through chat-server's normal cap-delivery path. The hostile party redeems withSelf.acceptInvite(token) -> GroupMember(test-lobby)and uses the cap badly. console (holdingGroupAdmin(test-lobby)` per the fixture) has two cap-clean revocation paths:
revoker.revoke()– the simplest path: console already holds the issuer-side handle and does not need any new ref. chat-server epochs the InviteToken’s lineage node and any descendants (the hostile member’sGroupMembercap and any sub-invitees they admitted).- General per-branch path: console obtains a
GroupCapReffor the hostile branch through one of the typed sources declared onGroupAdmin:- the
inviteRefreturned by the originalGroupMember.invite(...)tuple (if console issued the invite itself); GroupAdmin.lookupByPrincipal(hostilePrincipal)if console did NOT issue the invite (e.g. when revoking somebody else’s invitee or a public-join chain);- or
GroupAdmin.describeRoot()for a full top-down walk. ThenGroupAdmin.describeBranch(node)to inspect the subtree before pulling the trigger, andGroupAdmin.revokeBranch(node)to epoch it. Raw transfer of the hostile party’s bearerGroupMembercap is NOT how console gets the ref;transfer_policyforbids that, and chat-server’s lineage queries are the cap-clean substitute.
- the
The test asserts the third party is gone (
staleCapon the next dispatch through the revoked branch) and that UI’s DM with console is not affected (different lineage chain). - Negative case C: console invites a hostile third party to
This proves: contact-cap-driven DM works; DM peer caps are direction-bound (asymmetric); revoking a contact cap propagates to derived DMs without touching unrelated caps; per-branch revocation isolates spam without cascading to siblings; no cold-call path exists.
Cap lineage and transitive revocation
Each chat host maintains an internal cap-derivation tree:
- Every cap minted by a derive method has a recorded parent.
- A cap’s active descendants are reachable by tree walk.
revokeBranch(cap)rotates the kernel cap-epoch for the cap and all its active descendants. Subsequent dispatch through any of those caps fails closed.- The kernel does not need to know about lineage; it only sees
per-cap epochs (already an existing mechanism). Lineage tracking
is the chat host’s job. The kernel enforces the cap’s
transfer_policy, which forbids raw bearer transfer for chat caps – so the only way for a cap to reach a new principal is through a derive method, which records lineage.
Why service-side bookkeeping rather than kernel-tracked lineage.
capOS’s stated principle (docs/capability-model.md,
CLAUDE.md) is to “prefer userspace capability wrappers over
kernel-side policy checks.” Lineage has a domain-specific shape per
service (a chat group vs a file share vs a credential vault all want
different revocation semantics), and putting it in the kernel forces
every cap to carry lineage overhead even when its service does not
need it. The service-side approach lets each host implement the
semantics it actually needs, while leaning on existing kernel
mechanisms (cap epoch, transfer policy) for enforcement.
Revocation primitives
Three independent revocation paths, all observable as typed denials:
- Listener-side instant drop. Receiver
cancel()s theSubscriptioncap or drops the listener. No further pushes from anyone reach that listener. This is the receiver’s primary tool for “leave me alone right now”. - Branch revocation by lineage. Admin calls
GroupAdmin.revokeBranch(node :GroupCapRef)/ChannelAdmin.revokeBranch(node :ChannelCapRef), passing a typed lineage-node ref obtained fromdescribeRoot/lookupByPrincipal/ theinviteRefreturned by an earlierinvite/ the various*Reffields onSelfIncomingEvent– never a raw bearer cap (transfer_policyforbids cross-principal cap transfer; chat-server’s lineage queries are the cap-clean substitute). Issuer-held revoker caps cover the analogous bearer flows:Self.revokeContact(ref)/Self.revokeContactCode(codeId)for contact-driven DMs;InviteRevoker.revoke()for an outstanding invite;SpeakerRevoker.revoke()for stage-room speak grants;RolePromotionRevoker.revoke()for role promotions. In every case chat-server rotates the kernel epoch on the named branch. Used for “remove a misbehaving admin and everything they admitted”, “kill a contact cap that fell into spammer hands”, “shut down a topic and everyone who joined via it”. A separate operation,GroupOwner.closePublicJoin(entry)/ChannelOwner.closePublicJoin(entry), stops new joins through aDiscoverableGroupJoin/DiscoverableChannel*Subscriberoute without kicking existing members (the route is the policy that minted them, not their parent in the lineage tree). - Chat-wide invalidation. A
GroupOwner.disband/ChannelAdmin.closeChannelcall invalidates the whole chat (or the room is closed, or the agent shut down). Subsequent calls returnstaleChannel.
Revocation is not silent. All three paths surface as typed
staleCap / staleChannel denials at the next call site, with the
remote CapSet UI reflecting them as kind=presence chat events
(“you were removed from this group”, “this channel has closed”) or
on the next operator action.
Audit
Every derive and every revocation is auditable. The host’s lineage
tree is itself the audit substrate: for any cap, “who derived this,
when, from which parent, with what method” is a tree query. The
audit log records the caller’s session-scoped reference per
session-bound-invocation-context-proposal.md. Listener
subscribe/unsubscribe is auditable from the receiver’s session.
What this proposal does NOT decide
- The exact role-permission DSL for
GroupAdmin(Telegram allows per-admin granular permissions: can-pin, can-invite, can-edit; capOS’s first slice can ship a single Admin role and refine later). Schema must leave room. - Per-topic permission overrides within a group. First slice is group-wide policy; topics are sub-channels under the same membership.
- Group DMs (multi-recipient DMs). Likely modeled as a Group with Owner=initiator, Members=invited principals; no fan-out DmPeer. Details in a follow-up.
- The kernel feature for per-cap
transfer_policyto forbid raw bearer transfer specifically for chat-cap-classes. capOS’sCapInfo.transfer_policyalready exists as a string field; the exact policy values live in a kernel/auth follow-up. Until then, channel-host lineage tracking can still work but with a soft invariant: derive methods are the intended path; raw bearer transfer is not blocked at kernel level. The implementation iteration must close this gap before the substrate is treated as hardened. - The exact
ActionPlanandCapRequestschemas referenced fromApprovalClient. They are an approvals-side gap, not a chat-side one.
End-To-End Encrypted DMs
End-to-end-encrypted DMs are a distinct cap layer sitting on top of
the regular DM substrate, not a flag on DmPeer. Reasons to keep them
separate:
- The chat host carries ciphertext only and never sees plaintext. That is a strong invariant; making it a flag risks a code path where plaintext leaks under “encryption disabled” conditions.
- Key exchange, authenticated encryption (AEAD), forward-secrecy ratchets (e.g. Signal-style double ratchet), and out-of-band fingerprint verification are concerns the unencrypted DM does not have. They need their own cap surface so the policy can be reasoned about per-DM.
- Auditing differs: an unencrypted DM’s host can audit message contents per disclosure policy; an encrypted DM’s host audits metadata only (sender, recipient, timestamp, ciphertext size).
Cap shape
The E2E peer cap is routing-only. It carries opaque ciphertext
between two endpoints; it never has access to plaintext or to the
AEAD ratchet keys. The KeyContext lives strictly in the principal’s
own process (held client-side via
cryptography-and-key-management-proposal.md primitives), is never
serialized into a chat-server-minted cap, and never crosses to
chat-server in any method argument or return.
# E2E DM peer cap. Minted by chat-server, but holds NO key state.
# It is a pure routing endpoint: it accepts opaque ciphertext for
# delivery, and routes opaque ciphertext to a listener.
interface E2EDmPeer extends(ChatEndpoint) {
send @0 (envelope :CipherEnvelope) -> ();
subscribeCipher @1 (listener :CipherListener,
options :SubscribeOptions) -> (sub :Subscription);
# Outgoing media: still flow-controlled, but the bytes have
# already been encrypted client-side by the holder. The peer cap
# does not see the plaintext frame, nor does it accept a key
# context as an argument.
openCipherOut @2 (format :CipherStreamFormat) -> (track :CipherOut);
remoteFingerprint @3 () -> (info :PeerFingerprint);
callSurface @4 () -> (calls :E2ECallSurface);
closeDm @5 () -> ();
}
# Listener and outgoing-media caps for E2E. Both carry opaque
# bytes; decrypt/encrypt happens in the holder's own process.
interface CipherListener {
cipher @0 (envelope :CipherEnvelope) -> ();
}
interface CipherOut {
writeCipherFrame @0 (envelope :CipherEnvelope) -> stream;
close @1 ();
}
struct CipherEnvelope {
ciphertext @0 :Data; # AEAD output; opaque to chat-server
associatedData @1 :Data; # AEAD AAD (e.g. sequence number,
# ratchet header) -- routing
# metadata only, no plaintext
receivedAtMs @2 :UInt64;
}
# E2E call surface. Narrower than CallSurface: NO setRoutingMode,
# because chat-server cannot mix or transcode (it doesn't have the
# keys), so SFU-forward is the only viable mode. The constraint is
# enforced at the type level -- the method simply doesn't exist.
interface E2ECallSurface {
current @0 () -> (info :ActiveCallInfo);
subscribeState @1 (listener :CallStateListener,
options :SubscribeOptions) -> (sub :Subscription);
startCall @2 (config :E2ECallStartConfig) -> (host :E2ECallHost);
joinCall @3 () -> (participant :E2ECallParticipant);
# Roster delivery for E2E (DM) calls. Required for
# `e2eHostGranted :E2ECallHost` delivery on
# E2ECallHost.promoteHost.
subscribeRoster @4 (listener :CallRosterListener,
options :RosterSubscribeOptions)
-> (sub :Subscription);
}
# E2ECallParticipant mirrors CallParticipant but accepts only
# already-encrypted CipherOut tracks; the participant cap does
# not handle key state. Receive is via subscribeCipher: the
# listener gets one fan-out stream of CipherEnvelope frames
# covering all participants' audio and video tracks; the
# receiver's process discriminates kind/track via the envelope's
# associatedData / sequence-id metadata and decrypts locally.
# There is no plaintext-receive method on this cap.
interface E2ECallParticipant extends(ChatEndpoint) {
publishCipherAudio @0 (format :CipherStreamFormat) -> (track :CipherOut);
publishCipherVideo @1 (format :CipherStreamFormat,
purpose :VideoPurpose) -> (track :CipherOut);
unpublishAudio @2 () -> ();
unpublishVideo @3 (purpose :VideoPurpose) -> ();
raiseHand @4 (raised :Bool) -> ();
setMyMuteState @5 (muted :Bool) -> ();
leave @6 () -> ();
subscribeCipher @7 (listener :CipherListener,
options :SubscribeOptions)
-> (sub :Subscription);
}
# Note the deliberate absence of setRoutingMode: an E2ECallHost
# cannot select mesh/MCU because chat-server is keyless and can
# only forward.
interface E2ECallHost extends(E2ECallParticipant) {
mute @0 (participantRef :Data) -> ();
unmute @1 (participantRef :Data) -> ();
eject @2 (participantRef :Data) -> ();
# Same delivery pattern as `CallHost.promoteHost`: the new
# `E2ECallHost` cap is delivered to the bound participant via
# CallRosterDelta (`e2eHostGranted :E2ECallHost` arm), not
# returned to the caller.
promoteHost @3 (participantRef :Data) -> (revoker :RolePromotionRevoker);
end @4 () -> ();
}
Key exchange
E2E DM establishment piggybacks on the contact-cap path. The critical invariant: chat-server only ever sees ciphertext.
- Alice’s
Self.contact()produces a contact cap whoseContactInfoincludes Alice’s long-term identity public key (or a fingerprint resolvable through her published profile). Where the contact cap is shared is out-of-band relative to chat-server. - Bob, holding Alice’s contact cap, calls
Self.openE2EDm(contact). chat-server mintsE2EDmPeer(B->A)for Bob (a routing cap with NO key state) and delivers Alice’s sideE2EDmPeer(A->B)to Alice viaSelf.subscribeIncoming(e2eDmOpened :E2EDmPeerarm ofSelfIncomingEvent). - Bob and Alice run a key-exchange handshake (X3DH or similar)
in their own processes. The handshake ciphertexts travel
over the E2E DM channel itself; chat-server is an opaque
carrier. Bob’s
KeyContextis built in Bob’s process from his identityPrivateKeyand Alice’s identity public key; ditto for Alice. Neither key context is ever passed to a chat-server method or stored in a chat-server-minted cap. - After handshake, each side holds a
KeyContextlocally. To send: encrypt(plaintext, KeyContext) -> CipherEnvelope, thenpeer.send(envelope). To receive: peer’s listener deliversCipherEnvelope, the listener’s owning principal calls decrypt(envelope, KeyContext) -> plaintext locally. - Either party may rotate keys by performing a fresh ratchet
step in their own process and exchanging the new ratchet
header through normal
send()– no special method is required because key state never lived on the peer cap. - Out-of-band fingerprint verification compares
peer.remoteFingerprint()(a public-key digest, safe to expose; it is NOT the AEAD secret) with what each side knows from their contact cap.
Why this firewalls plaintext from the host
E2EDmPeer.send(CipherEnvelope)accepts ciphertext only. chat-server has no method to obtain the plaintext or the key context from the peer cap.subscribeCipherdeliversCipherEnvelopeto aCipherListener; decryption happens in the listener’s owning process.openCipherOutproduces aCipherOutthat accepts already- encrypted frames. chat-server forwards them without ever seeing plaintext.- The
KeyContextcap is held client-side, never serialized into a chat-server-minted cap, never passed as an argument to a chat-server method. (This is enforced by thecryptography-and-key-management-proposal.mdKeyContextcap’s transfer policy: not transferable to chat-server.) - E2E calls cannot mix/transcode because chat-server has no
keys. The
E2ECallSurface/E2ECallHostinterfaces simply do not havesetRoutingMode; the SFU-forward-only constraint is a type-level invariant rather than a runtime check.
What stays in vs out of scope here
In scope: end-to-end-encrypted DM voice/video calls. Both
plain DmPeer.callSurface() and E2EDmPeer.callSurface()
return E2ECallSurface. Direct calls between two principals are
end-to-end-encrypted at the media layer regardless of whether
the DM’s text is host-readable: chat-server forwards encrypted
RTP frames (via CipherOut-style tracks), and a DTLS-SRTP-style
key exchange runs between the peers at call start. The
SFU-forward-only constraint is enforced at the type level on
E2ECallSurface (no setRoutingMode).
Out of scope:
- E2E for the text of a regular
DmPeerstays plaintext-aware on chat-server. If you want host-blind text, useE2EDmPeer(which is a distinct cap layer with its ownCipherEnvelope- shaped send/subscribe). - Group E2E (multi-party MLS-style ratcheting). First slice is pairwise only. Group E2E is a future iteration once pairwise is proved.
- Cross-device synchronization (the “I want my E2E messages on a second device” problem). Out of scope.
- Server-side recording or transcoding for E2E media. The substrate is recording-blind everywhere; for E2E media, chat-server cannot mix or transcode anyway because it has no keys – this is a direct consequence, not a separate rule.
Backpressure And Quotas
Hot-path media (audio frames at 50 Hz, video frames at 30 Hz) does not fit on a synchronous request/response model.
- Outgoing audio/video uses
-> streamso the caller can pipeline frame writes without each one waiting for an ACK; the framework applies backpressure when the buffer fills. - Incoming audio/video listener caps publish a bounded ring; when the
consumer falls behind, the substrate drops oldest frames and reports
drop count via
AudioFrameMeta.dropsSinceLast(or equivalent) so the consumer can detect liveness gaps without reconstructing full frame history. - Per-chat quotas live in the chat cap itself (constructed by the hosting service). Per-session quotas live in the broker bundle. Two natural axes: max concurrent subscriptions per kind, max outgoing bandwidth per chat.
- Text history buffering is bounded by the trusted Rust backend’s
AppState; browser view models receive at most the last N events. The chat-cap holder may alsosubscribeTextwith asince(eventId)option to fetch a bounded backlog.
Privacy And Disclosure
Senders are surfaced through ChatInboundEvent.sender. Per
session-bound-invocation-context-proposal.md, the channel server sees the
caller’s opaque session-scoped reference plus freshness; it does not
see raw principal/profile/account fields by default. The chat-server-side
disclosure policy decides whether a sender’s display name, principal
class, or profile class is included in events visible to other
subscribers; default is “display name only”.
The remote CapSet UI’s redacted-transcript export rule applies here too: audio/video metadata (codec, timestamps, frame counts) may appear in transcripts; frame bodies do not.
Migration From The Existing Chat Schema
The current Chat interface (text, poll-based, single struct) stays
callable during the migration. Steps in approximate order:
- Add the listener-cap surface (
subscribeText,TextListener, the newChatInboundEventstruct) alongsidepoll. Keeppollworking. - Migrate the chat-server demo and the per-session chat worker to push
events through the listener cap. Mark
polldeprecated for capnp-rpc clients but keep it for DTO clients during the remote-session transport migration (docs/plans/remote-session-capset-client.mdTask 1). - Add the audio surface (
subscribeAudio,AudioSink,openAudioOut,AudioOut) onceMemoryObject-backed media rings exist. The realtime voice proposal’sVoiceSessionbecomes the browser-side adapter that maps WebRTC tracks into Chat audio subscriptions. - Add the video surface analogously. Video is feasible only after audio is proved end-to-end and the gateway-side WebRTC adapter exists.
- Once all subscribers are listener-cap-driven, remove
pollfrom the substrate-level interface; service-specific shims may keep it.
Each step is a separate iteration with its own QEMU smoke and host-side proof. The first iteration on top of this proposal is the text-only listener-cap rebuild, which is also iteration 4 of the remote-session plan (real Chat panel + cross-session messaging test).
Open Questions
- Per-cap
transfer_policyenforcement at kernel level. TodayCapInfo.transfer_policyis a string field on every cap (values like"stable","session-proxy"); it is descriptive, not enforced. Cap transfer between processes happens via the SQEIPC_TRANSFER_CAPflag, which the kernel implements by copying the cap entry from sender’s CapTable into receiver’s. Today that copy succeeds regardless oftransfer_policy. The substrate’s lineage invariant relies on: the only path for a chat cap to reach a new principal is through chat-server’sinvite/acceptInvite/Self.openDm/etc. methods (which record lineage). But if a principal holdsGroupMember(lobby)and passes that cap as a payload in an SQE to any other service via rawIPC_TRANSFER_CAP, the kernel hands a copy to that service – bypassing chat-server entirely. The lineage tree silently grows a copy with no recorded parent, and chat-server cannot revoke it. The kernel enforcement gap to close: extend SQE cap-transfer dispatch to consulttransfer_policyand reject transfers whose policy class forbids cross-principal copy (chat-class caps would carry such a policy). Sharing then must go through chat-server’s typed methods, which is where lineage gets recorded. Until this gap is closed, the substrate’s lineage invariant is enforced only by convention; no implementation iteration should treat the substrate as hardened without it. - Cross-channel reference of contact caps. This proposal has
contact caps travel “through some channel the principals already
share” – e.g. a contact cap is delivered via a group chat the
giver and recipient both belong to. Chat events therefore need a
way to carry cap references inline (the
datafield onChatOutboundEventplus a typed payload kind, or a separate cap-attachment field on the event). The first iteration may use the existing capnp cap-passing on the outbound event; details belong with iteration 1 schema refinement. - Multi-modal AI agents. When the agent runtime is a Chat
peer, it receives audio frames and emits audio frames. The
agent runner bridges
RealtimeModelSessionto the relevant per-kind chat facets – typicallyGroupMemberfor an agent-prompt group, or aDmPeer/E2EDmPeerif the agent is a DM peer. Should the bridge live in the agent runner (clean) or be a generic adapter cap (RealtimeChatBridge)? The realtime-voice proposal already has the agent runner doing the bridging; this proposal preserves that. - Cross-session media sharing. A chat may have subscribers from
multiple sessions. Does each subscription have its own session-scoped
reference (yes, per
session-bound-invocation-context-proposal.md), and does the chat cap retain owner-session metadata for moderation / kick? Likely yes; details in a follow-up. - Approval queue cap shape. Whether the queue lives on
AuthorityBroker, on a newApprovalQueuecap, or on aNotificationscap that carries approvals as one of its event kinds. Out of scope here; tracked in the approvals follow-up note above. - Voice barge-in semantics with WebRTC. Existing
realtime-voice-agent-shell-proposal.mddefines barge-in withinRealtimeModelSession; mapping that onto the Chat substrate (interrupt the outgoing audio track when apresencetyping event or a fresh inbound audio frame arrives) needs design before the voice iteration.
Relationship To Existing Proposals
realtime-voice-agent-shell-proposal.md—VoiceSessionbecomes the browser-side adapter into the Chat audio surface.RealtimeModelSessionstays unchanged (agent runtime ↔ provider). The agent runner bridges the two when the agent is part of a chat.llm-and-agent-proposal.md— “operator sends a prompt to a running agent” is a Chat text event over a channel the operator already holds (e.g.GroupOwnerof an agent-prompt group the operator created, or a contact cap the agent’s owner shared). “Agent emits a partial response” is a Chat text event withinReplyTo. “Agent requests a tool with consent required” emits anapprovalRefevent referencing anApprovalGrantfrom the existingApprovalClientsurface;ApprovalClientis not used to grant cross-principal write authority – that is always invite- or contact-cap-driven.user-identity-and-policy-proposal.md— the principal model (PrincipalKindincludingservice) is the basis for service principals owning system channels and for chat-server’s bundle and directory-scope predicates that test principal kind/profile.remote-session-capset-client-proposal.md— the remote CapSet UI’s “real Chat panel” target (iteration 4 of the plan) consumes the text-only slice of this substrate first; audio/video panels are follow-up iterations on the same backend boundary.shell-proposal.md—ApprovalClient/ApprovalGrantstay as defined; this proposal references them viaapprovalRef.session-bound-invocation-context-proposal.md— subscription identity is the session-scoped reference; Chat servers honour disclosure scopes.interactive-command-surface-proposal.md— typed command palettes remain a separate concern; a chat may surface a command-palette proposal as a structured message, but the command surface itself is not Chat.browser-capability-proposal.md— if a future browser tab sits inside a Chat-served pane (screen-share scenario), the browser cap rules still apply; Chat carries reference handles, not browser authority.
References
- WebRTC API specifications:
RTCPeerConnection,RTCDataChannel, audio and video tracks, SDP, ICE candidates, DTLS/SRTP. See https://webrtc.org/. - Cap’n Proto streaming RPC (
-> streammethod annotation) and listener-cap patterns: https://capnproto.org/news/2020-04-23-capnproto-0.8.html (introduces flow control), and the capnp Rust crate at v0.25 used in this repository. - Existing capOS proposals as cross-referenced above.