# Proposal: capOS-Hosted Agent Swarms

capOS should eventually host OpenClaw-like personal agents and multi-agent
workflows as ordinary capability-scoped services. The existing
[Language Models and Agent Runtime](llm-and-agent-proposal.md) proposal defines
the model capability surface and the single-session tool-use loop. This
proposal covers the layer above it: long-lived hosted agents, workspace and
memory layout, swarm orchestration, agent-to-agent coordination, and harness
controls.

The first credible implementation is not a general "AI computer". It is a
controlled service graph:

- user-facing ingress through native shell, SSH/WebShellGateway, chat channels,
  webhooks, or scheduled triggers;
- a trusted capOS runner that owns session capabilities and enforces tool
  gates;
- narrow agent workers that receive only task-local workspace, retrieval, and
  tool caps;
- explicit memory and wiki services instead of hidden prompt state;
- durable task records, review gates, and attribution for multi-agent work.

This belongs outside the shell proposal. Shell mode remains one interactive
runner surface. Hosted agents need persistent service state, remote ingress,
work queues, memory compaction, swarm scheduling, and audit rules that would
make the shell proposal too broad.

## Research Baseline

Sources reviewed for this design:

- capOS research note, Hosted Agent Harnesses:
  <../research/hosted-agent-harnesses.md>
- OpenAI, Harness engineering:
  <https://openai.com/index/harness-engineering/>
- OpenAI, Agents SDK sandbox and model-native harness direction:
  <https://openai.com/index/the-next-evolution-of-the-agents-sdk/>
- OpenClaw documentation: home, agent runtime, workspace, memory, exec,
  browser, and multi-agent controls:
  <https://openclawlab.com/en/>,
  <https://openclawlab.com/en/docs/concepts/agent/>,
  <https://openclawlab.com/en/docs/concepts/agent-workspace/>,
  <https://openclawlab.com/en/docs/concepts/memory/>,
  <https://openclawlab.com/en/docs/tools/exec/>,
  <https://openclawlab.com/en/docs/tools/browser/>,
  <https://openclawlab.com/en/docs/concepts/multi-agent/>
- DeepWiki secondary project summaries for OpenClaw, OpenClaw skills,
  OpenManus, Microsoft Agent Framework, and AutoGen:
  <https://deepwiki.com/openclaw/openclaw>,
  <https://deepwiki.com/openclaw/skills/2.2-agent-memory-persistence-pattern>,
  <https://deepwiki.com/openclaw/docs/6.3-web-search-and-browser-tools>,
  <https://deepwiki.com/FoundationAgents/OpenManus>,
  <https://deepwiki.com/microsoft/agent-framework>,
  <https://deepwiki.com/microsoft/ai-agents-for-beginners/3.1-autogen-framework>
- Karpathy, LLM Wiki:
  <https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f>
- Abdullin, Schema-Guided Reasoning:
  <https://abdullin.com/schema-guided-reasoning/>
- MetaGPT:
  <https://arxiv.org/abs/2308.00352>
- Generative Agents / Smallville:
  <https://arxiv.org/abs/2304.03442>
- Gas Town documentation:
  <https://docs.gastownhall.ai/>,
  <https://docs.gastownhall.ai/usage/>
- Model Context Protocol:
  <https://modelcontextprotocol.io/docs/getting-started/intro>,
  <https://modelcontextprotocol.io/docs/learn/architecture>
- Agent2Agent Protocol:
  <https://github.com/a2aproject/A2A>,
  <https://a2a-protocol.org/latest/specification/>
- Microsoft AutoGen and Microsoft Agent Framework:
  <https://www.microsoft.com/en-us/research/project/autogen/overview/>,
  <https://learn.microsoft.com/en-us/agent-framework/overview/>
- LangGraph durable execution:
  <https://docs.langchain.com/oss/python/langgraph/durable-execution>
- CrewAI:
  <https://docs.crewai.com/>
- CAMEL-AI:
  <https://docs.camel-ai.org/get_started/introduction>

There is substantial low-quality agent SEO around OpenClaw and related systems.
This proposal relies on primary docs, official project pages, arXiv papers, and
DeepWiki pages only as secondary codebase summaries. News and social reports may
motivate later risk research, but they are not treated as design authority.

## What Current Agent Harnesses Actually Do

The useful pattern is not "model plus tools". It is a harness that controls
what the model can inspect, what it can change, how work survives context loss,
and where human approval enters the loop.

OpenAI's harness engineering writeup is the cleanest framing for capOS:
repository-local, versioned artifacts are what the agent can reason about;
knowledge in chat threads, documents, and people's heads is effectively absent
unless compiled into files, schemas, tests, and executable plans. The same post
argues for mechanically enforced architecture, validated boundaries, and
agent-legible systems over ad-hoc documentation. The 2026 Agents SDK direction
adds an explicit model-native harness, controlled workspaces, sandbox execution,
filesystem tools, MCP, skills, AGENTS.md-style instructions, shell execution,
and structured patch tools.

OpenClaw shows the personal-agent product shape:

- local-first channel ingress through chat apps, webhooks, cron, and a gateway;
- a gateway security boundary for channels and tool execution;
- an agent runtime with a workspace as the default tool cwd;
- injected bootstrap files such as `AGENTS.md`, `TOOLS.md`, `USER.md`, and
  identity/persona files;
- built-in read, exec, edit/write, browser, web, process, memory, and skill
  surfaces;
- a browser harness with managed profiles, snapshots, screenshots, action refs,
  CDP routing, and optional arbitrary JavaScript evaluation;
- an exec harness with host selection (`sandbox`, gateway, node), security modes
  (`deny`, allowlist, full), approval prompts, timeouts, background sessions,
  PTY support, process polling, and path/env restrictions;
- markdown memory where files are the source of truth, plus semantic search,
  line-range reads, SQLite indexes, local/remote embeddings, and hybrid search;
- per-agent workspaces, sandbox settings, and tool allow/deny lists.

The important negative lesson is also explicit in OpenClaw's docs: a workspace
is not automatically a sandbox. If sandboxing is off, absolute paths and host
tools can still reach outside the workspace. capOS should not reproduce that
ambiguity. A capOS agent workspace must be a capability namespace by default,
not a convention over a host filesystem.

DeepWiki's accessible summaries add useful implementation-level signals:

- OpenClaw exposes tools as functional capabilities and skills as modular
  `SKILL.md` extensions, with a personal-assistant trust model, security audit,
  and sandboxing options.
- OpenClaw memory skills converge on durable, retrievable, self-maintaining
  memory because a single growing `MEMORY.md` overflows context and loses
  structure.
- OpenClaw web/browser docs describe dedicated managed browser profiles, CDP
  control through the gateway, SSRF checks, provider-backed web search, fetch
  normalization, and active memory integration.
- OpenManus uses a think-act cycle with tool execution, multi-provider LLMs,
  MCP integration, and sandboxed code/browser automation.
- Microsoft Agent Framework and AutoGen emphasize graph/workflow orchestration,
  checkpointing, human-in-the-loop, event-driven actor-style communication,
  distributed runtimes, tools, memory, observability, and MCP/A2A integrations.

For this repository itself, applying OpenAI-style harness engineering means
turning capOS's docs, workplans, run targets, QEMU proofs, proposal statuses,
research notes, and schema authority semantics into mechanically navigable
agent inputs. That repository-local work is owned by
[capOS Repository Harness Engineering](capos-repo-harness-engineering-proposal.md),
with source grounding in
[Hosted agent harnesses](../research/hosted-agent-harnesses.md).

## Product Goal

The visible milestone is:

`make run-hosted-agent` boots capOS in QEMU, starts a resident hosted-agent
service graph, accepts a scripted user request, creates a task-local workspace,
runs one or more bounded agent workers through a deterministic model service,
uses retrieval/wiki context, executes one read-only tool automatically, requires
approval for one mutating tool, records attributed audit output, and shuts down
without leaking session, model, or host authority to the worker.

Later milestones add real model backends, web ingress, chat ingress, browser
automation, multi-agent swarms, and remote/provider interoperability.

## Design Principles

1. **Harness first, model second.** The hosted-agent service is primarily a
   control plane for workspaces, tools, memory, approvals, lifecycle, and audit.
   Model selection is a replaceable backend decision.

2. **Agents are processes with caps, not identities with ambient power.** An
   agent worker has exactly the caps minted for one session, task, and phase.
   It does not inherit the operator's whole world.

3. **All tool execution is mediated.** The model proposes structured tool calls.
   The runner validates descriptors, arguments, turn binding, policy, budget,
   and approval before invocation.

4. **Memory is an artifact, not a hidden model property.** Durable facts,
   summaries, task logs, and wiki pages live in capability-scoped files or
   services with provenance, review status, and retention policy.

5. **Swarm work is durable structured data.** Tasks, assignments, handoffs,
   reviews, votes, failures, and merge decisions must outlive any model context
   window.

6. **Human review is a capability gate.** The system should support both
   high-autonomy local demos and conservative operator policy, but destructive
   or authority-widening actions require explicit fresh consent or step-up.

7. **Remote agent interoperability is data-plane only at first.** MCP and A2A
   style bridges may expose descriptors and messages, but they do not carry raw
   capOS authority.

8. **CapOS should be stricter than desktop harnesses.** Browser profiles,
   shell execution, provider credentials, memory stores, and file workspaces are
   separate capabilities with narrow lifetime and auditable grants.

9. **Shared resources need coordination objects.** A git repo, task queue,
   wiki, browser profile, or shared todo list is not just a file path. The
   agent harness must expose owners, leases, versions, watches, and conflict
   reports before workers mutate shared state.

10. **Incoming agent messages are untrusted work items.** A chat message from
    another agent can carry status, questions, handoffs, artifacts, or requests.
    It must not directly alter prompt state, execute tools, widen caps, or
    override task policy.

## System Topology

```mermaid
flowchart LR
    User[User / channel / cron / webhook] --> Gateway[Ingress Gateway]
    Gateway --> Broker[AuthorityBroker]
    Broker --> Host[HostedAgentService]

    Host --> Task[AgentTask<br/>durable state]
    Host --> Runner[AgentRunner<br/>trusted tool gate]
    Host --> Memory[AgentMemory<br/>wiki + logs + search]
    Host --> Model[LanguageModel<br/>local or remote backend]
    Host --> Scheduler[SwarmScheduler]

    Scheduler --> W1[Worker process<br/>task workspace caps]
    Scheduler --> W2[Worker process<br/>task workspace caps]
    Scheduler --> R[Reviewer process<br/>read + critique caps]

    Runner --> Tools[Typed capOS tools]
    Runner --> Approval[ApprovalClient]
    Runner --> Audit[AuditLog]
    Memory --> Store[(Workspace / Wiki / Vector Index)]
```

The kernel does not need agent semantics. It needs process isolation, endpoint
invocation metadata, MemoryObject/file-backed storage, capability transfer, and
resource accounting. The agent system is a userspace service graph.

## Core Capabilities

### HostedAgentService

Owns hosted-agent lifecycle for one broker policy domain:

- create a task from a user request, webhook, schedule, or shell command;
- allocate a task workspace and memory scope;
- select a model profile and runner policy;
- start workers with exact-grant capsets;
- enforce task budgets and cancellation;
- publish task status to shell, web, or chat surfaces;
- close, archive, or purge task state.

### AgentTask

Durable task record:

- request, normalized objective, requester session reference, and ingress
  provenance;
- workspace root cap, memory scope cap, allowed tools, and budgets;
- model profile and harness version;
- worker assignments and state transitions;
- links to artifacts, audit records, approvals, and review results;
- terminal status (`open`, `blocked`, `needsApproval`, `reviewing`, `done`,
  `failed`, `cancelled`, `expired`).

### AgentRunner

Trusted loop executor:

- builds tool descriptors from held caps and broker policy;
- calls `LanguageModel.stream` or `complete`;
- validates structured tool calls;
- applies schema-guided reasoning templates for planner/reviewer tasks;
- runs guard checks before and after tool execution;
- truncates and redacts tool results;
- appends conversation and action records;
- handles cancellation, timeout, retry, and model failure.

### AgentMemory

Information organization layer:

- append-only daily task log;
- curated long-term project memory;
- source store for immutable raw inputs;
- LLM-maintained wiki pages with source citations;
- index and log files for cheap navigation;
- optional BM25/vector hybrid search and reranking;
- stale/contradiction/orphan-page lint;
- per-session and per-project visibility controls.

### SwarmScheduler

Multi-agent orchestration:

- decomposes work into durable sub-tasks;
- assigns workers by role, available caps, model profile, and track record;
- creates task-local worktrees or equivalent namespace forks for code work;
- supervises handoff and timeout;
- asks reviewer workers for critique under read-only or constrained write caps;
- emits merge/release requests only after gates pass.

## Workspace Model

Desktop harnesses commonly treat a workspace as a cwd convention. capOS should
treat a workspace as a capability namespace:

- `WorkspaceRoot`: scoped directory-like cap for a task.
- `SourceMount`: read-only cap to immutable sources.
- `Scratch`: writeable temporary storage with quota and TTL.
- `ArtifactOutbox`: explicit export path for user-visible artifacts.
- `PatchSet`: structured edit proposal, not arbitrary writes by default.
- `SecretsView`: normally absent; if present, returns typed opaque handles, not
  strings.

Default policy:

- read-only source mounts unless the task explicitly asks for edits;
- no absolute path escape because there is no global filesystem path;
- generated artifacts are quarantined until reviewed or explicitly released;
- tool outputs are capped and stored with provenance;
- workspaces expire unless promoted to project memory.

This makes OpenClaw-style `sandbox` versus `host` ambiguity unnecessary.
Authority is not inferred from where a command happens to run.

## Shared Resource Coordination

Agent swarms fail in ordinary repositories and shared task lists when every
worker believes it is alone. capOS should model shared resources explicitly:

- `SharedResource`: git repository, task list, wiki page tree, browser profile,
  memory store, package cache, or external service account.
- `ResourceLease`: exclusive or shared claim with owner, task, phase, scope,
  expiry, renewal policy, and release reason.
- `ResourceVersion`: observed revision, generation, branch head, page hash, or
  compare-and-swap token.
- `ResourceWatch`: subscription to resource updates, lease changes, conflicts,
  and merge/release queue events.
- `ConflictReport`: structured notice that two tasks touched the same file,
  todo item, wiki page, browser profile, credential scope, or external object.

Minimum policy:

- leases are coordination metadata, not write authority; mutation still
  requires the relevant workspace, patch, tool, or service cap;
- every mutating task declares the resource scopes it expects to touch;
- exclusive resources reject overlapping leases unless a supervisor approves a
  shared mode;
- shared resources require versioned writes or patch sets;
- stale leases expire and emit events instead of silently blocking work;
- workers receive conflict reports as structured context, not as informal chat;
- merge/release queues serialize publication to user-visible state;
- audit records include resource scope, observed version, write version, and
  approving actor.

Concrete resource policies:

- **Git repositories:** one task worktree and branch per worker, path/subsystem
  claims for high-conflict areas, merge queue before mainline publication, and
  conflict reports when another task changes claimed paths.
- **Shared todo lists:** item-level claims, item generation numbers,
  compare-and-swap updates, and supervisor escalation for duplicate ownership.
- **Wiki and memory pages:** page leases or patch sets, source citations,
  contradiction checks, and freshness labels before compiled memory becomes
  trusted context.
- **Browser profiles:** exclusive lease by default because cookies, local
  storage, downloads, and screenshots collapse many unrelated authorities.

For capOS repository work specifically, this maps to the existing requirement
that each agent uses a dedicated branch and worktree. A future harness should
make that visible through an active-work registry, claimed resource scopes,
review findings, and merge-queue state instead of relying on each agent to
infer it from git state and chat history.

## Agent Inboxes and Inter-Agent Messages

Free-form peer chat is useful for coordination, but it is a poor authority
boundary. capOS should deliver messages through an explicit `AgentInbox`
capability owned by the runner or task, not by direct prompt injection.

An incoming message should be a structured `AgentMessage` event:

```yaml
id: msg-...
sender: agent-or-peer-id
sender_task: task-...
recipient_task: task-...
kind: status
# status | question | handoff | reviewFinding | resourceEvent |
# artifactReady | approvalRequest | interrupt
causal_parent: msg-or-task-event-id
body: bounded markdown or structured payload
artifact_refs:
  - artifact-...
requested_actions:
  - proposed action descriptor
requested_authority:
  - capability descriptor, never a raw cap
expires_at_unix_ms: 1893456000000
```

Delivery rules:

- the runner validates sender identity, task relationship, size, schema, expiry,
  and policy before the model sees the message;
- message ids are deduplicated per sender and task within a bounded replay
  window;
- old causal parents, duplicate approval requests, and duplicate interrupts are
  quarantined instead of redelivered;
- per-sender and per-task quotas cap message count, queued bytes, delivery rate,
  and model-visible inbox bytes;
- peers that exceed quota or trigger repeated quarantine are rate-limited or
  muted until supervisor review;
- unknown senders, stale tasks, malformed payloads, and policy-incompatible
  requests are quarantined for supervisor review;
- artifact references require separate artifact caps before content is read;
- requested actions become proposed tool calls or task changes, never automatic
  execution;
- requested authority becomes an approval request, never ambient delegation;
- interrupts and approval requests may receive priority, but still pass through
  policy and audit;
- every delivered message carries sender, task, and causal-parent metadata so a
  worker can distinguish user intent, supervisor instruction, peer status, and
  untrusted external input.

This gives agents the useful parts of chat messages from other agents without
making chat an authority channel. It also gives the scheduler a place to
surface shared-resource events such as "another worker claimed this path",
"your todo item changed", or "merge queue rejected your patch".

## Tool Harness Controls

capOS should support the same classes of controls as current harnesses, but
with capability-native semantics:

| Tool class | Desktop harness pattern | capOS target |
| --- | --- | --- |
| File read | workspace-relative reads, memory reads | directory/file caps with line-range and byte-budget policy |
| File write/edit | direct edits or patch tool | `PatchSet` plus approval, or write cap scoped to scratch/outbox |
| Shell/exec | host/sandbox/node, allowlist/full, approvals | `CommandRunner` cap with binary caps, argv schema, cwd cap, env cap, PTY cap, timeout, output cap |
| Browser | CDP profile, snapshots, action refs, screenshots | `BrowserSession` cap with profile isolation, origin policy, JS-eval deny by default, screenshot/snapshot separation |
| Web/fetch | provider-specific tool | `HttpEndpoint` / `Fetch` caps scoped by origin, method, headers, and data labels |
| Model | provider API key or local model | `LanguageModel` cap from broker, no provider secret strings |
| Memory | markdown files plus search plugin | `AgentMemory` cap with source/wiki/index/search subcaps |
| Agent-to-agent | session send/spawn, A2A-like messages | `AgentPeer` endpoint with message schema, no implicit authority transfer |

Execution policy modes should reuse the LLM proposal's `auto`, `consent`,
`stepUp`, and `forbidden` modes, but attach them to typed capability methods
and task phases. A tool may be `auto` during read-only research and `consent`
when called from a mutating phase.

## Browser Harness

Browser automation is high-risk because logged-in web state, screenshots, and
page JavaScript collapse many trust boundaries. A capOS browser harness should:

- launch a dedicated browser profile per task or per approved long-lived agent;
- keep personal/operator browser profiles out of scope by default;
- expose snapshots and screenshots as separate capabilities;
- require explicit policy for JavaScript evaluation;
- bind every action to a prior snapshot ref when possible;
- treat page text, DOM, screenshots, downloads, and clipboard data as hostile;
- block private-network and metadata-service fetches unless broker policy
  grants them;
- isolate cookies and credentials by profile cap;
- make remote CDP-style control a future bridge, never the baseline.

The first QEMU proof should use a deterministic fake browser tool, not a full
Chromium port.

## Exec Harness

The first exec surface should not be a Unix shell. It should be a command
capability with explicit shape:

```capnp
interface CommandRunner {
  run @0 (req :CommandRequest) -> (result :CommandResult);
}
```

The request should name a pre-granted program or command class, not arbitrary
shell text. If a POSIX layer later exists, shell execution can be a separate
high-risk tool with parsing, approval, and audit.

Minimum controls:

- allowed program identity is resolved before execution;
- argv is structured, not interpolated;
- environment is built from allowlisted variables and typed secret handles;
- working directory is a `WorkspaceRoot` or subdirectory cap;
- output byte and line limits are mandatory;
- timeout and kill semantics are mandatory;
- background processes require an explicit `ProcessSession` cap;
- PTY is a separate grant;
- network access is absent unless the child receives a network cap;
- mutating commands require approval unless the task owns the target scratch
  or patch workspace.

## Memory, Wiki, and Retrieval

Karpathy's LLM Wiki pattern is a better fit for capOS than an unstructured
vector database as the primary memory. The design has three layers:

- immutable raw sources;
- an LLM-maintained markdown wiki of summaries, entity pages, concept pages,
  comparisons, and synthesis;
- a schema/instruction file that defines page layout, ingest, query, lint, and
  update conventions.

The useful operations are:

- **Ingest:** read a source, write or update wiki pages, update index, append
  log.
- **Query:** read the index, inspect relevant pages, synthesize an answer with
  citations, optionally file useful answers back into the wiki.
- **Lint:** find contradictions, stale claims, orphan pages, missing links,
  weak citations, and data gaps.

capOS should implement this as a service rather than only as files:

- `SourceCorpus`: immutable source handles with digest, label, owner, and TTL.
- `WikiPage`: generated markdown plus source citations and confidence status.
- `WikiIndex`: content-oriented page catalog, cheap enough for the agent to
  read first.
- `WikiLog`: append-only operation timeline.
- `WikiLint`: typed findings for contradictions, missing citations, stale
  pages, orphan pages, and access-label drift.
- `SearchIndex`: optional BM25/vector hybrid index over approved pages and
  source chunks.

OpenClaw's memory docs are a practical baseline: markdown is the source of
truth, daily logs and curated `MEMORY.md` are separate, semantic search returns
bounded snippets with file and line ranges, indexes are per-agent, and local
embeddings can avoid remote leakage. capOS should add hard provenance, labels,
and write authority.

### Retrieval Rules

- Retrieval returns bounded snippets, not whole private files by default.
- Every synthesized claim that leaves the task should carry source links or be
  marked uncited.
- Wiki pages inherit the maximum confidentiality label of their sources unless
  a trusted redaction step lowers it.
- Memory writes require a policy decision: transient task log, project wiki,
  user memory, or rejected.
- Cross-agent memory access is explicit. A reviewer can read task artifacts
  without inheriting private user memory.
- Remote embedding backends are denied for high-label memory.

## Schema-Guided Reasoning

Abdullin's Schema-Guided Reasoning pattern is directly useful for capOS:
force the model to fill typed intermediate structures in a known order,
validate them, and test them. It is not a substitute for capability policy, but
it is a good harness technique for bounded agent roles.

Use SGR for:

- task intake: classify objective, risk, needed capabilities, and missing
  clarifications;
- plan decomposition: produce sub-tasks, dependencies, verification gates, and
  rollback paths;
- tool-call review: explain why a call is necessary and what authority it
  touches before approval;
- source ingest: extract claims, citations, contradictions, and affected pages;
- code review: enumerate behavioral risks, security risks, tests, and residual
  uncertainty;
- final handoff: summarize artifacts, verification, open risks, and memory
  updates.

Each schema should be a Cap'n Proto or JSON-schema-like type with versioning,
test fixtures, and guardrails. The runner should validate the structure before
any action, and failures should become ordinary tool results rather than hidden
prompt retries.

## Swarm Patterns

### MetaGPT / Role Pipelines

MetaGPT's useful contribution is not the specific software-company metaphor.
It encodes standard operating procedures into prompt sequences and assigns
roles so intermediate artifacts can be verified. capOS should borrow the
artifact gates:

- product/task brief;
- requirements and constraints;
- design sketch;
- implementation plan;
- implementation;
- tests and verification;
- review;
- release/handoff.

Do not hard-code "PM", "architect", and "engineer" as kernel concepts. They
are runner roles backed by schemas, caps, and task state.

### Smallville / Generative Agents

The Generative Agents paper is useful for long-lived NPCs, companion agents,
and simulations. Its memory stream, reflection, and planning loop explains how
agents can appear coherent over time. capOS should use it cautiously:

- good for adventure NPCs, training simulations, social workflows, and
  explainable daily plans;
- bad as a direct authority model because believable behavior is not safe
  behavior;
- memory/reflection outputs must be low-authority data until reviewed or
  compiled into a scoped wiki.

### Gas Town / Durable Agent Work

Gas Town's useful pattern is persistent orchestration: roles, durable work
objects, attribution, worker lifecycles, worktrees, convoys, merge queues, and
supervision. capOS should borrow:

- one task object per unit of work;
- explicit worker lifecycle classes: persistent worker, ephemeral worker,
  reviewer, supervisor;
- task-local worktrees or namespace forks;
- merge/release queues;
- per-action attribution and track record;
- handoff records when an agent loses context or is recycled.

capOS should not borrow the role vocabulary or assume git is the only state
substrate. For code work, git/worktrees are excellent. For OS services, the same
pattern should map to `AgentTask`, `PatchSet`, `Artifact`, and `ReviewFinding`
capabilities.

## Interoperability

### MCP

MCP is a useful external compatibility layer for tools, resources, and prompts.
Its architecture is JSON-RPC over stdio or HTTP, with client/server capability
negotiation and primitives for tools, resources, prompts, sampling,
elicitation, logging, and experimental tasks.

capOS should treat MCP as an adapter boundary:

- an MCP server can be hosted as a low-authority process behind a capOS tool
  proxy;
- an MCP client can import external tools only after broker review;
- MCP tool descriptors are translated into capOS `ToolDescriptor` values;
- MCP tool calls execute through runner policy, not directly from the model;
- stdio MCP servers run without ambient filesystem/network unless granted caps;
- remote MCP uses `HttpEndpoint` plus explicit auth/token caps;
- MCP sampling/elicitation must not bypass runner approval or user-presence
  policy.

The risk is tool-marketplace sprawl: tools with similar names, hidden network
behavior, local process execution, and prompt-injection-sensitive resources.
capOS should require provenance, signing, version pinning, permission review,
and sandboxed execution for imported MCP servers.

### A2A / Agent-to-Agent

A2A is the right primary protocol reference for cross-agent interoperability:
agent cards, peer discovery, modality negotiation, task collaboration, text,
files, structured data, and streaming or push delivery. The first capOS bridge
should still be narrower than the full protocol surface:

- `AgentPeer.describe()` returns identity, capabilities, cost, labels, and
  accepted task/message schemas.
- `AgentPeer.send()` imports a task or message into `AgentInbox` with no
  authority transfer.
- `AgentPeer.artifact()` returns content only through an explicit export cap.
- Authentication and authorization are broker-mediated.
- Remote agents are untrusted services, not session principals.

Raw capOS caps should not cross an A2A bridge. A remote agent receives data,
message events, and artifact references, not authority. Agent-card capabilities
map to descriptors that the broker can review; they do not imply tool access
inside capOS.

## Security Model

Primary threats:

- prompt injection through web pages, tool results, logs, email, chat, or
  memory pages;
- malicious or compromised tools, skills, MCP servers, browser extensions, and
  model adapters;
- workspace escape through shell, filesystem, browser profile, CDP, downloads,
  or path tricks;
- secret exposure through prompts, tool results, screenshots, logs, memory, or
  remote embeddings;
- authority widening through agent-to-agent delegation;
- stale or poisoned memory becoming trusted context;
- runaway cost, process count, token use, or network use;
- false completion: agent claims work is done without verifying artifacts;
- review capture: same model/harness family produces work and review without
  independent checks.

Controls:

- exact-grant worker capsets;
- task-local workspaces and quotas;
- no ambient filesystem, network, process, browser, or secret access;
- structured tool descriptors and argument validation;
- per-tool `auto` / `consent` / `stepUp` / `forbidden` policy;
- fresh user presence for mutating/destructive calls;
- audit for every authority-touching action;
- source labels and memory provenance;
- deterministic verification tools where possible;
- independent reviewer roles with read-only caps;
- expiry and revocation for tasks, workers, browser profiles, model streams,
  and provider tokens.

## Resource Accounting

Hosted agents need first-class quotas:

- model input/output tokens;
- remote provider spend;
- wall-clock runtime;
- process count and threads;
- memory and workspace bytes;
- source corpus bytes;
- vector index bytes;
- browser sessions and tabs;
- network requests and egress bytes;
- tool-call count by risk class;
- inbox message count, queued bytes, delivery rate, and replay-window entries;
- quarantined peer-message count by sender and task;
- approval prompt count to prevent consent fatigue.

Budgets belong to `AgentTask` and are enforced by the runner, broker, and
resource ledgers. A worker cannot extend its own budget. Budget extension is a
broker or user action.

## Implementation Phases

### Phase 0 - Research and design grounding

- Write targeted research notes for OpenClaw harness controls, MCP security,
  A2A, Gas Town orchestration, LLM Wiki memory, and browser automation risk.
- Decide which parts belong in capOS core versus a sibling
  `capos-agent-shell` repository.
- Define the minimum QEMU-hosted deterministic model and fake browser/exec
  tools needed for proof.

### Phase 1 - Single hosted task, deterministic model

- Add `HostedAgentService`, `AgentTask`, `AgentRunner`, and deterministic
  `LanguageModel` test service.
- Create task workspace caps over existing storage primitives or a temporary
  in-memory substitute.
- Implement a read-only tool and a mutating fake tool with approval.
- Add `make run-hosted-agent` QEMU proof.

### Phase 2 - Memory and wiki substrate

- Add `AgentMemory` with source, wiki, index, log, and lint concepts.
- Implement markdown-backed storage first.
- Add bounded retrieval by page and line range.
- Add source citations and label inheritance.
- Prove ingest, query, lint, and memory write rejection under policy.

### Phase 3 - Tool harnesses

- Add structured `CommandRunner` without arbitrary shell.
- Add `PatchSet` for file edits.
- Add fake browser harness, then later real browser integration outside the
  kernel path.
- Add MCP import behind a tool-proxy policy review.

### Phase 4 - Swarm scheduling

- Add durable subtask records and worker assignment.
- Add ephemeral worker processes with exact-grant capsets.
- Add reviewer workers with constrained caps.
- Add merge/release queue semantics for artifacts.
- Prove cancellation, worker timeout, handoff, and review failure.

### Phase 5 - External ingress and providers

- Wire WebShellGateway agent task submission.
- Add webhook and scheduled trigger caps.
- Add provider-token caps and remote model backend policy.
- Add remote MCP/A2A adapters.
- Add browser direct-provider mode only after server-side tool execution and
  provider-session revocation/audit are implemented.

### Phase 6 - Applications

- Hosted coding assistant over capOS repository worktrees.
- Agent-assisted first-boot setup.
- Agent-maintained operator/project wiki.
- Aurelian Frontier NPCs and story-world workers.
- Monitoring/log investigation assistant.
- Personal assistant over approved chat/email/calendar adapters.

## Open Questions

- Should hosted agents live in this repository or a sibling
  `capos-agent-shell` repository once the capability interfaces stabilize?
- What is the minimum storage substrate for `AgentMemory` before persistence
  and file-backed `MemoryObject` are complete?
- Should the first command harness support any shell syntax, or only structured
  program+argv invocations?
- How should capOS represent browser state: as a task-local profile cap,
  service-owned profile cap, or user-owned delegated profile cap?
- Which memory writes require human review before becoming long-term memory?
- How should labels propagate from raw sources through wiki summaries,
  embeddings, and model prompts?
- What is the right review independence policy when the same model provider is
  used for implementation and review?
- How should agent track record be measured without overfitting to easy tasks
  or encouraging unsafe autonomy?
- How should A2A/MCP imported tools be signed, pinned, reviewed, and revoked?
- What should be exposed in audit by default when prompts or tool outputs carry
  private content?
- How should hosted agents behave when session context expires while a task is
  mid-run?
- Can capOS use promise pipelining or notification objects to reduce tool-call
  latency without weakening approval gates?
- What formal properties should be specified for "model cannot acquire new
  authority except through broker-approved tool calls"?
- Which local embedding model is good enough for offline wiki search without
  adding unacceptable ISO size or trusted-build-input burden?
- What should be researched for secure, deterministic browser automation in a
  capability OS?

## Relationship to Existing Proposals

- [Shell](shell-proposal.md): defines the native shell and agent mode as one
  interactive runner surface. This proposal defines long-lived hosted agents
  and swarms that may be launched from shell but are not part of shell itself.
- [Language Models and Agent Runtime](llm-and-agent-proposal.md): defines
  `LanguageModel`, `TextEmbedder`, model backends, and the basic tool-use loop.
  This proposal layers hosted task state, workspaces, memory, swarms, and
  external interoperability on top.
- [Realtime Voice Agent Shell](realtime-voice-agent-shell-proposal.md): voice
  sessions can submit hosted-agent tasks or control a live runner, but media
  transport remains separate.
- [Repository Composition](repository-composition-proposal.md): the runtime,
  providers, browser harnesses, and skills may eventually belong in a sibling
  repository; the capOS core keeps capability interfaces and authority policy.
- [System Monitoring](system-monitoring-proposal.md): hosted agents need audit,
  trace, status, and cost views.
- [Resource Accounting and Quotas](resource-accounting-proposal.md): hosted
  agents are a forcing function for token, provider, workspace, process, and
  network ledgers.
- [User Identity and Policy](user-identity-and-policy-proposal.md): session
  profile, guest/operator policy, step-up, and expiry decide agent authority.

## Research Still Needed

- OpenClaw threat model from primary advisories, not news summaries: gateway
  exposure, node hosts, skills, browser profiles, exec approvals, memory, and
  provider credentials.
- MCP security: stdio process spawning, remote auth, tool poisoning, prompt
  injection, marketplace signing, and per-tool permission descriptions.
- A2A security and identity: authentication, authorization, task provenance,
  artifact integrity, and non-transfer of authority.
- Browser automation containment: CDP risks, extension relays, logged-in
  profiles, downloads/uploads, arbitrary JS evaluation, clipboard, screenshots,
  and private-network access.
- Agent memory correctness: citation fidelity, contradiction detection, stale
  summaries, label propagation, hallucinated links, and human review workflow.
- Retrieval architecture: index-first wiki navigation versus vector RAG,
  hybrid search, reranking, snippet budgets, local embeddings, and remote
  embedding denial for high-label data.
- Swarm orchestration: when parallel agents improve throughput, when they
  create coordination debt, how to assign work, and how to prevent review
  capture.
- Evals: deterministic task harnesses for tool calls, memory ingest, prompt
  injection, browser tasks, code edits, review quality, and resource budget
  enforcement.
- Local model viability: smallest model that can follow schemas/tool calls,
  local embedding model choice, quantization, context budget, and ISO/storage
  impact.
- Provider policy: data-retention settings, regional routing, ephemeral
  credentials, revocation, spend controls, and audit of remote inference.
- Formal authority model: prove that model text, memory text, remote agent
  messages, and MCP descriptors cannot mint capOS authority.
- UX for approvals: avoiding consent fatigue while preserving fresh user
  presence for dangerous actions.
- Agent-maintained docs: how capOS should use its own proposals, backlog,
  research notes, and wiki artifacts as agent-legible harness inputs without
  making stale generated docs authoritative.