Proposal: capOS-Hosted Agent Swarms

capOS should eventually host OpenClaw-like personal agents and multi-agent workflows as ordinary capability-scoped services. The existing Language Models and Agent Runtime proposal defines the model capability surface and the single-session tool-use loop. This proposal covers the layer above it: long-lived hosted agents, workspace and memory layout, swarm orchestration, agent-to-agent coordination, and harness controls.

The first credible implementation is not a general “AI computer”. It is a controlled service graph:

user-facing ingress through native shell, SSH/WebShellGateway, chat channels, webhooks, or scheduled triggers;
a trusted capOS runner that owns session capabilities and enforces tool gates;
narrow agent workers that receive only task-local workspace, retrieval, and tool caps;
explicit memory and wiki services instead of hidden prompt state;
durable task records, review gates, and attribution for multi-agent work.

This belongs outside the shell proposal. Shell mode remains one interactive runner surface. Hosted agents need persistent service state, remote ingress, work queues, memory compaction, swarm scheduling, and audit rules that would make the shell proposal too broad.

Research Baseline

Sources reviewed for this design:

capOS research note, Hosted Agent Harnesses: <../research/hosted-agent-harnesses.md>
OpenAI, Harness engineering: https://openai.com/index/harness-engineering/
OpenAI, Agents SDK sandbox and model-native harness direction: https://openai.com/index/the-next-evolution-of-the-agents-sdk/
OpenClaw documentation: home, agent runtime, workspace, memory, exec, browser, and multi-agent controls: https://openclawlab.com/en/, https://openclawlab.com/en/docs/concepts/agent/, https://openclawlab.com/en/docs/concepts/agent-workspace/, https://openclawlab.com/en/docs/concepts/memory/, https://openclawlab.com/en/docs/tools/exec/, https://openclawlab.com/en/docs/tools/browser/, https://openclawlab.com/en/docs/concepts/multi-agent/
DeepWiki secondary project summaries for OpenClaw, OpenClaw skills, OpenManus, Microsoft Agent Framework, and AutoGen: https://deepwiki.com/openclaw/openclaw, https://deepwiki.com/openclaw/skills/2.2-agent-memory-persistence-pattern, https://deepwiki.com/openclaw/docs/6.3-web-search-and-browser-tools, https://deepwiki.com/FoundationAgents/OpenManus, https://deepwiki.com/microsoft/agent-framework, https://deepwiki.com/microsoft/ai-agents-for-beginners/3.1-autogen-framework
Karpathy, LLM Wiki: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f
Abdullin, Schema-Guided Reasoning: https://abdullin.com/schema-guided-reasoning/
MetaGPT: https://arxiv.org/abs/2308.00352
Generative Agents / Smallville: https://arxiv.org/abs/2304.03442
Gas Town documentation: https://docs.gastownhall.ai/, https://docs.gastownhall.ai/usage/
Model Context Protocol: https://modelcontextprotocol.io/docs/getting-started/intro, https://modelcontextprotocol.io/docs/learn/architecture
Agent2Agent Protocol: https://github.com/a2aproject/A2A, https://a2a-protocol.org/latest/specification/
Microsoft AutoGen and Microsoft Agent Framework: https://www.microsoft.com/en-us/research/project/autogen/overview/, https://learn.microsoft.com/en-us/agent-framework/overview/
LangGraph durable execution: https://docs.langchain.com/oss/python/langgraph/durable-execution
CrewAI: https://docs.crewai.com/
CAMEL-AI: https://docs.camel-ai.org/get_started/introduction

There is substantial low-quality agent SEO around OpenClaw and related systems. This proposal relies on primary docs, official project pages, arXiv papers, and DeepWiki pages only as secondary codebase summaries. News and social reports may motivate later risk research, but they are not treated as design authority.

What Current Agent Harnesses Actually Do

The useful pattern is not “model plus tools”. It is a harness that controls what the model can inspect, what it can change, how work survives context loss, and where human approval enters the loop.

OpenAI’s harness engineering writeup is the cleanest framing for capOS: repository-local, versioned artifacts are what the agent can reason about; knowledge in chat threads, documents, and people’s heads is effectively absent unless compiled into files, schemas, tests, and executable plans. The same post argues for mechanically enforced architecture, validated boundaries, and agent-legible systems over ad-hoc documentation. The 2026 Agents SDK direction adds an explicit model-native harness, controlled workspaces, sandbox execution, filesystem tools, MCP, skills, AGENTS.md-style instructions, shell execution, and structured patch tools.

OpenClaw shows the personal-agent product shape:

local-first channel ingress through chat apps, webhooks, cron, and a gateway;
a gateway security boundary for channels and tool execution;
an agent runtime with a workspace as the default tool cwd;
injected bootstrap files such as AGENTS.md, TOOLS.md, USER.md, and identity/persona files;
built-in read, exec, edit/write, browser, web, process, memory, and skill surfaces;
a browser harness with managed profiles, snapshots, screenshots, action refs, CDP routing, and optional arbitrary JavaScript evaluation;
an exec harness with host selection (sandbox, gateway, node), security modes (deny, allowlist, full), approval prompts, timeouts, background sessions, PTY support, process polling, and path/env restrictions;
markdown memory where files are the source of truth, plus semantic search, line-range reads, SQLite indexes, local/remote embeddings, and hybrid search;
per-agent workspaces, sandbox settings, and tool allow/deny lists.

The important negative lesson is also explicit in OpenClaw’s docs: a workspace is not automatically a sandbox. If sandboxing is off, absolute paths and host tools can still reach outside the workspace. capOS should not reproduce that ambiguity. A capOS agent workspace must be a capability namespace by default, not a convention over a host filesystem.

DeepWiki’s accessible summaries add useful implementation-level signals:

OpenClaw exposes tools as functional capabilities and skills as modular SKILL.md extensions, with a personal-assistant trust model, security audit, and sandboxing options.
OpenClaw memory skills converge on durable, retrievable, self-maintaining memory because a single growing MEMORY.md overflows context and loses structure.
OpenClaw web/browser docs describe dedicated managed browser profiles, CDP control through the gateway, SSRF checks, provider-backed web search, fetch normalization, and active memory integration.
OpenManus uses a think-act cycle with tool execution, multi-provider LLMs, MCP integration, and sandboxed code/browser automation.
Microsoft Agent Framework and AutoGen emphasize graph/workflow orchestration, checkpointing, human-in-the-loop, event-driven actor-style communication, distributed runtimes, tools, memory, observability, and MCP/A2A integrations.

For this repository itself, applying OpenAI-style harness engineering means turning capOS’s docs, workplans, run targets, QEMU proofs, proposal statuses, research notes, and schema authority semantics into mechanically navigable agent inputs. That repository-local work is owned by capOS Repository Harness Engineering, with source grounding in Hosted agent harnesses.

Product Goal

The visible milestone is:

make run-hosted-agent boots capOS in QEMU, starts a resident hosted-agent service graph, accepts a scripted user request, creates a task-local workspace, runs one or more bounded agent workers through a deterministic model service, uses retrieval/wiki context, executes one read-only tool automatically, requires approval for one mutating tool, records attributed audit output, and shuts down without leaking session, model, or host authority to the worker.

Later milestones add real model backends, web ingress, chat ingress, browser automation, multi-agent swarms, and remote/provider interoperability.

Design Principles

Harness first, model second. The hosted-agent service is primarily a control plane for workspaces, tools, memory, approvals, lifecycle, and audit. Model selection is a replaceable backend decision.
Agents are processes with caps, not identities with ambient power. An agent worker has exactly the caps minted for one session, task, and phase. It does not inherit the operator’s whole world.
All tool execution is mediated. The model proposes structured tool calls. The runner validates descriptors, arguments, turn binding, policy, budget, and approval before invocation.
Memory is an artifact, not a hidden model property. Durable facts, summaries, task logs, and wiki pages live in capability-scoped files or services with provenance, review status, and retention policy.
Swarm work is durable structured data. Tasks, assignments, handoffs, reviews, votes, failures, and merge decisions must outlive any model context window.
Human review is a capability gate. The system should support both high-autonomy local demos and conservative operator policy, but destructive or authority-widening actions require explicit fresh consent or step-up.
Remote agent interoperability is data-plane only at first. MCP and A2A style bridges may expose descriptors and messages, but they do not carry raw capOS authority.
CapOS should be stricter than desktop harnesses. Browser profiles, shell execution, provider credentials, memory stores, and file workspaces are separate capabilities with narrow lifetime and auditable grants.
Shared resources need coordination objects. A git repo, task queue, wiki, browser profile, or shared todo list is not just a file path. The agent harness must expose owners, leases, versions, watches, and conflict reports before workers mutate shared state.
Incoming agent messages are untrusted work items. A chat message from another agent can carry status, questions, handoffs, artifacts, or requests. It must not directly alter prompt state, execute tools, widen caps, or override task policy.

System Topology

flowchart LR
    User[User / channel / cron / webhook] --> Gateway[Ingress Gateway]
    Gateway --> Broker[AuthorityBroker]
    Broker --> Host[HostedAgentService]

    Host --> Task[AgentTask<br/>durable state]
    Host --> Runner[AgentRunner<br/>trusted tool gate]
    Host --> Memory[AgentMemory<br/>wiki + logs + search]
    Host --> Model[LanguageModel<br/>local or remote backend]
    Host --> Scheduler[SwarmScheduler]

    Scheduler --> W1[Worker process<br/>task workspace caps]
    Scheduler --> W2[Worker process<br/>task workspace caps]
    Scheduler --> R[Reviewer process<br/>read + critique caps]

    Runner --> Tools[Typed capOS tools]
    Runner --> Approval[ApprovalClient]
    Runner --> Audit[AuditLog]
    Memory --> Store[(Workspace / Wiki / Vector Index)]

The kernel does not need agent semantics. It needs process isolation, endpoint invocation metadata, MemoryObject/file-backed storage, capability transfer, and resource accounting. The agent system is a userspace service graph.

Core Capabilities

HostedAgentService

Owns hosted-agent lifecycle for one broker policy domain:

create a task from a user request, webhook, schedule, or shell command;
allocate a task workspace and memory scope;
select a model profile and runner policy;
start workers with exact-grant capsets;
enforce task budgets and cancellation;
publish task status to shell, web, or chat surfaces;
close, archive, or purge task state.

AgentTask

Durable task record:

request, normalized objective, requester session reference, and ingress provenance;
workspace root cap, memory scope cap, allowed tools, and budgets;
model profile and harness version;
worker assignments and state transitions;
links to artifacts, audit records, approvals, and review results;
terminal status (open, blocked, needsApproval, reviewing, done, failed, cancelled, expired).

AgentRunner

Trusted loop executor:

builds tool descriptors from held caps and broker policy;
calls LanguageModel.stream or complete;
validates structured tool calls;
applies schema-guided reasoning templates for planner/reviewer tasks;
runs guard checks before and after tool execution;
truncates and redacts tool results;
appends conversation and action records;
handles cancellation, timeout, retry, and model failure.

AgentMemory

Information organization layer:

append-only daily task log;
curated long-term project memory;
source store for immutable raw inputs;
LLM-maintained wiki pages with source citations;
index and log files for cheap navigation;
optional BM25/vector hybrid search and reranking;
stale/contradiction/orphan-page lint;
per-session and per-project visibility controls.

SwarmScheduler

Multi-agent orchestration:

decomposes work into durable sub-tasks;
assigns workers by role, available caps, model profile, and track record;
creates task-local worktrees or equivalent namespace forks for code work;
supervises handoff and timeout;
asks reviewer workers for critique under read-only or constrained write caps;
emits merge/release requests only after gates pass.

Workspace Model

Desktop harnesses commonly treat a workspace as a cwd convention. capOS should treat a workspace as a capability namespace:

WorkspaceRoot: scoped directory-like cap for a task.
SourceMount: read-only cap to immutable sources.
Scratch: writeable temporary storage with quota and TTL.
ArtifactOutbox: explicit export path for user-visible artifacts.
PatchSet: structured edit proposal, not arbitrary writes by default.
SecretsView: normally absent; if present, returns typed opaque handles, not strings.

Default policy:

read-only source mounts unless the task explicitly asks for edits;
no absolute path escape because there is no global filesystem path;
generated artifacts are quarantined until reviewed or explicitly released;
tool outputs are capped and stored with provenance;
workspaces expire unless promoted to project memory.

This makes OpenClaw-style sandbox versus host ambiguity unnecessary. Authority is not inferred from where a command happens to run.

Shared Resource Coordination

Agent swarms fail in ordinary repositories and shared task lists when every worker believes it is alone. capOS should model shared resources explicitly:

SharedResource: git repository, task list, wiki page tree, browser profile, memory store, package cache, or external service account.
ResourceLease: exclusive or shared claim with owner, task, phase, scope, expiry, renewal policy, and release reason.
ResourceVersion: observed revision, generation, branch head, page hash, or compare-and-swap token.
ResourceWatch: subscription to resource updates, lease changes, conflicts, and merge/release queue events.
ConflictReport: structured notice that two tasks touched the same file, todo item, wiki page, browser profile, credential scope, or external object.

Minimum policy:

leases are coordination metadata, not write authority; mutation still requires the relevant workspace, patch, tool, or service cap;
every mutating task declares the resource scopes it expects to touch;
exclusive resources reject overlapping leases unless a supervisor approves a shared mode;
shared resources require versioned writes or patch sets;
stale leases expire and emit events instead of silently blocking work;
workers receive conflict reports as structured context, not as informal chat;
merge/release queues serialize publication to user-visible state;
audit records include resource scope, observed version, write version, and approving actor.

Concrete resource policies:

Git repositories: one task worktree and branch per worker, path/subsystem claims for high-conflict areas, merge queue before mainline publication, and conflict reports when another task changes claimed paths.
Shared todo lists: item-level claims, item generation numbers, compare-and-swap updates, and supervisor escalation for duplicate ownership.
Wiki and memory pages: page leases or patch sets, source citations, contradiction checks, and freshness labels before compiled memory becomes trusted context.
Browser profiles: exclusive lease by default because cookies, local storage, downloads, and screenshots collapse many unrelated authorities.

For capOS repository work specifically, this maps to the existing requirement that each agent uses a dedicated branch and worktree. A future harness should make that visible through an active-work registry, claimed resource scopes, review findings, and merge-queue state instead of relying on each agent to infer it from git state and chat history.

Agent Inboxes and Inter-Agent Messages

Free-form peer chat is useful for coordination, but it is a poor authority boundary. capOS should deliver messages through an explicit AgentInbox capability owned by the runner or task, not by direct prompt injection.

An incoming message should be a structured AgentMessage event:

id: msg-...
sender: agent-or-peer-id
sender_task: task-...
recipient_task: task-...
kind: status
# status | question | handoff | reviewFinding | resourceEvent |
# artifactReady | approvalRequest | interrupt
causal_parent: msg-or-task-event-id
body: bounded markdown or structured payload
artifact_refs:
  - artifact-...
requested_actions:
  - proposed action descriptor
requested_authority:
  - capability descriptor, never a raw cap
expires_at_unix_ms: 1893456000000

Delivery rules:

the runner validates sender identity, task relationship, size, schema, expiry, and policy before the model sees the message;
message ids are deduplicated per sender and task within a bounded replay window;
old causal parents, duplicate approval requests, and duplicate interrupts are quarantined instead of redelivered;
per-sender and per-task quotas cap message count, queued bytes, delivery rate, and model-visible inbox bytes;
peers that exceed quota or trigger repeated quarantine are rate-limited or muted until supervisor review;
unknown senders, stale tasks, malformed payloads, and policy-incompatible requests are quarantined for supervisor review;
artifact references require separate artifact caps before content is read;
requested actions become proposed tool calls or task changes, never automatic execution;
requested authority becomes an approval request, never ambient delegation;
interrupts and approval requests may receive priority, but still pass through policy and audit;
every delivered message carries sender, task, and causal-parent metadata so a worker can distinguish user intent, supervisor instruction, peer status, and untrusted external input.

This gives agents the useful parts of chat messages from other agents without making chat an authority channel. It also gives the scheduler a place to surface shared-resource events such as “another worker claimed this path”, “your todo item changed”, or “merge queue rejected your patch”.

Tool Harness Controls

capOS should support the same classes of controls as current harnesses, but with capability-native semantics:

Tool class	Desktop harness pattern	capOS target
File read	workspace-relative reads, memory reads	directory/file caps with line-range and byte-budget policy
File write/edit	direct edits or patch tool	`PatchSet` plus approval, or write cap scoped to scratch/outbox
Shell/exec	host/sandbox/node, allowlist/full, approvals	`CommandRunner` cap with binary caps, argv schema, cwd cap, env cap, PTY cap, timeout, output cap
Browser	CDP profile, snapshots, action refs, screenshots	`BrowserSession` cap with profile isolation, origin policy, JS-eval deny by default, screenshot/snapshot separation
Web/fetch	provider-specific tool	`HttpEndpoint` / `Fetch` caps scoped by origin, method, headers, and data labels
Model	provider API key or local model	`LanguageModel` cap from broker, no provider secret strings
Memory	markdown files plus search plugin	`AgentMemory` cap with source/wiki/index/search subcaps
Agent-to-agent	session send/spawn, A2A-like messages	`AgentPeer` endpoint with message schema, no implicit authority transfer

Execution policy modes should reuse the LLM proposal’s auto, consent, stepUp, and forbidden modes, but attach them to typed capability methods and task phases. A tool may be auto during read-only research and consent when called from a mutating phase.

Browser Harness

Browser automation is high-risk because logged-in web state, screenshots, and page JavaScript collapse many trust boundaries. A capOS browser harness should:

launch a dedicated browser profile per task or per approved long-lived agent;
keep personal/operator browser profiles out of scope by default;
expose snapshots and screenshots as separate capabilities;
require explicit policy for JavaScript evaluation;
bind every action to a prior snapshot ref when possible;
treat page text, DOM, screenshots, downloads, and clipboard data as hostile;
block private-network and metadata-service fetches unless broker policy grants them;
isolate cookies and credentials by profile cap;
make remote CDP-style control a future bridge, never the baseline.

The first QEMU proof should use a deterministic fake browser tool, not a full Chromium port.

Exec Harness

The first exec surface should not be a Unix shell. It should be a command capability with explicit shape:

interface CommandRunner {
  run @0 (req :CommandRequest) -> (result :CommandResult);
}

The request should name a pre-granted program or command class, not arbitrary shell text. If a POSIX layer later exists, shell execution can be a separate high-risk tool with parsing, approval, and audit.

Minimum controls:

allowed program identity is resolved before execution;
argv is structured, not interpolated;
environment is built from allowlisted variables and typed secret handles;
working directory is a WorkspaceRoot or subdirectory cap;
output byte and line limits are mandatory;
timeout and kill semantics are mandatory;
background processes require an explicit ProcessSession cap;
PTY is a separate grant;
network access is absent unless the child receives a network cap;
mutating commands require approval unless the task owns the target scratch or patch workspace.

Memory, Wiki, and Retrieval

Karpathy’s LLM Wiki pattern is a better fit for capOS than an unstructured vector database as the primary memory. The design has three layers:

immutable raw sources;
an LLM-maintained markdown wiki of summaries, entity pages, concept pages, comparisons, and synthesis;
a schema/instruction file that defines page layout, ingest, query, lint, and update conventions.

The useful operations are:

Ingest: read a source, write or update wiki pages, update index, append log.
Query: read the index, inspect relevant pages, synthesize an answer with citations, optionally file useful answers back into the wiki.
Lint: find contradictions, stale claims, orphan pages, missing links, weak citations, and data gaps.

capOS should implement this as a service rather than only as files:

SourceCorpus: immutable source handles with digest, label, owner, and TTL.
WikiPage: generated markdown plus source citations and confidence status.
WikiIndex: content-oriented page catalog, cheap enough for the agent to read first.
WikiLog: append-only operation timeline.
WikiLint: typed findings for contradictions, missing citations, stale pages, orphan pages, and access-label drift.
SearchIndex: optional BM25/vector hybrid index over approved pages and source chunks.

OpenClaw’s memory docs are a practical baseline: markdown is the source of truth, daily logs and curated MEMORY.md are separate, semantic search returns bounded snippets with file and line ranges, indexes are per-agent, and local embeddings can avoid remote leakage. capOS should add hard provenance, labels, and write authority.

Retrieval Rules

Retrieval returns bounded snippets, not whole private files by default.
Every synthesized claim that leaves the task should carry source links or be marked uncited.
Wiki pages inherit the maximum confidentiality label of their sources unless a trusted redaction step lowers it.
Memory writes require a policy decision: transient task log, project wiki, user memory, or rejected.
Cross-agent memory access is explicit. A reviewer can read task artifacts without inheriting private user memory.
Remote embedding backends are denied for high-label memory.

Schema-Guided Reasoning

Abdullin’s Schema-Guided Reasoning pattern is directly useful for capOS: force the model to fill typed intermediate structures in a known order, validate them, and test them. It is not a substitute for capability policy, but it is a good harness technique for bounded agent roles.

Use SGR for:

task intake: classify objective, risk, needed capabilities, and missing clarifications;
plan decomposition: produce sub-tasks, dependencies, verification gates, and rollback paths;
tool-call review: explain why a call is necessary and what authority it touches before approval;
source ingest: extract claims, citations, contradictions, and affected pages;
code review: enumerate behavioral risks, security risks, tests, and residual uncertainty;
final handoff: summarize artifacts, verification, open risks, and memory updates.

Each schema should be a Cap’n Proto or JSON-schema-like type with versioning, test fixtures, and guardrails. The runner should validate the structure before any action, and failures should become ordinary tool results rather than hidden prompt retries.

Swarm Patterns

MetaGPT / Role Pipelines

MetaGPT’s useful contribution is not the specific software-company metaphor. It encodes standard operating procedures into prompt sequences and assigns roles so intermediate artifacts can be verified. capOS should borrow the artifact gates:

product/task brief;
requirements and constraints;
design sketch;
implementation plan;
implementation;
tests and verification;
review;
release/handoff.

Do not hard-code “PM”, “architect”, and “engineer” as kernel concepts. They are runner roles backed by schemas, caps, and task state.

Smallville / Generative Agents

The Generative Agents paper is useful for long-lived NPCs, companion agents, and simulations. Its memory stream, reflection, and planning loop explains how agents can appear coherent over time. capOS should use it cautiously:

good for adventure NPCs, training simulations, social workflows, and explainable daily plans;
bad as a direct authority model because believable behavior is not safe behavior;
memory/reflection outputs must be low-authority data until reviewed or compiled into a scoped wiki.

Gas Town / Durable Agent Work

Gas Town’s useful pattern is persistent orchestration: roles, durable work objects, attribution, worker lifecycles, worktrees, convoys, merge queues, and supervision. capOS should borrow:

one task object per unit of work;
explicit worker lifecycle classes: persistent worker, ephemeral worker, reviewer, supervisor;
task-local worktrees or namespace forks;
merge/release queues;
per-action attribution and track record;
handoff records when an agent loses context or is recycled.

capOS should not borrow the role vocabulary or assume git is the only state substrate. For code work, git/worktrees are excellent. For OS services, the same pattern should map to AgentTask, PatchSet, Artifact, and ReviewFinding capabilities.

Interoperability

MCP

MCP is a useful external compatibility layer for tools, resources, and prompts. Its architecture is JSON-RPC over stdio or HTTP, with client/server capability negotiation and primitives for tools, resources, prompts, sampling, elicitation, logging, and experimental tasks.

capOS should treat MCP as an adapter boundary:

an MCP server can be hosted as a low-authority process behind a capOS tool proxy;
an MCP client can import external tools only after broker review;
MCP tool descriptors are translated into capOS ToolDescriptor values;
MCP tool calls execute through runner policy, not directly from the model;
stdio MCP servers run without ambient filesystem/network unless granted caps;
remote MCP uses HttpEndpoint plus explicit auth/token caps;
MCP sampling/elicitation must not bypass runner approval or user-presence policy.

The risk is tool-marketplace sprawl: tools with similar names, hidden network behavior, local process execution, and prompt-injection-sensitive resources. capOS should require provenance, signing, version pinning, permission review, and sandboxed execution for imported MCP servers.

A2A / Agent-to-Agent

A2A is the right primary protocol reference for cross-agent interoperability: agent cards, peer discovery, modality negotiation, task collaboration, text, files, structured data, and streaming or push delivery. The first capOS bridge should still be narrower than the full protocol surface:

AgentPeer.describe() returns identity, capabilities, cost, labels, and accepted task/message schemas.
AgentPeer.send() imports a task or message into AgentInbox with no authority transfer.
AgentPeer.artifact() returns content only through an explicit export cap.
Authentication and authorization are broker-mediated.
Remote agents are untrusted services, not session principals.

Raw capOS caps should not cross an A2A bridge. A remote agent receives data, message events, and artifact references, not authority. Agent-card capabilities map to descriptors that the broker can review; they do not imply tool access inside capOS.

Security Model

Primary threats:

prompt injection through web pages, tool results, logs, email, chat, or memory pages;
malicious or compromised tools, skills, MCP servers, browser extensions, and model adapters;
workspace escape through shell, filesystem, browser profile, CDP, downloads, or path tricks;
secret exposure through prompts, tool results, screenshots, logs, memory, or remote embeddings;
authority widening through agent-to-agent delegation;
stale or poisoned memory becoming trusted context;
runaway cost, process count, token use, or network use;
false completion: agent claims work is done without verifying artifacts;
review capture: same model/harness family produces work and review without independent checks.

Controls:

exact-grant worker capsets;
task-local workspaces and quotas;
no ambient filesystem, network, process, browser, or secret access;
structured tool descriptors and argument validation;
per-tool auto / consent / stepUp / forbidden policy;
fresh user presence for mutating/destructive calls;
audit for every authority-touching action;
source labels and memory provenance;
deterministic verification tools where possible;
independent reviewer roles with read-only caps;
expiry and revocation for tasks, workers, browser profiles, model streams, and provider tokens.

Resource Accounting

Hosted agents need first-class quotas:

model input/output tokens;
remote provider spend;
wall-clock runtime;
process count and threads;
memory and workspace bytes;
source corpus bytes;
vector index bytes;
browser sessions and tabs;
network requests and egress bytes;
tool-call count by risk class;
inbox message count, queued bytes, delivery rate, and replay-window entries;
quarantined peer-message count by sender and task;
approval prompt count to prevent consent fatigue.

Budgets belong to AgentTask and are enforced by the runner, broker, and resource ledgers. A worker cannot extend its own budget. Budget extension is a broker or user action.

Implementation Phases

Phase 0 - Research and design grounding

Write targeted research notes for OpenClaw harness controls, MCP security, A2A, Gas Town orchestration, LLM Wiki memory, and browser automation risk.
Decide which parts belong in capOS core versus a sibling capos-agent-shell repository.
Define the minimum QEMU-hosted deterministic model and fake browser/exec tools needed for proof.

Phase 1 - Single hosted task, deterministic model

Add HostedAgentService, AgentTask, AgentRunner, and deterministic LanguageModel test service.
Create task workspace caps over existing storage primitives or a temporary in-memory substitute.
Implement a read-only tool and a mutating fake tool with approval.
Add make run-hosted-agent QEMU proof.

Phase 2 - Memory and wiki substrate

Add AgentMemory with source, wiki, index, log, and lint concepts.
Implement markdown-backed storage first.
Add bounded retrieval by page and line range.
Add source citations and label inheritance.
Prove ingest, query, lint, and memory write rejection under policy.

Phase 3 - Tool harnesses

Add structured CommandRunner without arbitrary shell.
Add PatchSet for file edits.
Add fake browser harness, then later real browser integration outside the kernel path.
Add MCP import behind a tool-proxy policy review.

Phase 4 - Swarm scheduling

Add durable subtask records and worker assignment.
Add ephemeral worker processes with exact-grant capsets.
Add reviewer workers with constrained caps.
Add merge/release queue semantics for artifacts.
Prove cancellation, worker timeout, handoff, and review failure.

Phase 5 - External ingress and providers

Wire WebShellGateway agent task submission.
Add webhook and scheduled trigger caps.
Add provider-token caps and remote model backend policy.
Add remote MCP/A2A adapters.
Add browser direct-provider mode only after server-side tool execution and provider-session revocation/audit are implemented.

Phase 6 - Applications

Hosted coding assistant over capOS repository worktrees.
Agent-assisted first-boot setup.
Agent-maintained operator/project wiki.
Aurelian Frontier NPCs and story-world workers.
Monitoring/log investigation assistant.
Personal assistant over approved chat/email/calendar adapters.

Open Questions

Should hosted agents live in this repository or a sibling capos-agent-shell repository once the capability interfaces stabilize?
What is the minimum storage substrate for AgentMemory before persistence and file-backed MemoryObject are complete?
Should the first command harness support any shell syntax, or only structured program+argv invocations?
How should capOS represent browser state: as a task-local profile cap, service-owned profile cap, or user-owned delegated profile cap?
Which memory writes require human review before becoming long-term memory?
How should labels propagate from raw sources through wiki summaries, embeddings, and model prompts?
What is the right review independence policy when the same model provider is used for implementation and review?
How should agent track record be measured without overfitting to easy tasks or encouraging unsafe autonomy?
How should A2A/MCP imported tools be signed, pinned, reviewed, and revoked?
What should be exposed in audit by default when prompts or tool outputs carry private content?
How should hosted agents behave when session context expires while a task is mid-run?
Can capOS use promise pipelining or notification objects to reduce tool-call latency without weakening approval gates?
What formal properties should be specified for “model cannot acquire new authority except through broker-approved tool calls”?
Which local embedding model is good enough for offline wiki search without adding unacceptable ISO size or trusted-build-input burden?
What should be researched for secure, deterministic browser automation in a capability OS?

Relationship to Existing Proposals

Shell: defines the native shell and agent mode as one interactive runner surface. This proposal defines long-lived hosted agents and swarms that may be launched from shell but are not part of shell itself.
Language Models and Agent Runtime: defines LanguageModel, TextEmbedder, model backends, and the basic tool-use loop. This proposal layers hosted task state, workspaces, memory, swarms, and external interoperability on top.
Realtime Voice Agent Shell: voice sessions can submit hosted-agent tasks or control a live runner, but media transport remains separate.
Repository Composition: the runtime, providers, browser harnesses, and skills may eventually belong in a sibling repository; the capOS core keeps capability interfaces and authority policy.
System Monitoring: hosted agents need audit, trace, status, and cost views.
Resource Accounting and Quotas: hosted agents are a forcing function for token, provider, workspace, process, and network ledgers.
User Identity and Policy: session profile, guest/operator policy, step-up, and expiry decide agent authority.

Research Still Needed

OpenClaw threat model from primary advisories, not news summaries: gateway exposure, node hosts, skills, browser profiles, exec approvals, memory, and provider credentials.
MCP security: stdio process spawning, remote auth, tool poisoning, prompt injection, marketplace signing, and per-tool permission descriptions.
A2A security and identity: authentication, authorization, task provenance, artifact integrity, and non-transfer of authority.
Browser automation containment: CDP risks, extension relays, logged-in profiles, downloads/uploads, arbitrary JS evaluation, clipboard, screenshots, and private-network access.
Agent memory correctness: citation fidelity, contradiction detection, stale summaries, label propagation, hallucinated links, and human review workflow.
Retrieval architecture: index-first wiki navigation versus vector RAG, hybrid search, reranking, snippet budgets, local embeddings, and remote embedding denial for high-label data.
Swarm orchestration: when parallel agents improve throughput, when they create coordination debt, how to assign work, and how to prevent review capture.
Evals: deterministic task harnesses for tool calls, memory ingest, prompt injection, browser tasks, code edits, review quality, and resource budget enforcement.
Local model viability: smallest model that can follow schemas/tool calls, local embedding model choice, quantization, context budget, and ISO/storage impact.
Provider policy: data-retention settings, regional routing, ephemeral credentials, revocation, spend controls, and audit of remote inference.
Formal authority model: prove that model text, memory text, remote agent messages, and MCP descriptors cannot mint capOS authority.
UX for approvals: avoiding consent fatigue while preserving fresh user presence for dangerous actions.
Agent-maintained docs: how capOS should use its own proposals, backlog, research notes, and wiki artifacts as agent-legible harness inputs without making stale generated docs authoritative.

Keyboard shortcuts

capOS Documentation