Proposal: capOS As A Robot Brain

How capOS should grow into a capability-oriented robot brain for manufacturing robots, mobile robots, RC cars, drones, and autonomous-vehicle research without collapsing safety, realtime, perception, planning, and operator control into one trusted process.

Purpose

capOS has the right architectural ingredients for robotics: isolated processes, explicit capabilities, typed IPC, revocation, memory objects, service composition, audit direction, and future scheduling contexts. Robotics is a useful forcing function because it combines physical authority with mixed-criticality timing:

a camera pipeline can drop frames;
a local planner can miss a cycle and recover;
a wheel command must expire safely;
a robot arm must obey limits;
an e-stop must not depend on a model, network, shell, or log service.

The proposal is not “run every control loop in the kernel.” It is a staged robotics architecture where capOS owns authority routing, service isolation, telemetry, update, planning, and eventually admitted realtime islands, while the tightest safety loops remain on certified controllers or MCUs until capOS has evidence to replace them.

Goals

Define a capability-native robot service graph.
Separate safety, realtime control, perception, planning, operator UI, simulation, manufacturing integration, and agents.
Make actuator authority explicit, revocable, logged, and bounded by mode, safety state, command freshness, and limits.
Support compatibility bridges for ROS 2, micro-ROS, MAVLink, OPC UA, and simulation tooling without turning them into ambient authority tunnels.
Provide a path from simulation to small physical robots before industrial or vehicle safety claims.
Reuse MemoryObject rings, notification/futex paths, and future scheduling contexts for sensor streams and control loops.

Non-Goals

Replacing certified safety PLCs, flight controllers, servo drives, or vehicle safety controllers in the near term.
Claiming IEC 61508, ISO 13849, ISO 10218, or ISO 26262 compliance.
Putting model inference or natural-language agents in direct control of actuators.
Making ROS 2 an ambient compatibility layer with implicit access to every capOS service.
Copying large sensor frames through Cap’n Proto payloads in the data path.

Architecture

flowchart LR
    Operator[Operator UI / shell / teleop] --> Mission[Mission and behavior]
    Agent[Agent runner] --> Mission
    Mission --> Planner[Planner]
    Planner --> Controller[Realtime controller island]
    Controller --> Actuator[Actuator gateway]
    Actuator --> Hardware[MCU / PLC / drive / autopilot]

    SensorHW[Camera / lidar / IMU / encoders] --> SensorSvc[Sensor services]
    SensorSvc --> Perception[Perception]
    Perception --> World[World model]
    World --> Planner

    Safety[Safety monitor] --> Mission
    Safety --> Controller
    Safety --> Actuator

    Bridges[ROS 2 / MAVLink / OPC UA bridges] --> Mission
    Bridges --> SensorSvc
    Bridges --> Actuator

    Audit[Audit and telemetry] --- Mission
    Audit --- Controller
    Audit --- Actuator

Principal split:

Sensor services own device-facing capture authority and publish typed streams or snapshots.
Perception consumes sensor streams and emits world-model updates.
Mission and behavior chooses tasks, modes, and goals.
Planner computes paths, trajectories, or setpoints within policy.
Realtime controller island turns admitted inputs into cyclic commands.
Actuator gateway is the only holder of hardware command authority.
Safety monitor observes independent safety state and can force stop, neutral, disarm, or mode degradation.
Agent runner may propose or explain actions but does not hold actuator caps.
Compatibility bridges receive narrow imported/exported caps.

Core Rule

No process gets both broad interpretation authority and raw physical authority.

Examples:

A language model may emit a structured proposal; it does not receive ActuatorCommand.
A ROS bridge may publish odometry and accept a velocity command cap; it does not receive the whole capOS service graph.
A planner may receive a goal and produce a trajectory; it does not directly program motor registers.
An actuator gateway may command hardware; it does not fetch network content or run operator scripts.

Robot Capabilities

The first schema should stay small and control-plane oriented. Bulk sensor data uses MemoryObject rings.

interface RobotDescription {
  describe @0 () -> (description :RobotDescriptionSnapshot);
  readFrameTree @1 () -> (frames :FrameTreeSnapshot);
}

interface SensorStream {
  describe @0 () -> (info :SensorInfo);
  openRing @1 (config :StreamConfig) -> (ring :MemoryObject);
  readStatus @2 () -> (status :StreamStatus);
}

interface ActuatorCommand {
  describe @0 () -> (info :ActuatorInfo);
  submit @1 (frame :CommandFrame) -> (accepted :Bool);
  neutral @2 (reason :Text) -> ();
}

interface SafetyState {
  read @0 () -> (state :SafetySnapshot);
  subscribe @1 () -> (events :SensorStream);
}

interface ControlLoop {
  describe @0 () -> (info :LoopInfo);
  start @1 () -> ();
  stop @2 (reason :Text) -> ();
  readTelemetry @3 () -> (telemetry :LoopTelemetry);
}

CommandFrame must carry:

sequence number;
monotonic timestamp;
deadline;
command mode;
coordinate frame;
limit profile;
typed payload;
source identity;
optional safety-envelope revision.

Command freshness is mandatory. If the frame is stale, the actuator gateway rejects it or transitions to neutral/safe state according to policy.

Data Plane

Cap’n Proto is the control plane. Sensor and actuator streams need fixed-layout shared rings:

sequence
capture_time_ns
deadline_ns
frame_id
format
offset
length
flags
source_epoch

The ring can carry camera frames, lidar scans, IMU batches, encoder samples, audio-like streams, or command telemetry. Payload bytes live in MemoryObject backing storage. Producers and consumers coordinate through notification or futex-like wakeups. Slow consumers drop or skip according to policy; they do not backpressure a guaranteed control island.

Realtime Islands

The robot-control equivalent of the media graph’s guaranteed realtime island is an admitted control loop:

flowchart LR
    Sense[read sensors] --> Snapshot[input snapshot]
    Snapshot --> Update[controller update]
    Update --> Clamp[limit and safety clamp]
    Clamp --> Write[write actuator command]
    Write --> Telemetry[non-RT telemetry export]

Admission requires:

fixed period and deadline;
scheduling context with budget;
preallocated input, output, and telemetry buffers;
no allocation in the cycle;
no blocking endpoint calls in the cycle;
no credential checks, logging, service discovery, or model inference;
bounded data-age policy;
command-limit and clamp policy;
stale-command watchdog;
overrun behavior.

Failure behavior is part of the contract. An overrun, stale input, revoked cap, or failed write should produce a deterministic result: hold, neutral, stop, drop, degrade mode, or fault the island. It should not build an unbounded queue of late commands.

Compatibility Bridges

ROS 2 Bridge

The ROS 2 bridge should map selected topics, services, and actions to capOS capabilities. It must be configured from a manifest or broker policy:

which ROS topics can be imported;
which capOS sensor streams can be exported;
which commands can reach an actuator gateway;
freshness and rate limits;
whether messages are best-effort, reliable, latched, or deadline-bound;
how frames and transforms are mapped.

The bridge is not a general “ROS graph has all caps” adapter.

micro-ROS / MCU Bridge

For small robots, the MCU bridge is the first practical hardware path:

MCU closes motor PID, bumper debounce, watchdog, and current limits;
capOS sends bounded velocity/setpoint frames;
MCU publishes encoder, IMU, battery, bumper, and fault streams;
stale capOS commands force neutral behavior.

MAVLink / Autopilot Bridge

For drones and some rovers:

autopilot owns arming, stabilization, failsafe, and flight termination;
capOS consumes telemetry and sends high-level setpoints or missions;
bridge enforces geofence, mode, rate, and authority limits;
direct actuator override is absent or privileged behind stronger policy.

OPC UA / Manufacturing Bridge

For industrial cells:

OPC UA gateway imports cell, robot, fixture, and job state;
capOS exposes typed job/status/alarm caps;
robot program selection and start/stop are separate authorities;
safety state is read independently and cannot be overridden by job logic.

Product-Level Targets

Simulation Robot

The first milestone should be visible without hardware: boot capOS, launch a simulated differential-drive robot, publish fake lidar/odometry, run a behavior service, send bounded drive commands, and log telemetry. This proves the capability graph and stale-command behavior.

Vacuum / Indoor Mobile Robot

Next target: capOS on an SBC with an MCU base controller.

capOS runs mapping, local planning, cleaning behavior, docking, UI, and logs.
MCU runs wheel control, bumper/cliff protection, and motor watchdog.
BaseDrive accepts velocity commands with deadlines.
Loss of capOS or command authority stops motion.

RC Car / Rover

RC-car class demo:

camera/IMU/GPS sensor services;
teleop and autonomous mode caps;
steering/throttle gateway with watchdog;
geofence and speed envelope;
logs for every actuator-affecting command.

Manufacturing Cell Supervisor

Industrial demo:

OPC UA or mock PLC gateway;
robot program selection as a typed capability;
cell-state and alarm streams;
operator approval for mutating actions;
no attempt to replace certified safety functions.

Autonomous Vehicle Research Host

Autoware-like demo:

perception, localization, planning, control, and vehicle-interface services;
simulator or closed-course interface;
independent safety gateway;
command envelopes and audit.

This remains a research host, not a road-certified system.

Security Invariants

Actuator gateways are narrow and mode-limited.
Safety monitor authority is independent from planner and agent authority.
Model processes never receive actuator, safety, or raw device caps.
Operator UI receives consent and status caps, not raw hardware caps.
Bridges do not receive ambient service discovery authority.
Every actuator-affecting command is auditable by source, mode, limits, safety-state revision, timestamp, and result.
Revoking command authority causes stale handles and future commands to fail closed.
Device-facing services obey the DeviceMmio, DMAPool, and Interrupt authority model before userspace drivers touch physical hardware.

Scheduling Dependencies

This proposal depends on future scheduling work:

per-thread rings for full-SMP ownership;
notification objects for low-overhead wakeups;
scheduling contexts with period/budget/priority;
CPU affinity and isolation for admitted loops;
TLB shootdown and SMP-safe address-space migration;
timing telemetry and overrun events;
eventually WCET evidence for hard-realtime claims.

Until those exist, docs and demos must say “bounded soft realtime” or “supervised external controller”, not “hard realtime.”

Implementation Sequence

Add simulation-only robot services and typed fake sensor/actuator caps.
Add RobotDescription, SensorStream, ActuatorCommand, SafetyState, and ControlLoop draft schemas.
Add a QEMU/host smoke that proves stale drive commands fail closed.
Add a differential-drive MCU bridge design and host-side simulator.
Add ROS 2 bridge proposal detail for selected topics/actions and transforms.
Add control-loop telemetry counters: period, execution time, overrun, data age, command age, clamp, neutral, and safety fault.
Bind a local controller to scheduling contexts once the scheduler supports budgeted realtime islands.
Add manufacturing gateway design over OPC UA or a mock PLC protocol.
Add hardware-in-loop criteria before any real actuator demo is treated as a milestone.

Open Questions

Should the first visible milestone be simulation-only or a small physical differential-drive base?
Should robot schemas live in schema/capos.capnp or a separate robotics schema compiled by the same build pipeline?
Which transform-tree representation fits capOS best: immutable snapshots, streaming deltas, or both?
How should command envelopes compose when operator, planner, safety monitor, and actuator gateway all impose limits?
What is the minimum useful ROS 2 bridge: topics only, or topics plus actions for Nav2-style navigation?
Does SensorStream generalize the media-ring design, or should robotics get a distinct stream ABI?

capOS Documentation