Proposal: capOS As A Robot Brain
How capOS should grow into a capability-oriented robot brain for manufacturing robots, mobile robots, RC cars, drones, and autonomous-vehicle research without collapsing safety, realtime, perception, planning, and operator control into one trusted process.
Purpose
capOS has the right architectural ingredients for robotics: isolated processes, explicit capabilities, typed IPC, revocation, memory objects, service composition, audit direction, and future scheduling contexts. Robotics is a useful forcing function because it combines physical authority with mixed-criticality timing:
- a camera pipeline can drop frames;
- a local planner can miss a cycle and recover;
- a wheel command must expire safely;
- a robot arm must obey limits;
- an e-stop must not depend on a model, network, shell, or log service.
The proposal is not “run every control loop in the kernel.” It is a staged robotics architecture where capOS owns authority routing, service isolation, telemetry, update, planning, and eventually admitted realtime islands, while the tightest safety loops remain on certified controllers or MCUs until capOS has evidence to replace them.
Goals
- Define a capability-native robot service graph.
- Separate safety, realtime control, perception, planning, operator UI, simulation, manufacturing integration, and agents.
- Make actuator authority explicit, revocable, logged, and bounded by mode, safety state, command freshness, and limits.
- Support compatibility bridges for ROS 2, micro-ROS, MAVLink, OPC UA, and simulation tooling without turning them into ambient authority tunnels.
- Provide a path from simulation to small physical robots before industrial or vehicle safety claims.
- Reuse
MemoryObjectrings, notification/futex paths, and future scheduling contexts for sensor streams and control loops.
Non-Goals
- Replacing certified safety PLCs, flight controllers, servo drives, or vehicle safety controllers in the near term.
- Claiming IEC 61508, ISO 13849, ISO 10218, or ISO 26262 compliance.
- Putting model inference or natural-language agents in direct control of actuators.
- Making ROS 2 an ambient compatibility layer with implicit access to every capOS service.
- Copying large sensor frames through Cap’n Proto payloads in the data path.
Architecture
flowchart LR
Operator[Operator UI / shell / teleop] --> Mission[Mission and behavior]
Agent[Agent runner] --> Mission
Mission --> Planner[Planner]
Planner --> Controller[Realtime controller island]
Controller --> Actuator[Actuator gateway]
Actuator --> Hardware[MCU / PLC / drive / autopilot]
SensorHW[Camera / lidar / IMU / encoders] --> SensorSvc[Sensor services]
SensorSvc --> Perception[Perception]
Perception --> World[World model]
World --> Planner
Safety[Safety monitor] --> Mission
Safety --> Controller
Safety --> Actuator
Bridges[ROS 2 / MAVLink / OPC UA bridges] --> Mission
Bridges --> SensorSvc
Bridges --> Actuator
Audit[Audit and telemetry] --- Mission
Audit --- Controller
Audit --- Actuator
Principal split:
Sensor servicesown device-facing capture authority and publish typed streams or snapshots.Perceptionconsumes sensor streams and emits world-model updates.Mission and behaviorchooses tasks, modes, and goals.Plannercomputes paths, trajectories, or setpoints within policy.Realtime controller islandturns admitted inputs into cyclic commands.Actuator gatewayis the only holder of hardware command authority.Safety monitorobserves independent safety state and can force stop, neutral, disarm, or mode degradation.Agent runnermay propose or explain actions but does not hold actuator caps.- Compatibility bridges receive narrow imported/exported caps.
Core Rule
No process gets both broad interpretation authority and raw physical authority.
Examples:
- A language model may emit a structured proposal; it does not receive
ActuatorCommand. - A ROS bridge may publish odometry and accept a velocity command cap; it does not receive the whole capOS service graph.
- A planner may receive a goal and produce a trajectory; it does not directly program motor registers.
- An actuator gateway may command hardware; it does not fetch network content or run operator scripts.
Robot Capabilities
The first schema should stay small and control-plane oriented. Bulk sensor data
uses MemoryObject rings.
interface RobotDescription {
describe @0 () -> (description :RobotDescriptionSnapshot);
readFrameTree @1 () -> (frames :FrameTreeSnapshot);
}
interface SensorStream {
describe @0 () -> (info :SensorInfo);
openRing @1 (config :StreamConfig) -> (ring :MemoryObject);
readStatus @2 () -> (status :StreamStatus);
}
interface ActuatorCommand {
describe @0 () -> (info :ActuatorInfo);
submit @1 (frame :CommandFrame) -> (accepted :Bool);
neutral @2 (reason :Text) -> ();
}
interface SafetyState {
read @0 () -> (state :SafetySnapshot);
subscribe @1 () -> (events :SensorStream);
}
interface ControlLoop {
describe @0 () -> (info :LoopInfo);
start @1 () -> ();
stop @2 (reason :Text) -> ();
readTelemetry @3 () -> (telemetry :LoopTelemetry);
}
CommandFrame must carry:
- sequence number;
- monotonic timestamp;
- deadline;
- command mode;
- coordinate frame;
- limit profile;
- typed payload;
- source identity;
- optional safety-envelope revision.
Command freshness is mandatory. If the frame is stale, the actuator gateway rejects it or transitions to neutral/safe state according to policy.
Data Plane
Cap’n Proto is the control plane. Sensor and actuator streams need fixed-layout shared rings:
sequence
capture_time_ns
deadline_ns
frame_id
format
offset
length
flags
source_epoch
The ring can carry camera frames, lidar scans, IMU batches, encoder samples,
audio-like streams, or command telemetry. Payload bytes live in MemoryObject
backing storage. Producers and consumers coordinate through notification or
futex-like wakeups. Slow consumers drop or skip according to policy; they do
not backpressure a guaranteed control island.
Realtime Islands
The robot-control equivalent of the media graph’s guaranteed realtime island is an admitted control loop:
flowchart LR
Sense[read sensors] --> Snapshot[input snapshot]
Snapshot --> Update[controller update]
Update --> Clamp[limit and safety clamp]
Clamp --> Write[write actuator command]
Write --> Telemetry[non-RT telemetry export]
Admission requires:
- fixed period and deadline;
- scheduling context with budget;
- preallocated input, output, and telemetry buffers;
- no allocation in the cycle;
- no blocking endpoint calls in the cycle;
- no credential checks, logging, service discovery, or model inference;
- bounded data-age policy;
- command-limit and clamp policy;
- stale-command watchdog;
- overrun behavior.
Failure behavior is part of the contract. An overrun, stale input, revoked cap, or failed write should produce a deterministic result: hold, neutral, stop, drop, degrade mode, or fault the island. It should not build an unbounded queue of late commands.
Compatibility Bridges
ROS 2 Bridge
The ROS 2 bridge should map selected topics, services, and actions to capOS capabilities. It must be configured from a manifest or broker policy:
- which ROS topics can be imported;
- which capOS sensor streams can be exported;
- which commands can reach an actuator gateway;
- freshness and rate limits;
- whether messages are best-effort, reliable, latched, or deadline-bound;
- how frames and transforms are mapped.
The bridge is not a general “ROS graph has all caps” adapter.
micro-ROS / MCU Bridge
For small robots, the MCU bridge is the first practical hardware path:
- MCU closes motor PID, bumper debounce, watchdog, and current limits;
- capOS sends bounded velocity/setpoint frames;
- MCU publishes encoder, IMU, battery, bumper, and fault streams;
- stale capOS commands force neutral behavior.
MAVLink / Autopilot Bridge
For drones and some rovers:
- autopilot owns arming, stabilization, failsafe, and flight termination;
- capOS consumes telemetry and sends high-level setpoints or missions;
- bridge enforces geofence, mode, rate, and authority limits;
- direct actuator override is absent or privileged behind stronger policy.
OPC UA / Manufacturing Bridge
For industrial cells:
- OPC UA gateway imports cell, robot, fixture, and job state;
- capOS exposes typed job/status/alarm caps;
- robot program selection and start/stop are separate authorities;
- safety state is read independently and cannot be overridden by job logic.
Product-Level Targets
Simulation Robot
The first milestone should be visible without hardware: boot capOS, launch a simulated differential-drive robot, publish fake lidar/odometry, run a behavior service, send bounded drive commands, and log telemetry. This proves the capability graph and stale-command behavior.
Vacuum / Indoor Mobile Robot
Next target: capOS on an SBC with an MCU base controller.
- capOS runs mapping, local planning, cleaning behavior, docking, UI, and logs.
- MCU runs wheel control, bumper/cliff protection, and motor watchdog.
BaseDriveaccepts velocity commands with deadlines.- Loss of capOS or command authority stops motion.
RC Car / Rover
RC-car class demo:
- camera/IMU/GPS sensor services;
- teleop and autonomous mode caps;
- steering/throttle gateway with watchdog;
- geofence and speed envelope;
- logs for every actuator-affecting command.
Manufacturing Cell Supervisor
Industrial demo:
- OPC UA or mock PLC gateway;
- robot program selection as a typed capability;
- cell-state and alarm streams;
- operator approval for mutating actions;
- no attempt to replace certified safety functions.
Autonomous Vehicle Research Host
Autoware-like demo:
- perception, localization, planning, control, and vehicle-interface services;
- simulator or closed-course interface;
- independent safety gateway;
- command envelopes and audit.
This remains a research host, not a road-certified system.
Security Invariants
- Actuator gateways are narrow and mode-limited.
- Safety monitor authority is independent from planner and agent authority.
- Model processes never receive actuator, safety, or raw device caps.
- Operator UI receives consent and status caps, not raw hardware caps.
- Bridges do not receive ambient service discovery authority.
- Every actuator-affecting command is auditable by source, mode, limits, safety-state revision, timestamp, and result.
- Revoking command authority causes stale handles and future commands to fail closed.
- Device-facing services obey the
DeviceMmio,DMAPool, andInterruptauthority model before userspace drivers touch physical hardware.
Scheduling Dependencies
This proposal depends on future scheduling work:
- per-thread rings for full-SMP ownership;
- notification objects for low-overhead wakeups;
- scheduling contexts with period/budget/priority;
- CPU affinity and isolation for admitted loops;
- TLB shootdown and SMP-safe address-space migration;
- timing telemetry and overrun events;
- eventually WCET evidence for hard-realtime claims.
Until those exist, docs and demos must say “bounded soft realtime” or “supervised external controller”, not “hard realtime.”
Implementation Sequence
- Add simulation-only robot services and typed fake sensor/actuator caps.
- Add
RobotDescription,SensorStream,ActuatorCommand,SafetyState, andControlLoopdraft schemas. - Add a QEMU/host smoke that proves stale drive commands fail closed.
- Add a differential-drive MCU bridge design and host-side simulator.
- Add ROS 2 bridge proposal detail for selected topics/actions and transforms.
- Add control-loop telemetry counters: period, execution time, overrun, data age, command age, clamp, neutral, and safety fault.
- Bind a local controller to scheduling contexts once the scheduler supports budgeted realtime islands.
- Add manufacturing gateway design over OPC UA or a mock PLC protocol.
- Add hardware-in-loop criteria before any real actuator demo is treated as a milestone.
Open Questions
- Should the first visible milestone be simulation-only or a small physical differential-drive base?
- Should robot schemas live in
schema/capos.capnpor a separate robotics schema compiled by the same build pipeline? - Which transform-tree representation fits capOS best: immutable snapshots, streaming deltas, or both?
- How should command envelopes compose when operator, planner, safety monitor, and actuator gateway all impose limits?
- What is the minimum useful ROS 2 bridge: topics only, or topics plus actions for Nav2-style navigation?
- Does
SensorStreamgeneralize the media-ring design, or should robotics get a distinct stream ABI?