Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Research: Robotics Realtime Control

Survey of robotics realtime-control practice and the consequences for using capOS as a robot brain for industrial robots, vacuum/mobile robots, RC cars, drones, and autonomous vehicles.

Scope

This note is about the operating-system and middleware boundary, not robot kinematics or control theory. The capOS question is whether a capability OS can be a credible robot brain without pretending that every perception, planning, networking, and actuator path has the same timing or safety requirements.

The answer is conditional:

  • capOS is a plausible high-level robot brain and isolation substrate.
  • capOS should eventually host bounded realtime control islands.
  • capOS should not claim certified hard-realtime safety-controller status until scheduling contexts, driver isolation, timing analysis, fault containment, and certification evidence exist.
  • For early physical robots, capOS should supervise and coordinate while microcontrollers, PLCs, motor controllers, or flight controllers close the tightest safety loops.

Source Snapshot

External source observations below were checked on 2026-04-25. Local docs/research/ contents checked before adding this note were:

capnp-error-handling.md
completion-ring-threading.md
eros-capros-coyotos.md
genode.md
ix-on-capos-hosting.md
llvm-target.md
multimedia-pipeline-latency.md
os-error-handling.md
out-of-kernel-scheduling.md
pingora.md
plan9-inferno.md
realtime-multimodal-agent-apis.md
sel4.md
small-llm-survey.md
x2apic-and-virtualization.md
zircon.md

Related local grounding:

External Findings

ROS 2 Realtime Direction

ROS 2 documentation frames realtime computing as central to autonomous vehicles, spacecraft, and industrial manufacturing. Its realtime programming guide emphasizes periodic loops, bounded jitter, and avoiding page faults, dynamic allocation, and indefinitely blocking synchronization on the realtime path.

The ROS 2 design background makes a sharper point: an OS can provide deterministic services, but application code must still avoid nondeterministic behavior. It recommends separating startup/preallocation, realtime-safe loop, and teardown phases. This maps directly to capOS admission: graph setup may use ordinary capability calls, but the admitted realtime cycle must run over preallocated buffers and pre-authorized work.

ros2_control

The ros2_control controller manager is a useful concrete precedent. It owns a periodic hardware-control loop whose shape is read state, update controllers, and write commands. Its documentation attempts to run the main controller thread under SCHED_FIFO, reports controller/hardware periodicity and execution-time diagnostics, and warns that normal Linux is throughput-oriented rather than ideal for hardware control.

Consequences for capOS:

  • The robot-control API should make the cyclic read/update/write loop explicit.
  • Controller activation, hardware claiming, fallback, and limits are safety policy, not incidental plugin mechanics.
  • Periodicity, execution time, overruns, and command-limit enforcement need to be first-class telemetry.
  • A controller state query or lifecycle transition that is not realtime-safe must be prohibited inside the admitted control loop.

micro-ROS Executor

micro-ROS documents why the default ROS 2 executor is problematic for deterministic robotic control: timer precedence, non-preemptive round-robin callback execution, no explicit callback priority, and only one input per handle can all create priority inversion and weak latency bounds. Its rclc Executor adds static sequential execution, trigger conditions, optional multi-thread scheduling configuration, and Logical Execution Time semantics. It also allocates callbacks during configuration, not during runtime.

Consequences for capOS:

  • A robot graph should have an explicit execution plan, not generic event-loop fairness.
  • Sense-plan-act phases should be expressible as a timed DAG with trigger conditions.
  • LET-style input/output boundaries are useful for sensor fusion and multi-rate control where lower jitter is worth one controlled period of latency.
  • Runtime graph mutation belongs outside the realtime cycle.

Current Research Trend

A 2026 ROS 2 realtime survey reports that recent work focuses on executor analysis, DDS communication delays, response time, reaction time, data age, message filters, profiling tools, and micro-ROS. That confirms that the hard part is not merely “use ROS 2”; it is making callback scheduling, data age, and communication delays analyzable.

ReDAG-RT, submitted in March 2026, is a recent example of the same pressure. It adds a user-space global scheduler for ROS 2 callback DAGs using rate-priority ordering and per-DAG concurrency bounds. The result is relevant even if capOS does not run ROS 2 unchanged: robot workloads want graph-level scheduling policy with bounded interference, not only thread priorities.

A UAV PREEMPT_RT paper submitted in April 2026 studies a 250 Hz flight-control loop on Raspberry Pi 5 and isolates timing effects from deferred Linux activation paths versus direct realtime activation. The useful warning for capOS is that multicore SoC shared-resource contention can dominate nominal loop frequency. Capability isolation is not sufficient without temporal and cache/bus interference accounting.

seL4 MCS And Timing Work

seL4 MCS exposes scheduling contexts as kernel-managed objects, including periodic threads and passive servers. The Trustworthy Systems timing work emphasizes deadline guarantees, temporal isolation, and WCET analysis for kernel paths.

Consequences for capOS:

  • Processor time should become explicit authority. A process that can command a motor still needs budget authority to do so at a period.
  • Passive-server and scheduling-context donation semantics fit robot services: a controller can run on the caller’s admitted budget when that is the intended timing contract.
  • Hard realtime claims require bounded kernel paths and timing evidence, not only a priority scheduler.

Linux PREEMPT_RT And Xenomai

The Linux kernel now documents PREEMPT_RT internals, including priority inheritance, threaded interrupts, and differences from non-RT kernels. Xenomai remains a strong precedent for systems that split stringent realtime work into a co-kernel or companion core while keeping Linux services available for ordinary work.

Consequences for capOS:

  • There is a practical ladder: normal scheduling, soft realtime with telemetry, admitted realtime islands, and hard device deadlines.
  • If capOS cannot yet provide hard bounds, it should make that status visible instead of hiding it behind a “realtime” label.
  • A future capOS robotics platform may still delegate the smallest motor or flight-control loop to an MCU/RTOS while capOS owns capability isolation, planning, perception, logging, updates, and operator control.

Orocos

Orocos is a long-running robotics control precedent: portable C++ libraries for advanced machine and robot control, with the Real-Time Toolkit as a component framework for realtime components.

Consequence for capOS: robotics developers need component lifecycle, deployment, ports, and runtime introspection. capOS should not expose only raw actuator writes; it needs a component/graph model where a realtime component can be admitted, activated, monitored, and deactivated without granting broad device authority.

Mobile Robots, Drones, And Cars

Nav2 presents a production-grade ROS 2 navigation framework for mobile and surface robots, with perception, planning, control, localization, behaviors, collision monitoring, docking, and teleoperation. It is the right class of software for vacuum cleaners, warehouse robots, rovers, and small RC-car autonomy, but it is not itself a hard-safety controller.

PX4 recommends ROS 2 for companion-computer integration when low latency and Linux libraries matter, while the autopilot remains the flight controller. ArduPilot documents the same split: companion computers consume MAVLink telemetry and make higher-level decisions while the autopilot owns the hard vehicle-control loop.

Autoware is the comparable open-source autonomous-driving stack. It is built on ROS and presents perception, localization, planning, control, and vehicle interface modules for autonomous driving. That is the right architectural shape for a capOS self-driving-car prototype: capOS can isolate and supervise modules, but a safety-certified vehicle interface and independent safety controller remain mandatory.

Manufacturing Interoperability

OPC UA Companion Specifications exist to define industry/device-specific information models and environment profiles. OPC UA is designed to scale from field-level devices to enterprise management. For manufacturing robots, this matters because the robot brain rarely talks only to motors; it must also exchange state, jobs, alarms, and audit data with PLCs, MES/SCADA systems, and vendor controllers.

Consequence for capOS: industrial integration should use typed gateway services. A capOS robot brain should expose and consume narrow manufacturing capabilities such as RobotCellStatus, JobQueue, SafetyState, ProgramSelector, and AlarmLog, not ambient network sockets or filesystem paths.

Timing Classes

Robots mix several timing classes:

ClassTypical loopExamplescapOS stance
Hard safetymicroseconds to millisecondse-stop chain, torque disable, flight stabilizationexternal certified controller first; future capOS only with evidence
Cyclic motion control250 Hz to 4 kHz or higherjoint servo, wheel velocity, PWM/ESC updates, EtherCAT cyclefuture admitted realtime island; early offload to MCU/PLC
Local autonomy10 Hz to 100 Hzobstacle avoidance, local planner, odometry fusionplausible early capOS target with deadline/drop telemetry
Perception and mapping1 Hz to 60 Hzcamera/lidar processing, SLAM, object detectioncapOS service graph, GPU/NPU caps later
Mission behaviorevent-driven to 10 Hzroute plan, behavior tree, job dispatch, teleop modestrong capOS fit
Fleet/cloud integrationseconds and slowerlogs, updates, digital twin, MES/SCADAstrong capOS fit

The mistake would be to put all of these on one generic executor and call it a robot brain. The capOS advantage is that each row can have different authority, budget, telemetry, and failure policy.

Domain Consequences

Manufacturing Robots

capOS can plausibly supervise a robot cell:

  • isolate vendor robot gateways, PLC gateways, camera/lidar services, planning services, operator UI, audit, and update agents;
  • hold explicit capabilities for cell state, job selection, robot program invocation, fixtures, safety-state observation, and logs;
  • run non-safety planning and perception near the robot;
  • bridge OPC UA, fieldbus, and vendor APIs through narrow service caps.

capOS should not initially replace:

  • certified safety PLCs;
  • e-stop and guarding;
  • servo drives’ inner control loops;
  • vendor-certified robot-controller safety functions.

Vacuum Cleaners And Indoor Mobile Robots

capOS is a better early fit here:

  • high-level mapping, route planning, room segmentation, cleaning policy, docking, telemetry, and operator control are natural services;
  • wheel PID, bumper debounce, cliff sensors, battery protection, and motor current cutoffs can stay on a small MCU;
  • Nav2-like navigation concepts can map to capOS graph services and typed actuator/sensor caps.

The first useful physical demo could be a small differential-drive base with capOS running on an SBC and an MCU exposing a typed BaseDrive cap.

RC Cars And Rovers

An RC-car class platform is a good capOS autonomy test because it is simple enough to instrument and unsafe enough to require strict boundaries:

  • capOS can run teleop, camera perception, local planning, logging, and a geofenced mission controller;
  • PWM/ESC steering and throttle should be mediated by a microcontroller or device service with a watchdog;
  • command caps should carry speed, steering, freshness deadline, and mode;
  • stale or revoked command authority should force neutral throttle and safe steering.

Drones

capOS should be a companion computer first:

  • consume MAVLink/uORB-like telemetry through a typed autopilot bridge;
  • run perception, mapping, object tracking, mission planning, and logging;
  • send high-level setpoints only through a FlightSetpoint cap with mode, envelope, rate, and geofence limits;
  • never bypass the flight controller’s arming, failsafe, and stabilization logic in early stages.

Self-Driving Cars

capOS is a research host for autonomous-driving software, not a near-term safety-certified vehicle OS:

  • isolate perception, localization, prediction, planning, map, and vehicle interface modules;
  • make every actuator-affecting path explicit and auditable;
  • use a safety gateway that clamps commands to an envelope and can degrade to minimal-risk behavior;
  • keep independent safety monitors and hardware controls outside the model or planner process.

The useful capOS contribution is not “the LLM drives the car.” It is a capability and timing architecture that prevents perception, model, network, UI, or update components from accidentally gaining actuator or safety authority.

capOS Design Consequences

Robot Brain Means Authority Router, Not Monolith

The robot brain should be a composed service graph:

flowchart LR
    Sensors[Sensor services] --> Perception[Perception]
    Perception --> World[World model]
    World --> Planner[Planner / behavior]
    Planner --> Control[Controller island]
    Control --> Actuators[Actuator gateway]

    Safety[Safety monitor] --> Control
    Safety --> Actuators
    Operator[Operator UI / teleop] --> Planner
    Audit[Audit / telemetry] --- Sensors
    Audit --- Control

The security boundary is the capability graph. The timing boundary is the admitted realtime island. Both must be visible in documentation and telemetry.

Control-Loop Admission

A future ControlLoopManager should admit a loop only after it has:

  • fixed period and deadline;
  • declared worst-case execution budget;
  • preallocated command/state buffers;
  • reserved scheduling context;
  • pinned or registered memory for device I/O;
  • bounded input data age policy;
  • actuator command clamp policy;
  • overrun policy;
  • watchdog/freshness behavior;
  • audit/telemetry route outside the realtime path.

No Cap’n Proto allocation, service discovery, logging, credential lookup, model inference, network fetch, filesystem access, or policy prompt belongs in the admitted loop.

Capability Shapes

Likely future interfaces:

interface SensorStream {
  describe @0 () -> (info :SensorInfo);
  openRing @1 (config :StreamConfig) -> (ring :MemoryObject);
  readStatus @2 () -> (status :StreamStatus);
}

interface ActuatorCommand {
  describe @0 () -> (info :ActuatorInfo);
  submit @1 (command :CommandFrame) -> (accepted :Bool);
  neutral @2 (reason :Text) -> ();
}

interface ControlLoop {
  describe @0 () -> (info :LoopInfo);
  start @1 () -> ();
  stop @2 (reason :Text) -> ();
  readTelemetry @3 () -> (telemetry :LoopTelemetry);
}

interface SafetyState {
  read @0 () -> (state :SafetySnapshot);
  subscribe @1 () -> (events :SensorStream);
}

CommandFrame should include sequence, monotonic timestamp, deadline, coordinate frame, mode, limit profile, and typed payload. A stale command is a failed command.

Robot Description And Frames

capOS needs a typed robot description model rather than an ambient URDF file path. A robot description service should expose:

  • kinematic tree;
  • named frames and transforms;
  • joint limits and command interfaces;
  • sensors, actuators, and calibration;
  • safety envelopes and operating modes;
  • firmware/controller identity;
  • simulation twins.

The description is read-only to most services. Mutating calibration or limits requires a separate authority and should produce audit records.

ROS 2 Compatibility

capOS should not try to replace the robotics ecosystem in the first pass. It should host compatibility bridges:

  • ROS 2 graph bridge for topics/actions/services;
  • micro-ROS/MCU bridge for embedded controllers;
  • MAVLink bridge for autopilots;
  • OPC UA bridge for manufacturing cells;
  • simulation bridge for Gazebo/Isaac/Webots-like tools.

Each bridge receives only the caps it needs. A ROS bridge should not become an ambient authority tunnel from the ROS graph to every actuator.

Models And Agents

Language or vision-language models can help with:

  • operator command interpretation;
  • diagnostics and log summarization;
  • task planning under human approval;
  • visual inspection;
  • code/config generation in simulation.

They must not hold actuator caps. Model output is untrusted. A planner or agent may propose a mission step, but a trusted runner must validate it against tool descriptors, safety state, geofence, mode, and command limits before any actuator-affecting capability is invoked.

Safety And Certification Gap

capOS currently has no certification story for:

  • IEC 61508 / ISO 13849 / ISO 10218 / ISO 26262 style evidence;
  • bounded interrupt latency on target hardware;
  • WCET for kernel paths;
  • IOMMU-backed driver isolation for physical devices;
  • independent safety monitor authority;
  • safe boot/update rollback for robots;
  • fault-injection and hardware-in-loop test evidence.

Therefore the honest position is:

  • research/simulation: capOS can be the main robot OS;
  • hobby mobile robot: capOS can be the SBC brain with MCU safety;
  • industrial cell: capOS can supervise and integrate, not replace safety PLCs;
  • self-driving car: capOS can host research autonomy modules behind a safety gateway, not claim road-safety control.

Implementation Path

  1. Simulation-only robot graph: fake sensors, fake actuators, behavior service, and audit, all over typed capabilities.
  2. Differential-drive demo: BaseDrive MCU bridge, encoder/IMU sensor stream, watchdog, stale-command neutral behavior, and QEMU/host simulation proof.
  3. ROS 2/Nav2 bridge: import/export selected topics/actions with explicit caps and no broad graph authority.
  4. Control-loop telemetry: deadline, data age, overrun, stale command, clamp, watchdog reset, and safety-state event counters.
  5. Realtime island prototype: fixed-period local controller over preallocated rings once scheduling contexts and notification objects exist.
  6. Device authority integration: fieldbus/CAN/EtherCAT/serial through DeviceMmio, DMAPool, Interrupt, or userspace driver caps after the DMA isolation gate.
  7. Manufacturing gateway: OPC UA/PLC bridge exposing cell status, job dispatch, alarms, and robot-program selection as typed caps.
  8. Autonomy stack: perception/planning/control services with explicit timing and safety envelopes.

Open Questions

  • Should capOS define a native robot-description schema or import URDF/SDF into a normalized capability service?
  • Should the first physical demo target a differential-drive base, RC car, or manipulator simulator?
  • What is the smallest useful scheduling-context API for a 50-100 Hz mobile robot controller?
  • How should transform-tree state be represented: service, shared snapshot ring, or both?
  • Where should command-limit enforcement live: actuator gateway, controller, safety monitor, or all three with different authority?
  • Can the same media graph ring shape support camera/lidar frames and audio, or does robot perception need a distinct sensor-stream ABI?

References