Proposal: Lua Scripting
How capOS should add Lua as a small capability-aware scripting environment without turning scripts into ambiently privileged shell fragments.
Problem
capOS needs a lightweight scripting path for operator workflows, demos, service glue, and eventually interactive shell automation. The native shell already exposes typed capabilities and explicit child grants, but a shell REPL is not a full programming language. Lua is attractive because it is small, embeddable, and designed to let a host provide the domain API.
The risk is predictable: “system scripting” often becomes an escape hatch
around the operating system model. A script runner that receives broad
ProcessSpawner, BootPackage, filesystem, network, or terminal authority
and then exposes io, os, package.loadlib, or raw handle integers would
recreate the ambient authority capOS is trying to avoid.
The target is not “make Lua root.” The target is:
- Lua as ordinary userspace code.
- Capabilities as the only authority.
- Host-provided Lua libraries that map to typed capOS interfaces.
- Exact grants for script processes, with no default filesystem, network, process, terminal, or debug authority.
Scope
In scope:
- A
capos-luauserspace runner for trusted operator and service scripts. - A small Lua host API over
capos-rttyped clients. - A policy for standard Lua libraries on capOS.
- Script packaging and shell launch shape.
- Validation through QEMU scripts that prove granted and ungranted paths.
Out of scope for the first implementation:
- LuaJIT.
- Dynamic native Lua C modules.
- A POSIX-compatible Lua environment.
- Treating in-process Lua sandboxing as the isolation boundary for hostile scripts.
- Kernel awareness of Lua.
Current Manual Pages
- Programming Languages is the language-status
index. The Lua row tracks the in-tree
demos/lua-smoke/runner against the Rust, Python, Go, C/C++, WASI, and POSIX adapter rows and is the page to update whenever the runtime label or phase status changes. - Userspace Runtime documents the
implemented
capos-rtsurface (entry, allocator, syscall, CapSet lookup, typedConsoleClient/TimerClient/VirtualMemoryClient) that the Lua runner consumes today throughhost::Host::register_console,register_timer, andregister_memory. Any new Lua binding starts by identifying the matching typed client on this page, not by reaching into raw ring SQEs or method IDs. - Shell proposal defines the spawn-plan shape that the
shell uses to launch ordinary userspace processes with exact grants. The Lua
runner is a launched workload in that model, not a shell-embedded
interpreter; future
lua scripts/admin/inspect.lua with { ... }sugar must desugar to the same explicit spawn plan rather than inheriting the shell’s current CapSet. - Userspace Binaries proposal owns the
userspace runtime, language-support, and compatibility-adapter plan that the
Lua runner sits inside. Its “Future: Lua” section names this proposal as the
authoritative design for
capos-lua, and the Lua runner must keep matching its rules for unforgeable capability userdata, exact grants, curated standard libraries, no raw CapIds, and the C/libcapos dependency for the upstream PUC Lua port.
Research Grounding
Relevant research:
- Capability research survey: keep typed Cap’n Proto interfaces as the permission boundary and avoid parallel rights flags.
- Genode: route service access structurally; sessions are typed and resource-accounted.
- Plan 9 and Inferno: per-process namespaces are useful precedent, but capOS should not turn scripts into path-global clients.
- EROS, CapROS, and Coyotos: confinement depends on constructing the subject with only the capabilities it may use.
- seL4: keep the privileged kernel surface small and let userspace policy build higher-level systems.
External Lua references:
- The official Lua 5.5 manual describes Lua as an embeddable C library with a host program that registers C functions callable from Lua.
- The official Lua version history says
Lua 5.5.0 was released on 2025-12-22, while Lua 5.4.8 is the current 5.4
bug-fix release from 2025-06-04. It also says different
x.yversions have different APIs and virtual machines, and precompiled chunks are not portable between versions. - The official Lua 5.5 readme
says Lua is distributed as pure ISO C and normally builds into
lua,luac, andliblua.a. That makes Lua a plausible native port once capOS has the C userspace andlibcapossubstrate; it does not make Lua runnable on today’s no-std Rust-only userspace by itself.
Rust implementation candidates checked:
- mlua is a mature Rust binding layer for
PUC Lua, LuaJIT, and Luau. It is not a pure-Rust VM. Its
vendoredpath still builds C/C++ Lua-family sources throughmlua-sys,cc, andlua-src/luajit-src, and the public crate usesstd,libc,parking_lot, panic catching, and host linker/module assumptions. It is a useful API reference, but it does not avoid the native C/libcaposport. - piccolo is the only inspected pure-Rust
implementation that looks like a credible capOS bootstrap candidate. It has
a stackless VM, fuel-based stepping, memory tracking through
gc-arena, safe userdata downcasting, and most core language behavior. The current crate is stillstd-based, depends onanyhow,thiserror,rand,ahash, and a git-pinnedgc-arena, and its built-in I/O path writes to host stdout. Porting it to capOS would require ano_std + allocfork plus host-library replacement, but that is likely less work than bringing up C Lua beforelibcapos. - silt-lua, hematita, and luar were also inspected. They are pure Rust in varying degrees, but their own READMEs/code show early, incomplete, or CLI-oriented implementations. They are not good foundations for capOS runtime work today.
Design Principles
-
Lua is not a kernel feature. The kernel sees a normal process with a CapSet and a capability ring.
-
The runner’s CapSet is the authority. Script text, module names, global variables, and Lua tables are data. They cannot create authority.
-
In-process sandboxing is defense in depth, not confinement. A trusted service may embed Lua for local configuration or small trusted extensions. Untrusted user scripts must run in a separate process with a narrow CapSet, quotas, and no access to the host service’s private caps.
-
The standard libraries are curated. Base, coroutine, table, string, math, and utf8 are reasonable starting points.
io,os,package,debug, dynamic loading, and process execution are absent by default or replaced by capOS-specific libraries backed by explicit caps. -
No raw CapIds in Lua. A Lua capability value is host-owned userdata with a hidden metatable. Scripts can call methods exposed by the wrapper, but they cannot forge a handle by guessing an integer.
-
Lua version is part of the runtime contract. Precompiled chunks, language behavior, and C API details are series-specific. capOS should pin the runner to a declared Lua series and expose that in manifests and smoke output.
-
C module loading waits. Dynamic native modules need loader, linker, symbol, and authority policy. The first runner should statically link the selected Lua implementation and capOS host libraries.
Architecture
flowchart TD
Shell[capos-shell] --> Launcher[RestrictedLauncher]
Launcher --> Runner[capos-lua process]
Runner --> Lua[PUC Lua VM]
Runner --> Rt[capos-rt / libcapos host API]
Rt --> Ring[capability ring]
Ring --> Kernel[kernel CapObject dispatch]
Ring --> Services[userspace services]
ScriptPkg[ScriptPackage or Namespace cap] --> Runner
Terminal[TerminalSession cap] --> Runner
OtherCaps[Exact service caps] --> Runner
capos-lua is just another binary launched by the shell or init-owned
service graph, matching the “language runtime as ordinary process” rule from
Userspace Binaries. The parent chooses the
script source and the exact caps. The runner creates one Lua state, installs
selected libraries, wraps granted caps as userdata, loads the script with a
controlled environment, executes it in protected mode, flushes queued
releases, and exits with a normal process status.
The initial implementation should be a standalone runner, not Lua embedded in
capos-shell. Keeping the runner as a child process prevents script bugs,
Lua VM bugs, and accidental infinite loops from corrupting the interactive
shell state. It also gives QEMU smokes a clear process boundary to inspect.
Version Choice
Use PUC Lua, not LuaJIT, for the first runner.
As of 2026-05-13, Lua 5.5.0 (released 2025-12-22) is still the current upstream series and Lua 5.4.8 (released 2025-06-04) is still the latest 5.4 bug-fix release. Lua 5.5 has features that fit capOS scripting: explicit global declarations, compact arrays, and static fixed binaries. It is the right default target for new capOS-native scripts.
Keep a narrow compatibility option open for Lua 5.4.8 if imported scripts or libraries require it. Do not mix bytecode or native modules between Lua series. A script package should declare:
language = "lua"
series = "5.5"
entry = "main.lua"
Source scripts are preferable to precompiled chunks for reviewability. If precompiled chunks are allowed later, they must be tied to the exact runtime series and treated as trusted build inputs.
There is one practical sequencing exception: a piccolo-based
capos-lua-smoke may be the fastest way to prove the capOS host API before C
userspace support exists. That should be treated as an implementation
bootstrap, not as a promise of exact PUC Lua compatibility. If capOS takes that
route, the smoke should declare the runtime as piccolo rather than lua-5.5.
Host API
The first host API should be explicit and boring:
local capos = require("capos")
local terminal = capos.require_cap("terminal", "TerminalSession")
terminal:write_line("hello from Lua")
local now = capos.require_cap("timer", "Timer"):now()
terminal:write_line("now_ns=" .. tostring(now))
capos.require_cap(name, interface) looks up a bootstrap cap by manifest name
and checks the expected interface metadata before returning userdata. It fails
closed if the cap is absent or has the wrong interface.
Generated or handwritten bindings should expose method names, not method
numbers. The binding owns Cap’n Proto serialization through capos-rt or
libcapos; scripts should not construct raw SQEs, raw method IDs, transfer
descriptors, or cap_enter calls.
Transferred result caps become owned Lua userdata. Release is deterministic when possible:
do
local h <close> = launcher:spawn({
name = "child",
binary = "timer-smoke",
grants = { terminal = terminal },
})
local code = h:wait()
end
Finalizers may queue cleanup, but they are not the primary lifetime contract. The runner must flush owned-handle releases at script return and process exit.
Standard Library Policy
Initial allowed libraries:
| Library | Policy |
|---|---|
base | Load selected safe functions. load is allowed only with text mode and a supplied environment. |
coroutine | Allowed for cooperative script structure. It does not map to OS threads. |
table, string, math, utf8 | Allowed. |
debug | Denied by default. It pierces ordinary Lua abstraction and should require an explicit developer-profile cap. |
io | Denied by default. Replace with capos wrappers over TerminalSession, future File, ByteStream, or Namespace caps. |
os | Denied by default. Replace time, exit, and process operations with cap-backed methods. |
package | Restricted. require searches a script package or namespace cap, not host paths or environment variables. |
| dynamic C modules | Denied until native module loading has a reviewed authority model. |
Lua _ENV is useful for presenting a small global namespace, but it is not a
security boundary by itself. The security boundary is the process plus its
CapSet.
Script Sources
The current ProcessSpawner.spawn shape names a binary and grants caps; it
does not yet pass arbitrary argument vectors or script blobs. That creates an
implementation dependency for useful Lua scripting.
Near-term options, in order:
-
Smoke-only compiled script:
capos-lua-smokestatically embeds one script string in.rodataand proves the host API. This is not the general product, but it verifies the Lua VM, allocator, CapSet lookup, and terminal output without new startup ABI. -
Runner config cap: init or the shell grants a read-only
ScriptPackageorConfigBlobcap tocapos-lua. The runner asks that cap formain.luaand module bytes. This keeps script data out of the kernel and fits the existing capability model. -
Storage-backed scripts: after Store/Namespace exists, scripts live under a granted namespace.
requiresearches only that namespace and only through a read-only script-package view unless the script also receives a writable namespace cap.
Do not add a Lua-specific boot manifest field or kernel cap. Script packaging belongs to init, shell, storage, or a userspace package service.
Shell Integration
The launch shape comes from the Shell proposal; Lua adds no new spawn primitive. The shell should treat Lua as a launched workload:
run "capos-lua" with {
terminal: @terminal
timer: @timer
scripts: @home.sub("scripts/admin")
}
Later, the shell can add sugar such as:
lua scripts/admin/inspect.lua with { terminal: @terminal, timer: @timer }
That sugar must compile to the same explicit spawn plan. There is no implicit inheritance of the shell’s full current CapSet.
Agent mode can also use Lua, but Lua should be a tool target rather than the model itself. The agent runner may advertise “run this approved Lua script” as a consent-gated tool. The model still does not receive session caps.
Adventure Game Use
The adventure game is a good later demonstration target because it needs both strict authority and authorable behavior. The kernel and service capabilities still enforce authority; Lua should only express deterministic scenario logic over the caps granted to the script runner.
Suitable Lua-owned behavior:
- mission beat selection,
- deterministic NPC dialogue state machines,
- quest-board text,
- hint selection,
- debrief variants,
- scripted reactions that call typed game APIs through granted object caps.
Unsuitable Lua-owned behavior:
- deciding whether a player has authority,
- mutating relic custody without a typed service call,
- applying combat damage outside the game service,
- minting or transferring caps,
- holding broad spawn, debug, filesystem, or network authority by default.
The useful proof is language independence: a Rust adventure service and a Lua scenario script should both demonstrate proper capability use, including bounded failures when a script lacks a required cap.
Blocking, Async, and Coroutines
The first runner can use synchronous typed client calls over the existing single-owner ring client. A blocking Lua method blocks the runner process, which is acceptable for the first operator-script use case.
Coroutines provide script-local cooperative structure, not OS scheduling. A future runtime reactor can resume Lua coroutines when capability completions arrive, but that should wait until the capOS runtime has a general demux path for threaded and async clients. Do not design Lua-specific CQ demultiplexing.
Security Model
Threat boundaries:
- Script source is untrusted input until parsed and loaded in protected mode.
- Script packages are trusted build or storage inputs only when their source, digest, author, and runtime series are review-visible.
- The Lua VM is not trusted to confine hostile code inside a privileged host process.
- Capability wrappers must validate method parameters, buffer sizes, transfer counts, and result-cap interface IDs before translating Lua values into ring calls.
- Terminal and audit output must not print secrets. Lua error rendering should use bounded messages and avoid dumping arbitrary cap userdata internals.
Default deny list for untrusted scripts:
- no
debug, - no dynamic module loading,
- no raw
os/io, - no broad
ProcessSpawner, - no broad network manager,
- no boot package,
- no mutable namespace unless that is the explicit script purpose,
- no host environment variables.
Quotas matter. The first useful quota is process memory. CPU budgets, timer budgets, and capability-call quotas should follow the normal capOS scheduling and resource-accounting path rather than special Lua hooks.
Implementation Phases
Phase 0: Contract and Host Surface (in tree)
- Proposal landed and
docs/programming-languages.mdrecords the Phase 0 status. - Initial runtime label is
capos-lua-subset, notlua-5.x. Bytecode portability is explicitly out of scope. - Phase 0 ships a tiny hand-written tree-walking interpreter under
demos/lua-smoke/that exists to validate the long-term capability-aware host API design without committing capOS to a particular Lua dialect. Piccolo was investigated and not adopted: upstream does not compile no_std and the swap surface (anyhow, thiserror, std::io, std::sync, ahash::RandomState entropy) is large enough that the maintenance cost of a fork was judged to outweigh the benefit at this stage. The hand-written interpreter is replaced or kept as a research-grade sandbox once the C/libcapos PUC port lands. - Host surface in tree:
- typed userdata over
capos-rt::ConsoleClientandcapos-rt::TimerClient, obj:method(args)dispatch throughhost::Host::call_method,- errors flow back as Lua runtime errors via
EvalError::Lua, never Rust panics on script-controlled inputs, - bounded execution via a per-run step counter (
MAX_STEPS).
- typed userdata over
- Future Phase 0 items (still open):
- generalised
capos.require_caplookup, capos.interfacesreflection for typed errors,- owned-cap release semantics for granted result handles.
- generalised
Phase 1: Native Runner Smoke (in tree)
demos/lua-smoke/builds ascapos-demo-lua-smoke, gets embedded insystem-lua-smoke.cue, and runs undermake run-lua-smokewith QEMU’sisa-debug-exitto gate cleanly on script success or failure.- The smoke loads no Lua standard library at all (no
io,os,package,debug,string,table,math); the only callable surface is the typed cap bindings registered inhost::Host::register_*. - Iteration L.1 (
2026-05-04 18:42 EEST, merge050ac735) shipped the initialconsole:write_lineandtimer:nowbindings. - Iteration L.2 (
2026-05-05 19:30 UTC) added the third host binding,memory, wrappingcapos-rt::VirtualMemoryClient. The Lua surface ismemory:alloc(size) -> userdata,memory:write(buf, off, byte),memory:read(buf, off) -> int,memory:size(buf) -> int. The host binding owns the kernel-mapped address and the page-aligned size; the Lua side only ever sees an opaque userdata id and the byte values that came back through the typed binding. Eachread/writeis bounds- checked host-side before the single-bytevolatile_*access. Per-call (MAX_MEMORY_ALLOC_BYTES = 64 KiB), aggregate (MAX_MEMORY_TOTAL_BYTES = 256 KiB), and buffer-count (MAX_MEMORY_BUFFERS = 64) ceilings rejected as typed Lua errors keep hostile scripts from exhausting the per-process virtual-memory quota before the kernel does. The smoke proof lines ([lua-smoke] memory:alloc size=4096,[lua-smoke] memory roundtrip 65,66,67,[lua-smoke] memory sum=198) are gated bytools/qemu-lua-smoke.sh. - Iteration L.3 (
2026-05-13 09:28 EEST, commit430ccd0e) added deterministicmemory:release(buf)for the same smoke-only host binding. The host callsVirtualMemory.unmapwith the exact mapped(addr, size)pair stored for the opaque buffer userdata, marks that buffer dead after the unmap succeeds, credits the live byte budget, and rejects laterread,write,size, orreleasecalls on that stale userdata as Lua runtime errors. The proof line[lua-smoke] memory:release size=4096is gated bytools/qemu-lua-smoke.sh. This remains language-support behavior only: Lua receives no broader memory authority, raw address, raw cap id, or new kernel behavior. - Expected QEMU output is asserted by
tools/qemu-lua-smoke.sh: smoke produces[lua-smoke] hello from lua-smoke v0, anelapsed_ns=measurement throughtimer:now, the L.2 memory round-trip lines, the L.3 release line, and a[lua-smoke] script okproof line; init exits viaexitWhenServiceExits. - Future Phase 1 items (still open):
- typed wrong-interface and missing-cap failure modes returned as Lua runtime errors,
- explicit denied-API proof (currently denied by construction because no Lua stdlib is loaded at all),
TerminalSession.writeLineparity in addition to the currentConsole.writeLinebinding,- the next typed cap binding (process spawning or endpoint IPC).
Phase 2: Script Package Input
- Add a userspace-owned script source cap or startup-config path.
- Let shell/init launch
capos-luawith a selected package and exact grants. - Implement restricted
requireover the package. - Add QEMU proof for a granted
TerminalSessioncall and a denied ungranted cap lookup.
Phase 3: Generated Capability Bindings
- Generate Lua binding metadata from
schema/capos.capnpor from the same interface registry used by the native shell. - Expose method names and structured params/results.
- Add transfer-result cap adoption and deterministic release tests.
- Keep raw Cap’n Proto builders out of script code unless a separate developer diagnostic cap grants that power.
Phase 4: Shell and Service Use
- Add shell sugar for script execution after the exact spawn plan exists.
- Permit trusted services to embed Lua only when they can prove the embedded state holds no extra authority beyond what the script should use.
- Add audit records for script launch, script package digest, grants, exit status, and authority-touching cap calls when audit caps are available.
Validation
The first implementation is not complete until it has QEMU evidence:
- A Lua script prints through a granted
TerminalSession. - The same script cannot use
io,os.execute,debug, or an ungranted cap. - A missing or wrong-interface cap lookup returns a bounded Lua error.
- An owned result cap is released deterministically.
- The runner exits cleanly and does not wedge the shell.
Host tests should cover Lua value conversion and binding generation once those pieces are pure enough to test outside QEMU. Do not claim “Lua scripting works” from host tests alone; the useful behavior is authority-shaped process execution in capOS.
Open Questions
- Whether the initial implementation should wait for
libcaposC support or use a temporary Rust Lua VM to prove the host API earlier. - The exact startup-config mechanism for selecting
main.luabefore storage and general process arguments exist. - Whether Lua 5.5 should be the only supported series or whether a 5.4 runner is worth carrying for ecosystem compatibility.
- How much schema reflection the Lua binding should expose before the native shell’s generic call surface lands.
- Which audit fields belong in
AuditLogonce script launch becomes an operator workflow rather than a smoke.