Rejected Proposal: Cap’n Proto SQE Envelope
Proposal
Replace the fixed C-layout CapSqe descriptor with a fixed-size padded
Cap’n Proto message. Each SQ slot would contain a serialized single-segment
Cap’n Proto struct with a union for call, recv, return, release, and
finish, then zero padding to the chosen SQE size.
The live ring currently pins each SQ slot to 64 bytes (SQE_SIZE in
capos-config/src/ring.rs), so any Cap’n Proto envelope would either have to
fit inside that budget or motivate a slot-size bump. For a hypothetical 128-byte
slot, the rough layout would be:
+0x00 u32 segment_count_minus_one
+0x04 u32 segment0_word_count
+0x08 word root pointer
+0x10 RingSqe data words, including union discriminant
+0x?? zero padding to 128 bytes
A compact schema would need to keep fields flat to avoid pointer-heavy nested payload structs:
struct RingSqe {
userData @0 :UInt64;
capId @1 :UInt32;
methodId @2 :UInt16;
flags @3 :UInt16;
addr @4 :UInt64;
len @5 :UInt32;
resultAddr @6 :UInt64;
resultLen @7 :UInt32;
callId @8 :UInt32;
union {
call @9 :Void;
recv @10 :Void;
return @11 :Void;
release @12 :Void;
finish @13 :Void;
}
}
Potential Benefits
A Cap’n Proto SQE envelope would make the ring operation shape schema-defined instead of Rust-struct-defined. That has some real advantages:
- The ABI documentation would live in
schema/capos.capnpnext to the capability interfaces. - Future userspace runtimes in Rust, C, Go, or another language could use generated accessors instead of hand-mirroring a packed descriptor layout.
- The operation choice could be represented as a schema union, making it clear that fields meaningful for CALL are not meaningful for RECV or RETURN.
- Cap’n Proto defaulting gives a familiar path for adding optional fields while letting older readers ignore fields they do not understand.
- Ring dumps and traces could be decoded with generic Cap’n Proto tooling.
- A single “everything crossing this boundary is Cap’n Proto” rule is architecturally simpler to explain.
Those benefits are mostly about schema uniformity, generated bindings, and tooling. They do not remove the need for an operation discriminator; they move it from an explicit fixed descriptor field to a Cap’n Proto union tag.
Rationale For Rejection
The SQE is the fixed control-plane descriptor for a hostile kernel boundary. It should be cheap to classify and validate before any operation-specific payload parsing. A Cap’n Proto SQE envelope would still have a discriminator, but would move it into generated reader state and require Cap’n Proto message validation before the kernel even knows whether the entry is a CALL, RECV, or RETURN.
The current shape concentrates that hostile-input validation in one place:
sqe_wire_validation_error in capos-config/src/ring.rs is the single source
of truth shared by the kernel dispatch path and the sqe_validation fuzzer
under fuzz/fuzz_targets/. Replacing the descriptor with a Cap’n Proto
message would push some of that validation into generated reader state and
split the fuzz surface across the framing parser and the per-opcode predicates.
Cap’n Proto framing also consumes slot space: a single-segment message needs a segment table and root pointer before the struct data. The live 64-byte slot would not fit a Cap’n Proto envelope without either dropping fields or growing the slot; a 128-byte envelope would spend much of the slot on framing and padding. Nested payload structs are worse because they add pointers inside the ring descriptor.
The accepted split is:
- fixed
#[repr(C)]ring descriptors for SQ/CQ control state; - Cap’n Proto for capability method params, results, and higher-level transport payloads where schema evolution is valuable;
- endpoint delivery metadata in a small fixed
EndpointMessageHeaderfollowed by opaque params bytes.
EndpointMessageHeader is concretely 56 bytes today (see the static-size
assertion in capos-config/src/ring.rs), which keeps the endpoint delivery
header well under one cache line while leaving payload bytes opaque to the
kernel.
There is also a layering issue. The capability ring is part of the local Cap’n Proto transport implementation: it is the mechanism that moves capnp calls, returns, and eventually release/finish/promise bookkeeping between a process and the kernel. The SQE itself is therefore below ordinary Cap’n Proto message usage. Making the transport substrate depend on parsing Cap’n Proto messages to discover which transport operation to perform would couple the transport implementation to the protocol it is supposed to carry. Method params and results are proper Cap’n Proto messages; the ring descriptor is the framing/control structure that gets the transport to the point where those messages can be interpreted.
This keeps queue geometry simple, preserves bounded hostile-input handling, and avoids running a Cap’n Proto parser on the hot descriptor path.
Related Documents
- Ring v2 SMP Proposal – forward path for ring
geometry that keeps the fixed-layout descriptor and negotiates
sqe_sizerather than wrapping each slot in a Cap’n Proto message. - ABI Evolution Policy – how non-capnp ring ABIs (including SQE/CQE layouts) evolve alongside the Cap’n Proto schema.
- Error Handling Proposal – where Cap’n Proto
does sit on the dispatch path:
CapExceptionpayloads carried in SQE result buffers.