# Error Handling

capOS uses three error layers for capability invocation. Keeping the layers
separate prevents malformed transport state from looking like a service-domain
decision, and prevents ordinary business outcomes from becoming generic kernel
exceptions.

## Current Model

| Layer | Carrier | Use |
| --- | --- | --- |
| Transport status | Negative `CapCqe.result` codes | Ring, opcode, lookup, buffer, transfer, and dispatch failures where no safe typed payload boundary exists. |
| Capability exception | Serialized `CapException` plus `CAP_ERR_APPLICATION_EXCEPTION` or `CAP_ERR_APPLICATION_EXCEPTION_TRUNCATED` | Capability-level infrastructure failures after a target capability or accepted endpoint relationship exists. |
| Schema result union | Interface-specific result payload | Expected service or domain outcomes such as not-found, denied-by-policy, conflict, invalid domain input, or accepted/rejected business results. |

Transport failures are intentionally small and mechanical. Examples include a
bad SQE layout, an invalid params or result buffer, an unsupported opcode, a
malformed transfer descriptor, or a capability lookup that fails before a live
target object is identified.

Capability exceptions are for infrastructure failures at a valid capability
boundary: target gone, target overloaded, method unimplemented, argument value
rejected by the documented capability contract, or a target-side invariant
failure. The exception message is diagnostic and must not carry kernel pointers,
secret bytes, or unrelated process-private state.

Schema result unions are the normal application surface. A filesystem
`notFound`, service-level `permissionDenied`, ordinary conflict, or accepted
conditional rejection belongs in the interface result, not in `CapException`.

## Current Transport Namespace

The ring transport uses signed 32-bit completion results. Non-negative values
are opcode-specific successes. Negative values are defined in
`capos-config/src/ring.rs`:

| Code | Name | Meaning |
| --- | --- | --- |
| `-1` | `CAP_ERR_INVALID_REQUEST` | Malformed request metadata or a non-reserved opcode value. |
| `-2` | `CAP_ERR_INVALID_PARAMS_BUFFER` | Params buffer is unmapped, out of range, or unreadable. |
| `-3` | `CAP_ERR_INVALID_RESULT_BUFFER` | Result buffer is unmapped, out of range, or unwritable. |
| `-4` | `CAP_ERR_INVOKE_FAILED` | Lookup or dispatch failed before a successful typed result was produced. |
| `-5` | `CAP_ERR_UNSUPPORTED_OPCODE` | Opcode is reserved but not dispatched by this kernel. |
| `-6` | `CAP_ERR_TRANSFER_NOT_SUPPORTED` | Transfer mode or descriptor layout is recognized but unsupported. |
| `-7` | `CAP_ERR_INVALID_TRANSFER_DESCRIPTOR` | Transfer descriptor layout is malformed or carries reserved bits. |
| `-8` | `CAP_ERR_TRANSFER_ABORTED` | Transfer transaction failed without committing partial capability state. |
| `-9` | `CAP_ERR_APPLICATION_EXCEPTION` | A structured `CapException` was written to the result buffer. |
| `-10` | `CAP_ERR_APPLICATION_EXCEPTION_TRUNCATED` | An exception occurred, but no complete detail fit in the result buffer. |

## Capability Exceptions

`schema/capos.capnp` defines `ExceptionType` and `CapException`. The current
exception kinds are `Failed`, `Overloaded`, `Disconnected`,
`Unimplemented`, and the capOS-specific `InvalidArgument`.

The kernel serializes ordinary capability implementation errors through
`kernel/src/cap/ring.rs`. `capos-rt/src/client.rs` decodes application-exception
CQEs into `ClientError::Application(ApplicationException)`. The runtime treats
`Disconnected` as a broken local handle.

A path should produce `CapException` only when all of these are true:

- a live target capability was identified, or an endpoint operation is acting
  on an already accepted call, receive, or return relationship;
- the failure is attributable to capability semantics rather than malformed
  ring metadata;
- the affected caller supplied a result buffer large enough to receive the
  serialized exception, otherwise the result is the truncated exception code.

## Endpoint RETURN

Endpoint RETURN is asymmetric because the result belongs to the original caller,
not the returning receiver. A server can set
`CAP_SQE_RETURN_APPLICATION_EXCEPTION` on `CAP_OP_RETURN` to return a serialized
`CapException` to the caller. The server's own RETURN completion reports only
whether the return transport succeeded.

Revoked endpoint RETURN also reports `Disconnected` to the original caller when
that caller supplied a result buffer. Receiver-side lookup and CQ-space failures
that cannot be tied to the caller's result buffer remain transport failures.

## Code Map

- `capos-config/src/ring.rs` - transport error constants, SQE/CQE layout, and
  endpoint transport flags.
- `schema/capos.capnp` - `ExceptionType`, `CapException`, and per-interface
  result unions.
- `kernel/src/cap/ring.rs` - exception serialization, ring dispatch, endpoint
  RETURN exception handling, and `InvalidArgument` sentinel mapping.
- `kernel/src/cap/endpoint.rs` - endpoint queue, in-flight call, and revoked
  endpoint state.
- `capos-rt/src/client.rs` - runtime decoding into `ClientError`.
- `docs/architecture/capability-ring.md` - ring ABI and opcode dispatch rules.
- `docs/architecture/ipc-endpoints.md` - endpoint CALL/RECV/RETURN transport.

## Validation

- `make run-spawn` covers cross-process endpoint RETURN propagation for
  `Failed`, `Overloaded`, and `Unimplemented`, plus reserved opcode and
  no-result-buffer exception paths.
- `make run-smoke` covers same-process endpoint use and revoked-cap behavior.
- `cargo test-lib` covers cap-table stale-slot and transfer rollback behavior
  that the transport error paths depend on.
- `cargo test-ring-loom` covers ring queue behavior that completion delivery
  depends on.

## Open Work

- Promise pipelining and future multishot/link/drain ring behavior must carry
  the same three-layer error split.
- Long-lived services should prefer stable result-union variants over generic
  text errors for ordinary domain outcomes.
- Future external clients need compatibility rules for exception taxonomy
  evolution once the ABI is treated as cross-version or separately released.

## Design Grounding

The archival decision record is
[`docs/proposals/error-handling-proposal.md`](../proposals/error-handling-proposal.md).
Relevant research notes are
[`docs/research/capnp-error-handling.md`](../research/capnp-error-handling.md)
and [`docs/research/os-error-handling.md`](../research/os-error-handling.md).
