Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: Cloud Instance Bootstrap

Picking up instance-specific configuration — SSH keys, hostname, network config, user-supplied payload — from cloud provider metadata sources, without porting the Canonical cloud-init stack.

Problem

A capOS ISO built once has to boot on any cloud VM and adapt to its environment: different instance IDs, different public IPs, different operator-supplied SSH keys, different user-data payloads. Without this, every instance needs a custom-baked ISO — and the content-addressed-boot story (“same hash boots identically on N machines”) devalues itself at the point where it would actually matter for operations.

The Linux convention is cloud-init: a Python daemon that reads metadata from provider-specific sources and applies it by writing files under /etc, invoking systemctl, creating users, and running shell scripts. Porting it is a non-starter:

  • Python, POSIX, systemd-dependent.
  • Runs as root with ambient authority: parses untrusted user-data as shell scripts, mutates arbitrary system state.
  • ~100k lines covering hundreds of rarely-used modules (chef, puppet, seed_random, phone_home).
  • Assumes a package manager and init system that do not exist on capOS.

capOS needs the pattern — consume provider metadata, use it to bootstrap the instance — reshaped to the capability model.

Metadata Sources

All major clouds expose instance metadata through one or more of:

  • HTTP IMDS. 169.254.169.254. AWS IMDSv2 requires a PUT token-exchange handshake; GCP and Azure accept direct GET. Paths differ per provider. Needs a running network stack.
  • ConfigDrive. An ISO9660 filesystem attached as a block device, containing meta_data.json (or equivalent) and optional user-data file. OpenStack, older Azure. Needs a block driver and filesystem reader, no network.
  • SMBIOS / DMI. Vendor, product, serial-number, UUID fields populated by the hypervisor. Good for provider detection before networking comes up.
  • NoCloud. Seed files baked into the image or on an attached FAT disk. Useful for development and bare-metal.

The bootstrap service should read from whichever source is present rather than hardcoding one. Provider detection via SMBIOS runs first (no dependencies), then the appropriate transport is initialized.

CloudMetadata Capability

A single capnp interface; one or more implementations:

interface CloudMetadata {
    # Instance identity
    instanceId    @0 () -> (id :Text);
    instanceType  @1 () -> (type :Text);
    hostname      @2 () -> (name :Text);
    region        @3 () -> (region :Text);

    # Network configuration (primary interface addresses, gateway, DNS)
    networkConfig @4 () -> (config :NetworkConfig);

    # Authentication material
    sshKeys       @5 () -> (keys :List(Text));

    # User-supplied payload. Opaque to the metadata provider.
    userData      @6 () -> (data :Data, contentType :Text);

    # Vendor-supplied payload. Separate from userData so the
    # bootstrap policy can trust them differently.
    vendorData    @7 () -> (data :Data, contentType :Text);
}

struct NetworkConfig {
    interfaces @0 :List(Interface);

    struct Interface {
        macAddress @0 :Text;
        ipv4       @1 :List(IpAddress);
        ipv6       @2 :List(IpAddress);
        gateway    @3 :Text;
        dnsServers @4 :List(Text);
        mtu        @5 :UInt16;
    }
}

Implementations:

  • HttpMetadata — fetches from 169.254.169.254; one variant per provider because paths and auth handshakes differ (AWS IMDSv2 token, GCP Metadata-Flavor: Google, Azure API version).
  • ConfigDriveMetadata — reads an ISO9660 seed disk.
  • NoCloudMetadata — reads a seed blob from the initial manifest.

Detection lives in a small probe service that inspects SMBIOS (System Manufacturer: Google, Amazon EC2, Microsoft Corporation, …) and grants the cloud-bootstrap service the appropriate CloudMetadata implementation as part of a manifest delta.

Bootstrap Service

A single service — cloud-bootstrap — runs once per boot:

cloud-bootstrap:
  caps:
    - metadata: CloudMetadata        # from probe service
    - manifest: ManifestUpdater      # narrow authority to extend the graph
    - network:  NetworkConfigurator  # apply interface addresses
    - ssh_keys: KeyStore             # target store for authorized keys
  user_data_handlers:
    - application/x-capos-manifest: ManifestDeltaHandler
    # operator-installed handlers for other content types

Sequence:

  1. Gather identity and declarative config (instanceId, hostname, networkConfig, sshKeys), apply through the narrow caps above.
  2. (data, ct) = metadata.userData() — dispatch by content type. If no handler is registered, log and skip.
  3. Exit.

The service never holds ProcessSpawner directly. It holds ManifestUpdater, a wrapper that accepts capnp-encoded ManifestDelta messages and applies them through the existing init spawn path. The decoder and apply path are shared with the build-time pipeline (same capos-config crate, same spawn loop). The precise shape of ManifestDelta is an open question — see “Open Questions” below — but at minimum it covers hostname, network config, SSH keys, and authorized application-level service additions:

struct ManifestDelta {
    addServices      @0 :List(ServiceEntry);
    addBinaries      @1 :List(NamedBlob);
    setHostname      @2 :Text;
    setNetworkConfig @3 :NetworkConfig;
}

Relationship to the Build-Time Manifest Pipeline

The existing build-time pipeline (system.cuetools/mkmanifestmanifest.bin → Limine boot module → capos-config decoder → init spawn loop) and the cloud-metadata bootstrap path are not two parallel systems. They are the same pipeline with different transports and different trust scopes.

StageBuild-time (baked ISO)Runtime (cloud metadata)
Authoringsystem.cue in the repouser-data.cue on the operator’s host
Compilemkmanifest (CUE → capnp)same tool, same output
TransportLimine boot moduleHTTP IMDS / ConfigDrive / NoCloud disk
Wire formatcapnp-encoded SystemManifestcapnp-encoded ManifestDelta
Decodercapos-configcapos-config
Applyinit spawn loopsame spawn loop, invoked via ManifestUpdater

Three practical consequences:

  • CUE is a host-side authoring convenience, not an on-wire format. Neither kernel nor init evaluates CUE. An operator supplying user-data writes user-data.cue, runs `mkmanifest user-data.cue

    user-data.binon their host, and ships the capnp bytes (base64 into–metadata [email protected]` for GCP/AWS, or as a file on a ConfigDrive ISO).

  • NoCloud is a Limine boot module by another name. A NoCloud seed blob is the same bytes as a baked-in manifest.bin, attached via a disk or bundled into the ISO instead of handed over by the bootloader. The only difference is who hands the bytes to the parser.
  • No new schema surface. ManifestDelta is defined alongside SystemManifest in schema/capos.capnp, and sharing the decoder means ManifestUpdater’s apply path is a thin merge-and-spawn on top of code that already boots the base system.

The trust model stays clean precisely because ManifestDelta is not SystemManifest. The base manifest is inside the content-addressed ISO hash (fully trusted, reproducible). The runtime delta is applied by a narrowly-permitted service whose caps define what fields of the delta can actually take effect — the content-addressed-boot story is preserved because cloud metadata augments the base graph, it cannot replace it.

User-Data Model

User-data on the wire is a capnp blob, not a shell script. Content type application/x-capos-manifest identifies the canonical case: the payload is a ManifestDelta message produced by mkmanifest on the operator’s host and consumed directly by the bootstrap service.

For cross-cloud-vendor compatibility, operators can install user-data dispatcher services for other content types (YAML, other capnp schemas, signed manifests, etc.). The bootstrap service holds a handler cap per content type; unknown types are logged and ignored, not executed.

Shell-script user-data — the Linux default — has nowhere to run on capOS because there is no shell and no ambient-authority process to execute it under. An operator who insists on this can install a shell service and a handler that routes text/x-shellscript to it, but that is a deliberate choice, not a default fallback.

Trust Model

The capability angle earns its keep here.

  • The metadata endpoint is assumed as trustworthy as the hypervisor running the VM — the same assumption Linux cloud-init makes.
  • The bootstrap service holds narrow caps (ManifestUpdater, NetworkConfigurator, KeyStore), not ambient root. A bug or a malicious metadata response can at most spawn services the ManifestUpdater accepts, set network config the NetworkConfigurator accepts, and drop keys into the KeyStore. It cannot reach for arbitrary system state.
  • vendorData and userData are separated on the wire. A policy that trusts the cloud provider but not the operator (e.g., apply vendorData as-is, route userData through a signature check) is expressible by granting different handler caps to each.
  • User-data content-type dispatch is capability-mediated: the bootstrap service cannot execute a content type it wasn’t given a handler for. There is no fallback “try to run it as shell.”

Phased Implementation

Most of the manifest-handling machinery already exists from the build-time pipeline (capos-config, mkmanifest, init’s spawn loop). The new work is transports, provider detection, and the ManifestDelta merge semantics.

  1. ManifestDelta schema and ManifestUpdater cap. Add the delta type to schema/capos.capnp alongside SystemManifest, extend capos-config with a merge routine (SystemManifest + ManifestDelta → new services to spawn), and expose ManifestUpdater as a cap in init. NoCloudMetadata seeded from a test fixture is enough to demo the apply path end-to-end without any cloud dependency.
  2. Provider detection via SMBIOS. Kernel-side primitive or capability that reads SMBIOS DMI tables and exposes manufacturer / product strings. No network required.
  3. ConfigDrive support. ISO9660 reader plus ConfigDriveMetadata. Gives a working real-transport metadata source with no dependency on userspace networking. QEMU can attach one via -drive file=configdrive.iso,if=virtio for local testing.
  4. HttpMetadata per provider. Requires the userspace network stack (Stage 6+). GCP first (simplest auth), then AWS (IMDSv2 token flow), then Azure.
  5. Cross-provider Cloud Metadata demo. Same ISO hash boots under QEMU, GCP, AWS, and Azure; the only difference is the SMBIOS manufacturer string, which the probe service uses to pick the right HttpMetadata variant. This is the Cloud Metadata observable milestone.

Open Questions

Which fields of system.cue are runtime-modifiable?

system.cue today is a handful of service entries with kernel Console cap grants encoded as structured source variants. That will grow. Plausible additions as capOS matures: driver process definitions (virtio-net, virtio-blk, NVMe) with device MMIO, interrupt, and frame allocator grants; scheduler tuning (priority, budget, CPU pinning); filesystem driver services; memory-policy hooks; ACPI/SMBIOS consumers.

Most of those are either fragile (kernel-adjacent; a bad value bricks the instance), sensitive (granting kernel:frame_allocator to a user-data-declared service is effectively root), or both. A ManifestDelta with full SystemManifest equivalence hands every such knob to whoever controls user-data.

The narrowing has to happen somewhere, but there are several places it could live:

  1. Different schema. ManifestDelta is not structurally a subset of SystemManifest — it omits driver entries, scheduler config, and kernel cap sources entirely. Schema-level guarantee; rigid but unambiguous.
  2. Shared schema, policy-narrowing cap. ManifestUpdater accepts a full delta but validates at apply time: kernel source variants are rejected unless explicitly allow-listed by the cap’s parameters; additions that touch driver-level service entries fail. Flexible, but the narrowing logic is code that has to be audited, not a schema that is self-documenting.
  3. Tiered deltas. PrivilegedDelta (drivers, scheduler) and ApplicationDelta (hostname, SSH keys, app services), minted by different caps. An operator supervisor holds PrivilegedManifestUpdater; cloud-bootstrap holds only ApplicationManifestUpdater. Compositional; matches the capability-model grain but doubles the schema surface.
  4. Tag-based field permissions. Fields in ServiceEntry carry a privilege tag; ManifestUpdater is parameterized with a permitted-tag set. One schema, orthogonal policy.

Picking one prematurely would either over-constrain the cloud path (option 1 before we know what apps legitimately need) or under-constrain it (option 2 without clarity on what to check against). This proposal commits only to the shared pipeline (decoder, spawn loop, authoring tool). The shape of the public type(s) the cap accepts is deferred until system.cue has grown enough that the privileged vs. application split is visible in concrete form.

Related open question: whether kernel cap sources should be expressible in system.cue at all, or whether the build-time manifest should also declare them through a narrower mechanism so that the same discipline that protects cloud user-data also protects the baked-in manifest from accidental over-grants. If they remain expressible, they should be structured enum/union variants, not free-form strings; the associated interface TYPE_ID is only a schema compatibility check and does not identify the authority being granted.

Non-Goals

  • cloud-init compatibility. No parsing of #cloud-config YAML, no #!/bin/bash execution, no include-url, no MIME multipart handling. Operators who need these install their own dispatcher services; the base system does not.
  • Runtime package installation. The capOS equivalent of “install nginx on boot” is “include nginx in the manifest.” User-data can add services to the manifest; it cannot install packages (there is no package manager to install into).
  • Re-running on every boot. cloud-init distinguishes per-boot, per-instance, and per-once modules. The capOS bootstrap service runs once per boot; the manifest it produces is cached under the instance ID, and subsequent boots read the cache and skip the metadata round-trip. A full mode matrix is future work.
  • IPv6-only bring-up in the first iteration. Many clouds expose both; the schema supports both; the first implementations do whichever is easier per provider (typically IPv4).
  • Automatic secret rotation. Metadata often exposes short-lived credentials (IAM role tokens on AWS, service-account tokens on GCP). Refresh logic belongs to the service that consumes the credential, not to cloud-bootstrap.
  • cloud-init (Canonical). The Linux reference. Huge scope, shell-script-centric, assumes root and POSIX. The capOS design intentionally takes the pattern and drops everything that depends on ambient authority.
  • ignition (CoreOS/Flatcar). Runs once in initramfs, consumes a JSON spec, fails-fast if the spec can’t be applied. Closer in spirit to the capOS design — small, single-pass, declarative. Worth studying for its rollback and error-handling approach.
  • AWS IMDSv2. The token-exchange handshake is the one thing the HTTP client needs to handle that is not plain GETs. Designing the HttpMetadata interface without accounting for it up front leads to a rewrite later.