Scope Decision: Real-GCE “Reachable Network Stack” – Raw-Frame TX/RX vs L4 Sockets
Decision
Option A. For the second cloud milestone (“usable cloud instance”,
docs/backlog/hardware-boot-storage.md), the network data-path reachability
bar – “a reachable network data path” / “reachable network stack” – means
raw-frame (ethernet) TX/RX reachability over the live GCE NIC: the
production polled userspace virtio-net provider exchanging frames over the real
function is the reachability proof. Slices 1-4 of the GCE polling-path track
plus the slice-6 billable boot close that data-path reachability bar.
L4 sockets (TCP/UDP reachable from a userspace application) are a separate future track – networking-proposal Phase C – and are explicitly not a real-GCE-boot data-path blocker. This decision does not start that track; it records that the track exists, is sequenced after the milestone, and is gated by its own Phase C prerequisites rather than by the cloud usable-instance data-path bar.
Scope boundary: data-path reachability vs L4 terminal access
The milestone bullet (docs/backlog/hardware-boot-storage.md, “Second cloud
milestone: usable cloud instance”) states two network requirements, not one: add
network drivers and “prove SSH/WebShell or other network terminal access
over the cloud NIC.” SSH and WebShell are inherently L4 (TCP) – a raw frame
cannot carry an SSH session. Option A therefore disambiguates only the first
requirement (the network data path / “reachable network stack”, which is also
what the billable gate checks). It does not claim that raw frames satisfy
the SSH/WebShell terminal-access requirement. L4 network terminal access
(SSH/WebShell) is deferred to Phase C and is tracked there; the operator
access path demonstrated today is the serial-console shell (cloudboot
access-path serial-console-shell marker), not a network terminal. Option A is
thus a deliberate re-scoping of the milestone’s network-reachability gate down
to the raw-frame data path, with L4 terminal access sequenced after the
milestone – not a claim that the milestone delivers SSH/WebShell.
Rationale
The decisive principle for the data-path bar: the milestone’s automatically
gated network proof is whatever the billable harness actually checks. The
billable gate is make cloudboot-test (tools/cloudboot/run-test.sh). Reading
that harness directly settles the ambiguity in the “reachable network stack”
phrasing in one observation – it never checks an L4 socket round-trip. (The
milestone’s separate SSH/WebShell terminal-access requirement is not
harness-gated today and is handled under “Scope boundary” above: deferred to
Phase C.)
What the cloudboot harness actually gates on
run-test.sh has exactly two success gates over kernel network behavior, and
both are below the L4 layer:
- Boot landmark.
run-test.sh:BOOT_LANDMARKis the literal stringcapos kernel starting;main’s step 5 polls the serial port until that landmark appears (run-test.sh:main, thegrep -q "${BOOT_LANDMARK}"poll loop). No TCP, no UDP, no handshake. - Provider-NIC proof (optional, raw-frame). Under
--require-provider-nic-proof(run-test.sh:REQUIRE_NIC_PROOF), the run fails unless the serial output contains therun-test.sh:NIC_PROOF_MARKERline (cloudboot-evidence: provider-nic-bound <token>). The gate is pure marker presence (serial_marker_tokens "${NIC_PROOF_MARKER}"non-empty); it parses no socket state and performs no connect/send/recv against the instance.
The provider-nic-bound marker is, by its own documented contract
(tools/cloudboot/README.md, “Serial evidence-marker contract”), a
raw-frame bind proof: the non-qemu kernel composes the DeviceMmio +
DMAPool/DMABuffer + MSI-X Interrupt grant proofs over one virtio function,
programs the MSI-X table entry, and tears down with stale-handle assertions. It
explicitly does NOT write any virtio common-config register, does NOT activate
the device, and emits a summary line recording
device_autonomous_raise=not-attempted. There is no IP address, no socket, and
no L4 protocol anywhere in the marker contract. The harness’s structured
provider.json schema (tools/cloudboot/README.md, “provider.json schema”)
likewise has no TCP/UDP/socket/L4 field – the network-facing fields are
provider_nic_proof, enumerated_device_classes,
enumerated_device_inventory, dma_pool_grant, interrupt_route_allocated,
interrupt_route_delivered, and storage_bind_proof, all device/frame-level.
Choosing Option B would mean adopting a milestone acceptance bar (an L4 socket round-trip) that the billable gate does not enforce, and blocking the milestone on a large Phase C chain that the milestone’s own proof substrate never exercises. That is not an honest reading of the gate.
What the production polled path can and cannot reach today
Can reach (raw frame): kernel/src/cap/virtio_net_polled_provider.rs is the
always-built (non-qemu) production provider. It exercises raw-frame DMABuffer
movement over the live virtio function: the provider submits the brokered RX
receive buffer and observes its completion by polling the used ring
(InterruptCapVirtioNetPolledProvider::invoke_wait reads the latched
PublishedRx used.idx/used[0] captured in attempt_rx_submit), with zero
interrupts – no device_interrupt::wait_kernel_injected_dispatch, no
inject_real_lapic_int_for_proof on the wait/ack path. The TX leg is a
kernel-half SLIRP stimulus (a manager-owned broadcast-ARP frame authored on
queue 1 to elicit the inbound reply, attempt_rx_submit “Stimulus” step), not a
provider-submitted frame. One real device->host RX DMA of used_len=76 (an
ethernet frame, ethertype 0x0806 ARP) has been observed this way. This is the
ethernet-frame level: frames traverse the live function in both directions, with
the provider owning the RX receive path.
Cannot reach (L4): there is no TCP/UDP socket layer in the production data
path. The entire L4 surface is cfg(feature = "qemu")-gated and replaced in
the cloud kernel by kernel/src/virtio_stub.rs, whose socket entry points all
fail closed:
virtio_stub.rs:create_tcp_listener->NetworkError::DeviceUnavailablevirtio_stub.rs:connect_tcp_ipv4->NetworkError::DeviceUnavailablevirtio_stub.rs:create_udp_socket->NetworkError::DeviceUnavailablevirtio_stub.rs:send_tcp/recv_tcp->NetworkError::InvalidSocketvirtio_stub.rs:accept_tcp->NetworkError::InvalidListenervirtio_stub.rs:network_config-> all-zeroaddr/netmask/gatewayvirtio_stub.rs:poll_scheduler-> no-op
The cap/network.rs TCP/UDP socket CapObject family
(TcpListener/TcpSocket/UdpSocket, deferred accept/recv waiters, the
socket-terminal handoff) is wired to crate::virtio::poll_scheduler – i.e. to
the stub in production – so in the cloud kernel a userspace caller holding a
socket cap gets DeviceUnavailable/InvalidSocket, not a connection. The
in-kernel smoltcp stack, TCP listeners, accepted-socket state, the cooked-mode
line discipline, and the Telnet IAC filter live only in the cfg(qemu)
kernel/src/virtio.rs build.
Why Option B is genuinely a separate, larger track
Option B is networking-proposal Part 3: Userspace Decomposition (Phase C):
relocating smoltcp and the cap/network.rs socket caps out of the cfg(qemu)
kernel/src/virtio.rs into a userspace NIC-driver process (holding
DeviceMmio/Interrupt/DMAPool) and a userspace network-stack process
(holding the Nic cap + Timer), with applications holding socket caps. Its
declared exit criterion is “the kernel contains no smoltcp dependency and no
virtio-net code on the hot path.” Its prerequisite table (networking-proposal
“Phase C prerequisites”) requires production grantable
DMAPool/DeviceMmio/Interrupt lifecycles, real provider-driver
interrupt wait/ack/mask/unmask consumption, durable audit consumption, an IOMMU
domain or explicit production bounce-buffer policy, and full driver ownership
handoff – and the proposal itself states current DDF evidence is “narrower than
these Phase C prerequisites.” This is a multi-slice chain, not a finishing touch
on the milestone.
Sequencing it after the milestone is also consistent with the GCE polling-path decision already recorded in the backlog (2026-06-01): the production data path is polled, device-autonomous MSI-X is a parallel efficiency follow-up, and the milestone is deliberately decoupled from interrupt delivery. Raw-frame reachability is the layer that decision already commits to; L4 sits above it.
Consequence
- Slices 1-4 (the real polled provider, its default-manifest graduation, the
real
provider-nic-boundsource, and the polled-provider stale-authority teardown) plus slice 6 (the billablemake cloudboot-test --require-provider-nic-proofboot) close the usable-cloud-instance milestone’s network data-path reachability bar – the requirement the billable gate actually checks. - The milestone’s separate SSH/WebShell / network terminal access requirement is not closed by these slices; it is L4 and is deferred to Phase C as future work. The access path demonstrated on the current cloud kernel is the serial-console shell, not a network terminal.
- L4 sockets remain future work under networking-proposal Phase C, gated by
the Phase C prerequisites, not by the data-path bar. No child task chain is
created by this decision; Phase C is tracked where it already lives (the
networking proposal and the DDF Task 5/6 prerequisites in
docs/backlog/hardware-boot-storage.md).
2026-06-08 Follow-Up: Phase C Web UI Chain
The later Phase C serve-from-userspace proof does not reopen the 2026-06-02 raw-frame-vs-L4 decision above. That decision remains the historical scope record for the closed usable-cloud-instance raw-frame data-path bar. The selected milestone has since moved to GCE Self-Hosted Web UI, whose proof chain owns L4 and Web UI reachability through separate task records.
The relevant Phase C design home is
Phase C Userspace NIC Driver Relocation.
Its local 7c proof is now landed in
cloud-prod-userspace-network-stack-smoltcp-local-proof:
the non-qemu cloudboot manifest starts the userspace smoltcp network-stack
process, serves a scoped TcpListenAuthority, and completes one local
host-forwarded TCP request/response through served TcpListener/TcpSocket
caps. That is local cloudboot L4 evidence, not private GCE reachability and not
public operator ingress.
The current Web UI ladder is task-owned:
cloud-prod-network-stack-dhcp-ipv4-config-local-proofis done and owns the local DHCP IPv4 configuration, default route, and ARP/neighbor proof for the Phase C userspace stack.cloud-prod-remote-session-web-ui-l4-local-proofowns the local cloudboot proof thatremote-session-web-uilistens through the Phase C L4 path after the done DHCP/IPv4 configuration proof.cloud-gce-private-self-hosted-webui-proofowns the private GCE Web UI proof over the live NIC and remains gated on the local Web UI L4 path plus Web UI hardening tasks: server-side session hardening is done (remote-session-web-ui-session-hardening), and connection bounds are done (remote-session-web-ui-connection-bounds: per-connection request-read/response-send deadlines in the Web UI client over the bounded network-stack listener).cloud-gce-public-self-hosted-webui-ingress-tlsis the separate public ingress/TLS step and remains on hold pending private GCE proof and explicit public-exposure authorization.
This follow-up changes documentation scope only. It does not change any remaining task status, selected milestone, cloud resource posture, public ingress authority, TLS custody, or production release authority.
Inputs weighed
tools/cloudboot/run-test.sh(BOOT_LANDMARK,NIC_PROOF_MARKER,REQUIRE_NIC_PROOF,main,PROVIDER_JSON_REQUIRED_KEYS) andtools/cloudboot/README.md(“Serial evidence-marker contract”, “provider.jsonschema”, “Gate semantics”) – the billable gate, and the single most decisive input.kernel/src/virtio_stub.rs– the production L4 surface (all socket entry points fail closed).kernel/src/cap/network.rs– the L4 socketCapObjectcontract, wired to the stubbedpoll_schedulerin production.kernel/src/cap/virtio_net_polled_provider.rs– the always-built raw-frame polled provider (real device->host RX DMAused_len=76, zero interrupts).docs/proposals/networking-proposal.md, Part 3 (Phase C architecture, prerequisites, exit criteria) – the scope of Option B.docs/backlog/hardware-boot-storage.md, “Cloud Device Tracks – Real GCE Polling Path (decoupled from MSI-X)” – the track this decision is slice 5 of.