Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Proposal: Userspace TCP/IP Networking

How capOS gets from “kernel boots” to “userspace process opens a TCP connection.”

This document has two parts: a kernel-internal smoke test (actionable now) and a userspace networking architecture (blocked on Stages 4-6).


Part 1: Kernel-Internal Networking (Phase A)

Prove that capOS can send and receive TCP/IP traffic. Everything runs in-kernel — no IPC, no capability syscalls, no multiple processes needed.

What’s Needed

  1. PCI enumeration — scan config space, find virtio-net device. Uses the standalone PCI/PCIe subsystem described in cloud-deployment-proposal.md Phase 4 (~200 lines of glue code on top of the shared PCI infrastructure)
  2. virtio-net driver — init virtqueues, send/receive raw Ethernet frames. Use virtio-drivers crate or implement manually (~600-800 lines)
  3. Timer — PIT or LAPIC timer for smoltcp’s poll loop (retransmit timeouts, Instant::now() support). Not a full scheduler — just a monotonic clock (~50-100 lines)
  4. smoltcp integration — implement phy::Device trait over the in-kernel driver, create an Interface with static IP, ICMP ping, then TCP
  5. QEMU flags — add -netdev user,id=n0 -device virtio-net-pci,netdev=n0 to the Makefile

Milestones

  • Ping: ICMP echo to QEMU gateway (10.0.2.2 with default user-mode net)
  • HTTP: TCP connection to a host-side server, send GET, receive response

Estimated Scope

~1000-1500 lines of new kernel code. ~200 more for TCP on top of ping.

Crate Dependencies

CratePurposeno_std
smoltcpTCP/IP stackyes (features: medium-ethernet, proto-ipv4, socket-tcp)
virtio-driversvirtio device abstractionyes (optional — can implement manually)

Timer Source Decision

Resolved: PIT is already configured at 100 Hz from Stage 5. A monotonic TICK_COUNT (AtomicU64 in kernel/src/arch/x86_64/context.rs) increments on each timer interrupt, providing ~10ms resolution — sufficient for TCP timeouts. Switch to LAPIC timer when SMP lands (see smp-proposal.md Phase A).

QEMU Network Config

ConfigUse case
-netdev user,id=n0 -device virtio-net-pci,netdev=n0Default: NAT, guest reaches host
Add hostfwd=tcp::5555-:80 to netdevForward host port to guest

Part 2: Userspace Networking Architecture (Phases B+C)

Blocked on: Stage 4 (Capability Syscalls), Stage 5 (Scheduling), Stage 6 (IPC + Capability Transfer).

Architecture

+--------------------------------------------------+
|  Application Process                             |
|    holds: TcpSocket cap, UdpSocket cap, ...      |
|    calls: connect(), send(), recv() via capnp    |
+---------------------------+----------------------+
                            | IPC (capnp messages)
+---------------------------v----------------------+
|  Network Stack Process (userspace)               |
|    smoltcp TCP/IP stack                          |
|    holds: NIC cap (from driver), Timer cap       |
|    implements: TcpSocket, UdpSocket, Dns caps    |
+---------------------------+----------------------+
                            | IPC (capnp messages)
+---------------------------v----------------------+
|  NIC Driver Process (userspace)                  |
|    virtio-net driver                             |
|    holds: DeviceMmio cap, Interrupt cap          |
|    implements: Nic cap                           |
+---------------------------+----------------------+
                            | capability syscalls
+---------------------------v----------------------+
|  Kernel                                          |
|    DeviceMmio cap: maps BAR into driver process  |
|    Interrupt cap: routes virtio IRQ to driver     |
|    Timer cap: provides monotonic clock            |
+--------------------------------------------------+

Three separate processes, each with minimal authority:

  1. NIC driver – only has access to the specific virtio-net device registers and its interrupt line. Implements the Nic interface.
  2. Network stack – holds the Nic capability from the driver. Runs smoltcp. Implements higher-level socket interfaces.
  3. Application – holds socket capabilities from the network stack. Cannot touch the NIC or raw packets directly.

Prerequisites

PrerequisiteRoadmap StageWhy
Capability syscallsStage 4 (sync path done)All resource access via cap invocations
Scheduling + preemptionStage 5 (core done)Network I/O requires blocking/waking
IPC + capability transferStage 6Cross-process cap calls
Interrupt routing to userspaceNew kernel primitiveNIC driver receives IRQs
MMIO mapping capabilityNew kernel primitiveNIC driver accesses device registers

Phase B: Capability Interfaces

  • Define networking schema (Nic, TcpSocket, etc.) in schema/net.capnp
  • Implement Nic and NetworkManager as kernel-internal CapObjects wrapping the Phase A code
  • Verify capability-based invocation works end-to-end in kernel

Phase C: Userspace Decomposition

  • Move NIC driver into a userspace process
  • Move network stack into a separate userspace process
  • Application process uses socket capabilities via IPC
  • Full capability isolation achieved

Cap’n Proto Schema (draft — will evolve with IPC implementation)

interface Nic {
    transmit @0 (frame :Data) -> ();
    receive @1 () -> (frame :Data);
    macAddress @2 () -> (addr :Data);
    linkStatus @3 () -> (up :Bool);
}

interface DeviceMmio {
    map @0 (bar :UInt8) -> (virtualAddr :UInt64, size :UInt64);
    unmap @1 (virtualAddr :UInt64) -> ();
}

interface Interrupt {
    wait @0 () -> ();
    ack @1 () -> ();
}

interface Timer {
    now @0 () -> (ns :UInt64);
    sleep @1 (ns :UInt64) -> ();
}

interface TcpSocket {
    connect @0 (addr :Data, port :UInt16) -> ();
    send @1 (data :Data) -> (bytesSent :UInt32);
    recv @2 (maxLen :UInt32) -> (data :Data);
    close @3 () -> ();
}

interface NetworkManager {
    createTcpListener @0 () -> (listener :TcpListener);
    createUdpSocket @1 () -> (socket :UdpSocket);
    getConfig @2 () -> (addr :Data, netmask :Data, gateway :Data);
}

Open Questions

  1. DMA memory management. Dedicated DmaAllocator capability vs extending FrameAllocator with allocDma?
  2. Blocking model. Kernel blocks caller on IPC channel vs return “would block” vs both?
  3. Buffer ownership. Copy into IPC message vs shared memory vs capability lending?

References

Crates

Specs

Prior Art

QEMU