# Proposal: Go Language Support via Custom GOOS

Running Go programs natively on capOS by implementing a `GOOS=capos` target
in the Go runtime.


## Current Manual Pages

- [Go VirtualMemory Contract](../backlog/go-virtual-memory-contract.md)
  freezes the current allocator-facing memory contract for this proposal.
- [Programming Languages](../programming-languages.md) summarizes the current
  language support matrix and the distinction between native runtime adapters,
  POSIX compatibility adapters, and WASI host adapters. The Go row points back
  here for the native `GOOS=capos` track and to the WASI host adapter's
  Phase W.8 TinyGo / upstream `GOOS=wasip1` CUE evaluator path.
- [Userspace Binaries](userspace-binaries-proposal.md) holds the overall
  language-runtime track. Its "Future: Go (`GOOS=capos`)" section delegates
  the native plan to this proposal, and its "Phase W.8 (TinyGo / Go-on-WASI
  CUE evaluator, blocked)" entry tracks the WASI-side interim path.
- [WASI Host Adapter](wasi-host-adapter-proposal.md) documents the in-tree
  wasmi-backed host. Phase W.8 there is the TinyGo / upstream Go
  (`GOOS=wasip1`) CUE evaluator slice that runs inside the host adapter and
  bridges to the native Go track described in this proposal. The detailed
  plan lives in [`docs/proposals/wasi-host-adapter-proposal.md`](wasi-host-adapter-proposal.md)
  Task 9.
- [In-Process Threading](../architecture/threading.md) freezes the
  thread/process ownership contract that Phase 2 of this proposal builds on.
- [Park Authority](../architecture/park.md) freezes the compact
  `CAP_OP_PARK` / `CAP_OP_UNPARK` ABI that the Go runtime's futex glue must
  target instead of a Linux-style futex syscall namespace.
- [Memory Management](../architecture/memory.md) documents the implemented
  kernel memory and baseline `VirtualMemory` behavior.
- [Userspace Runtime](../architecture/userspace-runtime.md) documents the
  `capos-rt` client surface that a future Go runtime port will call.
- [LLVM Target](../research/llvm-target.md) is the main research grounding for
  Go runtime and target-triple work.

## Motivation

Go is the implementation language of CUE, the configuration language planned
for system manifests. Beyond CUE, Go has a large ecosystem of systems
software (container runtimes, network tools, observability agents) that would
be valuable to run on capOS without rewriting.

The userspace-binaries proposal keeps Go as a dedicated future runtime track.
This proposal explores the native path: a custom `GOOS=capos` that lets Go
programs run directly on capOS hardware, without a WASM interpreter in between.
Go through WASI remains a narrower option for CPU-bound tools such as CUE
evaluation before the native runtime port exists.

## Why Go is Hard

Go's runtime is a userspace operating system. It manages its own:

- **Goroutine scheduler** — M:N threading (M OS threads, N goroutines),
  work-stealing, preemption via signals or cooperative yield points
- **Garbage collector** — concurrent, tri-color mark-sweep, requires
  write barriers, stop-the-world pauses, and memory management syscalls
- **Stack management** — segmented/copying stacks with guard pages,
  grow/shrink on demand
- **Network poller** — epoll/kqueue-based async I/O for `net.Conn`
- **Memory allocator** — mmap-based, spans, mcache/mcentral/mheap hierarchy
- **Signal handling** — goroutine preemption, crash reporting, profiling

Each of these assumes a specific OS interface. The Go runtime calls ~40
distinct syscalls on Linux. capOS currently has 2.

## Syscall Surface Required

The Go runtime's Linux syscall usage, grouped by subsystem:

### Memory Management (critical, blocks everything)

| Go runtime needs | Linux syscall | capOS equivalent |
|---|---|---|
| Heap allocation | `mmap(MAP_ANON)` | `VirtualMemory.reserve` + `commit`, or compatibility `map` |
| Heap deallocation | `munmap` | `VirtualMemory.unmap` releases reservations and committed frames |
| Stack guard pages | `mmap(PROT_NONE)` + `mprotect` | Reserve uncommitted guard pages; use committed `VM_PROT_NONE` only when contents must be retained |
| GC needs contiguous arenas | `mmap` with hints | Contiguous virtual reservations; physical frames are committed sparsely |
| Commit/decommit pages | `madvise(DONTNEED)` | `VirtualMemory.commit` / `decommit` within reserved ranges |

**capOS needs:** A `sys_mmap`-like capability or syscall that can:
- Map anonymous pages at arbitrary user addresses
- Set per-page permissions (R, W, X, none)
- Allocate contiguous virtual ranges without requiring contiguous physical frames
- Decommit without unmapping (for GC arena management)

This could be a `VirtualMemory` capability:

```capnp
interface VirtualMemory {
    # Map anonymous pages at hint address (0 = kernel chooses)
    map @0 (hint :UInt64, size :UInt64, prot :UInt32) -> (addr :UInt64);
    # Unmap pages
    unmap @1 (addr :UInt64, size :UInt64) -> ();
    # Change permissions on mapped range
    protect @2 (addr :UInt64, size :UInt64, prot :UInt32) -> ();
    # Reserve virtual address space without physical frames
    reserve @3 (hint :UInt64, size :UInt64) -> (addr :UInt64);
    # Commit physical frames inside a reserved range
    commit @4 (addr :UInt64, size :UInt64, prot :UInt32) -> ();
    # Decommit physical frames while keeping the range reserved
    decommit @5 (addr :UInt64, size :UInt64) -> ();
}
```

The exact Go allocator contract is frozen in
[Go VirtualMemory Contract](../backlog/go-virtual-memory-contract.md): `map`
stays a compatibility operation, while `reserve`, `commit`, and `decommit`
separate virtual address reservation from physical frame commitment and make
guard-page behavior explicit.

### Threading (critical for goroutines)

| Go runtime needs | Linux syscall | capOS equivalent |
|---|---|---|
| Create OS thread | `clone(CLONE_THREAD)` | Thread capability / in-process thread lifecycle |
| Thread-local storage | `arch_prctl(SET_FS)` | `ThreadControl.setFsBase`; per-ThreadRef TLS ownership for Go integration |
| Block thread | `futex(WAIT)` | `ParkSpace` compact `CAP_OP_PARK` |
| Wake thread | `futex(WAKE)` | `ParkSpace` compact `CAP_OP_UNPARK` |
| Thread exit | `exit(thread)` | `ThreadControl.exitThread` capability operation |

**capOS baseline:** process-local thread lifecycle and private `ParkSpace`
wait/wake exist as the kernel substrate. The remaining Go work is runtime
integration: capos-rt clients, `newosproc` glue, per-ThreadRef TLS ownership,
and GC/runtime coordination across those kernel threads.

`ThreadControl.setFsBase` is a current-`ThreadRef` operation, not a
process-global mutation. Go integration must allocate a distinct TLS block and
FS base for each runtime M/OS thread, and context switch must preserve FS base
as per-thread state before true multi-threaded Go is treated as supported.

Design alternatives considered:

**Option A: Kernel threads.** The kernel manages threads (multiple execution
contexts sharing one address space). Each thread has its own stack, register
state, and FS base, but shares page tables and cap table with the process.
This is what Linux does and what Go expects.

**Option B: User-level threading.** The process manages its own threads (like
green threads). The kernel only sees one execution context per process. Go's
scheduler already does M:N threading, so it could work with a single OS
thread per process — but the GC's stop-the-world relies on being able to
stop other OS threads, and the network poller blocks an OS thread.

Option A is the selected substrate for Go compatibility. Option B is more
capability-aligned (threads are a process-internal concern), but it requires
larger Go runtime modifications and does not fit the current kernel-thread
checkpoint.

### Synchronization

| Go runtime needs | Linux syscall | capOS equivalent |
|---|---|---|
| Park wait | `futex(FUTEX_WAIT)` | `ParkSpace` compact `CAP_OP_PARK` |
| Park wake | `futex(FUTEX_WAKE)` | `ParkSpace` compact `CAP_OP_UNPARK` |
| Atomic compare-and-swap | CPU instructions | Already available (no kernel support needed) |

Linux futexes are a kernel primitive (block/wake on a userspace address). capOS
exposes park authority through a `ParkSpace` capability from the start. Go
futex glue should target the compact capability-authorized park operations
defined in the ParkSpace architecture rather than introducing a Linux-style
futex syscall namespace or routing failed wait / empty wake through generic
Cap'n Proto method dispatch. Blocked/resume performance still needs measurement
under Go's runtime workload, but that does not change the authority or key
model.

### Time

| Go runtime needs | Linux syscall | capOS equivalent |
|---|---|---|
| Monotonic clock | `clock_gettime(MONOTONIC)` | Timer cap `.now()` |
| Wall clock | `clock_gettime(REALTIME)` | Timer cap or RTC driver |
| Sleep | `nanosleep` or `futex` with timeout | Timer cap `.sleep()` or park timeout |
| Timer events | `timer_create` / `timerfd` | Timer cap with callback or poll |

Timer cap `now` and `sleep` are implemented for monotonic time and bounded
sleep. Wall-clock time and timerfd-style event sources remain future work.
ThreadControl `getFsBase` and `setFsBase` are implemented for current-process
runtime FS-base ownership; making FS base per-thread remains part of kernel
threading.

### I/O

| Go runtime needs | Linux syscall | capOS equivalent |
|---|---|---|
| Network I/O | `epoll_create`, `epoll_ctl`, `epoll_wait` | Async cap invocation or poll cap |
| File I/O | `read`, `write`, `open`, `close` | Directory/File or Namespace/Store caps through Go's OS adapter |
| Stdout/stderr | `write(1, ...)`, `write(2, ...)` | Console cap |
| Pipe (runtime internal) | `pipe2` | IPC caps or in-process channel |

Go's network poller (`netpoll`) is pluggable per-OS — each GOOS provides
its own implementation. For capOS, it would use async capability invocations
or a polling interface over socket caps.

### Signals (for preemption)

| Go runtime needs | Linux syscall | capOS equivalent |
|---|---|---|
| Goroutine preemption | `tgkill` + `SIGURG` | Thread preemption mechanism |
| Crash handling | `sigaction(SIGSEGV)` | Page fault notification |
| Profiling | `sigaction(SIGPROF)` + `setitimer` | Profiling cap (optional) |

Go 1.14+ uses asynchronous preemption: the runtime sends `SIGURG` to a
thread to interrupt a long-running goroutine. On capOS, alternatives:

- **Cooperative preemption only.** Go inserts yield points at function
  prologues and loop back-edges. This works but means tight loops without
  function calls won't yield. Acceptable for initial support.
- **Timer interrupt notification.** The kernel notifies the process (via a
  cap invocation or a signal-like mechanism) when a time quantum expires.
  The notification handler in the Go runtime triggers goroutine preemption.

## Implementation Strategy

### Phase 1: Minimal GOOS (single-threaded, cooperative)

Fork the Go toolchain, add `GOOS=capos GOARCH=amd64`. Implement the minimum
`runtime` changes:

**What to implement:**
- `osinit()` — read Timer cap from CapSet for monotonic clock
- `sysAlloc/sysFree/sysReserve/sysMap` — translate to VirtualMemory cap
- `settls()` — translate Go's FS-base install to ThreadControl
- `newosproc()` — stub (single OS thread, M:N scheduler still works with M=1)
- `futexsleep/futexwake` — spin-based fallback (no real futex yet)
- `nanotime/walltime` — Timer cap
- `write()` (for runtime debug output) — Console cap
- `exit` — sys_exit for current-thread termination; the process exits when its
  last live thread exits
- `exitThread` — terminal `ThreadControl.exitThread` capability operation
- `netpoll` — stub returning "nothing ready" (no async I/O)

**What to stub/disable:**
- Signals (no SIGURG preemption, cooperative only)
- Multi-threaded GC (single-thread STW is fine initially)
- CGo (no C interop)
- Profiling
- Core dumps

**Deliverable:** `GOOS=capos go build ./cmd/hello` produces an ELF that
runs on capOS, prints "Hello, World!", and exits.

Current capOS status: the `single-thread-runtime` QEMU demo proves the
capability-side checkpoint for this phase without a Go fork yet. It maps,
protects, and frees heap pages through `VirtualMemoryClient`, uses `TimerClient`
for monotonic `now` and sleep, keeps `newosproc` unsupported, and exercises the
temporary park fallback path locally.

**Estimated effort:** ~2000-3000 lines of Go runtime code (mostly in
`runtime/os_capos.go`, `runtime/sys_capos_amd64.s`,
`runtime/mem_capos.go`). Reference: `runtime/os_js.go` (WASM target) is
~400 lines; `runtime/os_linux.go` is ~700 lines. capOS sits between these.

### Phase 2: In-Process Threading + Park

Build on implemented kernel support for:
- multiple threads per process on the single-CPU scheduler first;
- private `ParkSpace` compact wait/wake;
- current-thread FS-base updates through `ThreadControl`.

Update Go runtime:
- `newosproc()` creates a real kernel thread
- `futexsleep/futexwake` use the `ParkSpace` compact park ABI
- thread creation allocates and owns distinct TLS state per `ThreadRef`
- GC can coordinate across multiple kernel threads in one process
- Enable real blocking instead of the temporary single-thread park fallback

**Deliverable:** Go programs can create multiple in-process kernel threads and
block/wake through futexes on one CPU. Multiple CPU-core execution remains a
later SMP milestone after the threading/park contract is settled.

The 7.1.0 thread/process ownership contract is now frozen in
[In-Process Threading](../architecture/threading.md). It keeps address space,
cap table, CapSet, and the capability ring process-owned; makes saved context,
kernel stack, block state, and FS base thread-owned; charges thread records and
kernel stacks to process-owned ledgers; and preserves a single process ring
waiter until a later ring-sharding design exists.
The 7.1.1 park authority contract is frozen in
[Park Authority](../architecture/park.md). It defines process-local
ParkSpace authority for private park keys, a future MemoryObject-derived
SharedParkSpace model for shared park-words, and compact `CAP_OP_PARK` /
`CAP_OP_UNPARK` operations as the starting ABI for the Go runtime
synchronization path.

### Phase 3: Network Poller

Implement `runtime/netpoll_capos.go`:
- Register socket caps with the poller
- Use an async notification mechanism (capability-based `poll()` or
  notification cap)
- `net.Dial()`, `net.Listen()`, `http.Get()` work

This depends on the networking stack being available as capabilities.

**Deliverable:** Go HTTP client/server runs on capOS.

### Phase 4: CUE on capOS

With Go working, CUE runs natively. This enables:
- Runtime manifest evaluation (not just build-time)
- Dynamic service reconfiguration via CUE expressions
- CUE-based policy enforcement in the capability layer

## Kernel Prerequisites

| Prerequisite | Roadmap Stage | Why |
|---|---|---|
| Capability syscalls | Stage 4 (sync path done) | Go runtime invokes caps (VirtualMemory, Timer, Console) |
| Scheduling | Stage 5 (core done) | Go needs timer interrupts for goroutine preemption fallback |
| IPC + cap transfer | Stage 6 | Go programs are service processes that export/import caps |
| VirtualMemory capability | Stage 5 | mmap equivalent for Go's memory allocator and GC |
| ThreadControl capability | Extends Stage 5 | `settls` equivalent before full in-process threads |
| Thread lifecycle | Extends Stage 5 | Implemented substrate for multiple execution contexts per process; Go integration remains |
| `ParkSpace` capability | Extends Stage 5 | Go runtime synchronization through compact park/unpark |

### VirtualMemory Capability

This is the biggest new kernel primitive. Go's allocator requires:

1. **Reserve** large virtual ranges without committing physical memory
   (Go reserves 256 TB of virtual space on 64-bit systems)
2. **Commit** pages within reserved ranges (back with physical frames)
3. **Decommit** pages (release frames, keep virtual range reserved)
4. **Set permissions** (RW for data, none for committed inaccessible pages;
   pure guard pages should stay reserved but uncommitted)

The existing page table code (`kernel/src/mem/paging.rs`) supports mapping
and unmapping individual pages. It needs to be extended with:
- Virtual range reservation (mark ranges as reserved in some bitmap/tree)
- Lazy commit (map as `PROT_NONE` initially, page fault handler commits
  on demand — or explicit commit via cap call)
- Permission changes on existing mappings

The concrete ABI for the first explicit-commit path is in
[Go VirtualMemory Contract](../backlog/go-virtual-memory-contract.md). It
chooses explicit `commit`/`decommit` before demand paging, permits
`VM_PROT_NONE` through reservation metadata plus non-present user PTEs, and
requires separate virtual-reservation and physical-commit quota ledgers.
Committed `VM_PROT_NONE` intentionally retains allocated frames and page
contents for later protection restore. Pure guard pages should use reserved
uncommitted pages so they consume virtual quota but no physical commit budget.

### Thread Support

Extending the process model (`kernel/src/process.rs`) now follows the contract
in [In-Process Threading](../architecture/threading.md). See the
[SMP proposal](smp-proposal.md) for the `PerCpu` struct layout (per-CPU
kernel stack, saved registers, FS base); `Thread` extends this for
multi-thread-per-process. See also the In-Process Threading section in
[`docs/roadmap.md`](../roadmap.md) for the roadmap-level view.

```rust
struct Process {
    pid: u64,
    address_space: AddressSpace,  // shared by all threads
    caps: CapTable,               // shared by all threads
    threads: Vec<Thread>,
}

struct Thread {
    tid: u64,
    state: ThreadState,
    kernel_stack: VirtAddr,
    saved_regs: RegisterState,    // rsp, rip, etc.
    fs_base: u64,                 // for thread-local storage
}
```

The scheduler (Stage 5) schedules threads, not processes. Each thread gets
its own kernel stack and register save area. Context switch saves/restores
thread state. Page table switch only happens when switching between threads
of different processes.

## Alternative: Go via WASI

For comparison, the WASI path from the userspace-binaries proposal:

| | Native GOOS | WASI |
|---|---|---|
| Performance | Native speed | ~2-5x overhead (wasm interpreter/JIT) |
| Go compatibility | Full (after Phase 3) | Limited (WASI Go support is experimental) |
| Goroutines | Real M:N scheduling | Single-threaded (WASI has no threads yet) |
| Net I/O | Native async via poller | Blocking only (WASI sockets are sync) |
| Kernel work | VirtualMemory, threads, park | None (wasm runtime handles it) |
| Go runtime fork | Yes (maintain a fork) | No (upstream `GOOS=wasip1`) |
| GC | Full concurrent GC | Conservative GC (wasm has no stack scanning) |
| Maintenance burden | High (track Go releases) | Low (upstream supported) |

**WASI is easier but limited.** Go on WASI (`GOOS=wasip1`) is officially
supported but experimental — no goroutine parallelism, no async I/O, limited
stdlib. For running CUE (which is CPU-bound evaluation, no I/O, single
goroutine), WASI might be sufficient.

**Native GOOS is harder but complete.** Full Go with goroutines, concurrent
GC, network I/O, and the entire stdlib. Required for Go network services
or anything using `net/http`.

**Recommendation:** Start with WASI for CUE evaluation. The in-tree path is
[WASI Host Adapter](wasi-host-adapter-proposal.md) Phase W.8 (and Task 9 of
[`docs/proposals/wasi-host-adapter-proposal.md`](wasi-host-adapter-proposal.md)): a CUE
evaluator binary built against TinyGo or upstream Go's `GOOS=wasip1`, loaded
through the host adapter against a future `ScriptPackage` cap. Phase W.8 is
blocked on the same std-userspace decision as W.7 today, but it is the
smaller-step bridge to running Go logic on capOS before the native runtime
port exists. If Go network services or full goroutine/GC semantics become a
goal, invest in the native `GOOS=capos` track described here; the
[Userspace Binaries](userspace-binaries-proposal.md) "Phase W.8" entry keeps
both paths sequenced from the language-track view.

## Relationship to Other Proposals

- **[Userspace Binaries](userspace-binaries-proposal.md)** — owns the overall
  language-runtime track. This proposal adds concrete Go implementation
  details to the future "Future: Go (`GOOS=capos`)" branch there. The POSIX
  compatibility adapter is not sufficient for native Go because Go does not
  use libc on Linux; it makes raw syscalls. The GOOS approach bypasses POSIX
  entirely. The same userspace-binaries doc tracks Phase W.8 as the
  Go-on-WASI interim path.
- **[Programming Languages](../programming-languages.md)** — the matrix entry
  for Go points here for the native track and to the WASI host adapter's
  Phase W.8 for the TinyGo / `GOOS=wasip1` interim. Any change to the
  sequencing between native Go and Go-on-WASI must keep that row in sync.
- **[WASI Host Adapter](wasi-host-adapter-proposal.md)** — Phase W.8 of the
  WASI host adapter ships a TinyGo or upstream Go `GOOS=wasip1` CUE
  evaluator binary that runs inside the in-tree wasmi-backed host. That
  slice is blocked on the same std-userspace decision as W.7 today and
  bridges to the native Go track described here once it lands. The detailed
  plan lives in [`docs/proposals/wasi-host-adapter-proposal.md`](wasi-host-adapter-proposal.md)
  Task 9.
- **[Service Architecture](service-architecture-proposal.md)** — Go services
  participate in the capability graph like any other process. The Go net
  poller (Phase 3) uses TcpSocket/UdpSocket caps from the network stack.
- **[Storage and Naming](storage-and-naming-proposal.md)** — Go's
  `os.Open()`/`os.Read()` map to Namespace + Store caps via the GOOS file
  I/O implementation. Go doesn't use POSIX for this — it has its own
  `runtime/os_capos.go` with direct cap invocations.
- **[SMP](smp-proposal.md)** — later multi-core scaling for Go after
  Phase 2. The first Phase 2 target is single-CPU in-process threads plus
  parking; per-CPU scheduling belongs to the later SMP milestone.

## Open Questions

1. **Fork maintenance.** A `GOOS=capos` fork must track upstream Go releases.
   How much drift is acceptable? Could the capOS-specific code eventually be
   upstreamed (like Fuchsia's was)?

2. **CGo support.** Go's FFI to C (`cgo`) requires a C toolchain and
   dynamic linking. Should capOS support cgo, or is pure Go sufficient?
   CUE doesn't use cgo, but some Go libraries do.

3. **GOROOT on capOS.** Go programs expect `$GOROOT/lib` at runtime for
   some stdlib features. Where does this live on capOS? In the Store?
   Baked into the binary via static compilation?

4. **Go module proxy.** `go get` needs HTTP access. On capOS, this would
   use a `Fetch` cap. But cross-compilation on the host is more practical
   than building Go on capOS itself.

5. **Debugging.** Go's `runtime/debug` and `pprof` expect signals and
   `/proc` access. What debugging capabilities should capOS expose?

6. **GC tuning.** Go's GC is tuned for Linux's mmap semantics (decommit is
   cheap, virtual space is nearly free). capOS's VirtualMemory cap needs to
   match these assumptions or the GC will need retuning. The first matching
   point is the reserve/commit/decommit contract in
   [Go VirtualMemory Contract](../backlog/go-virtual-memory-contract.md).

## Estimated Scope

| Phase | New kernel code | Go runtime changes | Dependencies |
|---|---|---|---|
| Phase 1: Minimal GOOS | ~200 (VirtualMemory cap) | ~2000-3000 | Stages 4-5 |
| Phase 2: Threading | ~500 (threads, park) | ~500 | In-process threading/park (7.1/7.2) |
| Phase 3: Net poller | ~100 (async notification) | ~300 | Networking, Stage 6 |
| Phase 4: CUE on capOS | 0 | 0 | Phase 1 (or WASI) |
| **Total** | **~800** | **~2800-3800** | |

Plus ongoing maintenance to track Go upstream releases.
