# KubeSolo OS — Boot Flow This document describes the boot sequence from power-on to a running Kubernetes node. ## Overview ``` BIOS/UEFI → Bootloader (isolinux) → Linux Kernel → initramfs → /sbin/init → Stage 00: Mount virtual filesystems → Stage 10: Parse boot parameters → Stage 20: Mount persistent storage → Stage 30: Load kernel modules → Stage 40: Apply sysctl settings → Stage 50: Configure networking → Stage 60: Set hostname → Stage 70: Set system clock → Stage 80: Prepare containerd prerequisites → Stage 90: exec KubeSolo (becomes PID 1) ``` ## Stage Details ### Bootloader (isolinux/syslinux) The ISO uses isolinux with several boot options: | Label | Description | |-------|-------------| | `kubesolo` | Normal boot (default, 3s timeout) | | `kubesolo-debug` | Boot with verbose init logging + serial console | | `kubesolo-shell` | Drop to emergency shell immediately | | `kubesolo-nopersist` | Run fully in RAM, no persistent mount | Kernel command line always includes `kubesolo.data=LABEL=KSOLODATA` to specify the persistent data partition. ### Stage 00 — Early Mount (`00-early-mount.sh`) Mounts essential virtual filesystems before anything else can work: - `/proc` — process information - `/sys` — sysfs (device/driver info) - `/dev` — devtmpfs (block devices) - `/tmp`, `/run` — tmpfs scratch space - `/dev/pts`, `/dev/shm` — pseudo-terminals, shared memory - `/sys/fs/cgroup` — cgroup v2 unified hierarchy (v1 fallback if unavailable) ### Stage 10 — Parse Cmdline (`10-parse-cmdline.sh`) Reads `/proc/cmdline` and sets environment variables: | Boot Parameter | Variable | Description | |---------------|----------|-------------| | `kubesolo.data=` | `KUBESOLO_DATA_DEV` | Block device for persistent data | | `kubesolo.debug` | `KUBESOLO_DEBUG` | Enables `set -x` for trace logging | | `kubesolo.shell` | `KUBESOLO_SHELL` | Drop to shell after this stage | | `kubesolo.nopersist` | `KUBESOLO_NOPERSIST` | Skip persistent mount | | `kubesolo.cloudinit=` | `KUBESOLO_CLOUDINIT` | Cloud-init config file path | | `kubesolo.flags=` | `KUBESOLO_EXTRA_FLAGS` | Extra flags for KubeSolo binary | If `kubesolo.data` is not specified, auto-detects a partition with label `KSOLODATA` via `blkid`. If none found, falls back to RAM-only mode. ### Stage 20 — Persistent Mount (`20-persistent-mount.sh`) If not in RAM-only mode: 1. Waits up to 30s for the data device to appear (handles slow USB, virtio) 2. Mounts the ext4 data partition at `/mnt/data` 3. Creates directory structure on first boot 4. Bind-mounts persistent directories: | Source (data partition) | Mount Point | Content | |------------------------|-------------|---------| | `/mnt/data/kubesolo` | `/var/lib/kubesolo` | K8s state, certs, SQLite DB | | `/mnt/data/containerd` | `/var/lib/containerd` | Container images + layers | | `/mnt/data/etc-kubesolo` | `/etc/kubesolo` | Node configuration | | `/mnt/data/log` | `/var/log` | System + K8s logs | | `/mnt/data/usr-local` | `/usr/local` | User binaries | In RAM-only mode, these directories are backed by tmpfs and lost on reboot. ### Stage 30 — Kernel Modules (`30-kernel-modules.sh`) Loads kernel modules listed in `/usr/lib/kubesolo-os/modules.list`: - `br_netfilter`, `bridge`, `veth`, `vxlan` — K8s pod networking - `ip_tables`, `iptable_nat`, `nf_nat`, `nf_conntrack` — service routing - `overlay` — containerd storage driver - `ip_vs`, `ip_vs_rr`, `ip_vs_wrr`, `ip_vs_sh` — optional IPVS mode Modules that fail to load are logged as warnings (may be built into the kernel). ### Stage 40 — Sysctl (`40-sysctl.sh`) Applies kernel parameters from `/etc/sysctl.d/k8s.conf`: - `net.bridge.bridge-nf-call-iptables = 1` — K8s requirement - `net.ipv4.ip_forward = 1` — pod-to-pod routing - `fs.inotify.max_user_watches = 524288` — kubelet/containerd watchers - `net.netfilter.nf_conntrack_max = 131072` — service connection tracking - `vm.swappiness = 0` — no swap (K8s requirement) ### Stage 50 — Network (`50-network.sh`) Priority order: 1. **Saved config** — `/mnt/data/network/interfaces.sh` (from previous boot) 2. **Cloud-init** — parsed from `cloud-init.yaml` (Phase 2: Go parser) 3. **DHCP fallback** — `udhcpc` on first non-virtual interface Brings up loopback, finds the first physical interface (skipping lo, docker, veth, br, cni), and runs DHCP. ### Stage 60 — Hostname (`60-hostname.sh`) Priority order: 1. Saved hostname from data partition 2. Generated from MAC address of primary interface (`kubesolo-XXXXXX`) Writes to `/etc/hostname` and appends to `/etc/hosts`. ### Stage 70 — Clock (`70-clock.sh`) Best-effort time synchronization: 1. Try `hwclock -s` (hardware clock) 2. Try NTP in background (non-blocking) via `ntpd` or `ntpdate` 3. Log warning if no time source available Non-blocking because NTP failure shouldn't prevent boot. ### Stage 80 — Containerd (`80-containerd.sh`) Ensures containerd prerequisites: - Creates `/run/containerd`, `/var/lib/containerd` - Creates CNI directories (`/etc/cni/net.d`, `/opt/cni/bin`) - Loads custom containerd config if present KubeSolo manages the actual containerd lifecycle internally. ### Stage 90 — KubeSolo (`90-kubesolo.sh`) Final stage — **exec replaces the init process**: 1. Verifies `/usr/local/bin/kubesolo` exists 2. Builds command line: `--path /var/lib/kubesolo --local-storage true` 3. Adds hostname as extra SAN for API server certificate 4. Appends any extra flags from boot params or config file 5. `exec kubesolo $ARGS` — KubeSolo becomes PID 1 After this, KubeSolo starts containerd, kubelet, API server, and all K8s components. The node should reach Ready status within 60-120 seconds. ## Failure Handling If any stage returns non-zero, `/sbin/init` calls `emergency_shell()` which: 1. Logs the failure to serial console 2. Drops to `/bin/sh` for debugging 3. User can type `exit` to retry the boot sequence If `kubesolo.shell` is passed as a boot parameter, the system drops to shell immediately after Stage 10 (cmdline parsing). ## Debugging ### Serial Console All init stages log to stderr with the prefix `[kubesolo-init]`. Boot with `console=ttyS0,115200n8` (default in debug mode) to see output on serial. ### Boot Markers Test scripts look for these markers in the serial log: - `[kubesolo-init] [OK] Stage 90-kubesolo.sh complete` — full boot success - `[kubesolo-init] [ERROR]` — stage failure ### Emergency Shell From the emergency shell: ```sh dmesg | tail -50 # Kernel messages cat /proc/cmdline # Boot parameters cat /proc/mounts # Current mounts blkid # Block devices and labels ip addr # Network interfaces ls /usr/lib/kubesolo-os/init.d/ # Available init stages ```