Complete Phase 1 implementation of KubeSolo OS — an immutable, bootable Linux distribution built on Tiny Core Linux for running KubeSolo single-node Kubernetes. Build system: - Makefile with fetch, rootfs, initramfs, iso, disk-image targets - Dockerfile.builder for reproducible builds - Scripts to download Tiny Core, extract rootfs, inject KubeSolo, pack initramfs, and create bootable ISO/disk images Init system (10 POSIX sh stages): - Early mount (proc/sys/dev/cgroup2), cmdline parsing, persistent mount with bind-mounts, kernel module loading, sysctl, DHCP networking, hostname, clock sync, containerd prep, KubeSolo exec Shared libraries: - functions.sh (device wait, IP lookup, config helpers) - network.sh (static IP, config persistence, interface detection) - health.sh (containerd, API server, node readiness checks) - Emergency shell for boot failure debugging Testing: - QEMU boot test with serial log marker detection - K8s readiness test with kubectl verification - Persistence test (reboot + verify state survives) - Workload deployment test (nginx pod) - Local storage test (PVC + local-path provisioner) - Network policy test - Reusable run-vm.sh launcher Developer tools: - dev-vm.sh (interactive QEMU with port forwarding) - rebuild-initramfs.sh (fast iteration) - inject-ssh.sh (dropbear SSH for debugging) - extract-kernel-config.sh + kernel-audit.sh Documentation: - Full design document with architecture research - Boot flow documentation covering all 10 init stages - Cloud-init examples (DHCP, static IP, Portainer Edge, air-gapped) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
946 lines
38 KiB
Markdown
946 lines
38 KiB
Markdown
# KubeSolo OS — Bootable Immutable Kubernetes Distribution
|
||
|
||
## Design Research: KubeSolo + Tiny Core Linux
|
||
|
||
---
|
||
|
||
## 1. Executive Summary
|
||
|
||
This document outlines the architecture for **KubeSolo OS** — an immutable, bootable Linux distribution purpose-built to run KubeSolo (Portainer's single-node Kubernetes distribution) with atomic updates. The design combines the minimal footprint of Tiny Core Linux with KubeSolo's single-binary K8s packaging to create an appliance-like Kubernetes node that boots directly into a production-ready cluster.
|
||
|
||
**Target use cases:** IoT/IIoT edge devices, single-node K8s appliances, air-gapped deployments, embedded systems, kiosk/POS systems, and resource-constrained hardware.
|
||
|
||
---
|
||
|
||
## 2. Component Analysis
|
||
|
||
### 2.1 KubeSolo
|
||
|
||
**Source:** https://github.com/portainer/kubesolo
|
||
|
||
KubeSolo is Portainer's production-ready, ultra-lightweight single-node Kubernetes distribution designed for edge and IoT scenarios.
|
||
|
||
**Architecture highlights:**
|
||
|
||
- **Single binary** — all K8s components bundled into one executable
|
||
- **SQLite backend** — uses Kine to replace etcd, eliminating cluster coordination overhead
|
||
- **Bundled runtime** — ships containerd, runc, CoreDNS, and CNI plugins
|
||
- **No scheduler** — replaced by a custom `NodeSetter` admission webhook (single-node, no scheduling decisions needed)
|
||
- **Dual libc support** — detects and supports both glibc and musl (Alpine) environments
|
||
- **Offline-ready** — designed for air-gapped deployments; all images can be preloaded
|
||
- **Portainer Edge integration** — optional remote management via `--portainer-edge-*` flags
|
||
|
||
**Runtime requirements:**
|
||
|
||
| Requirement | Minimum | Recommended |
|
||
|---|---|---|
|
||
| RAM | 512 MB | 1 GB+ |
|
||
| Kernel | 3.10+ (legacy) | 5.8+ (cgroup v2) |
|
||
| Storage | ~500 MB (binary) | 2 GB+ (with workloads) |
|
||
|
||
**Key kernel dependencies:**
|
||
|
||
- cgroup v2 (kernel 5.8+) — `CONFIG_CGROUP`, `CONFIG_CGROUP_CPUACCT`, `CONFIG_CGROUP_DEVICE`, `CONFIG_CGROUP_FREEZER`, `CONFIG_CGROUP_SCHED`, `CONFIG_CGROUP_PIDS`, `CONFIG_CGROUP_NET_CLASSID`
|
||
- Namespaces — `CONFIG_NAMESPACES`, `CONFIG_NET_NS`, `CONFIG_PID_NS`, `CONFIG_USER_NS`, `CONFIG_UTS_NS`, `CONFIG_IPC_NS`
|
||
- Networking — `CONFIG_BRIDGE`, `CONFIG_NETFILTER`, `CONFIG_VETH`, `CONFIG_VXLAN`, `CONFIG_IP_NF_IPTABLES`, `CONFIG_IP_NF_NAT`
|
||
- Filesystem — `CONFIG_OVERLAY_FS`, `CONFIG_SQUASHFS`
|
||
- Modules required at runtime: `br_netfilter`, `overlay`, `ip_tables`, `iptable_nat`, `iptable_filter`
|
||
|
||
**Installation & operation:**
|
||
|
||
```bash
|
||
# Standard install
|
||
curl -sfL https://get.kubesolo.io | sudo sh -
|
||
|
||
# Kubeconfig location
|
||
/var/lib/kubesolo/pki/admin/admin.kubeconfig
|
||
|
||
# Key flags
|
||
--path /var/lib/kubesolo # config directory
|
||
--apiserver-extra-sans # additional TLS SANs
|
||
--local-storage true # enable local-path provisioner
|
||
--portainer-edge-id # Portainer Edge agent ID
|
||
--portainer-edge-key # Portainer Edge agent key
|
||
```
|
||
|
||
### 2.2 Tiny Core Linux
|
||
|
||
**Source:** http://www.tinycorelinux.net
|
||
|
||
Tiny Core Linux is an ultra-minimal Linux distribution (11–17 MB) that runs entirely in RAM.
|
||
|
||
**Architecture highlights:**
|
||
|
||
- **Micro Core** — 11 MB: kernel + root filesystem + basic kernel modules (no GUI)
|
||
- **RAM-resident** — entire OS loaded into memory at boot; disk only needed for persistence
|
||
- **SquashFS root** — read-only compressed filesystem, inherently immutable
|
||
- **Extension system** — `.tcz` packages (SquashFS-compressed) mounted or copied at boot
|
||
- **Three operational modes:**
|
||
1. **Cloud/Default** — pure RAM, nothing persists across reboots
|
||
2. **Mount mode** — extensions stored in `/tce` directory, loop-mounted at boot
|
||
3. **Copy mode** — extensions copied into RAM from persistent storage
|
||
|
||
**Key concepts for this design:**
|
||
|
||
- `/tce` directory on persistent storage holds extensions and configuration
|
||
- `onboot.lst` — list of extensions to auto-mount at boot
|
||
- `filetool.sh` + `/opt/.filetool.lst` — backup/restore mechanism for persistent files
|
||
- Boot codes control behavior: `tce=`, `base`, `norestore`, `noswap`, etc.
|
||
- Custom remastering: extract `core.gz` → modify → repack → create bootable image
|
||
- Frugal install: `vmlinuz` + `core.gz` + bootloader + `/tce` directory
|
||
|
||
**Kernel:** Ships modern Linux kernel (6.x series in v17.0), supports x86, x86_64, ARM.
|
||
|
||
---
|
||
|
||
## 3. Competitive Landscape — Existing Immutable K8s OSes
|
||
|
||
### 3.1 Comparison Matrix
|
||
|
||
| Feature | Talos Linux | Bottlerocket | Flatcar Linux | Kairos | **KubeSolo OS** (proposed) |
|
||
|---|---|---|---|---|---|
|
||
| **Footprint** | ~80 MB | ~500 MB | ~700 MB | Varies (base distro) | **~50–80 MB** |
|
||
| **Immutability** | Radical (12 binaries) | Strong (read-only root) | Moderate (read-only /usr) | Strong (overlayFS) | **Strong (SquashFS root)** |
|
||
| **SSH access** | None (API only) | Disabled (container shell) | Yes | Optional | **Optional (extension)** |
|
||
| **Update model** | A/B partitions | A/B partitions | A/B partitions (ChromeOS) | A/B partitions (OCI) | **A/B partitions** |
|
||
| **K8s variants** | Multi-node, HA | Multi-node (EKS) | Multi-node (any) | Multi-node (any) | **Single-node only** |
|
||
| **Management** | talosctl (mTLS API) | API (localhost) | Ignition + SSH | Cloud-init, K8s CRDs | **API + cloud-init** |
|
||
| **Base OS** | Custom (Go userland) | Custom (Bottlerocket) | Gentoo-derived | Any Linux (meta-distro) | **Tiny Core Linux** |
|
||
| **Target** | Cloud + Edge | AWS (primarily) | Cloud + Bare metal | Edge + Bare metal | **Edge + IoT** |
|
||
| **Configuration** | Machine config YAML | TOML settings | Ignition JSON | Cloud-init YAML | **Cloud-init + boot codes** |
|
||
|
||
### 3.2 Key Lessons from Each
|
||
|
||
**From Talos Linux:**
|
||
- API-only management is powerful but aggressive — provide as optional mode
|
||
- 12-binary minimalism is aspirational; KubeSolo's single binary aligns well
|
||
- System extensions as SquashFS overlays in initramfs = directly applicable to Tiny Core's `.tcz` model
|
||
- A/B partition with GRUB fallback counter for automatic rollback
|
||
|
||
**From Bottlerocket:**
|
||
- Bootstrap containers for customization — useful pattern for pre-deploying workloads
|
||
- Host containers for privileged operations (debugging, admin access)
|
||
- Tightly coupled OS+K8s versions simplifies compatibility testing
|
||
|
||
**From Flatcar Linux:**
|
||
- Ignition for first-boot declarative config — consider cloud-init equivalent
|
||
- ChromeOS-style update engine is battle-tested
|
||
- Dynamic kernel module loading — Tiny Core's extension system provides similar flexibility
|
||
|
||
**From Kairos:**
|
||
- Container-based OS distribution (OCI images) — enables `docker pull` for OS updates
|
||
- P2P mesh clustering via libp2p — interesting for edge fleet bootstrapping
|
||
- Meta-distribution approach: don't reinvent, augment
|
||
- Static kernel+initrd shipped in container image = truly atomic full-stack updates
|
||
|
||
---
|
||
|
||
## 4. Architecture Design
|
||
|
||
### 4.1 High-Level Architecture
|
||
|
||
```
|
||
┌──────────────────────────────────────────────────────┐
|
||
│ BOOT MEDIA │
|
||
│ ┌──────────┐ ┌──────────┐ ┌────────────────────┐ │
|
||
│ │ GRUB/ │ │ Partition│ │ Partition B │ │
|
||
│ │ Syslinux │ │ A │ │ (passive) │ │
|
||
│ │ (EFI/ │ │ (active) │ │ │ │
|
||
│ │ BIOS) │ │ │ │ vmlinuz │ │
|
||
│ │ │ │ vmlinuz │ │ kubesolo-os.gz │ │
|
||
│ │ Fallback │ │ kubesolo-│ │ extensions.tcz │ │
|
||
│ │ counter │ │ os.gz │ │ │ │
|
||
│ │ │ │ ext.tcz │ │ │ │
|
||
│ └──────────┘ └──────────┘ └────────────────────┘ │
|
||
│ │
|
||
│ ┌──────────────────────────────────────────────────┐│
|
||
│ │ Persistent Data Partition ││
|
||
│ │ /var/lib/kubesolo/ (K8s state, SQLite DB) ││
|
||
│ │ /var/lib/containerd/ (container images/layers) ││
|
||
│ │ /etc/kubesolo/ (node config) ││
|
||
│ │ /var/log/ (logs, optional) ││
|
||
│ │ /usr/local/ (user data) ││
|
||
│ └──────────────────────────────────────────────────┘│
|
||
└──────────────────────────────────────────────────────┘
|
||
|
||
BOOT FLOW
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ GRUB loads vmlinuz + │
|
||
│ kubesolo-os.gz from │
|
||
│ active partition │
|
||
└────────────┬────────────┘
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ Kernel boots, mounts │
|
||
│ SquashFS root (ro) │
|
||
│ in RAM │
|
||
└────────────┬────────────┘
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ init: mount persistent │
|
||
│ partition, bind-mount │
|
||
│ writable paths │
|
||
└────────────┬────────────┘
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ Load kernel modules: │
|
||
│ br_netfilter, overlay, │
|
||
│ ip_tables, veth │
|
||
└────────────┬────────────┘
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ Configure networking │
|
||
│ (cloud-init or static) │
|
||
└────────────┬────────────┘
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ Start KubeSolo │
|
||
│ (single binary) │
|
||
└────────────┬────────────┘
|
||
│
|
||
┌────────────▼────────────┐
|
||
│ K8s API available │
|
||
│ Node ready for │
|
||
│ workloads │
|
||
└─────────────────────────┘
|
||
```
|
||
|
||
### 4.2 Partition Layout
|
||
|
||
```
|
||
Disk Layout (minimum 8 GB recommended):
|
||
|
||
┌──────────────────────────────────────────────────────────┐
|
||
│ Partition 1: EFI/Boot (256 MB, FAT32) │
|
||
│ /EFI/BOOT/bootx64.efi (or /boot/grub for BIOS) │
|
||
│ grub.cfg with A/B logic + fallback counter │
|
||
├──────────────────────────────────────────────────────────┤
|
||
│ Partition 2: System A (512 MB, SquashFS image, read-only)│
|
||
│ vmlinuz │
|
||
│ kubesolo-os.gz (initramfs: core.gz + KubeSolo ext) │
|
||
├──────────────────────────────────────────────────────────┤
|
||
│ Partition 3: System B (512 MB, SquashFS image, read-only)│
|
||
│ (passive — receives updates, swaps with A) │
|
||
├──────────────────────────────────────────────────────────┤
|
||
│ Partition 4: Persistent Data (remaining space, ext4) │
|
||
│ /var/lib/kubesolo/ → K8s state, certs, SQLite │
|
||
│ /var/lib/containerd/ → container images & layers │
|
||
│ /etc/kubesolo/ → node configuration │
|
||
│ /etc/network/ → network config │
|
||
│ /var/log/ → system + K8s logs │
|
||
│ /usr/local/ → user extensions │
|
||
└──────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### 4.3 Filesystem Mount Strategy
|
||
|
||
At boot, the init system constructs the runtime filesystem:
|
||
|
||
```bash
|
||
# Root: SquashFS from initramfs (read-only, in RAM)
|
||
/ → tmpfs (RAM) + SquashFS overlay (ro)
|
||
|
||
# Persistent bind mounts from data partition
|
||
/var/lib/kubesolo → /mnt/data/kubesolo (rw)
|
||
/var/lib/containerd → /mnt/data/containerd (rw)
|
||
/etc/kubesolo → /mnt/data/etc-kubesolo (rw)
|
||
/etc/resolv.conf → /mnt/data/resolv.conf (rw)
|
||
/var/log → /mnt/data/log (rw)
|
||
/usr/local → /mnt/data/usr-local (rw)
|
||
|
||
# Everything else: read-only or tmpfs
|
||
/tmp → tmpfs
|
||
/run → tmpfs
|
||
```
|
||
|
||
### 4.4 Custom Initramfs (kubesolo-os.gz)
|
||
|
||
The initramfs is the core of the distribution — a remastered Tiny Core `core.gz` with KubeSolo baked in:
|
||
|
||
```
|
||
kubesolo-os.gz (cpio+gzip archive)
|
||
├── bin/ # BusyBox symlinks
|
||
├── sbin/
|
||
│ └── init # Custom init script (see §4.5)
|
||
├── lib/
|
||
│ └── modules/ # Kernel modules (br_netfilter, overlay, etc.)
|
||
├── usr/
|
||
│ └── local/
|
||
│ └── bin/
|
||
│ └── kubesolo # KubeSolo binary
|
||
├── opt/
|
||
│ ├── containerd/ # containerd + runc + CNI plugins
|
||
│ │ ├── bin/
|
||
│ │ │ ├── containerd
|
||
│ │ │ ├── containerd-shim-runc-v2
|
||
│ │ │ └── runc
|
||
│ │ └── cni/
|
||
│ │ └── bin/ # CNI plugins (bridge, host-local, loopback, etc.)
|
||
│ └── kubesolo-os/
|
||
│ ├── cloud-init.yaml # Default cloud-init config
|
||
│ └── update-agent # Atomic update agent binary
|
||
├── etc/
|
||
│ ├── os-release # KubeSolo OS identification
|
||
│ ├── kubesolo/
|
||
│ │ └── config.yaml # Default KubeSolo config
|
||
│ └── sysctl.d/
|
||
│ └── k8s.conf # Kernel parameters for K8s
|
||
└── var/
|
||
└── lib/
|
||
└── kubesolo/ # Mount point (bind-mounted to persistent)
|
||
```
|
||
|
||
### 4.5 Init System
|
||
|
||
A custom init script replaces Tiny Core's default init to implement the appliance boot flow:
|
||
|
||
```bash
|
||
#!/bin/sh
|
||
# /sbin/init — KubeSolo OS init
|
||
set -e
|
||
|
||
# 1. Mount essential filesystems
|
||
mount -t proc proc /proc
|
||
mount -t sysfs sysfs /sys
|
||
mount -t devtmpfs devtmpfs /dev
|
||
mount -t tmpfs tmpfs /tmp
|
||
mount -t tmpfs tmpfs /run
|
||
mkdir -p /dev/pts /dev/shm
|
||
mount -t devpts devpts /dev/pts
|
||
mount -t tmpfs tmpfs /dev/shm
|
||
|
||
# 2. Parse boot parameters
|
||
PERSISTENT_DEV=""
|
||
for arg in $(cat /proc/cmdline); do
|
||
case "$arg" in
|
||
kubesolo.data=*) PERSISTENT_DEV="${arg#kubesolo.data=}" ;;
|
||
kubesolo.debug) set -x ;;
|
||
kubesolo.shell) exec /bin/sh ;; # Emergency shell
|
||
esac
|
||
done
|
||
|
||
# 3. Mount persistent data partition
|
||
if [ -n "$PERSISTENT_DEV" ]; then
|
||
mkdir -p /mnt/data
|
||
# Wait for device (USB, slow disks)
|
||
for i in $(seq 1 30); do
|
||
[ -b "$PERSISTENT_DEV" ] && break
|
||
sleep 1
|
||
done
|
||
mount -t ext4 "$PERSISTENT_DEV" /mnt/data
|
||
|
||
# Create directory structure on first boot
|
||
for dir in kubesolo containerd etc-kubesolo log usr-local network; do
|
||
mkdir -p /mnt/data/$dir
|
||
done
|
||
|
||
# Bind mount persistent paths
|
||
mount --bind /mnt/data/kubesolo /var/lib/kubesolo
|
||
mount --bind /mnt/data/containerd /var/lib/containerd
|
||
mount --bind /mnt/data/etc-kubesolo /etc/kubesolo
|
||
mount --bind /mnt/data/log /var/log
|
||
mount --bind /mnt/data/usr-local /usr/local
|
||
fi
|
||
|
||
# 4. Load required kernel modules
|
||
modprobe br_netfilter
|
||
modprobe overlay
|
||
modprobe ip_tables
|
||
modprobe iptable_nat
|
||
modprobe iptable_filter
|
||
modprobe veth
|
||
modprobe vxlan
|
||
|
||
# 5. Set kernel parameters
|
||
sysctl -w net.bridge.bridge-nf-call-iptables=1
|
||
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
|
||
sysctl -w net.ipv4.ip_forward=1
|
||
sysctl -w fs.inotify.max_user_instances=1024
|
||
sysctl -w fs.inotify.max_user_watches=524288
|
||
|
||
# 6. Configure networking
|
||
# Priority: cloud-init > persistent config > DHCP fallback
|
||
if [ -f /mnt/data/network/interfaces ]; then
|
||
# Apply saved network config
|
||
configure_network /mnt/data/network/interfaces
|
||
elif [ -f /mnt/data/etc-kubesolo/cloud-init.yaml ]; then
|
||
# First boot: apply cloud-init
|
||
apply_cloud_init /mnt/data/etc-kubesolo/cloud-init.yaml
|
||
else
|
||
# Fallback: DHCP on first interface
|
||
udhcpc -i eth0 -s /usr/share/udhcpc/default.script
|
||
fi
|
||
|
||
# 7. Set hostname
|
||
if [ -f /mnt/data/etc-kubesolo/hostname ]; then
|
||
hostname $(cat /mnt/data/etc-kubesolo/hostname)
|
||
else
|
||
hostname kubesolo-$(cat /sys/class/net/eth0/address | tr -d ':' | tail -c 6)
|
||
fi
|
||
|
||
# 8. Start containerd
|
||
containerd --config /etc/kubesolo/containerd-config.toml &
|
||
sleep 2 # Wait for socket
|
||
|
||
# 9. Start KubeSolo
|
||
exec /usr/local/bin/kubesolo \
|
||
--path /var/lib/kubesolo \
|
||
--local-storage true \
|
||
$(cat /etc/kubesolo/extra-flags 2>/dev/null || true)
|
||
```
|
||
|
||
### 4.6 Atomic Update System
|
||
|
||
#### Update Flow
|
||
|
||
```
|
||
UPDATE PROCESS
|
||
│
|
||
┌─────────────▼──────────────┐
|
||
│ 1. Download new OS image │
|
||
│ (kubesolo-os-v2.img) │
|
||
│ Verify checksum + sig │
|
||
└─────────────┬──────────────┘
|
||
│
|
||
┌─────────────▼──────────────┐
|
||
│ 2. Write image to PASSIVE │
|
||
│ partition (B if A active) │
|
||
└─────────────┬──────────────┘
|
||
│
|
||
┌─────────────▼──────────────┐
|
||
│ 3. Update GRUB: │
|
||
│ - Set next boot → B │
|
||
│ - Set boot_counter = 3 │
|
||
└─────────────┬──────────────┘
|
||
│
|
||
┌─────────────▼──────────────┐
|
||
│ 4. Reboot │
|
||
└─────────────┬──────────────┘
|
||
│
|
||
┌─────────────▼──────────────┐
|
||
│ 5. GRUB boots partition B │
|
||
│ Decrements boot_counter │
|
||
└─────────────┬──────────────┘
|
||
│
|
||
┌──────────┴──────────┐
|
||
│ │
|
||
┌─────▼─────┐ ┌─────▼─────┐
|
||
│ Boot OK │ │ Boot FAIL │
|
||
│ │ │ │
|
||
│ Health │ │ Counter │
|
||
│ check OK │ │ hits 0 │
|
||
│ │ │ │
|
||
│ Mark B as │ │ GRUB auto │
|
||
│ default │ │ rollback │
|
||
│ Clear │ │ to A │
|
||
│ counter │ │ │
|
||
└───────────┘ └───────────┘
|
||
```
|
||
|
||
#### GRUB Configuration for A/B Boot
|
||
|
||
```grub
|
||
# /boot/grub/grub.cfg
|
||
|
||
set default=0
|
||
set timeout=3
|
||
|
||
# Saved environment variables:
|
||
# active_slot = A or B
|
||
# boot_counter = 3 (decremented each boot, 0 = rollback)
|
||
# boot_success = 0 (set to 1 by health check)
|
||
|
||
load_env
|
||
|
||
# If last boot failed and counter expired, swap slots
|
||
if [ "${boot_success}" != "1" ]; then
|
||
if [ "${boot_counter}" = "0" ]; then
|
||
if [ "${active_slot}" = "A" ]; then
|
||
set active_slot=B
|
||
else
|
||
set active_slot=A
|
||
fi
|
||
save_env active_slot
|
||
set boot_counter=3
|
||
save_env boot_counter
|
||
else
|
||
# Decrement counter
|
||
if [ "${boot_counter}" = "3" ]; then set boot_counter=2; fi
|
||
if [ "${boot_counter}" = "2" ]; then set boot_counter=1; fi
|
||
if [ "${boot_counter}" = "1" ]; then set boot_counter=0; fi
|
||
save_env boot_counter
|
||
fi
|
||
fi
|
||
|
||
set boot_success=0
|
||
save_env boot_success
|
||
|
||
# Boot from active slot
|
||
if [ "${active_slot}" = "A" ]; then
|
||
set root=(hd0,gpt2)
|
||
else
|
||
set root=(hd0,gpt3)
|
||
fi
|
||
|
||
menuentry "KubeSolo OS" {
|
||
linux /vmlinuz kubesolo.data=/dev/sda4 quiet
|
||
initrd /kubesolo-os.gz
|
||
}
|
||
|
||
menuentry "KubeSolo OS (emergency shell)" {
|
||
linux /vmlinuz kubesolo.data=/dev/sda4 kubesolo.shell
|
||
initrd /kubesolo-os.gz
|
||
}
|
||
```
|
||
|
||
#### Update Agent
|
||
|
||
A lightweight Go binary that runs as a Kubernetes CronJob or DaemonSet:
|
||
|
||
```
|
||
kubesolo-update-agent responsibilities:
|
||
1. Poll update server (HTTPS) or watch OCI registry for new tags
|
||
2. Download + verify new system image (SHA256 + optional GPG signature)
|
||
3. Write to passive partition (dd or equivalent)
|
||
4. Update GRUB environment (grub-editenv)
|
||
5. Trigger reboot (via Kubernetes node drain → reboot)
|
||
6. Post-boot health check:
|
||
- KubeSolo API reachable?
|
||
- containerd healthy?
|
||
- Node Ready in kubectl?
|
||
If all pass → set boot_success=1
|
||
If any fail → leave boot_success=0 (auto-rollback on next reboot)
|
||
```
|
||
|
||
**Update distribution models:**
|
||
|
||
1. **HTTP/S server** — host images on a simple file server; agent polls for `latest.json`
|
||
2. **OCI registry** — tag system images as container images; agent pulls new tags
|
||
3. **USB drive** — for air-gapped: plug USB with new image, agent detects and applies
|
||
4. **Portainer Edge** — leverage existing Portainer Edge infrastructure for fleet updates
|
||
|
||
### 4.7 Configuration System
|
||
|
||
#### First Boot (cloud-init)
|
||
|
||
The system uses a simplified cloud-init compatible with Tiny Core's environment:
|
||
|
||
```yaml
|
||
# /etc/kubesolo/cloud-init.yaml (placed on data partition before first boot)
|
||
#cloud-config
|
||
|
||
hostname: edge-node-001
|
||
|
||
network:
|
||
version: 2
|
||
ethernets:
|
||
eth0:
|
||
dhcp4: false
|
||
addresses:
|
||
- 192.168.1.100/24
|
||
gateway4: 192.168.1.1
|
||
nameservers:
|
||
addresses:
|
||
- 8.8.8.8
|
||
- 1.1.1.1
|
||
|
||
kubesolo:
|
||
extra-sans:
|
||
- edge-node-001.local
|
||
- 192.168.1.100
|
||
local-storage: true
|
||
portainer:
|
||
edge-id: "your-edge-id"
|
||
edge-key: "your-edge-key"
|
||
|
||
ssh:
|
||
enabled: false # Set true to enable SSH extension
|
||
authorized_keys:
|
||
- "ssh-rsa AAAA..."
|
||
|
||
ntp:
|
||
servers:
|
||
- pool.ntp.org
|
||
```
|
||
|
||
#### Runtime Configuration
|
||
|
||
Post-boot configuration changes via Kubernetes API:
|
||
|
||
```bash
|
||
# Access from the node (kubeconfig is at known path)
|
||
export KUBECONFIG=/var/lib/kubesolo/pki/admin/admin.kubeconfig
|
||
kubectl get nodes
|
||
kubectl apply -f workload.yaml
|
||
|
||
# Remote access via Portainer Edge or direct API
|
||
# (if apiserver-extra-sans includes remote IP/DNS)
|
||
```
|
||
|
||
---
|
||
|
||
## 5. Build Process
|
||
|
||
### 5.1 Build Pipeline
|
||
|
||
```
|
||
BUILD PIPELINE
|
||
|
||
┌─────────────────────┐ ┌──────────────────────┐
|
||
│ 1. Fetch Tiny Core │────▶│ 2. Extract core.gz │
|
||
│ Micro Core ISO │ │ (cpio -idmv) │
|
||
└─────────────────────┘ └──────────┬───────────┘
|
||
│
|
||
┌──────────▼───────────┐
|
||
│ 3. Inject KubeSolo │
|
||
│ binary + deps │
|
||
│ (containerd, runc,│
|
||
│ CNI, modules) │
|
||
└──────────┬───────────┘
|
||
│
|
||
┌──────────▼───────────┐
|
||
│ 4. Replace /sbin/init│
|
||
│ with custom init │
|
||
└──────────┬───────────┘
|
||
│
|
||
┌──────────▼───────────┐
|
||
│ 5. Repack initramfs │
|
||
│ (find . | cpio -o │
|
||
│ | gzip > ks-os.gz│
|
||
└──────────┬───────────┘
|
||
│
|
||
┌──────────▼───────────┐
|
||
│ 6. Verify kernel has │
|
||
│ required configs │
|
||
│ (cgroup v2, ns, │
|
||
│ netfilter, etc.) │
|
||
└──────────┬───────────┘
|
||
│
|
||
┌────────────────────────┼────────────────────┐
|
||
│ │ │
|
||
┌─────────▼─────────┐ ┌─────────▼─────────┐ ┌──────▼───────┐
|
||
│ 7a. Create ISO │ │ 7b. Create raw │ │ 7c. Create │
|
||
│ (bootable │ │ disk image │ │ OCI │
|
||
│ media) │ │ (dd to disk) │ │ image │
|
||
└───────────────────┘ └───────────────────┘ └──────────────┘
|
||
```
|
||
|
||
### 5.2 Build Script (Skeleton)
|
||
|
||
```bash
|
||
#!/bin/bash
|
||
# build-kubesolo-os.sh
|
||
set -euo pipefail
|
||
|
||
VERSION="${1:?Usage: $0 <version>}"
|
||
WORK_DIR="$(mktemp -d)"
|
||
OUTPUT_DIR="./output"
|
||
|
||
# --- 1. Download components ---
|
||
echo "==> Downloading Tiny Core Micro Core..."
|
||
wget -q "http://www.tinycorelinux.net/17.x/x86_64/release/CorePure64-17.0.iso" \
|
||
-O "$WORK_DIR/core.iso"
|
||
|
||
echo "==> Downloading KubeSolo..."
|
||
curl -sfL https://get.kubesolo.io -o "$WORK_DIR/install-kubesolo.sh"
|
||
# Or: download specific release binary from GitHub
|
||
|
||
# --- 2. Extract Tiny Core ---
|
||
mkdir -p "$WORK_DIR/iso" "$WORK_DIR/rootfs"
|
||
mount -o loop "$WORK_DIR/core.iso" "$WORK_DIR/iso"
|
||
cp "$WORK_DIR/iso/boot/vmlinuz64" "$WORK_DIR/vmlinuz"
|
||
cd "$WORK_DIR/rootfs"
|
||
zcat "$WORK_DIR/iso/boot/corepure64.gz" | cpio -idmv 2>/dev/null
|
||
umount "$WORK_DIR/iso"
|
||
|
||
# --- 3. Inject KubeSolo + dependencies ---
|
||
# KubeSolo binary
|
||
mkdir -p usr/local/bin
|
||
cp /path/to/kubesolo usr/local/bin/kubesolo
|
||
chmod +x usr/local/bin/kubesolo
|
||
|
||
# containerd + runc + CNI (extracted from KubeSolo bundle or downloaded separately)
|
||
mkdir -p opt/cni/bin
|
||
# ... copy containerd, runc, CNI plugins
|
||
|
||
# Required kernel modules (if not already in core.gz)
|
||
# ... may need to compile or extract from Tiny Core extensions
|
||
|
||
# --- 4. Custom init ---
|
||
cat > sbin/init << 'INIT'
|
||
#!/bin/sh
|
||
# ... (init script from §4.5)
|
||
INIT
|
||
chmod +x sbin/init
|
||
|
||
# --- 5. Sysctl + OS metadata ---
|
||
mkdir -p etc/sysctl.d
|
||
cat > etc/sysctl.d/k8s.conf << EOF
|
||
net.bridge.bridge-nf-call-iptables = 1
|
||
net.bridge.bridge-nf-call-ip6tables = 1
|
||
net.ipv4.ip_forward = 1
|
||
fs.inotify.max_user_instances = 1024
|
||
fs.inotify.max_user_watches = 524288
|
||
EOF
|
||
|
||
cat > etc/os-release << EOF
|
||
NAME="KubeSolo OS"
|
||
VERSION="$VERSION"
|
||
ID=kubesolo-os
|
||
VERSION_ID=$VERSION
|
||
PRETTY_NAME="KubeSolo OS $VERSION"
|
||
HOME_URL="https://github.com/portainer/kubesolo"
|
||
EOF
|
||
|
||
# --- 6. Repack initramfs ---
|
||
find . | cpio -o -H newc 2>/dev/null | gzip -9 > "$WORK_DIR/kubesolo-os.gz"
|
||
|
||
# --- 7. Create disk image ---
|
||
mkdir -p "$OUTPUT_DIR"
|
||
create_disk_image "$WORK_DIR/vmlinuz" "$WORK_DIR/kubesolo-os.gz" \
|
||
"$OUTPUT_DIR/kubesolo-os-${VERSION}.img"
|
||
|
||
echo "==> Built: $OUTPUT_DIR/kubesolo-os-${VERSION}.img"
|
||
```
|
||
|
||
### 5.3 Alternative: Kairos-based Build (Container-first)
|
||
|
||
For faster iteration, leverage the Kairos framework to get A/B updates, P2P mesh, and OCI distribution for free:
|
||
|
||
```dockerfile
|
||
# Dockerfile.kubesolo-os
|
||
FROM quay.io/kairos/core-alpine:latest
|
||
|
||
# Install KubeSolo
|
||
RUN curl -sfL https://get.kubesolo.io | sh -
|
||
|
||
# Pre-configure
|
||
COPY kubesolo-config.yaml /etc/kubesolo/config.yaml
|
||
COPY cloud-init-defaults.yaml /system/oem/
|
||
|
||
# Kernel modules
|
||
RUN apk add --no-cache \
|
||
linux-lts \
|
||
iptables \
|
||
iproute2 \
|
||
conntrack-tools
|
||
|
||
# Sysctl
|
||
COPY k8s-sysctl.conf /etc/sysctl.d/
|
||
|
||
# Build: docker build -t kubesolo-os:v1 .
|
||
# Flash: Use AuroraBoot or Kairos tooling to convert OCI → bootable image
|
||
```
|
||
|
||
**Advantages of Kairos approach:**
|
||
- A/B atomic updates with rollback — built-in
|
||
- OCI-based distribution — `docker push` your OS
|
||
- P2P mesh bootstrapping — nodes find each other
|
||
- Kubernetes-native upgrades — `kubectl apply` to upgrade the OS
|
||
- Proven in production edge deployments
|
||
|
||
**Trade-offs:**
|
||
- Larger footprint than pure Tiny Core remaster (~200–400 MB vs ~50–80 MB)
|
||
- Dependency on Kairos project maintenance
|
||
- Less control over boot process internals
|
||
|
||
---
|
||
|
||
## 6. Kernel Considerations
|
||
|
||
### 6.1 Tiny Core Kernel Audit
|
||
|
||
Tiny Core 17.0 ships a modern 6.x kernel, which should have cgroup v2 support compiled in. However, the **kernel config must be verified** for these critical options:
|
||
|
||
```
|
||
# MANDATORY for KubeSolo
|
||
CONFIG_CGROUPS=y
|
||
CONFIG_CGROUP_CPUACCT=y
|
||
CONFIG_CGROUP_DEVICE=y
|
||
CONFIG_CGROUP_FREEZER=y
|
||
CONFIG_CGROUP_SCHED=y
|
||
CONFIG_CGROUP_PIDS=y
|
||
CONFIG_MEMCG=y
|
||
CONFIG_CGROUP_BPF=y
|
||
|
||
CONFIG_NAMESPACES=y
|
||
CONFIG_NET_NS=y
|
||
CONFIG_PID_NS=y
|
||
CONFIG_USER_NS=y
|
||
CONFIG_UTS_NS=y
|
||
CONFIG_IPC_NS=y
|
||
|
||
CONFIG_OVERLAY_FS=y # or =m (module)
|
||
CONFIG_BRIDGE=y # or =m
|
||
CONFIG_NETFILTER=y
|
||
CONFIG_NF_NAT=y
|
||
CONFIG_IP_NF_IPTABLES=y
|
||
CONFIG_IP_NF_NAT=y
|
||
CONFIG_IP_NF_FILTER=y
|
||
CONFIG_VETH=y # or =m
|
||
CONFIG_VXLAN=y # or =m
|
||
|
||
CONFIG_SQUASHFS=y # For Tiny Core's own extension system
|
||
CONFIG_BLK_DEV_LOOP=y # For SquashFS mounting
|
||
|
||
# RECOMMENDED
|
||
CONFIG_BPF_SYSCALL=y # For modern CNI plugins
|
||
CONFIG_CRYPTO_SHA256=y # For image verification
|
||
CONFIG_SECCOMP=y # Container security
|
||
CONFIG_AUDIT=y # Audit logging
|
||
```
|
||
|
||
If the stock Tiny Core kernel lacks any of these, options are:
|
||
|
||
1. **Load as modules** — if compiled as `=m`, load via `modprobe` in init
|
||
2. **Recompile kernel** — use Tiny Core's kernel build process with custom config
|
||
3. **Use a different kernel** — e.g., pull the kernel from Alpine Linux or build from mainline
|
||
|
||
### 6.2 Custom Kernel Build (if needed)
|
||
|
||
```bash
|
||
# On a Tiny Core build system
|
||
tce-load -wi compiletc linux-6.x-source
|
||
cd /usr/src/linux-6.x
|
||
cp /path/to/kubesolo-kernel.config .config
|
||
make oldconfig
|
||
make -j$(nproc) bzImage modules
|
||
# Extract vmlinuz and required modules
|
||
```
|
||
|
||
---
|
||
|
||
## 7. Security Model
|
||
|
||
### 7.1 Layered Security
|
||
|
||
```
|
||
┌─────────────────────────────────────────┐
|
||
│ APPLICATION LAYER │
|
||
│ Kubernetes RBAC + Network Policies │
|
||
│ Pod Security Standards │
|
||
│ Seccomp / AppArmor profiles │
|
||
├─────────────────────────────────────────┤
|
||
│ CONTAINER RUNTIME LAYER │
|
||
│ containerd with default seccomp │
|
||
│ Read-only container rootfs │
|
||
│ User namespace mapping (optional) │
|
||
├─────────────────────────────────────────┤
|
||
│ OS LAYER │
|
||
│ SquashFS root (read-only, in RAM) │
|
||
│ No package manager │
|
||
│ No SSH by default │
|
||
│ Minimal userland (BusyBox only) │
|
||
│ No compiler, no debugger │
|
||
├─────────────────────────────────────────┤
|
||
│ BOOT LAYER │
|
||
│ Signed images (GPG verification) │
|
||
│ Secure Boot (optional, UEFI) │
|
||
│ A/B rollback on tamper/failure │
|
||
│ TPM-based attestation (optional) │
|
||
└─────────────────────────────────────────┘
|
||
```
|
||
|
||
### 7.2 Attack Surface Comparison
|
||
|
||
| Attack Vector | Traditional Linux | KubeSolo OS |
|
||
|---|---|---|
|
||
| Package manager exploit | Possible (apt/yum) | **Eliminated** (no pkg manager) |
|
||
| SSH brute force | Common | **Eliminated** (no SSH default) |
|
||
| Writable system files | Yes (/etc, /usr) | **Eliminated** (SquashFS ro) |
|
||
| Persistent rootkit | Survives reboot | **Eliminated** (RAM-only root) |
|
||
| Kernel module injection | Possible | **Mitigated** (only preloaded modules) |
|
||
| Local privilege escalation | Various paths | **Reduced** (minimal binaries) |
|
||
|
||
---
|
||
|
||
## 8. Implementation Roadmap
|
||
|
||
### Phase 1 — Proof of Concept (2–3 weeks)
|
||
|
||
**Goal:** Boot Tiny Core + KubeSolo, validate K8s functionality.
|
||
|
||
1. Download Tiny Core Micro Core 17.0 (x86_64)
|
||
2. Extract `core.gz`, inject KubeSolo binary
|
||
3. Create custom init that starts KubeSolo
|
||
4. Verify kernel has required configs (cgroup v2, namespaces, netfilter)
|
||
5. Build bootable ISO, test in QEMU/KVM
|
||
6. Deploy a test workload (nginx pod)
|
||
7. Validate: `kubectl get nodes` shows Ready
|
||
|
||
**Success criteria:** Single ISO boots to functional K8s node in < 30 seconds.
|
||
|
||
### Phase 2 — Persistence + Immutability (2–3 weeks)
|
||
|
||
**Goal:** Persistent K8s state across reboots, immutable root.
|
||
|
||
1. Implement persistent data partition with bind mounts
|
||
2. Verify K8s state survives reboot (pods, services, PVCs)
|
||
3. Verify SQLite DB integrity across unclean shutdowns
|
||
4. Lock down root filesystem (verify read-only enforcement)
|
||
5. Test: corrupt system files → verify RAM-only root is unaffected
|
||
|
||
### Phase 3 — Atomic Updates + Rollback (3–4 weeks)
|
||
|
||
**Goal:** A/B partition updates with automatic rollback.
|
||
|
||
1. Implement GRUB A/B boot configuration
|
||
2. Build update agent (Go binary)
|
||
3. Implement health check + `boot_success` flag
|
||
4. Test update cycle: A → B → verify → mark good
|
||
5. Test rollback: A → B → fail → auto-revert to A
|
||
6. Test: pull power during update → verify clean state
|
||
|
||
### Phase 4 — Production Hardening (2–3 weeks)
|
||
|
||
**Goal:** Production-ready security and manageability.
|
||
|
||
1. Image signing and verification (GPG or sigstore/cosign)
|
||
2. Cloud-init implementation for first-boot config
|
||
3. Portainer Edge integration testing
|
||
4. Optional SSH extension (`.tcz`)
|
||
5. Optional management API (lightweight, mTLS-authenticated)
|
||
6. Performance benchmarking (boot time, memory usage, disk I/O)
|
||
7. Documentation and deployment guides
|
||
|
||
### Phase 5 — Distribution + Fleet Management (ongoing)
|
||
|
||
**Goal:** Scale to fleet deployments.
|
||
|
||
1. CI/CD pipeline for automated image builds
|
||
2. OCI registry distribution (optional)
|
||
3. Fleet update orchestration (rolling updates across nodes)
|
||
4. Monitoring integration (Prometheus metrics endpoint)
|
||
5. USB provisioning tool for air-gapped deployments
|
||
6. ARM64 support (Raspberry Pi, Jetson, etc.)
|
||
|
||
---
|
||
|
||
## 9. Open Questions & Decisions
|
||
|
||
| # | Question | Options | Recommendation |
|
||
|---|---|---|---|
|
||
| 1 | **Build approach** | Pure Tiny Core remaster vs. Kairos framework | Start with pure remaster for minimal footprint; evaluate Kairos if update complexity becomes unmanageable |
|
||
| 2 | **Kernel** | Stock Tiny Core kernel vs. custom build | Audit stock kernel first; only custom-build if missing critical configs |
|
||
| 3 | **Management interface** | SSH / API / Portainer Edge only | Portainer Edge primary; optional SSH extension for debugging |
|
||
| 4 | **Update distribution** | HTTP server / OCI registry / USB | HTTP for simplicity; OCI if leveraging container infrastructure |
|
||
| 5 | **Init system** | Custom shell script vs. BusyBox init vs. s6 | Custom shell script for PoC; evaluate s6 for supervision |
|
||
| 6 | **Networking** | DHCP only / Static / cloud-init | Cloud-init with DHCP fallback |
|
||
| 7 | **Architecture support** | x86_64 only vs. multi-arch | x86_64 first; ARM64 in Phase 5 |
|
||
| 8 | **Container images** | Preloaded in initramfs vs. pull at boot | Preload core workloads; pull additional at runtime |
|
||
|
||
---
|
||
|
||
## 10. References
|
||
|
||
- KubeSolo: https://github.com/portainer/kubesolo
|
||
- Tiny Core Linux: http://www.tinycorelinux.net
|
||
- Tiny Core Wiki (Remastering): http://wiki.tinycorelinux.net/doku.php?id=wiki:remastering
|
||
- Talos Linux: https://www.talos.dev
|
||
- Kairos: https://kairos.io
|
||
- Bottlerocket: https://github.com/bottlerocket-os/bottlerocket
|
||
- Flatcar Linux: https://www.flatcar.org
|
||
- Kubernetes Node Requirements: https://kubernetes.io/docs/setup/production-environment/container-runtimes/
|
||
- cgroup v2: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
|