feat: initial Phase 1 PoC scaffolding for KubeSolo OS
Complete Phase 1 implementation of KubeSolo OS — an immutable, bootable Linux distribution built on Tiny Core Linux for running KubeSolo single-node Kubernetes. Build system: - Makefile with fetch, rootfs, initramfs, iso, disk-image targets - Dockerfile.builder for reproducible builds - Scripts to download Tiny Core, extract rootfs, inject KubeSolo, pack initramfs, and create bootable ISO/disk images Init system (10 POSIX sh stages): - Early mount (proc/sys/dev/cgroup2), cmdline parsing, persistent mount with bind-mounts, kernel module loading, sysctl, DHCP networking, hostname, clock sync, containerd prep, KubeSolo exec Shared libraries: - functions.sh (device wait, IP lookup, config helpers) - network.sh (static IP, config persistence, interface detection) - health.sh (containerd, API server, node readiness checks) - Emergency shell for boot failure debugging Testing: - QEMU boot test with serial log marker detection - K8s readiness test with kubectl verification - Persistence test (reboot + verify state survives) - Workload deployment test (nginx pod) - Local storage test (PVC + local-path provisioner) - Network policy test - Reusable run-vm.sh launcher Developer tools: - dev-vm.sh (interactive QEMU with port forwarding) - rebuild-initramfs.sh (fast iteration) - inject-ssh.sh (dropbear SSH for debugging) - extract-kernel-config.sh + kernel-audit.sh Documentation: - Full design document with architecture research - Boot flow documentation covering all 10 init stages - Cloud-init examples (DHCP, static IP, Portainer Edge, air-gapped) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
945
docs/design/kubesolo-os-design.md
Normal file
945
docs/design/kubesolo-os-design.md
Normal file
@@ -0,0 +1,945 @@
|
||||
# KubeSolo OS — Bootable Immutable Kubernetes Distribution
|
||||
|
||||
## Design Research: KubeSolo + Tiny Core Linux
|
||||
|
||||
---
|
||||
|
||||
## 1. Executive Summary
|
||||
|
||||
This document outlines the architecture for **KubeSolo OS** — an immutable, bootable Linux distribution purpose-built to run KubeSolo (Portainer's single-node Kubernetes distribution) with atomic updates. The design combines the minimal footprint of Tiny Core Linux with KubeSolo's single-binary K8s packaging to create an appliance-like Kubernetes node that boots directly into a production-ready cluster.
|
||||
|
||||
**Target use cases:** IoT/IIoT edge devices, single-node K8s appliances, air-gapped deployments, embedded systems, kiosk/POS systems, and resource-constrained hardware.
|
||||
|
||||
---
|
||||
|
||||
## 2. Component Analysis
|
||||
|
||||
### 2.1 KubeSolo
|
||||
|
||||
**Source:** https://github.com/portainer/kubesolo
|
||||
|
||||
KubeSolo is Portainer's production-ready, ultra-lightweight single-node Kubernetes distribution designed for edge and IoT scenarios.
|
||||
|
||||
**Architecture highlights:**
|
||||
|
||||
- **Single binary** — all K8s components bundled into one executable
|
||||
- **SQLite backend** — uses Kine to replace etcd, eliminating cluster coordination overhead
|
||||
- **Bundled runtime** — ships containerd, runc, CoreDNS, and CNI plugins
|
||||
- **No scheduler** — replaced by a custom `NodeSetter` admission webhook (single-node, no scheduling decisions needed)
|
||||
- **Dual libc support** — detects and supports both glibc and musl (Alpine) environments
|
||||
- **Offline-ready** — designed for air-gapped deployments; all images can be preloaded
|
||||
- **Portainer Edge integration** — optional remote management via `--portainer-edge-*` flags
|
||||
|
||||
**Runtime requirements:**
|
||||
|
||||
| Requirement | Minimum | Recommended |
|
||||
|---|---|---|
|
||||
| RAM | 512 MB | 1 GB+ |
|
||||
| Kernel | 3.10+ (legacy) | 5.8+ (cgroup v2) |
|
||||
| Storage | ~500 MB (binary) | 2 GB+ (with workloads) |
|
||||
|
||||
**Key kernel dependencies:**
|
||||
|
||||
- cgroup v2 (kernel 5.8+) — `CONFIG_CGROUP`, `CONFIG_CGROUP_CPUACCT`, `CONFIG_CGROUP_DEVICE`, `CONFIG_CGROUP_FREEZER`, `CONFIG_CGROUP_SCHED`, `CONFIG_CGROUP_PIDS`, `CONFIG_CGROUP_NET_CLASSID`
|
||||
- Namespaces — `CONFIG_NAMESPACES`, `CONFIG_NET_NS`, `CONFIG_PID_NS`, `CONFIG_USER_NS`, `CONFIG_UTS_NS`, `CONFIG_IPC_NS`
|
||||
- Networking — `CONFIG_BRIDGE`, `CONFIG_NETFILTER`, `CONFIG_VETH`, `CONFIG_VXLAN`, `CONFIG_IP_NF_IPTABLES`, `CONFIG_IP_NF_NAT`
|
||||
- Filesystem — `CONFIG_OVERLAY_FS`, `CONFIG_SQUASHFS`
|
||||
- Modules required at runtime: `br_netfilter`, `overlay`, `ip_tables`, `iptable_nat`, `iptable_filter`
|
||||
|
||||
**Installation & operation:**
|
||||
|
||||
```bash
|
||||
# Standard install
|
||||
curl -sfL https://get.kubesolo.io | sudo sh -
|
||||
|
||||
# Kubeconfig location
|
||||
/var/lib/kubesolo/pki/admin/admin.kubeconfig
|
||||
|
||||
# Key flags
|
||||
--path /var/lib/kubesolo # config directory
|
||||
--apiserver-extra-sans # additional TLS SANs
|
||||
--local-storage true # enable local-path provisioner
|
||||
--portainer-edge-id # Portainer Edge agent ID
|
||||
--portainer-edge-key # Portainer Edge agent key
|
||||
```
|
||||
|
||||
### 2.2 Tiny Core Linux
|
||||
|
||||
**Source:** http://www.tinycorelinux.net
|
||||
|
||||
Tiny Core Linux is an ultra-minimal Linux distribution (11–17 MB) that runs entirely in RAM.
|
||||
|
||||
**Architecture highlights:**
|
||||
|
||||
- **Micro Core** — 11 MB: kernel + root filesystem + basic kernel modules (no GUI)
|
||||
- **RAM-resident** — entire OS loaded into memory at boot; disk only needed for persistence
|
||||
- **SquashFS root** — read-only compressed filesystem, inherently immutable
|
||||
- **Extension system** — `.tcz` packages (SquashFS-compressed) mounted or copied at boot
|
||||
- **Three operational modes:**
|
||||
1. **Cloud/Default** — pure RAM, nothing persists across reboots
|
||||
2. **Mount mode** — extensions stored in `/tce` directory, loop-mounted at boot
|
||||
3. **Copy mode** — extensions copied into RAM from persistent storage
|
||||
|
||||
**Key concepts for this design:**
|
||||
|
||||
- `/tce` directory on persistent storage holds extensions and configuration
|
||||
- `onboot.lst` — list of extensions to auto-mount at boot
|
||||
- `filetool.sh` + `/opt/.filetool.lst` — backup/restore mechanism for persistent files
|
||||
- Boot codes control behavior: `tce=`, `base`, `norestore`, `noswap`, etc.
|
||||
- Custom remastering: extract `core.gz` → modify → repack → create bootable image
|
||||
- Frugal install: `vmlinuz` + `core.gz` + bootloader + `/tce` directory
|
||||
|
||||
**Kernel:** Ships modern Linux kernel (6.x series in v17.0), supports x86, x86_64, ARM.
|
||||
|
||||
---
|
||||
|
||||
## 3. Competitive Landscape — Existing Immutable K8s OSes
|
||||
|
||||
### 3.1 Comparison Matrix
|
||||
|
||||
| Feature | Talos Linux | Bottlerocket | Flatcar Linux | Kairos | **KubeSolo OS** (proposed) |
|
||||
|---|---|---|---|---|---|
|
||||
| **Footprint** | ~80 MB | ~500 MB | ~700 MB | Varies (base distro) | **~50–80 MB** |
|
||||
| **Immutability** | Radical (12 binaries) | Strong (read-only root) | Moderate (read-only /usr) | Strong (overlayFS) | **Strong (SquashFS root)** |
|
||||
| **SSH access** | None (API only) | Disabled (container shell) | Yes | Optional | **Optional (extension)** |
|
||||
| **Update model** | A/B partitions | A/B partitions | A/B partitions (ChromeOS) | A/B partitions (OCI) | **A/B partitions** |
|
||||
| **K8s variants** | Multi-node, HA | Multi-node (EKS) | Multi-node (any) | Multi-node (any) | **Single-node only** |
|
||||
| **Management** | talosctl (mTLS API) | API (localhost) | Ignition + SSH | Cloud-init, K8s CRDs | **API + cloud-init** |
|
||||
| **Base OS** | Custom (Go userland) | Custom (Bottlerocket) | Gentoo-derived | Any Linux (meta-distro) | **Tiny Core Linux** |
|
||||
| **Target** | Cloud + Edge | AWS (primarily) | Cloud + Bare metal | Edge + Bare metal | **Edge + IoT** |
|
||||
| **Configuration** | Machine config YAML | TOML settings | Ignition JSON | Cloud-init YAML | **Cloud-init + boot codes** |
|
||||
|
||||
### 3.2 Key Lessons from Each
|
||||
|
||||
**From Talos Linux:**
|
||||
- API-only management is powerful but aggressive — provide as optional mode
|
||||
- 12-binary minimalism is aspirational; KubeSolo's single binary aligns well
|
||||
- System extensions as SquashFS overlays in initramfs = directly applicable to Tiny Core's `.tcz` model
|
||||
- A/B partition with GRUB fallback counter for automatic rollback
|
||||
|
||||
**From Bottlerocket:**
|
||||
- Bootstrap containers for customization — useful pattern for pre-deploying workloads
|
||||
- Host containers for privileged operations (debugging, admin access)
|
||||
- Tightly coupled OS+K8s versions simplifies compatibility testing
|
||||
|
||||
**From Flatcar Linux:**
|
||||
- Ignition for first-boot declarative config — consider cloud-init equivalent
|
||||
- ChromeOS-style update engine is battle-tested
|
||||
- Dynamic kernel module loading — Tiny Core's extension system provides similar flexibility
|
||||
|
||||
**From Kairos:**
|
||||
- Container-based OS distribution (OCI images) — enables `docker pull` for OS updates
|
||||
- P2P mesh clustering via libp2p — interesting for edge fleet bootstrapping
|
||||
- Meta-distribution approach: don't reinvent, augment
|
||||
- Static kernel+initrd shipped in container image = truly atomic full-stack updates
|
||||
|
||||
---
|
||||
|
||||
## 4. Architecture Design
|
||||
|
||||
### 4.1 High-Level Architecture
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────┐
|
||||
│ BOOT MEDIA │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌────────────────────┐ │
|
||||
│ │ GRUB/ │ │ Partition│ │ Partition B │ │
|
||||
│ │ Syslinux │ │ A │ │ (passive) │ │
|
||||
│ │ (EFI/ │ │ (active) │ │ │ │
|
||||
│ │ BIOS) │ │ │ │ vmlinuz │ │
|
||||
│ │ │ │ vmlinuz │ │ kubesolo-os.gz │ │
|
||||
│ │ Fallback │ │ kubesolo-│ │ extensions.tcz │ │
|
||||
│ │ counter │ │ os.gz │ │ │ │
|
||||
│ │ │ │ ext.tcz │ │ │ │
|
||||
│ └──────────┘ └──────────┘ └────────────────────┘ │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────────┐│
|
||||
│ │ Persistent Data Partition ││
|
||||
│ │ /var/lib/kubesolo/ (K8s state, SQLite DB) ││
|
||||
│ │ /var/lib/containerd/ (container images/layers) ││
|
||||
│ │ /etc/kubesolo/ (node config) ││
|
||||
│ │ /var/log/ (logs, optional) ││
|
||||
│ │ /usr/local/ (user data) ││
|
||||
│ └──────────────────────────────────────────────────┘│
|
||||
└──────────────────────────────────────────────────────┘
|
||||
|
||||
BOOT FLOW
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ GRUB loads vmlinuz + │
|
||||
│ kubesolo-os.gz from │
|
||||
│ active partition │
|
||||
└────────────┬────────────┘
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ Kernel boots, mounts │
|
||||
│ SquashFS root (ro) │
|
||||
│ in RAM │
|
||||
└────────────┬────────────┘
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ init: mount persistent │
|
||||
│ partition, bind-mount │
|
||||
│ writable paths │
|
||||
└────────────┬────────────┘
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ Load kernel modules: │
|
||||
│ br_netfilter, overlay, │
|
||||
│ ip_tables, veth │
|
||||
└────────────┬────────────┘
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ Configure networking │
|
||||
│ (cloud-init or static) │
|
||||
└────────────┬────────────┘
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ Start KubeSolo │
|
||||
│ (single binary) │
|
||||
└────────────┬────────────┘
|
||||
│
|
||||
┌────────────▼────────────┐
|
||||
│ K8s API available │
|
||||
│ Node ready for │
|
||||
│ workloads │
|
||||
└─────────────────────────┘
|
||||
```
|
||||
|
||||
### 4.2 Partition Layout
|
||||
|
||||
```
|
||||
Disk Layout (minimum 8 GB recommended):
|
||||
|
||||
┌──────────────────────────────────────────────────────────┐
|
||||
│ Partition 1: EFI/Boot (256 MB, FAT32) │
|
||||
│ /EFI/BOOT/bootx64.efi (or /boot/grub for BIOS) │
|
||||
│ grub.cfg with A/B logic + fallback counter │
|
||||
├──────────────────────────────────────────────────────────┤
|
||||
│ Partition 2: System A (512 MB, SquashFS image, read-only)│
|
||||
│ vmlinuz │
|
||||
│ kubesolo-os.gz (initramfs: core.gz + KubeSolo ext) │
|
||||
├──────────────────────────────────────────────────────────┤
|
||||
│ Partition 3: System B (512 MB, SquashFS image, read-only)│
|
||||
│ (passive — receives updates, swaps with A) │
|
||||
├──────────────────────────────────────────────────────────┤
|
||||
│ Partition 4: Persistent Data (remaining space, ext4) │
|
||||
│ /var/lib/kubesolo/ → K8s state, certs, SQLite │
|
||||
│ /var/lib/containerd/ → container images & layers │
|
||||
│ /etc/kubesolo/ → node configuration │
|
||||
│ /etc/network/ → network config │
|
||||
│ /var/log/ → system + K8s logs │
|
||||
│ /usr/local/ → user extensions │
|
||||
└──────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 4.3 Filesystem Mount Strategy
|
||||
|
||||
At boot, the init system constructs the runtime filesystem:
|
||||
|
||||
```bash
|
||||
# Root: SquashFS from initramfs (read-only, in RAM)
|
||||
/ → tmpfs (RAM) + SquashFS overlay (ro)
|
||||
|
||||
# Persistent bind mounts from data partition
|
||||
/var/lib/kubesolo → /mnt/data/kubesolo (rw)
|
||||
/var/lib/containerd → /mnt/data/containerd (rw)
|
||||
/etc/kubesolo → /mnt/data/etc-kubesolo (rw)
|
||||
/etc/resolv.conf → /mnt/data/resolv.conf (rw)
|
||||
/var/log → /mnt/data/log (rw)
|
||||
/usr/local → /mnt/data/usr-local (rw)
|
||||
|
||||
# Everything else: read-only or tmpfs
|
||||
/tmp → tmpfs
|
||||
/run → tmpfs
|
||||
```
|
||||
|
||||
### 4.4 Custom Initramfs (kubesolo-os.gz)
|
||||
|
||||
The initramfs is the core of the distribution — a remastered Tiny Core `core.gz` with KubeSolo baked in:
|
||||
|
||||
```
|
||||
kubesolo-os.gz (cpio+gzip archive)
|
||||
├── bin/ # BusyBox symlinks
|
||||
├── sbin/
|
||||
│ └── init # Custom init script (see §4.5)
|
||||
├── lib/
|
||||
│ └── modules/ # Kernel modules (br_netfilter, overlay, etc.)
|
||||
├── usr/
|
||||
│ └── local/
|
||||
│ └── bin/
|
||||
│ └── kubesolo # KubeSolo binary
|
||||
├── opt/
|
||||
│ ├── containerd/ # containerd + runc + CNI plugins
|
||||
│ │ ├── bin/
|
||||
│ │ │ ├── containerd
|
||||
│ │ │ ├── containerd-shim-runc-v2
|
||||
│ │ │ └── runc
|
||||
│ │ └── cni/
|
||||
│ │ └── bin/ # CNI plugins (bridge, host-local, loopback, etc.)
|
||||
│ └── kubesolo-os/
|
||||
│ ├── cloud-init.yaml # Default cloud-init config
|
||||
│ └── update-agent # Atomic update agent binary
|
||||
├── etc/
|
||||
│ ├── os-release # KubeSolo OS identification
|
||||
│ ├── kubesolo/
|
||||
│ │ └── config.yaml # Default KubeSolo config
|
||||
│ └── sysctl.d/
|
||||
│ └── k8s.conf # Kernel parameters for K8s
|
||||
└── var/
|
||||
└── lib/
|
||||
└── kubesolo/ # Mount point (bind-mounted to persistent)
|
||||
```
|
||||
|
||||
### 4.5 Init System
|
||||
|
||||
A custom init script replaces Tiny Core's default init to implement the appliance boot flow:
|
||||
|
||||
```bash
|
||||
#!/bin/sh
|
||||
# /sbin/init — KubeSolo OS init
|
||||
set -e
|
||||
|
||||
# 1. Mount essential filesystems
|
||||
mount -t proc proc /proc
|
||||
mount -t sysfs sysfs /sys
|
||||
mount -t devtmpfs devtmpfs /dev
|
||||
mount -t tmpfs tmpfs /tmp
|
||||
mount -t tmpfs tmpfs /run
|
||||
mkdir -p /dev/pts /dev/shm
|
||||
mount -t devpts devpts /dev/pts
|
||||
mount -t tmpfs tmpfs /dev/shm
|
||||
|
||||
# 2. Parse boot parameters
|
||||
PERSISTENT_DEV=""
|
||||
for arg in $(cat /proc/cmdline); do
|
||||
case "$arg" in
|
||||
kubesolo.data=*) PERSISTENT_DEV="${arg#kubesolo.data=}" ;;
|
||||
kubesolo.debug) set -x ;;
|
||||
kubesolo.shell) exec /bin/sh ;; # Emergency shell
|
||||
esac
|
||||
done
|
||||
|
||||
# 3. Mount persistent data partition
|
||||
if [ -n "$PERSISTENT_DEV" ]; then
|
||||
mkdir -p /mnt/data
|
||||
# Wait for device (USB, slow disks)
|
||||
for i in $(seq 1 30); do
|
||||
[ -b "$PERSISTENT_DEV" ] && break
|
||||
sleep 1
|
||||
done
|
||||
mount -t ext4 "$PERSISTENT_DEV" /mnt/data
|
||||
|
||||
# Create directory structure on first boot
|
||||
for dir in kubesolo containerd etc-kubesolo log usr-local network; do
|
||||
mkdir -p /mnt/data/$dir
|
||||
done
|
||||
|
||||
# Bind mount persistent paths
|
||||
mount --bind /mnt/data/kubesolo /var/lib/kubesolo
|
||||
mount --bind /mnt/data/containerd /var/lib/containerd
|
||||
mount --bind /mnt/data/etc-kubesolo /etc/kubesolo
|
||||
mount --bind /mnt/data/log /var/log
|
||||
mount --bind /mnt/data/usr-local /usr/local
|
||||
fi
|
||||
|
||||
# 4. Load required kernel modules
|
||||
modprobe br_netfilter
|
||||
modprobe overlay
|
||||
modprobe ip_tables
|
||||
modprobe iptable_nat
|
||||
modprobe iptable_filter
|
||||
modprobe veth
|
||||
modprobe vxlan
|
||||
|
||||
# 5. Set kernel parameters
|
||||
sysctl -w net.bridge.bridge-nf-call-iptables=1
|
||||
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
|
||||
sysctl -w net.ipv4.ip_forward=1
|
||||
sysctl -w fs.inotify.max_user_instances=1024
|
||||
sysctl -w fs.inotify.max_user_watches=524288
|
||||
|
||||
# 6. Configure networking
|
||||
# Priority: cloud-init > persistent config > DHCP fallback
|
||||
if [ -f /mnt/data/network/interfaces ]; then
|
||||
# Apply saved network config
|
||||
configure_network /mnt/data/network/interfaces
|
||||
elif [ -f /mnt/data/etc-kubesolo/cloud-init.yaml ]; then
|
||||
# First boot: apply cloud-init
|
||||
apply_cloud_init /mnt/data/etc-kubesolo/cloud-init.yaml
|
||||
else
|
||||
# Fallback: DHCP on first interface
|
||||
udhcpc -i eth0 -s /usr/share/udhcpc/default.script
|
||||
fi
|
||||
|
||||
# 7. Set hostname
|
||||
if [ -f /mnt/data/etc-kubesolo/hostname ]; then
|
||||
hostname $(cat /mnt/data/etc-kubesolo/hostname)
|
||||
else
|
||||
hostname kubesolo-$(cat /sys/class/net/eth0/address | tr -d ':' | tail -c 6)
|
||||
fi
|
||||
|
||||
# 8. Start containerd
|
||||
containerd --config /etc/kubesolo/containerd-config.toml &
|
||||
sleep 2 # Wait for socket
|
||||
|
||||
# 9. Start KubeSolo
|
||||
exec /usr/local/bin/kubesolo \
|
||||
--path /var/lib/kubesolo \
|
||||
--local-storage true \
|
||||
$(cat /etc/kubesolo/extra-flags 2>/dev/null || true)
|
||||
```
|
||||
|
||||
### 4.6 Atomic Update System
|
||||
|
||||
#### Update Flow
|
||||
|
||||
```
|
||||
UPDATE PROCESS
|
||||
│
|
||||
┌─────────────▼──────────────┐
|
||||
│ 1. Download new OS image │
|
||||
│ (kubesolo-os-v2.img) │
|
||||
│ Verify checksum + sig │
|
||||
└─────────────┬──────────────┘
|
||||
│
|
||||
┌─────────────▼──────────────┐
|
||||
│ 2. Write image to PASSIVE │
|
||||
│ partition (B if A active) │
|
||||
└─────────────┬──────────────┘
|
||||
│
|
||||
┌─────────────▼──────────────┐
|
||||
│ 3. Update GRUB: │
|
||||
│ - Set next boot → B │
|
||||
│ - Set boot_counter = 3 │
|
||||
└─────────────┬──────────────┘
|
||||
│
|
||||
┌─────────────▼──────────────┐
|
||||
│ 4. Reboot │
|
||||
└─────────────┬──────────────┘
|
||||
│
|
||||
┌─────────────▼──────────────┐
|
||||
│ 5. GRUB boots partition B │
|
||||
│ Decrements boot_counter │
|
||||
└─────────────┬──────────────┘
|
||||
│
|
||||
┌──────────┴──────────┐
|
||||
│ │
|
||||
┌─────▼─────┐ ┌─────▼─────┐
|
||||
│ Boot OK │ │ Boot FAIL │
|
||||
│ │ │ │
|
||||
│ Health │ │ Counter │
|
||||
│ check OK │ │ hits 0 │
|
||||
│ │ │ │
|
||||
│ Mark B as │ │ GRUB auto │
|
||||
│ default │ │ rollback │
|
||||
│ Clear │ │ to A │
|
||||
│ counter │ │ │
|
||||
└───────────┘ └───────────┘
|
||||
```
|
||||
|
||||
#### GRUB Configuration for A/B Boot
|
||||
|
||||
```grub
|
||||
# /boot/grub/grub.cfg
|
||||
|
||||
set default=0
|
||||
set timeout=3
|
||||
|
||||
# Saved environment variables:
|
||||
# active_slot = A or B
|
||||
# boot_counter = 3 (decremented each boot, 0 = rollback)
|
||||
# boot_success = 0 (set to 1 by health check)
|
||||
|
||||
load_env
|
||||
|
||||
# If last boot failed and counter expired, swap slots
|
||||
if [ "${boot_success}" != "1" ]; then
|
||||
if [ "${boot_counter}" = "0" ]; then
|
||||
if [ "${active_slot}" = "A" ]; then
|
||||
set active_slot=B
|
||||
else
|
||||
set active_slot=A
|
||||
fi
|
||||
save_env active_slot
|
||||
set boot_counter=3
|
||||
save_env boot_counter
|
||||
else
|
||||
# Decrement counter
|
||||
if [ "${boot_counter}" = "3" ]; then set boot_counter=2; fi
|
||||
if [ "${boot_counter}" = "2" ]; then set boot_counter=1; fi
|
||||
if [ "${boot_counter}" = "1" ]; then set boot_counter=0; fi
|
||||
save_env boot_counter
|
||||
fi
|
||||
fi
|
||||
|
||||
set boot_success=0
|
||||
save_env boot_success
|
||||
|
||||
# Boot from active slot
|
||||
if [ "${active_slot}" = "A" ]; then
|
||||
set root=(hd0,gpt2)
|
||||
else
|
||||
set root=(hd0,gpt3)
|
||||
fi
|
||||
|
||||
menuentry "KubeSolo OS" {
|
||||
linux /vmlinuz kubesolo.data=/dev/sda4 quiet
|
||||
initrd /kubesolo-os.gz
|
||||
}
|
||||
|
||||
menuentry "KubeSolo OS (emergency shell)" {
|
||||
linux /vmlinuz kubesolo.data=/dev/sda4 kubesolo.shell
|
||||
initrd /kubesolo-os.gz
|
||||
}
|
||||
```
|
||||
|
||||
#### Update Agent
|
||||
|
||||
A lightweight Go binary that runs as a Kubernetes CronJob or DaemonSet:
|
||||
|
||||
```
|
||||
kubesolo-update-agent responsibilities:
|
||||
1. Poll update server (HTTPS) or watch OCI registry for new tags
|
||||
2. Download + verify new system image (SHA256 + optional GPG signature)
|
||||
3. Write to passive partition (dd or equivalent)
|
||||
4. Update GRUB environment (grub-editenv)
|
||||
5. Trigger reboot (via Kubernetes node drain → reboot)
|
||||
6. Post-boot health check:
|
||||
- KubeSolo API reachable?
|
||||
- containerd healthy?
|
||||
- Node Ready in kubectl?
|
||||
If all pass → set boot_success=1
|
||||
If any fail → leave boot_success=0 (auto-rollback on next reboot)
|
||||
```
|
||||
|
||||
**Update distribution models:**
|
||||
|
||||
1. **HTTP/S server** — host images on a simple file server; agent polls for `latest.json`
|
||||
2. **OCI registry** — tag system images as container images; agent pulls new tags
|
||||
3. **USB drive** — for air-gapped: plug USB with new image, agent detects and applies
|
||||
4. **Portainer Edge** — leverage existing Portainer Edge infrastructure for fleet updates
|
||||
|
||||
### 4.7 Configuration System
|
||||
|
||||
#### First Boot (cloud-init)
|
||||
|
||||
The system uses a simplified cloud-init compatible with Tiny Core's environment:
|
||||
|
||||
```yaml
|
||||
# /etc/kubesolo/cloud-init.yaml (placed on data partition before first boot)
|
||||
#cloud-config
|
||||
|
||||
hostname: edge-node-001
|
||||
|
||||
network:
|
||||
version: 2
|
||||
ethernets:
|
||||
eth0:
|
||||
dhcp4: false
|
||||
addresses:
|
||||
- 192.168.1.100/24
|
||||
gateway4: 192.168.1.1
|
||||
nameservers:
|
||||
addresses:
|
||||
- 8.8.8.8
|
||||
- 1.1.1.1
|
||||
|
||||
kubesolo:
|
||||
extra-sans:
|
||||
- edge-node-001.local
|
||||
- 192.168.1.100
|
||||
local-storage: true
|
||||
portainer:
|
||||
edge-id: "your-edge-id"
|
||||
edge-key: "your-edge-key"
|
||||
|
||||
ssh:
|
||||
enabled: false # Set true to enable SSH extension
|
||||
authorized_keys:
|
||||
- "ssh-rsa AAAA..."
|
||||
|
||||
ntp:
|
||||
servers:
|
||||
- pool.ntp.org
|
||||
```
|
||||
|
||||
#### Runtime Configuration
|
||||
|
||||
Post-boot configuration changes via Kubernetes API:
|
||||
|
||||
```bash
|
||||
# Access from the node (kubeconfig is at known path)
|
||||
export KUBECONFIG=/var/lib/kubesolo/pki/admin/admin.kubeconfig
|
||||
kubectl get nodes
|
||||
kubectl apply -f workload.yaml
|
||||
|
||||
# Remote access via Portainer Edge or direct API
|
||||
# (if apiserver-extra-sans includes remote IP/DNS)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Build Process
|
||||
|
||||
### 5.1 Build Pipeline
|
||||
|
||||
```
|
||||
BUILD PIPELINE
|
||||
|
||||
┌─────────────────────┐ ┌──────────────────────┐
|
||||
│ 1. Fetch Tiny Core │────▶│ 2. Extract core.gz │
|
||||
│ Micro Core ISO │ │ (cpio -idmv) │
|
||||
└─────────────────────┘ └──────────┬───────────┘
|
||||
│
|
||||
┌──────────▼───────────┐
|
||||
│ 3. Inject KubeSolo │
|
||||
│ binary + deps │
|
||||
│ (containerd, runc,│
|
||||
│ CNI, modules) │
|
||||
└──────────┬───────────┘
|
||||
│
|
||||
┌──────────▼───────────┐
|
||||
│ 4. Replace /sbin/init│
|
||||
│ with custom init │
|
||||
└──────────┬───────────┘
|
||||
│
|
||||
┌──────────▼───────────┐
|
||||
│ 5. Repack initramfs │
|
||||
│ (find . | cpio -o │
|
||||
│ | gzip > ks-os.gz│
|
||||
└──────────┬───────────┘
|
||||
│
|
||||
┌──────────▼───────────┐
|
||||
│ 6. Verify kernel has │
|
||||
│ required configs │
|
||||
│ (cgroup v2, ns, │
|
||||
│ netfilter, etc.) │
|
||||
└──────────┬───────────┘
|
||||
│
|
||||
┌────────────────────────┼────────────────────┐
|
||||
│ │ │
|
||||
┌─────────▼─────────┐ ┌─────────▼─────────┐ ┌──────▼───────┐
|
||||
│ 7a. Create ISO │ │ 7b. Create raw │ │ 7c. Create │
|
||||
│ (bootable │ │ disk image │ │ OCI │
|
||||
│ media) │ │ (dd to disk) │ │ image │
|
||||
└───────────────────┘ └───────────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
### 5.2 Build Script (Skeleton)
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# build-kubesolo-os.sh
|
||||
set -euo pipefail
|
||||
|
||||
VERSION="${1:?Usage: $0 <version>}"
|
||||
WORK_DIR="$(mktemp -d)"
|
||||
OUTPUT_DIR="./output"
|
||||
|
||||
# --- 1. Download components ---
|
||||
echo "==> Downloading Tiny Core Micro Core..."
|
||||
wget -q "http://www.tinycorelinux.net/17.x/x86_64/release/CorePure64-17.0.iso" \
|
||||
-O "$WORK_DIR/core.iso"
|
||||
|
||||
echo "==> Downloading KubeSolo..."
|
||||
curl -sfL https://get.kubesolo.io -o "$WORK_DIR/install-kubesolo.sh"
|
||||
# Or: download specific release binary from GitHub
|
||||
|
||||
# --- 2. Extract Tiny Core ---
|
||||
mkdir -p "$WORK_DIR/iso" "$WORK_DIR/rootfs"
|
||||
mount -o loop "$WORK_DIR/core.iso" "$WORK_DIR/iso"
|
||||
cp "$WORK_DIR/iso/boot/vmlinuz64" "$WORK_DIR/vmlinuz"
|
||||
cd "$WORK_DIR/rootfs"
|
||||
zcat "$WORK_DIR/iso/boot/corepure64.gz" | cpio -idmv 2>/dev/null
|
||||
umount "$WORK_DIR/iso"
|
||||
|
||||
# --- 3. Inject KubeSolo + dependencies ---
|
||||
# KubeSolo binary
|
||||
mkdir -p usr/local/bin
|
||||
cp /path/to/kubesolo usr/local/bin/kubesolo
|
||||
chmod +x usr/local/bin/kubesolo
|
||||
|
||||
# containerd + runc + CNI (extracted from KubeSolo bundle or downloaded separately)
|
||||
mkdir -p opt/cni/bin
|
||||
# ... copy containerd, runc, CNI plugins
|
||||
|
||||
# Required kernel modules (if not already in core.gz)
|
||||
# ... may need to compile or extract from Tiny Core extensions
|
||||
|
||||
# --- 4. Custom init ---
|
||||
cat > sbin/init << 'INIT'
|
||||
#!/bin/sh
|
||||
# ... (init script from §4.5)
|
||||
INIT
|
||||
chmod +x sbin/init
|
||||
|
||||
# --- 5. Sysctl + OS metadata ---
|
||||
mkdir -p etc/sysctl.d
|
||||
cat > etc/sysctl.d/k8s.conf << EOF
|
||||
net.bridge.bridge-nf-call-iptables = 1
|
||||
net.bridge.bridge-nf-call-ip6tables = 1
|
||||
net.ipv4.ip_forward = 1
|
||||
fs.inotify.max_user_instances = 1024
|
||||
fs.inotify.max_user_watches = 524288
|
||||
EOF
|
||||
|
||||
cat > etc/os-release << EOF
|
||||
NAME="KubeSolo OS"
|
||||
VERSION="$VERSION"
|
||||
ID=kubesolo-os
|
||||
VERSION_ID=$VERSION
|
||||
PRETTY_NAME="KubeSolo OS $VERSION"
|
||||
HOME_URL="https://github.com/portainer/kubesolo"
|
||||
EOF
|
||||
|
||||
# --- 6. Repack initramfs ---
|
||||
find . | cpio -o -H newc 2>/dev/null | gzip -9 > "$WORK_DIR/kubesolo-os.gz"
|
||||
|
||||
# --- 7. Create disk image ---
|
||||
mkdir -p "$OUTPUT_DIR"
|
||||
create_disk_image "$WORK_DIR/vmlinuz" "$WORK_DIR/kubesolo-os.gz" \
|
||||
"$OUTPUT_DIR/kubesolo-os-${VERSION}.img"
|
||||
|
||||
echo "==> Built: $OUTPUT_DIR/kubesolo-os-${VERSION}.img"
|
||||
```
|
||||
|
||||
### 5.3 Alternative: Kairos-based Build (Container-first)
|
||||
|
||||
For faster iteration, leverage the Kairos framework to get A/B updates, P2P mesh, and OCI distribution for free:
|
||||
|
||||
```dockerfile
|
||||
# Dockerfile.kubesolo-os
|
||||
FROM quay.io/kairos/core-alpine:latest
|
||||
|
||||
# Install KubeSolo
|
||||
RUN curl -sfL https://get.kubesolo.io | sh -
|
||||
|
||||
# Pre-configure
|
||||
COPY kubesolo-config.yaml /etc/kubesolo/config.yaml
|
||||
COPY cloud-init-defaults.yaml /system/oem/
|
||||
|
||||
# Kernel modules
|
||||
RUN apk add --no-cache \
|
||||
linux-lts \
|
||||
iptables \
|
||||
iproute2 \
|
||||
conntrack-tools
|
||||
|
||||
# Sysctl
|
||||
COPY k8s-sysctl.conf /etc/sysctl.d/
|
||||
|
||||
# Build: docker build -t kubesolo-os:v1 .
|
||||
# Flash: Use AuroraBoot or Kairos tooling to convert OCI → bootable image
|
||||
```
|
||||
|
||||
**Advantages of Kairos approach:**
|
||||
- A/B atomic updates with rollback — built-in
|
||||
- OCI-based distribution — `docker push` your OS
|
||||
- P2P mesh bootstrapping — nodes find each other
|
||||
- Kubernetes-native upgrades — `kubectl apply` to upgrade the OS
|
||||
- Proven in production edge deployments
|
||||
|
||||
**Trade-offs:**
|
||||
- Larger footprint than pure Tiny Core remaster (~200–400 MB vs ~50–80 MB)
|
||||
- Dependency on Kairos project maintenance
|
||||
- Less control over boot process internals
|
||||
|
||||
---
|
||||
|
||||
## 6. Kernel Considerations
|
||||
|
||||
### 6.1 Tiny Core Kernel Audit
|
||||
|
||||
Tiny Core 17.0 ships a modern 6.x kernel, which should have cgroup v2 support compiled in. However, the **kernel config must be verified** for these critical options:
|
||||
|
||||
```
|
||||
# MANDATORY for KubeSolo
|
||||
CONFIG_CGROUPS=y
|
||||
CONFIG_CGROUP_CPUACCT=y
|
||||
CONFIG_CGROUP_DEVICE=y
|
||||
CONFIG_CGROUP_FREEZER=y
|
||||
CONFIG_CGROUP_SCHED=y
|
||||
CONFIG_CGROUP_PIDS=y
|
||||
CONFIG_MEMCG=y
|
||||
CONFIG_CGROUP_BPF=y
|
||||
|
||||
CONFIG_NAMESPACES=y
|
||||
CONFIG_NET_NS=y
|
||||
CONFIG_PID_NS=y
|
||||
CONFIG_USER_NS=y
|
||||
CONFIG_UTS_NS=y
|
||||
CONFIG_IPC_NS=y
|
||||
|
||||
CONFIG_OVERLAY_FS=y # or =m (module)
|
||||
CONFIG_BRIDGE=y # or =m
|
||||
CONFIG_NETFILTER=y
|
||||
CONFIG_NF_NAT=y
|
||||
CONFIG_IP_NF_IPTABLES=y
|
||||
CONFIG_IP_NF_NAT=y
|
||||
CONFIG_IP_NF_FILTER=y
|
||||
CONFIG_VETH=y # or =m
|
||||
CONFIG_VXLAN=y # or =m
|
||||
|
||||
CONFIG_SQUASHFS=y # For Tiny Core's own extension system
|
||||
CONFIG_BLK_DEV_LOOP=y # For SquashFS mounting
|
||||
|
||||
# RECOMMENDED
|
||||
CONFIG_BPF_SYSCALL=y # For modern CNI plugins
|
||||
CONFIG_CRYPTO_SHA256=y # For image verification
|
||||
CONFIG_SECCOMP=y # Container security
|
||||
CONFIG_AUDIT=y # Audit logging
|
||||
```
|
||||
|
||||
If the stock Tiny Core kernel lacks any of these, options are:
|
||||
|
||||
1. **Load as modules** — if compiled as `=m`, load via `modprobe` in init
|
||||
2. **Recompile kernel** — use Tiny Core's kernel build process with custom config
|
||||
3. **Use a different kernel** — e.g., pull the kernel from Alpine Linux or build from mainline
|
||||
|
||||
### 6.2 Custom Kernel Build (if needed)
|
||||
|
||||
```bash
|
||||
# On a Tiny Core build system
|
||||
tce-load -wi compiletc linux-6.x-source
|
||||
cd /usr/src/linux-6.x
|
||||
cp /path/to/kubesolo-kernel.config .config
|
||||
make oldconfig
|
||||
make -j$(nproc) bzImage modules
|
||||
# Extract vmlinuz and required modules
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Security Model
|
||||
|
||||
### 7.1 Layered Security
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ APPLICATION LAYER │
|
||||
│ Kubernetes RBAC + Network Policies │
|
||||
│ Pod Security Standards │
|
||||
│ Seccomp / AppArmor profiles │
|
||||
├─────────────────────────────────────────┤
|
||||
│ CONTAINER RUNTIME LAYER │
|
||||
│ containerd with default seccomp │
|
||||
│ Read-only container rootfs │
|
||||
│ User namespace mapping (optional) │
|
||||
├─────────────────────────────────────────┤
|
||||
│ OS LAYER │
|
||||
│ SquashFS root (read-only, in RAM) │
|
||||
│ No package manager │
|
||||
│ No SSH by default │
|
||||
│ Minimal userland (BusyBox only) │
|
||||
│ No compiler, no debugger │
|
||||
├─────────────────────────────────────────┤
|
||||
│ BOOT LAYER │
|
||||
│ Signed images (GPG verification) │
|
||||
│ Secure Boot (optional, UEFI) │
|
||||
│ A/B rollback on tamper/failure │
|
||||
│ TPM-based attestation (optional) │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### 7.2 Attack Surface Comparison
|
||||
|
||||
| Attack Vector | Traditional Linux | KubeSolo OS |
|
||||
|---|---|---|
|
||||
| Package manager exploit | Possible (apt/yum) | **Eliminated** (no pkg manager) |
|
||||
| SSH brute force | Common | **Eliminated** (no SSH default) |
|
||||
| Writable system files | Yes (/etc, /usr) | **Eliminated** (SquashFS ro) |
|
||||
| Persistent rootkit | Survives reboot | **Eliminated** (RAM-only root) |
|
||||
| Kernel module injection | Possible | **Mitigated** (only preloaded modules) |
|
||||
| Local privilege escalation | Various paths | **Reduced** (minimal binaries) |
|
||||
|
||||
---
|
||||
|
||||
## 8. Implementation Roadmap
|
||||
|
||||
### Phase 1 — Proof of Concept (2–3 weeks)
|
||||
|
||||
**Goal:** Boot Tiny Core + KubeSolo, validate K8s functionality.
|
||||
|
||||
1. Download Tiny Core Micro Core 17.0 (x86_64)
|
||||
2. Extract `core.gz`, inject KubeSolo binary
|
||||
3. Create custom init that starts KubeSolo
|
||||
4. Verify kernel has required configs (cgroup v2, namespaces, netfilter)
|
||||
5. Build bootable ISO, test in QEMU/KVM
|
||||
6. Deploy a test workload (nginx pod)
|
||||
7. Validate: `kubectl get nodes` shows Ready
|
||||
|
||||
**Success criteria:** Single ISO boots to functional K8s node in < 30 seconds.
|
||||
|
||||
### Phase 2 — Persistence + Immutability (2–3 weeks)
|
||||
|
||||
**Goal:** Persistent K8s state across reboots, immutable root.
|
||||
|
||||
1. Implement persistent data partition with bind mounts
|
||||
2. Verify K8s state survives reboot (pods, services, PVCs)
|
||||
3. Verify SQLite DB integrity across unclean shutdowns
|
||||
4. Lock down root filesystem (verify read-only enforcement)
|
||||
5. Test: corrupt system files → verify RAM-only root is unaffected
|
||||
|
||||
### Phase 3 — Atomic Updates + Rollback (3–4 weeks)
|
||||
|
||||
**Goal:** A/B partition updates with automatic rollback.
|
||||
|
||||
1. Implement GRUB A/B boot configuration
|
||||
2. Build update agent (Go binary)
|
||||
3. Implement health check + `boot_success` flag
|
||||
4. Test update cycle: A → B → verify → mark good
|
||||
5. Test rollback: A → B → fail → auto-revert to A
|
||||
6. Test: pull power during update → verify clean state
|
||||
|
||||
### Phase 4 — Production Hardening (2–3 weeks)
|
||||
|
||||
**Goal:** Production-ready security and manageability.
|
||||
|
||||
1. Image signing and verification (GPG or sigstore/cosign)
|
||||
2. Cloud-init implementation for first-boot config
|
||||
3. Portainer Edge integration testing
|
||||
4. Optional SSH extension (`.tcz`)
|
||||
5. Optional management API (lightweight, mTLS-authenticated)
|
||||
6. Performance benchmarking (boot time, memory usage, disk I/O)
|
||||
7. Documentation and deployment guides
|
||||
|
||||
### Phase 5 — Distribution + Fleet Management (ongoing)
|
||||
|
||||
**Goal:** Scale to fleet deployments.
|
||||
|
||||
1. CI/CD pipeline for automated image builds
|
||||
2. OCI registry distribution (optional)
|
||||
3. Fleet update orchestration (rolling updates across nodes)
|
||||
4. Monitoring integration (Prometheus metrics endpoint)
|
||||
5. USB provisioning tool for air-gapped deployments
|
||||
6. ARM64 support (Raspberry Pi, Jetson, etc.)
|
||||
|
||||
---
|
||||
|
||||
## 9. Open Questions & Decisions
|
||||
|
||||
| # | Question | Options | Recommendation |
|
||||
|---|---|---|---|
|
||||
| 1 | **Build approach** | Pure Tiny Core remaster vs. Kairos framework | Start with pure remaster for minimal footprint; evaluate Kairos if update complexity becomes unmanageable |
|
||||
| 2 | **Kernel** | Stock Tiny Core kernel vs. custom build | Audit stock kernel first; only custom-build if missing critical configs |
|
||||
| 3 | **Management interface** | SSH / API / Portainer Edge only | Portainer Edge primary; optional SSH extension for debugging |
|
||||
| 4 | **Update distribution** | HTTP server / OCI registry / USB | HTTP for simplicity; OCI if leveraging container infrastructure |
|
||||
| 5 | **Init system** | Custom shell script vs. BusyBox init vs. s6 | Custom shell script for PoC; evaluate s6 for supervision |
|
||||
| 6 | **Networking** | DHCP only / Static / cloud-init | Cloud-init with DHCP fallback |
|
||||
| 7 | **Architecture support** | x86_64 only vs. multi-arch | x86_64 first; ARM64 in Phase 5 |
|
||||
| 8 | **Container images** | Preloaded in initramfs vs. pull at boot | Preload core workloads; pull additional at runtime |
|
||||
|
||||
---
|
||||
|
||||
## 10. References
|
||||
|
||||
- KubeSolo: https://github.com/portainer/kubesolo
|
||||
- Tiny Core Linux: http://www.tinycorelinux.net
|
||||
- Tiny Core Wiki (Remastering): http://wiki.tinycorelinux.net/doku.php?id=wiki:remastering
|
||||
- Talos Linux: https://www.talos.dev
|
||||
- Kairos: https://kairos.io
|
||||
- Bottlerocket: https://github.com/bottlerocket-os/bottlerocket
|
||||
- Flatcar Linux: https://www.flatcar.org
|
||||
- Kubernetes Node Requirements: https://kubernetes.io/docs/setup/production-environment/container-runtimes/
|
||||
- cgroup v2: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html
|
||||
Reference in New Issue
Block a user