README: - Status line bumped from v0.3.0 to v0.3.1 with the actually-validated framing (K8s Ready under QEMU virt+HVF, CoreDNS + local-path + nginx all Running) and a link to CHANGELOG.md for full notes. - Roadmap: Phase 7 (generic ARM64) flipped to "Complete (v0.3.1, K8s Ready under QEMU virt+HVF)". OCI cosign verification, LABEL=KSOLODATA on ARM64, and real-hardware ARM64 validation move from "Planned for v0.3.1" to "Planned for v0.3.2" — they didn't make this release. CHANGELOG: - New "[Unreleased]" section covering the four post-v0.3.1 CI / repo housekeeping commits: drop tag trigger on build-arm64.yaml (04a5cd2), gitignore .env/credentials (48267e1), fix gated x86 job staying "queued" instead of "skipped" (fb24e64), and paths-ignore on build-arm64.yaml so workflow/docs-only commits skip the 60-minute kernel rebuild (e1b8a69). No runtime changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
258 lines
10 KiB
Markdown
258 lines
10 KiB
Markdown
# KubeSolo OS
|
|
|
|
An immutable, bootable Linux distribution purpose-built for [KubeSolo](https://github.com/portainer/kubesolo) — Portainer's ultra-lightweight single-node Kubernetes.
|
|
|
|
> **Status (v0.3.1):** First fully-validated generic ARM64 release. x86_64 and ARM64 (UEFI / virtio / mainline kernel) both build and boot end-to-end; v0.3.1 closes the dual-glibc, nftables address-family, and kube-proxy expression-module gaps that kept v0.3.0 from reaching a Ready node on ARM64. Validated end-to-end under QEMU virt + HVF on Apple Silicon: `kubectl get nodes` reports `Ready`, CoreDNS, local-path-provisioner, and an nginx test workload all `Running`. The update agent has an explicit state machine, OCI registry distribution alongside HTTP, channel + maintenance-window + version-stepping-stone gates, and auto-rollback. ARM64 Raspberry Pi support remains paused pending physical hardware. See [CHANGELOG.md](CHANGELOG.md) for the full v0.3.1 changelog and [docs/release-notes-0.3.0.md](docs/release-notes-0.3.0.md) for the v0.3.0 milestone summary.
|
|
|
|
## What is this?
|
|
|
|
KubeSolo OS combines **Tiny Core Linux** (~11 MB) with **KubeSolo** (single-binary Kubernetes) to create an appliance-like K8s node that:
|
|
|
|
- Boots to a functional Kubernetes cluster in ~35 seconds
|
|
- Runs entirely from RAM with a read-only SquashFS root
|
|
- Persists K8s state across reboots via a dedicated data partition
|
|
- Uses a custom kernel (6.18.2-tinycore64) optimized for containers
|
|
- Supports first-boot configuration via cloud-init YAML
|
|
- Performs atomic A/B updates with automatic GRUB-based rollback
|
|
- Signs update images with Ed25519 for integrity verification
|
|
- Exposes Prometheus metrics for monitoring
|
|
- Integrates with Portainer Edge for fleet management
|
|
- Ships as ISO, raw disk image, or OCI container
|
|
- Requires no SSH, no package manager, no writable system files
|
|
|
|
**Target use cases:** IoT/IIoT edge, air-gapped deployments, single-node K8s appliances, kiosk/POS systems, resource-constrained hardware.
|
|
|
|
## Quick Start
|
|
|
|
### x86_64 ISO
|
|
|
|
```bash
|
|
make fetch # Tiny Core ISO + KubeSolo binary
|
|
make kernel # Custom kernel (first time only, ~25 min, cached)
|
|
make build-cloudinit build-update-agent
|
|
make rootfs initramfs iso
|
|
make dev-vm
|
|
```
|
|
|
|
### Generic ARM64 disk image (v0.3.0+)
|
|
|
|
For Graviton / Ampere / generic UEFI ARM64 hosts:
|
|
|
|
```bash
|
|
make kernel-arm64 # Mainline 6.12 LTS kernel (first time only, ~30-60 min)
|
|
make rootfs-arm64 # Mainline kernel modules + KubeSolo arm64
|
|
make disk-image-arm64 # UEFI-bootable A/B GPT image
|
|
make test-boot-arm64-disk # boot smoke test under qemu-system-aarch64
|
|
```
|
|
|
|
### Raspberry Pi (work in progress)
|
|
|
|
Build path lives at `make kernel-rpi` / `make rpi-image`; needs physical
|
|
hardware to validate the firmware + autoboot.txt path. See
|
|
[docs/arm64-architecture.md](docs/arm64-architecture.md) for the two-track
|
|
build layout.
|
|
|
|
Or build everything at once inside Docker:
|
|
|
|
```bash
|
|
make docker-build
|
|
```
|
|
|
|
After boot, retrieve the kubeconfig and manage your cluster from the host:
|
|
|
|
```bash
|
|
curl -s http://localhost:8080 > ~/.kube/kubesolo-config
|
|
export KUBECONFIG=~/.kube/kubesolo-config
|
|
kubectl get nodes
|
|
```
|
|
|
|
### Portainer Edge Agent
|
|
|
|
Pass Edge credentials via boot parameters:
|
|
|
|
```bash
|
|
./hack/dev-vm.sh --edge-id=YOUR_EDGE_ID --edge-key=YOUR_EDGE_KEY
|
|
```
|
|
|
|
Or configure via [cloud-init YAML](cloud-init/examples/portainer-edge.yaml).
|
|
|
|
## Requirements
|
|
|
|
**Build host:**
|
|
- Linux x86_64 with root/sudo (for loop mounts)
|
|
- Go 1.22+ (for cloud-init and update agent)
|
|
- Tools: `cpio`, `gzip`, `wget`, `curl`, `syslinux` (or use `make docker-build`)
|
|
|
|
**Runtime:**
|
|
- x86_64 hardware or VM (ARM64 cross-compilation available)
|
|
- 512 MB RAM minimum (1 GB+ recommended)
|
|
- 8 GB disk (for persistent data partition)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Boot Media (ISO or Disk Image)
|
|
│
|
|
├── GRUB 2 bootloader (A/B slot selection, rollback counter)
|
|
│
|
|
└── Kernel + Initramfs (kubesolo-os.gz)
|
|
│
|
|
├── switch_root → SquashFS root (read-only, in RAM)
|
|
├── Persistent data partition (ext4, bind-mounted)
|
|
│ ├── /var/lib/kubesolo (K8s state, certs, SQLite)
|
|
│ ├── /var/lib/containerd (container images)
|
|
│ └── /etc/kubesolo (node configuration)
|
|
├── Custom init (POSIX sh, staged boot 00→90)
|
|
│ └── Stage 45: cloud-init (Go binary)
|
|
├── containerd (bundled with KubeSolo)
|
|
└── KubeSolo (single-binary K8s)
|
|
```
|
|
|
|
### Partition Layout (Disk Image)
|
|
|
|
```
|
|
GPT Disk (minimum 8 GB):
|
|
Part 1: EFI/Boot (256 MB, FAT32) — GRUB + A/B boot logic
|
|
Part 2: System A (512 MB, ext4) — vmlinuz + kubesolo-os.gz (active)
|
|
Part 3: System B (512 MB, ext4) — vmlinuz + kubesolo-os.gz (passive)
|
|
Part 4: Data (remaining, ext4) — persistent K8s state
|
|
```
|
|
|
|
See [docs/design/kubesolo-os-design.md](docs/design/kubesolo-os-design.md) for the full architecture document.
|
|
|
|
## Custom Kernel
|
|
|
|
The stock Tiny Core 17.0 kernel lacks several configs required for containers. KubeSolo OS builds a custom kernel (6.18.2-tinycore64) that adds:
|
|
|
|
- `CONFIG_CGROUP_BPF` — cgroup v2 device control via BPF
|
|
- `CONFIG_DEVTMPFS` / `CONFIG_DEVTMPFS_MOUNT` — automatic /dev node creation
|
|
- `CONFIG_MEMCG` — memory cgroup controller
|
|
- `CONFIG_CFS_BANDWIDTH` — CPU bandwidth throttling
|
|
|
|
Unnecessary subsystems (sound, GPU, wireless, Bluetooth, etc.) are stripped to keep the kernel minimal. Build is cached in `build/cache/custom-kernel/`.
|
|
|
|
## Cloud-Init
|
|
|
|
First-boot configuration via a simple YAML schema. All [documented KubeSolo flags](https://www.kubesolo.io/documentation#install) are supported:
|
|
|
|
```yaml
|
|
hostname: edge-node-01
|
|
network:
|
|
mode: static
|
|
address: 192.168.1.100/24
|
|
gateway: 192.168.1.1
|
|
dns:
|
|
- 8.8.8.8
|
|
kubesolo:
|
|
local-storage: true
|
|
local-storage-shared-path: "/mnt/shared"
|
|
apiserver-extra-sans:
|
|
- edge-node-01.local
|
|
debug: false
|
|
pprof-server: false
|
|
portainer-edge-id: "your-edge-id"
|
|
portainer-edge-key: "your-edge-key"
|
|
portainer-edge-async: true
|
|
```
|
|
|
|
See [docs/cloud-init.md](docs/cloud-init.md) and the [examples](cloud-init/examples/).
|
|
|
|
## Atomic Updates
|
|
|
|
A/B partition scheme with GRUB boot counter for automatic rollback:
|
|
|
|
1. Update agent downloads new image to passive partition
|
|
2. GRUB boots new partition with `boot_counter=3`
|
|
3. Health check verifies containerd + K8s API + node Ready → sets `boot_success=1`
|
|
4. On 3 consecutive boot failures, GRUB auto-rolls back to previous slot
|
|
|
|
Updates can be signed with Ed25519 for integrity verification. A K8s CronJob checks for updates every 6 hours.
|
|
|
|
See [docs/update-flow.md](docs/update-flow.md).
|
|
|
|
## Monitoring
|
|
|
|
The update agent exposes Prometheus metrics on port 9100:
|
|
|
|
```bash
|
|
kubesolo-update metrics --listen :9100
|
|
```
|
|
|
|
Metrics include: `kubesolo_os_info`, `boot_success`, `boot_counter`, `uptime_seconds`, `update_available`, `memory_total_bytes`, `memory_available_bytes`.
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
├── Makefile # Build orchestration
|
|
├── build/ # Build scripts, kernel config, rootfs overlays
|
|
│ └── scripts/
|
|
│ ├── build-kernel.sh # Custom kernel compilation
|
|
│ ├── fetch-components.sh # Download components
|
|
│ ├── create-iso.sh # Bootable ISO
|
|
│ ├── create-disk-image.sh # A/B partition disk image
|
|
│ └── create-oci-image.sh # OCI container image
|
|
├── init/ # Custom init system (POSIX sh)
|
|
│ ├── init.sh # Main init + switch_root
|
|
│ └── lib/ # Staged boot scripts (00-90)
|
|
├── cloud-init/ # Go cloud-init parser
|
|
├── update/ # Go atomic update agent
|
|
├── test/ # QEMU-based automated tests + benchmarks
|
|
├── hack/ # Developer utilities (dev-vm, SSH, USB)
|
|
├── docs/ # Documentation
|
|
│ ├── design/ # Architecture design document
|
|
│ ├── boot-flow.md # Boot sequence reference
|
|
│ ├── update-flow.md # A/B update reference
|
|
│ ├── cloud-init.md # Cloud-init configuration reference
|
|
│ └── deployment-guide.md # Deployment and operations guide
|
|
└── .gitea/workflows/ # CI/CD (Gitea Actions)
|
|
```
|
|
|
|
## Make Targets
|
|
|
|
| Target | Description |
|
|
|--------|-------------|
|
|
| `make fetch` | Download Tiny Core ISO + KubeSolo binary |
|
|
| `make kernel` | Build custom kernel (cached) |
|
|
| `make build-cloudinit` | Compile cloud-init Go binary |
|
|
| `make build-update-agent` | Compile update agent Go binary |
|
|
| `make rootfs` | Extract Tiny Core + inject KubeSolo |
|
|
| `make initramfs` | Pack initramfs (kubesolo-os.gz) |
|
|
| `make iso` | Create bootable ISO |
|
|
| `make disk-image` | Create A/B partition disk image |
|
|
| `make oci-image` | Package as OCI container |
|
|
| `make build-cross` | Cross-compile for amd64 + arm64 |
|
|
| `make docker-build` | Build everything in Docker |
|
|
| `make quick` | Fast rebuild (re-inject + repack + ISO) |
|
|
| `make dev-vm` | Launch QEMU dev VM (Linux + macOS) |
|
|
| `make test-all` | Run all tests |
|
|
|
|
## Documentation
|
|
|
|
- [Architecture Design](docs/design/kubesolo-os-design.md) — full research and technical specification
|
|
- [Boot Flow](docs/boot-flow.md) — boot sequence from GRUB to K8s Ready
|
|
- [Update Flow](docs/update-flow.md) — A/B atomic update mechanism
|
|
- [Cloud-Init](docs/cloud-init.md) — first-boot configuration reference
|
|
- [Deployment Guide](docs/deployment-guide.md) — installation, operations, troubleshooting
|
|
|
|
## Roadmap
|
|
|
|
| Phase | Scope | Status |
|
|
|-------|-------|--------|
|
|
| 1 | PoC: boot Tiny Core + KubeSolo, verify K8s | Complete (x86_64) |
|
|
| 2 | Cloud-init Go parser, network, hostname | Complete |
|
|
| 3 | A/B atomic updates, GRUB, rollback agent | Complete (x86_64) |
|
|
| 4 | Ed25519 signing, Portainer Edge, SSH extension | Complete |
|
|
| 5 | CI/CD, OCI distribution, Prometheus metrics, ARM64 cross-compile | Complete |
|
|
| 6 | Security hardening, AppArmor | Complete |
|
|
| - | Custom kernel build for container runtime fixes | Complete (x86_64) |
|
|
| 7 | ARM64 generic (mainline kernel, UEFI, virtio) | Complete (v0.3.1, K8s Ready under QEMU virt+HVF) |
|
|
| 8 | Update engine v2 (state machine, channels, OCI, pre-flight gates) | Complete (v0.3.0) |
|
|
| - | ARM64 Raspberry Pi (custom kernel, firmware, SD card image) | Paused — needs hardware |
|
|
| - | OCI cosign signature verification | Planned for v0.3.2 |
|
|
| - | LABEL=KSOLODATA on ARM64 (replace blkid/findfs path) | Planned for v0.3.2 |
|
|
| - | Real-hardware ARM64 validation (Graviton / Ampere) | Planned for v0.3.2 |
|
|
|
|
## License
|
|
|
|
MIT License — see [LICENSE](LICENSE) for details.
|