Some checks failed
CI / Go Tests (push) Successful in 1m29s
CI / Shellcheck (push) Successful in 46s
ARM64 Build / Build generic ARM64 disk image (push) Failing after 3s
Release / Test (push) Successful in 1m21s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Successful in 1m19s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 1m36s
Release / Build Binaries (amd64, linux, linux-amd64) (push) Failing after 1m27s
Release / Build Binaries (arm64, linux, linux-arm64) (push) Failing after 1m17s
Release / Build ISO (amd64) (push) Has been skipped
Release / Create Release (push) Has been skipped
Promote VERSION from 0.3.0-dev to 0.3.0. Finalise CHANGELOG entry with phases 5-8 work (state machine + metrics, channels + maintenance windows, OCI multi-arch distribution, pre-flight gates + deeper healthcheck + auto-rollback). Refresh README quick-start to show both x86_64 and generic ARM64 paths; update the roadmap status table to mark all v0.3 phases complete and explicitly track the v0.3.1 follow-ups (OCI cosign, LABEL=KSOLODATA on ARM64, real-hardware validation). Add docs/release-notes-0.3.0.md as the operator-facing summary, including a v0.2.x -> v0.3.0 migration section (non-breaking on live systems) and the known-limitations list copied from CHANGELOG. All tests green: cloud-init module, all 10 update-module packages, shellcheck across init / build / test / hack scripts under the v0.3 severity policy. Tagging is intentionally NOT done from this commit — that's a manual step so the operator can decide when v0.3.0 is final. After tagging: git tag -a v0.3.0 -m "KubeSolo OS v0.3.0" git push origin v0.3.0 The push triggers .gitea/workflows/build-arm64.yaml which runs the full ARM64 build on the Odroid runner. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
258 lines
10 KiB
Markdown
258 lines
10 KiB
Markdown
# KubeSolo OS
|
|
|
|
An immutable, bootable Linux distribution purpose-built for [KubeSolo](https://github.com/portainer/kubesolo) — Portainer's ultra-lightweight single-node Kubernetes.
|
|
|
|
> **Status (v0.3.0):** x86_64 and generic ARM64 (UEFI / virtio / mainline kernel) both build and boot end-to-end. Update agent has an explicit state machine, OCI registry distribution alongside HTTP, channel + maintenance-window + version-stepping-stone gates, and auto-rollback. ARM64 Raspberry Pi support remains paused pending physical hardware. See [docs/release-notes-0.3.0.md](docs/release-notes-0.3.0.md) for the full v0.3.0 changelog.
|
|
|
|
## What is this?
|
|
|
|
KubeSolo OS combines **Tiny Core Linux** (~11 MB) with **KubeSolo** (single-binary Kubernetes) to create an appliance-like K8s node that:
|
|
|
|
- Boots to a functional Kubernetes cluster in ~35 seconds
|
|
- Runs entirely from RAM with a read-only SquashFS root
|
|
- Persists K8s state across reboots via a dedicated data partition
|
|
- Uses a custom kernel (6.18.2-tinycore64) optimized for containers
|
|
- Supports first-boot configuration via cloud-init YAML
|
|
- Performs atomic A/B updates with automatic GRUB-based rollback
|
|
- Signs update images with Ed25519 for integrity verification
|
|
- Exposes Prometheus metrics for monitoring
|
|
- Integrates with Portainer Edge for fleet management
|
|
- Ships as ISO, raw disk image, or OCI container
|
|
- Requires no SSH, no package manager, no writable system files
|
|
|
|
**Target use cases:** IoT/IIoT edge, air-gapped deployments, single-node K8s appliances, kiosk/POS systems, resource-constrained hardware.
|
|
|
|
## Quick Start
|
|
|
|
### x86_64 ISO
|
|
|
|
```bash
|
|
make fetch # Tiny Core ISO + KubeSolo binary
|
|
make kernel # Custom kernel (first time only, ~25 min, cached)
|
|
make build-cloudinit build-update-agent
|
|
make rootfs initramfs iso
|
|
make dev-vm
|
|
```
|
|
|
|
### Generic ARM64 disk image (v0.3.0+)
|
|
|
|
For Graviton / Ampere / generic UEFI ARM64 hosts:
|
|
|
|
```bash
|
|
make kernel-arm64 # Mainline 6.12 LTS kernel (first time only, ~30-60 min)
|
|
make rootfs-arm64 # Mainline kernel modules + KubeSolo arm64
|
|
make disk-image-arm64 # UEFI-bootable A/B GPT image
|
|
make test-boot-arm64-disk # boot smoke test under qemu-system-aarch64
|
|
```
|
|
|
|
### Raspberry Pi (work in progress)
|
|
|
|
Build path lives at `make kernel-rpi` / `make rpi-image`; needs physical
|
|
hardware to validate the firmware + autoboot.txt path. See
|
|
[docs/arm64-architecture.md](docs/arm64-architecture.md) for the two-track
|
|
build layout.
|
|
|
|
Or build everything at once inside Docker:
|
|
|
|
```bash
|
|
make docker-build
|
|
```
|
|
|
|
After boot, retrieve the kubeconfig and manage your cluster from the host:
|
|
|
|
```bash
|
|
curl -s http://localhost:8080 > ~/.kube/kubesolo-config
|
|
export KUBECONFIG=~/.kube/kubesolo-config
|
|
kubectl get nodes
|
|
```
|
|
|
|
### Portainer Edge Agent
|
|
|
|
Pass Edge credentials via boot parameters:
|
|
|
|
```bash
|
|
./hack/dev-vm.sh --edge-id=YOUR_EDGE_ID --edge-key=YOUR_EDGE_KEY
|
|
```
|
|
|
|
Or configure via [cloud-init YAML](cloud-init/examples/portainer-edge.yaml).
|
|
|
|
## Requirements
|
|
|
|
**Build host:**
|
|
- Linux x86_64 with root/sudo (for loop mounts)
|
|
- Go 1.22+ (for cloud-init and update agent)
|
|
- Tools: `cpio`, `gzip`, `wget`, `curl`, `syslinux` (or use `make docker-build`)
|
|
|
|
**Runtime:**
|
|
- x86_64 hardware or VM (ARM64 cross-compilation available)
|
|
- 512 MB RAM minimum (1 GB+ recommended)
|
|
- 8 GB disk (for persistent data partition)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
Boot Media (ISO or Disk Image)
|
|
│
|
|
├── GRUB 2 bootloader (A/B slot selection, rollback counter)
|
|
│
|
|
└── Kernel + Initramfs (kubesolo-os.gz)
|
|
│
|
|
├── switch_root → SquashFS root (read-only, in RAM)
|
|
├── Persistent data partition (ext4, bind-mounted)
|
|
│ ├── /var/lib/kubesolo (K8s state, certs, SQLite)
|
|
│ ├── /var/lib/containerd (container images)
|
|
│ └── /etc/kubesolo (node configuration)
|
|
├── Custom init (POSIX sh, staged boot 00→90)
|
|
│ └── Stage 45: cloud-init (Go binary)
|
|
├── containerd (bundled with KubeSolo)
|
|
└── KubeSolo (single-binary K8s)
|
|
```
|
|
|
|
### Partition Layout (Disk Image)
|
|
|
|
```
|
|
GPT Disk (minimum 8 GB):
|
|
Part 1: EFI/Boot (256 MB, FAT32) — GRUB + A/B boot logic
|
|
Part 2: System A (512 MB, ext4) — vmlinuz + kubesolo-os.gz (active)
|
|
Part 3: System B (512 MB, ext4) — vmlinuz + kubesolo-os.gz (passive)
|
|
Part 4: Data (remaining, ext4) — persistent K8s state
|
|
```
|
|
|
|
See [docs/design/kubesolo-os-design.md](docs/design/kubesolo-os-design.md) for the full architecture document.
|
|
|
|
## Custom Kernel
|
|
|
|
The stock Tiny Core 17.0 kernel lacks several configs required for containers. KubeSolo OS builds a custom kernel (6.18.2-tinycore64) that adds:
|
|
|
|
- `CONFIG_CGROUP_BPF` — cgroup v2 device control via BPF
|
|
- `CONFIG_DEVTMPFS` / `CONFIG_DEVTMPFS_MOUNT` — automatic /dev node creation
|
|
- `CONFIG_MEMCG` — memory cgroup controller
|
|
- `CONFIG_CFS_BANDWIDTH` — CPU bandwidth throttling
|
|
|
|
Unnecessary subsystems (sound, GPU, wireless, Bluetooth, etc.) are stripped to keep the kernel minimal. Build is cached in `build/cache/custom-kernel/`.
|
|
|
|
## Cloud-Init
|
|
|
|
First-boot configuration via a simple YAML schema. All [documented KubeSolo flags](https://www.kubesolo.io/documentation#install) are supported:
|
|
|
|
```yaml
|
|
hostname: edge-node-01
|
|
network:
|
|
mode: static
|
|
address: 192.168.1.100/24
|
|
gateway: 192.168.1.1
|
|
dns:
|
|
- 8.8.8.8
|
|
kubesolo:
|
|
local-storage: true
|
|
local-storage-shared-path: "/mnt/shared"
|
|
apiserver-extra-sans:
|
|
- edge-node-01.local
|
|
debug: false
|
|
pprof-server: false
|
|
portainer-edge-id: "your-edge-id"
|
|
portainer-edge-key: "your-edge-key"
|
|
portainer-edge-async: true
|
|
```
|
|
|
|
See [docs/cloud-init.md](docs/cloud-init.md) and the [examples](cloud-init/examples/).
|
|
|
|
## Atomic Updates
|
|
|
|
A/B partition scheme with GRUB boot counter for automatic rollback:
|
|
|
|
1. Update agent downloads new image to passive partition
|
|
2. GRUB boots new partition with `boot_counter=3`
|
|
3. Health check verifies containerd + K8s API + node Ready → sets `boot_success=1`
|
|
4. On 3 consecutive boot failures, GRUB auto-rolls back to previous slot
|
|
|
|
Updates can be signed with Ed25519 for integrity verification. A K8s CronJob checks for updates every 6 hours.
|
|
|
|
See [docs/update-flow.md](docs/update-flow.md).
|
|
|
|
## Monitoring
|
|
|
|
The update agent exposes Prometheus metrics on port 9100:
|
|
|
|
```bash
|
|
kubesolo-update metrics --listen :9100
|
|
```
|
|
|
|
Metrics include: `kubesolo_os_info`, `boot_success`, `boot_counter`, `uptime_seconds`, `update_available`, `memory_total_bytes`, `memory_available_bytes`.
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
├── Makefile # Build orchestration
|
|
├── build/ # Build scripts, kernel config, rootfs overlays
|
|
│ └── scripts/
|
|
│ ├── build-kernel.sh # Custom kernel compilation
|
|
│ ├── fetch-components.sh # Download components
|
|
│ ├── create-iso.sh # Bootable ISO
|
|
│ ├── create-disk-image.sh # A/B partition disk image
|
|
│ └── create-oci-image.sh # OCI container image
|
|
├── init/ # Custom init system (POSIX sh)
|
|
│ ├── init.sh # Main init + switch_root
|
|
│ └── lib/ # Staged boot scripts (00-90)
|
|
├── cloud-init/ # Go cloud-init parser
|
|
├── update/ # Go atomic update agent
|
|
├── test/ # QEMU-based automated tests + benchmarks
|
|
├── hack/ # Developer utilities (dev-vm, SSH, USB)
|
|
├── docs/ # Documentation
|
|
│ ├── design/ # Architecture design document
|
|
│ ├── boot-flow.md # Boot sequence reference
|
|
│ ├── update-flow.md # A/B update reference
|
|
│ ├── cloud-init.md # Cloud-init configuration reference
|
|
│ └── deployment-guide.md # Deployment and operations guide
|
|
└── .gitea/workflows/ # CI/CD (Gitea Actions)
|
|
```
|
|
|
|
## Make Targets
|
|
|
|
| Target | Description |
|
|
|--------|-------------|
|
|
| `make fetch` | Download Tiny Core ISO + KubeSolo binary |
|
|
| `make kernel` | Build custom kernel (cached) |
|
|
| `make build-cloudinit` | Compile cloud-init Go binary |
|
|
| `make build-update-agent` | Compile update agent Go binary |
|
|
| `make rootfs` | Extract Tiny Core + inject KubeSolo |
|
|
| `make initramfs` | Pack initramfs (kubesolo-os.gz) |
|
|
| `make iso` | Create bootable ISO |
|
|
| `make disk-image` | Create A/B partition disk image |
|
|
| `make oci-image` | Package as OCI container |
|
|
| `make build-cross` | Cross-compile for amd64 + arm64 |
|
|
| `make docker-build` | Build everything in Docker |
|
|
| `make quick` | Fast rebuild (re-inject + repack + ISO) |
|
|
| `make dev-vm` | Launch QEMU dev VM (Linux + macOS) |
|
|
| `make test-all` | Run all tests |
|
|
|
|
## Documentation
|
|
|
|
- [Architecture Design](docs/design/kubesolo-os-design.md) — full research and technical specification
|
|
- [Boot Flow](docs/boot-flow.md) — boot sequence from GRUB to K8s Ready
|
|
- [Update Flow](docs/update-flow.md) — A/B atomic update mechanism
|
|
- [Cloud-Init](docs/cloud-init.md) — first-boot configuration reference
|
|
- [Deployment Guide](docs/deployment-guide.md) — installation, operations, troubleshooting
|
|
|
|
## Roadmap
|
|
|
|
| Phase | Scope | Status |
|
|
|-------|-------|--------|
|
|
| 1 | PoC: boot Tiny Core + KubeSolo, verify K8s | Complete (x86_64) |
|
|
| 2 | Cloud-init Go parser, network, hostname | Complete |
|
|
| 3 | A/B atomic updates, GRUB, rollback agent | Complete (x86_64) |
|
|
| 4 | Ed25519 signing, Portainer Edge, SSH extension | Complete |
|
|
| 5 | CI/CD, OCI distribution, Prometheus metrics, ARM64 cross-compile | Complete |
|
|
| 6 | Security hardening, AppArmor | Complete |
|
|
| - | Custom kernel build for container runtime fixes | Complete (x86_64) |
|
|
| 7 | ARM64 generic (mainline kernel, UEFI, virtio) | Complete (v0.3.0, QEMU validated) |
|
|
| 8 | Update engine v2 (state machine, channels, OCI, pre-flight gates) | Complete (v0.3.0) |
|
|
| - | ARM64 Raspberry Pi (custom kernel, firmware, SD card image) | Paused — needs hardware |
|
|
| - | OCI cosign signature verification | Planned for v0.3.1 |
|
|
| - | LABEL=KSOLODATA on ARM64 (replace blkid/findfs path) | Planned for v0.3.1 |
|
|
| - | Real-hardware ARM64 validation (Graviton / Ampere) | Planned for v0.3.1 |
|
|
|
|
## License
|
|
|
|
MIT License — see [LICENSE](LICENSE) for details.
|