# KubeSolo OS v0.3.0 — Release Notes **Released:** 2026-05-14 v0.3.0 is the second feature release after v0.2.0 and the first release that ships a generic ARM64 build alongside x86_64. The update agent grew up: it now has an explicit on-disk lifecycle, OCI registry distribution, and a fleet-friendly set of policy gates (channels, maintenance windows, version-stepping-stones, pre-flight checks, auto-rollback). This document is the operator-facing summary. The full per-phase changelog lives in [CHANGELOG.md](../CHANGELOG.md). ## What's new ### Generic ARM64 build The image you build with `make disk-image-arm64` now targets any UEFI-capable ARM64 host: AWS Graviton, Oracle Ampere, generic ARM64 servers, future SBCs with UEFI-compatible firmware. The kernel comes from kernel.org mainline LTS (6.12.10 by default, configurable via `MAINLINE_KERNEL_VERSION` in `build/config/versions.env`). This is **distinct** from the Raspberry Pi build path. RPi keeps its specialised kernel from `raspberrypi/linux` with bcm-defconfig + custom DTBs; the generic ARM64 path uses mainline + arm64-defconfig + UEFI/virtio. See [docs/arm64-architecture.md](arm64-architecture.md) for the file-by-file split. KubeSolo bumped to **v1.1.5** (was v1.1.0). New flags surfaced via cloud-init: - `kubesolo.full` — disable edge-optimised k8s overrides - `kubesolo.disable-ipv6` — disable IPv6 cluster-wide - `kubesolo.db-wal-repair` — recover from unclean shutdowns ### Update lifecycle is now observable The update agent writes a `state.json` at `/var/lib/kubesolo/update/state.json` recording where the current attempt is in the lifecycle: ``` idle → checking → downloading → staged → activated → verifying → success ↘ rolled_back ↘ failed ``` `kubesolo-update status --json` emits the full state for orchestration tooling. The Prometheus metrics endpoint gains three new series: - `kubesolo_update_phase{phase="..."}` — 1 for current phase, 0 for others (all 9 always emitted) - `kubesolo_update_attempts_total` - `kubesolo_update_last_attempt_timestamp_seconds` ### OCI registry distribution Update artifacts can now be pulled from any OCI-compliant registry alongside the existing HTTP `latest.json` protocol: ```bash # HTTP, unchanged from v0.2: kubesolo-update apply --server https://updates.example.com # New: OCI from ghcr.io (or quay.io, harbor, zot, ...) kubesolo-update apply --registry ghcr.io/yourorg/kubesolo-os --tag stable ``` Multi-arch is handled transparently — the same `stable` tag points at a manifest index, the agent picks the manifest matching its `runtime.GOARCH`. Publish your own artifacts with `build/scripts/push-oci-artifact.sh`. See the script's header comment for the full publishing flow. ### Policy gates `apply` now enforces five gates before destroying the passive slot: 1. **Maintenance window** (configurable, e.g. `03:00-05:00`; wrapping midnight supported) 2. **Node-block-label** — refuses if the K8s node carries `updates.kubesolo.io/block=true` (workload-author kill switch) 3. **Channel** — `stable` / `beta` / `edge` must match between the artifact metadata and the local channel 4. **Architecture** — refuses cross-arch artifacts via `runtime.GOARCH` check 5. **Min compatible version** — stepping-stone enforcement; refuses an upgrade that bypasses a required intermediate version `--force` bypasses the maintenance window and node-block label (channel / arch / min-version are non-negotiable). Failures are recorded in `state.json` with a clear `LastError` field. ### Healthcheck deepening + auto-rollback `kubesolo-update healthcheck` grew three optional probes: - **Kube-system pods** must hold Running for ≥ N seconds before passing - **Operator probe URL** — GET an operator-supplied endpoint; 200 = pass - **Disk smoke test** — write/fsync/read/delete a probe file under `/var/lib/kubesolo` to catch a wedged data partition Plus auto-rollback: with `--auto-rollback-after N` (or `auto_rollback_after=` in `update.conf`), after N consecutive post-activation failures, the agent calls `ForceRollback()` and the operator/init is expected to reboot. The counter resets on a clean pass. ### Persistent configuration via `/etc/kubesolo/update.conf` Cloud-init writes this file on first boot from a new `updates:` block; you can also hand-edit it. Recognised keys: ``` server = https://updates.example.com # or omit if using registry registry = # OCI registry ref (alt to server) channel = stable maintenance_window = 03:00-05:00 pubkey = /etc/kubesolo/update-pubkey.hex healthcheck_url = http://localhost:8000/ready auto_rollback_after = 3 ``` Cloud-init full reference at [cloud-init/examples/full-config.yaml](../cloud-init/examples/full-config.yaml). ## Migration from v0.2.x This is a non-breaking release for live systems. v0.2.x → v0.3.0 changes: - **`state.json` will appear** at `/var/lib/kubesolo/update/state.json` the first time a v0.3 agent runs `apply`. Pre-existing v0.2 deployments without this file are fine — the agent treats a missing file as fresh Idle state. - **`update.conf` is optional**. v0.2 deployments that pass everything via CLI flags keep working unchanged. - **HTTP `latest.json` protocol unchanged**. Existing update servers don't need a rebuild. - **GRUB env (boot counter, active slot)** unchanged. The bootloader's rollback behaviour is the same. - **No new mandatory kernel command-line parameters**. To opt into the new lifecycle, transports, and gates, drop in an `update.conf` (or update cloud-init) and switch to `--registry` if you want OCI distribution. ## Known limitations These shipped intentionally with v0.3.0 and are explicitly tracked for v0.3.1+: - **OCI signature verification** — the OCI transport is digest-verified end-to-end via oras-go, but does not yet consume cosign-style referrer attestations. The HTTP transport still honours `--pubkey` for `.sig` files. - **ARM64 LABEL=KSOLODATA** resolution doesn't work yet — piCore's `blkid`/`findfs` crash on QEMU virt under our mainline kernel; the static `busybox-static` we ship doesn't include those applets. `build/grub/grub-arm64.cfg` hardcodes `kubesolo.data=/dev/vda4` as a workaround. On real ARM64 hardware the device path may differ. - **Real-hardware ARM64 validation** is pending. The image builds and boots end-to-end under QEMU virt; production certification waits on a Graviton / Ampere run. - **AppArmor profile load fails on ARM64** (`apparmor_parser` ABI mismatch). Init reports the failure; boot continues without AppArmor enforcement. - **QEMU TCG performance** can trigger KubeSolo's first-boot image-import deadline. Not an OS defect; real hardware and KVM-accelerated QEMU complete the import in seconds. ## How to upgrade your build host ```bash git pull make distclean # optional — drops the build cache; full rebuild takes ~30 min make iso # or disk-image, or disk-image-arm64 ``` The Docker-based builder (`make docker-build`) regenerates its own image from `build/Dockerfile.builder` on next invocation; oras 1.2.3 and busybox-static are now included. ## Acknowledgements v0.3.0 work was driven by a single multi-week pair-programming session working through Phases 0–9 of the v0.3 roadmap. The Odroid self-hosted Gitea Actions runner (`odroid.local`, arm64-linux) carried every ARM64 build during development.