-
KubeSolo OS v0.3.0
StableSome checks failedCI / Go Tests (push) Successful in 1m29sCI / Shellcheck (push) Successful in 46sARM64 Build / Build generic ARM64 disk image (push) Failing after 3sRelease / Test (push) Successful in 1m21sCI / Build Go Binaries (amd64, linux, linux-amd64) (push) Successful in 1m19sCI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 1m36sRelease / Build Binaries (amd64, linux, linux-amd64) (push) Failing after 1m27sRelease / Build Binaries (arm64, linux, linux-arm64) (push) Failing after 1m17sRelease / Build ISO (amd64) (push) Has been skippedRelease / Create Release (push) Has been skippedreleased this
2026-05-15 03:14:14 +02:00 | 14 commits to main since this releaseKubeSolo OS v0.3.0 — Release Notes
Released: 2026-05-14
v0.3.0 is the second feature release after v0.2.0 and the first release that
ships a generic ARM64 build alongside x86_64. The update agent grew up: it
now has an explicit on-disk lifecycle, OCI registry distribution, and a
fleet-friendly set of policy gates (channels, maintenance windows,
version-stepping-stones, pre-flight checks, auto-rollback).This document is the operator-facing summary. The full per-phase changelog
lives in CHANGELOG.md.What's new
Generic ARM64 build
The image you build with
make disk-image-arm64now targets any UEFI-capable
ARM64 host: AWS Graviton, Oracle Ampere, generic ARM64 servers, future SBCs
with UEFI-compatible firmware. The kernel comes from kernel.org mainline LTS
(6.12.10 by default, configurable viaMAINLINE_KERNEL_VERSIONin
build/config/versions.env).This is distinct from the Raspberry Pi build path. RPi keeps its
specialised kernel fromraspberrypi/linuxwith bcm-defconfig + custom DTBs;
the generic ARM64 path uses mainline + arm64-defconfig + UEFI/virtio. See
docs/arm64-architecture.md for the file-by-file
split.KubeSolo bumped to v1.1.5 (was v1.1.0). New flags surfaced via cloud-init:
kubesolo.full— disable edge-optimised k8s overrideskubesolo.disable-ipv6— disable IPv6 cluster-widekubesolo.db-wal-repair— recover from unclean shutdowns
Update lifecycle is now observable
The update agent writes a
state.jsonat/var/lib/kubesolo/update/state.json
recording where the current attempt is in the lifecycle:idle → checking → downloading → staged → activated → verifying → success ↘ rolled_back ↘ failedkubesolo-update status --jsonemits the full state for orchestration tooling.
The Prometheus metrics endpoint gains three new series:kubesolo_update_phase{phase="..."}— 1 for current phase, 0 for others (all 9 always emitted)kubesolo_update_attempts_totalkubesolo_update_last_attempt_timestamp_seconds
OCI registry distribution
Update artifacts can now be pulled from any OCI-compliant registry alongside
the existing HTTPlatest.jsonprotocol:# HTTP, unchanged from v0.2: kubesolo-update apply --server https://updates.example.com # New: OCI from ghcr.io (or quay.io, harbor, zot, ...) kubesolo-update apply --registry ghcr.io/yourorg/kubesolo-os --tag stableMulti-arch is handled transparently — the same
stabletag points at a
manifest index, the agent picks the manifest matching itsruntime.GOARCH.Publish your own artifacts with
build/scripts/push-oci-artifact.sh. See
the script's header comment for the full publishing flow.Policy gates
applynow enforces five gates before destroying the passive slot:- Maintenance window (configurable, e.g.
03:00-05:00; wrapping
midnight supported) - Node-block-label — refuses if the K8s node carries
updates.kubesolo.io/block=true(workload-author kill switch) - Channel —
stable/beta/edgemust match between the artifact
metadata and the local channel - Architecture — refuses cross-arch artifacts via
runtime.GOARCHcheck - Min compatible version — stepping-stone enforcement; refuses an
upgrade that bypasses a required intermediate version
--forcebypasses the maintenance window and node-block label (channel /
arch / min-version are non-negotiable). Failures are recorded instate.json
with a clearLastErrorfield.Healthcheck deepening + auto-rollback
kubesolo-update healthcheckgrew three optional probes:- Kube-system pods must hold Running for ≥ N seconds before passing
- Operator probe URL — GET an operator-supplied endpoint; 200 = pass
- Disk smoke test — write/fsync/read/delete a probe file under
/var/lib/kubesoloto catch a wedged data partition
Plus auto-rollback: with
--auto-rollback-after N(orauto_rollback_after=
inupdate.conf), after N consecutive post-activation failures, the agent
callsForceRollback()and the operator/init is expected to reboot. The
counter resets on a clean pass.Persistent configuration via
/etc/kubesolo/update.confCloud-init writes this file on first boot from a new
updates:block; you
can also hand-edit it. Recognised keys:server = https://updates.example.com # or omit if using registry registry = # OCI registry ref (alt to server) channel = stable maintenance_window = 03:00-05:00 pubkey = /etc/kubesolo/update-pubkey.hex healthcheck_url = http://localhost:8000/ready auto_rollback_after = 3Cloud-init full reference at
cloud-init/examples/full-config.yaml.Migration from v0.2.x
This is a non-breaking release for live systems. v0.2.x → v0.3.0 changes:
state.jsonwill appear at/var/lib/kubesolo/update/state.jsonthe
first time a v0.3 agent runsapply. Pre-existing v0.2 deployments without
this file are fine — the agent treats a missing file as fresh Idle state.update.confis optional. v0.2 deployments that pass everything via
CLI flags keep working unchanged.- HTTP
latest.jsonprotocol unchanged. Existing update servers don't
need a rebuild. - GRUB env (boot counter, active slot) unchanged. The bootloader's
rollback behaviour is the same. - No new mandatory kernel command-line parameters.
To opt into the new lifecycle, transports, and gates, drop in an
update.conf(or update cloud-init) and switch to--registryif you want
OCI distribution.Known limitations
These shipped intentionally with v0.3.0 and are explicitly tracked for
v0.3.1+:- OCI signature verification — the OCI transport is digest-verified
end-to-end via oras-go, but does not yet consume cosign-style referrer
attestations. The HTTP transport still honours--pubkeyfor.sig
files. - ARM64 LABEL=KSOLODATA resolution doesn't work yet — piCore's
blkid/findfscrash on QEMU virt under our mainline kernel; the
staticbusybox-staticwe ship doesn't include those applets.
build/grub/grub-arm64.cfghardcodeskubesolo.data=/dev/vda4as a
workaround. On real ARM64 hardware the device path may differ. - Real-hardware ARM64 validation is pending. The image builds and
boots end-to-end under QEMU virt; production certification waits on a
Graviton / Ampere run. - AppArmor profile load fails on ARM64 (
apparmor_parserABI mismatch).
Init reports the failure; boot continues without AppArmor enforcement. - QEMU TCG performance can trigger KubeSolo's first-boot image-import
deadline. Not an OS defect; real hardware and KVM-accelerated QEMU
complete the import in seconds.
How to upgrade your build host
git pull make distclean # optional — drops the build cache; full rebuild takes ~30 min make iso # or disk-image, or disk-image-arm64The Docker-based builder (
make docker-build) regenerates its own image
frombuild/Dockerfile.builderon next invocation; oras 1.2.3 and
busybox-static are now included.Acknowledgements
v0.3.0 work was driven by a single multi-week pair-programming session
working through Phases 0–9 of the v0.3 roadmap. The Odroid self-hosted
Gitea Actions runner (odroid.local, arm64-linux) carried every ARM64
build during development.Downloads