Three changes that should have happened pre-v0.3.0:
1. Add a build-disk-arm64 job that runs on the arm64-linux runner (Odroid),
building kernel + rootfs + disk-image then xz-compressing the .arm64.img.
The previous release.yaml shipped x86_64 only.
2. Replace softprops/action-gh-release@v2 with a direct curl against Gitea's
/api/v1/repos/<owner>/<repo>/releases endpoint. The softprops action
hard-codes api.github.com instead of honouring ${{ github.api_url }},
so on Gitea's act_runner it succeeds silently without creating a
release. The curl path uses the auto-populated ${{ secrets.GITHUB_TOKEN }}
for auth; doc note in ci-runners.md covers the GITEA_TOKEN fallback.
3. Downgrade actions/upload-artifact and actions/download-artifact from
@v4 to @v3 to match Gitea act_runner v1.0.x's compatibility — same fix
we applied to ci.yaml in 0c6e200.
Also compress the x86 disk image with xz before uploading (parity with
the arm64 path, saves ~95% on bandwidth), and emit SHA256SUMS over all
attached artifacts.
docs/ci-runners.md gains a "Workflows in this repo" table, a per-job
breakdown of the release pipeline, the rationale for direct-curl over
the marketplace action, and a "manually re-running a release" section
warning against force-updating published tags.
This commit fixes the workflow but does not retroactively rebuild v0.3.0.
v0.3.0's release page already has the manually-uploaded arm64 image and
SHA256SUMS; x86 users who want the v0.3.0 artifact build from source
(documented in the release body). v0.3.1 will be the first tag that
exercises the fixed workflow end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promote VERSION from 0.3.0-dev to 0.3.0. Finalise CHANGELOG entry with
phases 5-8 work (state machine + metrics, channels + maintenance windows,
OCI multi-arch distribution, pre-flight gates + deeper healthcheck +
auto-rollback). Refresh README quick-start to show both x86_64 and generic
ARM64 paths; update the roadmap status table to mark all v0.3 phases
complete and explicitly track the v0.3.1 follow-ups (OCI cosign,
LABEL=KSOLODATA on ARM64, real-hardware validation).
Add docs/release-notes-0.3.0.md as the operator-facing summary, including a
v0.2.x -> v0.3.0 migration section (non-breaking on live systems) and the
known-limitations list copied from CHANGELOG.
All tests green: cloud-init module, all 10 update-module packages,
shellcheck across init / build / test / hack scripts under the v0.3
severity policy.
Tagging is intentionally NOT done from this commit — that's a manual step
so the operator can decide when v0.3.0 is final. After tagging:
git tag -a v0.3.0 -m "KubeSolo OS v0.3.0"
git push origin v0.3.0
The push triggers .gitea/workflows/build-arm64.yaml which runs the full
ARM64 build on the Odroid runner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove [KSOLO-DBG] per-step echos from init.sh. The /dev/console redirect
stays — it's load-bearing for early-boot visibility on QEMU virt.
Add docs/arm64-status.md capturing the end-of-Phase-3 state:
- What works (full boot through 14 stages, KubeSolo + containerd start)
- Known limitations of the dev setup (QEMU TCG perf, /dev/vda4 hardcode,
busybox-static gaps)
- What's needed to ship v0.3 ARM64 as production-ready
Real-hardware validation (Graviton, Ampere, or similar) is the next gating
step before we can call ARM64 generic done.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 audit finding: existing ARM64 build code is mostly already generic.
Only build-kernel-arm64.sh and rpi-kernel-config.fragment are misnamed (the
former is RPi-only, the latter is actually arch-agnostic). The QEMU virt
harness, modules-arm64.list, extract-core arm64 branch, and inject-kubesolo
arm64 branch are all generic.
This document records the target two-track layout for v0.3.0:
- Generic ARM64: mainline kernel, UEFI, GRUB, virtio, GPT 4-part image
- Raspberry Pi: raspberrypi/linux fork, autoboot.txt, MBR 4-part image
- Shared: init, cloud-init, update agent, modules list, kernel-container fragment
Phases 2 and 3 will execute the migration (rename build-kernel-arm64.sh ->
build-kernel-rpi.sh, write a new mainline build-kernel-arm64.sh, etc.).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add missing flags (--local-storage-shared-path, --debug, --pprof-server,
--portainer-edge-id, --portainer-edge-key, --portainer-edge-async) so all
10 documented KubeSolo parameters can be configured via cloud-init YAML.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement atomic OS updates via A/B partition scheme with automatic
rollback. GRUB bootloader manages slot selection with a 3-attempt
boot counter that auto-rolls back on repeated health check failures.
GRUB boot config:
- A/B slot selection with boot_counter/boot_success env vars
- Automatic rollback when counter reaches 0 (3 failed boots)
- Debug, emergency shell, and manual slot-switch menu entries
Disk image (refactored):
- 4-partition GPT layout: EFI + System A + System B + Data
- GRUB EFI/BIOS installation with graceful fallbacks
- Both system partitions populated during image creation
Update agent (Go, zero external deps):
- pkg/grubenv: read/write GRUB env vars (grub-editenv + manual fallback)
- pkg/partition: find/mount/write system partitions by label
- pkg/image: HTTP download with SHA256 verification
- pkg/health: post-boot checks (containerd, API server, node Ready)
- 6 CLI commands: check, apply, activate, rollback, healthcheck, status
- 37 unit tests across all 4 packages
Deployment:
- K8s CronJob for automatic update checks (every 6 hours)
- ConfigMap for update server URL
- Health check Job for post-boot verification
Build pipeline:
- build-update-agent.sh compiles static Linux binary (~5.9 MB)
- inject-kubesolo.sh includes update agent in initramfs
- Makefile: build-update-agent, test-update-agent, test-update targets
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement a lightweight cloud-init system for first-boot configuration:
- Go parser for YAML config (hostname, network, KubeSolo settings)
- Static/DHCP network modes with DNS override
- KubeSolo extra flags and API server SAN configuration
- Portainer Edge Agent and air-gapped deployment support
- New init stage 45-cloud-init.sh runs before network/hostname stages
- Stages 50/60 skip gracefully when cloud-init has already applied
- Build script compiles static Linux/amd64 binary (~2.7 MB)
- 17 unit tests covering parsing, validation, and example files
- Full documentation at docs/cloud-init.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>