Files
kubesolo-os/docs/ci-runners.md
Adolfo Delorenzo f8c308d9b7
Some checks failed
ARM64 Build / Build generic ARM64 disk image (push) Failing after 3s
CI / Go Tests (push) Successful in 1m40s
CI / Shellcheck (push) Successful in 55s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Successful in 1m16s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 1m21s
ci: fix release.yaml so v0.3.1+ auto-publishes a complete release
Three changes that should have happened pre-v0.3.0:

1. Add a build-disk-arm64 job that runs on the arm64-linux runner (Odroid),
   building kernel + rootfs + disk-image then xz-compressing the .arm64.img.
   The previous release.yaml shipped x86_64 only.

2. Replace softprops/action-gh-release@v2 with a direct curl against Gitea's
   /api/v1/repos/<owner>/<repo>/releases endpoint. The softprops action
   hard-codes api.github.com instead of honouring ${{ github.api_url }},
   so on Gitea's act_runner it succeeds silently without creating a
   release. The curl path uses the auto-populated ${{ secrets.GITHUB_TOKEN }}
   for auth; doc note in ci-runners.md covers the GITEA_TOKEN fallback.

3. Downgrade actions/upload-artifact and actions/download-artifact from
   @v4 to @v3 to match Gitea act_runner v1.0.x's compatibility — same fix
   we applied to ci.yaml in 0c6e200.

Also compress the x86 disk image with xz before uploading (parity with
the arm64 path, saves ~95% on bandwidth), and emit SHA256SUMS over all
attached artifacts.

docs/ci-runners.md gains a "Workflows in this repo" table, a per-job
breakdown of the release pipeline, the rationale for direct-curl over
the marketplace action, and a "manually re-running a release" section
warning against force-updating published tags.

This commit fixes the workflow but does not retroactively rebuild v0.3.0.
v0.3.0's release page already has the manually-uploaded arm64 image and
SHA256SUMS; x86 users who want the v0.3.0 artifact build from source
(documented in the release body). v0.3.1 will be the first tag that
exercises the fixed workflow end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 20:18:41 -06:00

223 lines
7.9 KiB
Markdown

# CI Runners
KubeSolo OS is built and tested on Gitea Actions runners. This document records the
runners currently in service and how to register a new one if a host is wiped.
## Active runners
| Name | Host | Arch | OS | Labels | Notes |
|------|------|------|-----|--------|-------|
| `odroid-arm64` | `odroid.local` | aarch64 | Ubuntu 22.04 LTS | `arm64-linux`, `ubuntu-latest`, `ubuntu-24.04`, `ubuntu-22.04` | Native ARM64 builder; 6 cores, 1.8 GB RAM + 4 GB swap; runs as systemd service `act_runner` |
## Workflow targeting
ARM64-specific jobs target the Odroid via the `arm64-linux` label:
```yaml
jobs:
build-arm64:
runs-on: arm64-linux
steps:
- uses: actions/checkout@v4
- run: make rootfs-arm64
```
Generic ubuntu jobs that don't care about arch fall through to whichever runner picks
them up first; on the Odroid they run in Docker via the `ubuntu-latest` /
`ubuntu-22.04` / `ubuntu-24.04` labels.
## Workflows in this repo
| Workflow file | Trigger | Where it runs | What it produces |
|---|---|---|---|
| `.gitea/workflows/ci.yaml` | push / PR to main | ubuntu-latest | Go tests, cross-arch binary build, shellcheck |
| `.gitea/workflows/build-arm64.yaml` | push to main, tags `v*`, manual | `arm64-linux` (Odroid) | ARM64 kernel + rootfs + disk image; uploads as workflow artifact only |
| `.gitea/workflows/release.yaml` | tags `v*` | mix: ubuntu-latest + `arm64-linux` | Full release: x86 ISO + disk, ARM64 disk, Go binaries, SHA256SUMS — posted to Gitea Releases via API |
### Release workflow specifics
`release.yaml` is what fires when you `git push origin vX.Y.Z`. The pipeline:
1. **test**`go test` cloud-init + update modules (ubuntu-latest).
2. **build-binaries** — cross-compiles `kubesolo-cloudinit` and
`kubesolo-update` for linux-amd64 + linux-arm64 with the version baked
in via `-X main.version=…`.
3. **build-iso-amd64** — runs `make iso disk-image` on ubuntu-latest;
produces the x86_64 ISO and a `.img.xz` compressed disk image.
4. **build-disk-arm64** — runs the same flow on the Odroid (`arm64-linux`
label); produces `.arm64.img.xz`.
5. **release** — downloads everything, computes `SHA256SUMS`, calls
Gitea's `POST /api/v1/repos/<owner>/<repo>/releases` to create the
release, then `POST .../releases/<id>/assets?name=…` once per asset.
Authentication uses Gitea's built-in `${{ secrets.GITHUB_TOKEN }}` — the
runner auto-populates that secret with repo-write scope. If your runner
is configured without that automatic token (e.g. an older `act_runner`),
generate a personal access token with `repo:write` scope, add it as an
org secret named `GITEA_TOKEN`, and swap the `TOKEN: ${{ secrets.GITHUB_TOKEN }}`
line in `release.yaml` for `TOKEN: ${{ secrets.GITEA_TOKEN }}`.
### Why not the GitHub Marketplace release actions?
`release.yaml` used to call `softprops/action-gh-release@v2`. That action
hard-codes calls to `api.github.com` instead of using `${{ github.api_url }}`
(which Gitea sets to its own API). On Gitea's act_runner the action fails
silently — the job reports green but no release is created. We replaced
it with a direct `curl` so the behaviour is explicit and debuggable.
Similarly, `actions/upload-artifact@v4` and `@download-artifact@v4` are not
fully implemented by act_runner v1.0.x. Pin to `@v3` until upstream
support catches up.
### Manually re-running a release
Releases are immutable once published, but you can:
- **Delete and recreate the release** through the Gitea UI on the
`releases/tag/vX.Y.Z` page, then push the tag again (Gitea reuses the
existing tag), and re-trigger the workflow via the Actions UI.
- **Trigger the build-arm64 workflow manually** for a one-off arm64
artifact: Gitea UI → Actions → ARM64 Build → Run workflow.
Don't force-update a published tag — anyone who already fetched it (or
downloaded an asset) sees a checksum mismatch. Prefer cutting a new patch
release (vX.Y.Z+1) over rewriting a published one.
## Registering a new runner
### Prerequisites
- Linux host (Ubuntu / Debian preferred; the install instructions below use Ubuntu
22.04+ paths).
- Outbound HTTPS to the Gitea instance.
- Root access on the runner host (the runner needs to create loop devices and run
`mkfs.ext4` for disk-image builds).
- A Gitea Actions runner registration token. Get it from:
- **Repo-scoped:** `<repo>/settings/actions/runners` → "Create new Runner"
- **Org-scoped (preferred for this project):** `<org>/-/settings/actions/runners`
"Create new Runner"
- **Site-scoped:** `/-/admin/actions/runners` → "Create new Runner"
### Step 1 — Add swap if the host has <4 GB RAM
Kernel builds in later phases need ~2 GB resident; tight hosts will OOM-kill `cc1`
without swap.
```bash
sudo fallocate -l 4G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
```
### Step 2 — Install the gitea-runner binary
Pinned to a known-good version. Check
<https://gitea.com/gitea/runner/releases> for the current stable tag before
bumping.
```bash
sudo -i
mkdir -p /opt/act_runner && cd /opt/act_runner
# Bump VERSION to the current stable release as needed
VERSION=1.0.3
ARCH=$(uname -m | sed 's/aarch64/arm64/; s/x86_64/amd64/')
curl -fL "https://gitea.com/gitea/runner/releases/download/v${VERSION}/gitea-runner-${VERSION}-linux-${ARCH}" \
-o act_runner
chmod +x act_runner
./act_runner --version
```
> The upstream project was renamed `act_runner` → `gitea-runner` at the v1.0.0
> release. The release asset filenames use `gitea-runner-*` even though we keep the
> local binary named `act_runner` to match this systemd unit. The CLI surface
> (`register`, `daemon`, `generate-config`) is unchanged.
### Step 3 — Register against Gitea
```bash
./act_runner register --no-interactive \
--instance https://git.oe74.net \
--token PASTE_TOKEN_HERE \
--name <hostname> \
--labels arm64-linux # adjust label for amd64 hosts
```
This creates a `.runner` file with the registration credentials.
### Step 4 — Generate and tune config
```bash
./act_runner generate-config > config.yaml
```
In `config.yaml`, confirm the `runner.labels:` block includes the labels you want.
The `:host` suffix routes jobs directly to the host (no Docker wrapper) — required
for disk-image builds that need loop devices and `mkfs`.
Example labels for an arm64 host:
```yaml
runner:
labels:
- "arm64-linux:host"
- "ubuntu-latest:docker://docker.gitea.com/runner-images:ubuntu-latest"
- "ubuntu-24.04:docker://docker.gitea.com/runner-images:ubuntu-24.04"
- "ubuntu-22.04:docker://docker.gitea.com/runner-images:ubuntu-22.04"
```
### Step 5 — Install as a systemd service
```bash
cat > /etc/systemd/system/act_runner.service << 'EOF'
[Unit]
Description=Gitea Actions runner
After=network-online.target
Wants=network-online.target
[Service]
ExecStart=/opt/act_runner/act_runner daemon --config /opt/act_runner/config.yaml
WorkingDirectory=/opt/act_runner
User=root
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
systemctl daemon-reload
systemctl enable --now act_runner
systemctl status act_runner --no-pager
```
### Step 6 — Verify in Gitea UI
Visit the runners page at the scope you registered against. The runner should appear
as `Idle` with the labels you configured.
## Removing a runner
On the host:
```bash
systemctl disable --now act_runner
rm -rf /opt/act_runner /etc/systemd/system/act_runner.service
systemctl daemon-reload
```
Then delete the runner entry from the Gitea Actions UI so Gitea stops trying to
schedule against it.
## Operational notes
- The runner stores in-progress job working directories under `/tmp/act_runner` by
default. Large disk-image builds may need that path moved to a larger volume —
edit `host.workdir_parent:` in `config.yaml`.
- Logs are visible via `journalctl -u act_runner -f`.
- If a job is interrupted (e.g. host reboot mid-build), the Gitea UI will mark it as
failed/cancelled. Re-run from the Actions UI.