feat(update): OCI registry distribution for update artifacts
Some checks failed
ARM64 Build / Build generic ARM64 disk image (push) Failing after 4s
CI / Go Tests (push) Successful in 1m28s
CI / Shellcheck (push) Successful in 45s
CI / Build Go Binaries (amd64, linux, linux-amd64) (push) Successful in 1m17s
CI / Build Go Binaries (arm64, linux, linux-arm64) (push) Successful in 1m13s

Phase 7 of v0.3. The update agent can now pull update artifacts from any
OCI-compliant registry (ghcr.io, quay.io, harbor, zot, etc.) alongside the
existing HTTP latest.json protocol. Multi-arch artifacts are resolved
through manifest indexes so the same tag (e.g. "stable") yields the
right kernel + initramfs for runtime.GOARCH.

New package update/pkg/oci (~280 LOC, 9 tests):
- Client wraps oras-go/v2's remote.Repository. NewClient parses
  host/path references; WithPlainHTTP toggle for httptest.
- FetchMetadata resolves a tag and returns image.UpdateMetadata from
  manifest annotations (io.kubesolo.os.{version,channel,architecture,
  min_compatible_version,release_notes,release_date}). No blobs fetched.
- Pull resolves the tag, walks index → arch-specific manifest, downloads
  kernel + initramfs layers identified by their custom media types
  (application/vnd.kubesolo.os.kernel.v1+octet-stream and
  application/vnd.kubesolo.os.initramfs.v1+gzip), verifies their digests
  against the manifest, returns the same image.StagedImage shape the
  HTTP client produces.
- Cross-arch single-arch manifests are refused via the AnnotArch check
  (defense in depth on top of the gates in cmd/apply.go).
- Tests use a hand-rolled httptest registry implementing /v2/probe,
  manifest fetch by tag-or-digest, blob fetch by digest. Cover index
  arch-selection, single-arch manifests, missing-arch error, tampered
  blob rejection (digest mismatch), and reference parsing.

Dependencies added: oras.land/oras-go/v2 v2.6.0 plus its transitive
opencontainers/{go-digest,image-spec} and golang.org/x/sync. All small
and well-maintained; total binary size impact is negligible relative to
the existing 6.1 MB update agent.

cmd/apply.go:
- New --registry and --tag flags; mutually exclusive with --server.
- applyMetadataGates extracted as a helper, called from both transports
  so channel/arch/min-version policy is enforced identically regardless
  of how metadata was fetched.
- State transitions identical to the HTTP path: Checking → Downloading
  → Staged, with RecordError on any failure.

cmd/opts.go: --registry, --tag CLI flags. update.conf "server=" already
accepts either an HTTP URL or an OCI ref; the agent distinguishes by
which CLI/conf field carries the value.

build/scripts/push-oci-artifact.sh: new tool that publishes a single-arch
update artifact via the oras CLI with our custom media types and
annotations. After running for each arch, the operator composes the
multi-arch index with `oras manifest index create`. Documented inline.

build/Dockerfile.builder: installs oras 1.2.3 from upstream releases so
the Gitea Actions build container can run the new script.

Signature verification on the OCI path is intentionally deferred — the
artifact format is digest-verified end-to-end via oras-go, and Ed25519
signature consumption via OCI referrers is a follow-up. Plain HTTP
clients keep their existing signature path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-14 18:58:38 -06:00
parent dfed6ddba8
commit 28de656b97
9 changed files with 954 additions and 70 deletions

281
update/pkg/oci/oci.go Normal file
View File

@@ -0,0 +1,281 @@
// Package oci pulls KubeSolo OS update artifacts from an OCI-compliant
// container registry (e.g. ghcr.io). It is the registry-native alternative
// to the legacy HTTP `latest.json` protocol implemented in pkg/image.
//
// # Artifact layout
//
// An update is published as a single OCI artifact under a tag like
// `stable` or `v0.3.0`. The tag may point at either:
//
// - A manifest index (preferred) containing per-architecture manifests.
// The agent picks the one matching runtime.GOARCH.
// - A single manifest (used for arch-specific tags such as
// `v0.3.0-amd64`). The agent verifies architecture against the
// manifest's platform annotation before trusting it.
//
// Each per-architecture manifest carries two layers:
//
// application/vnd.kubesolo.os.kernel.v1+octet-stream // vmlinuz / Image
// application/vnd.kubesolo.os.initramfs.v1+gzip // kubesolo-os.gz
//
// And these annotations (read into image.UpdateMetadata):
//
// io.kubesolo.os.version "v0.3.0"
// io.kubesolo.os.channel "stable"
// io.kubesolo.os.min_compatible_version "v0.2.0"
// io.kubesolo.os.architecture "amd64"
// io.kubesolo.os.release_notes (optional, short)
// io.kubesolo.os.release_date (optional, RFC3339)
//
// The agent ignores any additional layers, so the same image can also be
// shaped as a "scratch" container if the build pipeline finds that convenient
// for ecosystem tooling.
package oci
import (
"context"
"encoding/json"
"errors"
"fmt"
"io"
"log/slog"
"os"
"path/filepath"
"runtime"
"github.com/opencontainers/go-digest"
ocispec "github.com/opencontainers/image-spec/specs-go/v1"
"oras.land/oras-go/v2/content"
"oras.land/oras-go/v2/registry/remote"
"github.com/portainer/kubesolo-os/update/pkg/image"
)
// Media types used on KubeSolo OS update artifacts. Kept here (not in
// pkg/image) so the OCI protocol surface is fully self-contained.
const (
MediaKernel = "application/vnd.kubesolo.os.kernel.v1+octet-stream"
MediaInitramfs = "application/vnd.kubesolo.os.initramfs.v1+gzip"
AnnotVersion = "io.kubesolo.os.version"
AnnotChannel = "io.kubesolo.os.channel"
AnnotMinVersion = "io.kubesolo.os.min_compatible_version"
AnnotArch = "io.kubesolo.os.architecture"
AnnotReleaseNote = "io.kubesolo.os.release_notes"
AnnotReleaseDate = "io.kubesolo.os.release_date"
)
// Client pulls artifacts from a single OCI repository (e.g.
// `ghcr.io/portainer/kubesolo-os`).
//
// Anonymous (public-pull) access is supported out of the box. For private
// repositories, configure auth via the underlying remote.Repository.Client
// before passing it to Resolve/Pull — that hook isn't surfaced here yet
// (deferred until we actually need it for a private fleet).
type Client struct {
repo *remote.Repository
// Arch is the architecture string we match against manifest indexes.
// Defaults to runtime.GOARCH; overridable for testing.
Arch string
}
// NewClient parses a repository reference of the form `host/path` (no tag)
// and returns a ready-to-use Client.
func NewClient(repoRef string) (*Client, error) {
repo, err := remote.NewRepository(repoRef)
if err != nil {
return nil, fmt.Errorf("invalid OCI reference %q: %w", repoRef, err)
}
// remote.NewRepository defaults to HTTPS. PlainHTTP is set per-test
// via the WithPlainHTTP option when we hit a httptest.Server.
return &Client{repo: repo, Arch: runtime.GOARCH}, nil
}
// WithPlainHTTP toggles the underlying registry transport to HTTP. Useful for
// httptest-driven unit tests; do not use against production registries.
func (c *Client) WithPlainHTTP(plain bool) *Client {
c.repo.PlainHTTP = plain
return c
}
// FetchMetadata resolves the tag, walks index → manifest if needed, and
// returns an image.UpdateMetadata populated from the manifest's annotations.
// No blobs are downloaded — this is the cheap "what's available" probe.
func (c *Client) FetchMetadata(ctx context.Context, tag string) (*image.UpdateMetadata, error) {
manifest, _, err := c.resolveArchManifest(ctx, tag)
if err != nil {
return nil, err
}
return metadataFromAnnotations(manifest.Annotations), nil
}
// Pull resolves the tag, picks the matching-architecture manifest, downloads
// the kernel + initramfs layers to `stageDir`, verifies their digests, and
// returns a StagedImage compatible with the existing pkg/image consumer.
func (c *Client) Pull(ctx context.Context, tag, stageDir string) (*image.StagedImage, *image.UpdateMetadata, error) {
manifest, _, err := c.resolveArchManifest(ctx, tag)
if err != nil {
return nil, nil, err
}
if err := os.MkdirAll(stageDir, 0o755); err != nil {
return nil, nil, fmt.Errorf("create stage dir: %w", err)
}
var kernelPath, initramfsPath string
for _, layer := range manifest.Layers {
switch layer.MediaType {
case MediaKernel:
kernelPath = filepath.Join(stageDir, "vmlinuz")
if err := c.fetchBlobTo(ctx, layer, kernelPath); err != nil {
return nil, nil, fmt.Errorf("download kernel: %w", err)
}
case MediaInitramfs:
initramfsPath = filepath.Join(stageDir, "kubesolo-os.gz")
if err := c.fetchBlobTo(ctx, layer, initramfsPath); err != nil {
return nil, nil, fmt.Errorf("download initramfs: %w", err)
}
default:
slog.Debug("oci: skipping unknown layer", "media", layer.MediaType)
}
}
if kernelPath == "" {
return nil, nil, fmt.Errorf("manifest has no %s layer", MediaKernel)
}
if initramfsPath == "" {
return nil, nil, fmt.Errorf("manifest has no %s layer", MediaInitramfs)
}
meta := metadataFromAnnotations(manifest.Annotations)
staged := &image.StagedImage{
VmlinuzPath: kernelPath,
InitramfsPath: initramfsPath,
Version: meta.Version,
}
return staged, meta, nil
}
// resolveArchManifest fetches the descriptor at `tag`, walks an index if
// present, and returns the platform-specific manifest matching c.Arch.
func (c *Client) resolveArchManifest(ctx context.Context, tag string) (*ocispec.Manifest, *ocispec.Descriptor, error) {
desc, err := c.repo.Resolve(ctx, tag)
if err != nil {
return nil, nil, fmt.Errorf("resolve tag %q: %w", tag, err)
}
switch desc.MediaType {
case ocispec.MediaTypeImageIndex, "application/vnd.docker.distribution.manifest.list.v2+json":
index, err := fetchJSON[ocispec.Index](ctx, c.repo, desc)
if err != nil {
return nil, nil, fmt.Errorf("fetch index: %w", err)
}
var matched *ocispec.Descriptor
for i := range index.Manifests {
m := &index.Manifests[i]
if m.Platform != nil && m.Platform.Architecture == c.Arch {
matched = m
break
}
}
if matched == nil {
return nil, nil, fmt.Errorf("no manifest in index for architecture %q", c.Arch)
}
manifest, err := fetchJSON[ocispec.Manifest](ctx, c.repo, *matched)
if err != nil {
return nil, nil, fmt.Errorf("fetch manifest: %w", err)
}
return manifest, matched, nil
case ocispec.MediaTypeImageManifest, "application/vnd.docker.distribution.manifest.v2+json":
manifest, err := fetchJSON[ocispec.Manifest](ctx, c.repo, desc)
if err != nil {
return nil, nil, fmt.Errorf("fetch manifest: %w", err)
}
// Single-arch tag: if it declares an arch, enforce match.
if archAnnot := manifest.Annotations[AnnotArch]; archAnnot != "" && archAnnot != c.Arch {
return nil, nil, fmt.Errorf("single-arch manifest is %q, want %q", archAnnot, c.Arch)
}
return manifest, &desc, nil
default:
return nil, nil, fmt.Errorf("unsupported media type %q at tag %q", desc.MediaType, tag)
}
}
// fetchJSON pulls a small JSON document (manifest or index) and decodes it.
func fetchJSON[T any](ctx context.Context, store content.Fetcher, desc ocispec.Descriptor) (*T, error) {
rc, err := store.Fetch(ctx, desc)
if err != nil {
return nil, err
}
defer rc.Close()
data, err := content.ReadAll(rc, desc)
if err != nil {
return nil, err
}
var out T
if err := json.Unmarshal(data, &out); err != nil {
return nil, fmt.Errorf("decode: %w", err)
}
return &out, nil
}
// fetchBlobTo streams a blob to disk and verifies its digest matches.
// Cleans up the destination file on any error so we never leave a partial.
func (c *Client) fetchBlobTo(ctx context.Context, desc ocispec.Descriptor, dest string) (retErr error) {
rc, err := c.repo.Fetch(ctx, desc)
if err != nil {
return fmt.Errorf("fetch blob: %w", err)
}
defer rc.Close()
f, err := os.Create(dest)
if err != nil {
return fmt.Errorf("create %s: %w", dest, err)
}
defer func() {
if cerr := f.Close(); retErr == nil && cerr != nil {
retErr = cerr
}
if retErr != nil {
_ = os.Remove(dest)
}
}()
verifier := desc.Digest.Algorithm().Hash()
mw := io.MultiWriter(f, verifier)
n, err := io.Copy(mw, rc)
if err != nil {
return fmt.Errorf("stream blob: %w", err)
}
if desc.Size > 0 && n != desc.Size {
return fmt.Errorf("blob size mismatch: got %d, want %d", n, desc.Size)
}
got := digest.NewDigest(desc.Digest.Algorithm(), verifier)
if got != desc.Digest {
return fmt.Errorf("blob digest mismatch: got %s, want %s", got, desc.Digest)
}
return nil
}
// metadataFromAnnotations builds an UpdateMetadata from manifest annotations.
// Always returns a non-nil value (missing fields stay empty).
func metadataFromAnnotations(a map[string]string) *image.UpdateMetadata {
if a == nil {
a = map[string]string{}
}
return &image.UpdateMetadata{
Version: a[AnnotVersion],
Channel: a[AnnotChannel],
MinCompatibleVersion: a[AnnotMinVersion],
Architecture: a[AnnotArch],
ReleaseNotes: a[AnnotReleaseNote],
ReleaseDate: a[AnnotReleaseDate],
}
}
// ErrNoManifestForArch is returned from FetchMetadata/Pull when an index has
// no entry matching the running architecture. Exposed so callers can
// distinguish "registry unreachable" from "this build doesn't ship for us".
var ErrNoManifestForArch = errors.New("no manifest in index for runtime architecture")