~bigbes/ci-cacher

0ad1486d8a663154fb7e73cc694ffa2a20c01869 — Eugene Blikh 2 days ago 891eddc
Add BSD-2-Clause license and README

Copyright header matches the convention used in go-luarocks.
README walks through install, init/doctor, single-file and docker
caching, the new directory caching, key derivation, config
precedence, Garage compatibility, and exit codes.
2 files changed, 240 insertions(+), 0 deletions(-)

A LICENSE
A README.md
A LICENSE => LICENSE +23 -0
@@ 0,0 1,23 @@
Copyright (c) 2026, Eugene Blikh <bigbes@gmail.com>

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice,
   this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

A README.md => README.md +217 -0
@@ 0,0 1,217 @@
# cacher

S3-backed CI cache helper. A single static Go binary that downloads,
uploads, lists, and invalidates cached build artifacts in any S3-compatible
bucket. Built for [builds.sr.ht](https://builds.sr.ht) but works anywhere
you can run a binary and reach an S3 endpoint.

Replaces the typical CI cache shell loop:

```sh
# before — install awscli, write ~/.aws/config, then in every task:
if aws s3api head-object --bucket "$B" --key "$K" >/dev/null 2>&1; then
  aws s3 cp "s3://$B/$K" "$out"
else
  curl -sSL "$url" -o "$out"
  aws s3 cp "$out" "s3://$B/$K"
fi
```

```sh
# after — one binary, one config, one command:
cacher download "$key" "$out" --url "$url"
```

## Install

```sh
# Pre-built linux-amd64 binary:
wget https://git.srht.bigb.es/~bigbes/ci-cacher/refs/v0.1.0/cacher-linux-amd64 -O ~/.local/bin/cacher
chmod +x ~/.local/bin/cacher

# From source:
go install go.bigb.es/cacher@latest
```

## Setup

`cacher init` writes `~/.config/cacher/config.toml` and runs a smoke test
to fail fast on bad credentials. Run it once per CI job after secrets are
mounted:

```sh
cacher init \
  --endpoint    https://s3.example.com \
  --region      us-east-1 \
  --bucket      ci-cache \
  --prefix      my-project \
  --key-file    ~/.s3-cache-key-id \
  --secret-file ~/.s3-cache-key-secret
```

Credentials resolve as **`CACHER_S3_KEY_ID` / `CACHER_S3_SECRET` env
vars > `--key-file` / `--secret-file` > files recorded in config**. Use the
env vars for local-dev runs; use the files in CI where secrets are mounted.

`cacher doctor` repeats the smoke test (HEAD bucket + 1-byte
write/read/delete canary) and prints credential-length diagnostics
without leaking the secret.

## Usage

### Single file — fetch-or-download

```sh
cacher download go-1.26.3.tar.gz /tmp/go.tar.gz \
  --url https://go.dev/dl/go1.26.3.linux-amd64.tar.gz \
  --sha256 abc123…
```

Cache HIT: pulls from S3. Cache MISS: GETs the URL, verifies the optional
sha256, writes to the destination, **and uploads to S3** so the next run
hits.

```sh
cacher upload my-key /path/to/artifact            # skip if present
cacher upload my-key /path/to/artifact --force    # overwrite
cacher exists my-key                              # exit 0 hit, 1 miss
cacher list   my-prefix                           # debug
cacher delete my-key                              # invalidate
```

### Docker images — streamed save/load

```sh
# Build, cache, and reuse a docker image keyed by its Dockerfile content:
KEY=$(cacher key "images/{hash}.tar.zst" --hash-from Dockerfile)

if ! cacher docker exists "$KEY"; then
  docker build -t myimage:latest .
  cacher docker upload "$KEY" myimage:latest
else
  cacher docker download "$KEY" myimage:latest
fi
```

The save/load pipeline is fully streamed — `docker save | zstd | s3` and
the inverse, no on-disk tempfile. The zstd codec is pure Go
([klauspost/compress](https://github.com/klauspost/compress)), so no
external `zstd` binary is needed.

### Directory caching — the real CI speedup

Cache resolved trees keyed by a lockfile hash. Skip resolution entirely
when nothing changed:

```sh
# Go module cache, keyed by go.sum:
KEY=$(cacher key "go-mod/{hash}.tar.zst" --hash-from go.sum)
cacher dir download "$KEY" ~/go/pkg/mod 2>/dev/null || {
  go mod download
  cacher dir upload "$KEY" ~/go/pkg/mod
}

# Lua rocks tree, keyed by rockspec:
KEY=$(cacher key "rocks/{hash}.tar.zst" --hash-from project-scm-1.rockspec)
cacher dir download "$KEY" .rocks 2>/dev/null || {
  tt rocks install ...
  cacher dir upload "$KEY" .rocks
}
```

### Key derivation

`--hash-from <path>` is repeatable; files are hashed by content, directories
are hashed by recursively walking entries in sorted relative-path order.
The resulting hex digest is truncated to `--hash-length` characters
(default 16, matching the `sha256sum file | cut -c1-16` convention).

For a single file path, the digest equals `sha256sum file | head -c 16`
exactly — so you can migrate existing keys without recomputing them.

Substitution into the key template:

| Template                       | --hash-from   | Result                              |
|--------------------------------|---------------|-------------------------------------|
| `img/{hash}.tar.zst`           | `Dockerfile`  | `img/abcd1234….tar.zst`             |
| `img/build.tar.zst`            | `Dockerfile`  | `img/build-abcd1234….tar.zst`       |
| `bin/cacher` + `--arch-suffix` | (none)        | `bin/cacher-linux-amd64`            |

`cacher key <template> [--hash-from ...]` resolves and prints the key
without doing anything else — handy when the same key feeds multiple
subsequent commands.

## Configuration

`~/.config/cacher/config.toml`:

```toml
endpoint    = "https://s3.example.com"
region      = "us-east-1"
bucket      = "ci-cache"
prefix      = "my-project"
arch_suffix = false
key_file    = "~/.s3-cache-key-id"
secret_file = "~/.s3-cache-key-secret"
```

Every field is overridable, with precedence **flag > `CACHER_<UPPER>`
env var > config file > built-in default**.

`prefix` is joined to every key, so callers refer to keys relative to
their project namespace.

`arch_suffix = true` appends `-<goos>-<goarch>` to every key — useful if
you build the same project on multiple architectures. Off by default
because turning it on invalidates existing keys.

## Garage / MinIO compatibility

`cacher` is configured for [Garage](https://garagehq.deuxfleurs.fr/) by
default and works identically on MinIO and AWS S3:

- Path-style addressing (`https://endpoint/bucket/key`, not
  `https://bucket.endpoint/key`).
- Signature v4.
- `RequestChecksumCalculation` / `ResponseChecksumValidation` set to
  `when_required` (Garage doesn't implement boto3's trailing CRC32
  checksums introduced in 1.36+).
- Multipart upload for objects above 5 MiB (handled by
  `manager.Uploader`).

## Exit codes

| Code | Meaning                                                   |
|------|-----------------------------------------------------------|
| `0`  | Success                                                   |
| `1`  | `exists` returned false / key missing                     |
| `2`  | Operational error (credentials, network, permission, …)  |
| `3`  | `download` cache miss with no `--url` fallback            |

So shell can branch on `exists` cleanly:

```sh
if cacher exists "$key"; then
  echo "cached"
else
  echo "missing — build it"
fi
```

## Why not just call `aws s3` from shell?

The shell version this replaced repeated five things in every CI task:

1. Install AWS CLI v2 (~50 MB download per build).
2. Write `~/.aws/config` with the Garage tweaks.
3. Compute the cache key from file content with `sha256sum | cut`.
4. Branch HIT/MISS by hand.
5. For docker, pipe `docker save | zstd | aws s3 cp -` (and the inverse).

`cacher` does (1) by being a single 14 MB static binary that gets
fetched with one `wget`, and (2)-(5) as built-in commands. The directory
caching is brand new — the shell version only handled single files.

## License

BSD-2-Clause. See `LICENSE`.