# lethe Personal AI assistant log aggregator. `lethe` is a small, single-binary Go service that ingests turn-level NDJSON from AI assistant collectors (Claude Code, opencode, etc.), stores it in SQLite, and exposes a JSON API for listing and reading sessions. Search and the collector binary live in sibling repos / tasks (`lethe-collector-claude-code`, `lethe-search-and-opencode`); this repo is just the server. ## Purpose - A single private store for all my AI assistant transcripts across hosts and tools. - Per-user data isolation so each authenticated user only sees their own sessions; admins can override with `?owner=`. - Simple deploy: one Go binary, one SQLite file, all schema changes via embedded migrations, `127.0.0.1` bind behind a reverse proxy. ## Quickstart ```bash cp config.example.yaml config.yaml # edit config.yaml — at minimum set auth.allowed_users just run # or, for hot reload during development: just air ``` The server reads `config.yaml` by default. Pass `-config ` to override. Once the server is up (default bind `127.0.0.1:8080`), exercise the API: ```bash # Public probes (no auth): curl http://127.0.0.1:8080/healthz curl http://127.0.0.1:8080/readyz curl http://127.0.0.1:8080/metrics # Forward-auth (the reverse proxy normally injects this header): curl -H 'Remote-User: bigbes' http://127.0.0.1:8080/api/v1/sessions # Ingest one NDJSON line as that user: printf '%s\n' '{"tool":"claude-code","host":"laptop","session_id":"s1","turn_id":"t1","seq":0,"role":"user","timestamp":1700000000,"content":"hi","session_meta":{"source_file":"/tmp/s1.jsonl"}}' \ | curl -H 'Remote-User: bigbes' -H 'Content-Type: application/x-ndjson' \ --data-binary @- http://127.0.0.1:8080/api/v1/ingest # Read the session back: curl -H 'Remote-User: bigbes' http://127.0.0.1:8080/api/v1/sessions/claude-code/laptop/s1 # OIDC bearer (when auth.oidc.enabled: true): curl -H "Authorization: Bearer $JWT" http://127.0.0.1:8080/api/v1/sessions ``` ## Trust model `lethe` always binds `127.0.0.1`. It is not safe to expose it to the network directly. All authentication assumes a trusted reverse proxy on the same host (e.g. Caddy or Traefik) that terminates TLS and either injects auth headers or relays an OIDC bearer. ``` client ──TLS──▶ Caddy ──forward_auth──▶ Authelia │ │ │ ◀── 200 + headers ────┘ │ (Remote-User, Remote-Groups, ...) │ └─HTTP─▶ lethe (127.0.0.1:8080) │ └─▶ SQLite (./data/lethe.db) ``` The proxy MUST strip any `Remote-User` (and the other `Remote-*` headers) that arrive on the inbound request before it issues the forward-auth sub-request — otherwise a hostile client can spoof an identity by setting the header itself. Authelia's documentation calls this out; Caddy's `copy_headers` directive only adds Authelia's response headers, it does not clear inbound ones, so add an explicit `header_down -Remote-*` (or equivalent) when you pass requests through. The implicit assumption is that nothing else on the host binds `127.0.0.1` and forges auth headers. That is the whole trust boundary for path (a) below. There are two independent auth paths, each gated by the `auth.allowed_users` allowlist as a defense-in-depth check. If both are enabled and a request carries both an `Authorization` header and the configured forward-auth header, the bearer is validated first; the header is only consulted if bearer validation fails. ### (a) Forward-auth: reverse proxy + Authelia + `Remote-User` The reverse proxy runs Authelia forward-auth on the lethe vhost. On success Authelia signs the user identity into headers (`Remote-User`, `Remote-Email`, `Remote-Groups`) which the proxy forwards to lethe. `lethe` reads the configured `auth.forward_auth.user_header` (default `Remote-User`), checks the allowlist, and 403s on miss. Sample Caddy snippet (`/etc/caddy/Caddyfile`): ```caddy lethe.example.com { forward_auth http://127.0.0.1:9091 { uri /api/verify?rd=https://auth.example.com copy_headers Remote-User Remote-Email Remote-Groups Remote-Name } reverse_proxy 127.0.0.1:8080 } ``` `lethe` config: ```yaml auth: allowed_users: ["bigbes"] forward_auth: enabled: true user_header: "Remote-User" ``` Sample request (the proxy injects the headers on your behalf — this curl shows what reaches lethe, not what you'd send from a browser): ```bash curl -H 'Remote-User: bigbes' http://127.0.0.1:8080/api/v1/sessions ``` ### (b) OIDC bearer: Authelia issues, lethe validates `lethe` is registered as an OIDC client of Authelia. `lethe` only validates tokens; it never issues them. The bearer must already be obtained out-of-band by whatever client wants to call the API (the collector, scripts, etc.). Sample Authelia client entry (`identity_providers.oidc.clients` in `configuration.yml`): ```yaml identity_providers: oidc: clients: - client_id: lethe client_name: Lethe client_secret: '' public: false authorization_policy: two_factor redirect_uris: - https://lethe.example.com/oauth2/callback scopes: - openid - profile - email userinfo_signing_algorithm: none token_endpoint_auth_method: client_secret_basic ``` `lethe` config (note: only the issuer URL, audience, and the username claim — no client secret on the lethe side, since lethe never starts a flow): ```yaml auth: allowed_users: ["bigbes"] oidc: enabled: true issuer: "https://auth.example.com" audience: "lethe" username_claim: "preferred_username" ``` Sample request: ```bash curl -H "Authorization: Bearer $TOKEN" \ https://lethe.example.com/api/v1/sessions ``` ### Public endpoints `/healthz`, `/readyz`, and `/metrics` are mounted outside the auth middleware so the host can scrape them locally without going through the proxy. Everything under `/api/v1/*` is authed. ## API surface | Method | Path | Auth | Notes | |--------|-----------------------------------------------------|------|-------| | POST | `/api/v1/ingest` | yes | `Content-Type: application/x-ndjson`; one `TurnEvent` per line. Idempotent at the turn level per owner. | | GET | `/api/v1/sessions` | yes | Paginated. Filters: `tool`, `host`, `since`, `until`, `limit`, `offset`. Admins may pass `?owner=` or `?owner=*`. | | GET | `/api/v1/sessions/{tool}/{host}/{session_id}` | yes | Full session with turns inline. Admins may pass `?owner=`. | | GET | `/healthz` | no | Liveness — constant 200 `ok`. | | GET | `/readyz` | no | Readiness — runs every registered probe; 200 with `{"checks":{...}}` or 503. | | GET | `/metrics` | no | Prometheus exposition. | Response shapes (success): - **POST `/api/v1/ingest`** → `200` with `{"accepted": , "errors": [{"line": , "error": }]}`. The endpoint always returns 200 when the body is well-formed enough to read; per-line failures are reported in `errors` and the client uses `accepted` to advance its offset bookkeeping. - **GET `/api/v1/sessions`** → `200` with `{"sessions": [Session, ...], "limit": , "offset": }`. The echoed `limit`/`offset` reflect the (clamped) effective values. - **GET `/api/v1/sessions/{tool}/{host}/{session_id}`** → `200` with the `Session` columns flattened at the top level plus `"turns": [Turn, ...]` ordered by `seq` ascending. The `owner` field is server-derived; the wire format has no `owner`. A non-admin caller passing `?owner=` gets 403. Errors are RFC 7807 `application/problem+json` with the lethe-specific `code` extension carrying the machine-readable culpa code (e.g. `UNAUTHORIZED`, `NOT_FOUND`, `INVALID`, `DB_OPEN`). ## Production deployment `lethe` binds `127.0.0.1` only — non-loopback binds are rejected at startup with a `CONFIG_INVALID` error. Run it on the same host as a reverse proxy (Caddy or Traefik) that terminates TLS and injects auth headers (or relays an OIDC bearer). Never publish `:8080` directly. When run via the bundled `docker-compose.yml`, the service does not publish a port to the host — it only `expose`s `8080` on the compose network. The reverse proxy reaches lethe through that network. The SQLite file lives at `./data/lethe.db` on the host (mounted into the container at `/data`); take backups against the host path. ## Operational notes - **Health checks** — point your orchestrator at `GET /healthz` (constant 200, no DB touch) for liveness and `GET /readyz` (DB ping with a 5s budget) for readiness. - **Metrics** — `GET /metrics` exposes Prometheus series prefixed with `lethe_*` plus the standard process and Go runtime collectors. Scrape it locally; do not expose `:8080` to the network. - **Lifecycle** — on `SIGINT`/`SIGTERM` lethe runs a 15s graceful stop (in-flight requests drain, then the listener closes), then closes the database. If init fails before `Start` succeeds, lethe walks the partially-initialized assets in reverse and calls `Destroy` on each so no resource leaks. - **Logs** — set `logging.format: json` for structured production logging; `tint` is the friendly developer format. `request_id` and `user` are stamped on every record originating from the HTTP layer. ## Backup The SQLite DB is the only state. Take consistent online backups with `sqlite3 .backup`. Sample cron with 14-day retention: ```cron # /etc/cron.d/lethe-backup # Daily online backup of the lethe SQLite database, 14-day retention. PATH=/usr/bin:/bin 15 03 * * * lethe \ ts=$(date +\%Y\%m\%d-\%H\%M\%S); \ sqlite3 /var/lib/lethe/lethe.db ".backup '/var/backups/lethe/lethe-$ts.db'" && \ find /var/backups/lethe -name 'lethe-*.db' -mtime +14 -delete ``` `/var/backups/lethe` should live on a different volume from the live DB. ## Layout ``` . ├── cmd/lethe/ # main, thin shell + e2e smoke ├── internal/ │ ├── config/ # YAML+env loader, validators, defaults │ ├── domain/ │ │ ├── ingest/ # NDJSON ingest: handler, service, repo │ │ └── session/ # sessions read API: handler, repo │ ├── pkg/ │ │ ├── apierror/ # culpa-error → RFC 7807 problem renderer │ │ └── httputil/ # NDJSON scanner, JSON write helper │ ├── platform/ │ │ ├── database/ # sqlx wrapper + embed.FS migrations │ │ ├── health/ # Checker + steward-aggregated Set │ │ └── observability/ # slog logger + Prometheus metrics │ ├── server/ # chi router, middleware, route mount │ │ └── auth/ # forward-auth + OIDC verifier │ └── shared/wire/ # locked NDJSON contract (collector-shared) ├── config.example.yaml ├── Justfile ├── .air.toml ├── Dockerfile ├── docker-compose.yml └── .golangci.yml ```