~bigbes/lethe

ref: e048bdf74cd7459076a067ae22038a83e4627b6a lethe/README.md -rw-r--r-- 11.3 KiB
e048bdf7 — Eugene Blikh web: stats route with backend-driven chart primitives a month ago

#lethe

Personal AI assistant log aggregator. lethe is a small, single-binary Go service that ingests turn-level NDJSON from AI assistant collectors (Claude Code, opencode, etc.), stores it in SQLite, and exposes a JSON API for listing and reading sessions.

Search and the collector binary live in sibling repos / tasks (lethe-collector-claude-code, lethe-search-and-opencode); this repo is just the server.

#Purpose

  • A single private store for all my AI assistant transcripts across hosts and tools.
  • Per-user data isolation so each authenticated user only sees their own sessions; admins can override with ?owner=.
  • Simple deploy: one Go binary, one SQLite file, all schema changes via embedded migrations, 127.0.0.1 bind behind a reverse proxy.

#Quickstart

cp config.example.yaml config.yaml
# edit config.yaml — at minimum set auth.allowed_users
just run
# or, for hot reload during development:
just air

The server reads config.yaml by default. Pass -config <path> to override.

Once the server is up (default bind 127.0.0.1:8080), exercise the API:

# Public probes (no auth):
curl http://127.0.0.1:8080/healthz
curl http://127.0.0.1:8080/readyz
curl http://127.0.0.1:8080/metrics

# Forward-auth (the reverse proxy normally injects this header):
curl -H 'Remote-User: bigbes' http://127.0.0.1:8080/api/v1/sessions

# Ingest one NDJSON line as that user:
printf '%s\n' '{"tool":"claude-code","host":"laptop","session_id":"s1","turn_id":"t1","seq":0,"role":"user","timestamp":1700000000,"content":"hi","session_meta":{"source_file":"/tmp/s1.jsonl"}}' \
  | curl -H 'Remote-User: bigbes' -H 'Content-Type: application/x-ndjson' \
         --data-binary @- http://127.0.0.1:8080/api/v1/ingest

# Read the session back:
curl -H 'Remote-User: bigbes' http://127.0.0.1:8080/api/v1/sessions/claude-code/laptop/s1

# OIDC bearer (when auth.oidc.enabled: true):
curl -H "Authorization: Bearer $JWT" http://127.0.0.1:8080/api/v1/sessions

#Trust model

lethe always binds 127.0.0.1. It is not safe to expose it to the network directly. All authentication assumes a trusted reverse proxy on the same host (e.g. Caddy or Traefik) that terminates TLS and either injects auth headers or relays an OIDC bearer.

 client ──TLS──▶ Caddy ──forward_auth──▶ Authelia
                  │                       │
                  │ ◀── 200 + headers ────┘
                  │   (Remote-User, Remote-Groups, ...)
                  │
                  └─HTTP─▶ lethe (127.0.0.1:8080)
                              │
                              └─▶ SQLite (./data/lethe.db)

The proxy MUST strip any Remote-User (and the other Remote-* headers) that arrive on the inbound request before it issues the forward-auth sub-request — otherwise a hostile client can spoof an identity by setting the header itself. Authelia's documentation calls this out; Caddy's copy_headers directive only adds Authelia's response headers, it does not clear inbound ones, so add an explicit header_down -Remote-* (or equivalent) when you pass requests through.

The implicit assumption is that nothing else on the host binds 127.0.0.1 and forges auth headers. That is the whole trust boundary for path (a) below.

There are two independent auth paths, each gated by the auth.allowed_users allowlist as a defense-in-depth check. If both are enabled and a request carries both an Authorization header and the configured forward-auth header, the bearer is validated first; the header is only consulted if bearer validation fails.

#(a) Forward-auth: reverse proxy + Authelia + Remote-User

The reverse proxy runs Authelia forward-auth on the lethe vhost. On success Authelia signs the user identity into headers (Remote-User, Remote-Email, Remote-Groups) which the proxy forwards to lethe. lethe reads the configured auth.forward_auth.user_header (default Remote-User), checks the allowlist, and 403s on miss.

Sample Caddy snippet (/etc/caddy/Caddyfile):

lethe.example.com {
    forward_auth http://127.0.0.1:9091 {
        uri /api/verify?rd=https://auth.example.com
        copy_headers Remote-User Remote-Email Remote-Groups Remote-Name
    }

    reverse_proxy 127.0.0.1:8080
}

lethe config:

auth:
  allowed_users: ["bigbes"]
  forward_auth:
    enabled: true
    user_header: "Remote-User"

Sample request (the proxy injects the headers on your behalf — this curl shows what reaches lethe, not what you'd send from a browser):

curl -H 'Remote-User: bigbes' http://127.0.0.1:8080/api/v1/sessions

#(b) OIDC bearer: Authelia issues, lethe validates

lethe is registered as an OIDC client of Authelia. lethe only validates tokens; it never issues them. The bearer must already be obtained out-of-band by whatever client wants to call the API (the collector, scripts, etc.).

Sample Authelia client entry (identity_providers.oidc.clients in configuration.yml):

identity_providers:
  oidc:
    clients:
      - client_id: lethe
        client_name: Lethe
        client_secret: '<argon2 hash>'
        public: false
        authorization_policy: two_factor
        redirect_uris:
          - https://lethe.example.com/oauth2/callback
        scopes:
          - openid
          - profile
          - email
        userinfo_signing_algorithm: none
        token_endpoint_auth_method: client_secret_basic

lethe config (note: only the issuer URL, audience, and the username claim — no client secret on the lethe side, since lethe never starts a flow):

auth:
  allowed_users: ["bigbes"]
  oidc:
    enabled: true
    issuer: "https://auth.example.com"
    audience: "lethe"
    username_claim: "preferred_username"

Sample request:

curl -H "Authorization: Bearer $TOKEN" \
  https://lethe.example.com/api/v1/sessions

#Public endpoints

/healthz, /readyz, and /metrics are mounted outside the auth middleware so the host can scrape them locally without going through the proxy. Everything under /api/v1/* is authed.

#API surface

Method Path Auth Notes
POST /api/v1/ingest yes Content-Type: application/x-ndjson; one TurnEvent per line. Idempotent at the turn level per owner.
GET /api/v1/sessions yes Paginated. Filters: tool, host, since, until, limit, offset. Admins may pass ?owner=<user> or ?owner=*.
GET /api/v1/sessions/{tool}/{host}/{session_id} yes Full session with turns inline. Admins may pass ?owner=<user>.
GET /healthz no Liveness — constant 200 ok.
GET /readyz no Readiness — runs every registered probe; 200 with {"checks":{...}} or 503.
GET /metrics no Prometheus exposition.

Response shapes (success):

  • POST /api/v1/ingest200 with {"accepted": <int>, "errors": [{"line": <int>, "error": <string>}]}. The endpoint always returns 200 when the body is well-formed enough to read; per-line failures are reported in errors and the client uses accepted to advance its offset bookkeeping.
  • GET /api/v1/sessions200 with {"sessions": [Session, ...], "limit": <int>, "offset": <int>}. The echoed limit/offset reflect the (clamped) effective values.
  • GET /api/v1/sessions/{tool}/{host}/{session_id}200 with the Session columns flattened at the top level plus "turns": [Turn, ...] ordered by seq ascending.

The owner field is server-derived; the wire format has no owner. A non-admin caller passing ?owner= gets 403.

Errors are RFC 7807 application/problem+json with the lethe-specific code extension carrying the machine-readable culpa code (e.g. UNAUTHORIZED, NOT_FOUND, INVALID, DB_OPEN).

#Production deployment

lethe binds 127.0.0.1 only — non-loopback binds are rejected at startup with a CONFIG_INVALID error. Run it on the same host as a reverse proxy (Caddy or Traefik) that terminates TLS and injects auth headers (or relays an OIDC bearer). Never publish :8080 directly.

When run via the bundled docker-compose.yml, the service does not publish a port to the host — it only exposes 8080 on the compose network. The reverse proxy reaches lethe through that network. The SQLite file lives at ./data/lethe.db on the host (mounted into the container at /data); take backups against the host path.

#Operational notes

  • Health checks — point your orchestrator at GET /healthz (constant 200, no DB touch) for liveness and GET /readyz (DB ping with a 5s budget) for readiness.
  • MetricsGET /metrics exposes Prometheus series prefixed with lethe_* plus the standard process and Go runtime collectors. Scrape it locally; do not expose :8080 to the network.
  • Lifecycle — on SIGINT/SIGTERM lethe runs a 15s graceful stop (in-flight requests drain, then the listener closes), then closes the database. If init fails before Start succeeds, lethe walks the partially-initialized assets in reverse and calls Destroy on each so no resource leaks.
  • Logs — set logging.format: json for structured production logging; tint is the friendly developer format. request_id and user are stamped on every record originating from the HTTP layer.

#Backup

The SQLite DB is the only state. Take consistent online backups with sqlite3 .backup. Sample cron with 14-day retention:

# /etc/cron.d/lethe-backup
# Daily online backup of the lethe SQLite database, 14-day retention.
PATH=/usr/bin:/bin

15 03 * * * lethe \
  ts=$(date +\%Y\%m\%d-\%H\%M\%S); \
  sqlite3 /var/lib/lethe/lethe.db ".backup '/var/backups/lethe/lethe-$ts.db'" && \
  find /var/backups/lethe -name 'lethe-*.db' -mtime +14 -delete

/var/backups/lethe should live on a different volume from the live DB.

#Layout

.
├── cmd/lethe/                       # main, thin shell + e2e smoke
├── internal/
│   ├── config/                      # YAML+env loader, validators, defaults
│   ├── domain/
│   │   ├── ingest/                  # NDJSON ingest: handler, service, repo
│   │   └── session/                 # sessions read API: handler, repo
│   ├── pkg/
│   │   ├── apierror/                # culpa-error → RFC 7807 problem renderer
│   │   └── httputil/                # NDJSON scanner, JSON write helper
│   ├── platform/
│   │   ├── database/                # sqlx wrapper + embed.FS migrations
│   │   ├── health/                  # Checker + steward-aggregated Set
│   │   └── observability/           # slog logger + Prometheus metrics
│   ├── server/                      # chi router, middleware, route mount
│   │   └── auth/                    # forward-auth + OIDC verifier
│   └── shared/wire/                 # locked NDJSON contract (collector-shared)
├── config.example.yaml
├── Justfile
├── .air.toml
├── Dockerfile
├── docker-compose.yml
└── .golangci.yml