Status: Execute (verify pending)
Module: sourcecraft.dev/bigbes/lethe
Branch: master
Worktree: none
Parent RFC: Personal AI Assistant Log Aggregator (2026-04-25)
Sibling tasks (deferred): lethe-collector-claude-code.md (#2), lethe-search-and-opencode.md (#3)
Stand up the lethe server binary: SQLite-backed ingestion endpoint, session list/detail JSON API, forward-auth header trust against an Authelia-protected reverse proxy, ready to deploy on phoebe behind Caddy/Traefik+Authelia. Search, stats, any HTML/UI, and any collector code are explicitly deferred to siblings or a later UI task.
A successful end state for this task: I can curl -X POST a fixture NDJSON file at /api/v1/ingest, then curl the sessions list and a single session detail through the reverse proxy (with either an Authelia session forwarding Remote-User or an Authelia-issued OIDC bearer) and get the expected JSON back.
In:
lethe (cmd/lethe/main.go) — JSON API server only.embed.FS + golang-migrate/v4, applied on startup.POST /api/v1/ingest — NDJSON, turn-only protocol (server upserts session rows from turn data).GET /api/v1/sessions — paginated list with filters (tool, host, since, until).GET /api/v1/sessions/{tool}/{host}/{session_id} — full session with turns inline.auth.allowed_users allowlist:
Remote-User from an upstream reverse proxy (Caddy/Traefik) gated by Authelia.Authorization: Bearer <jwt> against Authelia's OIDC issuer (JWKS lookup, signature, iss/aud/exp); take user from preferred_username (fallback sub).auth.admins list can see all owners' data and override via ?owner=<user> (or ?owner=*) on read endpoints./healthz, /readyz, /metrics (Prometheus).scribe, structured errors via culpa rendered as RFC 7807 application/problem+json.Justfile, .air.toml, Dockerfile, docker-compose.yml, config.example.yaml, .golangci.yml per go-selfhosted-backend skill conventions.internal/shared/wire/ so #2's collector imports them directly..backup is documented in the README (cron + sqlite3 .backup); no code in this task.Out:
goldmark/bluemonday). Deferred to a later UI task; the JSON API is the only consumer surface in this task./api/v1/search + search UI + tool_outputs_fts query path — #3 (table + triggers exist; queries don't)./api/v1/stats + rollups view — #3./debug/pprof — defer until something forces the issue.parentUuid session chaining (Claude Code resume semantics) — parser concern, lands with #2.Storage: option B (locked). SQLite (modernc.org/sqlite, no CGO) with WAL, single DB file. Schema includes turns_fts (FTS5 over prose content) and tool_outputs_fts (FTS5 over tool_calls text). Both indexes populated by INSERT/UPDATE/DELETE triggers from the start — #3 only wires queries.
Wire protocol: option B (locked). Collector emits turn-only NDJSON. Server upserts the session row on first-seen turn from session_meta carried on the turn; subsequent turns extend the session's ended_at. No separate session events on the wire. Reduces collector state and makes outbox replay trivially idempotent.
Repo layout: option A (locked). Monorepo. cmd/lethe/ (this task) and cmd/lethe-collector/ (placeholder dir for #2) under one go.mod. Shared types in internal/shared/wire/.
Auth model. Server binds 127.0.0.1 only. A reverse proxy on phoebe (Caddy or Traefik) terminates TLS and forwards to localhost. Two independent auth paths inside lethe, each enable-able via config:
Forward-auth (header trust). Reverse proxy runs Authelia forward-auth on the lethe vhost, and on success injects Remote-User (and Remote-Email, Remote-Groups) headers. Middleware reads Remote-User (header name configurable for non-Authelia setups), checks allowlist, 403 on miss. Used by browsers and any other tool that already has an Authelia session cookie.
OIDC bearer. Lethe is registered as an OIDC client of Authelia (client_id/client_secret configured at the Authelia side, client_id only at lethe — the server validates tokens, never issues them). Middleware accepts Authorization: Bearer <jwt>, validates against Authelia's published JWKS (discovered via /.well-known/openid-configuration), enforces iss, aud, exp, then resolves user from preferred_username falling back to sub. Used by the collector and any scripted client. No code-flow / callback / cookie machinery in lethe — the bearer must already be obtained out-of-band (Authelia issues it via its OIDC flow to whatever client got it).
Both paths drop the user identity into the request context so handlers see the same shape regardless of how the user was authenticated. Both paths are gated by the same auth.allowed_users allowlist as a defense-in-depth check. If both are enabled and a request carries both an Authorization header and a Remote-User header, the bearer is validated first; the proxy header is ignored unless bearer validation fails open (configurable). Health and metrics endpoints are mounted outside the auth middleware so phoebe can scrape them locally without going through the proxy. The implicit assumption — that nothing else on phoebe binds 127.0.0.1 and forges the header — is the whole trust model for path 1; documented in the README.
Layered design (from go-selfhosted-backend):
internal/config — Viper, strict mode, validator tags, fails on unknown keys. Top-level Config struct exposes substructs (Server, Database, Auth, …) via config-section:"" tags so steward can inject them by type into individual services.internal/platform/database — sqlx connection, migration runner over embed.FS, transaction helper. Implements steward Init (open + migrate) and Destroy (close).internal/platform/observability — scribe logger service (Init sets slog.SetDefault), Prometheus registry singleton.internal/platform/health — Checker interface, Set aggregator service with steward multi-inject ([]Checker \inject:""``), DB check service registered as a Checker. Adding a new check = registering a new asset; no edits to the set.internal/server — chi router, middleware stack (request-id, logging, metrics, recovery, auth), Start (listen) and Stop (graceful shutdown). Marked steward.Root() so it's always started.internal/domain/ingest — handler + service for POST /api/v1/ingest, both as steward services. Service owns the upsert-session-then-upsert-turn logic in a single transaction per batch.internal/domain/session — handler + repository for the list/detail JSON API, both as steward services.internal/shared/wire — TurnEvent, SessionMeta types. Imported by both server and (eventually) collector.internal/pkg/httputil + internal/pkg/apierror — JSON helpers, RFC 7807 problem rendering with culpa code → status mapping. Pure libraries, not steward components.Wiring & lifecycle. All wiring goes through steward.Manager. cmd/lethe/main.go is a thin shell: parse -config, load Config, register configuration + service assets, conditionally register OIDCVerifier only when cfg.Auth.OIDC.Enabled, call Inject → Init → Start, wait on signal, call Stop → Destroy. Dependency direction is enforced by struct tags (config:"", inject:"", inject:"" optional:"true"); the manager solves the topological order. No global state outside slog.Default (set by the logger service in Init).
Wire type (locked contract for #2):
package wire
type TurnEvent struct {
Tool string `json:"tool"`
Host string `json:"host"`
SessionID string `json:"session_id"`
TurnID string `json:"turn_id"`
Seq int64 `json:"seq"`
Role string `json:"role"` // user | assistant | tool | system
Timestamp int64 `json:"timestamp"` // unix epoch seconds
Content string `json:"content"`
Model *string `json:"model,omitempty"`
TokensIn *int64 `json:"tokens_in,omitempty"`
TokensOut *int64 `json:"tokens_out,omitempty"`
CostUSD *float64 `json:"cost_usd,omitempty"`
ToolCalls json.RawMessage `json:"tool_calls,omitempty"`
SessionMeta SessionMeta `json:"session_meta"`
Metadata json.RawMessage `json:"metadata,omitempty"`
}
type SessionMeta struct {
WorkingDir *string `json:"working_dir,omitempty"`
SourceFile string `json:"source_file"`
StartedAt *int64 `json:"started_at,omitempty"` // optional; server falls back to MIN(turn.timestamp)
Metadata json.RawMessage `json:"metadata,omitempty"`
}
tool_calls is json.RawMessage: the server doesn't interpret it, just persists it and feeds its serialized text to tool_outputs_fts. Carrying session_meta on every turn is redundant (~100 bytes/turn) and intentional: collector replay never has to ask "did I already send the session header?"
Ingest semantics.
TurnEvent per line. Hard cap on body size (configurable, default 16 MiB). Per-turn content soft-capped at 4 MiB (configurable); a single oversize turn is a LineError, not a 413.tool, host, session_id, turn_id, seq, role, timestamp, content. role ∈ {user, assistant, tool, system}. source_file (in SessionMeta) capped at 1024 bytes. Validation runs per-line before DB; failure → LineError.owner is server-derived from the authenticated user (request context), never read from the wire. The wire format in internal/shared/wire/ deliberately has no owner field — collectors cannot impersonate other owners.INTEGER (unix epoch seconds). No TEXT timestamps anywhere in the schema.INSERT INTO sessions ... ON CONFLICT (owner, tool, host, session_id) DO UPDATE SET ended_at = MAX(ended_at, excluded.ended_at). started_at, working_dir, source_file, metadata are first-write-wins (preserved on conflict); only ended_at extends.INSERT INTO turns ... ON CONFLICT (owner, tool, host, session_id, turn_id) DO UPDATE SET <all non-key columns>. Last-write-wins on the turn row. Triggers keep both FTS tables in sync.200 {"accepted": N, "errors": [{"line": N, "error": "..."}]} where accepted is the count of lines successfully committed in a previous chunk. Practically, the server processes and commits in chunks (e.g. every 500 lines) so a single bad line near the end of a batch doesn't lose 499 good ones. Collector advances its offset by exactly accepted.accepted, collector retries the whole batch from the same offset.Schema (initial migration). As specified in the RFC §4.2, with these adjustments:
owner TEXT NOT NULL to sessions and turns. Composite PKs become sessions(owner, tool, host, session_id) and turns(owner, tool, host, session_id, turn_id) with FK (owner, tool, host, session_id) → sessions. Owner-leading PK gives a free index for "list my sessions" and makes per-user isolation a schema property.owner UNINDEXED column to both FTS tables so #3's search can WHERE turns_fts MATCH ? AND owner = ? cheaply; triggers carry it from the source row.turns_fts_update trigger so an UPSERT on turns keeps the FTS row current (the RFC has insert/delete only).tool_outputs_fts table with insert/update/delete triggers, indexing the tool_calls column when non-NULL.schema_migrations (managed by golang-migrate) — replaces the RFC's hand-rolled schema_version.sessions(owner, started_at DESC) for the timeline/list query.CLI. This binary has one mode (run server). flag package, no cobra. Single arg: -config <path>. (Cobra will land with the collector binary in #2.)
Tradeoffs that settled it.
snippet/highlight and tokenizer are better than tsvector for this workload, single-user means no write contention. Migrating to PG later is mechanical if I'm wrong; the cost of being wrong is an afternoon.session_meta per turn costs nothing at this scale.Unknowns that remain.
~/.claude/projects/. If the FTS index for tool_outputs_fts grows pathologically, #3 has the option to add a size cap or move that table to a separate attached DB.go-oidc defaults usually fine; revisit if validation latency or 401-storms appear). Not blocking #1.Greenfield. Empty repo, no consumers, nothing to break. The only forward-compat concern is the wire format, which is locked into internal/shared/wire/ and versioned implicitly via the /api/v1/ path prefix. Future breaking changes get /api/v2/.
TDD: yes (reason: ingest idempotency, the upsert-session-from-turn semantics, the chunked-commit-with-partial-accept response shape, the auth middleware allowlist, and migration application on startup are all deterministic, regression-prone surfaces.)
source_file from incoming turns is stored as opaque string only.embed.FS migrations applied on startup. No ad-hoc DDL, no startup-time conditional CREATE TABLE.sessions keyed on (owner, tool, host, session_id); turns keyed on (owner, tool, host, session_id, turn_id). No surrogate IDs anywhere.POST /api/v1/ingest is idempotent at the turn level per owner: re-POST of identical (tool, host, session_id, turn_id) by the same authenticated user produces the same final state regardless of how many times it's sent. Two different users posting the same (tool, host, session_id, turn_id) produce two distinct rows.owner is set from the authenticated user on every ingest write. The wire format has no owner field; the server never reads owner from the request body.GET /api/v1/sessions, GET /api/v1/sessions/{tool}/{host}/{session_id}) return only rows where owner = <current user>, except when the current user is in auth.admins and supplies ?owner=<user> (specific owner) or ?owner=* (all owners). Non-admin requests with ?owner= are 403./api/v1/* validates the configured user header (default Remote-User) against auth.allowed_users. Only /healthz, /readyz, /metrics are unauthenticated.127.0.0.1 only. Binding any other interface is a config error and fails fast at startup._busy_timeout configured. Foreign keys are enforced (PRAGMA foreign_keys = ON).turns_fts and tool_outputs_fts tables are never written to directly outside triggers.application/problem+json with the culpa code mapped to status; internal (5xx) errors are logged with full stacktrace via scribe.Err before being sanitized for the response._unused parameters. If something turns out wrong, rewrite the file.metadata JSON column on the relevant row; new SQL columns require justification.chi + sqlx + golang-migrate + modernc.org/sqlite + go-oidc/v3 (JWT/JWKS validation only — no auth-code flow) + go.bigb.es/auxilia (steward for DI/lifecycle, culpa for errors, scribe for logs, async only if a background task surfaces). No ORM, no template engine, no UI dependencies in this task.steward.Manager. Adding a new component is registering an asset; main.go does not grow.config:"", inject:"", optional + multi-injection where it earns its keep). Constructors are the zero value; setup happens in Init.steward.Manager to assemble a real graph against a :memory: DB.culpa errors with codes; HTTP layer translates once at the boundary.internal/shared/wire/ is treated as a published API even though it isn't published — changes ripple into the collector and need to be obvious in diff.Approach: build bottom-up — wire types → config → DB+schema → platform → HTTP foundation → auth → ingest → read API → main. Each phase is one commit; tests land with the phase that introduces the behavior. Greenfield, so no compat shims.
go.mod (create) — module sourcecraft.dev/bigbes/lethe, Go 1.22+. Direct deps stub: chi/v5, sqlx, modernc.org/sqlite, golang-migrate/v4, viper, validator/v10, prometheus/client_golang, coreos/go-oidc/v3, go.bigb.es/auxilia/{steward,culpa,scribe}.Justfile, .air.toml, Dockerfile, docker-compose.yml, .golangci.yml, .gitignore, config.example.yaml (create) — per go-selfhosted-backend skill conventions; SQLite volume mount, no CGO.README.md (create) — purpose, quickstart, trust model section documenting both auth paths: (a) the 127.0.0.1 + reverse-proxy + Authelia forward-auth + Remote-User chain (with a sample Caddy forward_auth snippet), and (b) the OIDC bearer flow against Authelia (sample Authelia identity_providers.oidc.clients entry + sample lethe auth.oidc config). Backup section with the sqlite3 .backup cron snippet.cmd/lethe/main.go (create, ~30 lines) — flag.String("config", ...), prints version and exits. Real wiring in Phase 9.internal/shared/wire/wire.go (create, ~40 lines) — TurnEvent, SessionMeta exactly as specified in Design. No methods; pure data. Locked contract for #2.internal/shared/wire/ published-API-discipline (Principles).feat: bootstrap lethe server skeleton + wire contractinternal/config/config.go (create, ~190 lines) — Config struct with Server, Database, Auth, Logging, Ingest substructs. Each substruct has mapstructure, validate, and config-section:"" tags so steward can inject them by type into individual services.
Database substruct: Path string (sqlite file path), BusyTimeout time.Duration (default 5s).Auth substruct: AllowedUsers []string, Admins []string (subset of allowed users; may be empty), ForwardAuth ForwardAuthConfig{ Enabled bool; UserHeader string (default "Remote-User") }, OIDC OIDCConfig{ Enabled bool; Issuer string (URL); Audience string; UsernameClaim string (default "preferred_username") }.Ingest substruct: MaxBodyBytes int64 (default 16 MiB), MaxTurnContentBytes int64 (default 4 MiB), ChunkSize int (default 500).Server substruct: Bind string, ShutdownGrace time.Duration (default 10s).func Load(path string) (*Config, error) — viper strict mode, validator, env-var overrides via viper.SetEnvPrefix("LETHE") + viper.AutomaticEnv() + viper.SetEnvKeyReplacer(NewReplacer(".", "_")), returns culpa error on failure.func MustLoad(path string) *Config — wraps Load, panics on error. Used by main.go.Server.Bind must equal 127.0.0.1 or 127.0.0.1:<port> (regex); Database.Path required; at least one of Auth.ForwardAuth.Enabled or Auth.OIDC.Enabled must be true (custom validator: auth_at_least_one); Auth.AllowedUsers min=1; every entry in Auth.Admins must also appear in Auth.AllowedUsers (custom validator: admins_subset_of_allowed); if OIDC enabled, Auth.OIDC.Issuer url and Auth.OIDC.Audience required; Ingest.MaxBodyBytes gt=0, Ingest.MaxTurnContentBytes gt=0,ltefield=MaxBodyBytes, Ingest.ChunkSize gt=0.internal/config/config_test.go (create, TDD) — tests for: empty allowlist rejected; non-loopback bind rejected; both auth modes disabled rejected; OIDC enabled without issuer rejected; OIDC enabled with non-URL issuer rejected; admin not in allowed_users rejected; empty admins list accepted; missing Database.Path rejected; MaxTurnContentBytes > MaxBodyBytes rejected; unknown YAML key rejected (strict mode); env override works (LETHE_AUTH_ALLOWED_USERS overrides YAML); valid forward-auth-only config loads; valid OIDC-only config loads; valid both-enabled config loads; defaults applied (UserHeader="Remote-User", UsernameClaim="preferred_username", MaxBodyBytes=16MiB, MaxTurnContentBytes=4MiB, ChunkSize=500, BusyTimeout=5s, ShutdownGrace=10s).feat(config): viper-loaded config with fail-fast validationinternal/platform/database/database.go (create, ~110 lines) — Database is a steward service.
type Database struct { Cfg config.DatabaseConfig \config:""`; DB *sqlx.DB }—DBpopulated inInit`.func (d *Database) Init(ctx context.Context) error — opens via modernc.org/sqlite with _journal_mode=WAL, _busy_timeout=5000, _foreign_keys=on, _synchronous=NORMAL, cache=shared; then runs Migrate(d.DB).func (d *Database) Destroy(ctx context.Context) error — closes the DB.func Migrate(db *sqlx.DB) error — runs embed.FS migrations via golang-migrate/v4 iofs source + sqlite driver. Pure function so tests can call it directly.func InTx(ctx, db, fn func(*sqlx.Tx) error) error — transaction helper, rollback on error. Pure function.inject:"" and read .DB.internal/platform/database/migrations/0001_init.up.sql + .down.sql (create) — sessions, turns, turns_fts (FTS5 over content + owner UNINDEXED), tool_outputs_fts (FTS5 over tool_calls + owner UNINDEXED), insert/update/delete triggers for both FTS tables (triggers carry owner from source row). Composite PKs: sessions(owner, tool, host, session_id); turns(owner, tool, host, session_id, turn_id) with FK (owner, tool, host, session_id) → sessions. All timestamps (started_at, ended_at, turns.timestamp) are INTEGER NOT NULL (unix epoch seconds). tool_calls, metadata are TEXT storing JSON. Index on sessions(owner, started_at DESC) for timeline.internal/platform/database/migrations.go (create, ~10 lines) — //go:embed migrations/*.sql var FS embed.FS.internal/platform/database/database_test.go (create, TDD) — tests with :memory: DB: migrate is idempotent on second run; turn insert populates turns_fts with correct owner; turn update updates turns_fts; turn delete removes from turns_fts; same for tool_outputs_fts when tool_calls non-NULL; FK rejects orphan turn; two owners with same (tool, host, session_id) coexist as distinct sessions; FTS query with owner = ? filter returns only that owner's rows.feat(db): SQLite schema with FTS5 + migration runnerinternal/platform/observability/logger.go (create, ~110 lines) — Logger steward service.
type Logger struct { Cfg config.LoggingConfig \config:""`; L *slog.Logger }`func (l *Logger) Init(ctx) error — builds scribe.NewTintHandler (or JSON handler per cfg), applies WithLevel, WithMaskKeys("password","token","authorization","secret","cookie"), wraps with a small contextHandler that pulls request_id and user from r.Context() and adds them to every record. Sets slog.SetDefault(l.L).func WithRequestID(ctx, id string) context.Context, func RequestIDFrom(ctx) string — context helpers used by the request-id middleware in Phase 5.internal/platform/observability/metrics.go (create, ~80 lines) — Metrics steward service.
type Metrics struct { Registry *prometheus.Registry; HTTPRequests *prometheus.CounterVec; HTTPDuration *prometheus.HistogramVec; IngestLinesAccepted, IngestLinesErrored, IngestChunksCommitted prometheus.Counter }func (m *Metrics) Init(ctx) error — prometheus.NewRegistry(); register collectors.NewProcessCollector + collectors.NewGoCollector; register HTTP histograms with labels {method, route, status} (route from chi.RouteContext(r.Context()).RoutePattern() — never raw path, to keep cardinality bounded); register ingest counters.Metrics; ingest service in Phase 7 increments the ingest counters.internal/platform/health/health.go (create, ~90 lines)
type Checker interface { Name() string; Check(ctx context.Context) error }type DBCheck struct { DB *database.Database \inject:""` }— implementsChecker; registered as a steward service tagged for multi-injection. CheckrunsSELECT 1`.type Set struct { Checks []Checker \inject:""` }— steward multi-injects every registeredChecker`.func (s *Set) Run(ctx) (results map[string]error, allOK bool) — applies a per-check 2s timeout via context.WithTimeout. Empty Checks slice → returns allOK = true (intentional: no checks means nothing has declared a readiness signal yet, not an error).Checker. No edits to Set.internal/platform/health/health_test.go (create, TDD) — Set returns aggregate failure when any check errors; passes when all OK; empty Checks returns allOK=true; per-check timeout enforced. Uses fake Checker implementations (no steward needed for unit test).internal/platform/steward_unwind_test.go (create, TDD, throwaway after Phase 4) — confirms steward calls Destroy on already-init'd siblings when a later component's Init errors; if it doesn't, Database.Destroy won't run on partial-init failures and we need to add an explicit guard in main. Verifies the assumption underpinning the lifecycle design.feat(platform): scribe logger, prometheus registry, health checker setinternal/pkg/apierror/apierror.go (create, ~80 lines)
type Problem struct { Type, Title, Status, Detail, Code, Instance, Errors } — RFC 7807 shape.func Render(w, r, err error) — extracts culpa.Code from err, maps to HTTP status (NotFound→404, Invalid→400, Unauthorized→401, Forbidden→403, Conflict→409, Internal→500), writes application/problem+json. 5xx logs full stacktrace via scribe.Err before sanitizing.internal/pkg/httputil/httputil.go (create, ~50 lines) — ReadJSON, WriteJSON, ReadNDJSONLines(r io.Reader, maxBytes int64) iter.Seq2[[]byte, error].internal/server/server.go (create, ~150 lines) — Server is the steward root service.
type Server struct { Cfg config.ServerConfig \config:""`; Log *observability.Logger `inject:""`; Metrics *observability.Metrics `inject:""`; Health *health.Set `inject:""`; Auth *auth.Authenticator `inject:""`; Ingest *ingest.Handler `inject:""`; Sessions *session.Handler `inject:""`; httpSrv *http.Server }`func (s *Server) Init(ctx) error — builds chi router, mounts middleware stack:
observability.WithRequestID, echo as X-Request-ID response header.contextHandler from Phase 4.1. Body never logged.Metrics.HTTPRequests and observes Metrics.HTTPDuration using chi.RouteContext(r.Context()).RoutePattern() as the route label.GET /healthz (process up), GET /readyz (calls s.Health.Run with 5s timeout, 503 on any failure), GET /metrics (promhttp.HandlerFor(s.Metrics.Registry, promhttp.HandlerOpts{}))./api/v1/* group with s.Auth.Middleware then s.Ingest.Mount(r) and s.Sessions.Mount(r) (paths inside Mount are relative to the /api/v1 group).Cfg.Bind resolves to a loopback IP — error otherwise.func (s *Server) Start(ctx) error — spawns http.Server.ListenAndServe in a goroutine; returns nil immediately. Errors propagate via stop channel.func (s *Server) Stop(ctx) error — httpSrv.Shutdown(ctx) with Cfg.ShutdownGrace (default 10s) drain budget. In-flight ingest chunks finish their commit; partially-processed batches return their Accepted count truthfully.steward.Root() so it's always started even if no other component injects it.internal/pkg/apierror/apierror_test.go (create, TDD) — each culpa code maps to expected status; problem JSON has all required fields; internal-error response detail is sanitized (no stack trace in body).internal/server/server_test.go (create, TDD) — non-loopback bind returns error from Server.Init; recovery middleware turns panic into 500 problem; request-id propagates to log lines. Tests construct Server directly with hand-built deps (skip steward; unit test of router behavior).feat(http): chi server with middleware stack + RFC 7807 problem rendererinternal/server/auth/oidc.go (create, ~140 lines) — OIDCVerifier is a steward service, registered conditionally in main only when cfg.Auth.OIDC.Enabled.
type OIDCVerifier struct { Cfg config.OIDCConfig \config:""`; verifier *oidc.IDTokenVerifier; usernameClaim string }`func (v *OIDCVerifier) Init(ctx) error — builds oidc.NewProvider(ctx, Cfg.Issuer) (which fetches /.well-known/openid-configuration + JWKS) and provider.Verifier(&oidc.Config{ClientID: Cfg.Audience}). Accepts go-oidc default clock skew (no explicit option). Hard-fails at startup if Authelia unreachable; that's the chosen tradeoff (see Risks).func (v *OIDCVerifier) Verify(ctx, raw string) (user string, err error) — validates JWT, extracts username via usernameClaim, falls back to sub. Returns culpa.Unauthorized-coded error on any validation failure.internal/server/auth/middleware.go (create, ~150 lines) — Authenticator is a steward service.
type Authenticator struct { Cfg config.AuthConfig \config:""`; Log *observability.Logger `inject:""`; Verifier *OIDCVerifier `inject:"" optional:"true"`; allowed, admins map[string]struct{} }`func (a *Authenticator) Init(ctx) error — builds allowed and admins lowercase sets from Cfg.AllowedUsers / Cfg.Admins; if Cfg.OIDC.Enabled && Verifier == nil → hard error (config invariant breach).func (a *Authenticator) Middleware(next http.Handler) http.Handler — resolution order: (1) if OIDC enabled and Authorization: Bearer <token> present, call Verifier.Verify; (2) if forward-auth enabled and <UserHeader> non-empty, take it; (3) else 401 problem. After resolving user: lowercase, check allowed, 403 problem on miss, otherwise put Identity{User, IsAdmin} into request context via WithIdentity and call next.type Identity struct { User string; IsAdmin bool }func WithIdentity(ctx, Identity) context.Context, func IdentityFrom(ctx) (Identity, bool), func MustIdentity(ctx) Identity — context helpers used by handlers.Server.Init) mounts middleware on /api/v1/* only; /healthz, /readyz, /metrics unmounted (Phase 5/9).internal/server/auth/middleware_test.go (create, TDD) — table-driven against an in-memory router:
X-Forwarded-User) honored → 200.Authorization → 401; malformed bearer → 401; valid JWT signed by test JWKS, allowed user → 200; valid JWT, user not in allowlist → 403; expired JWT → 401; wrong-audience JWT → 401; preferred_username claim used; falls back to sub when preferred_username absent.Auth.Admins → IdentityFrom(ctx).IsAdmin == true; user not in admins → false; admin not in AllowedUsers rejected at config load (covered in Phase 2 tests).httptest.Server serving JWKS + OIDC discovery + signs JWTs with a generated RSA key; pointed at by the verifier under test./api/v1/* route validates auth; only /healthz, /readyz, /metrics exempt (enforced by mount point in Phase 9). Same auth.allowed_users allowlist applied regardless of which auth path resolved the user.feat(auth): forward-auth + OIDC bearer middleware with shared allowlistinternal/domain/ingest/repository.go (create, ~140 lines) — Repository is a steward service.
type Repository struct { Database *database.Database \inject:""` }`func (r *Repository) UpsertChunk(ctx, tx *sqlx.Tx, owner string, turns []wire.TurnEvent) error — single tx: per turn, INSERT … ON CONFLICT (owner,tool,host,session_id) DO UPDATE SET ended_at = MAX(ended_at, excluded.ended_at) for sessions (first-write-wins on metadata); INSERT … ON CONFLICT (owner,tool,host,session_id,turn_id) DO UPDATE SET <all non-key cols> for turns. owner is bound from the parameter on every row — never sourced from the wire payload. started_at falls back to MIN(turn.timestamp) when SessionMeta.StartedAt is nil.internal/domain/ingest/service.go (create, ~160 lines) — Service is a steward service.
type Service struct { Cfg config.IngestConfig \config:""`; Repo *Repository `inject:""`; Log *observability.Logger `inject:""`; Metrics *observability.Metrics `inject:""` }`type Result struct { Accepted int; Errors []LineError }func validateTurn(t wire.TurnEvent, maxContentBytes int64) error — required fields (tool, host, session_id, turn_id, seq, role, timestamp, content non-empty); role ∈ {user, assistant, tool, system}; len(content) ≤ maxContentBytes; len(SessionMeta.SourceFile) ≤ 1024. Returns culpa.Invalid-coded error on failure.func (s *Service) Ingest(ctx, owner string, body io.Reader, maxBytes int64) (Result, error) — reads NDJSON via httputil.ReadNDJSONLines; per line: JSON-unmarshal then validateTurn; buffers up to Cfg.ChunkSize lines; on full chunk, opens tx and calls Repo.UpsertChunk(ctx, tx, owner, chunk); on commit success, increments Metrics.IngestChunksCommitted and IngestLinesAccepted by chunk len, adds chunk len to Accepted; on parse/validation/DB error mid-chunk, rolls back chunk, increments IngestLinesErrored, logs at WARN with line number, owner, tool/host/session_id, returns with Accepted reflecting prior chunks plus a LineError for the failing line; subsequent lines are not processed.internal/domain/ingest/handler.go (create, ~70 lines) — Handler is a steward service.
type Handler struct { Cfg config.IngestConfig \config:""`; Service *Service `inject:""` }`func (h *Handler) Post(w, r) — extract auth.MustIdentity(r.Context()).User as owner, Content-Type check, body limit reader at Cfg.MaxBodyBytes, calls Service.Ingest(ctx, owner, ...), returns 200 {"accepted": N, "errors": [...]}. Hard server errors (DB down) → 5xx problem with empty accepted.func (h *Handler) Mount(r chi.Router) — r.Post("/ingest", h.Post). Called by Server.Init.internal/domain/ingest/service_test.go + repository_test.go (create, TDD) — tests against :memory: DB:
LineError; bad role value → LineError; content over MaxTurnContentBytes → LineError (not 413, body still under cap); source_file over 1024 → LineError. All other lines in the same chunk that came before the bad line still commit.ended_at extends: MAX(existing, incoming).started_at fallback to MIN turn timestamp when SessionMeta.StartedAt is nil.Accepted = 2*chunkSize, error references correct line number.MaxBodyBytes → 413 problem.owner (turns_fts contains turn content; tool_outputs_fts contains tool_calls JSON when present).owner. Modifying user B's session's ended_at does not affect user A's row.owner is ignored: a TurnEvent JSON line that includes a stray "owner": "evil" field (not in the wire struct) does not change the stored owner; ingest still attributes to the authenticated user. (Negative test against the trust model.)feat(ingest): NDJSON ingest with chunked transactions and partial-acceptinternal/domain/session/repository.go (create, ~150 lines) — Repository is a steward service.
type Repository struct { Database *database.Database \inject:""` }`type OwnerScope struct { User string; AllOwners bool; SpecificOwner *string } — resolved by handler from identity + ?owner= param.type ListFilter struct { Owner OwnerScope; Tool, Host *string; Since, Until *int64; Limit, Offset int }func (r *Repository) List(ctx, f ListFilter) ([]Session, error) — dynamic WHERE built from non-nil filters; owner clause: if AllOwners no clause, if SpecificOwner WHERE owner = ?, else WHERE owner = User; ORDER BY started_at DESC; LIMIT/OFFSET.func (r *Repository) Get(ctx, scope OwnerScope, tool, host, sessionID string) (*SessionWithTurns, error) — owner clause same as above; joined select; returns culpa.NotFound when session missing or owner mismatch (do not leak existence by returning 403 vs 404 — a non-admin asking for someone else's session must see the same 404 as if the session didn't exist).internal/domain/session/handler.go (create, ~120 lines) — Handler is a steward service.
type Handler struct { Repo *Repository \inject:""` }`limit=50 default, limit capped at 200, negative limit/offset clamped to default/0. since > until → 400 problem.func (h *Handler) resolveScope(r) (OwnerScope, error) — read auth.MustIdentity(r.Context()), parse ?owner=; if param empty → scope is current user; if param non-empty and identity is admin → scope is SpecificOwner (or AllOwners for ?owner=*); if param non-empty and identity is not admin → 403 problem.func (h *Handler) List(w, r) — resolveScope, parse other query params (tool, host, since, until, limit, offset), calls Repo.List, writes JSON.func (h *Handler) Get(w, r) — resolveScope, chi URL params, calls Repo.Get, writes JSON; 404 problem when not found (auto via apierror).func (h *Handler) Mount(r chi.Router) — r.Get("/sessions", h.List); r.Get("/sessions/{tool}/{host}/{session_id}", h.Get). Called by Server.Init.internal/domain/session/repository_test.go + handler_test.go (create, TDD) — tests against :memory: DB:
limit (e.g. max 200) and clamps negatives; ordering by started_at desc; turns inline in correct order by seq.?owner=B lists/gets B's rows. Admin with ?owner=* lists across all owners. Admin without ?owner= defaults to admin's own rows (no implicit cross-tenant view).?owner= rejected: non-admin user A passing ?owner=A (their own user!) or ?owner=B → 403 problem. (Param is admin-only; non-admins must not pass it at all.)feat(session): list and detail JSON API with filterscmd/lethe/main.go (modify, ~70 lines) — thin shell. No business logic; everything is a steward asset.
cfg := config.MustLoad(*configPath)
mgr := steward.NewManager()
mgr.AddComponent(ctx,
steward.MustConfigurationAsset(cfg),
steward.MustServiceAsset(&observability.Logger{}),
steward.MustServiceAsset(&observability.Metrics{}),
steward.MustServiceAsset(&database.Database{}),
steward.MustServiceAsset(&health.DBCheck{}), // registers as Checker
steward.MustServiceAsset(&health.Set{}),
steward.MustServiceAsset(&auth.Authenticator{}),
steward.MustServiceAsset(&ingest.Repository{}),
steward.MustServiceAsset(&ingest.Service{}),
steward.MustServiceAsset(&ingest.Handler{}),
steward.MustServiceAsset(&session.Repository{}),
steward.MustServiceAsset(&session.Handler{}),
steward.MustServiceAsset(&server.Server{}, steward.Root()),
)
if cfg.Auth.OIDC.Enabled {
mgr.AddComponent(ctx, steward.MustServiceAsset(&auth.OIDCVerifier{}))
}
must(mgr.Inject(ctx)); must(mgr.Init(ctx)); must(mgr.Start(ctx))
// wait on SIGINT/SIGTERM
must(mgr.Stop(stopCtx)); must(mgr.Destroy(ctx))
Server.Init (Phase 5.3): /healthz, /readyz, /metrics outside the auth group; /api/v1/* inside Authenticator.Middleware. Main does not touch chi.signal.NotifyContext(ctx, SIGINT, SIGTERM) → on cancel, mgr.Stop(ctx with 15s deadline) → mgr.Destroy(ctx) → exit.cmd/lethe/main_test.go (create, TDD, light) — end-to-end smoke via steward.Manager assembled in-test with a :memory: DB and a random-port Server: POST a fixture NDJSON as user A, GET sessions list as A (sees own row), GET session detail as A. Then POST a fresh batch as user B with the same (tool, host, session_id) and confirm both rows coexist; A still only sees A's. Confirms wiring + isolation reaches all the way through the steward graph.README.md (modify) — fill in real config example with all keys (including both auth.forward_auth and auth.oidc blocks), add curl commands for ingest + list + detail (one variant with Remote-User for testing forward-auth, one variant with Authorization: Bearer … for OIDC), finalize the dual-auth trust-model section and backup section./healthz, /readyz, /metrics are unauthenticated; mounting layout enforces this.feat(cmd): wire server with /healthz /readyz /metrics + authed /api/v1Linear: each phase depends on all prior phases. Phase 4 (observability/health) and Phase 5 (HTTP foundation) could parallelize but commit-coupling makes it not worth it. No phase can land before its dependencies (config → db → platform → http → auth → handlers → main).
golang-migrate/v4 + modernc.org/sqlite driver compatibility — multiple-statement migrations (FTS triggers) may need ; handling. Mitigation: Phase 3 test asserts migration applies; if it fails, swap to per-statement execution or goose.INSERT … ON CONFLICT … DO UPDATE in older versions. Mitigation: Phase 3 test specifically covers UPSERT path; if triggers don't fire, replace UPSERT with explicit SELECT then INSERT/UPDATE.NewOIDCVerifier errors propagate from main, lethe refuses to start. Acceptable because Authelia is on the same host; if Authelia is down lethe is unusable anyway. Forward-auth-only deployments are unaffected.git revert per phase commit.Greenfield — no compat surface. Wire format is the only forward-compat concern; pinned in internal/shared/wire/ and /api/v1/ URL prefix per Design. No further checks needed.
/api/v1/ingest — body cap + per-turn content cap are the only safeguards in v1.internal/shared/wire/./api/v1/sessions — offset+limit is enough for the expected volume.Authenticator to expose a per-identity bucket).go.mod directive is go 1.25.0, not go 1.22+. go get of golang-migrate/v4.19, viper 1.21, and prometheus 1.23 forced the bump (each requires ≥1.24). Plan said "Go 1.22+" so this satisfies the floor; flagging because the explicit number changed and the Dockerfile builder image was bumped to golang:1.25-alpine to match (consistency fix folded into the same Phase 1 commit).internal/deps/deps.go with blank imports of every direct dep so go mod tidy keeps them in go.mod until real packages start importing them. Transitional file; expected to shrink each phase and disappear by end of Phase 9. Without it, go mod tidy strips the dep stub the plan called for..golangci.yml uses the v2 schema (golangci-lint 2.11.4 rejects v1). Same lint set as the plan listed (errcheck, govet, staticcheck, revive, gosec, unused, gofmt, goimports).migrate-up, migrate-down, migrate-create to the Justfile alongside the migration runner so the targets aren't dead. (Done in Phase 3.)internal/deps/deps.go; Phase 9 deletes the file.auth.example.com placeholders; replace with phoebe-specific values when the production deploy lands (out of scope for this task).steward.Manager.Init returns on the first failing CallInit and does not iterate back over previously-initialized assets to call Destroy. The canary test TestStewardUnwindsOnInitFailure (in internal/platform/health/steward_unwind_test.go) is intentionally red on master to document this. Phase 9 main must compensate: track each component as it init's and, on Init error, walk the list in reverse calling Destroy directly on each (don't try mgr.Stop/mgr.Destroy — those panic unless the manager has reached Started). Once Phase 9 lands the explicit unwind, either delete the canary test or convert it to assert the new compensating behavior.3c45b48 via amend): chi's default 404/405 handlers wrote text/plain, violating the invariant "errors leaving any HTTP handler are rendered as RFC 7807". Added explicit chi.Router.NotFound/MethodNotAllowed handlers that call apierror.Render with NOT_FOUND / METHOD_NOT_ALLOWED codes. Added METHOD_NOT_ALLOWED → 405 entry to the apierror code-status map. Added two regression tests.UNSUPPORTED_MEDIA_TYPE → 415 entry to apierror.codeStatus (ingest handler enforces application/x-ndjson Content-Type). Repository simulates DB-down by closing the underlying *sql.DB (cleaner than service-faking, mirrors real driver-disconnect failure). Service-level FK test omitted because the schema makes it unreachable through the Service path (parent session is upserted in the same chunk); equivalent Repository-level test pins the wrap-and-classify code path.JSONText (sql.Scanner wrapper) for nullable TEXT-JSON columns — json.RawMessage cannot Scan NULL directly. External JSON shape unchanged. If Phase #3 (search) wants the same scan-safety, factor up to internal/pkg/sqljson.Server.Start to net.Listen first then http.Serve(listener) plus added Server.Addr() so :0 binds report the kernel-assigned port — enables the e2e smoke to bind to a random port without races. cmd/lethe/main.go uses a run() int shell so tests can drive it. Steward unwind canary internal/platform/health/steward_unwind_test.go deleted; main.go's reverse-order unwindOnError compensator is now the production guarantee. Bootstrap stderr slog handler installed before any asset registration so the unwind path always has a logger.4ca03be → 53221c9).go test ./... -race -count=1 fully green; no allowed-red exception.go vet, gofmt -l, go mod tidy, golangci-lint run ./... all clean.config.example.yaml; /healthz, /readyz, /metrics, unauthed /api/v1/sessions (401), authed /api/v1/sessions with Remote-User: bigbes (200) all behaved as designed; SIGTERM triggered clean shutdown via the steward graph.INSERT … ON CONFLICT … DO UPDATE fires the UPDATE trigger on SQLite (verified by TestUpsertFiresUpdateTriggerAndKeepsFTSCoherent). Regular FTS5 (not contentless / external content) was chosen so WHERE owner = ? works on the FTS table without a join — accepted the storage cost (content duplicated in real table + FTS shadow). Composite key order is (owner, tool, host, session_id[, turn_id]) everywhere; ingest INSERT/UPDATE/ON CONFLICT clauses must match. started_at/ended_at/source_file are NOT NULL — ingest derives started_at from MIN(turn.timestamp) when SessionMeta.StartedAt is absent.