From ab6efef20d1f15a9c5da17ded758114abb40aeb8 Mon Sep 17 00:00:00 2001 From: Eugene Blikh Date: Sun, 3 May 2026 14:42:02 +0300 Subject: [PATCH] docs: clarify collector plan assumptions --- docs/tasks/lethe-collector-claude-code.md | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/docs/tasks/lethe-collector-claude-code.md b/docs/tasks/lethe-collector-claude-code.md index 621fe1f5bf9a6d687cc575ee19548988d9dcde7a..e9820b652aa0c3f69025422a1de1e886e133445d 100644 --- a/docs/tasks/lethe-collector-claude-code.md +++ b/docs/tasks/lethe-collector-claude-code.md @@ -134,10 +134,17 @@ log: - *Parse-then-batch vs streaming POST:* batching keeps the wire protocol simple (NDJSON body, one HTTP call) and lets the server commit chunks atomically. Streaming would force the server to handle interrupted bodies — the RFC's chunked-commit response shape works because the body is bounded. - *Synthesize missing turn_ids vs require source IDs:* Claude Code always provides UUIDs in current versions, but the parser can't assume that holds for older fixture files or future regressions. Synthesis preserves idempotency; the rare case of a `content[:64]` collision within one session at one timestamp is acceptable. -**Unknowns that remain.** -- Whether `tailscale serve` injects `Tailscale-User-Login` for daemon HTTP clients (vs only browsers). If not, I add a `lethe-token` shared-secret fallback header in the deploy step — a 5-line server change. Confirmed empirically before declaring this task done. -- True line-size distribution of Claude Code `.jsonl` events. If it exceeds `bufio.Scanner`'s default 64 KiB token buffer, the parser uses `Scanner.Buffer(buf, maxSize)` with maxSize = 16 MiB. Captured here so the test fixtures cover the long-line case. -- Whether the laptop's `~/.claude/projects/` ever contains files concurrent-written from multiple Claude Code processes. If yes, the parser still works (append-only, monotonic offset), but the test plan should cover it. +### Assumptions + +- AS1 — The server-side ingest contract remains the locked `internal/shared/wire.TurnEvent` over `POST /api/v1/ingest`. +- AS2 — Claude Code transcript files are append-only for the byte ranges the collector has already read. +- AS3 — This task has one host identity and one configured Claude Code source root per collector process. + +### Unknowns + +- UK1 — Whether `tailscale serve` injects `Tailscale-User-Login` for daemon HTTP clients. +- UK2 — True line-size distribution of Claude Code `.jsonl` events. +- UK3 — Whether the laptop's `~/.claude/projects/` ever contains files concurrent-written from multiple Claude Code processes. ### Backwards-compatibility check