~bigbes/lethe

ref: 6971b2d09703b22178c58427be3211f93648eaca lethe/docs/tasks/lethe-search-and-opencode.md -rw-r--r-- 12.4 KiB
6971b2d0 — Eugene Blikh web: palette — gate prefetch with enabled+staleTime per plan 4.2 a month ago

#lethe-search-and-opencode

Status: Design (hands-off) Module: sourcecraft.dev/bigbes/lethe Depends on: lethe-server.md (#1) — FTS5 tables and triggers were created in #1; this task only adds query code. lethe-collector-claude-code.md (#2) — the collector framework and Parser interface this task extends. Sibling tasks (deferred): per-tool parsers (lethe-collector-crush.md, lethe-collector-pi.md, lethe-collector-kimi.md); RFC backlog items (cost rollups for tools that report it, tagging, JSON/Markdown export).

#Design

#Purpose

Make the archive answer "what did I ask any robot last Tuesday about X" in under five seconds, and add the second tool to prove the parser interface scales. End state: a search box on the timeline that returns ranked turn snippets with click-through to session, a stats page showing token rollups, and a working opencode parser registered in the collector.

A successful end state for this task: I can type "tt bundle lockfile" into the search box, see ranked snippets across both Claude Code and opencode sessions on either machine, click into any result and land on the matching turn. The stats page shows a per-tool, per-day token bar for the last month.

#Scope

In:

  • Server-side:
    • GET /api/v1/search?q=&tool=&host=&since=&until=&include_tool_outputs=&limit=&cursor= — FTS5 query against turns_fts. With include_tool_outputs=1, also union tool_outputs_fts and merge by BM25 rank, deduped on (tool, host, session_id, turn_id).
    • GET /api/v1/stats?group_by=tool|host|day&since=&until= — token rollups (sum tokens_in, tokens_out, count of turns, count of sessions). Cost is summed only where non-NULL.
    • HTML GET /search?q=... — server-rendered results page with snippet/highlight, filters in URL.
    • HTML GET /stats — table view of the stats endpoint, simple and readable.
    • Search-as-you-type debouncing (vanilla JS, ~300 ms) on the timeline and search pages.
    • Session view gains an anchor-jump on #turn-<turn_id> and a "find on page" hint.
  • Collector-side:
    • internal/collector/parser/opencode/ — new parser implementing the same Parser interface from #2.
    • Format-discovery spike captured under docs/spikes/opencode-format.md before the parser is written; spike output is checked in.
    • One config-entry change to register opencode in the dispatcher; otherwise the collector framework is untouched.
  • Golden fixtures for the opencode parser (anonymized snippets in testdata/opencode/).

Out:

  • Cost rollups for tools that don't report cost. Stats endpoint sums what's there and reports nulls as null.
  • crush, pi, kimi parsers — separate task files when the time comes.
  • Tag system for manual session annotation (RFC backlog).
  • JSON / Markdown export endpoints (RFC backlog).
  • Faceted search UI (e.g. histogram-driven date range pickers). Filters stay as form fields and URL params.
  • Saved searches, alerts, RSS, anything subscription-shaped.
  • A second machine's deploy (the goal is to prove the parser interface with #2; running on the work PC is a deployment exercise, not a code change).

#Chosen approach

Search query (default — include_tool_outputs=0). One FTS5 MATCH against turns_fts, filtered by the optional tool, host, since, until UNINDEXED columns:

SELECT
    f.tool, f.host, f.session_id, f.turn_id, f.timestamp, f.role,
    snippet(turns_fts, 0, '<mark>', '</mark>', '…', 32) AS snippet,
    bm25(turns_fts) AS rank
FROM turns_fts AS f
WHERE turns_fts MATCH ?
  AND (? IS NULL OR f.tool = ?)
  AND (? IS NULL OR f.host = ?)
  AND (? IS NULL OR f.timestamp >= ?)
  AND (? IS NULL OR f.timestamp <  ?)
ORDER BY rank LIMIT ?;

Pagination: cursor encodes (rank, turn_id) of the last row; next page applies (rank, turn_id) > cursor in lexicographic order. Offset-based pagination over FTS5 is correct but gets pathological at depth; a tuple cursor is barely more code. For the personal scale, offset+limit would also be fine — defaulting to cursor anyway because changing pagination later is an API break.

Search query (include_tool_outputs=1). Two MATCH queries (one per FTS table), UNION ALL, then a window dedupe on (tool, host, session_id, turn_id) keeping the better-ranked match (a turn might appear in both indexes). The frontend sets include_tool_outputs=1 only when the user toggles the "include tool output" checkbox; the default URL stays clean and fast.

Stats query. Standard SQL aggregation over turns, parameterized by group_by:

  • group_by=toolGROUP BY tool returning per-tool totals.
  • group_by=hostGROUP BY host.
  • group_by=dayGROUP BY date(timestamp, 'unixepoch') returning a daily series.

The idx_turns_timestamp index from #1 covers the date-range filter; for the small data scale this stays sub-second without further tuning.

Search HTML. A new search.html template sharing base.html with the timeline and session views. The result list shows: tool icon, host, working_dir, timestamp, snippet (with <mark>), session-link to /session/{tool}/{host}/{session_id}#turn-<turn_id>. Snippets pass through bluemonday UGC policy before insertion.

JS (vanilla, no framework). One small script (internal/ui/static/lethe.js):

  • Debounced search-as-you-type: 300 ms after last keystroke, fetch /search?q=...&format=fragment, replace the results region's innerHTML. The format=fragment query param tells the server to return only the results partial template, no <html> chrome.
  • Anchor-jump highlight on session view (#turn-<id> → scroll, briefly add a .highlighted CSS class).
  • Tool-call expand/collapse uses native <details> from #1; no JS needed.

Total JS budget: under 50 lines. CSP-safe; no eval.

opencode parser — discovery first.

The format is unknown until the spike runs. Step 0 of this task is cmd/lethe-spike-opencode/main.go (deleted before the task closes), which:

  1. Walks ~/.local/share/opencode/ and ~/.config/opencode/ and ~/.cache/opencode/.
  2. Reports file types, sizes, sample contents.
  3. Identifies whether sessions are stored as JSONL, JSON-per-session, SQLite, or something else.

Spike output goes to docs/spikes/opencode-format.md. Until that spike runs, the parser implementation below is provisional. Two likely shapes:

  • If JSONL or JSON-per-session: implementation mirrors the Claude Code parser. Parse() reads from a byte offset, returns wire.TurnEvents, persists offset.
  • If SQLite-backed: Parse() opens the source DB read-only with _busy_timeout set, queries rows newer than a stored marker (rowid or (session, sequence)), maps to wire.TurnEvent. The ingestion_state.last_offset column gets reused for the marker (it's an INTEGER — works for either rowid or byte offset).

In both cases the Parser interface from #2 is unchanged. The opencode-specific code is contained in internal/collector/parser/opencode/.

If the spike reveals an opaque or encrypted format, opencode is dropped from this task and an issue is filed against opencode upstream. The task still ships search.

Tradeoffs that settled it.

  • Single-table FTS query vs always-union: unioning costs nothing on small data, but it muddies BM25 scoring (tool outputs have wildly different term distributions than prose). Default to clean prose search; let the user opt in.
  • Cursor vs offset pagination: offset+limit is fine at this scale, but changing pagination later is an API break. Cursor cost is one struct and a base64 helper.
  • Server-side fragment rendering vs client-side JSON+template: fragment rendering keeps JS at 50 lines and templates as the single source of truth. Client-side rendering would force a parallel JS template path.
  • Discovery spike vs guess-and-iterate on opencode: the spike is 30 minutes; iterating against a wrong assumption costs an evening. The spike's output is also documentation worth keeping.

Unknowns that remain.

  • opencode's actual on-disk format. Resolved by the spike before parser code is written.
  • BM25 tuning. SQLite's default BM25 (k1=1.2, b=0.75) is fine for prose. If empirical search quality is bad against real data, tune via bm25(turns_fts, 5.0, 1.0, ...) per-column weights — that's a #3-and-a-half follow-up, not a blocker.
  • Stats-page chart. v1 is a table. Adding a sparkline is a one-CSS-grid afternoon if I want it; not in scope here.

#Backwards-compatibility check

  • Server: only adds routes. /api/v1/sessions/... and the timeline/session HTML are unchanged. The internal/shared/wire/ types are untouched. No schema changes.
  • Collector: only adds a new parser package and one config entry under sources:. The Parser interface is unchanged. Existing claude-code config keeps working.
  • Database: zero migrations. The FTS tables and triggers were created in #1 and are merely queried here for the first time.

#Hands-off decisions

  • udesign: cursor pagination for /api/v1/search rather than offset — costs one helper, prevents an API break later. Stats keeps since/until only (no pagination — the result set is intrinsically small).
  • udesign: include_tool_outputs defaults to false — keeps default search latency low and result lists uncluttered. UI surfaces a checkbox; URL param drives behavior.
  • udesign: server-side fragment rendering for live-search — single template path beats a parallel JS template engine. ?format=fragment returns the partial.
  • udesign: BM25 default weights — no per-column tuning until empirical evidence forces it.
  • udesign: opencode discovery spike is mandatory before parser code — protects against an evening of guessing.
  • udesign: spike binary at cmd/lethe-spike-opencode/ deleted before task closes; spike findings preserved in docs/spikes/opencode-format.md. Spike code is throwaway; the writeup is the artifact.
  • udesign: stats UI is a table for v1 — sparkline / chart is YAGNI until I actually use the page.
  • udesign: cost rollups null-pass — sum where present, ignore otherwise. No fake "estimated cost" math. Honest data.
  • udesign: anchor-jump uses #turn-<turn_id> rather than scroll-into-view-via-JS — anchors are stable across renders, work with browser back/forward, and survive the page being reloaded from a bookmark.

TDD: yes (reason: FTS query result shape is exactly the kind of thing that silently regresses on schema or trigger changes; stats aggregation has off-by-one risks around timezone/date boundaries; opencode parser determinism mirrors the claude-code parser's TDD justification.)

#Invariants

  • All schema migrations remain in #1's migrations/. This task adds no new migration files.
  • internal/shared/wire/ types are not modified by this task. If a need arises, it's flagged and split into a separate amendment task.
  • GET /api/v1/search and GET /api/v1/stats are read-only: they execute SELECT only, with no transaction lower than BEGIN DEFERRED.
  • BM25 ranking is the default for /api/v1/search; time-ordered results are explicitly opt-in via a future ?order=time parameter (not implemented in this task; reserved).
  • Search and stats endpoints honor the same Tailscale-User-Login allowlist middleware as #1's existing routes. No exemptions.
  • Snippets and any user-supplied query reflected back into HTML pass through bluemonday before insertion.
  • The opencode parser implements parser.Parser from #2 verbatim — no special interface, no opencode-specific hooks in the ingestion loop.
  • Source-format identification for opencode is documented in docs/spikes/opencode-format.md and committed before the parser code lands.
  • tool_outputs_fts is queried only when include_tool_outputs=1; default search path never touches it.
  • All client-side JavaScript stays under 50 lines, lives in one file, and works without any build step.

#Principles

  • Search defaults are the prose-only path. Tool-output search is a power-user toggle, not the headline.
  • One template tree, server-rendered. Live updates go through ?format=fragment partials, not a parallel JS render path.
  • A new parser is one directory under internal/collector/parser/<tool>/ plus a single register.go import — opencode's job is to prove this and leave the path obvious for crush/pi/kimi.
  • Spike before commit when the format is unknown. The spike's writeup is the artifact; the spike's code is throwaway.
  • Stats are honest. Null-pass on missing fields. No estimates, no synthesis.
  • Vanilla JS, no framework, no build step, no transpiler, no bundler. The day this stops working is the day the project has changed shape enough to merit its own task.