Status: done
Branch: task/lethe-search-and-opencode
Worktree: /Users/blikh/data/home/lethe/.worktrees/lethe-search-and-opencode
Mode: hands-off
Module: sourcecraft.dev/bigbes/lethe
Depends on: lethe-server.md (#1) — FTS5 tables and triggers were created in #1; this task only adds query code. lethe-collector-claude-code.md (#2) — the collector framework and Parser interface this task extends.
Sibling tasks (deferred): per-tool parsers (lethe-collector-crush.md, lethe-collector-pi.md, lethe-collector-kimi.md); RFC backlog items (cost rollups for tools that report it, tagging, JSON/Markdown export).
Make the archive searchable by exposing the existing FTS5 indexes through /api/v1/search, and prove the collector parser boundary with a second tool: opencode.
A successful end state for this task: ingested Claude Code and opencode turns can be searched through the authenticated JSON API, with ranked snippets and session anchors that #7 can render in the existing React /search route.
In:
GET /api/v1/search?q=&tool=&host=&since=&until=&include_tool_outputs=&limit=&cursor= — owner-scoped FTS5 query against turns_fts; opt-in union with tool_outputs_fts.internal/domain/search/ — repository and handler matching the existing domain package shape.400 INVALID./session/{tool}/{host}/{session_id}#turn-{turn_id}.internal/collector/parser/opencode/ — new parser implementing the same Parser interface from #2.docs/spikes/opencode-format.md before the parser is written; spike output is checked in.cmd/lethe-collector; otherwise the collector runner framework is untouched.testdata/opencode/).Out:
web/src/routes/search.tsx and any saved-search execute flow.turns_fts and tool_outputs_fts.Search API. Add a search domain package, mount it under the existing authenticated /api/v1 group, and register it in the steward graph beside session, project, stats, and savedsearch.
Response shape:
{
"results": [
{
"tool": "claude-code",
"host": "laptop",
"session_id": "...",
"turn_id": "...",
"timestamp": 1760000000,
"role": "user",
"working_dir": "/repo",
"snippet": "...\u0002term\u0003...",
"match_source": "turn",
"rank": -1.23
}
],
"limit": 50,
"next_cursor": "opaque-or-empty"
}
snippet uses marker runes instead of HTML so #7 can render highlights with React text nodes; match_source is turn or tool_output.
Search query (default — include_tool_outputs=0). One FTS5 MATCH against turns_fts, joined back to turns/sessions by rowid and composite key so filters and result metadata come from canonical tables:
SELECT
t.tool, t.host, t.session_id, t.turn_id, t.timestamp, t.role,
s.working_dir,
snippet(turns_fts, 0, char(2), char(3), '…', 32) AS snippet,
bm25(turns_fts) AS rank,
'turn' AS match_source
FROM turns_fts
JOIN turns AS t ON t.rowid = turns_fts.rowid
JOIN sessions AS s ON s.owner = t.owner AND s.tool = t.tool AND s.host = t.host AND s.session_id = t.session_id
WHERE turns_fts MATCH ?
AND t.owner = ?
AND (? IS NULL OR t.tool = ?)
AND (? IS NULL OR t.host = ?)
AND (? IS NULL OR t.timestamp >= ?)
AND (? IS NULL OR t.timestamp < ?)
ORDER BY rank ASC, t.timestamp DESC, t.turn_id ASC
LIMIT ?;
Pagination cursor encodes (rank, timestamp, turn_id, match_source) of the last row and must be generated from the same normalized query/filter tuple.
Search query (include_tool_outputs=1). Two MATCH queries, one per FTS table, UNION ALL, then window-dedupe on (tool, host, session_id, turn_id) keeping the better-ranked match and exposing which source won.
Query validation. Empty q is 400 INVALID; limit clamps to the existing 50/200 pattern; since/until parse as Unix seconds; invalid FTS syntax returns 400 INVALID rather than a 500.
opencode parser — discovery first. The local install currently exposes ~/.local/share/opencode/opencode.db, storage/session/**/*.json, and tool-output/*; the spike decides which is canonical before parser code exists.
If session JSON files are canonical, implementation mirrors the Claude Code parser: discover session JSON files, parse from byte offset, and emit complete-turn events. If SQLite is canonical, implementation opens the DB read-only and uses ingestion_state.last_offset as a row marker. If neither source is stable, opencode leaves this task and the task still ships /api/v1/search.
Tradeoffs that settled it.
docs/TODO.md, unblocks #7 cleanly, and avoids mixing parser discovery with frontend route work.dangerouslySetInnerHTML; #7 can convert them to <mark> with normal React nodes./api/v1/stats, /api/v1/sessions, /api/v1/projects, /api/v1/saved-searches, and ingest behavior stay unchanged.claude-code sources keep the same config and parser behavior.turns/sessions tables./search stub remains a stub until #7./api/v1/search plus opencode parser — current docs/TODO.md assigns React search UI to #7, and stats already shipped in #5./search stub.TDD: yes (reason: FTS query behavior, cursor round-trips, owner scoping, FTS syntax errors, and opencode parser offsets are deterministic contracts where regressions should fail CI.)
internal/shared/wire/ types are not modified./api/v1/search is read-only and executes SELECT only.turns_fts only; tool_outputs_fts is read only when include_tool_outputs=1.400 INVALID, not 500.parser.Parser unchanged.docs/spikes/opencode-format.md is committed before opencode parser implementation lands./api/v1/stats behavior and React /stats page are not changed by this task.web/src/routes/search.tsx remains a stub until #7.turns_fts and tool_outputs_fts are kept current by #1's triggers for every ingested turn.turns.rowid is stable for the existing regular FTS5 tables.~/.local/share/opencode/.opencode.db, storage/session/**/*.json, tool-output/*, or a combination.q through to MATCH.Approach: ship /api/v1/search as an additive read domain first, then run the opencode storage spike before writing the parser; keep #3 API/parser-only so #7 can consume the search contract without frontend churn here.
internal/domain/search/repository.go:1-260 (create)
type Result struct, type Row struct, type Filter struct, type Cursor struct — API/domain shapes for JSON output, filters, and pagination.func (r *Repository) Search(ctx context.Context, f Filter) (*Result, error) — executes default turns_fts search and optional tool_outputs_fts union with owner/tool/host/time filters.func EncodeCursor(c Cursor, f Filter) (string, error) / func DecodeCursor(raw string, f Filter) (Cursor, error) — opaque cursor tied to normalized query/filter tuple.internal/domain/search/repository_test.go:1-360 (create)
search: add fts repositoryinternal/domain/search/handler.go:1-220 (create)
func (h *Handler) Mount(r chi.Router) — registers GET /search under /api/v1.func (h *Handler) List(w http.ResponseWriter, r *http.Request) — resolves auth owner scope, parses query params, clamps limit to 50/200, renders JSON or RFC 7807 errors.func (h *Handler) resolveScope(r *http.Request) (session.OwnerScope, error) — mirrors session/project admin owner rules.internal/domain/search/handler_test.go:1-260 (create)
q, bad since, non-admin owner, admin owner=*, bad cursor, and successful response envelope.internal/server/server.go:31-66,103-110 (modify)
*search.Handler and mount it inside the authenticated /api/v1 group.cmd/lethe/main.go:26-137 and cmd/lethe/main_e2e_test.go:73-92 (modify)
search.Repository and search.Handler with steward in production and e2e graph setup.search: expose search endpointcmd/lethe-spike-opencode/main.go:1-180 (create, then delete before phase commit)
~/.local/share/opencode/, ~/.config/opencode/, and ~/.cache/opencode/; report structural file types, counts, sizes, and redacted samples.docs/spikes/opencode-format.md:1-160 (create)
collector: document opencode storage formatinternal/collector/parser/opencode/parser.go:1-320 (create)
func New(host string) *Parser, func (p *Parser) Tool() string, func (p *Parser) Discover(root string) ([]parser.SourceFile, error), func (p *Parser) Parse(path string, since int64) ([]wire.TurnEvent, int64, error) — implement the source shape chosen in PH3 without changing parser.Parser.func mapRecord(...) (wire.TurnEvent, bool) or SQLite-equivalent mapper — converts opencode session/message/tool-output records into wire.TurnEvent.internal/collector/parser/opencode/parser_test.go:1-260 and internal/collector/parser/opencode/testdata/* (create)
cmd/lethe-collector/main.go:17-221 and cmd/lethe-collector/main_test.go:1-90 (modify)
opencode.New(host) in buildParsers; test that both claude-code and opencode are present.collector: add opencode parserinternal/domain/search repository tests for FTS result shape, owner scope, filters, cursor, tool-output opt-in, and invalid query handling.internal/domain/search handler tests for query parsing, auth scoping, route mount, and response envelope.go test ./... -count=1; collector CLI smoke with an opencode source in config once PH4 lands.master.MATCH syntax can turn user input into hard SQL errors; PH1 maps those to 400 INVALID and keeps normalization isolated.tool-output/*; PH3 must choose a marker that PH4 can persist in last_offset without state schema changes.func (r *Repository) Search(ctx context.Context, f Filter) (*Result, error) — search read boundary used only by the HTTP handler.func (h *Handler) Mount(r chi.Router) — server mount contract matching other domain packages.func New(host string) *Parser — opencode parser constructor registered by the collector CLI.func buildParsers(host string) map[string]parser.Parser — collector parser registry remains the only dispatch point.docs/spikes/opencode-format.md — canonical opencode source choice consumed by the parser phase.internal/domain/search/internal/domain/search/, internal/server/, cmd/lethe/docs/spikes/opencode-format.mdinternal/collector/parser/opencode/, cmd/lethe-collector/Backwards-compat: additive route and parser registration only; PH1/PH2 do not alter existing routes or schema, and PH4 does not change the parser interface, runner, or collector state schema.
Scope check: no stats work, no React search UI, no schema migration, no saved-search changes, and no parser abstraction beyond buildParsers.
Result: passed
Positive:
/api/v1/search repository and handler tests cover ranked prose search, tool-output opt-in, filters, cursors, and response envelope.go build ./cmd/lethe ./cmd/lethe-collector succeeds.go test ./... -count=1 passes.Negative:
INVALID.?owner= on search returns FORBIDDEN.tool-output/ blob contents.Invariants / assumptions:
internal/shared/wire.parser.Parser, keeps collector state schema unchanged, and consumes the committed storage spike./search route were not changed.Interfaces:
Repository.Search(ctx, Filter) is called by handler and repository tests.Handler.Mount(r chi.Router) registers /api/v1/search.opencode.New(host) is registered through buildParsers and tested by cmd/lethe-collector.docs/spikes/opencode-format.md records the SQLite source and message.rowid marker used by PH4.Smoke: go test ./internal/domain/search -run TestHandler_SuccessfulResponseEnvelope -v and go test ./internal/collector/parser/opencode -run TestParse_MapsTurnsAndIdentity -v both pass.
Outcome: /api/v1/search and the opencode collector parser shipped on task/lethe-search-and-opencode through 5cc599d.
Invariants:
internal/shared/wire/ was not modified.INVALID.parser.Parser unchanged.docs/spikes/opencode-format.md landed before parser implementation./search route was not changed.turns.rowid in tests and implementation.~/.local/share/opencode/.last_offset stores next opencode message.rowid, and TurnEvent.Seq stores current rowid.opencode.db is canonical for v1.INVALID; no stricter normalizer was needed.message.time_created to inclusive next-message.rowid after reviewer found skipped-row risk in partial-accept paths.