How It Works¶

Indexing pipeline¶

sessfind reads session files from each configured source (GitHub Copilot, Claude Code, OpenCode, Cursor, Codex).
User/assistant messages are paired and split into chunks (~6000 chars each).
Chunks are indexed with tantivy full-text search.

Incremental updates¶

Each file's mtime and size are tracked in a local SQLite database (state.db). On subsequent runs, only new or modified session files are re-indexed — making updates fast even with large session histories.

Search (FTS + Fuzzy)¶

FTS mode queries the tantivy index with BM25 ranking for relevance-sorted results.
Fuzzy mode does in-memory substring matching on pre-loaded chunks across content, project name, and title.

Semantic search pipeline¶

The optional sessfind-semantic plugin generates vector embeddings for each chunk using the multilingual-e5-small model (384 dimensions, via ONNX Runtime).
Embeddings are stored in a local sqlite-vec database (semantic.db).
At query time, the input is embedded and compared against stored vectors via cosine similarity.

LLM search pipeline¶

sessfind detects installed AI CLI tools (claude, opencode, copilot) on PATH.
The user's natural language query is sent to the selected LLM in headless mode (e.g., claude -p).
The LLM generates optimized FTS queries (synonyms, related terms, multiple languages).
sessfind executes each generated query against the tantivy index and merges the results.

Resume mechanism¶

When you select a session and press Enter, sessfind replaces the current process (exec()) with the appropriate tool's resume command:

Source	Resume Command
GitHub Copilot	`copilot --resume=SESSION_ID`
Claude Code	`claude --resume SESSION_ID`
OpenCode	`opencode --session SESSION_ID`
Cursor	`cursor PROJECT_PATH`
Codex	`codex resume SESSION_ID`

The exec() call means the terminal is handed off cleanly to the AI tool — sessfind's process is replaced, not kept running in the background.