SQLite-backed persistence with in-memory working memory, semantic search via embeddings, and tri-hybrid retrieval.
id
—
session_id
role
content
null
tool_call_id
tool_name
tool_calls_json
created_at
importance
0.5
embedding
embedding_error
consolidated_at
auto
category
key
value
source
""
updated_at
An in-memory HashMap<String, VecDeque<Message>> per session, capped at working_memory_cap (default 50). Avoids database hits for recent conversation history.
HashMap<String, VecDeque<Message>>
working_memory_cap
The get_context method combines three retrieval strategies:
get_context
Results are deduplicated by message ID before being included in context.
A background task runs every consolidation_interval_hours (default 6). It compresses old conversations into summaries using the fast model, reducing storage and context window usage.
consolidation_interval_hours