State Management & Memory

SQLite-backed persistence with in-memory working memory, semantic search via embeddings, and tri-hybrid retrieval.

Database Schema

messages table

KeyTypeDefaultDescription
idTEXT PKUUID primary key
session_idTEXTSession identifier (indexed)
roleTEXTuser, assistant, or tool
contentTEXTnullMessage text content
tool_call_idTEXTnullID for tool result messages
tool_nameTEXTnullTool name for tool messages
tool_calls_jsonTEXTnullJSON array of tool calls
created_atTEXTRFC3339 timestamp
importanceREAL0.5Importance score (0.0–1.0)
embeddingBLOBnullJSON-encoded Vec<f32> embedding
embedding_errorTEXTnullError message if embedding failed
consolidated_atTEXTnullMemory consolidation timestamp

facts table

KeyTypeDefaultDescription
idINTEGER PKautoAuto-incrementing primary key
categoryTEXTGrouping category
keyTEXTFact key (unique per category)
valueTEXTFact content
sourceTEXT""Who stored it: "agent" or "user"
created_atTEXTRFC3339 timestamp
updated_atTEXTRFC3339 timestamp

Working Memory

An in-memory HashMap<String, VecDeque<Message>> per session, capped at working_memory_cap (default 50). Avoids database hits for recent conversation history.

Tri-Hybrid Retrieval

The get_context method combines three retrieval strategies:

StrategySourceLimitPurpose
RecencyLast N messages10Conversational continuity
Salienceimportance ≥ 0.85Critical flagged memories
RelevanceVector similarity > 0.655Semantic search via embeddings

Results are deduplicated by message ID before being included in context.

Embedding Service

  • Model: AllMiniLML6V2 (via fastembed)
  • Runs in background — embeds new messages after they are appended
  • Enables the relevance leg of tri-hybrid retrieval

Memory Consolidation

A background task runs every consolidation_interval_hours (default 6). It compresses old conversations into summaries using the fast model, reducing storage and context window usage.

Importance Scoring

RoleDefault Score
User message0.5
Assistant response0.5
Tool output0.3
System message0.1

Connection Pool

  • SQLite pool: max 5 connections
  • Journal mode: WAL (Write-Ahead Logging) for concurrent reads
  • Auto-creates database and tables if missing