Agent Loop & Error Recovery
The core agentic loop: receive message, call LLM, execute tools, iterate, respond.
Agent Loop Flow
- Persist user message โ stored with importance score
- Auto-route model โ classify query complexity (if not overridden)
- Build system prompt โ base prompt + matched skills + known facts
- Retrieve context โ tri-hybrid memory retrieval
- Iterate (up to
max_iterations):- Collect pinned old memories + recent messages (deduplicated)
- Build OpenAI-format message list
- Call LLM with error-classified recovery
- If tool calls โ execute each, persist results, continue loop
- If no tool calls OR final iteration โ return text response
- Max iterations reached โ return timeout message
Dynamic Iteration Budget
The agent has a built-in request_more_iterations tool that extends the loop budget when the current limit is insufficient:
- Extends budget by 10 iterations per call
- Hard cap prevents unlimited extension (typically 25 total)
- Requires a
reasonparameter explaining what remains to be done - Used when the agent has a clear plan but would otherwise run out of iterations mid-task
Error Recovery Strategy
The call_llm_with_recovery method classifies errors and responds accordingly:
| Error Type | Strategy |
|---|---|
| Auth / Billing | Return immediately to user โ no retry |
| Rate Limit | Wait retry_after_secs (capped at 60s), retry once |
| Timeout / Network / Server Error | Wait 2s, retry once; on failure, fall back to previous model |
| Not Found (bad model) | Immediately switch to fallback model |
| Unknown | Propagate as error |
Last Known Good
After every successful LLM call, the current config is saved as
config.toml.lastgood. This enables automatic recovery from bad config changes.Message Ordering Fixup
To satisfy constraints across Gemini, Anthropic, and OpenAI providers, aidaemon runs a three-pass fixup on the message history before each LLM call:
- Pass 1: Merge consecutive same-role messages (combines tool_calls arrays)
- Pass 2: Drop orphaned tool results (no matching assistant tool_call) and strip orphaned tool_calls (no matching tool result)
- Pass 3: Merge again after orphan removal may create new consecutive same-role messages
Tool Execution
During the loop, each tool call receives:
_session_idโ injected automatically for session tracking_untrusted_sourceโ flag set for trigger-originated sessions
Stall & Repetition Detection
The agent loop includes safeguards against getting stuck:
- Stall detection — if the same tool is called 3+ times consecutively with similar arguments, the loop breaks
- Repetition detection — detects repeated response text and forces a break
- Hard iteration limit — default 10, extendable to 25 via
request_more_iterations
Session Types
| Session Type | Format | Trusted |
|---|---|---|
| Telegram chat | Chat ID as string | Yes |
| Slack channel | slack:{channel_id} or slack:{channel_id}:{thread_ts} | Yes |
| Discord channel | discord:{channel_id} | Yes |
| Email trigger | email_trigger | No |
| Event trigger | event_{uuid} | No |
| Sub-agent | sub-{depth}-{uuid} | Inherited |