Architecture
This page describes OpenOrca's internal architecture for developers who want to understand, extend, or contribute to the codebase.
Solution Structure
OpenOrca.sln
├── src/
│ ├── OpenOrca.Cli # Console app, REPL, streaming UI
│ ├── OpenOrca.Core # Domain logic (chat, config, sessions, permissions, hooks, memory)
│ └── OpenOrca.Tools # 35 tool implementations
└── tests/
├── OpenOrca.Cli.Tests # CLI layer unit tests
├── OpenOrca.Core.Tests # Core domain unit tests
├── OpenOrca.Tools.Tests # Tool unit tests
└── OpenOrca.Harness # Integration tests (requires LM Studio)
Dependency Graph
OpenOrca.Cli
├── OpenOrca.Core
└── OpenOrca.Tools
└── OpenOrca.Core
- Cli depends on both Core and Tools
- Tools depends on Core (for config, permissions)
- Core has no project dependencies (only NuGet: Microsoft.Extensions.AI, System.Text.Json, etc.)
OpenOrca.Cli
The console application, REPL loop, and all rendering/UI concerns.
Key Files
| File | Purpose |
|---|---|
Program.cs | Entry point. Parses CLI args (--prompt, --demo), loads config, creates services, starts REPL. |
Repl/ReplLoop.cs | Main REPL loop. Reads user input, dispatches to CommandHandler or AgentLoopRunner. |
Repl/AgentLoopRunner.cs | Runs the agent loop: streaming, native/text tool switching, retry logic, auto-compaction, server error probing, nudge, and generation cancellation. Max 25 iterations per user message. |
Repl/CommandHandler.cs | Handles all slash commands (/help, /clear, /model, /session, /plan, /compact, /rewind, /context, /stats, /memory, /doctor, /copy, /export, /checkpoint, custom commands). |
Repl/ToolCallParser.cs | Parses tool calls from LLM text output. Handles <tool_call> tags, <|tool_call|> tags, [TOOL_CALL] tags, <function_call> tags, JSON in code fences, bare JSON, and {"function": {...}} wrappers. |
Repl/ToolCallExecutor.cs | Executes parsed tool calls: permission checks, hook running, tool invocation, result formatting. |
Repl/SystemPromptBuilder.cs | Constructs the system prompt from templates, substituting {{TOOL_LIST}}, {{CWD}}, {{PLATFORM}}, {{PROJECT_INSTRUCTIONS}}. Also injects auto memory. |
Repl/ReplState.cs | Mutable state: plan mode, show thinking, session ID, last response, token counts, stopwatch. |
Rendering/StreamingRenderer.cs | Renders streaming tokens to the terminal with markdown-like formatting. |
Rendering/ThinkingIndicator.cs | Shows animated thinking indicator and token counter when thinking is hidden. |
Rendering/ToolCallRenderer.cs | Renders tool call panels (name, arguments, result, timing). |
CustomCommands/CustomCommandLoader.cs | Discovers and loads custom command templates from .orca/commands/ and ~/.openorca/commands/. |
Agent Loop Flow
The core of OpenOrca is the agent loop in AgentLoopRunner.RunAgentLoopAsync():
User sends message
│
▼
┌─── Agent Loop (max 25 iterations) ◄───────────────────┐
│ │ │
│ ▼ │
│ Auto-compact check │
│ │ │
│ ▼ │
│ Stream LLM response │
│ │ │
│ ├── Native tool calls found? │
│ │ ├── Yes ─► Execute tools ─► Add results ──────┘
│ │ │ (with permission + hooks)
│ │ └── No
│ │ │
│ │ ▼
│ ├── Text tool calls parsed? │
│ │ ├── Yes ─► Execute tools ─► Add results ──────┘
│ │ └── No
│ │ │
│ │ ▼
│ ├── Truncated <tool_call>? ─► Nudge continue ──────┘
│ │ │
│ │ ▼
│ ├── Should nudge? ─► Send nudge message ───────────┘
│ │ │
│ │ ▼
│ └── No tool calls ─► Done (break)
│
└── Retry loop detection (4 identical failures ─► break)
Streaming with Thinking Toggle
During streaming, Ctrl+O toggles thinking visibility in real-time:
- Visible: Streaming tokens appear in cyan via
StreamingRenderer - Hidden:
ThinkingIndicatorshows token counter; console output is redirected toTextWriter.Null
The thinking state toggle works mid-stream: when toggled on, buffered tokens are flushed to the console.
Native Tool Auto-Downgrade
When nativeToolCalling is true:
- First attempt: send tool definitions via OpenAI function calling protocol
- If streaming returns updates but 0 content items → retry without tool definitions
- If native tool calls have missing required arguments → switch to text-based mode
- Once downgraded, stays in text-based mode for the rest of that agent loop
Streaming Error Probe (ProbeForServerErrorAsync)
LM Studio returns some streaming errors as SSE event: error with HTTP 200 status. The Microsoft.Extensions.AI SDK silently drops these, producing a stream that completes with 0 tokens and 0 updates.
When AgentLoopRunner detects this (no tokens received after streaming completes), it calls ProbeForServerErrorAsync() which makes a raw HTTP POST to /chat/completions with stream: false. This non-streaming request surfaces the actual error from the server (context overflow, model crash, etc.) and displays it to the user. Without this probe, the user would only see "LLM returned an empty response" with no actionable information.
Streaming Idle Timeout
Each streaming loop (both primary and retry) is wrapped with a resettable idle timeout via CancellationTokenSource.CancelAfter(). The timeout resets on every token/update received, so it only fires when the stream goes idle.
- Default: 120 seconds (configurable via
LmStudioConfig.StreamingTimeoutSeconds) - Fallback constant:
CliConstants.StreamingIdleTimeoutSeconds - Uses a linked CTS so generation cancellation (Ctrl+C) still works
- On timeout: displays a warning with the timeout duration and a log path hint
Text-Based Tool Call Parsing
ToolCallParser.ParseToolCallsFromText() extracts tool calls from LLM text output using 6 pattern categories, tried in order:
| # | Pattern | Regex / Description |
|---|---|---|
| 1 | <tool_call> tags | <tool_call>{...}</tool_call> |
| 2 | <|tool_call|> tags | <|tool_call|>{...}<|/tool_call|> |
| 3 | [TOOL_CALL] tags | [TOOL_CALL]{...}[/TOOL_CALL] |
| 4 | <function_call> tags | <function_call>{...}</function_call> |
| 4b | JSON in code fences | ```json\n{...}\n``` |
| 5 | Bare JSON | {"name": "...", "arguments": {...}} (requires "name" before "arguments", supports one level of nested braces) |
Before pattern matching, the parser strips <think>...</think> blocks and <assistant> tags. It tries the stripped text first, falling back to the full text only if no matches are found. Pattern 5 (bare JSON) only runs if patterns 1–4b yield no results, preventing false positives on conversational JSON.
The parser also handles {"function": {"name": "...", "arguments": {...}}} wrapper format and {"tool_call": {...}} wrapper format.
Nudge Mechanism
When the model outputs text that looks like an action (code blocks + action words + file paths, or code blocks with tool-like JSON) but doesn't use <tool_call> tags, the agent loop sends a nudge message (PromptConstants.NudgeMessage) asking the model to re-emit using the proper format. This gives the model a second chance without wasting context.
Nudge triggers:
- Tool-like JSON in code blocks — a
```jsonblock containing{"name":is detected - Action pattern — a code block + action words (create, write, save, update, etc.) + a file path pattern
Nudge is limited to 1 attempt per agent loop. Additionally, truncated <tool_call> tags (open tag without close) trigger up to 2 continuation attempts with a separate recovery message.
25-Turn Agent Loop Limit
The agent loop runs a maximum of CliConstants.AgentMaxIterations (25) iterations per user message. This prevents runaway loops when the model keeps calling tools indefinitely. The loop also detects retry patterns — if a tool fails 4 times identically, the loop breaks. At 3 identical failures, a redirect message is injected asking the model to try a different approach.
OpenOrca.Core
Domain logic with no UI concerns. Can be referenced independently.
Key Files
| File | Purpose |
|---|---|
Chat/Conversation.cs | Message list management, system prompt, token estimation, compaction, turn removal. |
Chat/ConversationManager.cs | Tracks active conversations by ID. |
Client/LmStudioClientFactory.cs | Creates IChatClient instances configured from OrcaConfig. |
Client/ModelDiscovery.cs | Queries the LLM server's /v1/models endpoint for available models. |
Configuration/OrcaConfig.cs | Configuration POCO: LmStudioConfig, PermissionsConfig, ContextConfig, SessionConfig, HooksConfig. |
Configuration/ConfigManager.cs | Loads/saves config.json. Handles ~/.openorca/ directory creation. |
Configuration/PromptManager.cs | Loads and generates system prompt templates. Three-tier resolution: explicit profile → model-specific → default. |
Configuration/ProjectInstructionsLoader.cs | Finds and loads ORCA.md from the project root. |
Configuration/MemoryManager.cs | Manages auto-learned memory files in ~/.openorca/memory/ (global) and .orca/memory/ (project). Handles load, save, prune, list, and clear operations. |
Hooks/HookRunner.cs | Runs pre/post-tool shell hooks. Pre-hooks can block tool execution (non-zero exit). Post-hooks are fire-and-forget. |
Orchestration/AgentOrchestrator.cs | Manages sub-agent spawning and lifecycle. |
Permissions/PermissionManager.cs | Evaluates whether a tool call should be auto-approved or requires user confirmation. Supports allow/deny glob patterns. |
Permissions/PermissionPattern.cs | Parses and matches ToolName(argGlob) permission patterns. Case-insensitive tool name matching with wildcard argument matching. |
Session/SessionManager.cs | Save, load, list, delete conversation sessions as JSON files. |
Session/CheckpointManager.cs | Manages file checkpoints — automatic snapshots before file-modifying tool calls, with restore, diff, list, and cleanup operations. |
OpenOrca.Tools
All 35 tool implementations, organized by category.
Directory Structure
OpenOrca.Tools/
├── Abstractions/
│ ├── IOrcaTool.cs # Tool interface
│ ├── ToolResult.cs # Result type (content + isError)
│ └── ToolRiskLevel.cs # ReadOnly, Moderate, Dangerous
├── Registry/
│ └── ToolRegistry.cs # Auto-discovery via reflection + ILogger injection
├── FileSystem/ # read_file, write_file, edit_file, multi_edit, delete_file,
│ # copy_file, move_file, mkdir, cd, glob, grep, list_directory
├── Shell/ # bash, start_background_process, get_process_output, stop_process
├── Git/ # git_status, git_diff, git_log, git_commit, git_branch,
│ # git_checkout, git_push, git_pull, git_stash
├── GitHub/ # github (gh CLI wrapper)
├── Web/ # web_fetch, web_search (with per-domain rate limiting)
├── Network/ # network_diagnostics (ping, dns_lookup, check_connection)
├── Archive/ # archive (create, extract, list zip archives)
├── Interactive/ # ask_user
├── Utility/ # think, task_list, env
└── Agent/ # spawn_agent
Tool Interface
Every tool implements IOrcaTool:
public interface IOrcaTool
{
string Name { get; }
string Description { get; }
ToolRiskLevel RiskLevel { get; }
JsonElement ParameterSchema { get; }
Task<ToolResult> ExecuteAsync(JsonElement args, CancellationToken ct = default);
}
Auto-Discovery
ToolRegistry.DiscoverTools() uses reflection to find all classes implementing IOrcaTool in the assembly. Each tool is instantiated via Activator.CreateInstance() — no manual registration or attributes needed. Just implement IOrcaTool with a parameterless constructor and the registry picks it up.
ILogger Property Injection
Tools can opt in to receiving a logger by declaring a public settable property:
public ILogger? Logger { get; set; }
When ToolRegistry has an ILoggerFactory (passed to its constructor), it uses reflection to find this property and injects a category-specific logger after instantiation. This avoids constructor injection (which would break Activator.CreateInstance) while still providing structured logging for tools that need it.
Data Flow: A User Prompt
Here's the complete path a user prompt takes through the system:
- ReplLoop reads user input → checks for
/commands or!shell shortcut - ReplLoop adds user message to Conversation → calls AgentLoopRunner
- AgentLoopRunner checks context usage → may call CommandHandler.CompactConversationAsync
- AgentLoopRunner calls
IChatClient.GetStreamingResponseAsync()with messages and options - StreamingRenderer displays tokens as they arrive; ThinkingIndicator shows counter if thinking is hidden
- After streaming completes, AgentLoopRunner checks for native FunctionCallContent items
- If none, ToolCallParser searches the text for tool call patterns
- ToolCallExecutor runs each tool call: a. PermissionManager checks approval b. HookRunner runs pre-hooks (can block) c. IOrcaTool.ExecuteAsync runs the tool d. HookRunner runs post-hooks e. Result is added to Conversation
- Loop repeats from step 4 (up to 25 times)
- When no tool calls are found, the loop ends and ReplLoop prompts for the next input
Key Design Patterns
- Separation of concerns: UI (Cli) vs logic (Core) vs tools (Tools) — clean project boundaries
- Interface-based tools:
IOrcaToolwith reflection-based discovery enables easy extension - Graceful degradation: Native tool calling → text-based fallback → nudge → retry loop detection
- Linked cancellation:
_generationCtslinked to app CTS allows per-generation Ctrl+C without killing the app - Fire-and-forget hooks: Post-tool hooks never block the agent loop
- Token estimation: Simple char/4 heuristic for context tracking (good enough for compaction decisions)
- Per-domain rate limiting:
DomainRateLimiterthrottles web requests with a 1.5s minimum delay per domain usingConcurrentDictionary+SemaphoreSlim - Property-based logger injection: Tools opt into logging via a public
Loggerproperty, injected byToolRegistryat discovery time — avoids constructor injection constraints