Agent Core

Responsibilities

claw_core implements the on-device Agent core: given a message with “what the user said, which session and channel it came from”, it runs in a dedicated task to assemble context, call the LLM, parse tool calls, execute capabilities through a single entry point, and iterate per configuration until the model returns final text or errors.

Request shape

claw_core_request_t carries metadata for one interaction, for example:

session_id: session identifier (the Console session command switches the active session; IM routing also derives ids per policy).
user_text: user message body.
source_* / target_*: source channel, chat id, message id, source cap, etc. These are passed through to claw_cap_call_from_core so capabilities can reply or correlate context.

Applications submit with claw_core_submit and read claw_core_response_t via claw_core_receive / claw_core_receive_for. claw_core_response_t includes assistant text (text) and error_message.

Request flags

claw_core_request_t.flags supports combinations of the following flags:

Flag	Description
`CLAW_CORE_REQUEST_FLAG_PUBLISH_OUT_MESSAGE`	After inference, publish the response as an `out_message` event to the Event Router
`CLAW_CORE_REQUEST_FLAG_SKIP_RESPONSE_QUEUE`	Skip the response queue; don’t return results via `claw_core_receive`
`CLAW_CORE_REQUEST_FLAG_USER_INTERRUPT`	Mark this request as a user interrupt, started after aborting the previous request

The Event Router’s run_agent action sets the first two flags to enable an asynchronous submit + event-published response path.

When assembling the current-turn user prompt sent to the LLM, claw_core also appends Behavior Notes: since the framework usually auto-delivers the final assistant result to users, you generally do not need to proactively call IM send APIs from cap_im_platform to return the same text (to avoid duplicate replies).

Context assembly

claw_core can attach multiple context providers (claw_core_context_provider_t). Each contributes a slice to claw_core_context_t, distinguished by claw_core_context_kind_t:

CLAW_CORE_CONTEXT_KIND_SYSTEM_PROMPT
CLAW_CORE_CONTEXT_KIND_MESSAGES
CLAW_CORE_CONTEXT_KIND_TOOLS

edge_agent registers multiple providers in fixed order inside app_claw_start:

Editable profile/persona: claw_memory_profile_provider
Long-term memory: claw_memory_long_term_provider (full mode) or claw_memory_long_term_lightweight_provider (lightweight mode)
Session history: claw_memory_session_history_provider
Skills catalog list: claw_skill_skills_list_provider
Current visible tools: claw_cap_tools_provider

So the final tool list depends both on registered capabilities and on claw_cap_set_llm_visible_groups.

Cache Hit-Rate Optimization Design

claw_core’s context assembly order is designed to maximize LLM API cache hit rates.

Design principle

LLM APIs such as Anthropic and OpenAI use prefix matching for caching. Therefore, recent ESP-Claw versions try to reduce how often early prompt content changes.

Skill document injection changes

Earlier designs injected activated Skill documents into the system prompt through a separate context provider (claw_skill_active_skill_docs_provider). Every Skill activation or deactivation changed the system prompt content and invalidated caches.

The current design removes that provider and instead injects Skill documents into conversation history through the return value of the activate_skill tool. This keeps the system prompt relatively stable, so the cache prefix is not affected by Skill activation state.

Tool calls

claw_core handles LLM tool calls through callbacks.

In edge_agent, the call_cap callback targets claw_cap_call_from_core from claw_cap. claw_core only accepts “cap name + JSON args + current request context”; it does not resolve or execute the underlying capability implementation. Resolution and execution live in claw_cap.

Per-turn tool rounds are capped by max_tool_iterations.

Context persistence

claw_core persists replayable context records through a typed record batch callback.

In edge_agent, persist_context is claw_memory_persist_context_callback from claw_memory. claw_core invokes it for plain turns, intermediate tool rounds, final assistant records, and failure notes so context records scoped by session_id are persisted through claw_memory.

Request cancellation

claw_core_cancel_request(request_id) cancels the currently executing request:

request_id == 0: cancel any current in-flight request.
request_id != 0: only cancel if the current in-flight request ID matches.
If no cancellable request exists, returns ESP_ERR_NOT_FOUND.

This is cooperative cancellation, mainly used to interrupt an in-progress LLM HTTP request; after completion, the error is normalized as request cancelled for easier upper-layer handling.

Completion observer

claw_core_add_completion_observer lets you register a completion observer that receives claw_core_completion_summary_t after each request:

request_id / session_id: request and session identifiers.
final_text: final assistant text for this turn (may be empty).
context_providers_csv: CSV list of providers that injected non-empty context.
tool_calls_csv: CSV list of tool calls triggered in this turn.

This is useful for audit, metrics, or result-consistency checks (for example, verifying whether specific claims were accompanied by the corresponding tool calls).

Init error handling

app_claw_start skips claw_core_init / claw_core_start when API key, backend_type, or model is missing, and logs.

Core-dependent features (ask, default routing to the Agent, image inspect, …) stay unavailable until the LLM is fully configured and the device reboots. Event routing, automation, local capabilities, and the Console REPL still work.

Boot and runtime wiring Full core init flow and conditional logic

Configuration

Common fields in claw_core_config_t:

backend_type / model / base_url / auth_type / max_tokens_field / timeout_ms / max_tokens: routing and auth for different LLM vendors (backends live under claw_core_llm and llm/backends).
system_prompt: system prompt, required (claw_core_init validates this is non-empty).
supports_tools / supports_vision / image_remote_url_only: whether tools, vision input, and remote-URL-only images are supported.
image_max_bytes: maximum size of images/media passed to the LLM layer.
request_gate: optional request-gating callback that can intercept or reject requests before handling.
task_stack_size, task_priority, task_core: FreeRTOS resources for the background Agent task.
request_queue_len, response_queue_len: queue depths.
max_context_providers: maximum number of registered providers.