Skip to content

Agent Core

Header: claw_core.hcomponents/claw_modules/claw_core/include/claw_core.h

claw_core implements the on-device Agent core: given a message with “what the user said, which session and channel it came from”, it runs in a dedicated task to assemble context, call the LLM, parse tool calls, execute capabilities through a single entry point, and iterate per configuration until the model returns final text or errors.

claw_core_request_t carries metadata for one interaction, for example:

  • session_id: session identifier (the Console session command switches the active session; IM routing also derives ids per policy).
  • user_text: user message body.
  • source_* / target_*: source channel, chat id, message id, source cap, etc. These are passed through to claw_cap_call_from_core so capabilities can reply or correlate context.

Applications submit with claw_core_submit and read claw_core_response_t via claw_core_receive / claw_core_receive_for. claw_core_response_t includes assistant text (text) and error_message.

claw_core_request_t.flags supports combinations of the following flags:

FlagDescription
CLAW_CORE_REQUEST_FLAG_PUBLISH_OUT_MESSAGEAfter inference, publish the response as an out_message event to the Event Router
CLAW_CORE_REQUEST_FLAG_SKIP_RESPONSE_QUEUESkip the response queue; don’t return results via claw_core_receive
CLAW_CORE_REQUEST_FLAG_USER_INTERRUPTMark this request as a user interrupt, started after aborting the previous request

The Event Router’s run_agent action sets the first two flags to enable an asynchronous submit + event-published response path.

When assembling the current-turn user prompt sent to the LLM, claw_core also appends Behavior Notes: since the framework usually auto-delivers the final assistant result to users, you generally do not need to proactively call IM send APIs from cap_im_platform to return the same text (to avoid duplicate replies).

claw_core can attach multiple context providers (claw_core_context_provider_t). Each contributes a slice to claw_core_context_t, distinguished by claw_core_context_kind_t:

  • CLAW_CORE_CONTEXT_KIND_SYSTEM_PROMPT
  • CLAW_CORE_CONTEXT_KIND_MESSAGES
  • CLAW_CORE_CONTEXT_KIND_TOOLS

edge_agent registers multiple providers in fixed order inside app_claw_start:

  1. Editable profile/persona: claw_memory_profile_provider
  2. Long-term memory: claw_memory_long_term_provider (full mode) or claw_memory_long_term_lightweight_provider (lightweight mode)
  3. Session history: claw_memory_session_history_provider
  4. Skills catalog list: claw_skill_skills_list_provider
  5. Current visible tools: claw_cap_tools_provider

So the final tool list depends both on registered capabilities and on claw_cap_set_llm_visible_groups.

claw_core’s context assembly order is designed to maximize LLM API cache hit rates.

LLM APIs such as Anthropic and OpenAI use prefix matching for caching. Therefore, recent ESP-Claw versions try to reduce how often early prompt content changes.

Earlier designs injected activated Skill documents into the system prompt through a separate context provider (claw_skill_active_skill_docs_provider). Every Skill activation or deactivation changed the system prompt content and invalidated caches.

The current design removes that provider and instead injects Skill documents into conversation history through the return value of the activate_skill tool. This keeps the system prompt relatively stable, so the cache prefix is not affected by Skill activation state.

claw_core handles LLM tool calls through callbacks.

In edge_agent, the call_cap callback targets claw_cap_call_from_core from claw_cap. claw_core only accepts “cap name + JSON args + current request context”; it does not resolve or execute the underlying capability implementation. Resolution and execution live in claw_cap.

Per-turn tool rounds are capped by max_tool_iterations.

claw_core persists replayable context records through a typed record batch callback.

In edge_agent, persist_context is claw_memory_persist_context_callback from claw_memory. claw_core invokes it for plain turns, intermediate tool rounds, final assistant records, and failure notes so context records scoped by session_id are persisted through claw_memory.

claw_core_cancel_request(request_id) cancels the currently executing request:

  • request_id == 0: cancel any current in-flight request.
  • request_id != 0: only cancel if the current in-flight request ID matches.
  • If no cancellable request exists, returns ESP_ERR_NOT_FOUND.

This is cooperative cancellation, mainly used to interrupt an in-progress LLM HTTP request; after completion, the error is normalized as request cancelled for easier upper-layer handling.

claw_core_add_completion_observer lets you register a completion observer that receives claw_core_completion_summary_t after each request:

  • request_id / session_id: request and session identifiers.
  • final_text: final assistant text for this turn (may be empty).
  • context_providers_csv: CSV list of providers that injected non-empty context.
  • tool_calls_csv: CSV list of tool calls triggered in this turn.

This is useful for audit, metrics, or result-consistency checks (for example, verifying whether specific claims were accompanied by the corresponding tool calls).

app_claw_start skips claw_core_init / claw_core_start when API key, backend_type, or model is missing, and logs.

Core-dependent features (ask, default routing to the Agent, image inspect, …) stay unavailable until the LLM is fully configured and the device reboots. Event routing, automation, local capabilities, and the Console REPL still work.

Boot and runtime wiring Full core init flow and conditional logic

Common fields in claw_core_config_t:

  • backend_type / model / base_url / auth_type / max_tokens_field / timeout_ms / max_tokens: routing and auth for different LLM vendors (backends live under claw_core_llm and llm/backends).
  • system_prompt: system prompt, required (claw_core_init validates this is non-empty).
  • supports_tools / supports_vision / image_remote_url_only: whether tools, vision input, and remote-URL-only images are supported.
  • image_max_bytes: maximum size of images/media passed to the LLM layer.
  • request_gate: optional request-gating callback that can intercept or reject requests before handling.
  • task_stack_size, task_priority, task_core: FreeRTOS resources for the background Agent task.
  • request_queue_len, response_queue_len: queue depths.
  • max_context_providers: maximum number of registered providers.