Skip to content

Capabilities overview

A capability is the uniform abstraction for anything callable in ESP-Claw. Each cap_* component registers one or more metadata-rich descriptors (claw_cap_descriptor_t) with claw_cap, and all execution funnels through the shared dispatcher.

Capabilities can play three roles:

RolekindMeaning
Callable toolCLAW_CAP_KIND_CALLABLEJSON in, text out; invocable from the LLM, Console, or automation
Event sourceCLAW_CAP_KIND_EVENT_SOURCELong-lived producer feeding claw_event_router
HybridCLAW_CAP_KIND_HYBRIDBoth callable and event emitting

Descriptors are grouped into capability groups (claw_cap_group_t)—the smallest unit for registration, start/stop, and LLM visibility toggles.

FamilyComponentsNotes
IM ingresscap_im_platformUnified component for Feishu, QQ, Telegram, and WeChat; registers per-platform groups such as cap_im_tg
Filesystemcap_filesManaged FATFS read/write, edit, copy, move, delete, and listing
Luacap_luaAuthor, sync/async run, and track Lua jobs
Skill admincap_skillRegister/unregister and activate Skills, exposing core features as tools and returning the full document on activation
LLM inspectcap_llm_inspectNested multimodal calls over local images
HTTP requestcap_http_requestAllowlisted direct HTTP and HTTPS requests
Web searchcap_web_searchExternal search APIs
Systemcap_systemSystem/device inspection with optional runtime sections, current local time queries, and restart controls
Schedulercap_schedulerCron-like scheduling
Router admincap_router_mgrDynamic Event Router rule maintenance
MCPcap_mcp_*Model Context Protocol client/server
CLIcap_cliSurface caps through terminal helpers
Sessionscap_session_mgrSession state persistence
Agent managercap_agent_mgrRoot-only subagent spawn / supervise / close / delete

Every cap_* exposes cap_xxx_register_group() for the app to call during init:

// cap_files example
esp_err_t cap_files_register_group(void)
{
    if (claw_cap_group_exists(s_files_group.group_id)) {
        return ESP_OK;
    }
    return claw_cap_register_group(&s_files_group);
}

claw_cap_register_group enforces unique group ids, runs descriptor init hooks, then finishes registration. claw_cap_start_group later invokes start (e.g. launching the Telegram poll task from the cap_im_platform component).

Not every registered group is visible to the model. claw_cap_set_llm_visible_groups applies an allow-list controlling which tool schemas enter context:

// Typical boot policy: only baseline tools
static const char *VISIBLE_GROUPS[] = { "cap_files", "cap_skill", "cap_system" };
claw_cap_set_llm_visible_groups(VISIBLE_GROUPS, 3);

Other groups remain callable from the Console or automation; activating Skills expands the LLM-facing set.

Three primary entry points:

  1. LLM tool calls: claw_coreclaw_cap_call_from_core (session/channel aware)
  2. Console: cap call <name> <json> for direct execution
  3. Event Router rules: call_cap actions (see Dataflow and automation)

claw_cap_call_context_t.caller records whether the invoker was SYSTEM, AGENT, or CONSOLE.

How to implement a capability Authoring a `cap_*` component end-to-end
cap_im_platform Unified IM reference: Feishu, QQ, Telegram, and WeChat
cap_skill Core-surface reference: `claw_skill` as tools
cap_agent_mgr Subagent reference: root-only spawn, supervise, and tear down subagents
cap_llm_inspect LLM-interaction reference: nested inference
cap_files Filesystem reference: managed-tree read/write, edit, copy, move, and delete
cap_system System-state reference: inspect runtime status and support safe restart
cap_scheduler Scheduling reference: time-based event triggering and periodic tasks
cap_http_request Allowlist-protected HTTP/HTTPS requests
cap_web_search Web search through external search APIs (Tavily / Brave)
cap_router_mgr Dynamic Event Router automation rule maintenance
cap_mcp_client / cap_mcp_server MCP client and server support for cross-device tool calls
cap_lua and Lua overview Lua tooling plus where scripts fit in the stack
Lua extension modules Registering `lua_module_*` / `lua_driver_*`, custom modules, `display` deep dive