中文用户?
What new feature/improvement
Overview
We are designing a flexible, model-agnostic tool calling mechanism for AmritaCore. The goal is to enable any LLM (whether it supports native function calling or not) to invoke external tools in a structured, extensible way, while keeping token usage efficient and avoiding technical debt.
After internal discussions, we’ve settled on an XML‑based approach with an am: namespace. We’d love to get community feedback on the design before implementation.
Core Design
1. Tool Definition
Tools are registered in code and their definitions are presented to the LLM as an XML block in the system prompt.
Example:
<am:tools>
<am:tool name="calculator" description="Add two integers">
<parameter name="a" type="integer" description="First number" required="true"/>
<parameter name="b" type="integer" description="Second number" required="true"/>
</am:tool>
<am:tool name="weather" description="Get current weather for a city">
<parameter name="city" type="string" description="City name" required="true"/>
</am:tool>
</am:tools>
- Each
<am:tool> contains metadata and a list of <parameter> children.
- Parameter types can be extended later (arrays, objects) using nested elements or JSON strings.
2. Tool Invocation
When the LLM decides to call a tool, it must output a non‑self‑closing <am:tool_call> element, with every parameter as a child element.
Example:
<am:tool_call name="calculator">
<a>5</a>
<b>3</b>
</am:tool_call>
- Why non‑self‑closing?
Self‑closing tags would prevent us from using </am:tool_call> as a reliable stop sequence. By requiring an explicit closing tag, we can set the LLM’s stop parameter to ["</am:tool_call>"] and capture the entire invocation cleanly.
- Why parameters as child elements?
This avoids attribute escaping issues, naturally supports complex data (by nesting or embedding JSON), and keeps the format consistent and extensible. It also eliminates the risk of attribute‑based technical debt later.
3. Result Feedback
After executing the tool, the system inserts the result wrapped in an <am:tool_result> tag (or <am:tool_error> on failure):
<am:tool_result name="calculator">8</am:tool_result>
The LLM then continues generation, seeing the result as part of the conversation history.
Two Operating Modes
We envision two modes of operation, allowing users to choose the trade‑off between simplicity and token efficiency.
Mode 1: Pure XML
- All tool definitions (the full
<am:tools> block) are included in every conversation’s system prompt.
- The LLM invokes tools directly using
<am:tool_call>.
- Pros: Model‑agnostic, simple to implement, works with any LLM.
- Cons: If you have many tools, token usage can be high.
Mode 2: Hybrid (Native Function Calling + XML)
- Step 1 – Tool selection:
Use the LLM’s native function calling capability (e.g., OpenAI tools) to let the model choose which tools it needs. Only lightweight summaries (name + short description) are sent in this phase to save tokens.
- Step 2 – Dynamic injection:
Once the model signals the required tools (via a native tool call or a text list), the system injects the full XML definitions of only those tools into the context.
- Step 3 – XML invocation:
From that point on, the conversation uses the pure XML mechanism described above.
- Pros: Drastically reduces token usage when tool sets are large; still model‑agnostic for the actual tool execution.
- Cons: Requires native function calling support for the selection phase (optional fallback to text‑based selection). Slightly more complex to implement.
Both modes will be supported, with the hybrid mode as an advanced, opt‑in feature.
Open Questions & Discussion Points
We’d love the community’s input on the following:
-
Parameter representation for complex types
Should we standardize on JSON strings inside child elements for arrays/objects, or invest in fully structured XML nesting from the start? The former is simpler for LLMs to generate; the latter is more “native” to XML.
-
Dynamic tool discovery
In hybrid mode, how should the model request additional tools mid‑conversation? A special tag like <am:request_tools>? Or re‑enter the selection phase?
-
Error handling and retries
How detailed should error messages in <am:tool_error> be? Should we include stack traces or only user‑friendly messages?
-
Tool definition granularity
Should we include parameter schemas (like JSON Schema) inside the XML, or rely on code‑side validation and only provide human‑readable descriptions? The current proposal uses simple <parameter> elements, but we could embed JSON Schema for more rigor.
-
Compatibility with existing agent frameworks
Are there other tool‑calling standards (e.g., OpenAPI, MCP) we should consider aligning with?
-
Streaming considerations
How should we handle tool calls when using streaming responses? Detect the closing tag on the fly and abort?
Next Steps
- We will implement a proof‑of‑concept in a feature branch, focusing on the pure XML mode first.
- After gathering feedback, we’ll refine the design and move toward a production‑ready implementation in AmritaCore 1.x (pure XML) and 2.x (hybrid).
Please share your thoughts, concerns, or alternative ideas! We believe this approach balances flexibility, extensibility, and token efficiency, and we’re excited to build it with the community.
中文用户?
What new feature/improvement
Overview
We are designing a flexible, model-agnostic tool calling mechanism for AmritaCore. The goal is to enable any LLM (whether it supports native function calling or not) to invoke external tools in a structured, extensible way, while keeping token usage efficient and avoiding technical debt.
After internal discussions, we’ve settled on an XML‑based approach with an
am:namespace. We’d love to get community feedback on the design before implementation.Core Design
1. Tool Definition
Tools are registered in code and their definitions are presented to the LLM as an XML block in the system prompt.
Example:
<am:tool>contains metadata and a list of<parameter>children.2. Tool Invocation
When the LLM decides to call a tool, it must output a non‑self‑closing
<am:tool_call>element, with every parameter as a child element.Example:
Self‑closing tags would prevent us from using
</am:tool_call>as a reliable stop sequence. By requiring an explicit closing tag, we can set the LLM’sstopparameter to["</am:tool_call>"]and capture the entire invocation cleanly.This avoids attribute escaping issues, naturally supports complex data (by nesting or embedding JSON), and keeps the format consistent and extensible. It also eliminates the risk of attribute‑based technical debt later.
3. Result Feedback
After executing the tool, the system inserts the result wrapped in an
<am:tool_result>tag (or<am:tool_error>on failure):The LLM then continues generation, seeing the result as part of the conversation history.
Two Operating Modes
We envision two modes of operation, allowing users to choose the trade‑off between simplicity and token efficiency.
Mode 1: Pure XML
<am:tools>block) are included in every conversation’s system prompt.<am:tool_call>.Mode 2: Hybrid (Native Function Calling + XML)
Use the LLM’s native function calling capability (e.g., OpenAI
tools) to let the model choose which tools it needs. Only lightweight summaries (name + short description) are sent in this phase to save tokens.Once the model signals the required tools (via a native tool call or a text list), the system injects the full XML definitions of only those tools into the context.
From that point on, the conversation uses the pure XML mechanism described above.
Both modes will be supported, with the hybrid mode as an advanced, opt‑in feature.
Open Questions & Discussion Points
We’d love the community’s input on the following:
Parameter representation for complex types
Should we standardize on JSON strings inside child elements for arrays/objects, or invest in fully structured XML nesting from the start? The former is simpler for LLMs to generate; the latter is more “native” to XML.
Dynamic tool discovery
In hybrid mode, how should the model request additional tools mid‑conversation? A special tag like
<am:request_tools>? Or re‑enter the selection phase?Error handling and retries
How detailed should error messages in
<am:tool_error>be? Should we include stack traces or only user‑friendly messages?Tool definition granularity
Should we include parameter schemas (like JSON Schema) inside the XML, or rely on code‑side validation and only provide human‑readable descriptions? The current proposal uses simple
<parameter>elements, but we could embed JSON Schema for more rigor.Compatibility with existing agent frameworks
Are there other tool‑calling standards (e.g., OpenAPI, MCP) we should consider aligning with?
Streaming considerations
How should we handle tool calls when using streaming responses? Detect the closing tag on the fly and abort?
Next Steps
Please share your thoughts, concerns, or alternative ideas! We believe this approach balances flexibility, extensibility, and token efficiency, and we’re excited to build it with the community.