Enhance MCP server integration and tool registration in OpenClaw#36
Enhance MCP server integration and tool registration in OpenClaw#36Telli merged 19 commits intoclawdotnet:mainfrom
Conversation
Co-authored-by: geffzhang <439390+geffzhang@users.noreply.github.com> Agent-Logs-Url: https://github.com/geffzhang/openclaw.net/sessions/8d16cf23-fb87-42c9-91f6-93bbb91550bd
Enhance agent delegation, bash execution, and WhatsApp/Teams integration
Add ModelContextProtocol dependency and refactor McpServerToolRegistry to use McpClient and transport implementations instead of custom McpServerSession types. Introduces transport creation, header and environment variable resolution, tool name sanitization, and centralized tool loading via the ModelContextProtocol client. Update disposal to dispose clients and simplify tool execution to use the client's CallTool API. Tests updated to exercise the new HTTP transport, verify header/secret resolution, and use an embedded MCP server with DemoMcpTools.
…gration Add MCP server plugin config and register discovered tools as native OpenClaw tools
🤖 Augment PR SummarySummary: This PR adds first-class support for integrating external MCP (Model Context Protocol) servers as native tools in OpenClaw. Changes:
Technical Notes: Uses ModelContextProtocol v1.1.0 and applies separate startup vs request timeouts; secrets in headers/env values are resolved via 🤖 Was this summary useful? React with 👍 or 👎 |
| public async Task RegisterToolsAsync(NativePluginRegistry nativeRegistry, CancellationToken ct) | ||
| { | ||
| var tools = await LoadAsync(ct); | ||
| nativeRegistry.RegisterOwnedResource(this); |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:42 RegisterToolsAsync registers this as an owned resource on every call; if RegisterToolsAsync is invoked more than once, NativePluginRegistry will dispose the same registry instance multiple times.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| var displayName = string.IsNullOrWhiteSpace(serverConfig.Name) ? serverId : serverConfig.Name!; | ||
| var pluginId = $"mcp:{serverId}"; | ||
|
|
||
| var tools = await LoadToolsFromClientAsync(client, serverId, pluginId, displayName, serverConfig, timeoutCts.Token); |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:73 passes the startup-timeout token into LoadToolsFromClientAsync, so tool listing is also bounded by StartupTimeoutSeconds (even if RequestTimeoutSeconds is larger).
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| _clients.Add(client); | ||
|
|
||
| var displayName = string.IsNullOrWhiteSpace(serverConfig.Name) ? serverId : serverConfig.Name!; | ||
| var pluginId = $"mcp:{serverId}"; | ||
|
|
||
| var tools = await LoadToolsFromClientAsync(client, serverId, pluginId, displayName, serverConfig, timeoutCts.Token); | ||
|
|
||
| foreach (var tool in tools) | ||
| { | ||
| _tools.Add(new DiscoveredMcpTool( | ||
| pluginId, | ||
| new McpNativeTool(client, tool.LocalName, tool.RemoteName, tool.Description, tool.InputSchemaText), | ||
| displayName)); | ||
| } | ||
| } | ||
| catch | ||
| { | ||
| if (client is IDisposable disposable) | ||
| disposable.Dispose(); | ||
| throw; |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:68-87: if an exception occurs after _clients.Add(client), the catch disposes client but leaves it in _clients, so Dispose() may attempt to dispose an already-disposed client later.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| try | ||
| { | ||
| (client as IDisposable)?.Dispose(); | ||
| (client as IAsyncDisposable)?.DisposeAsync().GetAwaiter().GetResult(); |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:160-162 calls both IDisposable.Dispose() and IAsyncDisposable.DisposeAsync() on the same McpClient; for types implementing both, this can cause double-dispose side effects that get silently swallowed by the empty catch.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| pluginId, detail is not null ? $" ({detail})" : ""); | ||
| } | ||
|
|
||
| public void RegisterExternalTool(ITool tool, string pluginId, string? detail = null) |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/NativePluginRegistry.cs:81-82: RegisterExternalTool uses RegisterTool, which removes the previously-registered tool on name collisions without disposing it; with MCP tools now registering dynamically, that can turn tool-name conflicts into resource leaks for displaced IDisposable tools.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| // MCP plugin servers | ||
| if (config.Plugins.Mcp.Enabled) | ||
| { | ||
| foreach (var (serverId, server) in config.Plugins.Mcp.Servers) |
There was a problem hiding this comment.
src/OpenClaw.Core/Validation/ConfigValidator.cs:172-196 validates every MCP server entry even if server.Enabled is false; this can make disabled servers fail config validation due to missing Command/Url, despite McpServerToolRegistry skipping disabled servers at runtime.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Move the MCP native tool implementation into its own OpenClaw.Agent.Tools.McpNativeTool class and clean up McpServerToolRegistry logic. Ensure nativeRegistry.RegisterOwnedResource is called only once using Interlocked.Exchange. Refactor loading to accumulate discovered clients/tools before committing, properly dispose clients on failure (supporting IAsyncDisposable), and mark _loaded appropriately so failed attempts can be retried. Pass the correct cancellation token for tool listing and add tests to validate single owned-resource registration, retry behavior after failure, and request-timeout handling for tool listing.
Ignore MCP servers that are explicitly disabled during config validation to avoid reporting missing required fields for disabled entries. Adds a guard in ConfigValidator to continue when server.Enabled is false, and a unit test (Validate_DisabledMcpServerWithMissingRequiredFields_DoesNotReturnError) to assert no validation errors are produced for disabled stdio/http servers with empty command/url.
When registering an external tool with a duplicate name, previously the old registrations were removed but not disposed. Capture displaced tools before removing them and dispose any that implement IDisposable (and aren't the same instance) to avoid resource leaks. Add a unit test (RegisterExternalTool_NameCollision_DisposesDisplacedDisposableTool) and a DisposableFakeTool helper to verify the behavior.
|
augment review |
| var description = !string.IsNullOrWhiteSpace(tool.Description) | ||
| ? $"{tool.Description} (from MCP server '{displayName}')" | ||
| : $"MCP tool '{remoteName}' from server '{displayName}'."; | ||
| var inputSchema = "{}"; |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:134 currently hard-codes inputSchema to "{}" for every MCP tool. This means ITool.ParameterSchema won’t reflect the server-provided schema (if any), which can cause incorrect tool declarations/arguments generation upstream.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| { | ||
| var tools = await LoadAsync(ct); | ||
| if (Interlocked.Exchange(ref _registeredAsOwnedResource, 1) == 0) | ||
| nativeRegistry.RegisterOwnedResource(this); |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:43 registers the DI-managed McpServerToolRegistry instance as an owned resource of NativePluginRegistry. Since both are singletons implementing IDisposable, shutdown can end up disposing this registry twice (once via NativePluginRegistry.Dispose(), again via the DI container).
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| foreach (var displaced in displacedTools) | ||
| { | ||
| if (!ReferenceEquals(displaced, tool) && displaced is IDisposable disposable) | ||
| disposable.Dispose(); |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/NativePluginRegistry.cs:77 disposes displaced tools on name collisions without guarding against Dispose() throwing. If a plugin tool’s Dispose() throws, it can abort tool registration during startup and leave the registry in a partially-updated state.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Stop treating the registry as an owned resource and preserve MCP tool JSON schemas as raw JSON text.
- Removed _ownedResources tracking, registration as an owned resource, and disposal of those resources from McpServerToolRegistry.
- Added System.Text.Json usage and ResolveInputSchemaText(JsonElement) to return '{}' for null/undefined or the raw JSON text otherwise, so tool parameter schemas are emitted as proper JSON.
- Updated tests to parse and assert the JSON schema structure and to expect the registry is not registered as an owned resource anymore (test renamed accordingly).
These changes ensure schema fidelity when exposing remote MCP tool schemas and simplify ownership semantics for the registry.
Wrap disposal of displaced native tools in a try/catch and log a warning if Dispose throws, so a failing Dispose won't abort plugin/tool registration. Adds a unit test (RegisterExternalTool_NameCollision_DisposeFailureDoesNotAbortRegistration) and a ThrowingDisposableFakeTool to verify that registration proceeds and the exception is logged when Dispose fails.
|
augment review |
|
|
||
| internal async Task<IReadOnlyList<DiscoveredMcpTool>> LoadAsync(CancellationToken ct) | ||
| { | ||
| if (_loaded) |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:47: LoadAsync gates on _loaded but doesn’t synchronize access to _loaded, _tools, or _clients; concurrent callers could run discovery twice and append duplicate clients/tools. If this can be called from multiple threads, consider guarding the load with a lock/semaphore to make it single-flight.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| public void RegisterExternalTool(ITool tool, string pluginId, string? detail = null) | ||
| => RegisterTool(tool, pluginId, detail); | ||
|
|
||
| public void RegisterOwnedResource(IDisposable resource) |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/NativePluginRegistry.cs:102: RegisterOwnedResource unconditionally appends to _ownedResources; registering the same instance twice (or passing null) will lead to double-dispose or a NullReferenceException during shutdown. This seems easy to hit as more code starts using owned resources.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| } | ||
|
|
||
| foreach (var resource in _ownedResources) | ||
| resource.Dispose(); |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/NativePluginRegistry.cs:231-232: disposing _ownedResources isn’t wrapped in any error handling, so a single Dispose() throwing will abort disposing remaining resources and may surface as a shutdown failure. This differs from the guarded disposal used for displaced tools earlier in the file.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| { | ||
| using var argsDoc = JsonDocument.Parse(string.IsNullOrWhiteSpace(argumentsJson) ? "{}" : argumentsJson); | ||
| var argsDict = new Dictionary<string, object?>(StringComparer.Ordinal); | ||
| foreach (var prop in argsDoc.RootElement.EnumerateObject()) |
There was a problem hiding this comment.
src/OpenClaw.Agent/Tools/McpNativeTool.cs:25: argsDoc.RootElement.EnumerateObject() throws if the JSON root isn’t an object (e.g., []), but that gets reported via the generic “MCP tool failed” path rather than as an argument/JSON-shape error. That can make bad-argument failures harder to diagnose for callers.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Add a private SemaphoreSlim to serialize LoadAsync to prevent concurrent initialization. LoadAsync now waits on the semaphore, checks _loaded under the lock, and releases the semaphore in a finally block; client discovery and error cleanup are preserved. Dispose waits on the semaphore to avoid races with an in-flight load, disposes clients, then releases and disposes the semaphore. Add a unit test (LoadAsync_ConcurrentCalls_LoadsToolsOnce) to verify concurrent calls only perform a single discovery.
Validate and deduplicate owned resources in NativePluginRegistry.RegisterOwnedResource: throw on null, ignore if the same instance is already registered, otherwise add. Adds tests covering null argument, duplicate registration resulting in a single dispose call, and a DisposableOwnedResource test helper.
Add a check in McpNativeTool to ensure the parsed JSON root is an object and return a clear error message if not. This avoids exceptions from calling EnumerateObject() on non-object roots (arrays or primitives) when processing tool arguments.
|
augment review |
|
|
||
| public void Dispose() | ||
| { | ||
| _loadSemaphore.Wait(); |
There was a problem hiding this comment.
| public async Task RegisterToolsAsync(NativePluginRegistry nativeRegistry, CancellationToken ct) | ||
| { | ||
| var tools = await LoadAsync(ct); | ||
| foreach (var tool in tools) |
There was a problem hiding this comment.
RegisterToolsAsync will re-register the same ITool instances on subsequent calls, which will trigger duplicate-name overwrite warnings in NativePluginRegistry (and potentially reorder the registry) even though nothing changed.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
|
||
|
|
||
|
|
||
| private static string NormalizeTransport(McpServerConfig config) |
There was a problem hiding this comment.
| d.Dispose(); | ||
| } | ||
|
|
||
| foreach (var resource in _ownedResources) |
There was a problem hiding this comment.
| value = v.GetString(); | ||
| break; | ||
| case JsonValueKind.Number: | ||
| value = v.TryGetInt64(out var l) ? l : v.GetDouble(); |
There was a problem hiding this comment.
Centralize MCP transport normalization into McpServerConfig.NormalizeTransport() and update callers (ConfigValidator and McpServerToolRegistry) to use the extension; remove duplicate NormalizeTransport implementation. Add a _registered guard to McpServerToolRegistry.RegisterToolsAsync to avoid double registration. Improve Dispose() to wait for the load semaphore with a 5s timeout, log a warning if acquiring fails, and only release the semaphore if it was acquired. Fix JSON number handling in McpNativeTool by cloning the JsonElement value to preserve numeric fidelity. Also add the required using for plugins namespace.
|
augment review |
| { | ||
| if (acquired) | ||
| _loadSemaphore.Release(); | ||
| _loadSemaphore.Dispose(); |
There was a problem hiding this comment.
Dispose() calls _loadSemaphore.Dispose() even when it failed to acquire the semaphore, which can race with an in-flight LoadAsync and cause ObjectDisposedException or concurrent disposal of _clients. Consider ensuring disposal only proceeds once no load is active (or otherwise coordinating shutdown) to avoid tearing down the semaphore while it’s still in use.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| var resolved = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase); | ||
| foreach (var (name, rawValue) in headers) | ||
| { | ||
| var value = SecretResolver.Resolve(rawValue) ?? rawValue; |
There was a problem hiding this comment.
SecretResolver.Resolve(rawValue) ?? rawValue means a missing env:... variable will silently fall back to sending the literal secret reference (e.g., env:TEST_AUTH_TOKEN) as a header/env value. That seems inconsistent with SecretResolver’s “null when unset” contract and could leak placeholders to the server; consider treating unresolved env: refs as missing/invalid instead.
Severity: medium
Other Locations
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:252
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
McpServerToolRegistry: strengthen Dispose to wait indefinitely if the initial semaphore wait times out, handle ObjectDisposedException and log appropriately, and ensure load completion ordering. Secret resolution for environment variables and headers now treats unresolved values that start with "env:" as errors (throws InvalidOperationException) and preserves resolved/null semantics. Tests: update multiple tests to dispose the registry via "using var registry =" to match IDisposable behavior and adjust one test to construct the registry with a provided logger. These changes tighten lifecycle handling and make secret/env resolution failures explicit.
|
augment review |
| /// </summary> | ||
| public async Task RegisterToolsAsync(NativePluginRegistry nativeRegistry, CancellationToken ct) | ||
| { | ||
| if (_registered) |
There was a problem hiding this comment.
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:42-50: _registered is read/written without synchronization, so concurrent callers of RegisterToolsAsync can both proceed and register the same discovered tools twice. If this can be invoked from multiple startup paths, it could cause duplicate registrations/name-collision churn.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
src/OpenClaw.Gateway/Composition/RuntimeInitializationExtensions.cs
Outdated
Show resolved
Hide resolved
Add proper semaphore locking around tool loading and registration to prevent races, extract LoadInternalAsync to simplify load logic, and ensure clients are disposed on failure. Only register MCP tools at startup when Plugins.Mcp is enabled, and include Plugins.Mcp in the security check that forbids third-party plugins on public binds. Also remove unused usings and tidy up client/tool collection handling.
|
augment review |
| { | ||
| return $"Error: Invalid JSON arguments for MCP tool '{localName}': {ex.Message}"; | ||
| } | ||
| catch (Exception ex) |
There was a problem hiding this comment.
McpNativeTool.ExecuteAsync catches all exceptions, including OperationCanceledException; this can hide tool timeouts/cancellation from OpenClawToolExecutor (which relies on cancellation exceptions to flag timeouts and propagate shutdown cancellation). Consider letting cancellation propagate (at least when ct.IsCancellationRequested) instead of converting it into a normal string result.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
|
||
| try | ||
| { | ||
| foreach (var (serverId, serverConfig) in _config.Servers) |
There was a problem hiding this comment.
_config.Servers (and the Environment/Headers dictionaries used later) are assumed non-null; if config binding ever sets these to null (e.g., JSON has "Servers": null / "Headers": null), discovery will throw NullReferenceException during startup. Consider null-coalescing/normalizing these collections before iterating/reading .Count.
Severity: medium
Other Locations
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:253src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:259
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Avoid null-reference exceptions and improve cancellation handling: - Iterate safely over _config.Servers by defaulting to an empty collection when Servers may be null. - Make ResolveEnv and ResolveHeaders accept nullable Dictionary parameters and return null for null/empty inputs to reflect absence of values. - Add an explicit catch for OperationCanceledException that rethrows when the provided CancellationToken is cancelled so cancellation propagates correctly. These changes harden MCP tool registration and execution against missing configuration and ensure proper task cancellation semantics.
|
augment review |
| foreach (var (name, rawValue) in environment) | ||
| { | ||
| var value = SecretResolver.Resolve(rawValue); | ||
| if (value is null && rawValue.StartsWith("env:", StringComparison.Ordinal)) |
There was a problem hiding this comment.
ResolveEnv assumes rawValue is non-null; if configuration binding yields a null dictionary value, rawValue.StartsWith(...) will throw and prevent MCP discovery. This same null-value risk exists in ResolveHeaders too.
Severity: medium
Other Locations
src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs:292
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| "stdio" => new StdioClientTransport(new StdioClientTransportOptions | ||
| { | ||
| Command = config.Command!, | ||
| Arguments = config.Arguments, |
There was a problem hiding this comment.
CreateTransport passes config.Arguments through directly; if config binding sets Arguments to null, transport creation (or downstream MCP client code) can fail at startup. It may be worth ensuring Arguments can’t be null by the time it reaches this point.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Add null-safety and stronger validation for MCP plugin configuration. In McpServerToolRegistry: default stdio Arguments to an empty array, and handle null raw environment/header values (preserving null for env vars and using empty string for headers) to avoid null-ref errors during resolution. In ConfigValidator: return a validation error when Plugins.Mcp is enabled but Servers is null, and restructure the servers loop while keeping existing checks for transport type, command presence for stdio, URL validity for http, and startup/request timeouts.
Description
This pull request introduces support for integrating external MCP (Model Context Protocol) servers as native tools within the OpenClaw Agent. It adds configuration options, validation, service registration, and runtime logic to discover, register, and manage tools exposed by MCP servers, making them available alongside built-in and native plugins. The changes also ensure robust configuration validation and proper resource management.
Key changes include:
MCP Server Integration & Tool Registration
McpServerToolRegistryservice that connects to configured MCP servers, discovers their available tools, and registers them as native tools in theNativePluginRegistry. This includes logic for both HTTP and stdio transports, tool name resolution, error handling, and resource cleanup. (src/OpenClaw.Agent/Plugins/McpServerToolRegistry.cs)src/OpenClaw.Gateway/Composition/ToolServicesExtensions.cs,src/OpenClaw.Gateway/Composition/RuntimeInitializationExtensions.cs) [1] [2] [3]Configuration and Validation Enhancements
src/OpenClaw.Core/Plugins/PluginModels.cs,src/OpenClaw.Gateway/appsettings.json,src/OpenClaw.Gateway/appsettings.Production.json) [1] [2] [3]src/OpenClaw.Core/Validation/ConfigValidator.cs) [1] [2]src/OpenClaw.Tests/ConfigValidatorTests.cs)Resource Management
NativePluginRegistryto track and dispose of owned resources, including those registered by the MCP tool registry. (src/OpenClaw.Agent/Plugins/NativePluginRegistry.cs) [1] [2] [3]Dependency Updates
ModelContextProtocolpackage dependency to support MCP client functionality. (src/OpenClaw.Agent/OpenClaw.Agent.csproj)Related Issues
#35
Type of Change
Checklist
dotnet test)Screenshots (if applicable)
[Add screenshots for UI/UX changes]
Benchmarks (if applicable)
[Did this change affect performance? Provide benchmark results.]