Conversation
There was a problem hiding this comment.
Nice work!
Critical Issues
1. matches_uri_pattern regex injection bug (view.py:942-946)
Non-wildcard characters (., (, +, etc.) are not escaped before glob-to-regex conversion. A pattern like tool://my.namespace/tool matches tool://myXnamespace/tool because . is a regex metacharacter. This can cause false-positive URI matches in policy rules.
2. iter_views silently drops unknown content types (view.py:1159)
Parts with a content_type not in _CONTENT_TYPE_TO_VIEW_KIND are silently skipped. A policy engine calling iter_views() could miss parts entirely, meaning a tool call could bypass all policy evaluation. Should raise or at minimum log a warning.
3. Tests
Please keep test coverage above our current 96% coverage level. Tests should be addressed before merge.
Type Safety Issues
4. extensions field typed as Any (message.py:839)
Is there a reason for typing it Any? Should be Extensions | None. As-is, no type checker can validate extension access in view.py.
5. content_type fields are not Literal types (all ContentPart subclasses)
A caller can construct TextContent(content_type=ContentType.TOOL_CALL, text="hello") (the discriminator field lies about the actual type). Standard Pydantic pattern is content_type: Literal[ContentType.TEXT] = ContentType.TEXT.
6. PromptResult.messages typed as list[Any] (message.py:357)
Should be list[Message] per the spec.
7. Media source type field is unvalidated str (ImageSource, VideoSource, etc.)
Documented as "url" | "base64" but accepts any string. Should be Literal["url", "base64"].
Validation Issues
8. _content_type_discriminator falls back to "text" on unknown input (message.py:742)
A malformed tool-call payload would silently become a TextContent, dropping its actual content.
9. Resource has no mutual-exclusion validation for content vs blob
Both can be set simultaneously with no validator.
10. ResourceReference has no range validation
range_start=100, range_end=5 passes silently.
11. is_pre/is_post direction for Role.TOOL text content
Text from Role.TOOL is classified as is_pre=True, but tool messages are semantically post-processing responses.
Performance Issues
12. _ACTION_MAP dict rebuilt on every action property access (view.py:361-374)
Should be a module-level constant. Called per-view in policy evaluation hot paths.
13. size_bytes and to_dict redundantly compute self.content
self.content calls json.dumps() for tool calls/results, and is called twice during to_dict serialization.
Security Consideration
14. headers property returns live internal dict (view.py:693)
Pydantic's frozen=True prevents field reassignment but not dict mutation. A consumer can mutate view.headers["Authorization"] = "...", bypassing the _SENSITIVE_HEADERS stripping that only happens in to_dict. Should return a copy or MappingProxyType. Look in memory.py and manager.py; we have copy-on-write dict and list wrappers that could be used instead, and a utility wrapper that wrapps any models (including models with nested dicts and lists) with copy-on-write wrappers.
Recommendations
Here is a suggested priority list to tackle the issues:
- High: Issues 1 (regex bug), 2 (silent view drop), 3 (tests)
- Medium: Issues 4-7 (type safety), Issue 14 (headers mutation)
- Low: Issues 8-13 (validation, performance)
Signed-off-by: Teryl Taylor <terylt@ibm.com>
Signed-off-by: Teryl Taylor <terylt@ibm.com>
This is an initial revision of the CMF data format for plugins.
Closes #4
Common Message Format (CMF) Specification
Status: Draft
Version: 2.0
Introduction
The Common Message Format (CMF) defines a provider-agnostic, structured representation for interactions between users, agents, tools, and language models.
CMF provides a canonical message model that supports interoperability, policy enforcement, access control, data governance, and end-to-end auditing across heterogeneous model providers and agent frameworks.
The format explicitly separates:
This separation allows transport adapters, processing pipelines, and policy engines to evolve independently while operating over a consistent enforcement-ready message model.
flowchart LR A([Provider Wire Format]) --> B[Adapter] B --> C([CMF Message]) C --> D[Processing Pipeline] D --> E([CMF Message]) E --> F[Adapter] F --> G([Wire Format])Table of Contents
1. Design Goals
The CMF is designed to satisfy the following goals:
Interoperable, canonical representation
Define a provider-agnostic message format that decouples transport wire protocols from internal processing, enabling consistent handling across LLM providers, agent frameworks, tools, and enforcement systems.
Complete and explicit enforcement surface
Represent all policy-relevant data—including identity, access control metadata, governance attributes, execution context, and provenance—as structured, typed fields on the message.
Integrity and safety by construction
Protect security-relevant data through explicit mutability tiers (immutable, monotonic, guarded, mutable) enforced by the processing pipeline, and enable safe read-only inspection for policy evaluation.
Extensibility without compromising correctness
Support structured extensions while preserving interoperability, enforcement guarantees, and message integrity.
To realize these goals, CMF defines the following abstractions:
2. Message
A
Messagerepresents a single turn in a conversation. It has four fields:2.1 Role
Role is a closed-set enumeration type.
systemdeveloperuserassistanttool2.2 Content
Content is a list of typed
ContentParts for multimodal messages.This is the wire format, which preserves the LLM's response grouping. A single assistant message can contain text, thinking, and multiple tool calls, just as the provider API returns them.
ContentPart:
Each
ContentPartmust include aContentTypetype discriminator.content_typeContentTypeContentPart types:
textstrthinkingstrtool_callToolCalltool_resultToolResultresourceResourceresource_refResourceReferenceprompt_requestPromptRequestprompt_resultPromptResultimageImageSourcevideoVideoSourceaudioAudioSourcedocumentDocumentSourceToolCall:
tool_call_idstrnamestrargumentsdict[str, Any]namespacestr | NoneToolResult:
tool_call_idstrtool_namestrcontentJSONValueis_errorboolResource:
resource_request_idstruristrnamestr | Nonedescriptionstr | Noneresource_typeResourceTypefile,blob,uri,database,api,memory,artifactcontentstr | Noneblobbytes | Nonemime_typestr | Nonesize_bytesint | Noneannotationsdictversionstr | NoneResourceReference:
resource_request_idstruristrnamestr | Noneresource_typeResourceTyperange_startint | Nonerange_endint | Noneselectorstr | NoneResourceType:
ResourceTypeis a closed-set enumeration.filebloburidatabaseapimemoryartifactPromptRequest:
prompt_request_idstrnamestrargumentsdict[str, Any]server_idstr | NonePromptResult:
prompt_request_idstrprompt_namestrmessageslist[Message]contentstr | Noneis_errorboolerror_messagestr | NoneImageSource:
type"url" | "base64"datastrmedia_typestr | Noneimage/jpeg)VideoSource:
type"url" | "base64"datastrmedia_typestr | Nonevideo/mp4)duration_msint | NoneAudioSource:
type"url" | "base64"datastrmedia_typestr | Noneaudio/mp3)duration_msint | NoneDocumentSource:
type"url" | "base64"datastrmedia_typestr | Noneapplication/pdf)titlestr | None2.3 Channel
Channelis a closed-set enumeration that classifies the kind of output a message represents. It is optional, and when unset, it indicates a standard, unclassified message. Channels allow agentic frameworks and pipelines to route or filter messages by output type without inspecting content.analysiscommentaryfinal3. Extensions
Extensions are the sole carrier of all contextual data. Everything a consumer or policy engine needs (e.g., identity, security classification, HTTP context, entity metadata, agent lineage, execution environment, completion info, provenance) is stored as a typed message extension with an explicit mutability tier.
3.1 Mutability Tiers
Every extension declares its mutability tier. The processing pipeline enforces these contracts during copy-on-write.
3.2 Extension Types
requestagenthttpread_headers, writable withwrite_headers)securitysecurity.subjectsecurity.objectssecurity.datamcpcompletionprovenancellmframeworkcustom3.3 RequestExtension (immutable)
Execution environment and request-level timing/tracing. Available to all consumers without any capability requirement (base tier).
environmentstr | Noneproduction,staging,dev)request_idstr | Nonetimestampstr | Nonetrace_idstr | Nonespan_idstr | None3.4 AgentExtension (immutable)
Agent execution context — session tracking, multi-agent lineage, and the original user intent. Immutable because the user's intent and session identity must not be modifiable by processing components.
inputstr | Nonesession_idstr | Noneconversation_idstr | Noneturnint | Noneagent_idstr | Noneparent_agent_idstr | NoneconversationConversationContext | NoneConversationContext:
historylist[Message] | Nonesummarystr | Nonetopicslist[str]3.5 HttpExtension (guarded)
HTTP request context. Readable with
read_headers, writable withwrite_headers. The write capability exists because consumers sometimes need to inject headers for downstream systems — e.g., adding an OAuth token for an API call, or a correlation ID for tracing.headersread_headerswrite_headersdict[str, str]Sensitive headers (
Authorization,Cookie,X-API-Key) are stripped when serialized for external policy engines. Consumers withwrite_headerscan inject new headers; the pipeline audits all header modifications.3.6 SecurityExtension (monotonic and immutable)
Data classification, security labels, and all security-relevant contextual data: subject identity, access control profiles, and data governance policies. The
SecurityExtensionlabels are add-only during normal message flow. Its nested immutable fields (subject,objects,data) cannot be replaced or modified.labelsset[str]PII,CONFIDENTIAL,SECRET, etc.)classificationstr | NonesubjectSubjectExtension | Noneobjectsdict[str, ObjectSecurityProfile]datadict[str, DataPolicy]Requires
read_labelscapability to see labels. Any consumer can add labels (through copy-on-write), but the pipeline validates that no labels were removed (before.labels ⊆ after.labels). Removal requires a privileged declassification operation that is audited separately.3.6.1 SubjectExtension (immutable)
The authenticated entity making the request. Access to individual fields is controlled by declared capabilities.
SubjectTypeis a closed-set enumeration:user,agent,service,system.idread_subjectstrtyperead_subjectSubjectTyperolesread_rolesset[str]developer,admin,viewer, etc.)permissionsread_permissionsset[str]tools.execute,db.read, etc.)teamsread_teamsset[str]claimsread_claimsdict3.6.2 Objects (immutable)
Access control profiles for entities referenced in the message, keyed by entity identifier (tool name, resource URI, prompt name). Each entry is an
ObjectSecurityProfiledeclaring the entity's access requirements and data scope.ObjectSecurityProfile (flat):
managed_bystr"host","tool", or"both"permissionslist[str]["read:compensation"])trust_domainstr | None"internal","external","privileged"data_scopelist[str]["salary", "bonus"])For MCP/framework messages (single entity), this map has one entry. For LLM provider messages with multiple tool calls, it has one entry per entity. The host pipeline populates this map during message ingestion by looking up each entity's registered profile from an implementation-defined registry (e.g., tool registration metadata, MCP server manifests, or a policy store). Profile lookup is keyed by entity identifier — the tool name, resource URI, or prompt name that appears in the message content.
The
MessageViewexposes a singular accessor — each view resolves its own profile from the map:For the full model and policy integration, see the Object Security Profile Spec.
3.6.3 Data (immutable)
Data governance policies for entities referenced in the message, keyed by entity identifier (tool name, resource URI, prompt name). Each entry is a
DataPolicydeclaring labeling, action restrictions, and retention rules for the entity's output.DataPolicy:
apply_labelslist[str]["PII", "financial"])allowed_actionslist[str] | NoneNone= unrestricted.denied_actionslist[str]["export", "forward"])retentionRetentionPolicy | NoneRetentionPolicy:
max_age_secondsint | Nonepolicystr"session","transient","persistent","none"delete_afterstr | NoneThe
dataextension is evaluated on post views (tool results, resource responses, prompt results). When a tool returns data, the pipeline looks up itsDataPolicy, stampsapply_labelsonto the message'sSecurityExtension, and enforces action restrictions downstream.The
MessageViewexposes a singular accessor:For the full model, action vocabulary, and policy integration, see the Object Security Profile Spec.
3.7 MCPExtension (immutable)
Typed metadata about the MCP entity being processed. Gives consumers access to the schema and annotations of the tool, resource, or prompt being evaluated.
ToolMetadata (
ext.mcp.metadata.tool):namestrtitlestr | Nonedescriptionstr | Noneinput_schemadict | Noneoutput_schemadict | Noneserver_idstr | Nonenamespacestr | NoneannotationsdictreadOnlyHint,destructiveHint)ResourceMetadata (
ext.mcp.metadata.resource):uristrfile:///path,db://table/id, etc.)namestr | Nonedescriptionstr | Nonemime_typestr | Nonetext/csv,application/json, etc.)server_idstr | NoneannotationsdictPromptMetadata (
ext.mcp.metadata.prompt):namestrdescriptionstr | Noneargumentslist[dict] | Nonename,description,requiredserver_idstr | NoneannotationsdictNote: Prompts use an argument list rather than JSON Schema for input definition, following the MCP prompt specification. There is no output schema — prompt output is always rendered messages.
3.8 CompletionExtension (immutable)
LLM completion information. Fields like
modelandstop_reasoncan drive policy decisions (e.g., "only allow gpt-4 for financial queries", "flag max_tokens responses for review").StopReasonis a closed-set enumeration:end,return,call,max_tokens,stop_sequence.TokenUsage:
input_tokensintoutput_tokensinttotal_tokensintstop_reasonStopReason | NonetokensTokenUsage | Nonemodelstr | Noneraw_formatstr | Nonechatml,harmony,gemini,anthropic)created_atstr | Nonelatency_msint | None3.9 ProvenanceExtension (immutable)
Origin and threading information for the message. Enables lineage tracking across multi-turn conversations and multi-agent systems.
sourcestr | None"user","agent:xyz","mcp-server:abc")message_idstr | Noneparent_idstr | NoneNote:
conversation_id,session_id, and agent lineage (agent_id,parent_agent_id) live onAgentExtension(section 3.4) since they are per-request context, not per-message.trace_idandspan_idlive onRequestExtension(section 3.3) alongsiderequest_id.3.10 LLMExtension (immutable)
Model identity and capability metadata. Used for routing, policy evaluation, and audit when the producing model's identity matters independently of the completion itself.
model_idstr | Nonegpt-4o,claude-sonnet-4-20250514)providerstr | Noneopenai,anthropic,google)capabilitieslist[str]["vision", "tool_use", "extended_thinking"])3.11 FrameworkExtension (immutable)
Agentic framework context. Captures the framework-level execution environment for messages that originate from or pass through agentic orchestration layers.
frameworkstr | Nonelanggraph,crewai,autogen,a2a)framework_versionstr | Nonenode_idstr | Nonegraph_idstr | Nonemetadatadict[str, Any]3.12 Custom Extensions (mutable)
Custom extensions. Fully mutable through copy-on-write, no restrictions on modification.
4. MessageView
Messageis the storage format — it preserves the wire structure exactly as the LLM sent it.MessageViewis the policy and interaction surface — it decomposes a message into individually addressable parts, enriches each with computed semantics, and provides a uniform interface regardless of content type.4.1 Message vs. MessageView
A single LLM response can contain text, reasoning, and multiple tool calls bundled together. That's one
Messagebut potentially many things that need to be evaluated independently.Message (one object, wire format):
{ "role": "assistant", "content": [ {"content_type": "thinking", "text": "The user wants admin users. I'll query the database..."}, {"content_type": "text", "text": "Let me look that up for you."}, {"content_type": "tool_call", "name": "execute_sql", "arguments": {"query": "SELECT * FROM users WHERE role='admin'"}}, {"content_type": "tool_call", "name": "send_email", "arguments": {"to": "boss@company.com", "body": "..."}} ] }Calling
message.iter_views()produces four MessageViews, each with a uniform interface:kindnameactionis_preuricontentthinkinggeneratefalse"The user wants admin users..."textsendfalse"Let me look that up for you."tool_callexecute_sqlexecutetruetool://db-server/execute_sql'{"query": "SELECT..."}'tool_callsend_emailexecutetruetool://email-server/send_email'{"to": "boss@..."}'What the view adds that doesn't exist on the raw content parts:
tool://ns/name,prompt://server/name,file:///path)is_pre/is_postcomputed from kind + roleaction:read,write,execute,invoke,send,receive,generateroles,labels,environment, etc.)contentalways returns scannable text (serialized arguments for tool calls)has_role(),has_label(),matches_uri_pattern(),get_arg(), etc.4.2 Supporting Both LLM and Framework Formats
The CMF naturally supports two messaging patterns through the same structure:
An MCP tool invocation is simply a Message with one
tool_callcontent part:{"role": "assistant", "content": [{"content_type": "tool_call", "name": "get_user", "arguments": {"id": "123"}}]}This produces one view. An OpenAI assistant response that bundles reasoning with two tool calls produces three views. The processing pipeline doesn't care —
iter_views()yields the right number either way, and every view has the same interface.This means the CMF does not force a choice between "one action per message" (MCP, A2A) and "bundled response" (LLM providers). Both are first-class, and the same policies and routing rules work across both patterns without adaptation.
4.3 Core Attributes
ViewKindis a closed-set enumeration:text,thinking,tool_call,tool_result,resource,resource_ref,prompt_request,prompt_result,image,video,audio,document.ViewActionis a closed-set enumeration:read,write,execute,invoke,send,receive,generate.kindViewKindroleRoleuser,assistant,system,developer,tool)contentstr | Noneuristr | Nonetool://ns/name,prompt://server/name,tool_result://name,file:///pathnamestr | NoneactionViewActionargsdict | Noneargumentsdict ontool_callandprompt_requestcontent parts;Nonefor other kindsmime_typestr | Nonesize_bytesint | Nonepropertiesdict4.4 Direction
Direction is determined by a combination of ViewKind and Role:
tool_call,prompt_request,resource_reftool_result,prompt_result,resourcetext,thinking, mediatext,thinking, mediais_preboolis_postboolis_toolbooltool_callortool_resultis_promptboolprompt_requestorprompt_resultis_resourceboolresourceorresource_refis_textbooltextorthinkingis_mediaboolimage,video,audio, ordocument4.5 Flat Accessors (capability-gated)
MessageView provides flat accessor properties over extensions. These hide the underlying extension nesting — consumers write
view.roles, notview.extensions.security.subject.roles. Availability depends on the consumer's declared capabilities — extensions for which access has not been granted areNoneon the view.environmentstr | Noneext.request.environmentrequest_idstr | Noneext.request.request_idsubjectread_subjectSubjectExtension | Noneext.security.subjectrolesread_rolesset[str]ext.security.subject.rolespermissionsread_permissionsset[str]ext.security.subject.permissionsteamsread_teamsset[str]ext.security.subject.teamsheadersread_headersdict[str, str]ext.http.headerslabelsread_labelsset[str]ext.security.labelsagent_inputread_agentstr | Noneext.agent.inputsession_idread_agentstr | Noneext.agent.session_idconversation_idread_agentstr | Noneext.agent.conversation_idturnread_agentint | Noneext.agent.turnagent_idread_agentstr | Noneext.agent.agent_idparent_agent_idread_agentstr | Noneext.agent.parent_agent_idobjectread_objectsObjectSecurityProfile | Noneext.security.objects.get(view.name)data_policyread_dataDataPolicy | Noneext.security.data.get(view.name)Helper methods:
has_role(role)has_permission(perm)has_label(label)has_header(name)get_header(name)get_arg(name)has_arg(name)matches_uri_pattern(glob)*,**wildcards)has_content()4.6 Type-Specific Properties
Each ViewKind exposes additional properties via
get_property(name)orproperties:resourceresource_typestrfile,blob,uri,database,api,memory,artifactresourceversionstr | Noneresourceannotationsdict | Nonetool_callnamespacestr | Nonetool_calltool_idstr | Nonetool_resultis_errorbooltool_resulttool_namestrprompt_requestserver_idstr | Noneprompt_resultis_errorboolprompt_resultmessage_countint4.7 Serialization
to_dict()include_content,include_contextto_opa_input(){"input": {...view...}}Sensitive headers (
Authorization,Cookie,X-API-Key) are automatically stripped from serialized output. The serializedextensionsblock is assembled from capability-gated extensions, mirroring the extension hierarchy.5. Security Properties
5.1 Label Propagation
Security labels on
extensions.securityare monotonically accumulating during normal message flow. A message that touches PII data carries thePIIlabel for its lifetime. Labels propagate through the pipeline:Removal requires explicit declassification, a privileged operation that is audited separately.
5.2 Extension Tier Enforcement
The four mutability tiers (immutable, monotonic, guarded, mutable) provide layered protection: