feat(runtime-expressions): improve ABNF grammar clarity by frankkilcommins · Pull Request #454 · OAI/Arazzo-Specification

frankkilcommins · 2026-03-23T11:21:17Z

Define CHAR to exclude curly braces for unambiguous parsing (Ambiguity in runtime expressions embedded in strings #424)
Add expression-string and embedded-expression grammar (Introduction of ABNF for expressions embedded in strings #425)
Specialize reference types by context (Introduce complete grammar for parsing expressions #426)
Add Source Description expression resolution priority (Ambiguity in $sourceDescription.* runtime expressions #427)
Require explicit component types in $components (Ambiguity in $components runtime expressions #428)
Clarify type conversion in embedded expressions (List or object usage in runtime expressions embedded in strings #437)
Fix invalid examples in spec (conditions, stepId references)
Update examples table with nested inputs and component actions

$components now requires explicit component type (parameters/successActions/failureActions). Generic components.name pattern removed. Note: This was already semantically invalid per spec.

fixes: #424
fixes: #425
fixes: #426
fixes: #428
fixes: #437

resolves: #427

DmitryAnansky · 2026-03-26T10:10:46Z

examples/1.0.0/bnpl-arazzo.yaml

+          "firstName": "{$inputs.customer#/firstName}",
+          "lastName": "{$inputs.customer#/lastName}",
+          "dateOfBirth": "{$inputs.customer#/dateOfBirth}",
+          "postalCode": "{$inputs.customer#/postalCode}"


It would be great to include more complex and diverse examples demonstrating the application of ABNF syntax.

char0n · 2026-04-02T11:41:52Z

src/arazzo.md

+  component-name = identifier
+
+  ; Identifier rule
+  identifier = 1*( ALPHA / DIGIT / "." / "-" / "_" )


The PR's identifier = 1*( ALPHA / DIGIT / "." / "-" / "_" ) is used for all IDs (stepId, workflowId, sourceDescriptionName, component keys, input/output names). But the spec defines two
different patterns:

stepId, workflowId, sourceDescriptionName: SHOULD [A-Za-z0-9_-]+ (no dot)

Components keys: MUST ^[a-zA-Z0-9.-_]+$ (with dot)

A single shared identifier rule conflates these — it allows dots in step/workflow IDs where the spec says they shouldn't be, and it's only SHOULD-level enforcement anyway. Separate rules would
be more faithful to the spec's intent.

char0n · 2026-04-02T11:55:54Z

src/arazzo.md

+  field-name = identifier
+
+  ; Source descriptions expressions
+  source-reference = source-name "." reference-id


source-reference is too restrictive with identifier

The proposed grammar uses:

source-reference = source-name "." reference-id reference-id = identifier

The <reference> part can be an operationId from an OpenAPI description or a workflowId from
an Arazzo document. OpenAPI does not constrain operationId to any specific character set —
it's just a string. This means operationIds like get/pets, get pets, or create-user@v2 are
technically valid in OpenAPI but would be rejected by the identifier rule.

I'd suggest using a less restrictive rule for reference-id — something like 1*CHAR (any
character except { and }) — to avoid rejecting valid OpenAPI operationIds.

updated to:

; Source descriptions expressions source-reference = source-name "." reference-id source-name = identifier-strict reference-id = 1*CHAR ; operationIds have no character restrictions in OpenAPI/AsyncAPI ; Resolution priority defined in spec text: (1) operationId/workflowId, (2) field names

src/arazzo.md

Restructure the ABNF grammar to use explicit, typed reference rules in the primary grammar instead of relying on secondary grammars with two-pass parsing. This improves grammar clarity and aligns with the proposed spec changes in OAI/Arazzo-Specification#454. Key changes: - Add $self expression support - Add $inputs/$outputs JSON Pointer support (e.g., $inputs.customer#/firstName) - Inline all secondary grammars into the primary grammar - Extract shared identifier and identifier-strict rules - Adapt json-pointer to exclude { and } from unescaped for unambiguous embedded expression parsing, fixing the body expression extract limitation - Require explicit component types (parameters/successActions/failureActions) - Update README with current grammar and examples Resolves: OAI/Arazzo-Specification#424, OAI/Arazzo-Specification#425, OAI/Arazzo-Specification#426, OAI/Arazzo-Specification#428 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

src/arazzo.md

char0n · 2026-04-02T14:18:07Z

Implementation Verification

I implemented the proposed grammar changes in my ABNF parser at swaggerexpert/arazzo-runtime-expression#116 to verify the grammar is correct and parseable. All 152 tests pass. Below are the findings from the implementation.

Issue: `unescaped` in json-pointer still includes `{` and `}`

The CHAR rule correctly excludes { (%x7B) and } (%x7D) for unambiguous embedded expression parsing, but the unescaped rule in json-pointer still uses %x30-7D, which includes both characters.

This means embedded expressions containing JSON pointers — like {$request.body#/status}, {$inputs.customer#/firstName}, or {$steps.foo.outputs.bar#/0/id} — cannot be reliably parsed. The json-pointer's unescaped will consume the closing }, making it impossible to determine where the expression ends.

Suggested fix — change unescaped from:

unescaped = %x00-2E / %x30-7D / %x7F-10FFFF

to:

unescaped = %x00-2E / %x30-7A / %x7C / %x7F-10FFFF
    ; %x2F ('/'), %x7E ('~'), %x7B ('{'), %x7D ('}') are excluded

This is a minor deviation from RFC 6901, but { and } in JSON Pointer reference tokens are extremely rare in practice, and without this fix the expression-string grammar cannot work correctly for any expression containing a json-pointer.

Issue: Single `identifier` rule conflates two different spec constraints

The proposed grammar uses a single identifier = 1*( ALPHA / DIGIT / "." / "-" / "_" ) rule for everything — step IDs, workflow IDs, source description names, component keys, input/output names, and field names. However, the spec defines two different patterns:

stepId, workflowId, sourceDescriptionName: SHOULD conform to [A-Za-z0-9_\-]+ (no dot)
Components keys: MUST match ^[a-zA-Z0-9\.\-_]+$ (with dot)

Using a single shared rule allows dots in step/workflow IDs where the spec says they shouldn't be. In my implementation, I split this into two rules:

identifier        = 1*(ALPHA / DIGIT / "." / "-" / "_")   ; for field names, component keys
identifier-strict = 1*(ALPHA / DIGIT / "_" / "-")          ; for step/workflow/source-description IDs

Issue: `source-descriptions-reference` (`reference-id`) is too restrictive

The proposed grammar constrains reference-id to identifier, but this value can be an operationId from an OpenAPI description. OpenAPI does not constrain operationId to any specific character set — it's just a string. OperationIds like get/pets, get pets, or create-user@v2 are technically valid in OpenAPI but would be rejected by the identifier rule.

In my implementation, I use 1*CHAR (any character except { and }) for this rule.

Issue: Simplified `CHAR` rule diverges from OpenAPI

The PR replaces the JSON string-based CHAR definition (from RFC 7159, with escape sequences) with a simpler character range: CHAR = %x00-7A / %x7C / %x7E-10FFFF. This changes the semantics — a bare \ becomes a valid character, and JSON escape sequences like \n, \uXXXX are no longer recognized.

OpenAPI's runtime expression ABNF uses the RFC 7159-based CHAR definition. Since Arazzo builds on top of OpenAPI and shares the runtime expression concept, simplifying CHAR introduces a subtle divergence. An expression valid in one spec could behave differently in the other. I'd recommend keeping the RFC 7159-based definition for interoperability.

Suggestion: `name` rule is not "legacy"

The PR labels the name rule as ; Legacy 'name' rule (retained for query/path references). This rule isn't legacy — it's the correct rule for query and path parameter names, which are user-defined and can contain any valid character. The comment could be misleading and suggest future removal. A more accurate comment would be something like ; Unconstrained name rule for query/path references.

Note: Example file version mismatch

The example fixes in examples/1.0.0/bnpl-arazzo.yaml (changing $inputs.customer.firstName to $inputs.customer#/firstName) apply 1.1.0 grammar semantics to a 1.0.0 example file. This could cause confusion about backward compatibility. Consider applying these fixes only to a 1.1.0 example, or noting that the 1.0.0 example has been updated to reflect the corrected grammar.

Note: Missing comma in example payload

In bnpl-arazzo.yaml, there's a pre-existing missing comma after the postalCode line in the JSON payload template, making it invalid JSON:

"postalCode": "{$inputs.customer#/postalCode}"
  "termsAndConditionsAccepted": true

Our ABNF grammar for reference

For reference, here is the complete ABNF grammar from my implementation that addresses the issues above:

; Arazzo runtime expression ABNF syntax
expression = (
    "$url" /
    "$method" /
    "$statusCode" /
    "$request." source /
    "$response." source /
    "$inputs." inputs-reference /
    "$outputs." outputs-reference /
    "$steps." steps-reference /
    "$workflows." workflows-reference /
    "$sourceDescriptions." source-reference /
    "$components." components-reference /
    "$self"
  )
; Request/Response sources
source                  = ( header-reference / query-reference / path-reference / body-reference )
header-reference        = "header." token
query-reference         = "query." name
path-reference          = "path." name
body-reference          = "body" ["#" json-pointer ]

; Input/Output references
inputs-reference        = inputs-name ["#" json-pointer]
inputs-name             = identifier
outputs-reference       = outputs-name ["#" json-pointer]
outputs-name            = identifier

; Steps expressions
steps-reference         = steps-id ".outputs." outputs-name ["#" json-pointer]
steps-id                = identifier-strict

; Workflows expressions
workflows-reference     = workflows-id "." workflows-field "." workflows-field-name ["#" json-pointer]
workflows-id            = identifier-strict
workflows-field         = "inputs" / "outputs"
workflows-field-name    = identifier

; Source descriptions expressions
source-reference                = source-descriptions-name "." source-descriptions-reference
source-descriptions-name        = identifier-strict
source-descriptions-reference   = 1*CHAR

; Components expressions
components-reference    = components-type "." components-name
components-type         = "parameters" / "successActions" / "failureActions"
components-name         = identifier

; Unconstrained name rule for query/path references and source description references
name                    = *( CHAR )

; Grammar for parsing template strings with embedded expressions
expression-string    = *( literal-char / embedded-expression )
embedded-expression  = "{" expression "}"
literal-char         = %x00-7A / %x7C / %x7E-10FFFF  ; anything except { (%x7B) and } (%x7D)

; JSON Pointer (RFC 6901, adapted)
; { (%x7B) and } (%x7D) are excluded from 'unescaped' for unambiguous embedded expression parsing
json-pointer     = *( "/" reference-token )
reference-token  = *( unescaped / escaped )
unescaped        = %x00-2E / %x30-7A / %x7C / %x7F-10FFFF
                 ; %x2F ('/'), %x7E ('~'), %x7B ('{'), %x7D ('}') are excluded
escaped          = "~" ( "0" / "1" )
                 ; representing '~' and '/', respectively

; https://datatracker.ietf.org/doc/html/rfc7230#section-3.2.6
token          = 1*tchar
tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
               / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~"
               / DIGIT / ALPHA
               ; any VCHAR, except delimiters

; https://www.rfc-editor.org/rfc/rfc7159#section-7
CHAR = unescape /
    escape (
        %x22 /          ; "    quotation mark  U+0022
        %x5C /          ; \    reverse solidus U+005C
        %x2F /          ; /    solidus         U+002F
        %x62 /          ; b    backspace       U+0008
        %x66 /          ; f    form feed       U+000C
        %x6E /          ; n    line feed       U+000A
        %x72 /          ; r    carriage return U+000D
        %x74 /          ; t    tab             U+0009
        %x75 4HEXDIG )  ; uXXXX                U+XXXX
escape         = %x5C   ; \
unescape       = %x20-21 / %x23-5B / %x5D-7A / %x7C / %x7E-10FFFF
               ; %x7B ('{') and %x7D ('}') are excluded from 'unescape'

; Identifier rules
identifier        = 1*(ALPHA / DIGIT / "." / "-" / "_")
                  ; Alphanumeric with dots, hyphens, underscores
identifier-strict = 1*(ALPHA / DIGIT / "_" / "-")
                  ; Alphanumeric with hyphens, underscores (no dots)

; https://datatracker.ietf.org/doc/html/rfc5234#appendix-B.1
HEXDIG         =  DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
DIGIT          =  %x30-39   ; 0-9
ALPHA          =  %x41-5A / %x61-7A   ; A-Z / a-z

DmitryAnansky · 2026-04-08T08:59:05Z

src/arazzo.md

-  unescaped = %x00-2E / %x30-7D / %x7F-10FFFF
-      ; %x2F ('/') and %x7E ('~') excluded from 'unescaped'
+  unescaped = %x00-2E / %x30-7A / %x7C / %x7E-10FFFF
+      ; Excludes / (%x2F), { (%x7B), } (%x7D), and ~ (%x7E)


Will it make sense to explicitly define some other literals, like boolean, undefined, null?
They can be used in runtime expressions.

// Boolean literals for true and false Boolean = "true" / "false" // Null literal Null = "null" Undefined = "undefined"

The literals (boolean, null, undefined) are used in Criterion Object conditions alongside runtime expressions, not within runtime expressions themselves. These are already defined in the Criterion Object section as part of the condition syntax. The ABNF grammar here specifically defines the structure of runtime expressions ($inputs.foo, $statusCode, etc.), which reference values but don't contain literals. The condition evaluation syntax is separate from the runtime expression parsing syntax.

Do you have scenarios that you're thinking of? I'd tend to scope that under a different enhancement issue if warranted.

char0n · 2026-04-08T15:56:16Z

Hi @frankkilcommins,

Hi Frank,

Great progress on the grammar updates — the identifier-strict split, reference-id = 1*CHAR, and json-pointer unescaped fix all look good.

I noticed a couple of issues with the latest changes.

`input-name = name` breaks json-pointer parsing

Since name = *( CHAR ) and CHAR includes # (%x23, within the %x00-7A range), the greedy *( CHAR ) will consume the # as part of the name. This means the optional ["#" json-pointer] in rules like:

inputs-reference = input-name [ "#" json-pointer ]

would never match, because the parser consumes customer#/firstName entirely as the input-name, leaving nothing for the json-pointer.

For example, $inputs.customer#/firstName would parse with input-name = "customer#/firstName" and no json-pointer — which defeats the purpose of adding json-pointer support to inputs/outputs.

The previous version using identifier worked because identifier doesn't include #, so the parser correctly stops at # and matches the json-pointer. The same issue applies to output-name = name and field-name = name in workflows-reference.

The fix is either:

Keep identifier for these rules (what I use in my implementation)
Define a new rule like name-no-hash that is CHAR minus # — more permissive than identifier but still allows the json-pointer delimiter to work

Option 1 is simpler. Option 2 is more permissive but requires defining a new character class.

CHAR redefinition diverges from OpenAPI

The PR redefines CHAR as a simple character range (%x00-7A / %x7C / %x7E-10FFFF), dropping the RFC 7159 JSON string definition with escape sequences. OpenAPI's runtime expression ABNF uses the RFC 7159-based CHAR with unescape / escape rules. Since Arazzo builds on top of OpenAPI and shares the runtime expression concept, redefining CHAR introduces a divergence — a bare \ becomes valid in Arazzo but not in OpenAPI, and escape sequences like \n or \uXXXX are no longer recognized. I'd recommend keeping the RFC 7159-based definition for interoperability.

…@char0n

…@char0n

feat(runtime-expressions): improve ABNF grammar clarity

64e553f

DmitryAnansky reviewed Mar 26, 2026

View reviewed changes

char0n reviewed Apr 2, 2026

View reviewed changes

src/arazzo.md Outdated Show resolved Hide resolved

char0n reviewed Apr 2, 2026

View reviewed changes

src/arazzo.md Outdated Show resolved Hide resolved

char0n reviewed Apr 2, 2026

View reviewed changes

src/arazzo.md Outdated Show resolved Hide resolved

char0n mentioned this pull request Apr 2, 2026

feat: improve ABNF grammar clarity and inline all secondary grammars swaggerexpert/arazzo-runtime-expression#116

Open

5 tasks

char0n reviewed Apr 2, 2026

View reviewed changes

src/arazzo.md Show resolved Hide resolved

frankkilcommins added 2 commits April 6, 2026 15:59

chore(spec): address PR comments on updated ABNF grammar

c7eb3ac

chore(spec): fix unescaped tilde exclusion.

2f2b50e

frankkilcommins requested a review from char0n April 6, 2026 15:46

DmitryAnansky reviewed Apr 8, 2026

View reviewed changes

chore(spec): address feedback and leverage implementation example from …

7dfadbf

…@char0n

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(runtime-expressions): improve ABNF grammar clarity#454

feat(runtime-expressions): improve ABNF grammar clarity#454
frankkilcommins wants to merge 4 commits intoOAI:v1.1-devfrom
frankkilcommins:abnf-grammer-improvements

frankkilcommins commented Mar 23, 2026 •

edited

Loading

Uh oh!

DmitryAnansky Mar 26, 2026

Uh oh!

char0n Apr 2, 2026

Uh oh!

char0n Apr 2, 2026

Uh oh!

frankkilcommins Apr 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

char0n commented Apr 2, 2026 •

edited

Loading

Uh oh!

DmitryAnansky Apr 8, 2026

Uh oh!

frankkilcommins Apr 8, 2026

Uh oh!

char0n commented Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

frankkilcommins commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DmitryAnansky Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

char0n Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

char0n Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

frankkilcommins Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

char0n commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Implementation Verification

Issue: unescaped in json-pointer still includes { and }

Issue: Single identifier rule conflates two different spec constraints

Issue: source-descriptions-reference (reference-id) is too restrictive

Issue: Simplified CHAR rule diverges from OpenAPI

Suggestion: name rule is not "legacy"

Note: Example file version mismatch

Note: Missing comma in example payload

Our ABNF grammar for reference

Uh oh!

DmitryAnansky Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

frankkilcommins Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

char0n commented Apr 8, 2026

input-name = name breaks json-pointer parsing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

frankkilcommins commented Mar 23, 2026 •

edited

Loading

char0n commented Apr 2, 2026 •

edited

Loading

Issue: `unescaped` in json-pointer still includes `{` and `}`

Issue: Single `identifier` rule conflates two different spec constraints

Issue: `source-descriptions-reference` (`reference-id`) is too restrictive

Issue: Simplified `CHAR` rule diverges from OpenAPI

Suggestion: `name` rule is not "legacy"

`input-name = name` breaks json-pointer parsing