Skip to content

Feat/rvm infrastructure v6#652

Draft
anakrish wants to merge 1 commit intomicrosoft:mainfrom
anakrish:feat/rvm-infrastructure-v6
Draft

Feat/rvm infrastructure v6#652
anakrish wants to merge 1 commit intomicrosoft:mainfrom
anakrish:feat/rvm-infrastructure-v6

Conversation

@anakrish
Copy link
Copy Markdown
Collaborator

No description provided.

@anakrish anakrish force-pushed the feat/rvm-infrastructure-v6 branch 4 times, most recently from 6603f15 to 3cfa544 Compare April 1, 2026 14:36
@anakrish anakrish requested a review from Copilot April 1, 2026 15:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Azure Policy–oriented VM infrastructure and metadata/context plumbing to the RVM, alongside new instructions and updated program serialization (v6) with expanded metadata.

Changes:

  • Introduces new VM instructions (LoadContext, LoadMetadata, ReturnUndefinedIfNotTrue, CoalesceUndefinedToNull, ArrayPushDefined) plus listing/display/test parsing support.
  • Extends VM state with host-supplied evaluation context and cached metadata, and adds Azure Policy–specific loop behavior for [*] over non-collections.
  • Refactors/extends program metadata (new module, language + annotations), adds has_host_await, and bumps binary serialization version to 6 with new/updated YAML VM suites.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/rvm/vm/suites/type_errors.yaml Updates logical NOT test case description/example metadata.
tests/rvm/vm/suites/return_undefined_if_not_true.yaml New YAML suite for ReturnUndefinedIfNotTrue.
tests/rvm/vm/suites/load_context_metadata.yaml New YAML suite for LoadContext / LoadMetadata.
tests/rvm/vm/suites/coalesce_undefined_to_null.yaml New YAML suite for CoalesceUndefinedToNull and interactions.
src/rvm/vm/machine.rs Adds VM context + metadata cache + Azure Policy loop flag; changes execution-timer tick logic.
src/rvm/vm/loops.rs Implements Azure Policy non-collection [*] loop semantics via a virtual single element.
src/rvm/vm/dispatch.rs Executes new instructions and adds ArrayPushDefined.
src/rvm/vm/context.rs Extends IterationState with Single for virtual iteration.
src/rvm/vm/comprehension.rs Ensures comprehensions tolerate IterationState::Single.
src/rvm/tests/vm.rs Extends YAML harness with context + metadata fields and recomputes has_host_await.
src/rvm/tests/instruction_parser.rs Adds parser support for new instruction mnemonics.
src/rvm/program/types.rs Removes old ProgramMetadata definition (moved to new module).
src/rvm/program/serialization/json.rs Serializes/deserializes expanded metadata, annotations, and has_host_await.
src/rvm/program/serialization/binary.rs Bumps binary version handling and adjusts legacy detection.
src/rvm/program/mod.rs Exposes new metadata module and re-exports ProgramMetadata.
src/rvm/program/metadata.rs New metadata types and postcard-safe annotation serialization bridge.
src/rvm/program/listing.rs Displays metadata language/annotations and renders new instructions.
src/rvm/program/core.rs Adds has_host_await, bumps serialization version to 6, tracks HostAwait presence.
src/rvm/instructions/mod.rs Defines new instruction variants and docs.
src/rvm/instructions/display.rs Adds Display formatting for new instruction variants.
src/languages/rego/compiler/mod.rs Minor import adjustment for Value.
src/languages/rego/compiler/core.rs Tracks has_host_await when emitting HostAwait directly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 153 to +160
match version {
1..=3 => {
1..=5 => {
let mut program = Program::new();
program.needs_recompilation = true;
program.rego_v0 = Self::legacy_rego_v0(data, version).unwrap_or(false);
Ok(DeserializationResult::Partial(program))
}
4 | 5 => {
6 => {
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary deserialization now treats versions 1..=5 as Partial (needs_recompilation), which means previously supported v5 artifacts will no longer fully deserialize. Since Program/ProgramMetadata added fields use serde defaults, this looks like an unintended backwards-compatibility break; consider keeping v4/v5 in the full-deserialization path and only special-casing truly unsupported versions.

Copilot uses AI. Check for mistakes.
Comment on lines 419 to 430
@@ -399,7 +426,7 @@ impl RegoVM {
};

self.execution_timer
.check_now(now)
.tick(work_units, now)
.map_err(|err| match err {
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execution_timer_tick now calls monotonic_now() on every tick when a limit is configured, even though ExecutionTimer::tick only needs the clock when the check interval is reached. This defeats the amortization design and can add significant overhead; consider restoring the previous accumulate() gate (or adding an API to query whether a check is due before reading the clock).

Copilot uses AI. Check for mistakes.
Comment on lines +144 to +146
/// number, etc.) returns `false` instead of vacuous `true`. This matches
/// Azure Policy semantics where `field[*]` on a non-array means "condition
/// not met". Automatically set from `program.metadata.language`.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment says an Every loop over a non-collection "returns false", but the implementation creates a single virtual iteration and the result depends on the loop body outcome. Please update the comment to match the actual semantics (virtual single null element) or adjust the implementation if the intent is to force false unconditionally.

Suggested change
/// number, etc.) returns `false` instead of vacuous `true`. This matches
/// Azure Policy semantics where `field[*]` on a non-array means "condition
/// not met". Automatically set from `program.metadata.language`.
/// number, etc.) behaves as if iterating over a single virtual element whose
/// value is `null`, instead of being vacuously `true` over an empty collection.
/// This matches Azure Policy semantics where `field[*]` on a non-array produces
/// a single `null` element (which typically causes the condition to evaluate to
/// `false`). Automatically set from `program.metadata.language`.

Copilot uses AI. Check for mistakes.
Comment on lines +13 to +15
/// In standard Rego, `Every` over empty is vacuously `true`.
/// In Azure Policy mode, this code path is only reached when
/// `virtual_element_on_non_collection` is false (standard Rego).
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is misleading: non_collection_result() is still used in Azure Policy mode for Any/ForEach (objects/non-collections short-circuit to false). Please update the wording to reflect when this helper is used in each mode.

Suggested change
/// In standard Rego, `Every` over empty is vacuously `true`.
/// In Azure Policy mode, this code path is only reached when
/// `virtual_element_on_non_collection` is false (standard Rego).
/// In standard Rego mode, this helper is used for all loop modes when the
/// collection operand is not iterable: `Every` over empty is vacuously `true`,
/// and `Any`/`ForEach` over empty are `false`.
/// In Azure Policy mode, this helper is still used for `Any`/`ForEach` when an
/// object or other non-collection is encountered and short-circuits to `false`;
/// `Every` with virtual elements is handled via a different code path.

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +46
/// Used by Azure Policy's `[*]` on scalar/null fields: iterates once
/// with the original value (typically Null/Undefined) as the element.
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IterationState::Single is documented as iterating over the original scalar/null value, but setup_next_iteration currently always yields value = Null for this state. Either store the original element in IterationState::Single and return it, or update the doc comment to describe the current "virtual null element" behavior.

Suggested change
/// Used by Azure Policy's `[*]` on scalar/null fields: iterates once
/// with the original value (typically Null/Undefined) as the element.
/// Used by Azure Policy's `[*]` on scalar/null fields: presents a single
/// "virtual" element to iterate over, which is always `Null` regardless
/// of the underlying source value.

Copilot uses AI. Check for mistakes.
Comment on lines +277 to +283
/// Return undefined immediately when condition is false or undefined.
///
/// This is used by Azure Policy compilation to model "condition does not match"
/// without treating it as a VM assertion failure.
ReturnUndefinedIfNotTrue {
condition: u8,
},
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The instruction doc says this returns undefined when condition is false/undefined, but the dispatch implementation returns undefined for any value that is not exactly Bool(true) (including null, numbers, strings, etc.). Please update the doc to match the actual semantics so compiler authors don’t misuse it.

Copilot uses AI. Check for mistakes.
Comment on lines +95 to +110
let annotations: alloc::collections::BTreeMap<String, Value> = metadata
.get("annotations")
.and_then(|v| match *v {
// Deserialize JSON annotations into Value map
serde_json::Value::Object(ref map) => {
let mut result = alloc::collections::BTreeMap::new();
for (k, json_val) in map {
if let Ok(val) = serde_json::from_value::<Value>(json_val.clone()) {
result.insert(k.clone(), val);
}
}
Some(result)
}
_ => None,
})
.unwrap_or_default();
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotation deserialization silently drops entries when serde_json::from_value::<Value> fails. That can lead to hard-to-debug metadata loss; consider returning an error (fail deserialization) or preserving the key with a fallback representation instead of skipping it.

Suggested change
let annotations: alloc::collections::BTreeMap<String, Value> = metadata
.get("annotations")
.and_then(|v| match *v {
// Deserialize JSON annotations into Value map
serde_json::Value::Object(ref map) => {
let mut result = alloc::collections::BTreeMap::new();
for (k, json_val) in map {
if let Ok(val) = serde_json::from_value::<Value>(json_val.clone()) {
result.insert(k.clone(), val);
}
}
Some(result)
}
_ => None,
})
.unwrap_or_default();
let annotations: alloc::collections::BTreeMap<String, Value> =
match metadata.get("annotations") {
// Deserialize JSON annotations into Value map
Some(serde_json::Value::Object(ref map)) => {
let mut result = alloc::collections::BTreeMap::new();
for (k, json_val) in map {
let val = serde_json::from_value::<Value>(json_val.clone())
.map_err(|e| {
format!(
"Annotation '{}' deserialization failed: {}",
k, e
)
})?;
result.insert(k.clone(), val);
}
result
}
_ => alloc::collections::BTreeMap::new(),
};

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +115
/// # Lossy mappings
///
/// [`Null`](crate::value::Value::Null) and [`Undefined`](crate::value::Value::Undefined)
/// are mapped to [`MetadataValue::String("")`](MetadataValue::String) because metadata
/// annotations have no need for null semantics. A round-trip through
/// `from_value` → `to_value` will therefore turn `Null`/`Undefined` into an
/// empty string.
///
/// Floating-point numbers are truncated to `i64` because metadata annotations
/// are expected to contain only integer counts, flags, and identifiers —
/// fractional values are not anticipated.
#[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, Serialize, Deserialize)]
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetadataValue intentionally performs lossy conversions (Null/Undefined → empty string; floats truncated; non-i64 numbers become Integer(0)). Since this is used for binary serialization of program metadata, it can silently corrupt annotations across a save/load round-trip. Consider adding explicit Null/Undefined/Float variants (or rejecting unsupported values at compile time) to keep metadata serialization lossless and predictable.

Copilot uses AI. Check for mistakes.
description: NOT with non-boolean defined operand should yield false
example_rego: "not 42"
description: NOT with int operand treats non-boolean as truthy (result false)
example_rego: "!42"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example_rego uses !42, but the Rego parser in this repo recognizes not (negation) and doesn’t appear to support ! as a unary operator. Consider switching the example back to not 42 (or another syntactically valid example) to avoid misleading readers.

Suggested change
example_rego: "!42"
example_rego: "not 42"

Copilot uses AI. Check for mistakes.
Comment on lines +611 to +636
ArrayPushDefined { arr, value } => {
let value_to_push = self.get_register(value)?.clone();

// Skip undefined values — matches Azure Policy's
// `field('alias[*].property')` collection semantics where
// absent nested properties are excluded from the collected
// array.
if value_to_push == Value::Undefined {
return Ok(InstructionOutcome::Continue);
}

let mut arr_value = self.take_register(arr)?;

if let Ok(arr_mut) = arr_value.as_array_mut() {
arr_mut.push(value_to_push);
self.set_register(arr, arr_value)?;
} else {
let offending = arr_value.clone();
self.set_register(arr, arr_value)?;
return Err(VmError::RegisterNotArray {
register: arr,
value: offending,
pc: self.pc,
});
}
Ok(InstructionOutcome::Continue)
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ArrayPushDefined introduces new behavior (skipping Undefined) but there’s no YAML VM suite coverage for it yet, while similar array ops have tests. Adding a small suite that covers (1) pushing Undefined is skipped, (2) pushing Null/value is kept, and (3) non-array target still errors would help prevent regressions.

Copilot uses AI. Check for mistakes.
@anakrish anakrish force-pushed the feat/rvm-infrastructure-v6 branch 2 times, most recently from bd0edc1 to c6110cc Compare April 1, 2026 17:56
…ization v6

Extend the RVM infrastructure with evaluation context, program metadata,
and several new instructions needed by language frontends beyond Rego.

VM & instructions:
- Add LoadContext / LoadMetadata instructions for accessing evaluation
  context and program metadata from within RVM programs
- Add ArrayPushDefined (skip-undefined variant of ArrayPush) for
  wildcard alias collection where absent properties should be excluded
- Add ReturnUndefinedIfNotTrue for early-return semantics without
  triggering VM assertion failures
- Add CoalesceUndefinedToNull for languages where missing fields are
  null rather than undefined
- Support IterationState::Single in loops and comprehensions for
  iterating over scalar values alongside collections

Program metadata:
- Extend ProgramMetadata with language identifier, typed annotations
  (BTreeMap<String, MetadataValue>), and to_value() conversion
- Add MetadataValue enum (String, Bool, Integer, Float, Array, Object)
  with full serde support
- Add has_host_await flag to Program with recompute_host_await_presence()

Serialization:
- Bump binary format version to 6
- Add JSON serialization for ProgramMetadata including annotations

Compiler:
- Track has_host_await during compilation
- Minor import cleanups

Test infrastructure:
- Extend VM test harness with context, metadata_language, and
  metadata_annotations fields
- Add instruction parser support for new instructions
- Add assembly listing support (readable, tabular, compact formats)
- Add 3 new YAML test suites: load_context_metadata,
  coalesce_undefined_to_null, return_undefined_if_not_true
@anakrish anakrish force-pushed the feat/rvm-infrastructure-v6 branch from c6110cc to 9247770 Compare April 3, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants