Denotational semantics for Laurel IR with concrete evaluator and transform preservation tests by olivier-aws · Pull Request #631 · strata-org/Strata

olivier-aws · 2026-03-20T19:52:57Z

Denotational semantics for Laurel IR with concrete evaluator and transform preservation tests

Summary

This PR adds a fuel-based denotational interpreter for Laurel IR, a concrete program evaluator, a comprehensive test suite (~130 tests), and transform-preservation infrastructure that validates the Laurel→Laurel lowering pipeline preserves semantics. It also fixes a bug in liftExpressionAssignments that broke evaluation order.

Changes

Denotational interpreter (`Strata/Languages/Laurel/`)

LaurelSemantics.lean: Shared semantic types (values, stores, heaps, outcomes) and helper functions (evalPrimOp, bindParams, getBody, etc.) used by the interpreter and evaluator.
LaurelDenote.lean: Fuel-based denotational interpreter (denoteStmt, denoteBlock, denoteArgs) — three mutually recursive functions covering all StmtExpr constructors exhaustively. Short-circuit operators (AndThen, OrElse, Implies) are handled as special cases before the general PrimitiveOp path. evalPrimOp uses explicit per-operation fallthrough (no wildcard) so adding a new Operation variant forces a build error.
LaurelDenoteMono.lean: Fuel monotonicity proof — if the interpreter succeeds with fuel n, it succeeds with any m ≥ n giving the same result.
LaurelConcreteEval.lean: Bridges denoteStmt to Laurel.Program by building ProcEnv from static + instance procedures, constructing the initial store from static fields, and running main.

Test suite (`StrataTest/Languages/Laurel/`)

ConcreteEval/: 13 test modules covering primitives, arithmetic, boolean ops (including short-circuit), control flow, variables, procedures, side effects, recursion, aliasing, heap objects, type ops, verification constructs, and edge cases. Shared TestHelper.lean with parseLaurel (parse + resolve) and programmatic AST helpers.
LaurelDenoteTest.lean, LaurelDenoteUnitTest.lean, LaurelDenoteIntegrationTest.lean, LaurelDenotePropertyTest.lean: Direct tests of the denotational interpreter using both programmatic AST and Plausible property-based testing.
LaurelConcreteEvalTest.lean: End-to-end tests of the concrete evaluator.

Transform preservation (`ConcreteEval/TransformPreservation.lean`)

Runs all 94 string-based ConcreteEval tests after the full Laurel→Laurel lowering pipeline (the exact pass sequence from LaurelToCoreTranslator.translate, stopping before Laurel→Core translation). 82/94 pass — the remaining 12 failures are all due to heapParameterization changing the calling convention for composite types.

Bug fix: `liftExpressionAssignments`

Fixed two bugs in the StaticCall case of transformExpr:

Nested call ordering: add(mul(2,3), mul(4,5)) — the outer call's temp variable was declared before inner calls' temps, causing the evaluator to reference undeclared variables.
Side-effect evaluation order: add({x:=1;x}, {x:=x+10;x}) — the lifted code didn't preserve left-to-right evaluation order. Fixed by isolating each arg's prepends, capturing side-effectful args in temporaries, and emitting prepend groups in the correct order.

Other improvements

LaurelToCoreTranslator.lean: Extracted lowerLaurelToLaurel function that runs all Laurel→Laurel passes, reusable by both translate and the test infrastructure.
Removed applyLift parameter from parseLaurel — tests validate raw Laurel semantics directly.
Removed unused simp warnings in LaurelDenoteMono.lean.

Testing

lake build — no warnings from new files
lake test — all tests pass
Transform preservation: 82/94 tests match direct-mode output

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…prehensive test suite Semantics (Strata/Languages/Laurel/): - LaurelSemantics: Shared type definitions (values, stores, heaps, outcomes) and helper functions (evalPrimOp, bindParams, store/heap operations) - LaurelDenote: Fuel-based denotational interpreter - LaurelDenoteMono: Fuel monotonicity proof for the denotational interpreter Concrete evaluator: - LaurelConcreteEval: Concrete evaluator for Laurel programs via denotational semantics Test suite (StrataTest/Languages/Laurel/): - LaurelDenoteUnitTest: Unit tests for denotational interpreter - LaurelDenoteIntegrationTest: Integration scenario tests - LaurelDenotePropertyTest: Plausible property-based tests - LaurelConcreteEvalTest: Concrete evaluator tests using Laurel parser - ConcreteEval/ module hierarchy with shared TestHelper: Primitives, Arithmetic, BooleanOps, ControlFlow, SideEffects, Procedures, Aliasing, Variables, HeapObjects, Recursion, TypeOps, Verification, EdgeCases Also fixes LiftImperativeExpressions refactoring and minor test updates.

…onstructs Remove the wildcard catch-all from evalPrimOp and replace it with explicit per-operation cases so that adding a new Operation constructor forces a build error. This prevents new operations from silently returning none. Add short-circuit handling for AndThen, OrElse, and Implies in denoteStmt instead of evalPrimOp, since these operators must not eagerly evaluate their second argument. This enables proper side-effect semantics where the second operand is only evaluated when needed. Add DivT (truncation division) and ModT (truncation modulus) cases to evalPrimOp using Int.tdiv and Int.tmod respectively. Update the fuel monotonicity proof (LaurelDenoteMono) to handle the new short-circuit cases in denoteStmt by case-splitting on the operation and argument list structure. Fix pre-existing test failures: - BooleanOps tests now pass (AndThen/OrElse have semantics) - LaurelConcreteEvalTest short-circuit tests now pass - LaurelDenoteUnitTest short-circuit tests use correct operations - DivT test updated from stuck to returning correct result Add new tests: - DivT/ModT with positive and negative operands - DivT/ModT division by zero (stuck) - evalPrimOp unit tests for DivT, ModT, AndThen, OrElse, Implies - Truncation division edge cases (negative dividend/divisor)

…exhaustive over all Laurel constructs - Add DivT/ModT to arithTotalProp property test (bug fix) - Add Implies to short-circuit ops return none section in unit tests - Add TODO for extracting shared tactic in LaurelDenoteMono.lean - Document And/Or (eager) vs AndThen/OrElse/Implies (short-circuit) distinction in evalPrimOp - Fix stale comment in BooleanOps.lean referencing And/Or instead of AndThen/OrElse for short-circuit semantics

…osition Test 3 expected `returned: 42` but got `returned: 0` because the block expression `{x := 42; x}` was wrapped in a procedure call `id(...)`. The lift pass correctly lifts the call to a temporary before the block's side effects execute, so `id(x)` was called with x still equal to 0. Rewrite Test 3 to place the block expression directly in return position (`return {x := 42; x}`) where the lift pass produces the correct order: assign x := 42, then return x. The expected output `returned: 42` remains correct.

The denotational interpreter handles blocks in expression/argument position natively, making the liftExpressionAssignments pass unnecessary for concrete evaluator tests. Remove the applyLift parameter, its conditional logic, and the LiftImperativeExpressions import. Update all 58 call sites and related doc comments.

Delete three no-op `simp only [denoteStmt] at heval ⊢` lines in the AndThen, OrElse, and Implies cases of denoteStmt_fuel_mono. The match on the following line already unfolds denoteStmt, making these simp calls redundant and triggering unused-argument warnings during build. Remove unused simp warnings in LaurelDenoteMono.lean

Add infrastructure to run every string-based ConcreteEval test through the full Laurel→Laurel lowering pipeline, exposing which tests break after the lowering passes. Changes: - Add lowerLaurelToLaurel helper to LaurelToCoreTranslator.lean that extracts the Laurel→Laurel pass sequence (stops before Laurel→Core) - Add parseLaurelTransformed to TestHelper.lean using the new helper - Create TransformPreservation.lean with 94 tests mirroring all string-based ConcreteEval tests - Update ConcreteEval.lean barrel file Results: 77/94 tests pass (output matches direct mode). 17 tests fail due to two known categories: - heapParameterization (13 tests): composite types/heap objects - liftExpressionAssignments (4 tests): nested calls and eval order Failing tests document actual (wrong) output with explanatory comments.

- Fix failure count breakdown: 12 heapParameterization / 5 liftExpressionAssignments (was incorrectly 13/4) - Fix SideEffects Test 1 comment: the lifting pass traverses arguments right-to-left creating snapshot variables, so both block expressions independently see the original x=0, yielding add(0,0)=0 - Remove duplicate resolve call in lowerLaurelToLaurel (no-op second call after modifiesClausesTransform, mirrored from translate) - Add TODO to refactor translate to call lowerLaurelToLaurel internally to avoid duplicated Laurel→Laurel pass pipeline

…tation

Two bugs in the StaticCall case of transformExpr: 1. Nested call ordering: when an imperative call had arguments that were themselves imperative calls (e.g. add(mul(2,3), mul(4,5))), the outer call's temp variable was declared before the inner calls' temps. Fix: snapshot arg prepends before adding the call's own prepends, then restore them so inner calls are declared first. 2. Side-effect evaluation order: when arguments contained blocks with side effects (e.g. add({x:=1;x}, {x:=x+10;x})), the lifted code didn't preserve left-to-right evaluation order. The right arg's side effects could clobber variables read by the left arg's result. Fix: isolate each arg's prepends, capture side-effectful args in temporaries, and emit prepend groups in left-to-right order using the cons-based stack (right-to-left groups pushed first so left groups end up on top). All 5 liftExpressionAssignments failures in TransformPreservation tests now pass (82/94 total, remaining 12 are heapParameterization).

Should not have been committed

Strata/Languages/Laurel/LaurelDenote.lean

keyboardDrummer · 2026-03-23T12:04:39Z

Strata/Languages/Laurel/LaurelDenote.lean

+      | none => none
+
+    -- Verification Constructs
+    -- The relational semantics requires assert/assume conditions to be pure


Runtime compilation might erase assertions and assumptions, and to do that safely their bodies should not have side-effects. The semantics here should check that if the heap/store changes, none is returned.

Agreed in principle. The interpreter already discards the post-condition store/heap and returns the originals, so side effects in assert/assume conditions are invisible to subsequent code. However, we cannot detect impure conditions at runtime because LaurelStore and LaurelHeap are function types (String ? Option LaurelValue and Nat ? Option (...)) which have no BEq or DecidableEq instance in Lean. There is no way to compare two closures for equality. The purity requirement must be enforced statically (e.g., by a well-formedness check on the AST before interpretation) rather than dynamically. Added documentation clarifying this design constraint.

However, we cannot detect impure conditions at runtime because LaurelStore and LaurelHeap are function types

What if you use data instead of functions to represent the store and heap? Out of scope for this PR though.

Strata/Languages/Laurel/LaurelInterpreter.lean

keyboardDrummer · 2026-03-23T12:15:41Z

Strata/Languages/Laurel/LaurelDenote.lean

+
+    -- Quantifiers (delegated to δ)
+    | .Forall name ty body =>
+      match δ σ (.Forall name ty body) with


It seems you're leaving the meaning of quantifiers unspecified. How about you enable δ to return a List of all possible values of a particular type? Then you can use the fuel to iterate over these values and check the property.

Interesting idea. The current δ callback was designed as an escape hatch for constructs the interpreter can't handle natively. Enumerating all values of a type using fuel would work for finite types (bool, bounded int ranges) but not for unbounded ones (int, string). I could add a typeValues : LaurelType → Option (List LaurelValue) field to δ that returns some for enumerable types and none otherwise.

Happy to add this in a follow-up PR.

but not for unbounded ones

Why not? If you run out of fuel then the quantifier holds, so the proof would be of the form: for whatever amount of fuel you start with, the quantifier's value does not change before and after phase P of the compilation.

keyboardDrummer · 2026-03-23T12:17:14Z

Strata/Languages/Laurel/LaurelInterpreter.lean

+      | some v => some (.normal v, σ, h)
+      | none => none
+
+    -- Specification Constructs (delegated to δ)


What does delegating to δ mean? Shouldn't these all return a not yet implemented error?

Strata/Languages/Laurel/LaurelInterpreter.lean

keyboardDrummer · 2026-03-23T12:19:32Z

Strata/Languages/Laurel/LaurelSemantics.lean

+  SPDX-License-Identifier: Apache-2.0 OR MIT
+-/
+
+import Strata.Languages.Laurel.Laurel


The separation between LaurelConcreteEval, LaurelDenote and LaurelSemantics seems arbitrary to me.

LaurelSemantics defines shared types and helpers (values, stores, heaps, evalPrimOp). LaurelInterpreter (formerly LaurelDenote) is the fuel-based interpreter that uses those types. LaurelConcreteEval bridges the interpreter to Laurel.Program by building ProcEnv from program declarations and running main. The split keeps the interpreter independent of the program-level structure. Added module-level doc comments making this layering explicit.

Happy to merge some of these if you feel strongly about it.

I don't feel strongly but IMO the "program level structure" is as much a part of the language as any other part of the AST.

I agree and I have a follow-up to this PR that adds an "interpretProgram" method to define the semantics at the program level. I'll update the PR with this part

I feel more strongly about this - I think there should be no distinction between the "interpreter" and the "evaluator" anywhere, they should be labeled as the same concept that could potentially be implemented for any elements of a dialect.

I would vote for using "interpreter" in this case, only because "evaluator" is slightly ambiguous in Lean since it has `#eval.

keyboardDrummer · 2026-03-23T13:13:37Z

Strata/Languages/Laurel/LiftImperativeExpressions.lean

    let model := (← get).model
-    let seqArgs ← args.reverse.mapM transformExpr
-    let seqCall := ⟨.StaticCall callee seqArgs.reverse, md⟩
+    -- Process arguments right-to-left (for substitution mechanism).


This seems awfully complicated. Isn't the change on line 267 and 272 sufficient? If not, can you explain why?

Investigated. The simple change (just lines 267/272) handles the basic case where an imperative call's temp variable needs to be declared after its arguments' temps. But it does NOT handle the evaluation-order bug for side-effectful arguments like add({x:=1;x}, {x:=x+10;x}). In that case we need to:

Isolate each argument's prepended statements (so arg2's side effects don't leak into arg1's scope)

Capture side-effectful results in temporaries (so the final call uses snapshot values)

Emit the prepend groups in left-to-right order (so side effects execute in program order)

The simple fix only addresses point (1). Points (2) and (3) require the full complexity. Added a detailed comment in the code explaining which test cases require each piece.

For this example add({x:=1;x}, {x:=x+10;x}), could you write down the before this PR and after this PR translation?

I thought the before would already be:

x := 1; var x1 := x; x := x + 10 add(x1, x);

Is something wrong with that, or was it not producing that?

With my change:

procedure main() returns () deterministic { var x: int := 0; x := 1; var $c_1: int; $c_1 := x; x := x + 10; var $c_0: int; $c_0 := x; var $c_2: void; $c_2 := add($c_1, $c_0); return $c_2 }

Before my change:

procedure main() returns () deterministic { var x: int := 0; var $c_0: void; $c_0 := add(x, x); x := x + 10; x := 1; return $c_0 }

I think the above code changes are not the correct solution for the bug. I made a fix that I think is correct here: #672

Critical question: did all the changes in LiftImperativeExpressions in this PR make sense to you? I think before submitting a PR the answer has to be yes. I looked at the lifting changes in this PR but I couldn't make sense of them that's why I investigated the bug separately.

What's the Laurel program that's failing for you on #672?

I also tried:

function add(x: int, y: int): int { x + y }; procedure repeatedBlockExpressions() { var x: int := 2; var y: int := { x := 1; x } + { x := x + 10; x }; var z: int := add({ x := 1; x }, { x := x + 10; x }); assert y == 1 + 11; assert z == 1 + 11 };

this passes verification, and returns:

procedure repeatedBlockExpressions () returns () { $body: { var x : int := 2; var $x_1 : int := x; x := 1; var $x_0 : int := x; x := x + 10; var y : int := $x_0 + x; var $x_3 : int := x; x := 1; var $x_2 : int := x; x := x + 10; var z : int := add($x_2, x); assert [|assert(2376)|]: y == 1 + 11; assert [|assert(2398)|]: z == 1 + 11; } };

which seems good

The proposed change does make sense to me because I reviewed several iterations

Sorry for questioning that

I integrated the change from #672 into this PR after reverting the change on LiftImperative pass. And then the test

procedure add(a: int, b: int) { return a + b }; procedure main() { var x: int := 0; return add({x := 1; x}, {x := x + 10; x}) };

would not produce the correct IR.
Maybe it is because of the return statement.

Sorry for questioning that

Don't be, it is very sound to always look for the simplest solution, especially with LLMs that tend to find too complex solutions.

#672 has some issues and I'm working on resolving them. I'll update when resolved and I'll include the above test-case as well.

#672 was still missing the fixes on lines 267/272 from this PR. With those, the program:

procedure add(a: int, b: int) { return a + b }; procedure main() { var x: int := 0; return add({x := 1; x}, {x := x + 10; x}) };

passes as well.

Thanks, one #672 is merged I will update the branch and revert these changes then.

Rename "denotational interpreter" to "interpreter" throughout: - LaurelDenote.lean -> LaurelInterpreter.lean - LaurelDenoteMono.lean -> LaurelInterpreterMono.lean - denoteStmt/denoteBlock/denoteArgs -> interpStmt/interpBlock/interpArgs - Test files renamed accordingly - All doc comments updated Rename single-character variable names per reviewer request: - h -> heap, σ -> store in function signatures and bodies - π -> procEnv, σ₀/h₀ -> initStore/initHeap in evalProgram Add documentation: - Module layering explanation in LaurelSemantics.lean - δ (LaurelEval) callback purpose in LaurelInterpreter.lean - TODO for quantifier type enumeration via δ - TODO for Old pre-state snapshot threading - TODO for DiagnosticModel in Outcome Document assert/assume purity enforcement: - Conditions are evaluated but post-condition store/heap is discarded - Original store/heap returned, matching erasable-construct semantics Add detailed comment in LiftImperativeExpressions.lean explaining why the StaticCall case requires full arg-isolation complexity (Bug 1: nested call ordering, Bug 2: evaluation order preservation) rather than the simpler reordering approach.

robin-aws

Just top level feedback so far. I'm inclined to look closer after you merge the interpreter and the evaluator, if you agree, as I believe that will cut some of the documentation and datatype complexity in half.

robin-aws · 2026-03-25T21:39:47Z

Strata/Languages/Laurel/LaurelSemantics.lean

+  SPDX-License-Identifier: Apache-2.0 OR MIT
+-/
+
+import Strata.Languages.Laurel.Laurel


I feel more strongly about this - I think there should be no distinction between the "interpreter" and the "evaluator" anywhere, they should be labeled as the same concept that could potentially be implemented for any elements of a dialect.

I would vote for using "interpreter" in this case, only because "evaluator" is slightly ambiguous in Lean since it has `#eval.

robin-aws · 2026-03-25T21:43:16Z

Strata/Languages/Laurel/LaurelConcreteEval.lean

+  | noMain
+  | noBody
+  | stuck (msg : String)


It seems simpler to just have the stuck alternative, rather than picking out noMain and noBody as more specific ways to get stuck.

I suppose you might be thinking that noMain and noBody are program errors, whereas stuck is an interpreter error? I might buy that, but in that case it would be better to encode the two categories of errors more explicitly in the result naming.

robin-aws · 2026-03-25T21:51:01Z

Strata/Languages/Laurel/LaurelConcreteEval.lean

+
+/-- Run a program and classify the result. Delegates to `evalProgram` for
+    the core evaluation, preserving the `noMain` / `noBody` distinction. -/
+def runProgram (prog : Program) (fuel : Nat := 10000) : EvalResult :=


Why not make the default fuel something like 2^64 - 1 (or whatever the right value that stops short of expensive bignumber operations)? We only have the fuel so we can still make it technically guaranteed to terminate, but in practice having the interpreter terminate by running out of fuel is never useful information. This number is probably far too low to be a reasonable "timeout" value on large examples, so I think we're better off letting the Lean process run "forever" and we can kill the process if actually necessary.

robin-aws · 2026-03-25T22:46:31Z

Strata/Languages/Laurel/laurel-denotational-semantics.md

+interpreter that:
+
+1. Can be `#eval`'d on concrete programs
+2. Is deterministic by construction (it is a function, not a relation)


Apologies for the dumb question if this is known, but is Laurel itself deterministic? I had thought not, by design, in which case how is this point true?

robin-aws · 2026-03-25T22:49:01Z

Strata/Languages/Laurel/laurel-denotational-semantics.md

+can be evaluated — any terminating program can be evaluated with
+sufficient fuel.
+
+When fuel reaches zero, the interpreter returns `none` (indistinguishable


Here I got confused because LaurelResult has stuck, but I'm again getting confused between the interpreter and the evaluator. Another point that they should be the same thing IMO.

robin-aws · 2026-03-25T22:49:28Z

Strata/Languages/Laurel/laurel-denotational-semantics.md

This file should probably go under docs instead.

robin-aws · 2026-03-25T23:00:42Z

Strata/Languages/Laurel/laurel-denotational-semantics.md

+
+## Limitations
+
+1. **Fuel exhaustion is indistinguishable from stuck.** When fuel


Does not seem true of the interpreter, since LaurelResult DOES distinguish.

Delete 4 redundant test files (~1800 lines) whose coverage is subsumed by the ConcreteEval/ end-to-end tests: - LaurelInterpreterTest.lean (66 hand-built AST tests) - LaurelInterpreterUnitTest.lean (120 hand-built AST tests) - LaurelInterpreterIntegrationTest.lean (27 hand-built AST tests) - LaurelConcreteEvalTest.lean (16 tests, mix of parsed and AST) Create LaurelInterpreterInternals.lean with tests that cannot be exercised through parsed Laurel source: - interpPrimop exhaustive coverage (all ops, type mismatches, arity) - Stuck/error states (malformed AST the parser would reject) - δ delegation (Forall, Exists, Old, Fresh, Assigned, ContractOf) - Short-circuit semantics (AndThen/OrElse/Implies) - Fuel exhaustion - Programmatic AST tests for interpProgram Also fix stale LaurelConcreteEval import in Strata.lean. Remaining test files: - ConcreteEval/ (13 modules) — primary end-to-end test suite - LaurelInterpreterInternals — internal API tests - LaurelInterpreterPropertyTest — Plausible property-based tests

*Issue #, if available:* *Description of changes:* This PR fixes a Boole lowering bug where `returns ()` procedures were silently dropped because nested `return []` statements escaped `toCoreDecls` instead of producing empty local lists. It also includes a small Boole frontend cleanup and test refresh: - add Boole string type lowering to Core - normalize a few remaining nested branch returns in `Verify.lean` from `return` to `pure` - update Boole regression tests whose expected obligations changed once unit-return procedures were actually lowered and verified - stabilize several feature-request seeds and update their comments/docs to reflect current implementation priorities - added a regression test for the bug fix The net effect is that Boole procedures with no outputs are lowered correctly, the relevant regression tests are now meaningful again, and the Boole test suite is back in sync with the frontend behavior. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add more integration tests for PyAnalyzeLaurel By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>

Add infrastructure to run the full Python → Laurel → Core → SMT verification pipeline from inline Python source strings in Lean tests. - StrataTest/Languages/Python/TestExamples.lean: New processPythonFile function that compiles Python source to Ion via the external tool, translates through Laurel to Core with Python-specific factory functions (ReFactory), and returns verification diagnostics. - StrataTest/Util/TestDiagnostics.lean: Extend parseDiagnosticExpectations to recognize '#' comments (in addition to '//') so that Python source embedded in Lean strings can use # comment expectations. - StrataTest/Languages/Python/VerifyPythonTest.lean: Five tests exercising the new infrastructure: passing assertions, failing assertions, mixed, line-number accuracy, and annotated-style with testInputWithOffset. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>

The existing Stmt-to-GOTO translation (ToCProverGOTO.lean) interleaves control-flow lowering — emitting conditional jumps, managing forward references, patching targets — with CBMC-specific concerns like source locations and instruction encoding. This makes the code hard to extend to additional backends and difficult to reason about independently. *Description of changes:* This commit introduces a CFG-based path that separates those responsibilities: Core Stmt → CFG (via stmtsToCFG) → CProverGOTO instructions The first step (stmtsToCFG from StructuredToUnstructured) handles control-flow lowering once, producing a backend-agnostic CFG. The second step (detCFGToGotoTransform, added here) is a straightforward mapping from CFG blocks to GOTO instructions. This separation makes it easier to target additional backends — each only needs to consume the CFG — and to reason about control-flow lowering independently. Three gaps remain before this can replace the direct path: 1. Source locations on control flow: StructuredToUnstructured discards MetaData from ite/loop/block/exit statements, so transfer commands in the CFG carry no source location information. 2. Loop contracts: the direct path emits #spec_loop_invariant and #spec_decreases as named sub-expressions on the backward-edge GOTO (for CBMC DFCC). In the CFG, invariants become plain asserts and measures are discarded. 3. Core.CmdExt.call: this translation handles Imperative.Cmd only. Core procedure calls need a command translator analogous to coreStmtsToGoto in CoreToGOTOPipeline.lean. No changes to StructuredToUnstructured are needed: the existing stmtsToCFG already works with Core.Command because its CmdT parameter is generic, and Core.Command has the required HasPassiveCmds and HasBool instances. Tests cover sequential commands, if-then-else, loops (including back-edge verification), empty programs, assert/assume, and havoc. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Kiro <kiro-agent@users.noreply.github.com> Co-authored-by: Shilpi Goel <shigoel@gmail.com> Co-authored-by: Aaron Tomb <aarotomb@amazon.com> Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>

…#666) ## Summary - Add `ResultClassifier` structure with pluggable `isSuccess`/`isFailure` predicates (defaults preserve existing behavior) so callers can control what counts as a failure without touching `printPyAnalyzeSummary`. - In bug-finding mode, narrow `isFailure` to `alwaysFalseAndReachable` only — the outcome where validity confirms the property is always false on a reachable path. Other failure modes become inconclusive. - Apply the classifier consistently: both the per-VC `"Assertion failed"` prefix and the summary counts/exit-code now use the same `classifier.isFailure`. - `nInconclusive` is now the remainder (total minus success, failure, unreachable, impl-error), so narrowing `isFailure` automatically widens inconclusive. - Add `VCOutcome.hasSMTError` / `VCResult.hasSMTError` to `Verifier.lean` alongside the other nine-case predicates, and use them to simplify `nImplError` in `printPyAnalyzeSummary`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…#647) PySpec commands can now work with an entire directory of PySpec files using standard Python conventions: `__init__.py` for package modules and relative imports (`from .Foo import Bar`). The CLI and API take dotted module names (e.g., `servicelib.Storage`) instead of individual Ion file paths, resolved against a spec directory. - Previously, PySpec only understood absolute `from X import Y` imports. Relative imports, bare imports, and `__init__.py` packages all produced errors. This PR adds support for all of these so that real-world Python codebases with typical project layouts can be translated. - When an imported module itself contains imports, those are now resolved relative to the imported file's own directory, not the top-level search path. This matches Python's import system and is essential for packages where `__init__.py` imports from sibling modules. - `translateFile` now takes an explicit `searchPath` parameter instead of deriving it implicitly, giving callers control over where imports are resolved from. - The existing `dispatch_test` fixture is converted from a flat `servicelib.py` to a proper Python package (`servicelib/__init__.py` with sibling modules), so the test suite actually exercises package-level imports end-to-end. - A new `RelativeImportTest` suite covers basic relative imports, aliased imports, mixed absolute+relative, bare imports, dotted package paths, `__init__.py` resolution, and negative cases (unsupported level-2 imports, missing modules). ## SimpleAPI Changes **Breaking:** `pyAnalyzeLaurel` and `pyTranslateLaurel` now take **module names** instead of Ion file paths. | Parameter | Before | After | |-----------|--------|-------| | `dispatchPaths : Array String` | Ion file paths | `dispatchModules : Array String` — dotted module names (e.g., `"servicelib"`) | | `pyspecPaths : Array String` | Ion file paths | `pyspecModules : Array String` — dotted module names (e.g., `"servicelib.Storage"`) | | _(new)_ | — | `specDir : System.FilePath := "."` — directory containing compiled PySpec Ion files | Module names are resolved against `specDir` using `ModuleName.specIonPath`, which tries canonical paths (`specDir/servicelib/Storage.pyspec.st.ion`) then falls back to `__init__` layout (`specDir/servicelib/__init__.pyspec.st.ion`). `resolveAndBuildLaurelPrelude` has the same parameter changes (`dispatchPaths` → `dispatchModules`, `pyspecPaths` → `pyspecModules`). ## Strata Exe Changes **Breaking:** The `--dispatch` and `--pyspec` flags now take **module names** instead of Ion file paths for these commands: - `pyAnalyzeLaurel` - `pyTranslateLaurel` - `pyAnalyzeLaurelToGoto` New flag: `--spec-dir <dir>` specifies where compiled PySpec Ion files live (defaults to `.`). The command exits with an error if the directory does not exist. **Before:** ```sh strata pyAnalyzeLaurel --dispatch path/to/dispatch.pyspec.st.ion --pyspec path/to/Storage.pyspec.st.ion file.ion ``` **After:** ```sh strata pyAnalyzeLaurel --spec-dir path/to/pyspec --dispatch servicelib --pyspec servicelib.Storage file.ion ``` By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

*Description of changes:* Adds the `--keep-all-files dir` to pyAnalyzeLaurel that creates the dir directory and stores a text representation of the Laurel and Core programs in it, plus all the SMT queries. Manually tested. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com> Co-authored-by: Mikael Mayer <mimayere@amazon.com>

This PR implements several previously-opaque functions in the `Strata/SimpleAPI.lean` and consolidates the verification entry point. ### Changes Implemented Simple API functions: - `genericToCore` — translates a generic `Strata.Program` to` Core.Program` using `Core.getProgram` - `genericToLaurel` — translates a generic `Strata.Program` to `Laurel.Program` using `Laurel.parseProgram` - `laurelToCore` — translates a `Laurel.Program` to `Core.Program` using the existing Laurel translator - `Core.inlineProcedures` — inlines procedure calls with a configurable predicate controlling which calls to inline - `Core.loopElimUsingContract` — eliminates loops by replacing them with invariant- based assertions/assumptions - `Core.callElimUsingContract` — eliminates procedure calls by replacing them with contract-based assertions/assumptions - `Core.verifyProgram` — runs deductive verification on a Core program (replaces the standalone `verifyCore`) Structural changes: - `Core.InlineTransformOptions` is now a concrete structure with an optional `doInline` predicate, rather than an opaque type - Removed the standalone `verifyCore` function; its logic is now in `Core.verifyProgram` under the `Core` namespace - Renamed `Core.loopElimWithContract` → `Core.loopElimUsingContract` and `Core.callElimWithContract` → `Core.callElimUsingContract` - Trimmed unnecessary imports from `SimpleAPI.lean` and `StrataMain.lean` - Fixed a docstring typo (Core dialect → Laurel dialect for `genericToLaurel`) Transform module workarounds: - Added top-level aliases `coreCallElimCmd` and `coreInlineCallCmd` in `CallElim.lean` and `ProcedureInlining.lean` to work around the Core module/dialect name collision Dogfooding in `StrataMain`: - Updated `pyAnalyzeCommand` to use `Core.inlineProcedures` instead of calling the transform directly - Updated `pyAnalyzeLaurelCommand` to use `Core.verifyProgram` instead of `Strata.verifyCore` ### Motivation The Simple API was originally sketched with opaque declarations to define the intended surface. This PR makes the majority of those functions concrete, moving the API closer to being usable by external clients. The remaining `noncomputable opaque` functions depend on DDM functionality that is not yet available. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>

*Issue #, if available:* *Description of changes:* Fixes Boole’s `old(...)` lowering in `Verify.lean` so it works on compound expressions, not just bare identifiers. Boole already lowers `old` by mapping free variables to their pre-state names in `Core`. This PR extends that existing lowering to compound expressions needed by the new regression test `stack_array_based`, without changing Core’s representation of old-state values. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>

- Make pyAnalyzeLaurel only check non-prelude asserts - Add check of prelude asserts as build-time check - Add check for missing source location to CI By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Support `**dict` expansion at resolved method call sites Python code often builds a parameter dictionary and passes it to a method call using `**`, e.g. `client.start_job_run(**params)`. When the target method had a known signature (via pyspec), the translator would fail with `'method' called with unknown keyword arguments: []` because it couldn't match the unnamed `**` keyword to individual named parameters. This PR detects `**dict` at call sites of resolved methods and expands the dictionary into individual arguments: required parameters use `DictStrAny_get` (preserving the precondition that the key must exist), and optional parameters use a new prelude function `DictStrAny_get_or_none` (returns `None` if the key is absent). Two end-to-end tests were added to the dispatch test suite: one covering all-required parameters, one covering a mix of required and optional parameters where the dictionary omits the optional keys. All existing tests continue to pass.

## Summary - Appends `(set-info :final-message "...")` at the end of each generated SMTLib file, containing the "message" metadata field from the obligation. Emitted once per obligation, after all check-sat commands. - Promotes MetaData field definitions from `def` to `abbrev` with `@[match_pattern]` so they can be used in pattern matches. - Adds `MetaData.message` field definition. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

@MikaelMayer

…correctness of DetToNondet with loops (#660) **Specification of Transformation**: Strata/Transform/Specification.lean - Define two notions of assertion validity (`AssertValid` via reachability, `Hoare.Triple` via partial-correctness triples) and prove them equivalent (`hoareTriple_implies_assertValid`, `assertValid_implies_hoareTriple`). - Define `Sound` and `Overapproximates` predicates for transformation correctness with `WellFormedSemanticEvalBool`/`Val` preconditions. - Prove `overapproximates_triple`, composition, identity, and block-level `overapproximates_stmts` (using `noFuncDecl` for eval preservation). - Add structural Hoare rules: `cmd`, `seq_cons`, `ite`, `skip`, block lifting. - Add `Lang P` structure abstracting statement/config types and step relation for generality. **The new `hasFailure` flag in a program state, and small-step of Imperative** - Add `Env P` record bundling store, evaluator, and cumulative `hasFailure` flag. Refactor `Config`, `StepStmt`, `EvalStmt`, `EvalBlock` to use `Env`. - Properly amend small-step semantics of Imperative; copied verbatim from #509, @MikaelMayer **Correctness proof**: DetToNondetCorrect.lean - Prove `detToNondet_overapproximates` using the `Overapproximates` predicate defined above. - Defined small-step semantics of NondetStmt. This also supports loops. - Fix DetToNondet pass to optionally return `none` if funcDecl or exit is met. *Issue #, if available:* *Description of changes:* By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

The pipeline crashes with a type unification error when a composite value flows into an `Any`-typed position (e.g., `str(e)` on an exception object in a `try/except` handler). This PR fixes the issue by: 1. Generating `$composite_to_string_<type>` and `$composite_to_string_any_<type>` functions for each composite type. These take a composite parameter, so heap parameterization naturally adds a `Heap` dependency — preventing the verifier from unsoundly assuming referential transparency for `str()` calls across heap mutations. 2. When `str()` is called on a composite-typed variable (e.g., `str(e)` where `e: PythonError`), the translation emits `$composite_to_string_any_<Type>(e)` instead of `to_string_any(e)`, keeping the composite type and routing through the heap-aware function. Existing tests pass. New test added for try/except with `str(e)` on PythonError.

## Fix `with` statement variable scoping in Python-to-Laurel translation Python `with` blocks (e.g., `with open(...) as f:`) were placing variable declarations inside a Laurel `Block`, which scopes them in Core. When a function contained multiple `with` blocks reusing the same variable name, the second block would generate a `havoc` for a variable that wasn't declared at the outer Core scope, causing a "Cannot havoc undeclared variable" type-checking error. The fix hoists the manager and `as`-variable declarations out of the `Block`, matching Python's scoping semantics where `with`-block variables remain visible after the block ends. New test added in `VerifyPythonTest.lean` exercising multiple `with` blocks with the same variable name.

…pe system, and class method emission (#664) ## Summary Three interrelated changes to the Python→Laurel→Core pipeline that enable correct handling of instance methods, void procedures, and user-defined class fields: 1. **Unambiguous `@` separator for instance methods** — `Storage@put_item` instead of `Storage_put_item`, eliminating collisions with underscored names 2. **Void procedures return Any** — all PySpec procedures now have exactly one output (void returns `Any` via `from_none`), removing special-case void handling at call sites 3. **Hybrid two-kind field type system** — builtin types (int, str, bool, float) become `Any` on Composite fields; class types (PySpec services, user classes) stay `Composite`, matching parameter semantics at call boundaries 4. **User-defined class method emission** — class method definitions (`Counter@__init__`, `Counter@increment`) are now included in the Laurel program as static procedures ## PythonToLaurel changes - `@` separator applied across PythonToLaurel, PythonToCore, PySpecPipeline, and Specs/ToLaurel - New `translateFieldType` helper maps builtins→Any, composites→Composite for field declarations - `self` parameter typed as `Composite` (not `Any`) in `__init__` and user methods - Removed `wrapFieldInAny` for `self.field` reads in method bodies — builtin fields are already Any-typed - `unsupportedConstruct` error thrown when Composite→Any coercion is attempted (e.g., returning a Storage field as str) ## LaurelToCoreTranslator changes - Statement-position calls now synthesize throwaway `$unused_N` variables to match output count, fixing Core arity checks for void-returns-Any procedures - Single-target `Assign` with `StaticCall` also synthesizes extra LHS variables for multi-output procedures ## Testing - New dispatch tests for void assign/init, non-void discard, instance call result, user class construct, class field assign, and class method return - Two negative tests for Composite↔Any kind mismatches: `test_class_composite_as_any` (Composite field returned as Any) and `test_class_any_as_composite` (Any value assigned to Composite field). The latter requires explicit pyspec paths since `IdentifyOverloads` resolves modules from complete programs with `main`, not isolated class snippets. - 7 regenerated expected_laurel files reflecting the new field type system ## Known limitations - Composite→Any coercion (e.g., returning a `Storage` field as `str`) is explicitly rejected with a user-facing error. Coercion functions are being developed separately. - Procedure names used as type annotations (e.g., `Storage_put_item` as a type) silently fall through to `Any` instead of producing an error or warning. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 and MIT licenses. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

## Problem `Any_to_bool` only accepted `bool`, `none`, `string`, and `int` in its precondition and body (returning `false` for everything else). This caused a soundness issue: After `if results:`, the solver knows `results` must satisfy `Any_to_bool`'s precondition and its body must return `true`. Since Dict and List were not handled, the solver could prove that `results` is never a Dict or List. Then `PIn`'s precondition (`requires isfrom_Dict || isfrom_ListAny`) was provably unsatisfiable: ``` ❌ always false and is reachable from declaration entry ``` This is a false positive — in Python, non-empty dicts and lists are truthy. ## Fix Add `isfrom_Dict` and `isfrom_ListAny` to `Any_to_bool`'s precondition and body: empty dict/list → `false`, non-empty → `true`. ## Additional changes - Add `# strata-args:` directive for per-file flags in test Python files - Deprecate `pyAnalyze` (non-Laurel) in CI — default to `pyAnalyzeLaurel` ## Test `test_pin_any.py`: `def test_in_on_any(results: Any): if results: assert 'key' in results` - **Before**: `assert_assert(124)_calls_PIn_0: ❌ always false and is reachable` - **After**: `assert_assert(124)_calls_PIn_0: ❓ unknown`

## Summary - Always generate `$__bv{N}` names for quantifier-bound variables instead of reusing user-provided names - This guarantees globally unique bound variable names across all quantifiers. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

PR #681 replaced human-readable bound variable names (e.g. `n`, `m`, `x`) with globally unique `$__bv{N}` identifiers in SMT output. While this guarantees uniqueness, most SMT solvers don't require it and the output becomes harder to read. This PR restores human-readable names as the default behavior, with disambiguation (e.g. `x@1`) when names clash with other bound variables or free variables. The `$__bv{N}` naming is still available via the `--unique-bound-names` CLI flag for solvers that need it. The existing clash detection already ensures generated bound variable names never conflict with free variable names passed through in the context. Existing tests pass. No new tests needed since the test expectations were restored to their pre-#681 values. Fixes #683

## Summary Fix a bug where paths were getting an extra slash (e.g., `a//b`) and this caused output paths to become absolute, failing with permission errors. - Rewrite path stripping to handle trailing-slash variations correctly - Promote internal path failures to hard errors instead of silently skipping modules Verified on pyspec files. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

…esolve modifies error (#693) This patch resolves two failures: - Add method arguments to context when translating Python to Laurel - Rename the internal variable name for the argument to add 'in_', which will resolve the 'modifies'-type checking error. These two fixes are needed to make the new test pass. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Shilpi Goel <shigoel@gmail.com>

## Summary - Add `--vc-directory <dir>` CLI flag to the `pyAnalyze` command (it was already supported by `pyAnalyzeLaurel`) - Wire the flag through `run_py_analyze.sh` so both `core` and `laurel` modes can dump SMTLib files By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

### Changes This change enables me to use the Miso installed Python on my local machine, so I can run `StrataTest/Languages/Python/AnalyzeLaurelTest.lean`. With the system Python and the Python brew installation I was having issues. ### Testing - Was locally able to run `StrataTest/Languages/Python/AnalyzeLaurelTest.lean`

Migrates the standalone `StrataVerify` executable into a `verify` subcommand of the `strata` CLI (`StrataMain.lean`), deletes the old `StrataVerify.lean`, and enhances the shared argument parser to support `--flag=value` syntax (with validation that values don't contain `=`). All references across CI, C# integration tests, documentation, and example files are updated. ### Features - The `verifyCommand` faithfully reproduces all flags and behavior from the original `StrataVerify.lean`, including backward compatibility with `--output-format=sarif` via the new `--flag=value` parsing. - The `--flag=value` enhancement to `parseArgs` is well-implemented: it splits on `=`, validates that the value doesn't itself contain `=` (clear error message), and correctly handles both `.arg` and `.repeat` flag types with fallback to the next positional argument when no inline value is present. - The diff is clean: no stale files, no dead code. `StrataVerify.lean` is deleted, no untracked files are accidentally included. - The steering doc now lists `strata` with its subcommands instead of the removed `StrataVerify` entry. - The C# integration test correctly updates both the binary path and prepends `verify` to the arguments. - All documentation (README, GettingStarted, CI, example comments) is consistently updated. - Build succeeds and smoke tests pass for `--sarif`, `--output-format=sarif`, and the `=`-in-value rejection. ### Testing Existing tests that previously used `StrataVerify` now use `strata verify`. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>

…639) *Description of changes:* This PR move all variable declarations to the beginning of the function body. This PR continues the local variable bug fix from this PR #619. PR #619 moves the variable declarations inside the body blocks of `if`, `while`, and `for` statements and attach them on the top of those statements. However, in case those `if`, `while`, and `for` statements are inside a sub-block, those variables are not accessible outside of that block. This PR resolves that issue by collecting all variables that appear inside a function body and making declarations for them at the top of the function body. Note: To accommodate this task, this PR also adds overloaded types to compositeTypeNames so that isCompositeType recognizes them during type inference. By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Shilpi Goel <shigoel@gmail.com>

…ag (#696) ## Summary - Introduce `ExitCode` namespace with documented exit code constants and a common scheme across all `strata` subcommands (1=user error, 2=failures found, 3=internal error, 4=known limitation) - Use `Core.inlineProcedures` from SimpleAPI in `pyAnalyzeLaurelCommand` and `pyAnalyzeToGotoCommand` instead of calling lower-level `Core.Transform.runProgram` directly - Add `--no-solve` flag to `pyAnalyzeLaurel` that generates SMT-Lib files without invoking the solver (requires `--vc-directory` or`--keep-all-files`) - Add `skipSolver` field to `VerifyOptions` and wire through `dischargeObligation` - Rename `alwaysRunSMT` → `alwaysGenerateSMT` for clarity - Document all `VerifyOptions` fields with grouping comments - Replace all `panic!` calls with proper `exitInternalError` (exit code 3) - Swap exit codes 2↔3 to group user-actionable codes (1,2) and tool-side codes (3,4) ## NEXT UP Continue migrating StrataMain commands to use SimpleAPI — deduplicate the shared Laurel text-file parsing across 4 commands and add Laurel verification and Python direct-pipeline functions to SimpleAPI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*Description of changes:* Handle f-strings in Python->Laurel translation - FormattedValue: convert inner expr to string via to_string_any - JoinedStr: concatenate all parts using PAdd (same as string +) - Interpolation/TemplateStr (Python 3.14+ t-strings): fall through to Hole - inferExprType: return Str for JoinedStr and FormattedValue - Add f-string test cases to test_strings.py - ~Bind exception variables from `except ... as e` handlers as LocalVariable declarations in Laurel, so they are in scope when referenced in f-strings or other expressions within the handler body.~ Fixed in PR 609 - test_missing_models expected file updated By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice. --------- Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>

Add `exhaustive` tracking for Python class methods. Add `@exhaustive` decorator for pyspec classes. The current settings are: - User-defined classes: always exhaustive - Pyspec classes: exhaustive controlled by ClassDef.exhaustive field (defaults to false; models should aim for `true`) - Types from annotations only: not exhaustive, unmodeled methods become Hole - Primitives (str, int, bool): not exhaustive, methods like lower()/startswith() become Hole By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

## Summary The prelude was being re-verified for every Python test file, wasting time and cluttering test output with 22 identical prelude lines per file. ## Changes **Verifier** (`Strata/Languages/Core/Verifier.lean`): When `proceduresToVerify` is set, after `CallElim` inlines contracts at call sites, drop every procedure not in the target list. Previously the second `FilterProcedures` used a stale call-graph closure and kept prelude procedures, causing them to be re-verified every time. **Expected files**: All 39 `expected_laurel` test files updated — 22 prelude lines removed from each, DETAIL counts adjusted. Prelude correctness is verified once by the dedicated `PreludeVerifyTest`. ## Testing - `lake build StrataMain` passes - `PreludeVerifyTest` passes - Core verification tests pass (`FailingAssertion`, `Axioms`, `AlwaysRunSMT`, `RemoveIrrelevantAxioms`) --------- Co-authored-by: Andrew Wells <130512013+andrewmwells-amazon@users.noreply.github.com>

Add infrastructure to run every string-based ConcreteEval test through the full Laurel→Laurel lowering pipeline, exposing which tests break after the lowering passes. Changes: - Add lowerLaurelToLaurel helper to LaurelToCoreTranslator.lean that extracts the Laurel→Laurel pass sequence (stops before Laurel→Core) - Add parseLaurelTransformed to TestHelper.lean using the new helper - Create TransformPreservation.lean with 94 tests mirroring all string-based ConcreteEval tests - Update ConcreteEval.lean barrel file Results: 77/94 tests pass (output matches direct mode). 17 tests fail due to two known categories: - heapParameterization (13 tests): composite types/heap objects - liftExpressionAssignments (4 tests): nested calls and eval order Failing tests document actual (wrong) output with explanatory comments.

olivier-aws added 14 commits March 20, 2026 19:51

Use lowerLaurelToLaurel in LaurelToCore

ee6bd65

docs(laurel): Move denotational semantics design doc next to implemen…

00058ab

…tation

Add TODO

5b2698a

Delete docs/reviews/review-liftimperativeexpressions-diff.md

2f8ee4c

Should not have been committed

Revert unwanted changes

721ba52

olivier-aws marked this pull request as ready for review March 20, 2026 20:38

olivier-aws requested a review from a team March 20, 2026 20:38

olivier-aws requested review from fabiomadge and keyboardDrummer as code owners March 20, 2026 20:38

aqjune-aws added the Laurel label Mar 20, 2026

Merge branch 'main' into feat/laurel-denotational-semantics

af4305c

keyboardDrummer reviewed Mar 23, 2026

View reviewed changes

robin-aws requested changes Mar 25, 2026

View reviewed changes

olivier-aws and others added 6 commits March 30, 2026 13:55

Rename to Interpreter rather than Semantics or Evaluator, part 1

0924043

Remove empty files

9a2ab2e

tautschnig and others added 30 commits March 30, 2026 16:37

Use lowerLaurelToLaurel in LaurelToCore

dc28756

Revert unwanted changes

1f1c19e


		## Limitations

		1. Fuel exhaustion is indistinguishable from stuck. When fuel

Conversation

olivier-aws commented Mar 20, 2026

Denotational semantics for Laurel IR with concrete evaluator and transform preservation tests

Summary

Changes

Denotational interpreter (Strata/Languages/Laurel/)

Test suite (StrataTest/Languages/Laurel/)

Transform preservation (ConcreteEval/TransformPreservation.lean)

Bug fix: liftExpressionAssignments

Other improvements

Testing

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keyboardDrummer Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keyboardDrummer Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keyboardDrummer Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

olivier-aws Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robin-aws left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Denotational interpreter (`Strata/Languages/Laurel/`)

Test suite (`StrataTest/Languages/Laurel/`)

Transform preservation (`ConcreteEval/TransformPreservation.lean`)

Bug fix: `liftExpressionAssignments`

keyboardDrummer Mar 25, 2026 •

edited

Loading

keyboardDrummer Mar 25, 2026 •

edited

Loading

keyboardDrummer Mar 26, 2026 •

edited

Loading

olivier-aws Mar 26, 2026 •

edited

Loading