Skip to content

Denotational semantics for Laurel IR with concrete evaluator and transform preservation tests#631

Open
olivier-aws wants to merge 70 commits intomainfrom
feat/laurel-denotational-semantics
Open

Denotational semantics for Laurel IR with concrete evaluator and transform preservation tests#631
olivier-aws wants to merge 70 commits intomainfrom
feat/laurel-denotational-semantics

Conversation

@olivier-aws
Copy link
Copy Markdown
Contributor

Denotational semantics for Laurel IR with concrete evaluator and transform preservation tests

Summary

This PR adds a fuel-based denotational interpreter for Laurel IR, a concrete program evaluator, a comprehensive test suite (~130 tests), and transform-preservation infrastructure that validates the Laurel→Laurel lowering pipeline preserves semantics. It also fixes a bug in liftExpressionAssignments that broke evaluation order.

Changes

Denotational interpreter (Strata/Languages/Laurel/)

  • LaurelSemantics.lean: Shared semantic types (values, stores, heaps, outcomes) and helper functions (evalPrimOp, bindParams, getBody, etc.) used by the interpreter and evaluator.
  • LaurelDenote.lean: Fuel-based denotational interpreter (denoteStmt, denoteBlock, denoteArgs) — three mutually recursive functions covering all StmtExpr constructors exhaustively. Short-circuit operators (AndThen, OrElse, Implies) are handled as special cases before the general PrimitiveOp path. evalPrimOp uses explicit per-operation fallthrough (no wildcard) so adding a new Operation variant forces a build error.
  • LaurelDenoteMono.lean: Fuel monotonicity proof — if the interpreter succeeds with fuel n, it succeeds with any m ≥ n giving the same result.
  • LaurelConcreteEval.lean: Bridges denoteStmt to Laurel.Program by building ProcEnv from static + instance procedures, constructing the initial store from static fields, and running main.

Test suite (StrataTest/Languages/Laurel/)

  • ConcreteEval/: 13 test modules covering primitives, arithmetic, boolean ops (including short-circuit), control flow, variables, procedures, side effects, recursion, aliasing, heap objects, type ops, verification constructs, and edge cases. Shared TestHelper.lean with parseLaurel (parse + resolve) and programmatic AST helpers.
  • LaurelDenoteTest.lean, LaurelDenoteUnitTest.lean, LaurelDenoteIntegrationTest.lean, LaurelDenotePropertyTest.lean: Direct tests of the denotational interpreter using both programmatic AST and Plausible property-based testing.
  • LaurelConcreteEvalTest.lean: End-to-end tests of the concrete evaluator.

Transform preservation (ConcreteEval/TransformPreservation.lean)

Runs all 94 string-based ConcreteEval tests after the full Laurel→Laurel lowering pipeline (the exact pass sequence from LaurelToCoreTranslator.translate, stopping before Laurel→Core translation). 82/94 pass — the remaining 12 failures are all due to heapParameterization changing the calling convention for composite types.

Bug fix: liftExpressionAssignments

Fixed two bugs in the StaticCall case of transformExpr:

  1. Nested call ordering: add(mul(2,3), mul(4,5)) — the outer call's temp variable was declared before inner calls' temps, causing the evaluator to reference undeclared variables.

  2. Side-effect evaluation order: add({x:=1;x}, {x:=x+10;x}) — the lifted code didn't preserve left-to-right evaluation order. Fixed by isolating each arg's prepends, capturing side-effectful args in temporaries, and emitting prepend groups in the correct order.

Other improvements

  • LaurelToCoreTranslator.lean: Extracted lowerLaurelToLaurel function that runs all Laurel→Laurel passes, reusable by both translate and the test infrastructure.
  • Removed applyLift parameter from parseLaurel — tests validate raw Laurel semantics directly.
  • Removed unused simp warnings in LaurelDenoteMono.lean.

Testing

  • lake build — no warnings from new files
  • lake test — all tests pass
  • Transform preservation: 82/94 tests match direct-mode output

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…prehensive test suite

Semantics (Strata/Languages/Laurel/):
- LaurelSemantics: Shared type definitions (values, stores, heaps, outcomes)
  and helper functions (evalPrimOp, bindParams, store/heap operations)
- LaurelDenote: Fuel-based denotational interpreter
- LaurelDenoteMono: Fuel monotonicity proof for the denotational interpreter

Concrete evaluator:
- LaurelConcreteEval: Concrete evaluator for Laurel programs via
  denotational semantics

Test suite (StrataTest/Languages/Laurel/):
- LaurelDenoteUnitTest: Unit tests for denotational interpreter
- LaurelDenoteIntegrationTest: Integration scenario tests
- LaurelDenotePropertyTest: Plausible property-based tests
- LaurelConcreteEvalTest: Concrete evaluator tests using Laurel parser
- ConcreteEval/ module hierarchy with shared TestHelper:
  Primitives, Arithmetic, BooleanOps, ControlFlow, SideEffects,
  Procedures, Aliasing, Variables, HeapObjects, Recursion,
  TypeOps, Verification, EdgeCases

Also fixes LiftImperativeExpressions refactoring and minor test updates.
…onstructs

Remove the wildcard catch-all from evalPrimOp and replace it with
explicit per-operation cases so that adding a new Operation constructor
forces a build error. This prevents new operations from silently
returning none.

Add short-circuit handling for AndThen, OrElse, and Implies in
denoteStmt instead of evalPrimOp, since these operators must not
eagerly evaluate their second argument. This enables proper
side-effect semantics where the second operand is only evaluated
when needed.

Add DivT (truncation division) and ModT (truncation modulus) cases
to evalPrimOp using Int.tdiv and Int.tmod respectively.

Update the fuel monotonicity proof (LaurelDenoteMono) to handle the
new short-circuit cases in denoteStmt by case-splitting on the
operation and argument list structure.

Fix pre-existing test failures:
- BooleanOps tests now pass (AndThen/OrElse have semantics)
- LaurelConcreteEvalTest short-circuit tests now pass
- LaurelDenoteUnitTest short-circuit tests use correct operations
- DivT test updated from stuck to returning correct result

Add new tests:
- DivT/ModT with positive and negative operands
- DivT/ModT division by zero (stuck)
- evalPrimOp unit tests for DivT, ModT, AndThen, OrElse, Implies
- Truncation division edge cases (negative dividend/divisor)
…exhaustive over all Laurel constructs

- Add DivT/ModT to arithTotalProp property test (bug fix)
- Add Implies to short-circuit ops return none section in unit tests
- Add TODO for extracting shared tactic in LaurelDenoteMono.lean
- Document And/Or (eager) vs AndThen/OrElse/Implies (short-circuit)
  distinction in evalPrimOp
- Fix stale comment in BooleanOps.lean referencing And/Or instead of
  AndThen/OrElse for short-circuit semantics
…osition

Test 3 expected `returned: 42` but got `returned: 0` because the
block expression `{x := 42; x}` was wrapped in a procedure call
`id(...)`. The lift pass correctly lifts the call to a temporary
before the block's side effects execute, so `id(x)` was called with
x still equal to 0.

Rewrite Test 3 to place the block expression directly in return
position (`return {x := 42; x}`) where the lift pass produces the
correct order: assign x := 42, then return x. The expected output
`returned: 42` remains correct.
The denotational interpreter handles blocks in expression/argument
position natively, making the liftExpressionAssignments pass
unnecessary for concrete evaluator tests. Remove the applyLift
parameter, its conditional logic, and the LiftImperativeExpressions
import. Update all 58 call sites and related doc comments.
Delete three no-op `simp only [denoteStmt] at heval ⊢` lines in the
AndThen, OrElse, and Implies cases of denoteStmt_fuel_mono. The match
on the following line already unfolds denoteStmt, making these simp
calls redundant and triggering unused-argument warnings during build.

Remove unused simp warnings in LaurelDenoteMono.lean
Add infrastructure to run every string-based ConcreteEval test through
the full Laurel→Laurel lowering pipeline, exposing which tests break
after the lowering passes.

Changes:
- Add lowerLaurelToLaurel helper to LaurelToCoreTranslator.lean that
  extracts the Laurel→Laurel pass sequence (stops before Laurel→Core)
- Add parseLaurelTransformed to TestHelper.lean using the new helper
- Create TransformPreservation.lean with 94 tests mirroring all
  string-based ConcreteEval tests
- Update ConcreteEval.lean barrel file

Results: 77/94 tests pass (output matches direct mode).
17 tests fail due to two known categories:
- heapParameterization (13 tests): composite types/heap objects
- liftExpressionAssignments (4 tests): nested calls and eval order

Failing tests document actual (wrong) output with explanatory comments.
- Fix failure count breakdown: 12 heapParameterization / 5
  liftExpressionAssignments (was incorrectly 13/4)
- Fix SideEffects Test 1 comment: the lifting pass traverses arguments
  right-to-left creating snapshot variables, so both block expressions
  independently see the original x=0, yielding add(0,0)=0
- Remove duplicate resolve call in lowerLaurelToLaurel (no-op second
  call after modifiesClausesTransform, mirrored from translate)
- Add TODO to refactor translate to call lowerLaurelToLaurel internally
  to avoid duplicated Laurel→Laurel pass pipeline
Two bugs in the StaticCall case of transformExpr:

1. Nested call ordering: when an imperative call had arguments that were
   themselves imperative calls (e.g. add(mul(2,3), mul(4,5))), the outer
   call's temp variable was declared before the inner calls' temps.
   Fix: snapshot arg prepends before adding the call's own prepends,
   then restore them so inner calls are declared first.

2. Side-effect evaluation order: when arguments contained blocks with
   side effects (e.g. add({x:=1;x}, {x:=x+10;x})), the lifted code
   didn't preserve left-to-right evaluation order. The right arg's
   side effects could clobber variables read by the left arg's result.
   Fix: isolate each arg's prepends, capture side-effectful args in
   temporaries, and emit prepend groups in left-to-right order using
   the cons-based stack (right-to-left groups pushed first so left
   groups end up on top).

All 5 liftExpressionAssignments failures in TransformPreservation
tests now pass (82/94 total, remaining 12 are heapParameterization).
@olivier-aws olivier-aws marked this pull request as ready for review March 20, 2026 20:38
@olivier-aws olivier-aws requested a review from a team March 20, 2026 20:38
| none => none

-- Verification Constructs
-- The relational semantics requires assert/assume conditions to be pure
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Runtime compilation might erase assertions and assumptions, and to do that safely their bodies should not have side-effects. The semantics here should check that if the heap/store changes, none is returned.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed in principle. The interpreter already discards the post-condition store/heap and returns the originals, so side effects in assert/assume conditions are invisible to subsequent code. However, we cannot detect impure conditions at runtime because LaurelStore and LaurelHeap are function types (String ? Option LaurelValue and Nat ? Option (...)) which have no BEq or DecidableEq instance in Lean. There is no way to compare two closures for equality. The purity requirement must be enforced statically (e.g., by a well-formedness check on the AST before interpretation) rather than dynamically. Added documentation clarifying this design constraint.

Copy link
Copy Markdown
Contributor

@keyboardDrummer keyboardDrummer Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, we cannot detect impure conditions at runtime because LaurelStore and LaurelHeap are function types

What if you use data instead of functions to represent the store and heap? Out of scope for this PR though.


-- Quantifiers (delegated to δ)
| .Forall name ty body =>
match δ σ (.Forall name ty body) with
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you're leaving the meaning of quantifiers unspecified. How about you enable δ to return a List of all possible values of a particular type? Then you can use the fuel to iterate over these values and check the property.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. The current δ callback was designed as an escape hatch for constructs the interpreter can't handle natively. Enumerating all values of a type using fuel would work for finite types (bool, bounded int ranges) but not for unbounded ones (int, string). I could add a typeValues : LaurelType → Option (List LaurelValue) field to δ that returns some for enumerable types and none otherwise.

Happy to add this in a follow-up PR.

Copy link
Copy Markdown
Contributor

@keyboardDrummer keyboardDrummer Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but not for unbounded ones

Why not? If you run out of fuel then the quantifier holds, so the proof would be of the form: for whatever amount of fuel you start with, the quantifier's value does not change before and after phase P of the compilation.

| some v => some (.normal v, σ, h)
| none => none

-- Specification Constructs (delegated to δ)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does delegating to δ mean? Shouldn't these all return a not yet implemented error?

SPDX-License-Identifier: Apache-2.0 OR MIT
-/

import Strata.Languages.Laurel.Laurel
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separation between LaurelConcreteEval, LaurelDenote and LaurelSemantics seems arbitrary to me.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LaurelSemantics defines shared types and helpers (values, stores, heaps, evalPrimOp). LaurelInterpreter (formerly LaurelDenote) is the fuel-based interpreter that uses those types. LaurelConcreteEval bridges the interpreter to Laurel.Program by building ProcEnv from program declarations and running main. The split keeps the interpreter independent of the program-level structure. Added module-level doc comments making this layering explicit.

Happy to merge some of these if you feel strongly about it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly but IMO the "program level structure" is as much a part of the language as any other part of the AST.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I have a follow-up to this PR that adds an "interpretProgram" method to define the semantics at the program level. I'll update the PR with this part

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel more strongly about this - I think there should be no distinction between the "interpreter" and the "evaluator" anywhere, they should be labeled as the same concept that could potentially be implemented for any elements of a dialect.

I would vote for using "interpreter" in this case, only because "evaluator" is slightly ambiguous in Lean since it has `#eval.

let model := (← get).model
let seqArgs ← args.reverse.mapM transformExpr
let seqCall := ⟨.StaticCall callee seqArgs.reverse, md⟩
-- Process arguments right-to-left (for substitution mechanism).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems awfully complicated. Isn't the change on line 267 and 272 sufficient? If not, can you explain why?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Investigated. The simple change (just lines 267/272) handles the basic case where an imperative call's temp variable needs to be declared after its arguments' temps. But it does NOT handle the evaluation-order bug for side-effectful arguments like add({x:=1;x}, {x:=x+10;x}). In that case we need to:

  1. Isolate each argument's prepended statements (so arg2's side effects don't leak into arg1's scope)
  2. Capture side-effectful results in temporaries (so the final call uses snapshot values)
  3. Emit the prepend groups in left-to-right order (so side effects execute in program order)

The simple fix only addresses point (1). Points (2) and (3) require the full complexity. Added a detailed comment in the code explaining which test cases require each piece.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this example add({x:=1;x}, {x:=x+10;x}), could you write down the before this PR and after this PR translation?

I thought the before would already be:

x := 1;
var x1 := x;
x := x + 10
add(x1, x);

Is something wrong with that, or was it not producing that?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my change:

procedure main() returns 
 ()
 deterministic
 {
   var x: int := 0;
   x := 1;
   var $c_1: int;
   $c_1 := x;
   x := x + 10;
   var $c_0: int;
   $c_0 := x;
   var $c_2: void;
   $c_2 := add($c_1, $c_0);
   return $c_2
 }

Before my change:

procedure main() returns 
()
deterministic
{
 var x: int := 0;
 var $c_0: void;
 $c_0 := add(x, x);
 x := x + 10;
 x := 1;
 return $c_0
}

Copy link
Copy Markdown
Contributor

@keyboardDrummer keyboardDrummer Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the above code changes are not the correct solution for the bug. I made a fix that I think is correct here: #672

Critical question: did all the changes in LiftImperativeExpressions in this PR make sense to you? I think before submitting a PR the answer has to be yes. I looked at the lifting changes in this PR but I couldn't make sense of them that's why I investigated the bug separately.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the Laurel program that's failing for you on #672?

I also tried:

function add(x: int, y: int): int {
  x + y
};
procedure repeatedBlockExpressions() {
  var x: int := 2;
  var y: int := { x := 1; x } + { x := x + 10; x };
  var z: int := add({ x := 1; x }, { x := x + 10; x });
  assert y == 1 + 11;
  assert z == 1 + 11
};

this passes verification, and returns:

procedure repeatedBlockExpressions () returns ()
{
  $body: {
    var x : int := 2;
    var $x_1 : int := x;
    x := 1;
    var $x_0 : int := x;
    x := x + 10;
    var y : int := $x_0 + x;
    var $x_3 : int := x;
    x := 1;
    var $x_2 : int := x;
    x := x + 10;
    var z : int := add($x_2, x);
    assert [|assert(2376)|]: y == 1 + 11;
    assert [|assert(2398)|]: z == 1 + 11;
    }
  };

which seems good

The proposed change does make sense to me because I reviewed several iterations

Sorry for questioning that

Copy link
Copy Markdown
Contributor Author

@olivier-aws olivier-aws Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I integrated the change from #672 into this PR after reverting the change on LiftImperative pass. And then the test

procedure add(a: int, b: int) { return a + b };
procedure main() {
  var x: int := 0;
  return add({x := 1; x}, {x := x + 10; x})
};

would not produce the correct IR.
Maybe it is because of the return statement.

Sorry for questioning that

Don't be, it is very sound to always look for the simplest solution, especially with LLMs that tend to find too complex solutions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#672 has some issues and I'm working on resolving them. I'll update when resolved and I'll include the above test-case as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#672 was still missing the fixes on lines 267/272 from this PR. With those, the program:

procedure add(a: int, b: int) { return a + b };
procedure main() {
  var x: int := 0;
  return add({x := 1; x}, {x := x + 10; x})
};

passes as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, one #672 is merged I will update the branch and revert these changes then.

Rename "denotational interpreter" to "interpreter" throughout:
- LaurelDenote.lean -> LaurelInterpreter.lean
- LaurelDenoteMono.lean -> LaurelInterpreterMono.lean
- denoteStmt/denoteBlock/denoteArgs -> interpStmt/interpBlock/interpArgs
- Test files renamed accordingly
- All doc comments updated

Rename single-character variable names per reviewer request:
- h -> heap, σ -> store in function signatures and bodies
- π -> procEnv, σ₀/h₀ -> initStore/initHeap in evalProgram

Add documentation:
- Module layering explanation in LaurelSemantics.lean
- δ (LaurelEval) callback purpose in LaurelInterpreter.lean
- TODO for quantifier type enumeration via δ
- TODO for Old pre-state snapshot threading
- TODO for DiagnosticModel in Outcome

Document assert/assume purity enforcement:
- Conditions are evaluated but post-condition store/heap is discarded
- Original store/heap returned, matching erasable-construct semantics

Add detailed comment in LiftImperativeExpressions.lean explaining
why the StaticCall case requires full arg-isolation complexity
(Bug 1: nested call ordering, Bug 2: evaluation order preservation)
rather than the simpler reordering approach.
Copy link
Copy Markdown

@robin-aws robin-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just top level feedback so far. I'm inclined to look closer after you merge the interpreter and the evaluator, if you agree, as I believe that will cut some of the documentation and datatype complexity in half.

SPDX-License-Identifier: Apache-2.0 OR MIT
-/

import Strata.Languages.Laurel.Laurel
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel more strongly about this - I think there should be no distinction between the "interpreter" and the "evaluator" anywhere, they should be labeled as the same concept that could potentially be implemented for any elements of a dialect.

I would vote for using "interpreter" in this case, only because "evaluator" is slightly ambiguous in Lean since it has `#eval.

Comment on lines +91 to +93
| noMain
| noBody
| stuck (msg : String)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems simpler to just have the stuck alternative, rather than picking out noMain and noBody as more specific ways to get stuck.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you might be thinking that noMain and noBody are program errors, whereas stuck is an interpreter error? I might buy that, but in that case it would be better to encode the two categories of errors more explicitly in the result naming.


/-- Run a program and classify the result. Delegates to `evalProgram` for
the core evaluation, preserving the `noMain` / `noBody` distinction. -/
def runProgram (prog : Program) (fuel : Nat := 10000) : EvalResult :=
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make the default fuel something like 2^64 - 1 (or whatever the right value that stops short of expensive bignumber operations)? We only have the fuel so we can still make it technically guaranteed to terminate, but in practice having the interpreter terminate by running out of fuel is never useful information. This number is probably far too low to be a reasonable "timeout" value on large examples, so I think we're better off letting the Lean process run "forever" and we can kill the process if actually necessary.

interpreter that:

1. Can be `#eval`'d on concrete programs
2. Is deterministic by construction (it is a function, not a relation)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the dumb question if this is known, but is Laurel itself deterministic? I had thought not, by design, in which case how is this point true?

can be evaluated — any terminating program can be evaluated with
sufficient fuel.

When fuel reaches zero, the interpreter returns `none` (indistinguishable
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I got confused because LaurelResult has stuck, but I'm again getting confused between the interpreter and the evaluator. Another point that they should be the same thing IMO.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should probably go under docs instead.


## Limitations

1. **Fuel exhaustion is indistinguishable from stuck.** When fuel
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not seem true of the interpreter, since LaurelResult DOES distinguish.

olivier-aws and others added 6 commits March 30, 2026 13:55
Delete 4 redundant test files (~1800 lines) whose coverage is
subsumed by the ConcreteEval/ end-to-end tests:
- LaurelInterpreterTest.lean (66 hand-built AST tests)
- LaurelInterpreterUnitTest.lean (120 hand-built AST tests)
- LaurelInterpreterIntegrationTest.lean (27 hand-built AST tests)
- LaurelConcreteEvalTest.lean (16 tests, mix of parsed and AST)

Create LaurelInterpreterInternals.lean with tests that cannot be
exercised through parsed Laurel source:
- interpPrimop exhaustive coverage (all ops, type mismatches, arity)
- Stuck/error states (malformed AST the parser would reject)
- δ delegation (Forall, Exists, Old, Fresh, Assigned, ContractOf)
- Short-circuit semantics (AndThen/OrElse/Implies)
- Fuel exhaustion
- Programmatic AST tests for interpProgram

Also fix stale LaurelConcreteEval import in Strata.lean.

Remaining test files:
- ConcreteEval/ (13 modules) — primary end-to-end test suite
- LaurelInterpreterInternals — internal API tests
- LaurelInterpreterPropertyTest — Plausible property-based tests
Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-python/releases">actions/setup-python's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade to node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1164">actions/setup-python#1164</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Enhancements:</h3>
<ul>
<li>Add support for <code>pip-version</code> by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1129">actions/setup-python#1129</a></li>
<li>Enhance reading from .python-version by <a
href="https://github.com/krystof-k"><code>@​krystof-k</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li>
<li>Add version parsing from Pipfile by <a
href="https://github.com/aradkdj"><code>@​aradkdj</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li>
</ul>
<h3>Bug fixes:</h3>
<ul>
<li>Clarify pythonLocation behaviour for PyPy and GraalPy in environment
variables by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1183">actions/setup-python#1183</a></li>
<li>Change missing cache directory error to warning by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1182">actions/setup-python#1182</a></li>
<li>Add Architecture-Specific PATH Management for Python with --user
Flag on Windows by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1122">actions/setup-python#1122</a></li>
<li>Include python version in PyPy python-version output by <a
href="https://github.com/cdce8p"><code>@​cdce8p</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li>
<li>Update docs: clarification on pip authentication with setup-python
by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1156">actions/setup-python#1156</a></li>
</ul>
<h3>Dependency updates:</h3>
<ul>
<li>Upgrade idna from 2.9 to 3.7 in /<strong>tests</strong>/data by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/843">actions/setup-python#843</a></li>
<li>Upgrade form-data to fix critical vulnerabilities <a
href="https://redirect.github.com/actions/setup-python/issues/182">#182</a>
&amp; <a
href="https://redirect.github.com/actions/setup-python/issues/183">#183</a>
by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1163">actions/setup-python#1163</a></li>
<li>Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in
PackageIndex.download by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1165">actions/setup-python#1165</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/1181">actions/setup-python#1181</a></li>
<li>Upgrade <code>@​actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/1095">actions/setup-python#1095</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/krystof-k"><code>@​krystof-k</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li>
<li><a href="https://github.com/cdce8p"><code>@​cdce8p</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li>
<li><a href="https://github.com/aradkdj"><code>@​aradkdj</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v6.0.0">https://github.com/actions/setup-python/compare/v5...v6.0.0</a></p>
<h2>v5.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Workflow updates related to Ubuntu 20.04 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1065">actions/setup-python#1065</a></li>
<li>Fix for Candidate Not Iterable Error by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1082">actions/setup-python#1082</a></li>
<li>Upgrade semver and <code>@​types/semver</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1091">actions/setup-python#1091</a></li>
<li>Upgrade prettier from 2.8.8 to 3.5.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1046">actions/setup-python#1046</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1081">actions/setup-python#1081</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v5.6.0">https://github.com/actions/setup-python/compare/v5...v5.6.0</a></p>
<h2>v5.5.0</h2>
<h2>What's Changed</h2>
<h3>Enhancements:</h3>
<ul>
<li>Support free threaded Python versions like '3.13t' by <a
href="https://github.com/colesbury"><code>@​colesbury</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/973">actions/setup-python#973</a></li>
<li>Enhance Workflows: Include ubuntu-arm runners, Add e2e Testing for
free threaded and Upgrade <code>@​action/cache</code> from 4.0.0 to
4.0.3 by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1056">actions/setup-python#1056</a></li>
<li>Add support for .tool-versions file in setup-python by <a
href="https://github.com/mahabaleshwars"><code>@​mahabaleshwars</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1043">actions/setup-python#1043</a></li>
</ul>
<h3>Bug fixes:</h3>
<ul>
<li>Fix architecture for pypy on Linux ARM64 by <a
href="https://github.com/mayeut"><code>@​mayeut</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1011">actions/setup-python#1011</a>
This update maps arm64 to aarch64 for Linux ARM64 PyPy
installations.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/actions/setup-python/commit/a309ff8b426b58ec0e2a45f0f869d46889d02405"><code>a309ff8</code></a>
Bump urllib3 from 2.6.0 to 2.6.3 in /<strong>tests</strong>/data (<a
href="https://redirect.github.com/actions/setup-python/issues/1264">#1264</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/bfe8cc55a7890e3d6672eda6460ef37bfcc70755"><code>bfe8cc5</code></a>
Upgrade <a href="https://github.com/actions"><code>@​actions</code></a>
dependencies to Node 24 compatible versions (<a
href="https://redirect.github.com/actions/setup-python/issues/1259">#1259</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/4f41a90a1f38628c7ccc608d05fbafe701bc20ae"><code>4f41a90</code></a>
Bump urllib3 from 2.5.0 to 2.6.0 in /<strong>tests</strong>/data (<a
href="https://redirect.github.com/actions/setup-python/issues/1253">#1253</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/83679a892e2d95755f2dac6acb0bfd1e9ac5d548"><code>83679a8</code></a>
Bump <code>@​types/node</code> from 24.1.0 to 24.9.1 and update macos-13
to macos-15-intel ...</li>
<li><a
href="https://github.com/actions/setup-python/commit/bfc4944b43a5d84377eca3cf6ab5b7992ba61923"><code>bfc4944</code></a>
Bump prettier from 3.5.3 to 3.6.2 (<a
href="https://redirect.github.com/actions/setup-python/issues/1234">#1234</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/97aeb3efb8a852c559869050c7fb175b4efcc8cf"><code>97aeb3e</code></a>
Bump requests from 2.32.2 to 2.32.4 in /<strong>tests</strong>/data (<a
href="https://redirect.github.com/actions/setup-python/issues/1130">#1130</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/443da59188462e2402e2942686db5aa6723f4bed"><code>443da59</code></a>
Bump actions/publish-action from 0.3.0 to 0.4.0 &amp; Documentation
update for pi...</li>
<li><a
href="https://github.com/actions/setup-python/commit/cfd55ca82492758d853442341ad4d8010466803a"><code>cfd55ca</code></a>
graalpy: add graalpy early-access and windows builds (<a
href="https://redirect.github.com/actions/setup-python/issues/880">#880</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/bba65e51ff35d50c6dbaaacd8a4681db13aa7cb4"><code>bba65e5</code></a>
Bump typescript from 5.4.2 to 5.9.3 and update docs/advanced-usage.md
(<a
href="https://redirect.github.com/actions/setup-python/issues/1094">#1094</a>)</li>
<li><a
href="https://github.com/actions/setup-python/commit/18566f86b301499665bd3eb1a2247e0849c64fa5"><code>18566f8</code></a>
Improve wording and &quot;fix example&quot; (remove 3.13) on testing
against pre-releas...</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/setup-python/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>
*Issue #, if available:*

*Description of changes:*

This PR fixes a Boole lowering bug where `returns ()` procedures were
silently dropped because nested `return []` statements escaped
`toCoreDecls` instead of producing empty local lists.

It also includes a small Boole frontend cleanup and test refresh:
    - add Boole string type lowering to Core
- normalize a few remaining nested branch returns in `Verify.lean` from
`return` to `pure`
- update Boole regression tests whose expected obligations changed once
unit-return procedures were actually lowered and verified
- stabilize several feature-request seeds and update their comments/docs
to reflect current implementation priorities
    - added a regression test for the bug fix

The net effect is that Boole procedures with no outputs are lowered
correctly, the relevant regression tests are now meaningful again, and
the Boole test suite is back in sync with the frontend behavior.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
Add more integration tests for PyAnalyzeLaurel


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>
tautschnig and others added 30 commits March 30, 2026 16:37
Add infrastructure to run the full Python → Laurel → Core → SMT
verification pipeline from inline Python source strings in Lean tests.

- StrataTest/Languages/Python/TestExamples.lean: New processPythonFile
function that compiles Python source to Ion via the external tool,
translates through Laurel to Core with Python-specific factory functions
(ReFactory), and returns verification diagnostics.

- StrataTest/Util/TestDiagnostics.lean: Extend
parseDiagnosticExpectations to recognize '#' comments (in addition to
'//') so that Python source embedded in Lean strings can use # comment
expectations.

- StrataTest/Languages/Python/VerifyPythonTest.lean: Five tests
exercising the new infrastructure: passing assertions, failing
assertions, mixed, line-number accuracy, and annotated-style with
testInputWithOffset.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>
The existing Stmt-to-GOTO translation (ToCProverGOTO.lean) interleaves
control-flow lowering — emitting conditional jumps, managing forward
references, patching targets — with CBMC-specific concerns like source
locations and instruction encoding. This makes the code hard to extend
to additional backends and difficult to reason about independently.

*Description of changes:*

This commit introduces a CFG-based path that separates those
responsibilities:

  Core Stmt → CFG (via stmtsToCFG) → CProverGOTO instructions

The first step (stmtsToCFG from StructuredToUnstructured) handles
control-flow lowering once, producing a backend-agnostic CFG. The second
step (detCFGToGotoTransform, added here) is a straightforward mapping
from CFG blocks to GOTO instructions. This separation makes it easier to
target additional backends — each only needs to consume the CFG — and to
reason about control-flow lowering independently.

Three gaps remain before this can replace the direct path:

1. Source locations on control flow: StructuredToUnstructured discards
MetaData from ite/loop/block/exit statements, so transfer commands in
the CFG carry no source location information.

2. Loop contracts: the direct path emits #spec_loop_invariant and
#spec_decreases as named sub-expressions on the backward-edge GOTO (for
CBMC DFCC). In the CFG, invariants become plain asserts and measures are
discarded.

3. Core.CmdExt.call: this translation handles Imperative.Cmd only. Core
procedure calls need a command translator analogous to coreStmtsToGoto
in CoreToGOTOPipeline.lean.

No changes to StructuredToUnstructured are needed: the existing
stmtsToCFG already works with Core.Command because its CmdT parameter is
generic, and Core.Command has the required HasPassiveCmds and HasBool
instances.

Tests cover sequential commands, if-then-else, loops (including
back-edge verification), empty programs, assert/assume, and havoc.


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Co-authored-by: Shilpi Goel <shigoel@gmail.com>
Co-authored-by: Aaron Tomb <aarotomb@amazon.com>
Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>
…#666)

## Summary

- Add `ResultClassifier` structure with pluggable
`isSuccess`/`isFailure` predicates (defaults preserve existing behavior)
so callers can control what counts as a failure without touching
`printPyAnalyzeSummary`.
- In bug-finding mode, narrow `isFailure` to `alwaysFalseAndReachable`
only — the outcome where validity confirms the property is always false
on a reachable path. Other failure modes become inconclusive.
- Apply the classifier consistently: both the per-VC `"Assertion
failed"` prefix and the summary counts/exit-code now use the same
`classifier.isFailure`.
- `nInconclusive` is now the remainder (total minus success, failure,
unreachable, impl-error), so narrowing `isFailure` automatically widens
inconclusive.
- Add `VCOutcome.hasSMTError` / `VCResult.hasSMTError` to
`Verifier.lean` alongside the other nine-case predicates, and use them
to simplify `nImplError` in `printPyAnalyzeSummary`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…#647)

PySpec commands can now work with an entire directory of PySpec files
using standard Python conventions: `__init__.py` for package modules and
relative imports (`from .Foo import Bar`). The CLI and API take dotted
module names (e.g., `servicelib.Storage`) instead of individual Ion file
paths, resolved against a spec directory.

- Previously, PySpec only understood absolute `from X import Y` imports.
Relative imports, bare imports, and `__init__.py` packages all produced
errors. This PR adds support for all of these so that real-world Python
codebases with typical project layouts can be translated.
- When an imported module itself contains imports, those are now
resolved relative to the imported file's own directory, not the
top-level search path. This matches Python's import system and is
essential for packages where `__init__.py` imports from sibling modules.
- `translateFile` now takes an explicit `searchPath` parameter instead
of deriving it implicitly, giving callers control over where imports are
resolved from.
- The existing `dispatch_test` fixture is converted from a flat
`servicelib.py` to a proper Python package (`servicelib/__init__.py`
with sibling modules), so the test suite actually exercises
package-level imports end-to-end.
- A new `RelativeImportTest` suite covers basic relative imports,
aliased imports, mixed absolute+relative, bare imports, dotted package
paths, `__init__.py` resolution, and negative cases (unsupported level-2
imports, missing modules).

## SimpleAPI Changes

**Breaking:** `pyAnalyzeLaurel` and `pyTranslateLaurel` now take
**module names** instead of Ion file paths.

| Parameter | Before | After |
|-----------|--------|-------|
| `dispatchPaths : Array String` | Ion file paths | `dispatchModules :
Array String` — dotted module names (e.g., `"servicelib"`) |
| `pyspecPaths : Array String` | Ion file paths | `pyspecModules : Array
String` — dotted module names (e.g., `"servicelib.Storage"`) |
| _(new)_ | — | `specDir : System.FilePath := "."` — directory
containing compiled PySpec Ion files |

Module names are resolved against `specDir` using
`ModuleName.specIonPath`, which tries canonical paths
(`specDir/servicelib/Storage.pyspec.st.ion`) then falls back to
`__init__` layout (`specDir/servicelib/__init__.pyspec.st.ion`).

`resolveAndBuildLaurelPrelude` has the same parameter changes
(`dispatchPaths` → `dispatchModules`, `pyspecPaths` → `pyspecModules`).

## Strata Exe Changes

**Breaking:** The `--dispatch` and `--pyspec` flags now take **module
names** instead of Ion file paths for these commands:
- `pyAnalyzeLaurel`
- `pyTranslateLaurel`
- `pyAnalyzeLaurelToGoto`

New flag: `--spec-dir <dir>` specifies where compiled PySpec Ion files
live (defaults to `.`). The command exits with an error if the directory
does not exist.

**Before:**
```sh
strata pyAnalyzeLaurel --dispatch path/to/dispatch.pyspec.st.ion --pyspec path/to/Storage.pyspec.st.ion file.ion
```

**After:**
```sh
strata pyAnalyzeLaurel --spec-dir path/to/pyspec --dispatch servicelib --pyspec servicelib.Storage file.ion
```

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
*Description of changes:*

Adds the `--keep-all-files dir` to pyAnalyzeLaurel that creates the dir
directory and stores a text representation of the Laurel and Core
programs in it, plus all the SMT queries.

Manually tested.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Mikaël Mayer <MikaelMayer@users.noreply.github.com>
Co-authored-by: Mikael Mayer <mimayere@amazon.com>
This PR implements several previously-opaque functions in the
`Strata/SimpleAPI.lean` and consolidates the verification entry point.

### Changes

Implemented Simple API functions:
- `genericToCore` — translates a generic `Strata.Program` to`
Core.Program` using
`Core.getProgram`
- `genericToLaurel` — translates a generic `Strata.Program` to
`Laurel.Program` using
`Laurel.parseProgram`
- `laurelToCore` — translates a `Laurel.Program` to `Core.Program` using
the existing
Laurel translator
- `Core.inlineProcedures` — inlines procedure calls with a configurable
predicate
controlling which calls to inline
- `Core.loopElimUsingContract` — eliminates loops by replacing them with
invariant-
based assertions/assumptions
- `Core.callElimUsingContract` — eliminates procedure calls by replacing
them with
contract-based assertions/assumptions
- `Core.verifyProgram` — runs deductive verification on a Core program
(replaces
the standalone `verifyCore`)

Structural changes:
- `Core.InlineTransformOptions` is now a concrete structure with an
optional
`doInline` predicate, rather than an opaque type
- Removed the standalone `verifyCore` function; its logic is now in
`Core.verifyProgram` under the `Core` namespace
- Renamed `Core.loopElimWithContract` → `Core.loopElimUsingContract` and
`Core.callElimWithContract` → `Core.callElimUsingContract`
- Trimmed unnecessary imports from `SimpleAPI.lean` and
`StrataMain.lean`
- Fixed a docstring typo (Core dialect → Laurel dialect for
`genericToLaurel`)

Transform module workarounds:
- Added top-level aliases `coreCallElimCmd` and `coreInlineCallCmd` in
`CallElim.lean`
and `ProcedureInlining.lean` to work around the Core module/dialect name
collision

Dogfooding in `StrataMain`:
- Updated `pyAnalyzeCommand` to use `Core.inlineProcedures` instead of
calling the
transform directly
- Updated `pyAnalyzeLaurelCommand` to use `Core.verifyProgram` instead
of
`Strata.verifyCore`

### Motivation

The Simple API was originally sketched with opaque declarations to
define the
intended surface. This PR makes the majority of those functions
concrete, moving
the API closer to being usable by external clients. The remaining
`noncomputable opaque` functions depend on DDM functionality that is not
yet
available.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>
*Issue #, if available:*

*Description of changes:*

Fixes Boole’s `old(...)` lowering in `Verify.lean` so it works on
compound expressions, not just bare identifiers.

Boole already lowers `old` by mapping free variables to their pre-state
names in `Core`. This PR extends that existing lowering to compound
expressions needed by the new regression test `stack_array_based`,
without changing Core’s representation of old-state values.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>
- Make pyAnalyzeLaurel only check non-prelude asserts
- Add check of prelude asserts as build-time check
- Add check for missing source location to CI



By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Support `**dict` expansion at resolved method call sites

Python code often builds a parameter dictionary and passes it to a
method
call using `**`, e.g. `client.start_job_run(**params)`. When the target
method had a known signature (via pyspec), the translator would fail
with
`'method' called with unknown keyword arguments: []` because it couldn't
match the unnamed `**` keyword to individual named parameters.

This PR detects `**dict` at call sites of resolved methods and expands
the
dictionary into individual arguments: required parameters use
`DictStrAny_get` (preserving the precondition that the key must exist),
and
optional parameters use a new prelude function `DictStrAny_get_or_none`
(returns `None` if the key is absent).

Two end-to-end tests were added to the dispatch test suite: one covering
all-required parameters, one covering a mix of required and optional
parameters where the dictionary omits the optional keys. All existing
tests
continue to pass.
## Summary
- Appends `(set-info :final-message "...")` at the end of each generated
SMTLib file, containing the "message" metadata field from the
obligation. Emitted once per obligation, after all check-sat commands.
- Promotes MetaData field definitions from `def` to `abbrev` with
`@[match_pattern]` so they can be used in pattern matches.
- Adds `MetaData.message` field definition.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…correctness of DetToNondet with loops (#660)

**Specification of Transformation**: Strata/Transform/Specification.lean

- Define two notions of assertion validity (`AssertValid` via
reachability, `Hoare.Triple` via partial-correctness triples) and prove
them equivalent (`hoareTriple_implies_assertValid`,
`assertValid_implies_hoareTriple`).
- Define `Sound` and `Overapproximates` predicates for transformation
correctness with `WellFormedSemanticEvalBool`/`Val` preconditions.
- Prove `overapproximates_triple`, composition, identity, and
block-level `overapproximates_stmts` (using `noFuncDecl` for eval
preservation).
- Add structural Hoare rules: `cmd`, `seq_cons`, `ite`, `skip`, block
lifting.
- Add `Lang P` structure abstracting statement/config types and step
relation for generality.

**The new `hasFailure` flag in a program state, and small-step of
Imperative**

- Add `Env P` record bundling store, evaluator, and cumulative
`hasFailure` flag. Refactor `Config`, `StepStmt`, `EvalStmt`,
`EvalBlock` to use `Env`.
- Properly amend small-step semantics of Imperative; copied verbatim
from #509, @MikaelMayer

**Correctness proof**: DetToNondetCorrect.lean
- Prove `detToNondet_overapproximates` using the `Overapproximates`
predicate defined above.
- Defined small-step semantics of NondetStmt. This also supports loops.
- Fix DetToNondet pass to optionally return `none` if funcDecl or exit
is met.

*Issue #, if available:*

*Description of changes:*


By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
The pipeline crashes with a type unification error when a composite
value flows into an `Any`-typed position (e.g., `str(e)` on an exception
object in a `try/except` handler).

This PR fixes the issue by:

1. Generating `$composite_to_string_<type>` and
`$composite_to_string_any_<type>` functions for each composite type.
These take a composite parameter, so heap parameterization naturally
adds a `Heap` dependency — preventing the verifier from unsoundly
assuming referential transparency for `str()` calls across heap
mutations.

2. When `str()` is called on a composite-typed variable (e.g., `str(e)`
where `e: PythonError`), the translation emits
`$composite_to_string_any_<Type>(e)` instead of `to_string_any(e)`,
keeping the composite type and routing through the heap-aware function.

Existing tests pass. New test added for try/except with `str(e)` on
PythonError.
## Fix `with` statement variable scoping in Python-to-Laurel translation

Python `with` blocks (e.g., `with open(...) as f:`) were placing
variable declarations inside a Laurel `Block`, which scopes them in
Core. When a function contained multiple `with` blocks reusing the same
variable name, the second block would generate a `havoc` for a variable
that wasn't declared at the outer Core scope, causing a "Cannot havoc
undeclared variable" type-checking error.

The fix hoists the manager and `as`-variable declarations out of the
`Block`, matching Python's scoping semantics where `with`-block
variables remain visible after the block ends.

New test added in `VerifyPythonTest.lean` exercising multiple `with`
blocks with the same variable name.
…pe system, and class method emission (#664)

## Summary

Three interrelated changes to the Python→Laurel→Core pipeline that
enable correct handling of instance methods, void procedures, and
user-defined class fields:

1. **Unambiguous `@` separator for instance methods** —
`Storage@put_item` instead of `Storage_put_item`, eliminating collisions
with underscored names
2. **Void procedures return Any** — all PySpec procedures now have
exactly one output (void returns `Any` via `from_none`), removing
special-case void handling at call sites
3. **Hybrid two-kind field type system** — builtin types (int, str,
bool, float) become `Any` on Composite fields; class types (PySpec
services, user classes) stay `Composite`, matching parameter semantics
at call boundaries
4. **User-defined class method emission** — class method definitions
(`Counter@__init__`, `Counter@increment`) are now included in the Laurel
program as static procedures

## PythonToLaurel changes

- `@` separator applied across PythonToLaurel, PythonToCore,
PySpecPipeline, and Specs/ToLaurel
- New `translateFieldType` helper maps builtins→Any,
composites→Composite for field declarations
- `self` parameter typed as `Composite` (not `Any`) in `__init__` and
user methods
- Removed `wrapFieldInAny` for `self.field` reads in method bodies —
builtin fields are already Any-typed
- `unsupportedConstruct` error thrown when Composite→Any coercion is
attempted (e.g., returning a Storage field as str)

## LaurelToCoreTranslator changes

- Statement-position calls now synthesize throwaway `$unused_N`
variables to match output count, fixing Core arity checks for
void-returns-Any procedures
- Single-target `Assign` with `StaticCall` also synthesizes extra LHS
variables for multi-output procedures

## Testing

- New dispatch tests for void assign/init, non-void discard, instance
call result, user class construct, class field assign, and class method
return
- Two negative tests for Composite↔Any kind mismatches:
`test_class_composite_as_any` (Composite field returned as Any) and
`test_class_any_as_composite` (Any value assigned to Composite field).
The latter requires explicit pyspec paths since `IdentifyOverloads`
resolves modules from complete programs with `main`, not isolated class
snippets.
- 7 regenerated expected_laurel files reflecting the new field type
system

## Known limitations

- Composite→Any coercion (e.g., returning a `Storage` field as `str`) is
explicitly rejected with a user-facing error. Coercion functions are
being developed separately.
- Procedure names used as type annotations (e.g., `Storage_put_item` as
a type) silently fall through to `Any` instead of producing an error or
warning.

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 and MIT licenses.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Problem

`Any_to_bool` only accepted `bool`, `none`, `string`, and `int` in its
precondition and body (returning `false` for everything else). This
caused a soundness issue:

After `if results:`, the solver knows `results` must satisfy
`Any_to_bool`'s precondition and its body must return `true`. Since Dict
and List were not handled, the solver could prove that `results` is
never a Dict or List. Then `PIn`'s precondition (`requires isfrom_Dict
|| isfrom_ListAny`) was provably unsatisfiable:

```
❌ always false and is reachable from declaration entry
```

This is a false positive — in Python, non-empty dicts and lists are
truthy.

## Fix

Add `isfrom_Dict` and `isfrom_ListAny` to `Any_to_bool`'s precondition
and body: empty dict/list → `false`, non-empty → `true`.

## Additional changes

- Add `# strata-args:` directive for per-file flags in test Python files
- Deprecate `pyAnalyze` (non-Laurel) in CI — default to
`pyAnalyzeLaurel`

## Test

`test_pin_any.py`: `def test_in_on_any(results: Any): if results: assert
'key' in results`

- **Before**: `assert_assert(124)_calls_PIn_0: ❌ always false and is
reachable`
- **After**: `assert_assert(124)_calls_PIn_0: ❓ unknown`
## Summary
- Always generate `$__bv{N}` names for quantifier-bound variables
instead of reusing user-provided names
- This guarantees globally unique bound variable names across all
quantifiers.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR #681 replaced human-readable bound variable names (e.g. `n`, `m`,
`x`) with globally unique `$__bv{N}` identifiers in SMT output. While
this guarantees uniqueness, most SMT solvers don't require it and the
output becomes harder to read.

This PR restores human-readable names as the default behavior, with
disambiguation (e.g. `x@1`) when names clash with other bound variables
or free variables. The `$__bv{N}` naming is still available via the
`--unique-bound-names` CLI flag for solvers that need it.

The existing clash detection already ensures generated bound variable
names never conflict with free variable names passed through in the
context.

Existing tests pass. No new tests needed since the test expectations
were restored to their pre-#681 values.

Fixes #683
## Summary

Fix a bug where paths were getting an extra slash (e.g., `a//b`) and
this caused output paths to become absolute, failing with permission
errors.

- Rewrite path stripping to handle trailing-slash variations correctly
- Promote internal path failures to hard errors instead of silently
skipping modules

Verified on pyspec files.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…esolve modifies error (#693)

This patch resolves two failures:
- Add method arguments to context when translating Python to Laurel
- Rename the internal variable name for the argument to add 'in_', which
will resolve the 'modifies'-type checking error.
These two fixes are needed to make the new test pass.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Shilpi Goel <shigoel@gmail.com>
## Summary
- Add `--vc-directory <dir>` CLI flag to the `pyAnalyze` command (it was
already supported by `pyAnalyzeLaurel`)
- Wire the flag through `run_py_analyze.sh` so both `core` and `laurel`
modes can dump SMTLib files

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
### Changes
This change enables me to use the Miso installed Python on my local
machine, so I can run
`StrataTest/Languages/Python/AnalyzeLaurelTest.lean`. With the system
Python and the Python brew installation I was having issues.

### Testing
- Was locally able to run
`StrataTest/Languages/Python/AnalyzeLaurelTest.lean`
Migrates the standalone `StrataVerify` executable into a `verify`
subcommand of the `strata` CLI (`StrataMain.lean`), deletes the old
`StrataVerify.lean`, and enhances the shared argument parser to support
`--flag=value` syntax (with validation that values don't contain `=`).
All references across CI, C# integration tests, documentation, and
example files are updated.

### Features

- The `verifyCommand` faithfully reproduces all flags and behavior from
the original `StrataVerify.lean`, including backward compatibility with
`--output-format=sarif` via the new `--flag=value` parsing.
- The `--flag=value` enhancement to `parseArgs` is well-implemented: it
splits on `=`, validates that the value doesn't itself contain `=`
(clear error message), and correctly handles both `.arg` and `.repeat`
flag types with fallback to the next positional argument when no inline
value is present.
- The diff is clean: no stale files, no dead code. `StrataVerify.lean`
is deleted, no untracked files are accidentally included.
- The steering doc now lists `strata` with its subcommands instead of
the removed `StrataVerify` entry.
- The C# integration test correctly updates both the binary path and
prepends `verify` to the arguments.
- All documentation (README, GettingStarted, CI, example comments) is
consistently updated.
- Build succeeds and smoke tests pass for `--sarif`,
`--output-format=sarif`, and the `=`-in-value rejection.

### Testing

Existing tests that previously used `StrataVerify` now use `strata
verify`.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Josh Cohen <36058610+joscoh@users.noreply.github.com>
…639)

*Description of changes:* This PR move all variable declarations to the
beginning of the function body.

This PR continues the local variable bug fix from this PR
#619.
PR #619 moves the variable declarations inside the body blocks of `if`,
`while`, and `for` statements and attach them on the top of those
statements. However, in case those `if`, `while`, and `for` statements
are inside a sub-block, those variables are not accessible outside of
that block.
This PR resolves that issue by collecting all variables that appear
inside a function body and making declarations for them at the top of
the function body.

Note: To accommodate this task, this PR also adds overloaded types to
compositeTypeNames so that isCompositeType recognizes them during type
inference.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Shilpi Goel <shigoel@gmail.com>
…ag (#696)

## Summary

- Introduce `ExitCode` namespace with documented exit code constants and
a common scheme across all `strata` subcommands (1=user error,
2=failures found, 3=internal error, 4=known limitation)

- Use `Core.inlineProcedures` from SimpleAPI in `pyAnalyzeLaurelCommand`
and `pyAnalyzeToGotoCommand` instead of calling lower-level
`Core.Transform.runProgram` directly

- Add `--no-solve` flag to `pyAnalyzeLaurel` that generates SMT-Lib
files without invoking the solver (requires `--vc-directory`
or`--keep-all-files`)

- Add `skipSolver` field to `VerifyOptions` and wire through
`dischargeObligation`

- Rename `alwaysRunSMT` → `alwaysGenerateSMT` for clarity

- Document all `VerifyOptions` fields with grouping comments

- Replace all `panic!` calls with proper `exitInternalError` (exit code
3)

- Swap exit codes 2↔3 to group user-actionable codes (1,2) and tool-side
codes (3,4)

## NEXT UP
Continue migrating StrataMain commands to use SimpleAPI — deduplicate
the shared Laurel text-file parsing across 4 commands and add Laurel
verification and Python direct-pipeline functions to SimpleAPI.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
*Description of changes:*

Handle f-strings in Python->Laurel translation
- FormattedValue: convert inner expr to string via to_string_any
- JoinedStr: concatenate all parts using PAdd (same as string +)
- Interpolation/TemplateStr (Python 3.14+ t-strings): fall through to
Hole
- inferExprType: return Str for JoinedStr and FormattedValue
- Add f-string test cases to test_strings.py
- ~Bind exception variables from `except ... as e` handlers as
LocalVariable declarations in Laurel, so they are in
scope when referenced in f-strings or other expressions within the
handler body.~ Fixed in PR 609
- test_missing_models expected file updated

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.

---------

Co-authored-by: Juneyoung Lee <136006969+aqjune-aws@users.noreply.github.com>
Add `exhaustive` tracking for Python class methods.

Add `@exhaustive` decorator for pyspec classes.

The current settings are:

- User-defined classes: always exhaustive
- Pyspec classes: exhaustive controlled by ClassDef.exhaustive field
(defaults to false; models should aim for `true`)
- Types from annotations only: not exhaustive, unmodeled methods become
Hole
- Primitives (str, int, bool): not exhaustive, methods like
lower()/startswith() become Hole



By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
## Summary

The prelude was being re-verified for every Python test file, wasting
time and cluttering test output with 22 identical prelude lines per
file.

## Changes

**Verifier** (`Strata/Languages/Core/Verifier.lean`): When
`proceduresToVerify` is set, after `CallElim` inlines contracts at call
sites, drop every procedure not in the target list. Previously the
second `FilterProcedures` used a stale call-graph closure and kept
prelude procedures, causing them to be re-verified every time.

**Expected files**: All 39 `expected_laurel` test files updated — 22
prelude lines removed from each, DETAIL counts adjusted.

Prelude correctness is verified once by the dedicated
`PreludeVerifyTest`.

## Testing

- `lake build StrataMain` passes
- `PreludeVerifyTest` passes
- Core verification tests pass (`FailingAssertion`, `Axioms`,
`AlwaysRunSMT`, `RemoveIrrelevantAxioms`)

---------

Co-authored-by: Andrew Wells <130512013+andrewmwells-amazon@users.noreply.github.com>
Add infrastructure to run every string-based ConcreteEval test through
the full Laurel→Laurel lowering pipeline, exposing which tests break
after the lowering passes.

Changes:
- Add lowerLaurelToLaurel helper to LaurelToCoreTranslator.lean that
  extracts the Laurel→Laurel pass sequence (stops before Laurel→Core)
- Add parseLaurelTransformed to TestHelper.lean using the new helper
- Create TransformPreservation.lean with 94 tests mirroring all
  string-based ConcreteEval tests
- Update ConcreteEval.lean barrel file

Results: 77/94 tests pass (output matches direct mode).
17 tests fail due to two known categories:
- heapParameterization (13 tests): composite types/heap objects
- liftExpressionAssignments (4 tests): nested calls and eval order

Failing tests document actual (wrong) output with explanatory comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.