Skip to content

feat(Core): add Str.ToLower and Str.ToUpper string operations#621

Open
shigoel wants to merge 6 commits intomainfrom
shilpi/str-to-lower
Open

feat(Core): add Str.ToLower and Str.ToUpper string operations#621
shigoel wants to merge 6 commits intomainfrom
shilpi/str-to-lower

Conversation

@shigoel
Copy link
Copy Markdown
Contributor

@shigoel shigoel commented Mar 19, 2026

Summary

  • Add str.tolower and str.toupper operations to Strata Core, each with concrete evaluation and SMT axioms
  • Extend the unaryOp combinator in IntBoolFactory.lean with an optional axioms parameter
  • Skip axiom-bearing factory functions in ExprEvalTest to avoid MBQI timeouts on UF-based operations

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

shigoel and others added 2 commits March 19, 2026 17:04
- Add `unaryOp` optional `axioms` parameter in IntBoolFactory.lean
- Add `strToLowerFunc` and `strToUpperFunc` in Factory.lean, each with
  concrete evaluation (String.toLower / String.toUpper) and three SMT
  axioms: idempotence, length preservation, and concat distributivity
- Wire both ops into the DDM grammar, ASTtoCST, and Translate passes
- Skip axiom-bearing factory functions in ExprEvalTest to avoid MBQI
  timeouts on UF-based operations
- Merge StrToLowerTest and StrToUpperTest into StrCaseTest.lean
- Update ProcedureEvalTests and ProgramTypeTests expected outputs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@shigoel shigoel requested a review from a team March 19, 2026 22:14
@shigoel shigoel changed the title feat: add Str.ToLower and Str.ToUpper string operations feat(Core): add Str.ToLower and Str.ToUpper string operations Mar 19, 2026
shigoel and others added 2 commits March 19, 2026 17:17
@shigoel shigoel added the Core label Mar 19, 2026
(((~Str.Concat : string → string → string) %1) %0))
== (((~Str.Concat : string → string → string)
((~Str.ToUpper : string → string) %1))
((~Str.ToUpper : string → string) %0)))]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed that all these axioms and their ToLower versions are indeed valid, using CVC5.
I heard ß uppercases to SS, but probably ToUpper only deals with English alphabets.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you verify that idea?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I could test with this! :)

(set-logic QF_SLIA)                                                                                                                                                                                                                                 
                                                                                                                                                                                                                                                    
(declare-const result String)                                                                                                                                                                                                                       
(assert (= result (str.to_upper "\u{00DF}")))                                                                                                                                                                                                       
                                                                                                                                                                                                                                                    
(check-sat)                                                                                                                                                                                                                                         
(get-value (result))

It prints:

sat <- this doesn't matter
((result "\u{df}"))

\u00df is ß.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a test with an unprovable or false goal to demonstrate that the axioms are not vacuous.

Copy link
Copy Markdown
Contributor

@joehendrix joehendrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's be nice to see if we can reduce some of the per-operator boilerplate wrt DDM integration, but that's clearly a separate PR.

knowledge for the solver:
- *Idempotence*: `f(f(s)) == f(s)`
- *Length preservation*: `len(f(s)) == len(s)`
- *Concat distributivity*: `f(s1 ++ s2) == f(s1) ++ f(s2)`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three axioms can be trivially satisfied by the identity function. This means that having them will make "sat" no longer sound. @aqjune-aws any insight we could bring on that?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is true that this might lead to spurious counter examples - but I thought it was fine to raise spurious counter examples. There was a discussion suggesting that preventing cover from unsat -> sat was too strict.

Copy link
Copy Markdown
Contributor

@MikaelMayer MikaelMayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be redundant tests.

assert [empty_lower]: str.tolower("") == "";

// Concrete idempotence (two calls on a literal)
assert [concrete_idempotent]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are about concrete values, so they don't make sense since we know the result of these functions. It would be more interesting to see symbolic inputs here to see that the axioms are being picked up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants