Hypothesis: Do Bitvector Decision Diagrams (BVDDs) represent reasoning state efficiently and effectively enough to outperform the state of the art in SMT solving and bounded model checking on hardware and software verification benchmarks?
Agent-bitr is inspired by agent-sat. However, unlike agent-sat, which aims at demonstrating that AI agents may autonomously discover competitive solving techniques for a well-understood problem (SAT), agent-bitr tests whether agents can build a competitive solver around a novel, unproven data structure encoded as agent skill. The question is not just "can agents build a solver?" but "do BVDDs — where the decision diagram IS the complete solver state — provide a unified representation of the usually separate clause databases and assignment trails of conventional CDCL(T) and DPLL(T) solvers and bounded model checkers?". Similar to agent-sat, success is determined through benchmarking against existing state-of-the-art tools.
BVDDs are nested decision diagrams with 256-bit bitmask edge labels where:
- Bitmask AND is propagation (intersect feasible values)
- Bitmask OR is resolution (merge conflicting edges)
- Empty bitmask is a conflict (UNSAT)
- Operations work on bytes rather than bits
A single BVDD encodes the formula, the assignment trail, AND the learned clauses — all in one incrementally canonical structure. Algorithmic details are in an upcoming publication.
The project is split into two crates:
bvdd/— Standalone BVDD library (publishable on crates.io, C API via FFI)bitr/— BTOR2 solver built on the BVDD library
# Build (debug)
cargo build
# Build (release, optimized)
cargo build --release
# Run tests
cargo test
# Run on a BTOR2 file
cargo run --release -- benchmarks/tiny/simple_sat.btor2
# Run with statistics
cargo run --release -- --stats --verbose benchmarks/tiny/simple_sat.btor2- HWMCC'24: 3,498 BTOR2 tasks (1,977 bitvector + 1,521 array) from Zenodo
- CAV'18 BTOR2 suite: 10 real-world (System)Verilog designs from JKU
- QF_BV: Quantifier-free bitvector benchmarks from SMT-LIB
- QF_ABV: Quantifier-free bitvector + array benchmarks
- bitwuzla — State of the art for QF_BV/QF_ABV in SMT-COMP
- rIC3 — HWMCC'24 gold medalist (BV tracks), github
- AVR — HWMCC'24 gold in arrays track (IC3sa algorithm)
# Download HWMCC'24 benchmarks
./benchmarks/download.sh
# Run bitr on all benchmarks with 300s timeout
./scripts/run_benchmarks.sh
# Compare against reference solver
python3 scripts/compare.py results/bitr.csv results/bitwuzla.csvPhases 0–9 complete. Core solver operational on combinational, sequential, and array benchmarks. 16/16 tiny benchmarks correct. Optimization ongoing.
| Metric | bitr | bitwuzla | rIC3 |
|---|---|---|---|
| HW BV solved (≤500K, 10s) | 36/155 | — | — |
| QF_BV (SMT-LIB2, 20s) | 10/10 | — | — |
| QF_ABV (SMT-LIB2, 20s) | 6/6 | — | — |
| HW Array solved | 0/321 | — | — |
| SW BV solved | — | — | — |
| Total time (s) | — | — | — |
The table below tracks each BVDD concept, its DPLL(T) analogue, and measured performance.
| BVDD Concept | DPLL(T) Analogue | Status | Space | Notes |
|---|---|---|---|---|
| Value sets — 256-bit bitmask edge labels | Literal watches / domain | Done | 32 B | [u64; 4]; branchless AND/OR/NOT |
| BVDD nodes — decision DAG with value-set edges | Clause database + trail | Done | 4 B id | Hash-consed unique table; arena-allocated |
| Edge merging — OR value sets of same-child edges | Clause subsumption | Done | — | Reduces branching factor at construction |
| BVCs — constrained symbolic values at terminals | Theory atoms | Done | 4 B id | (term, constraint) pairs |
| Hash-consed terms — symbolic expression DAG | Term algebra | Done | 4 B id | Memoized substitution caches |
| Constraints — Boolean formulas over predicates | Learned clauses | Done | 4 B id | Hash-consed; short-circuit Restrict |
| HSC — hierarchical 8-bit slice cascade | Bit-blasting to SAT | Done | — | MSB→LSB cascade for variables > 8 bits |
| Computed cache — memoize Solve(node, valueset) | Conflict cache | Done | 64K entries | Direct-mapped; cleared between BMC steps |
| Canonicalize/Solve — reducing BVDD to canonical form decides SAT | DPLL(T) search | Done | — | Ground check → terminal → decision traversal |
| Decide/Restrict — partition domain by predicate signatures | Decision + BCP | Done | — | Coarsest partition; short-circuit AND/OR |
| Theory resolution — 4-stage cascade when no predicates remain | Theory solver | Done | — | See cascade table below |
Theory resolution cascade (invoked when all constraints reduce to TRUE/FALSE):
| Stage | Strategy | Budget | Throughput |
|---|---|---|---|
| 1. Boolean decomposition | Branch on 1-bit comparison subterms | — | — |
| 2. Generalized blast | Enumerate narrowest variable first (packed bytecode evaluator) | 2^28 domain | 211M eval/s (parallel) |
| 3. Byte-blast | Split widest variable's MSB byte; enumerate 256 × LSB | depth 4; 25% bailout | — |
| 4. Theory oracle | External SMT solver (bitwuzla/z3) on residual | 5s per call | cached |
Exhaustive search performance (UNSAT x²+1 ≡ 0 mod 2^n, packed bytecode evaluator, 8-core Apple Silicon):
| Width | Domain | Wall time | Eval throughput | Parallelism |
|---|---|---|---|---|
| 12-bit | 4K | <0.01s | — | sequential |
| 20-bit | 1M | 0.04s | ~25M/s | sequential |
| 24-bit | 16M | 0.24s | ~67M/s | parallel (8 cores) |
| 28-bit | 268M | 1.27s | 211M/s | parallel (8 cores) |
| 2 × 10-bit | 1M | 0.04s | ~25M/s | sequential |
| 3 × 8-bit | 16M | 0.32s | ~50M/s | parallel (8 cores) |
Test suite: 91 unit tests, 40/40 benchmarks correct (16 BTOR2 + 10 QF_BV + 6 QF_ABV + 8 SW).
This solver is being built iteratively by Claude Code agents. Each agent session:
- Reads
program.mdfor current status and next steps - Reads
.claude/commands/bitr-expert.mdfor algorithmic reference - Implements the next phase
- Runs tests and benchmarks
- Updates
expert.mdwith discoveries - Commits progress
MIT