agent-bitr

Hypothesis: Do Bitvector Decision Diagrams (BVDDs) represent reasoning state efficiently and effectively enough to outperform the state of the art in SMT solving and bounded model checking on hardware and software verification benchmarks?

Agent-bitr is inspired by agent-sat. However, unlike agent-sat, which aims at demonstrating that AI agents may autonomously discover competitive solving techniques for a well-understood problem (SAT), agent-bitr tests whether agents can build a competitive solver around a novel, unproven data structure encoded as agent skill. The question is not just "can agents build a solver?" but "do BVDDs — where the decision diagram IS the complete solver state — provide a unified representation of the usually separate clause databases and assignment trails of conventional CDCL(T) and DPLL(T) solvers and bounded model checkers?". Similar to agent-sat, success is determined through benchmarking against existing state-of-the-art tools.

What are BVDDs?

BVDDs are nested decision diagrams with 256-bit bitmask edge labels where:

Bitmask AND is propagation (intersect feasible values)
Bitmask OR is resolution (merge conflicting edges)
Empty bitmask is a conflict (UNSAT)
Operations work on bytes rather than bits

A single BVDD encodes the formula, the assignment trail, AND the learned clauses — all in one incrementally canonical structure. Algorithmic details are in an upcoming publication.

Architecture

The project is split into two crates:

bvdd/ — Standalone BVDD library (publishable on crates.io, C API via FFI)
bitr/ — BTOR2 solver built on the BVDD library

Build & Run

# Build (debug)
cargo build

# Build (release, optimized)
cargo build --release

# Run tests
cargo test

# Run on a BTOR2 file
cargo run --release -- benchmarks/tiny/simple_sat.btor2

# Run with statistics
cargo run --release -- --stats --verbose benchmarks/tiny/simple_sat.btor2

Benchmarks

Hardware Verification

HWMCC'24: 3,498 BTOR2 tasks (1,977 bitvector + 1,521 array) from Zenodo
CAV'18 BTOR2 suite: 10 real-world (System)Verilog designs from JKU

Software Verification

QF_BV: Quantifier-free bitvector benchmarks from SMT-LIB
QF_ABV: Quantifier-free bitvector + array benchmarks

Reference Solvers

bitwuzla — State of the art for QF_BV/QF_ABV in SMT-COMP
rIC3 — HWMCC'24 gold medalist (BV tracks), github
AVR — HWMCC'24 gold in arrays track (IC3sa algorithm)

Running Benchmarks

# Download HWMCC'24 benchmarks
./benchmarks/download.sh

# Run bitr on all benchmarks with 300s timeout
./scripts/run_benchmarks.sh

# Compare against reference solver
python3 scripts/compare.py results/bitr.csv results/bitwuzla.csv

Current Status

Phases 0–9 complete. Core solver operational on combinational, sequential, and array benchmarks. 16/16 tiny benchmarks correct. Optimization ongoing.

Metric	bitr	bitwuzla	rIC3
HW BV solved (≤500K, 10s)	36/155	—	—
QF_BV (SMT-LIB2, 20s)	10/10	—	—
QF_ABV (SMT-LIB2, 20s)	6/6	—	—
HW Array solved	0/321	—	—
SW BV solved	—	—	—
Total time (s)	—	—	—

BVDD Implementation Status

The table below tracks each BVDD concept, its DPLL(T) analogue, and measured performance.

BVDD Concept	DPLL(T) Analogue	Status	Space	Notes
Value sets — 256-bit bitmask edge labels	Literal watches / domain	Done	32 B	`[u64; 4]`; branchless AND/OR/NOT
BVDD nodes — decision DAG with value-set edges	Clause database + trail	Done	4 B id	Hash-consed unique table; arena-allocated
Edge merging — OR value sets of same-child edges	Clause subsumption	Done	—	Reduces branching factor at construction
BVCs — constrained symbolic values at terminals	Theory atoms	Done	4 B id	(term, constraint) pairs
Hash-consed terms — symbolic expression DAG	Term algebra	Done	4 B id	Memoized substitution caches
Constraints — Boolean formulas over predicates	Learned clauses	Done	4 B id	Hash-consed; short-circuit Restrict
HSC — hierarchical 8-bit slice cascade	Bit-blasting to SAT	Done	—	MSB→LSB cascade for variables > 8 bits
Computed cache — memoize Solve(node, valueset)	Conflict cache	Done	64K entries	Direct-mapped; cleared between BMC steps
Canonicalize/Solve — reducing BVDD to canonical form decides SAT	DPLL(T) search	Done	—	Ground check → terminal → decision traversal
Decide/Restrict — partition domain by predicate signatures	Decision + BCP	Done	—	Coarsest partition; short-circuit AND/OR
Theory resolution — 4-stage cascade when no predicates remain	Theory solver	Done	—	See cascade table below

Theory resolution cascade (invoked when all constraints reduce to TRUE/FALSE):

Stage	Strategy	Budget	Throughput
1. Boolean decomposition	Branch on 1-bit comparison subterms	—	—
2. Generalized blast	Enumerate narrowest variable first (packed bytecode evaluator)	2^28 domain	211M eval/s (parallel)
3. Byte-blast	Split widest variable's MSB byte; enumerate 256 × LSB	depth 4; 25% bailout	—
4. Theory oracle	External SMT solver (bitwuzla/z3) on residual	5s per call	cached

Exhaustive search performance (UNSAT x²+1 ≡ 0 mod 2^n, packed bytecode evaluator, 8-core Apple Silicon):

Width	Domain	Wall time	Eval throughput	Parallelism
12-bit	4K	<0.01s	—	sequential
20-bit	1M	0.04s	~25M/s	sequential
24-bit	16M	0.24s	~67M/s	parallel (8 cores)
28-bit	268M	1.27s	211M/s	parallel (8 cores)
2 × 10-bit	1M	0.04s	~25M/s	sequential
3 × 8-bit	16M	0.32s	~50M/s	parallel (8 cores)

Test suite: 91 unit tests, 40/40 benchmarks correct (16 BTOR2 + 10 QF_BV + 6 QF_ABV + 8 SW).

Agent-Driven Development

This solver is being built iteratively by Claude Code agents. Each agent session:

Reads program.md for current status and next steps
Reads .claude/commands/bitr-expert.md for algorithmic reference
Implements the next phase
Runs tests and benchmarks
Updates expert.md with discoveries
Commits progress

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

agent-bitr

What are BVDDs?

Architecture

Build & Run

Benchmarks

Hardware Verification

Software Verification

Reference Solvers

Running Benchmarks

Current Status

BVDD Implementation Status

Agent-Driven Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.claude		.claude
benchmarks		benchmarks
bitr		bitr
bvdd		bvdd
results		results
scripts		scripts
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
experiments.log		experiments.log
expert.md		expert.md
program.md		program.md

Folders and files

Latest commit

History

Repository files navigation

agent-bitr

What are BVDDs?

Architecture

Build & Run

Benchmarks

Hardware Verification

Software Verification

Reference Solvers

Running Benchmarks

Current Status

BVDD Implementation Status

Agent-Driven Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages