Skip to content

briangmilnes/rusticate

Repository files navigation

rusticate

Def. Rusticate - go to, live in, or spend time in the country or particularly suspend a student from an Oxbridge university as a punishment.

Rusticating Python as a method for code review and fix of Rust files in LLMs by using abstract syntax trees in Rust code instead of Python and regexps.

Python will be sent back to the family estate for not working well!


📊 Featured: Rust Stdlib Usage Analysis

What does real Rust code actually use from the standard library?

We analyzed 3,336 crates from the top 1,036 crates.io projects to find out—critical data for Verus verification prioritization.

Key Findings

Category Unique Items Top Item Coverage
Modules 84 std::result alone → 84% of crates
Types 51 Top 9 types → 70% coverage
Traits 79 Iterator + Clone + Debug → 85%
Methods 3,582 Top 9 methods → 90% coverage

Greedy set cover (minimum items to cover N% of crates):

Coverage   Modules   Types   Traits   Methods
---------  -------   -----   ------   -------
 70%          40        9       15       301
 80%          45       14       20       510
 90%          59       19       32       552
100%          84       48       79       552

Run the Analysis

# 1. Generate MIR for projects (slow: ~2-4 hours, one-time)
rusticate-mirify -C ~/projects/RustCodebases -j 6

# 2. Analyze stdlib usage (fast: ~50 seconds)
rusticate-analyze-modules-mir ~/projects/RustCodebases
# Output: analyses/rusticate-analyze-modules-mir.json

📄 Full report + JSON: analyses/

🔗 Gap analysis (what vstd wraps vs what Rust uses): veracity


Table of Contents

  1. Overview
  2. Design Principles
  3. Installation
  4. Tool Categories
  5. Standard Arguments
  6. Tool Reference
  7. Development

Overview

Rusticate is a suite of 77+ AST-based tools for analyzing and automatically fixing Rust codebases. Unlike traditional linters that rely on string manipulation or regex, Rusticate uses the Rust Analyzer syntax tree (ra_ap_syntax) for precise, reliable code transformations.

Note: While Rusticate is a general-purpose Rust analysis toolkit, some tools are specific to the APAS-AI project (Algorithm Performance Analysis System - a comprehensive Rust implementation of algorithms and data structures). These APAS-specific tools handle chapter migration (Chap18 → Chap19), module naming conventions, and parallelism patterns specific to that codebase.

Key Features

  • Zero String Hacking: All tools use proper AST parsing for accuracy
  • Automatic Fixes: Many review tools have corresponding fix tools
  • Emacs Integration: All output is Emacs-clickable for easy navigation
  • Consistent Interface: Standard arguments across all tools
  • Comprehensive Logging: Three-level logging system (stdout, per-tool logs, comprehensive log)
  • Deterministic Output: Sorted file lists and error messages for CI/CD

Use Cases

  • Code Quality Enforcement: Detect naming conventions, structure violations, and anti-patterns
  • Automated Refactoring: Bulk transformations (imports, UFCS, type bounds, etc.)
  • Test Coverage Analysis: Identify untested public functions (99.6% accuracy)
  • Parallelism Auditing: Detect inherent and transitive parallel operations
  • Codebase Migration: Automated migration between module versions (e.g., Chap18 → Chap19)
  • Stdlib Usage Analysis: Analyze which std/core/alloc items are used across 1000+ Rust projects

🔥 NEW: Rust Standard Library Usage Analysis

For Verus verification prioritization: Analyze which stdlib modules, types, methods, and traits are actually used in real-world Rust code.

Tools

Tool Description
rusticate-mirify Generate MIR files from Rust codebases (runs cargo check --emit=mir)
rusticate-analyze-modules-mir Analyze stdlib usage from MIR files with greedy set cover

Workflow

# 1. Generate MIR for 1000+ projects (slow: ~2-4 hours)
rusticate-mirify -C ~/projects/RustCodebases -j 6 --clean-artifacts

# 2. Analyze stdlib usage (fast: ~50 seconds for 1036 projects, 3636 crates)
rusticate-analyze-modules-mir ~/projects/RustCodebases

Key Results (from 3636 crates)

Finding Value
Crates with stdlib usage 3336 (92%)
Stdlib modules detected 84
Stdlib types detected 51
Stdlib traits detected 79
Methods/functions detected 3582

Greedy Set Cover: Finds minimum stdlib items to cover N% of real-world crates:

  • std::result module alone covers 84% of crates
  • 9 methods cover 90% of all stdlib usage
  • Iterator has 70+ methods, but 2 cover 92%

Output

The analysis generates analyses/rusticate-analyze-modules-mir.log with:

  1. Module usage by crate count
  2. Type usage by crate count
  3. Trait usage by crate count
  4. Method/function usage by crate count
  5. Greedy cover analysis (touch and full support) for modules, types, traits, methods
  6. Per-type method breakdown
  7. Trait implementation analysis

See: docs/RusticateAnalyzeModulesMir.md for full documentation.


Design Principles

1. AST-Only Analysis

No string hacking. All code analysis uses SyntaxKind, SyntaxNode, and TextRange from ra_ap_syntax. String methods like .contains(), .find(), .replace() are forbidden for code detection.

Why? String hacking leads to:

  • False positives (matching in comments, strings, identifiers)
  • Missed cases (whitespace variations, formatting differences)
  • Incorrect transformations (byte boundary errors, scope issues)

2. Review + Fix Pairs

Most review-* tools have a corresponding fix-* tool:

  • review-grouped-importsfix-grouped-imports
  • review-merge-importsfix-merge-imports
  • review-min-typingfix-min-typing

Workflow:

  1. Run review-* to identify issues
  2. Review the report
  3. Run fix-* to automatically apply fixes
  4. Run compile-and-test -c to verify correctness

3. Standard Interface

All tools support:

  • -c (codebase mode): Analyze src/, tests/, benches/
  • -d <dir>: Analyze specific directory
  • -f <file>: Analyze specific file
  • -m <module>: Search for module by name

Output format:

path/to/file.rs:LINE:  Description of issue

4. Review Tool Dispatcher

Run all review tools at once or individually through a single interface:

# Run all review tools
rusticate-review all -c

# Run specific tool
rusticate-review string-hacking -c
rusticate-review test-functions -f src/lib.rs

Dual Logging:

  1. stdout - Real-time output as tools run
  2. analyses/rusticate-review.log - Complete log with ALL output from ALL tools (tool details + timing + summary)
  3. analyses/review_<tool>.log - Individual tool logs (when tools run standalone)

The comprehensive review log captures everything so you can easily review findings from a full rusticate-review all -c run without scrolling through terminal output.

5. Deterministic Output

  • File lists are sorted
  • Error messages are sorted by file path and line number
  • Consistent for CI/CD and version control

Installation

Prerequisites

  • Rust 1.70+ (uses ra_ap_syntax for AST parsing)
  • Cargo

Build from Source

git clone https://github.com/yourusername/rusticate.git
cd rusticate
cargo build --release

Binaries will be in target/release/.

Add to PATH (Optional)

export PATH="$PATH:/path/to/rusticate/target/release"

Tool Categories

Compilation & Testing

Tools for building and testing Rust codebases.

Tool Purpose
compile Compile src/ with detailed error reporting
compile-and-test Compile src/ + run tests with colored output
compile-src-tests-benches-run-tests Full grind: compile all + run tests

Example:

compile -c                    # Compile entire codebase
compile-and-test -d src/      # Compile and test src/

Code Analysis (Review)

Tools that detect issues but don't modify code. All output to analyses/tool-name.log.

Import & Module Structure

Tool Description
review-grouped-imports Detect grouped imports use mod::{A, B};
review-merge-imports Find mergeable single-line imports from same module
review-import-order Check import statement ordering
review-no-extern-crate Detect deprecated extern crate usage
review-pub-mod Check for pub mod X {} declarations (general Rust)
review-non-wildcard-uses Analyze non-wildcard use statements
review-module-encapsulation Check module visibility and encapsulation

Naming Conventions

Tool Description
review-snake-case-filenames Enforce snake_case file names
review-pascal-case-filenames Enforce PascalCase file names
review-struct-file-naming Check struct name matches file name
review-variable-naming Check variable naming conventions

Type System & Traits

Tool Description
review-impl-trait-bounds Detect unnecessary trait bounds in impl blocks
review-trait-bound-mismatches Find inconsistent trait bounds
review-where-clause-simplification Identify simplifiable where clauses
review-min-typing Detect redundant type annotations
review-impl-order Check impl block ordering
review-inherent-and-trait-impl Analyze inherent vs trait implementations
review-inherent-plus-trait-impl Detect both inherent and trait impls in same file
review-public-only-inherent-impls Check public inherent impl restrictions
review-single-trait-impl Enforce one trait per impl block
review-redundant-inherent-impls Find duplicate inherent implementations
review-no-trait-method-duplication Detect method name conflicts in traits
review-trait-definition-order Check trait definition ordering
review-trait-method-conflicts Find conflicting trait method names
review-trait-self-usage Analyze Self usage in traits
review-typeclasses Analyze typeclass patterns
analyze-review-typeclasses Deep typeclass analysis

UFCS & Qualified Paths

Tool Description
review-unnecessary-ufcs-and-qualified-paths Detect simplifiable UFCS calls
review-simplifiable-ufcs Identify UFCS that could be method calls
review-minimize-ufcs-call-sites Find excessive UFCS usage
review-qualified-paths Analyze qualified path usage

Code Structure

Tool Description
review-stub-delegation Detect stub methods that delegate
review-internal-method-impls Check internal method implementations
review-duplicate-methods Find duplicate method definitions
review-comment-placement Check comment placement conventions

Testing

Tool Description
review-test-modules Analyze test module structure
review-test-functions Check test coverage for public functions (99.6% accuracy)
review-integration-test-structure Validate integration test organization
review-bench-modules Analyze benchmark module structure
review-duplicate-bench-names Find duplicate benchmark names

Parallelism & Concurrency (Mt/St Analysis)

Tool Description
review-inherent-and-transitive-mt Detect inherent/transitive parallel operations in Mt modules
review-st-mt-consistency Check Single-threaded (St) vs Multi-threaded (Mt) consistency
review-mt-per Analyze Mt/Per module relationships
review-stt-compliance Check StT trait compliance

Doctests

Tool Description
review-doctests Analyze doctest completeness and correctness

Chapter Migration (APAS-specific)

Tool Description
review-chap18-chap19 Identify Chap18/Chap19 import issues

Meta-Analysis

Tool Description
review-string-hacking Detect string manipulation that should use AST
review-summary-accuracy Verify summary counts match actual violations
review-logging Check that binaries have dual stdout+file logging

Code Fixing

Tools that automatically modify code. Always run tests after fixing!

Import Management

Tool Description
fix-grouped-imports Expand use mod::{A, B}; to individual imports
fix-merge-imports Merge single-line imports from same module
fix-import-order Sort import statements
fix-our-uses-to-wildcards Convert to wildcard imports for crate modules
fix-non-wildcard-uses Fix non-wildcard use statements
fix-test-trait-imports Fix trait imports in tests

Type System

Tool Description
fix-min-typing Remove redundant type annotations
fix-no-pub-type Fix public type visibility issues

UFCS & Qualified Paths

Tool Description
fix-unnecessary-ufcs Simplify UFCS to method calls (use cautiously!)

Code Structure

Tool Description
fix-stub-delegation Fix stub delegation patterns
fix-duplicate-methods Remove duplicate method definitions
fix-duplicate-method-call-sites Deduplicate method call sites

Doctests

Tool Description
fix-doctests Add missing use statements to doctests

Logging & Infrastructure

Tool Description
fix-logging Add dual stdout+file logging to binaries

Chapter Migration (APAS-specific)

Tool Description
fix-chap18-to-chap19 Migrate imports from Chap18 to Chap19
fix-chap18-to-chap19-per Migrate ArraySeqStPer imports
fix-chap18-chap19-both Remove Chap18 imports when Chap19 exists

Module Structure

Tool Description
fix-pub-mod Add missing pub mod X {} declarations (general Rust)

General

Tool Description
fix General-purpose fix tool (delegates to specific fixers)

Code Metrics

Tools that count or measure code properties.

Tool Description
count-as Count UFCS (Universal Function Call Syntax) usage: <Type as Trait>::method patterns
count-loc Count lines of code (excluding comments, blanks)
count-vec Count Vec usage
count-where Count where clause usage
count-for-loops Count for loops, distinguishing range-based (0..n) from iterator-based loops

Rust Standard Library Analysis

Tools for analyzing stdlib usage across large codebases. Critical for Verus verification prioritization.

Tool Description
rusticate-mirify Generate MIR files for codebases (runs cargo check --emit=mir)
rusticate-analyze-modules-mir Analyze stdlib module/type/method/trait usage from MIR files
rusticate-download-rust-codebases Clone repositories from crates.io top crates list

See: docs/RusticateAnalyzeModulesMir.md for detailed documentation.

Workflow:

# 1. Download top Rust crates
rusticate-download-rust-codebases -i analyses/top1100_unique_repos.txt -o ~/projects/RustCodebases

# 2. Generate MIR for all projects (slow: ~2-4 hours, only needed once)
rusticate-mirify -C ~/projects/RustCodebases -j 6 --clean-artifacts

# 3. Analyze stdlib usage (fast: ~50 seconds)
rusticate-analyze-modules-mir ~/projects/RustCodebases

Key Features:

  • Analyzes 1000+ projects in ~50 seconds (after MIR generation)
  • Validates symbols against stdlib module whitelist (std, core, alloc)
  • Filters out false positives (C++ symbols, internal crate modules, constants)
  • Greedy set cover: minimum stdlib items for N% crate coverage
  • Trait implementation analysis: Iterator, Clone, Debug, Display, etc.
  • Per-type method coverage: which Vec/Option/Result methods to verify first

Analysis Output:

  • 84 stdlib modules, 51 types, 79 traits, 3582 methods detected
  • Greedy cover shows 9 methods cover 90% of crate usage
  • Logs to analyses/rusticate-analyze-modules-mir.log

Example:

count-loc -c                  # Count LOC in entire codebase
count-where -d src/Chap19/    # Count where clauses in Chap19
count-for-loops -c            # Count all for loops, show range vs iterator breakdown

Parsing & Utilities

Tool Description
parse Parse Rust file and dump AST for debugging
review General-purpose review tool (delegates to specific reviewers)
rusticate Main CLI entry point

Standard Arguments

All tools support these common arguments:

-c (Codebase Mode)

Analyze src/, tests/, and benches/ directories.

review-test-functions -c

-d <directory> (Directory Mode)

Analyze a specific directory recursively.

review-merge-imports -d src/Chap19/

-f <file> (File Mode)

Analyze a single file.

fix-grouped-imports -f src/lib.rs

-m <module> (Module Search Mode)

Search for files matching a module name pattern.

review-impl-order -m ArraySeq

Searches for files containing "ArraySeq" in their name.

-l <language> (Language Mode)

Specify the target language for analysis. Defaults to Rust.

count-loc -c                           # Rust (default)
count-loc -l Rust -d src/             # Explicit Rust
count-loc -l Verus -d ~/verus-code/   # Verus language

Supported Languages:

  • Rust (default) - Standard Rust code
  • Verus - Verus verification language (distinguishes spec/proof/exec functions)

See Tools with Verus Support for which tools support Verus analysis.

-p <project> (Project Mode)

Enable project-specific tools (e.g., APAS-specific checks).

rusticate-review all -c -p APAS   # Run all tools including APAS-specific

Currently supports: APAS (for APAS-AI project)


Tool Reference

Key Tools by Use Case

🔥 Most Used

Test Coverage Analysis

review-test-functions -c
  • Detects untested public functions
  • 99.6% accuracy on large codebases
  • Handles trait implementations (Display, Debug, PartialEq)
  • Filters out nested helper functions
  • Output: analyses/review_test_functions.txt
  • Limitations:
    • Inlining: Measures syntactic calls in AST, not actual execution. Trivial functions (e.g., new(), into_iter()) may be inlined by the compiler and show as "tested" even though llvm-cov reports 0 executions.
    • Type Disambiguation: Cannot distinguish method calls across types without a typed AST (requires full type inference from rustc). Method names like vertices() or edges() match across ALL types that define them, causing false positives when multiple types share method names. Use llvm-cov for execution-based ground truth.

Parallelism Detection

review-inherent-and-transitive-mt -c
  • Identifies Mt modules with inherent parallelism (spawn, par_iter, ParaPair!)
  • Detects transitive parallelism (calls to other parallel methods)
  • Fixed-point iteration for intra-module transitivity
  • Output: analyses/review_inherent_and_transitive_mt.log

Import Cleanup

# 1. Find mergeable imports
review-merge-imports -c

# 2. Merge them
fix-merge-imports -c

# 3. Verify
compile-and-test -c

String Hacking Detection

review-string-hacking -c
  • Finds .contains(), .find(), .replace() on code content
  • Ensures tools use AST instead of string manipulation

Type Annotation Cleanup

# 1. Find redundant type annotations
review-min-typing -c

# 2. Remove them
fix-min-typing -c

# 3. Verify
compile-and-test -c

🧪 Testing Workflow

# Full grind cycle
compile-src-tests-benches-run-tests -c

# Or step-by-step
compile -c                    # Compile src/
compile-and-test -c           # Compile + run tests

📊 Metrics & Reporting

# Code metrics
count-loc -c                  # Lines of code
count-where -c                # Where clause usage
count-vec -c                  # Vec usage
count-for-loops -c            # For loop analysis (range vs iterator)

# Quality checks
review-summary-accuracy       # Verify report accuracy
review-test-functions -c      # Test coverage

🔧 Automated Fixes

# Import management
fix-merge-imports -c          # Merge imports
fix-grouped-imports -c        # Expand grouped imports
fix-import-order -c           # Sort imports

# Type system
fix-min-typing -c             # Remove redundant types
fix-no-pub-type -c            # Fix public type issues

# Doctests
fix-doctests -c               # Add missing use statements

Development

Adding a New Tool

  1. Create the binary:

    touch src/bin/review_my_feature.rs
  2. Add to Cargo.toml:

    [[bin]]
    name = "review-my-feature"
    path = "src/bin/review_my_feature.rs"
  3. Use StandardArgs:

    use rusticate::StandardArgs;
    use anyhow::Result;
    
    fn main() -> Result<()> {
        let args = StandardArgs::parse()?;
        // Use args.paths, args.is_module_search, args.base_dir
        Ok(())
    }
  4. Use AST parsing (never string hacking):

    use ra_ap_syntax::{SourceFile, SyntaxKind, ast};
    
    let parsed = SourceFile::parse(&content, Edition::Edition2021);
    for node in parsed.tree().syntax().descendants() {
        if node.kind() == SyntaxKind::USE {
            // Process with AST methods
        }
    }
  5. Add logging:

    use std::fs::File;
    use std::io::Write;
    
    macro_rules! log {
        ($logger:expr, $($arg:tt)*) => {{
            let msg = format!($($arg)*);
            println!("{}", msg);
            if let Some(ref mut f) = $logger {
                let _ = writeln!(f, "{}", msg);
            }
        }};
    }
  6. Sort output:

    let mut files = rusticate::find_rust_files(&paths, base_dir, is_module_search)?;
    // Already sorted by find_rust_files()

Testing a Tool

# Build
cargo build --release --bin review-my-feature

# Test on single file
./target/release/review-my-feature -f src/lib.rs

# Test on directory
./target/release/review-my-feature -d src/

# Test on full codebase
./target/release/review-my-feature -c

# Verify compilation still works
compile-and-test -c

Code Quality Rules

  1. NO STRING HACKING: Use SyntaxKind, not .contains() or .find()
  2. Sort output: Use find_rust_files() which returns sorted paths
  3. Emacs-clickable: Format as path/to/file.rs:LINE: message
  4. Dual logging: Write to both stdout and analyses/tool-name.log
  5. Test after fixing: Always run compile-and-test -c after auto-fixes

Examples

Example 1: Find and Fix Mergeable Imports

# Step 1: Identify mergeable imports
$ review-merge-imports -d src/Chap19/

src/Chap19/ArraySeqStEph.rs:5:  Mergeable imports:
  use std::fmt::Display;
  use std::fmt::Formatter;

# Step 2: Apply fix
$ fix-merge-imports -d src/Chap19/

Fixed src/Chap19/ArraySeqStEph.rs (merged 2 imports)

# Step 3: Verify
$ compile-and-test -c
✅ All tests passed

Example 2: Check Test Coverage

$ review-test-functions -c

================================================================================
SUMMARY:
  Total public functions: 2,360
  Functions with test coverage: 2,350 (99.6%)
  Functions without test coverage: 10 (0.4%)
================================================================================

src/Chap50/Probability.rs:118:  Probability::add - NO TEST COVERAGE
src/Chap50/Probability.rs:124:  Probability::sub - NO TEST COVERAGE
...

Example 3: Detect Parallel Operations

$ review-inherent-and-transitive-mt -c

INHERENT PARALLELISM:
  Chap19/ArraySeqMtEph.rs - scan() [line 159] (uses ParaPair!)
  Chap19/ArraySeqMtEph.rs - reduce() [line 186] (uses ParaPair!)

TRANSITIVE PARALLELISM:
  Chap41/AVLTreeSetMtEph.rs - filter() calls ArraySeqMtEph::scan()
  
NON-PARALLEL Mt MODULES:
  Chap06/DirGraphMtEph.rs - No parallel operations detected

Example 4: Remove Redundant Type Annotations

# Find redundant annotations
$ review-min-typing -d src/

src/lib.rs:45:  Redundant type annotation: let x: i32 = 5;

# Fix automatically
$ fix-min-typing -d src/

Fixed src/lib.rs (removed 3 redundant type annotations)

# Verify
$ compile-and-test -c
✅ All tests passed

Tools Tested on APAS-AI Codebase

The following tools have been validated on the large-scale APAS-AI codebase (2,780+ public functions, 114K+ LOC):

Compilation & Testing

  • compile - Validated across entire codebase
  • compile-and-test - Full test suite validation
  • compile-src-tests-benches-run-tests - Complete build pipeline

Code Metrics

  • count-loc - Counted 114,561 LOC: 45,462 src + 55,213 tests + 13,886 benches (655 files total)
  • count-as - Found 1,456 UFCS patterns: 256 src + 942 tests + 258 benches
  • count-vec - Found 721 Vec usages: 453 src + 198 tests + 70 benches
  • count-where - Found 213 where clauses: 213 src + 0 tests + 0 benches
  • count-for-loops - Found 739 loops: 478 range-based (64%), 261 iterator-based (35%)

Review Tools (Extensively Tested)

  • review-test-functions - Analyzed 2,780 functions, 88.9% coverage detection
  • review-inherent-and-transitive-mt - Detected parallelism in 52+ Mt modules
  • review-string-hacking - Found violations in 11 rusticate tools
  • review-summary-accuracy - Validated all analysis reports
  • review-merge-imports - Identified mergeable imports
  • review-grouped-imports - Found grouped imports to expand
  • review-unnecessary-ufcs-and-qualified-paths - Detected simplifiable UFCS
  • review-simplifiable-ufcs - Analyzed UFCS patterns
  • review-non-wildcard-uses - Categorized import patterns
  • review-chap18-chap19 - Found module migration issues

Fix Tools (Production Tested)

  • fix-grouped-imports - Expanded 100+ grouped imports successfully
  • fix-merge-imports - Merged imports, preserving aliases
  • fix-chap18-to-chap19-per - Migrated 13 modules
  • fix-chap18-chap19-both - Resolved trait ambiguities
  • fix-logging - Added dual logging to rusticate tools
  • fix-doctests - Fixed doctest compilation issues
  • fix-binary-logging - Injected logging boilerplate
  • fix-unnecessary-ufcs - Tested but reverted (ambiguity issues)

Results Summary

  • Test Coverage Detection: 88.9% (2,471/2,780 functions)
  • Parallel Methods Detected: 156 inherent, 340+ transitive
  • Code Transformations: 500+ automated fixes applied
  • Zero Regressions: All fixes validated with compile-and-test -c

Verus Language Support

Rusticate provides specialized support for Verus, a verification-aware Rust dialect for formally verified systems programming. Verus extends Rust with specification constructs (spec fn), proof functions (proof fn), and executable code (exec fn), all within a verus! {} macro.

Available Verus Tools

All Verus tools require the -l Verus flag.

Tool Description Output
count-loc Count lines of code with spec/proof/exec breakdown Spec/Proof/Exec LOC per file + summary
review-verus-proof-holes Detect unproven assumptions, proof holes, and axioms Holes and axioms by type, clean/holed modules
review-verus-axiom-purity Classify axioms as pure math vs system-dependent Pure math vs non-pure axiom breakdown

Tool: count-loc with Verus Support

Purpose: Count lines of code in Verus projects, distinguishing between specification, proof, and executable code.

Usage:

# Single project
rusticate-count-loc -l Verus -d ~/my-verus-project/

# Multiple projects in a repository
rusticate-count-loc -l Verus -r ~/VerusCodebases/

What it Counts:

In Verus, everything inside verus! {} is executable by default unless marked otherwise:

Category What Counts As Examples
Spec Specification-level code spec fn, global fn, layout fn, type invariants
Proof Proof code proof fn, proof { } blocks inside exec functions
Exec Executable code (default) Regular fn, structs, enums, impl blocks, constants, type aliases

Algorithm:

  1. Find all verus! {} macros in the file
  2. Count total lines inside each macro
  3. Walk token tree to identify spec/proof functions and proof blocks
  4. exec = total_lines - spec_lines - proof_lines

Example Output:

$ rusticate-count-loc -l Verus -d ~/verus-project/

Verus LOC (Spec/Proof/Exec)

      36/      34/     114 human_eval_001.rs
     128/      87/     342 array_list.rs
      89/     156/     287 btree_map.rs
     ...

      253/     277/     743 total
    2,489 total lines

12 files analyzed in 15ms

Tested On:

  • human-eval-verus: 264 files, 394K total lines → 60K spec / 68K proof / 138K exec
  • VerusCodebases (24 Verus projects): 1,323 files → 60,739 spec / 68,157 proof / 138,849 exec
  • APAS-VERUS: 16 files, 4K total lines → 253 spec / 277 proof / 743 exec

Implementation: Pure AST/token parsing. Walks the verus! {} macro's token tree, identifies function modifiers by looking for SyntaxKind::IDENT tokens before SyntaxKind::FN_KW. No string hacking.


Tool: review-verus-proof-holes

Purpose: Detect incomplete proofs, unverified assumptions, and trusted axioms in Verus code.

Usage:

# Analyze a directory
rusticate-review-verus-proof-holes -l Verus -d src/

# Analyze specific file
rusticate-review-verus-proof-holes -l Verus -f src/btree.rs

What it Detects:

Proof Holes (unverified gaps in proofs):

Hole Type Description Severity
assume(false) Unproven assumption with false (admits anything) 🔴 Critical
assume(...) Unproven assumption (any argument) 🟡 Warning
admit() Admitted proof obligation 🟡 Warning
#[verifier::external_body] Externally verified function 🔵 Info
#[verifier::external_*] External specifications, traits, types 🔵 Info
#[verifier::opaque] Opaque function (body hidden from verifier) 🔵 Info

Trusted Axioms (mathematical foundations, separate from holes):

Axiom Type Description
axiom fn Trusted axiom function declarations
broadcast use ...axiom... Imported axiom groups and functions

Metrics Reported:

Per File:

  • ✓ Clean files (no holes)
  • ❌ Holed files (contains holes)
  • Hole breakdown by type
  • Clean vs holed proof functions
  • Axiom usage

Summary:

  • Holes Found: Unverified assumptions and admitted obligations
    • Breakdown by hole type (assume/admit/external/opaque)
  • Trusted Axioms: Mathematical foundations (reported separately)
    • axiom fn declarations
    • broadcast use axiom imports

Example Output:

$ rusticate-review-verus-proof-holes -l Verus -d src/

Verus Proof Hole Detection
Looking for:
  - assume(false), assume(), admit()
  - axiom fn (trusted axioms)
  - broadcast use axioms (axiom groups and functions)
  - external_body, external_fn_specification, external_trait_specification
  - external_type_specification, external_trait_extension, external
  - opaque

✓ array_list.rs
❌ btree_map.rs
   Holes: 8 total
      3 × admit()
      5 × external_body
   Proof functions: 23 total (20 clean, 3 holed)
✓ seq_lib.rs
   141 clean proof functions

═══════════════════════════════════════════════════════════════
SUMMARY
═══════════════════════════════════════════════════════════════

Modules:
   73 clean (no holes)
   12 holed (contains holes)
   85 total

Proof Functions:
   672 clean
   49 holed
   721 total

Holes Found: 321 total
   [breakdown by type]

Trusted Axioms: 189 total
   108 × axiom fn declarations
   81 × broadcast use axioms

Completed in 47ms

Use Cases:

  • Pre-publication audit: Find all unverified assumptions before releasing
  • Technical debt tracking: Monitor proof completion progress
  • Code review: Identify modules requiring proof work
  • Verification status: Quick overview of proof completeness
  • Axiom audit: Track mathematical foundations being relied upon

Implementation: Pure AST/token parsing using ra_ap_syntax. Finds both verus! and verus_! macros, walks token trees to identify:

  • proof fn and axiom fn declarations (IDENT modifiers before FN_KW)
  • Function calls to assume/admit (IDENT followed by L_PAREN)
  • #[verifier::*] attributes (POUND → L_BRACK → verifier path, handling :: tokenized as two COLON)
  • broadcast use statements with axiom references (broadcast IDENT → USE_KW)

No string hacking. Passes string-hacking detector.

Logs to: analyses/rusticate-review-verus-proof-holes.log


Tool: review-verus-axiom-purity

Purpose: Classify trusted axioms in Verus code by mathematical abstraction level: numeric math, set theoretic math, or machine-level math.

Usage:

# Analyze a directory
rusticate-review-verus-axiom-purity -l Verus -d src/

# Analyze specific file
rusticate-review-verus-axiom-purity -l Verus -f src/seq_lib.rs

Three-Tier Axiom Classification:

1. Numeric Math (✅ Numbers and arithmetic):

  • nat, int - Natural and integer axioms
  • arithmetic, div_mod, mul, power, logarithm - Arithmetic operations
  • add, sub - Numeric operations
  • Mathematical foundations for computation

2. Set Theoretic Math (✅ Mathematical abstractions):

  • seq, multiset, map, set - Collection theory axioms
  • to_multiset - Collection conversion axioms
  • Abstract mathematical structures

3. Machine Math (⚠️ Concrete data structures - requires scrutiny):

  • hash - Hash function axioms (runtime-dependent)
  • array, vec, slice - Concrete array/vector structures
  • ptr, borrow, alloc - Memory and ownership axioms
  • atomic, thread, sync - Concurrency primitives
  • layout - Memory layout axioms
  • Ties proofs to specific runtime behaviors

Why This Matters:

Numeric and set theoretic axioms are mathematical foundations (67%+ in vstd). Machine math axioms (33%) tie proofs to specific runtime behaviors (hash implementations, memory models, concurrency primitives) and require more careful review for portability and trust.

Example Output:

$ rusticate-review-verus-axiom-purity -l Verus -d src/

Verus Axiom Purity Analysis

✓ seq.rs
   Numeric math axioms: 7
      2 × axiom_seq_subrange_index
      2 × axiom_seq_subrange_len
      1 × axiom_seq_add_index1
      1 × axiom_seq_add_index2
      1 × axiom_seq_add_len
   Set theoretic math axioms: 16
      3 × group_seq_axioms
      1 × axiom_seq_empty
      [...]

⚠ hash_set.rs
   Machine math axioms: 3
      3 × group_hash_axioms

═══════════════════════════════════════════════════════════════
SUMMARY
═══════════════════════════════════════════════════════════════

Axiom Classification:
   54 numeric math (26.1%)
   85 set theoretic math (41.1%)
   68 machine math (32.9%)
   ─────────────────────
   207 total axioms

Completed in 52ms

Use Cases:

  • Trust audit: Understand mathematical vs machine-level axiom dependencies
  • Portability analysis: Identify axioms tied to specific runtime behaviors
  • Review prioritization: Focus review on machine math axioms
  • Publication: Document axiom abstraction levels for verified systems

Implementation: Pure AST/token parsing using ra_ap_syntax. Detects:

  • axiom fn declarations - extracts function names
  • broadcast use statements - extracts axiom group/function names
  • Classifies based on name patterns:
    • Numeric: arithmetic, div, mul, add, sub, nat, int
    • Set theoretic: seq, multiset, map, set
    • Machine: everything else (hash, array, ptr, thread, etc.)

No string hacking. Passes string-hacking detector.

Logs to: analyses/rusticate-review-verus-axiom-purity.log


Verus Implementation Notes

How Rusticate Parses Verus:

  1. Find verus! {} macros: Search for MACRO_CALL nodes with path "verus"
  2. Extract token tree: Get the TOKEN_TREE child (the contents of the macro)
  3. Walk tokens: Iterate through descendants_with_tokens() to find:
    • FN_KW tokens (function declarations)
    • IDENT tokens preceding FN_KW (modifiers: spec, proof, global, layout)
    • IDENT tokens followed by L_PAREN (function calls like assume, admit)
    • POUND tokens (start of attributes like #[verifier::external_body])

Why Not Re-parse the Macro Contents?

Verus keywords (spec, proof, exec, global, layout, tracked, ghost) are not valid Rust syntax outside the verus! macro. Attempting to parse the macro's token tree as a new SourceFile fails because the Rust parser rejects these keywords.

Solution: Walk the token tree directly using SyntaxToken analysis, which preserves the original tokens without requiring syntactic validity.

Challenges:

  • Token Tree Structure: The verus! {} macro wraps all code in a flat token stream. We must reconstruct function boundaries by matching braces.
  • Modifier Detection: Modifiers like spec appear as IDENT tokens before fn. We look backwards from FN_KW to find modifiers.
  • Nested Scopes: Functions can contain nested functions. We track brace depth to find function boundaries.

Why This Approach Works:

  • ✅ Handles all Verus syntax without string matching
  • ✅ Correctly identifies function types even with unusual formatting
  • ✅ Finds holes inside nested scopes
  • ✅ Fast: No full type inference needed, just token walking

Future Verus Tools (Planned)

  • review-verus-spec-coverage - Detect functions missing specifications
  • review-verus-invariants - Check struct invariants and ensures clauses
  • review-verus-ghost-tracking - Analyze ghost/tracked usage
  • count-verus-modes - Count open/closed/tracked mode usage

License

GPL-3.0 - See LICENSE file for details.

Copyright (C) Brian G. Milnes 2025

Development

Testing with APAS-AI

Many rusticate tools have been extensively tested on the APAS-AI codebase. To run tests against APAS-AI:

# From rusticate project root, check out APAS-AI
mkdir -p APAS-AI-copy
cd APAS-AI-copy
git clone https://github.com/briangmilnes/APAS-AI.git apas-ai
cd ..

# Run count tools on APAS-AI
target/release/rusticate-count-loc -c APAS-AI-copy/apas-ai
target/release/rusticate-count-as -c APAS-AI-copy/apas-ai
target/release/rusticate-count-vec -c APAS-AI-copy/apas-ai
target/release/rusticate-count-where -c APAS-AI-copy/apas-ai
target/release/rusticate-count-for-loops -c APAS-AI-copy/apas-ai

# Run review tools on APAS-AI
target/release/rusticate-review string-hacking -c APAS-AI-copy/apas-ai
target/release/rusticate-review test-functions -c APAS-AI-copy/apas-ai

The APAS-AI codebase is checked out in APAS-AI-copy/apas-ai/ by convention to avoid confusion with the main rusticate project.

Contributing

Contributions welcome! Please ensure:

  1. All tools use AST parsing (no string hacking)
  2. Output is deterministic (sorted)
  3. Add corresponding fix tool for each review tool
  4. Tests pass after automatic fixes
  5. Run rusticate-review string-hacking on your changes

Authors

Brian G. Milnes

About

Rusticating Python as a method for code review and fix of Rust files. Python will be sent back to the family estate for not working well.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors