This README focuses on the autonomous agent system. For the detailed mathematical and technical report, see TECHNICAL_REPORT.md.
A complete formalization of the Vlasov-Maxwell-Landau steady-state theorem on the 3-torus — achieved in 10 days by a centaur team of AI agents and a human mathematician.
| Metric | Value |
|---|---|
| Status | ✅ Fully Verified (0 sorry's) |
| Lean 4 Code | 10,445 lines |
| Development Time | 10 days (Mar 1–10, 2026) |
| Human Prompts | 229 |
| Assistant Turns | 27,200+ |
| Tokens Consumed | 2.8 Billion |
| API Cost | ~$6,300 |
"The goal is not to end up with 0 sorry's! The goal is to make an honest formalization of the main theorem." — Project Manifesto
This project demonstrates a new paradigm in mathematical research: Semi-Autonomous Formalization. The human steers the high-level strategy and validates the definitions, while a suite of AI agents handles the implementation, proof search, and verification.
This result was produced by a specialized multi-agent system:
- Vasily Ilin (Human): Architect & Reviewer. Designed the proof strategy, enforced hypothesis discipline, and audited critical definitions.
- Claude Code (Agent): Engineer & Prover. Wrote 99% of the Lean code, managed the repository, and executed the
/babysitloop. - Gemini DeepThink (Reasoning Model): Mathematician. Generated the initial natural-language proof and solved complex analytical bottlenecks.
- Aristotle (ATP): Lemma Specialist. Automatically proved 111 difficult lemmas and caught 28 false conjectures via counterexample search.
The core innovation is the /babysit loop: an autonomous development cycle run by Claude Code.
/critique: Adversarial review of the codebase (finding gaps, dead code, weak definitions)./plan: Prioritize work based on the critique./prove: Attempt to close opensorry's using Mathlib tactics./submit-aristotle: Extract hard lemmas and send them to the Aristotle cloud prover./check-aristotle: Integrate successful proofs and debug failed ones./simplify: Refactor code and remove redundancy.
This loop ran continuously, often overnight, turning high-level directives into verified theorems.
Every node represents a verified theorem. The graph flows from basic definitions to the final VML steady-state theorem.
10,000 lines of verified code in 10 days. The sharp rise on Mar 7-8 represents the autonomous "sprint" to handle the Coulomb kernel singularity.
Blue bars are AI actions; red bars are human prompts. The system ran effectively autonomously for long stretches (see Mar 9-10).
Theorem (CoulombConcreteTheorem42):
Let
-
$f$ is a global Maxwellian (thermodynamic equilibrium). - The electric field
$E$ vanishes everywhere. - The magnetic field
$B$ is constant.
This is a rigidity result fundamental to plasma physics, formalized here in full generality.
This work is licensed under CC BY-NC-SA 4.0.
