ChemReact is a sophisticated chemoinformatics and retrosynthesis automation framework designed to bridge the gap between high-level chemical reasoning and executable laboratory procedures. By leveraging the advanced molecular manipulation capabilities of RDKit and a Multi-Persona LLM Architecture, ChemReact provides a closed-loop system for molecular design, auditing, and high-fidelity visualization.
Unlike monolithic planning systems, ChemReact utilizes a "checks and balances" approach through specialized agent personas, ensuring that every synthetic route is both strategically sound and tactically executable.
| Persona | Responsibility | Focus |
|---|---|---|
| Top-Level Designer | Strategic Planning | Skeletal disconnection, convergent vs. linear strategy, core ring system construction. |
| Reaction Designer | Tactical Execution | Detailed reagent selection, solvent/catalyst optimization, selectivity control (Regio/Stereo). |
| Auditor | Quality Assurance | Mass balance verification, protecting group (PG) loop detection, safety/toxicity screening. |
| Visualization Specialist | Visual Communication | Creative direction for molecular rendering, identifying key intermediates and reaction centers for highlighting. |
The system's strength lies in its ability to translate abstract JSON data into intuitive visual artifacts. Below are representative outputs from a standard run (run_001) performing the retrosynthesis of Losartan.
The system identifies the core biphenyl-tetrazole-imidazole scaffold, providing a clean 2D representation for strategic mapping.
Fig 1. Target Molecule (Losartan) 2D Rendering.
A cornerstone of the ChemReact system is the orthogonal tree visualization, which clearly maps the target to its primary precursors, illustrating the convergent nature of the synthesis.
*Fig 2. Route 1&2: Strategic Disconnection Tree.*
For each step, the system generates high-fidelity reaction mappings, highlighting the transformation of functional groups and atomic changes.
Fig 3. Detailed Reaction Mapping: Step 1 (Coupling).
Fig 4. Detailed Reaction Mapping: Step 2 (Coupling).
Target Molecule: Ic1ccc(c(c1)F)Nc1c(ccc(c1F)F)C(=O)N1CC(O)(C1)[C@@H]1CCCCN1

- Core Skeleton: Biaryl-amide with fused azabicyclic system
- Complexity: Halogenated aryl rings (I, F), Chiral azabicyclo[2.2.1]heptane, Pyrrolidine moiety, Multiple amide linkages
- Strategy: Convergent
- Critical Issues: None
Step 1: Nucleophilic Aromatic Substitution
- Reagents: K2CO3, DMF
- Conditions: 80°C, 12h
Step 2: Amide Coupling
- Reagents: HATU, DIPEA, DMF
- Conditions: RT, 4h
- Critical Issues: Requires protection/deprotection
Step 1: Buchwald-Hartwig Coupling
- Reagents: Pd2(dba)3, XPhos, Cs2CO3, Toluene
- Conditions: 100°C, 16h
- Verify availability of Key Starting Materials (KSMs) for Route 1.
- Run Conformer Generation (Module 4) on late-stage intermediates to check steric hindrance.
- Review safety flags for Scale-up.
- Closed-Loop Verification: Integration with
verify_skill.pyensures that all RDKit-derived properties (LogP, MW, Fingerprints) are consistent throughout the planning process. - Persona-Driven Prompting: Specialized prompts in
prompts_personas.pyreduce hallucination by forcing agents to focus on their specific domain (e.g., the Auditor cannot ignore PG loops). - Automated Reporting: The
report_generator.pycompiles JSON audit trails and PNG assets into a single, cohesive Markdown document for peer review.
ChemReact transforms retrosynthesis from a solo "guessing" game into an audited, visual, and documented engineering process. By combining the precision of RDKit with the flexibility of multi-persona LLMs, it offers a scalable solution for early-stage drug discovery and process chemistry.
But there are still some problems:
① there are still some small stom mismatch in the whole retro reaction(you can even see on the two examples above, well this maybe a small problem and will be fixed in the next version, if i have enough time to do so), now this program is more like an assiatant than a independent researcher.
② it is still a long voyage to the multi-step retro/forward synthesis, especially for the new reaction and complicated, multi compound reaction in the specific conditions. well i think this is also a huge problem for the frontier AI researchers, maybe.
③ the accuracy is not the only thing need tobe improved, so is the diversity. in reality there are always multi routes to approach the same compound which means i need to design a better strategy for further routes designing, lol, may not be that easy...
Released Version: v0.1.0 the multi-step skill is coming PROJECT_REPORT.pdf