Skip to content

ChazenLi/ChemReact

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

PROJECT_REPORT.md

ChemReact: Integrated Retrosynthesis Planning & Visualization System

1. Project Overview

ChemReact is a sophisticated chemoinformatics and retrosynthesis automation framework designed to bridge the gap between high-level chemical reasoning and executable laboratory procedures. By leveraging the advanced molecular manipulation capabilities of RDKit and a Multi-Persona LLM Architecture, ChemReact provides a closed-loop system for molecular design, auditing, and high-fidelity visualization.

2. Core Architecture: Competitive Multi-Persona Reasoning

Unlike monolithic planning systems, ChemReact utilizes a "checks and balances" approach through specialized agent personas, ensuring that every synthetic route is both strategically sound and tactically executable.

Persona Responsibility Focus
Top-Level Designer Strategic Planning Skeletal disconnection, convergent vs. linear strategy, core ring system construction.
Reaction Designer Tactical Execution Detailed reagent selection, solvent/catalyst optimization, selectivity control (Regio/Stereo).
Auditor Quality Assurance Mass balance verification, protecting group (PG) loop detection, safety/toxicity screening.
Visualization Specialist Visual Communication Creative direction for molecular rendering, identifying key intermediates and reaction centers for highlighting.

3. Visual Results and Demonstrations

The system's strength lies in its ability to translate abstract JSON data into intuitive visual artifacts. Below are representative outputs from a standard run (run_001) performing the retrosynthesis of Losartan.

3.1 Target Molecule Analysis

The system identifies the core biphenyl-tetrazole-imidazole scaffold, providing a clean 2D representation for strategic mapping.

target

Fig 1. Target Molecule (Losartan) 2D Rendering.

3.2 Retrosynthesis Tree (The "Tree View")

A cornerstone of the ChemReact system is the orthogonal tree visualization, which clearly maps the target to its primary precursors, illustrating the convergent nature of the synthesis.

route_1_tree route_2_tree *Fig 2. Route 1&2: Strategic Disconnection Tree.*

3.3 Reaction Step Detailing

For each step, the system generates high-fidelity reaction mappings, highlighting the transformation of functional groups and atomic changes.

route_1_step_1 route_1_step_2

Fig 3. Detailed Reaction Mapping: Step 1 (Coupling).

route_2_step_1 route_2_step_2

Fig 4. Detailed Reaction Mapping: Step 2 (Coupling).

4. Another Visual Results and Demonstrations

Target Molecule: Ic1ccc(c(c1)F)Nc1c(ccc(c1F)F)C(=O)N1CC(O)(C1)[C@@H]1CCCCN1 target

4.1 Executive Summary (Deep Chemical Audit)

  • Core Skeleton: Biaryl-amide with fused azabicyclic system
  • Complexity: Halogenated aryl rings (I, F), Chiral azabicyclo[2.2.1]heptane, Pyrrolidine moiety, Multiple amide linkages
  • Strategy: Convergent

4.2 Recommended Routes

Route 1 (Score: 8.5/10 - PASS)

route_1_overview route_1_tree
Auditor's Verdict
  • Critical Issues: None
Detailed Steps

Step 1: Nucleophilic Aromatic Substitution

  • Reagents: K2CO3, DMF
  • Conditions: 80°C, 12h
route_1_step_1

Step 2: Amide Coupling

  • Reagents: HATU, DIPEA, DMF
  • Conditions: RT, 4h
route_1_step_2

Route 2 (Score: 7.8/10 - PASS)

route_2_overview route_2_tree
Auditor's Verdict
  • Critical Issues: Requires protection/deprotection

Detailed Steps

Step 1: Buchwald-Hartwig Coupling

  • Reagents: Pd2(dba)3, XPhos, Cs2CO3, Toluene
  • Conditions: 100°C, 16h
route_2_step_1

4.3 Recommended Next Steps

  1. Verify availability of Key Starting Materials (KSMs) for Route 1.
  2. Run Conformer Generation (Module 4) on late-stage intermediates to check steric hindrance.
  3. Review safety flags for Scale-up.

5. Key Technical Innovations

  • Closed-Loop Verification: Integration with verify_skill.py ensures that all RDKit-derived properties (LogP, MW, Fingerprints) are consistent throughout the planning process.
  • Persona-Driven Prompting: Specialized prompts in prompts_personas.py reduce hallucination by forcing agents to focus on their specific domain (e.g., the Auditor cannot ignore PG loops).
  • Automated Reporting: The report_generator.py compiles JSON audit trails and PNG assets into a single, cohesive Markdown document for peer review.

6. Conclusion

ChemReact transforms retrosynthesis from a solo "guessing" game into an audited, visual, and documented engineering process. By combining the precision of RDKit with the flexibility of multi-persona LLMs, it offers a scalable solution for early-stage drug discovery and process chemistry.

But there are still some problems:
① there are still some small stom mismatch in the whole retro reaction(you can even see on the two examples above, well this maybe a small problem and will be fixed in the next version, if i have enough time to do so), now this program is more like an assiatant than a independent researcher. ② it is still a long voyage to the multi-step retro/forward synthesis, especially for the new reaction and complicated, multi compound reaction in the specific conditions. well i think this is also a huge problem for the frontier AI researchers, maybe. ③ the accuracy is not the only thing need tobe improved, so is the diversity. in reality there are always multi routes to approach the same compound which means i need to design a better strategy for further routes designing, lol, may not be that easy...


Released Version: v0.1.0 the multi-step skill is coming PROJECT_REPORT.pdf

About

This is a history collection of the agent skill system program(contains diff version of the whole system, you may need to read the pdf and result in the specific file for details) for molecular reaction and retrosynthesis. What you need is just Claude Code/Open code/Codex/Gemini CLI, and download this skill.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages