Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model

Overview

we introduce GLARE, a GRPO-based Learning framework for Active REinforced screening, designed to overcome the limitations of traditional active learning methods and enhance large-scale virtual screening. GLARE reformulates the virtual screening process as a Markov Decision Process (MDP), enabling reinforcement learning to dynamically optimize molecular selection strategies. By leveraging Group Relative Policy Optimization (GRPO), GLARE eliminates the reliance on manually-designed heuristics, learning to adaptively screen large-scale chemical spaces.

Environment

Required dependencies and versions:

Python 3.9
Torch 1.13.1
Torch Geometric 2.4.0
Numpy 1.24.0
Pandas 2.0.3
Scipy 1.13.1
Scikit-learn 1.3.0
Rdkit 2023.3.2

Data

Run utils/preprocess_data.py to preprocess ALDH1, PKM2 and VDR.

Run utils/preprocess_data_enamine.py to preprocess Enamine50k and EnamineHTS.

Training

For LIT-PCBA, run this command:

python main.py -cuda $cuda -output_folder "result_VDR" -mode "a" -architecture "ginl" -strategy "grpo" -dataset "VDR" -seed 0 -start_active_num 1 -start_num 64 -batch_size 64 -max_screen_size 1000 -ensemble_size 10 -epochs 50

For Enamine, run this command:

python main.py -cuda $cuda -output_folder "result_Enamine50k" -mode "a" -architecture "gine" -strategy "grpo" -dataset "Enamine50k" -seed 0 -start_active_num 1 -start_num 500 -batch_size 500 -max_screen_size 3000 -ensemble_size 10 -epochs 3

Citation and Contact

If you find GLARE useful for your research and applications, please cite:

@inproceedings{
    chen2025reinforced,
    title={Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model},
    author={Yicong Chen and Jiahua Rao and Jiancong Xie and Dahao Xu and Zhen WANG and Yuedong Yang},
    booktitle={Thirty-ninth Annual Conference on Neural Information Processing Systems},
    year={2025},
    url={https://neurips.cc/virtual/2025/poster/119971}
}

Please contact Jiahua Rao for any questions or suggestions.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
fig		fig
utils		utils
README.md		README.md
acquisition.py		acquisition.py
config.py		config.py
dataset.py		dataset.py
main.py		main.py
model.py		model.py
model_pretrain.py		model_pretrain.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model

Overview

Environment

Data

Training

Citation and Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model

Overview

Environment

Data

Training

Citation and Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages