This is the official code repository for the paper Exploring Chemical Space with Score-based Out-of-distribution Generation (ICML 2023), in which we propose Molecular Out-Of-distribution Diffusion (MOOD).
- We propose a novel score-based generative model for OOD generation, which overcomes the limited explorability of previous models by leveraging the novel OOD-controlled reverse-time diffusion.
- Since the extended exploration space by the OOD control contains molecules that are chemically implausible, we propose a framework for molecule optimization that leverages the gradients of the property prediction network to confine the generated molecules to a novel yet chemically meaningful space.
- We experimentally demonstrate that the proposed MOOD can generate novel molecules that are drug-like, synthesizable, and have high binding affinity for five protein targets, outperforming existing molecule generation methods.
Run the following commands to install the dependencies:
conda create -n mood python=3.8
conda activate mood
conda install -c pytorch pytorch==1.12.0 cudatoolkit=11.3
conda install -c conda-forge rdkit=2020.09 openbabel
pip install tqdm pyyaml pandas easydict networkx==2.6.3 numpy==1.20.3
chmod u+x scorer/qvina02MOOD utilizes GDSS as its backbone diffusion model. In our paper, we utilized the pretrained gdss_zinc250k_v2.pth GDSS checkpoint, which is in the folder checkpoints/ZINC250k.
Run the following command to preprocess the ZINC250k dataset:
python data/preprocess.pyWe provide the pretrained property predictor networks (prop_parp1.pth, prop_fa7.pth, prop_5ht1b.pth, prop_braf.pth, and prop_jak2.pth, respectively), in the folder checkpoints/ZINC250k.
To train your own property predictor, run the following command:
CUDA_VISIBLE_DEVICES=${gpu_id} python main.py --type train --config prop_trainYou can modify hyperparameters in config/prop_train.yaml.
To generate molecules, run the following command:
CUDA_VISIBLE_DEVICES=${gpu_id} python main.py --type sample --config sampleYou can modify hyperparameters in config/sample.yaml.
If you find this repository and our paper useful, we kindly request to cite our work.
@article{lee2023MOOD,
author = {Seul Lee and Jaehyeong Jo and Sung Ju Hwang},
title = {Exploring Chemical Space with Score-based Out-of-distribution Generation},
journal = {Proceedings of the 40th International Conference on Machine Learning},
year = {2023}
}