Git Product home page Git Product logo

develop-zyh's Introduction

DEVELOP - Deep Generative Design with 3D Pharmacophoric Constraints

This repository contains our implementation of Deep Generative Design with 3D Pharmacophoric Constraints (DEVELOP).

If you found DEVELOP useful, please cite our preprint:

@article{Imrie2021DEVELOP,
  title = {Deep Generative Design with 3D Pharmacophoric Constraints},
  author = {Imrie, Fergus and Hadfield, Thomas E and Bradley, Anthony R and Deane, Charlotte M},
  journal = {bioRxiv},
  doi = {10.1101/2021.04.27.441676},
  year = {2021},
  URL = {https://www.biorxiv.org/content/10.1101/2021.04.27.441676v1.full},
  eprint = {https://www.biorxiv.org/content/10.1101/2021.04.27.441676v1.full.pdf}
}

Acknowledgements

We thank the authors of Constrained Graph Variational Autoencoders for Molecule Design for releasing their code. The code in this repository is based on their source code release (link). If you find this code useful, please consider citing their work.

We thank the authors of libmolgrid, which we use for molecular gridding (link to project). If you use DEVELOP, please consider citing their work.

Requirements

This code was tested in Python 3.7 with Tensorflow 1.14 and molgrid 0.2.1.

A yaml file containing all requirements is provided. This can be readily setup using conda.

conda env create -f DEVELOP_env.yml
conda activate DEVELOP_env

Note that a GPU is required to run DEVELOP as a result of the molgrid dependency.

Data Extraction

We use three primary datasets (ZINC, CASF, PDBbind).

To preprocess these datasets for DEVELOP, please go to data directory and run prepare_data_linker_design.py or prepare_data_scaffold_elaboration.py.

E.g.

python prepare_data_linker_design.py --output_dir ./linker_design --add_idx --include_pharms

Running DEVELOP

DEVELOP can be used to perform both linker design and scaffold elaboration. The below commands demonstrate how to train and generate molecules for linker design. Scaffold elaboration can be performed analogously.

We provide two settings to generate molecules with DEVELOP. The first setting generates molecules with the same number of atoms as the reference molecule. The second setting generates elaborations with a specified number of atoms.

To train and generate molecules using the first setting, use:

python DEVELOP.py --dataset zinc --config '{"num_epochs": 10, "epoch_to_generate": 10, "train_file": "data/linker_design/molecules_zinc_train.json", "train_struct_file": "data/linker_design/zinc_train_structs.types", "valid_file": "data/linker_design/molecules_zinc_test.json", "valid_struct_file": "data/linker_design/zinc_test_structs.types", "struct_data_root": "./data/linker_design/"}'

To generate molecules with a pretrained model using the first setting, use:

python DEVELOP.py --dataset zinc --restore models/linker_design/pretrained_DEVELOP_model.pickle --config '{"generation": true, "number_of_generation_per_valid": 10, "batch_size": 1, "train_file": "data/linker_design/molecules_casf_test.json", "train_struct_file": "data/linker_design/casf_test_structs.types", "valid_file": "data/linker_design/molecules_casf_test.json", "valid_struct_file": "data/linker_design/casf_test_structs.types", "struct_data_root": "./data/linker_design/"}'

To generate molecules using the second setting, first prepare the data for using the --test_mode option:

python prepare_data_linker_design.py --add_idx --include_pharms --test_mode --data_path linker_design/data_casf.txt --dataset_name casf_test_setting_2 --output_dir ./linker_design

Then use:

python DEVELOP.py --dataset zinc --restore models/linker_design/pretrained_DEVELOP_model.pickle --config '{"generation": true, "number_of_generation_per_valid": 10, "batch_size": 1, "train_file": "data/linker_design/molecules_casf_test_setting_2.json", "train_struct_file": "data/linker_design/casf_test_structs.types", "valid_file": "data/linker_design/molecules_casf_test_setting_2.json", "valid_struct_file": "data/linker_design/casf_test_structs.types", "struct_data_root": "./data/linker_design/", "min_atoms": 5, "max_atoms": 6}'

In both cases, the output is of the following format:

Input fragments (SMILES) Ground truth molecule/fragments (SMILES) Generated molecule (SMILES)

More configurations can be found in default_params in DEVELOP.py.

Pretrained Models and Generated Molecules

We provide pretrained models for both linker design and scaffold elaboration:

models/linker_design/pretrained_DEVELOP_model.pickle
models/scaffold_elaboration/pretrained_DEVELOP_model.pickle

Generated molecules can be obtained upon request.

Contact (Questions/Bugs/Requests)

Please submit a Github issue or contact the Oxford Protein Informatics Group (OPIG) [email protected].

develop-zyh's People

Contributors

fimrie avatar xiaotanyu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.