Git Product home page Git Product logo

paccmann_sarscov2's Introduction

Build Status

DISCLAIMER:

This code gives the tensorflow implementation of PaccMann as of our paper in Molecular Pharmaceutics.

PaccMann

paccmann is a package for drug sensitivity prediction and is the core component of the repo.

The package provides a toolbox of learning models for IC50 prediction using drug's chemical properties and tissue-specific cell lines gene expression.

Citation

Please cite us as follows:

@article{oskooei2018paccmann,
  title={PaccMann: Prediction of anticancer compound sensitivity with multi-modal attention-based neural networks},
  author={Oskooei, Ali and Born, Jannis and Manica, Matteo and Subramanian, Vigneshwari and S{\'a}ez-Rodr{\'\i}guez, Julio and Mart{\'\i}nez, Mar{\'\i}a Rodr{\'\i}guez},
  journal={arXiv preprint arXiv:1811.06802},
  year={2018}
}

@article{manica2019paccmann,
author = {Manica, Matteo and Oskooei, Ali and Born, Jannis and Subramanian, Vigneshwari and Saez-Rodriguez, Julio and Rodriguez Martinez, Maria},
title = {Toward Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-Based Convolutional Encoders},
journal = {Molecular Pharmaceutics},
year = {2019},
doi = {10.1021/acs.molpharmaceut.9b00520},
note = {PMID: 31618586},
}

Installation

Setup of the virtual environment

We strongly recommend to work inside a virtual environment (venv).

Create the environment:

python3 -m venv venv

Activate it:

source venv/bin/activate

Module installation

The module can be installed either in editable mode:

pip3 install -e .

Or as a normal package:

pip3 install .

Models training

Models can be trained using the script bin/training_paccmann that is installed together with the module. Check the examples for a quick start. For more details see the help of the training command by typing training_paccmann -h:

usage: training_paccmann [-h] [-save_checkpoints_steps 300]
                         [-eval_throttle_secs 60] [-model_suffix]
                         [-train_steps 10000] [-batch_size 64]
                         [-learning_rate 0.001] [-dropout 0.5]
                         [-buffer_size 20000] [-number_of_threads 1]
                         [-prefetch_buffer_size 6]
                         train_filepath eval_filepath model_path
                         model_specification_fn_name params_filepath
                         feature_names

Run training of a `paccmann` model.

positional arguments:
  train_filepath        Path to train data.
  eval_filepath         Path to eval data.
  model_path            Path where the model is stored.
  model_specification_fn_name
                        Model specification function. Pick one of the
                        following: ['dnn', 'rnn', 'scnn', 'sa', 'ca', 'mca'].
  params_filepath       Path to model params. Dictionary with parameters
                        defining the model.
  feature_names         Comma separated feature names. Select from the
                        following: ['smiles_character_tokens',
                        'smiles_atom_tokens', 'fingerprints_256',
                        'fingerprints_512', 'targets_10', 'targets_20',
                        'targets_50', 'selected_genes_10',
                        'selected_genes_20', 'cnv_min', 'cnv_max', 'disrupt',
                        'zigosity', 'ic50', 'ic50_labels'].

optional arguments:
  -h, --help            show this help message and exit
  -save_checkpoints_steps 300, --save-checkpoints-steps 300
                        Steps before saving a checkpoint.
  -eval_throttle_secs 60, --eval-throttle-secs 60
                        Throttle seconds between evaluations.
  -model_suffix , --model-suffix 
                        Suffix for the trained moedel.
  -train_steps 10000, --train-steps 10000
                        Number of training steps.
  -batch_size 64, --batch-size 64
                        Batch size.
  -learning_rate 0.001, --learning-rate 0.001
                        Learning rate.
  -dropout 0.5, --dropout 0.5
                        Dropout to be applied to set and dense layers.
  -buffer_size 20000, --buffer-size 20000
                        Buffer size for data shuffling.
  -number_of_threads 1, --number-of-threads 1
                        Number of threads to be used in data processing.
  -prefetch_buffer_size 6, --prefetch-buffer-size 6
                        Prefetch buffer size to allow pipelining.

paccmann_sarscov2's People

Contributors

c-nit avatar drugilsberg avatar jannisborn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paccmann_sarscov2's Issues

runtime error when I ran training

Thank you for your repo.
When I ran
(paccmann_sarscov2) $ python ./code/paccmann_generator/examples/affinity/train_conditional_generator.py
./models/SELFIESVAE
./models/ProteinVAE
./models/affinity
./data/training/merged_sequence_encoding/uniprot_covid-19.csv
./code/paccmann_generator/examples/affinity/conditional_generator.json
paccmann_sarscov2
35
./data/training/unbiased_predictions
--tox21_path ./models/Tox21

I got the following error messages:

INFO:train_paccmann_rl:Model with name paccmann_sarscov2_35 starts.
INFO:train_paccmann_rl:Test protein is ACE2_HUMAN
WARNING:train_paccmann_rl:Model exists already. Call model.load() to restore weights.
INFO:train_paccmann_rl:Model stored at biased_models/paccmann_sarscov2_35
INFO:train_paccmann_rl:Current train protein: NS8B_CVHSA
Traceback (most recent call last):
File "./code/paccmann_generator/examples/affinity/train_conditional_generator.py", line 324, in
main(parser_namespace=args)
File "./code/paccmann_generator/examples/affinity/train_conditional_generator.py", line 244, in main
protein_name, epoch, params['batch_size']
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_generator/reinforce_proteins.py", line 298, in policy_gradient
latent_z, remove_invalid=True
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_generator/reinforce.py", line 169, in get_smiles_from_latent
search=SamplingSearch(temperature=self.temperature)
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_chemistry/models/vae.py", line 640, in generate
generate_len=generate_len
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_chemistry/models/vae.py", line 457, in generate_from_latent
output, hidden, stack = self(input_token, hidden, stack)
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_chemistry/models/stack_rnn.py", line 153, in forward
gru_input, stack = self.update(embedded_input, hidden, stack)
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_chemistry/models/stack_rnn.py", line 62, in
inp, hidden, stack
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_chemistry/models/stack_rnn.py", line 195, in _stack_update
stack_input.permute(1, 0, 2), stack, stack_controls
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/paccmann_chemistry/models/stack_rnn.py", line 228, in stack_augmentation
stack_up = torch.cat((input_val, prev_stack[:, :-1]), dim=1)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking arugment for argument tensors in method wrapper__cat)

When I tried

(paccmann_sarscov2) $ python ./code/toxsmi/scripts/train_tox.py
./data/pretraining/toxicity_predictor/tox21_train.csv
./data/pretraining/toxicity_predictor/tox21_test.csv
./data/pretraining/toxicity_predictor/tox21.smi
./data/pretraining/language_models/smiles_language_tox21.pkl
./models/
./code/toxsmi/params/mca.json
Tox21
--embedding_path ./data/pretraining/toxicity_predictor/smiles_vae_embeddings.pkl

similar errors occurred:

INFO:Tox21:mca.json
INFO:Tox21:== Epoch [0/200] ==
Traceback (most recent call last):
File "./code/toxsmi/scripts/train_tox.py", line 420, in
args.params_filepath, args.training_name, args.embedding_path
File "./code/toxsmi/scripts/train_tox.py", line 284, in main
loss = model.loss(y_hat, y.to(device))
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/toxsmi/models/mca.py", line 307, in loss
return self.loss_fn(yhat, y)
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/xzhang/miniconda3/envs/paccmann_sarscov2/lib/python3.7/site-packages/toxsmi/utils/wrappers.py", line 76, in forward
out = loss * weight_tensor
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

pip conflict of VCS project urls

With pip 20.3, a dependency conflict is raised when there is any mismatch in the VCS project url, even if they actually end up pointing to the exact same version.

Here, this is encountered as

The conflict is caused by:
    The user requested pytoda 0.1.1 (from git+https://github.com/PaccMann/[email protected])
    paccmann-chemistry 0.0.4 depends on pytoda 0.1.1 (from git+https://****@github.com/PaccMann/[email protected])

Thanks to the anonymous reporter of the issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.