Git Product home page Git Product logo

experiments-binding-affinity's Issues

Add CI badge to README

We should add a CI badge so we can easily see and access the state of the nightly CI tests.

Tests/examples

The ligand-based only tests are not ligand-based, grouper per kinases should be added to the associated python scripts.

Training tests do not work

Both training tests in tests/test_examples.sh do not work with the current kinoml master.

They both report the same error:

papermill.exceptions.PapermillExecutionError:
---------------------------------------------------------------------------
Exception encountered at "In [10]":
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-c525cb273784> in <module>
      6     a_dataloader = dataloaders[next(iter(dataloaders.keys()))]["train"]
      7     x_sample, _ = next(iter(a_dataloader))
----> 8     MODEL_KWARGS["input_shape"] = ModelCls.estimate_input_shape(x_sample)
      9
     10 nn_model = ModelCls(**MODEL_KWARGS)

~/miniconda3/envs/experiments-binding-affinity/lib/python3.9/site-packages/kinoml/ml/torch_models.py in estimate_input_shape(input_sample)
     23         than a Tensor, please adapt this method accordingly.
     24         """
---> 25         return input_sample.shape[1]
     26
     27

KLIFS binding site

Features and models KLIFS' binding site sequence, and not the full kinase sequence.

๐Ÿšง WIP , see #45 PR in kinoml

  • morgan + composition binding site sequence
  • smiles + sequence binding site sequence

Features missing

Ligand features to be tested in notebook template

  • graph featurizer
    • check per node features
  • onehot smiles
    • check pipeline and max length padding

Can not create environment from environment.yaml

Using the following command fails to create an environment:

mamba env create -f environment.yml -n experiments-binding-affinity

I got the following error about missing torch modules in the pip installation step:

....
        import torch
    ModuleNotFoundError: No module named 'torch'
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/29/96/566ac314e796d4b07209a3b88cc7a8d2e8582d55819e33f72e6c0e8d8216/torch_scatter-0.3.0.tar.gz#sha256=9e5e5a6efa4ef45f584e8611f83690d799370dd122b862646751ae112b685b50 (from https://pypi.org/simple/torch-scatter/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement torch-scatter (from versions: 0.3.0, 1.0.2, 1.0.3, 1.0.4, 1.1.0, 1.1.1, 1.1.2, 1.2.0, 1.3.0, 1.3.1, 1.3.2, 1.4.0, 2.0.2, 2.0.3, 2.0.4, 2.0.5, 2.0.6, 2.0.7)
ERROR: No matching distribution found for torch-scatter

I will check what I can find about this but let me know if you have an easy fix.

Protect master branch

Should we protect the master branch? I accidentally pushed something to master this morning instead of to a new branch.

Issue with testing example for kinoml and experiments-binding-affinity

As requested by @jaimergp - Running an example notebook on kinoml (as dictated by his presented slides) resulted in the Papermill errors on both MacOS and Linux systems:

In both cases the command run was:
python run_notebook.py features/featurize-template.ipynb features/example-ligand-only-chembl28-morgan512-1k-subsample.py --overwrite

Error stack trace for MacOS:

(experiments-binding-affinity) 31 sukrit@pillarofautumn:~/work/postdoc/kinoml-tutorial/experiments-binding-affinity$ python run_notebook.py \
> features/featurize-template.ipynb \
> features/example-ligand-only-chembl28-morgan512-1k-subsample.py \
> 
Executing:  45%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–                             | 13/29 [00:07<00:09,  1.63cell/s]
Traceback (most recent call last):
  File "run_notebook.py", line 51, in <module>
    main()
  File "run_notebook.py", line 47, in main
    pm.execute_notebook(str(nbin), str(nbout), parameters)
  File "/Users/sukrit/miniconda3/envs/experiments-binding-affinity/lib/python3.8/site-packages/papermill/execute.py", line 118, in execute_notebook
    raise_for_execution_errors(nb, output_path)
  File "/Users/sukrit/miniconda3/envs/experiments-binding-affinity/lib/python3.8/site-packages/papermill/execute.py", line 230, in raise_for_execution_errors
    raise error
papermill.exceptions.PapermillExecutionError: 
---------------------------------------------------------------------------
Exception encountered at "In [7]":
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-7-56f51f7ca191> in <module>
----> 1 dataset = import_object(DATASET_CLS).from_source(**DATASET_KWARGS)
      2 dataset

~/miniconda3/envs/experiments-binding-affinity/lib/python3.8/site-packages/kinoml/datasets/chembl.py in from_source(cls, filename, measurement_types, sample, **kwargs)
     66                     zf.extractall(tmpdir)
     67                 cached_path.parent.mkdir(parents=True, exist_ok=True)
---> 68                 shutil.copyfile(Path(tmpdir) / csv_filename, cached_path)
     69 
     70         df = pd.read_csv(cached_path)

~/miniconda3/envs/experiments-binding-affinity/lib/python3.8/shutil.py in copyfile(src, dst, follow_symlinks)
    262         os.symlink(os.readlink(src), dst)
    263     else:
--> 264         with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
    265             # macOS
    266             if _HAS_FCOPYFILE:

FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/3q/r0xmt3qn6xq02b3p08xbr54r0000gn/T/tmpcilygxcr/activities-chembl27_v0.2.csv'

Error stack trace for linux:

(experiments-binding-affinity) 12 singhs15@lt22:~/kinoml-workshop/openkinome/experiments-binding-affinity$ python run_notebook.py features/featurize-template.ipynb features/example-ligand-only-chembl28-morgan512-1k-subsample.py 
Executing:  28%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–Š                                                    | 8/29 [00:05<00:14,  1.48cell/s]
Traceback (most recent call last):
  File "run_notebook.py", line 51, in <module>
    main()
  File "run_notebook.py", line 47, in main
    pm.execute_notebook(str(nbin), str(nbout), parameters)
  File "/home/singhs15/miniconda3/envs/experiments-binding-affinity/lib/python3.8/site-packages/papermill/execute.py", line 118, in execute_notebook
    raise_for_execution_errors(nb, output_path)
  File "/home/singhs15/miniconda3/envs/experiments-binding-affinity/lib/python3.8/site-packages/papermill/execute.py", line 230, in raise_for_execution_errors
    raise error
papermill.exceptions.PapermillExecutionError: 
---------------------------------------------------------------------------
Exception encountered at "In [4]":
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-baec478d6de7> in <module>
     19 OUT.mkdir(parents=True, exist_ok=True)
     20 
---> 21 print(f"This notebook:           HERE = ~/{HERE.relative_to(Path.home())}")
     22 print(f"This repo:               REPO = ~/{REPO.relative_to(Path.home())}")
     23 print(f"Outputs in:               OUT = ~/{OUT.relative_to(Path.home())}")

~/miniconda3/envs/experiments-binding-affinity/lib/python3.8/pathlib.py in relative_to(self, *other)
    905         if (root or drv) if n == 0 else cf(abs_parts[:n]) != cf(to_abs_parts):
    906             formatted = self._format_parsed_parts(to_drv, to_root, to_parts)
--> 907             raise ValueError("{!r} does not start with {!r}"
    908                              .format(str(self), str(formatted)))
    909         return self._from_parsed_parts('', root if n == 1 else '',

ValueError: '/lila/home/singhs15/kinoml-workshop/openkinome/experiments-binding-affinity/features/example-ligand-only-chembl28-morgan512-1k-subsample' does not start with '/home/singhs15'

Composite tests fail

The composite tests fail with the following error:

Traceback (most recent call last):
  File "/home/david/github/experiments-binding-affinity/run_notebook.py", line 51, in <module>
    main()
  File "/home/david/github/experiments-binding-affinity/run_notebook.py", line 47, in main
    pm.execute_notebook(str(nbin), str(nbout), parameters)
  File "/home/david/miniconda3/envs/test_eba/lib/python3.9/site-packages/papermill/execute.py", line 118, in execute_notebook
    raise_for_execution_errors(nb, output_path)
  File "/home/david/miniconda3/envs/test_eba/lib/python3.9/site-packages/papermill/execute.py", line 230, in raise_for_execution_errors
    raise error
papermill.exceptions.PapermillExecutionError:
---------------------------------------------------------------------------
Exception encountered at "In [13]":
---------------------------------------------------------------------------
ArrowNotImplementedError                  Traceback (most recent call last)
/tmp/ipykernel_4232/462024357.py in <module>
      8     path = OUT / f"{'__'.join([g for g in group if g != 'valid'])}.parquet"
      9     parquets.append(path)
---> 10     ak.to_parquet(parquet, path)
     11     # TODO: Missing indices?

~/miniconda3/envs/test_eba/lib/python3.9/site-packages/awkward/operations/convert.py in to_parquet(array, where, explode_records, list_to32, string_to32, bytestring_to32, **options)
   3028         options["schema"] = first.schema
   3029
-> 3030     writer = pyarrow.parquet.ParquetWriter(**options)
   3031     writer.write_table(pyarrow.Table.from_batches([first]))
   3032

~/miniconda3/envs/test_eba/lib/python3.9/site-packages/pyarrow/parquet.py in __init__(self, where, schema, filesystem, flavor, version, use_dictionary, compression, write_statistics, use_deprecated_int96_timestamps, compression_level, use_byte_stream_split, writer_engine_version, data_page_version, use_compliant_nested_type, **options)
    653         self._metadata_collector = options.pop('metadata_collector', None)
    654         engine_version = 'V2'
--> 655         self.writer = _parquet.ParquetWriter(
    656             sink, schema,
    657             version=version,

~/miniconda3/envs/test_eba/lib/python3.9/site-packages/pyarrow/_parquet.pyx in pyarrow._parquet.ParquetWriter.__cinit__()

~/miniconda3/envs/test_eba/lib/python3.9/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()

ArrowNotImplementedError: Unhandled type for Arrow to Parquet schema conversion: dense_union<0: bool not null=0, 1: int64 not null=1>

Models not running

The aim of this PR is to have scripts that are running for featurization schemes and associated machine learning models on the latest ChEMBL (v.28) data set.

Models

Ligand-based

To run these models, type:
(experiments-binding-affinity) $ bash tests/test_model_ligand_only.sh

  • Morgan & Fully connected neural network
  • One-hot SMILES & Convolutional neural network
  • Graph & Graph neural network

Kinase-informed

To run these models, type:
(experiments-binding-affinity) $ bash tests/test_model_kinase_informed.sh

  • morgan + hash & Fully connected neural network
  • morgan + composition & Fully connected neural network
  • smiles + sequence & Convolutional neural network

Status

  • Models Running

  • Ready to go

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.