Git Product home page Git Product logo

helm-gpt's Introduction

HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer

HELM-GPT-image

Installation and running

Clone and Create environment

Clone and create the environment.

git clone https://github.com/charlesxu90/helm-gpt.git
cd helm-gpt

mamba env create -f environment.yml
mamba activate helm-gpt-env

conda install conda-forge::git-lfs
git lfs pull
mamba install -c conda-forge rdkit
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

pip install -r requirements.txt

Running the code

# Train prior model
python train_prior.py --train_data data/prior/chembl32/biotherapeutics_dict_prot_flt.csv --valid_data data/prior/chembl32/biotherapeutics_dict_prot_flt.csv --output_dir result/prior/chembl_5.0 --n_epochs 200 --max_len 200 --batch_size 1024

# Train agent model
python train_agent.py --prior result/prior/perm_tune/gpt_model_final_0.076.pt --output_dir result/agent/cpp/pep_perm_5.1_reinvent --batch_size 32 --n_steps 500 --sigma 60 --task permeability  --max_len 140

# Generate molecules from a model
python generate.py --model_path result/prior/chembl_5.0/gpt_model_34_0.143.pt --out_file result/prior/chembl_5.0/1k_samples.csv --n_samples 1000 --max_len 200 --batch_size 128

License

This code is licensed under MIT License.

Citation

If you're using HELM-GPT in your research or applications, please cite using this BibTeX:

@article{xu2024helm,
  title={HELM-GPT: de novo macrocyclic peptide design using generative pre-trained transformer},
  author={Xu, Xiaopeng and Xu, Chencheng and He, Wenjia and Wei, Lesong and Li, Haoyang and Zhou, Juexiao and Zhang, Ruochi and Wang, Yu and Xiong, Yuanpeng and Gao, Xin},
  journal={Bioinformatics},
  pages={btae364},
  year={2024},
  publisher={Oxford University Press}
}

helm-gpt's People

Contributors

charlesxu90 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

helm-gpt's Issues

No such file or directory: 'data/prior/monomer_library.csv'

Traceback (most recent call last):
File "/Users/shauseth/Desktop/helm-gpt-main/train_prior.py", line 9, in
from prior.trainer import Trainer, TrainerConfig
File "/Users/shauseth/Desktop/helm-gpt-main/prior/trainer.py", line 16, in
from utils.helm_utils import get_validity
File "/Users/shauseth/Desktop/helm-gpt-main/utils/helm_utils.py", line 281, in
df_monomers = pd.read_csv('data/prior/monomer_library.csv')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1024, in read_csv
return _read(filepath_or_buffer, kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 618, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1618, in init
self._engine = self._make_engine(f, self.engine)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1878, in _make_engine
self.handles = get_handle(
^^^^^^^^^^^
File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/pandas/io/common.py", line 873, in get_handle
handle = open(
^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/prior/monomer_library.csv'

Error in training agent

Thanks for resolving #2
I can now run the first step, but the second step gives the error

Traceback (most recent call last):
  File "train_agent.py", line 11, in <module>
    from agent.scoring.scaffold import Scaffold
ModuleNotFoundError: No module named 'agent.scoring.scaffold'

Best,
Amin.

Error in the training step

Hello.
Thanks for this really awesome work.
I am trying to go through the steps of training as detailed in the README.
I think the files might have changed because I get an error as follows.

(helm-gpt-env) amin@BTX-CC1:~/softwares/helm-gpt$ python train_prior.py --train_data data/prior/chembl32/biotherapeutics_dict_prot.csv --valid_data data/prior/chembl32/biotherapeutics_dict_prot.csv --output_dir result_amin/prior/chembl_5.0 --n_epochs 200 --max_len 200 --batch_size 1024
Traceback (most recent call last):
  File "train_prior.py", line 113, in <module>
    main(args)
  File "train_prior.py", line 69, in main
    df_train = pd.read_csv(args.train_data)
  File "/home/amin/mambaforge/envs/helm-gpt-env/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/amin/mambaforge/envs/helm-gpt-env/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/home/amin/mambaforge/envs/helm-gpt-env/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 488, in _read
    return parser.read(nrows)
  File "/home/amin/mambaforge/envs/helm-gpt-env/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 1047, in read
    index, columns, col_dict = self._engine.read(nrows)
  File "/home/amin/mambaforge/envs/helm-gpt-env/lib/python3.7/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 224, in read
    chunks = self._reader.read_low_memory(nrows)
  File "pandas/_libs/parsers.pyx", line 801, in pandas._libs.parsers.TextReader.read_low_memory
  File "pandas/_libs/parsers.pyx", line 857, in pandas._libs.parsers.TextReader._read_rows
  File "pandas/_libs/parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas/_libs/parsers.pyx", line 1925, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 58, saw 5

Note, I changed the name of the file to biotherapeutics_dict_prot.csv because biotherapeutics_dict_prot_flt.csv is not in the directory.

I would be really grateful for any suggestions.
Best,
Amin.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.