graylab / igfold Goto Github PK
View Code? Open in Web Editor NEWFast, accurate antibody structure prediction from deep learning on massive set of natural antibodies
License: Other
Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies
License: Other
I want to know how to set device?
Hello,
I'm curious as to whether it is possible to run training ourselves (instead of using the checkpoint files / pretrained weights). Also, is it possible to replace AntiBERTy with a different LLM? Would like to know if these things are even possible at all before attempting to do them.
Thanks!
This is a good project. I'm learning an end-to-end prediction, trying to use your loss calculation method. I describe my understanding of loss calculation: First, extract the first four of the 'N', 'CA', 'C', 'CB', 'O' coordinates from the native pdb, and ipa_coords (the coordinate matrix generated by the module structure_ipa) , which is aligned with the kabsch_mse function to generate coords_loss combined with other losses for gradient descent.
This is an excellent job and I would appreciate it if you could fix this for me!
Hello! I am very interested in your project, and I want to try to predict the antibody structure( PDB:1IGT) by using your model, but THERE are some problems in the process.
I have successfully installed the environment according to your requirement.txt, but I cannot use it. The error message is as follows:
I find there( requirement.txt) are no requirements for CUDA and Python versions.
Here is a list of my environments
# packages in environment at /home/LChuang/anaconda3/envs/igfold2:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main defaults
_openmp_mutex 5.1 1_gnu defaults
absl-py 1.0.0 pypi_0 pypi
aiohttp 3.8.1 pypi_0 pypi
aiosignal 1.2.0 pypi_0 pypi
antiberty 0.0.5 pypi_0 pypi
argon2-cffi 21.3.0 pyhd3eb1b0_0 defaults
argon2-cffi-bindings 21.2.0 py38h7f8727e_0 defaults
asttokens 2.0.5 pyhd3eb1b0_0 defaults
async-timeout 4.0.2 pypi_0 pypi
attrs 21.4.0 pyhd3eb1b0_0 defaults
backcall 0.2.0 pyhd3eb1b0_0 defaults
beautifulsoup4 4.11.1 py38h06a4308_0 defaults
biopython 1.79 pypi_0 pypi
bleach 4.1.0 pyhd3eb1b0_0 defaults
ca-certificates 2022.4.26 h06a4308_0 defaults
cachetools 5.1.0 pypi_0 pypi
certifi 2022.5.18.1 py38h06a4308_0 defaults
cffi 1.15.0 py38hd667e15_1 defaults
charset-normalizer 2.0.12 pypi_0 pypi
click 8.1.3 pypi_0 pypi
cycler 0.11.0 pypi_0 pypi
debugpy 1.5.1 py38h295c915_0 defaults
decorator 5.1.1 pyhd3eb1b0_0 defaults
defusedxml 0.7.1 pyhd3eb1b0_0 defaults
einops 0.3.0 pypi_0 pypi
entrypoints 0.4 py38h06a4308_0 defaults
executing 0.8.3 pyhd3eb1b0_0 defaults
filelock 3.7.0 pypi_0 pypi
frozenlist 1.3.0 pypi_0 pypi
fsspec 2022.3.0 pypi_0 pypi
future 0.18.2 pypi_0 pypi
fvcore 0.1.5.post20220512 pypi_0 pypi
ghost-py 0.2.3 pypi_0 pypi
google-auth 2.6.6 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
grpcio 1.46.1 pypi_0 pypi
huggingface-hub 0.6.0 pypi_0 pypi
idna 3.3 pypi_0 pypi
igfold 0.0.8 pypi_0 pypi
importlib-metadata 4.11.3 pypi_0 pypi
importlib_resources 5.2.0 pyhd3eb1b0_1 defaults
invariant-point-attention 0.1.4 pypi_0 pypi
iopath 0.1.9 pypi_0 pypi
ipykernel 6.9.1 py38h06a4308_0 defaults
ipython 8.3.0 py38h06a4308_0 defaults
ipython_genutils 0.2.0 pyhd3eb1b0_1 defaults
jedi 0.18.1 py38h06a4308_1 defaults
jinja2 3.0.3 pyhd3eb1b0_0 defaults
joblib 1.1.0 pypi_0 pypi
jsonschema 4.4.0 py38h06a4308_0 defaults
jupyter_client 7.2.2 py38h06a4308_0 defaults
jupyter_core 4.10.0 py38h06a4308_0 defaults
jupyterlab_pygments 0.1.2 py_0 defaults
kiwisolver 1.4.2 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1 defaults
libffi 3.3 he6710b0_2 defaults
libgcc-ng 11.2.0 h1234567_0 defaults
libgomp 11.2.0 h1234567_0 defaults
libsodium 1.0.18 h7b6447c_0 defaults
libstdcxx-ng 11.2.0 h1234567_0 defaults
markdown 3.3.7 pypi_0 pypi
markupsafe 2.0.1 py38h27cfd23_0 defaults
matplotlib 3.4.3 pypi_0 pypi
matplotlib-inline 0.1.2 pyhd3eb1b0_2 defaults
mistune 0.8.4 py38h7b6447c_1000 defaults
multidict 6.0.2 pypi_0 pypi
nb_conda 2.2.1 py38h06a4308_1 defaults
nb_conda_kernels 2.3.1 py38h06a4308_0 defaults
nbbrowserpdf 0.2.0 pypi_0 pypi
nbclient 0.5.13 py38h06a4308_0 defaults
nbconvert 6.4.4 py38h06a4308_0 defaults
nbformat 5.3.0 py38h06a4308_0 defaults
ncurses 6.3 h7f8727e_2 defaults
nest-asyncio 1.5.5 py38h06a4308_0 defaults
notebook 6.4.11 py38h06a4308_0 defaults
numpy 1.21.2 pypi_0 pypi
oauthlib 3.2.0 pypi_0 pypi
openssl 1.1.1o h7f8727e_0 defaults
packaging 21.3 pyhd3eb1b0_0 defaults
pandas 1.4.2 pypi_0 pypi
pandocfilters 1.5.0 pyhd3eb1b0_0 defaults
parso 0.8.3 pyhd3eb1b0_0 defaults
pexpect 4.8.0 pyhd3eb1b0_3 defaults
pickleshare 0.7.5 pyhd3eb1b0_1003 defaults
pillow 9.1.1 pypi_0 pypi
pip 21.2.4 py38h06a4308_0 defaults
portalocker 2.4.0 pypi_0 pypi
prometheus_client 0.13.1 pyhd3eb1b0_0 defaults
prompt-toolkit 3.0.20 pyhd3eb1b0_0 defaults
protobuf 3.20.1 pypi_0 pypi
ptyprocess 0.7.0 pyhd3eb1b0_2 defaults
pure_eval 0.2.2 pyhd3eb1b0_0 defaults
py3dmol 1.8.0 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pycparser 2.21 pyhd3eb1b0_0 defaults
pydeprecate 0.3.1 pypi_0 pypi
pygments 2.11.2 pyhd3eb1b0_0 defaults
pyparsing 3.0.9 pypi_0 pypi
pypdf2 1.27.12 pypi_0 pypi
python 3.8.13 h12debd9_0 defaults
python-dateutil 2.8.2 pyhd3eb1b0_0 defaults
python-fastjsonschema 2.15.1 pyhd3eb1b0_0 defaults
pytorch-lightning 1.5.10 pypi_0 pypi
pytorch-ranger 0.1.1 pypi_0 pypi
pytorch3d 0.3.0 pypi_0 pypi
pytz 2022.1 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
pyzmq 22.3.0 py38h295c915_2 defaults
readline 8.1.2 h7f8727e_1 defaults
regex 2022.4.24 pypi_0 pypi
requests 2.26.0 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.8 pypi_0 pypi
sacremoses 0.0.53 pypi_0 pypi
scipy 1.8.1 pypi_0 pypi
seaborn 0.11.2 pypi_0 pypi
send2trash 1.8.0 pyhd3eb1b0_1 defaults
setuptools 59.5.0 pypi_0 pypi
six 1.16.0 pyhd3eb1b0_1 defaults
soupsieve 2.3.1 pyhd3eb1b0_0 defaults
sqlite 3.38.3 hc218d9a_0 defaults
stack_data 0.2.0 pyhd3eb1b0_0 defaults
tabulate 0.8.9 pypi_0 pypi
tensorboard 2.9.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
terminado 0.13.1 py38h06a4308_0 defaults
testpath 0.5.0 pyhd3eb1b0_0 defaults
tk 8.6.11 h1ccaba5_1 defaults
tokenizers 0.11.6 pypi_0 pypi
torch 1.7.1 pypi_0 pypi
torch-optimizer 0.3.0 pypi_0 pypi
torchmetrics 0.8.2 pypi_0 pypi
torchvision 0.8.2 pypi_0 pypi
tornado 6.1 py38h27cfd23_0 defaults
tqdm 4.62.1 pypi_0 pypi
traitlets 5.1.1 pyhd3eb1b0_0 defaults
transformers 4.18.0 pypi_0 pypi
typing-extensions 4.2.0 pypi_0 pypi
typing_extensions 4.1.1 pyh06a4308_0 defaults
urllib3 1.26.9 pypi_0 pypi
wcwidth 0.2.5 pyhd3eb1b0_0 defaults
webencodings 0.5.1 py38_1 defaults
werkzeug 2.1.2 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0 defaults
xz 5.2.5 h7f8727e_1 defaults
yacs 0.1.8 pypi_0 pypi
yapf 0.31.0 pypi_0 pypi
yarl 1.7.2 pypi_0 pypi
zeromq 4.3.4 h2531618_0 defaults
zipp 3.8.0 py38h06a4308_0 defaults
zlib 1.2.12 h7f8727e_2 defaults
Hello, I tried the demo code today, but the following RuntimeError occurs. Please help to fix this, THX.
/data/personal/yankai/packages/anaconda3/envs/igfold/lib/python3.7/site-packages/Bio/pairwise2.py:283: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.
BiopythonDeprecationWarning,
The code, data, and weights for this work are made available for non-commercial use
(including at commercial entities) under the terms of the JHU Academic Software License
Agreement. For commercial inquiries, please contact dmalon11[at]jhu.edu.
License: https://github.com/Graylab/IgFold/blob/main/LICENSE.md
Loading 4 IgFold models...
Using device: cuda:0
Successfully loaded 4 IgFold models.
Loaded AntiBERTy model.
Traceback (most recent call last):
File "igbody_test.py", line 15, in
do_renum=True, # Renumber predicted antibody structure (Chothia)
File "/data/personal/yankai/LiuChang/IgFold/igfold/IgFoldRunner.py", line 119, in fold
truncate_sequences=truncate_sequences,
File "/data/personal/yankai/LiuChang/IgFold/igfold/utils/folding.py", line 184, in fold
return_attention=True,
File "/data/personal/yankai/packages/anaconda3/envs/igfold/lib/python3.7/site-packages/antiberty/AntiBERTyRunner.py", line 81, in embed
embeddings[i] = embeddings[i][:, a == 1]
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
Hi,
The default notebook does not run:
(I just used Run all with all default options).
Here is link to a copy of the notebook:
https://colab.research.google.com/drive/11bH_aFT2Wm_0r3UQk4JNFTbsa3hJcXEb?usp=sharing
No module named 'pyrosetta'
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
[<ipython-input-3-2d0e067d8df6>](https://localhost:8080/#) in <module>
4 sys.path.insert(0, f"/usr/local/lib/python{python_version}/site-packages/")
5
----> 6 from igfold.utils.visualize import *
7 from igfold import IgFoldRunner
8
12 frames
[/usr/local/lib/python3.8/site-packages/pandas/core/window/ewm.py](https://localhost:8080/#) in <module>
13
14 from pandas._libs.tslibs import Timedelta
---> 15 import pandas._libs.window.aggregations as window_aggregations
16 from pandas._typing import (
17 Axis,
ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/local/lib/python3.8/site-packages/pandas/_libs/window/aggregations.cpython-38-x86_64-linux-gnu.so)
---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------
Nitpicky, but typos are typos. The demo code below contains a small typo:
from igfold import IgFoldRunner
sequences = {
"H": "EVQLVQSGPEVKKPGTSVKVSCKASGFTFMSSAVQWVRQARGQRLEWIGWIVIGSGNTNYAQKFQERVTITRDMSTSTAYMELSSLRSEDTAVYYCAAPYCSSISCNDGFDIWGQGTMVTVS",
"L": ```
"DVVMTQTPFSLPVSLGDQASISCRSSQSLVHSNGNTYLHWYLQKPGQSPKLLIYKVSNRFSGVPDRFSGSGSGTDFTLKISRVEAEDLGVYFCSQSTHVPYTFGGGTKLEIK"
}
igfold = IgFoldRunner()
emb = igfold.embed(
sequences=sequences, # Antibody sequences
)
emb.bert_embs # Embeddings from AntiBERTy final hidden layer (dim: 1, L, 512)
emb.gt_embs # Embeddings after graph transformer layers (dim: 1, L, 64)
emb.strucutre_embs # Embeddings after template incorporation IPA (dim: 1, L, 64)
the last line needs to be fixed.
emb.strucuture_embs
-> emb.structure_embs
Hello,
I am trying to run IgFold on Google Colab. It shows problem in running "Load IgFold models", seems module import issues with the following error message
ModuleNotFoundError Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
1092 try:
-> 1093 return importlib.import_module("." + module_name, self.name)
1094 except Exception as e:
27 frames
ModuleNotFoundError: No module named 'tokenizers.tokenizers'
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
/usr/local/lib/python3.7/site-packages/transformers/utils/import_utils.py in _get_module(self, module_name)
1093 return importlib.import_module("." + module_name, self.name)
1094 except Exception as e:
-> 1095 raise RuntimeError(
1096 f"Failed to import {self.name}.{module_name} because of the following error (look up to see its"
1097 f" traceback):\n{e}"
RuntimeError: Failed to import transformers.models.auto.tokenization_auto because of the following error (look up to see its traceback):
No module named 'tokenizers.tokenizers'
I searched on google but no clear help is indicated. Please suggest, Thank you
Hello, I am interested in your pretrained model, AntiBERTy. I saw this model was trained on OAS. I am wondering how you deal with the training sequence length. Was this model trained on the fixed sequence length, such as 512 or 256? If trained like this, when doing inference, can this model give accurate embedding for the sequence which is longer than 512?
I'm sorry. I have tried to read the source code, however , I don't know how to use batch model. I just need embeddings on antibodies structure. Can you give me a batch model example?
Thanks.
There are multiple ways to calculate RMSD, Would you like to share how you calculate the RMSD, is it KabschRMSD or something else? In addition, how do you do the RMSD evaluation on CDR region? We are trying to follow your work~ Please share more information to us so that we can have a fair comparison. Thank you.
Best,
Zhangzhi
We have installed IgFold (RedHat 7) following the instructions on GitHub and Anaconda Python 3.9.13. Testing with below example from Git. abnumber is definitely installed (works: python -c 'import abnumber'). However running (python ig-test.py) below we get error "Error: AbNumber not installed. Please install AbNumber to use renumbering".
from igfold import IgFoldRunner
sequences = {
"H": "QVQLQESGGGLVQAGGSLTLSCAVSGLTFSNYAMGWFRQAPGKEREFVAAITWDGGNTYYTDSVKGRFTISRDNAKNTVFLQMNSLKPEDTAVYYCAAKLLGSSRYELALAGYDYWGQGTQVTVS"
}
pred_pdb = "my_nanobody.pdb"
igfold = IgFoldRunner()
igfold.fold(
pred_pdb, # Output PDB file
sequences=sequences, # Nanobody sequence
do_refine=False, # Refine the antibody structure with PyRosetta
do_renum=True, # Renumber predicted antibody structure (Chothia)
)
Thank you
This is an interesting work. I saw you used pretrained model, AntiBERTy to represent the protein sequence and this package had been installed by pip. But I didn't see where it is used in the code. Could you give me any idea?
Hi, the file is installed by conda but run the test script report error. detail error information as follows:
Loading 4 IgFold models... Using device: cuda:0 Loading /home/lyw/.local/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_1.ckpt... /home/lyw/.local/lib/python3.8/site-packages/torch/cuda/__init__.py:132: UserWarning: Found GPU0 NVIDIA GeForce GT 720 which is of cuda capability 3.5. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is 3.7. warnings.warn(old_gpu_warn % (d, name, major, minor, min_arch // 10, min_arch % 10)) Loading /home/lyw/.local/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_2.ckpt... Loading /home/lyw/.local/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_3.ckpt... Loading /home/lyw/.local/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_5.ckpt... Successfully loaded 4 IgFold models. Loaded AntiBERTy model. Traceback (most recent call last): File "test.py", line 9, in <module> igfold.fold( File "/home/lyw/.local/lib/python3.8/site-packages/igfold/IgFoldRunner.py", line 106, in fold model_out = fold( File "/home/lyw/.local/lib/python3.8/site-packages/igfold/utils/folding.py", line 182, in fold embeddings, attentions = antiberty.embed( File "/home/lyw/.local/lib/python3.8/site-packages/antiberty/AntiBERTyRunner.py", line 81, in embed embeddings[i] = embeddings[i][:, a == 1] RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
The error occurs when I tried to run the code, please help to fix. THX.
data/personal/yankai/packages/anaconda3/envs/igfold/lib/python3.7/site-packages/Bio/pairwise2.py:283: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.
BiopythonDeprecationWarning,
The code, data, and weights for this work are made available for non-commercial use
(including at commercial entities) under the terms of the JHU Academic Software License
Agreement. For commercial inquiries, please contact dmalon11[at]jhu.edu.
License: https://github.com/Graylab/IgFold/blob/main/LICENSE.md
Loading 4 IgFold models...
Using device: cuda:0
Successfully loaded 4 IgFold models.
Loaded AntiBERTy model.
length of self.models: <class 'list'> 0
Traceback (most recent call last):
File "igbody_test.py", line 15, in
do_renum=True, # Renumber predicted antibody structure (Chothia)
File "/data/personal/yankai/LiuChang/IgFold/igfold/IgFoldRunner.py", line 120, in fold
truncate_sequences=truncate_sequences,
File "/data/personal/yankai/LiuChang/IgFold/igfold/utils/folding.py", line 210, in fold
best_model_i = scores.index(min(scores))
ValueError: min() arg is an empty sequence
according to the usage:
Note: The first time IgFoldRunner is initialized, it will download the pre-trained weights. This may take a few minutes and will require a network connection.
The first time to run IgFold, it should download the pretrained Weights. But in my machine, it doesn't. So, how to fix this problem? Are there a ftp to download?
i try to run igfold , there is no response at this step:
Downloading checkpoint files...
Is there anywhere else to download the pre-trained weights?
Can we use put multiple antibody sequences into the model ?
I'm having difficulty installing the package on MacOS Monterey 12.5.1.
I installed via:
conda create -n env_igfold python=3.9
conda activate env_igfold
pip3 install igfold
conda install -c conda-forge openmm pdbfixer
conda install -c bioconda anarci
Running one of the given examples yields the following error:
OMP: Error #15: Initializing libomp.dylib, but found libomp.dylib already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://openmp.llvm.org/
A google search suggests "conda install nomkl" as a solution. This had worked for me previously, but now no longer works (I had to create a new environment). Any suggestions on this? As a side note, I also tried using python=3.7 in the recipe above, but this gave a different error where the network weights were downloaded but then gave a segfault.
At my python3.7 local conda environment, I did
git clone https://github.com/Graylab/IgFold.git
cd IgFold
pip install -r requirements.txt
# yes, https://github.com/Graylab/IgFold/blob/main/requirements.txt
pip install IgFold
conda install -c conda-forge openmm pdbfixer
However, I can't run even simple.py
(e.g.
from igfold import IgFoldRunner
sequences = {
"H": "QVQLQESGGGLVQAGGSLTLSCAVSGLTFSNYAMGWFRQAPGKEREFVAAITWDGGNTYYTDSVKGRFTISRDNAKNTVFLQMNSLKPEDTAVYYCAAKLLGSSRYELALAGYDYWGQGTQVTVS"
}
pred_pdb = "my_nanobody.pdb"
igfold = IgFoldRunner()
igfold.fold(
pred_pdb, # Output PDB file
sequences=sequences, # Nanobody sequence
do_refine=False, # Refine the antibody structure with PyRosetta
do_renum=True, # Renumber predicted antibody structure (Chothia)
use_abnum=True,
)
in readme)
as
What pytorch version do you use?
Mine is 1.7.1 (which is expected since requirement specifies
torch==1.7.1).
I use IgFold to predict the VH structure of antibodies, and here is my code:
import pandas as pd
from igfold import IgFoldRunner
from Bio.PDB.PDBExceptions import PDBConstructionException
def get_igfold_structure(inputpath):
data = pd.read_csv(inputpath,header=None,index_col=0)
sequences = list(data.iloc[:,0])
igfold = IgFoldRunner()
i = 1
for seq,i in zip(sequences,range(1,len(sequences))):
seq = {str(i):seq}
outputpath = "feature/igfold_structure/trastuzumab_VH/VH_igfold_" + str(i) + ".pdb"
igfold.fold(
outputpath, # Output PDB file
sequences=seq, # Nanobody sequence
do_refine=False, # Refine the antibody structure with PyRosetta
use_openmm=True, # Use OpenMM for refinement
do_renum=False, # Renumber predicted antibody structure (Chothia)
)
i = i+1
inputpath = "data/trastuzumab_VH/seqs2.csv"
get_igfold_structure(inputpath)
I want to use the for loop to generate multiple VH PDB files at once, but when the loop reaches the 10th time, an error is reported: Bio. PDB. PDBExceptions. PDBConstructionException: Invalid or missing coordinate (s) at line 1.
And the PDB file generated for the 10th time does not have "TER" or the last column of atomic symbols compared to the previous PDB file. As shown in the following figure:
Hi developer,
Greetings!
Here I report two issues in running the IgFold in a conda virtual env with python 3.8.0.
Python version mismatch: module was compiled for Python 3.5, but the interpreter version is incompatible: 3.8.0 (default, Nov 6 2019, 21:49:08)
I got this information in importing the igfold module. which version of the python is recommended?
RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling cublasSgemmStridedBatched( handle, opa, opb, m, n, k, &alpha, a, lda, stridea, b, ldb, strideb, &beta, c, ldc, stridec, num_batches)
Try to ignore the issue 1, and run the following steps listed in the website, I got this error. The program cannot be run anymore.
Please help to check that if any solution could be useful in solving those problems.
Thanks very much!
Best
User
Jan 12, 2023
Dear users,
Could I ask you a question? Thanks!
I installed igfold locally and ran the antibody and nanobody tests (from github) quite well.
When I ran my prediction test on CH67 antibody, firstly I mistakenly typed H as L, and L as H, certainly its H and L structures were wrong. After that I quickly realized this mistake, and changed the chain names back to normal.
However, after my correction, igfold becomes quite werid: it always predicts H structure and misses L structure. (I tried to use igfold to predict other antibody structure, and it worked well for their both H and L)
This issue makes me confused, and I think perhaps igfold might get trained when I made the above chain name typo?
My script is as follows, thanks!
from igfold import IgFoldRunner, init_pyrosetta
init_pyrosetta()
sequences = { "H": "QVQLVQSGAEVRKPGASVKVSCKASGYTFTDNYIHWVRQAPGQGLEWMGWIHPNSGATKYAQKFEGWVTMTRDTSISTVYMELSRSRSDDTAVYYCARAGLEPRSVDYYFYGLDVWGQGTAVTVSS",
"L": "QSALTQPPSVSVAPGQTATITCGGNNIGRKRVDWFQQKPGQAPVLVVYERFSDSNSGTTATLTISRVEAGDEADYYCQVWDSDSDHVVFGGGTKLTVL"
}
pred_pdb = "CH67.pdb"
igfold = IgFoldRunner()
igfold.fold(
pred_pdb,
sequences=sequences,
do_refine=True,
do_renum=True,
)
Hi,
Just run the linked google colab ipynb without any modification, and got following error messages when rungining prediction, mainly refered to libstdc++.so.6: version `GLIBCXX_3.4.29' not found. I executed "!apt-get upgrade libstdc++6", however, same error still appeared.
Messages as:
ImportError Traceback (most recent call last)
in
4 sys.path.insert(0, f"/usr/local/lib/python{python_version}/site-packages/")
5
----> 6 from igfold.utils.visualize import *
7 from igfold import IgFoldRunner
8
12 frames
/usr/local/lib/python3.9/site-packages/pandas/core/window/ewm.py in
13
14 from pandas._libs.tslibs import Timedelta
---> 15 import pandas._libs.window.aggregations as window_aggregations
16 from pandas._typing import (
17 Axis,
ImportError: /lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /usr/local/lib/python3.9/site-packages/pandas/_libs/window/aggregations.cpython-39-x86_64-linux-gnu.so)
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.
Hi all, had no other way to try to reach out than making a github issue. It is awesome that you have a license. It specifies that a company that wants to use this software for commercial use would need a separate written license. Great! Unfortunately, it doesn't say whom to reach out to in order to get that done. So this issue can simultaneously be considered a request to add that information to this repo, and also, a request for the information itself as I represent such a company! Thank you so much in advance, this software is awesome!
OS: ubuntu 22.04.2 LTS
NVIDIA Driver Version: 525.85.12
CUDA Version: 12.0
Python 3.8.16
Ran the demo code of "Antibody structure prediction from sequence" and got #30 error. After modifying the AntiBERTyRunner.py suggested in #30, another error occurs:
`
The code, data, and weights for this work are made available for non-commercial use
(including at commercial entities) under the terms of the JHU Academic Software License
Agreement. For commercial inquiries, please contact dmalon11[at]jhu.edu.
License: https://github.com/Graylab/IgFold/blob/main/LICENSE.md
Loading 4 IgFold models...
Using device: cuda:0
Loading /home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_1.ckpt...
Loading /home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_2.ckpt...
Loading /home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_3.ckpt...
Loading /home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/trained_models/IgFold/igfold_5.ckpt...
Successfully loaded 4 IgFold models.
Loaded AntiBERTy model.
Traceback (most recent call last):
File "demo.py", line 16, in
igfold.fold(
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/IgFoldRunner.py", line 106, in fold
model_out = fold(
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/utils/folding.py", line 206, in fold
model_out = model(model_in)
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/igfold/model/IgFold.py", line 248, in forward
str_nodes = self.str_node_transform(bert_feats)
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/px172/anaconda3/envs/IgFold38/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)
`
Investigated IgFold.py and made a modification:
move bert_feats to the same device before the str_node_transform operation.
`
bert_feats = bert_feats.to(self.device)
str_nodes = self.str_node_transform(bert_feats)
`
After the fix, the prediction is generated.
Why my generated structure is a coil?
the fasta is below:
7T0K_1|Chains A, C, E, G[auth L]|S25-2 Fab light chain|Mus musculus (10090)
DIVMSQSPSSLAVSAGEKVTMSCKSSQSLLNSRTRKNYLAWYQQKPGQSPKLLIYWASTRESGVPDRFTGSGSGTDFTLTITSVQAEDLAVYYCKQSYNLRTFGGGTKLEIKRADAAPTVSIFPPSSEQLTSGGASVVCFLNNFYPKDINVKWKIDGSERQNGVLNSWTDQDSKDSTYSMSSTLTLTKDEYERHNSYTCEATHKTSTSPIVKSFNRNEC
7T0K_2|Chains B, D, F, H|S25-2 Fab heavy chain|Mus musculus (10090)
EVKLVESGGGLVQSGGSLRLSCATSGFTFTDYYMSWVRQPPGKALEWLGFIRNKANGYTTEYSPSVKGRFTISRDNSQSILYLQMNTLRAEDSATYYCARDHDGYYERFSYWGQGTLVTVSAAKTTPPSVYPLAPGSAAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVPSSTWPSETVTCNVAHPASSTKVDKKIVPRA
I was unable to install IgFold on my computer after installing Anaconda following the installation steps you explained and have tried several times.
Meanwhile,IgFold cannot run on Google Colab.
Here is a part of the PDB file from an example from the demo code:
ATOM 598 C SER H 122 -13.308 -3.541 25.995 1.00 0.72 C
ATOM 599 CB SER H 122 -13.603 -2.584 25.501 1.00 0.72 C
ATOM 600 O SER H 122 -13.651 -3.377 27.165 1.00 0.72 O
TER 601 SER H 122
ATOM 601 N ASP L 123 12.888 2.582 8.954 1.00 0.62 N
ATOM 602 CA ASP L 123 11.859 1.989 8.092 1.00 0.62 C
ATOM 603 C ASP L 123 11.877 3.321 7.378 1.00 0.62 C
The TER record and the next ATOM record have the atom serial number (601).
do_refined and do_renum are set to False.
Hi,
Please help to solve following errors, thanks!
I downloaded weights, and loaded in IgFoldRunner, but errors appeared as when excuting inference:
#############################################
The code, data, and weights for this work are made available for non-commercial use
(including at commercial entities) under the terms of the JHU Academic Software License
Agreement. For commercial inquiries, please contact dmalon11[at]jhu.edu.
License: https://github.com/Graylab/IgFold/blob/main/LICENSE.md
Loading 4 IgFold models...
Using device: cuda:0
Loading /home/shanghai/RationalDesign/ToBeTest/IgFold/igfold/trained_models/IgFold/igfold_1.ckpt...
Traceback (most recent call last):
File "Inference_IgFold.py", line 38, in
igfold = IgFoldRunner(num_models=num_models)
File "/home/shanghai/RationalDesign/ToBeTest/IgFold/igfold/IgFoldRunner.py", line 71, in init
IgFold.load_from_checkpoint(ckpt_file).eval().to(device))
File "/home/shanghai/anaconda3/envs/IgFold/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 137, in load_from_checkpoint
return _load_from_checkpoint(
File "/home/shanghai/anaconda3/envs/IgFold/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 180, in _load_from_checkpoint
return _load_state(cls, checkpoint, strict=strict, **kwargs)
File "/home/shanghai/anaconda3/envs/IgFold/lib/python3.8/site-packages/pytorch_lightning/core/saving.py", line 238, in _load_state
keys = obj.load_state_dict(checkpoint["state_dict"], strict=strict)
File "/home/shanghai/anaconda3/envs/IgFold/lib/python3.8/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for IgFold:
Unexpected key(s) in state_dict: "bert_model.embeddings.position_ids", "bert_model.embeddings.word_embeddings.weight", "bert_model.embeddings.position_embeddings.weight", "bert_model.embeddings.token_type_embeddings.weight", "bert_model.embeddings.LayerNorm.weight", "bert_model.embeddings.LayerNorm.bias", "bert_model.encoder.layer.0.attention.self.query.weight", "bert_model.encoder.layer.0.attention.self.query.bias", "bert_model.encoder.layer.0.attention.self.key.weight", "bert_model.encoder.layer.0.attention.self.key.bias", "bert_model.encoder.layer.0.attention.self.value.weight", "bert_model.encoder.layer.0.attention.self.value.bias", "bert_model.encoder.layer.0.attention.output.dense.weight", "bert_model.encoder.layer.0.attention.output.dense.bias", "bert_model.encoder.layer.0.attention.output.LayerNorm.weight", "bert_model.encoder.layer.0.attention.output.LayerNorm.bias", "bert_model.encoder.layer.0.intermediate.dense.weight", "bert_model.encoder.layer.0.intermediate.dense.bias", "bert_model.encoder.layer.0.output.dense.weight", "bert_model.encoder.layer.0.output.dense.bias", "bert_model.encoder.layer.0.output.LayerNorm.weight", "bert_model.encoder.layer.0.output.LayerNorm.bias", "bert_model.encoder.layer.1.attention.self.query.weight", "bert_model.encoder.layer.1.attention.self.query.bias", "bert_model.encoder.layer.1.attention.self.key.weight", "bert_model.encoder.layer.1.attention.self.key.bias", "bert_model.encoder.layer.1.attention.self.value.weight", "bert_model.encoder.layer.1.attention.self.value.bias", "bert_model.encoder.layer.1.attention.output.dense.weight", "bert_model.encoder.layer.1.attention.output.dense.bias", "bert_model.encoder.layer.1.attention.output.LayerNorm.weight", "bert_model.encoder.layer.1.attention.output.LayerNorm.bias", "bert_model.encoder.layer.1.intermediate.dense.weight", "bert_model.encoder.layer.1.intermediate.dense.bias", "bert_model.encoder.layer.1.output.dense.weight", "bert_model.encoder.layer.1.output.dense.bias", "bert_model.encoder.layer.1.output.LayerNorm.weight", "bert_model.encoder.layer.1.output.LayerNorm.bias", "bert_model.encoder.layer.2.attention.self.query.weight", "bert_model.encoder.layer.2.attention.self.query.bias", "bert_model.encoder.layer.2.attention.self.key.weight", "bert_model.encoder.layer.2.attention.self.key.bias", "bert_model.encoder.layer.2.attention.self.value.weight", "bert_model.encoder.layer.2.attention.self.value.bias", "bert_model.encoder.layer.2.attention.output.dense.weight", "bert_model.encoder.layer.2.attention.output.dense.bias", "bert_model.encoder.layer.2.attention.output.LayerNorm.weight", "bert_model.encoder.layer.2.attention.output.LayerNorm.bias", "bert_model.encoder.layer.2.intermediate.dense.weight", "bert_model.encoder.layer.2.intermediate.dense.bias", "bert_model.encoder.layer.2.output.dense.weight", "bert_model.encoder.layer.2.output.dense.bias", "bert_model.encoder.layer.2.output.LayerNorm.weight", "bert_model.encoder.layer.2.output.LayerNorm.bias", "bert_model.encoder.layer.3.attention.self.query.weight", "bert_model.encoder.layer.3.attention.self.query.bias", "bert_model.encoder.layer.3.attention.self.key.weight", "bert_model.encoder.layer.3.attention.self.key.bias", "bert_model.encoder.layer.3.attention.self.value.weight", "bert_model.encoder.layer.3.attention.self.value.bias", "bert_model.encoder.layer.3.attention.output.dense.weight", "bert_model.encoder.layer.3.attention.output.dense.bias", "bert_model.encoder.layer.3.attention.output.LayerNorm.weight", "bert_model.encoder.layer.3.attention.output.LayerNorm.bias", "bert_model.encoder.layer.3.intermediate.dense.weight", "bert_model.encoder.layer.3.intermediate.dense.bias", "bert_model.encoder.layer.3.output.dense.weight", "bert_model.encoder.layer.3.output.dense.bias", "bert_model.encoder.layer.3.output.LayerNorm.weight", "bert_model.encoder.layer.3.output.LayerNorm.bias", "bert_model.encoder.layer.4.attention.self.query.weight", "bert_model.encoder.layer.4.attention.self.query.bias", "bert_model.encoder.layer.4.attention.self.key.weight", "bert_model.encoder.layer.4.attention.self.key.bias", "bert_model.encoder.layer.4.attention.self.value.weight", "bert_model.encoder.layer.4.attention.self.value.bias", "bert_model.encoder.layer.4.attention.output.dense.weight", "bert_model.encoder.layer.4.attention.output.dense.bias", "bert_model.encoder.layer.4.attention.output.LayerNorm.weight", "bert_model.encoder.layer.4.attention.output.LayerNorm.bias", "bert_model.encoder.layer.4.intermediate.dense.weight", "bert_model.encoder.layer.4.intermediate.dense.bias", "bert_model.encoder.layer.4.output.dense.weight", "bert_model.encoder.layer.4.output.dense.bias", "bert_model.encoder.layer.4.output.LayerNorm.weight", "bert_model.encoder.layer.4.output.LayerNorm.bias", "bert_model.encoder.layer.5.attention.self.query.weight", "bert_model.encoder.layer.5.attention.self.query.bias", "bert_model.encoder.layer.5.attention.self.key.weight", "bert_model.encoder.layer.5.attention.self.key.bias", "bert_model.encoder.layer.5.attention.self.value.weight", "bert_model.encoder.layer.5.attention.self.value.bias", "bert_model.encoder.layer.5.attention.output.dense.weight", "bert_model.encoder.layer.5.attention.output.dense.bias", "bert_model.encoder.layer.5.attention.output.LayerNorm.weight", "bert_model.encoder.layer.5.attention.output.LayerNorm.bias", "bert_model.encoder.layer.5.intermediate.dense.weight", "bert_model.encoder.layer.5.intermediate.dense.bias", "bert_model.encoder.layer.5.output.dense.weight", "bert_model.encoder.layer.5.output.dense.bias", "bert_model.encoder.layer.5.output.LayerNorm.weight", "bert_model.encoder.layer.5.output.LayerNorm.bias", "bert_model.encoder.layer.6.attention.self.query.weight", "bert_model.encoder.layer.6.attention.self.query.bias", "bert_model.encoder.layer.6.attention.self.key.weight", "bert_model.encoder.layer.6.attention.self.key.bias", "bert_model.encoder.layer.6.attention.self.value.weight", "bert_model.encoder.layer.6.attention.self.value.bias", "bert_model.encoder.layer.6.attention.output.dense.weight", "bert_model.encoder.layer.6.attention.output.dense.bias", "bert_model.encoder.layer.6.attention.output.LayerNorm.weight", "bert_model.encoder.layer.6.attention.output.LayerNorm.bias", "bert_model.encoder.layer.6.intermediate.dense.weight", "bert_model.encoder.layer.6.intermediate.dense.bias", "bert_model.encoder.layer.6.output.dense.weight", "bert_model.encoder.layer.6.output.dense.bias", "bert_model.encoder.layer.6.output.LayerNorm.weight", "bert_model.encoder.layer.6.output.LayerNorm.bias", "bert_model.encoder.layer.7.attention.self.query.weight", "bert_model.encoder.layer.7.attention.self.query.bias", "bert_model.encoder.layer.7.attention.self.key.weight", "bert_model.encoder.layer.7.attention.self.key.bias", "bert_model.encoder.layer.7.attention.self.value.weight", "bert_model.encoder.layer.7.attention.self.value.bias", "bert_model.encoder.layer.7.attention.output.dense.weight", "bert_model.encoder.layer.7.attention.output.dense.bias", "bert_model.encoder.layer.7.attention.output.LayerNorm.weight", "bert_model.encoder.layer.7.attention.output.LayerNorm.bias", "bert_model.encoder.layer.7.intermediate.dense.weight", "bert_model.encoder.layer.7.intermediate.dense.bias", "bert_model.encoder.layer.7.output.dense.weight", "bert_model.encoder.layer.7.output.dense.bias", "bert_model.encoder.layer.7.output.LayerNorm.weight", "bert_model.encoder.layer.7.output.LayerNorm.bias", "bert_model.pooler.dense.weight", "bert_model.pooler.dense.bias".
##############################
i use the sample :
from igfold import IgFoldRunner
sequences = {
"H": "QVQLQESGGGLVQAGGSLTLSCAVSGLTFSNYAMGWFRQAPGKEREFVAAITWDGGNTYYTDSVKGRFTISRDNAKNTVFLQMNSLKPEDTAVYYCAAKLLGSSRYELALAGYDYWGQGTQVTVS"
}
pred_pdb = "my_nanobody.pdb"
igfold = IgFoldRunner()
igfold.fold(
pred_pdb, # Output PDB file
sequences=sequences, # Nanobody sequence
do_refine=False, # Refine the antibody structure with PyRosetta
do_renum=True, # Renumber predicted antibody structure (Chothia)
)
get the erro:
Warning: AbNumber not available. Provide --use_abnum to renumber with the AbNum server.
cannot import name 'clean_pdb' from 'igfold.utils.pdb' (/data/miniconda3/envs/igfold/lib/python3.9/site-packages/igfold/utils/pdb.py)
Completed folding in 0.90 seconds.
How to solve it?
Upon running the folding algorithm, I've noticed a few of my targets fail due to an error such as the following:
RuntimeError: The expanded size of the tensor (587) must match the existing size (512) at non-singleton dimension 1. Target sizes: [1, 587]. Tensor sizes: [1, 512]
which gives the impression that there is either a sequence limit or an error in how the input is processed. So I was just wondering which of these is true?
I downloaded the model weight, but there is no igfold_4.ckpt ?
The link is https://data.graylab.jhu.edu/IgFold.tar.gz
When I run the antibody structure prediction using IgFoldRunner().fold(...), there becomes an error of "can not import name 'functional_datapipe' from 'torch.utils.data'. Do you know why such an issue occur, it seems that I downloaded every possible package requirement and followed your instructions sequentially, but there is still such an error. Would be appreciated if you could assist in assisting. Thanks for the paper as well, it is highly awesome work!
Hello, IgFold is a nice job. I am a rookie in antibody prediction from sequence to 3D structure.
The example runs smoothly, but when I run my file, the output structures are bizarre. To be exact, the amino acid chain is broken as shown in the picture. This problem doesn't happen when I truncate the sequence. I tried different files, and this happens over and over again.I would appreciate it if you could give me some advice.
Hello, I want to fineturn your pretrained AntiBERTy model, but I did not get some details. Could you please provide a relevant demo? Thank you!
Recently, I have tried to reproduce the evaluation of Igfold, but have not found the download location of benchmark datasets. I tried to use the method mentioned in the paper, but the number of datasets is much larger than that in the paper. So I want to know whether you can provide relevant data ?Thanks a lot.
Hello, recently, i use your nanobody test data to test the model performance on NanoBody. However, I want to get more data. So is there any way to identify if a structure is nanobody or how to get nanobody data from RCSB PDB? Thanks so much.
when i follow the demo, the following error occurs:
Traceback (most recent call last):
File "", line 1, in
File "/data/personal/yankai/LiuChang/IgFold/igfold/IgFoldRunner.py", line 120, in fold
truncate_sequences=truncate_sequences,
File "/data/personal/yankai/LiuChang/IgFold/igfold/utils/folding.py", line 220, in fold
do_renum=do_renum,
File "/data/personal/yankai/LiuChang/IgFold/igfold/utils/folding.py", line 134, in process_prediction
from igfold.utils.abnumber_ import renumber_pdb
File "/data/personal/yankai/LiuChang/IgFold/igfold/utils/abnumber_.py", line 2, in
from abnumber import Chain
ModuleNotFoundError: No module named 'abnumber'
so, where can I find the abnumber? I only found abnumber_.py.
Hello author, I installed IgFold according to the installation process you provided. After the successful installation, I couldn’t run the test case. I checked the code and found that it was because the model weight could not be loaded. I hope you can provide me with the method of obtaining the weight, very grateful!
Hi,
After installation and running the script, it gives:
This seems caused by that models array is empty in igfold/utils/folding.py, which is tested by print(models). In IgFoldRunner.py, although it said 4 IgFold models have been successfully loaded in the output, the list "model_ckpts" is empty. Model_ckpts should be loaded from trained_models. However, no such directory exists currently.
Please help to solve, thanks !
Hi @jeffreyruffolo - As I think a few others have noted, the version is missing for pytorch3d, which is causing an error in conda when we try to install:
ERROR: Could not find a version that satisfies the requirement pytorch3d (from igfold) (from versions: none)
ERROR: No matching distribution found for pytorch3d
Is there a preferred version of pytorch3d that you have tested works for igfold?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.