jhakanchan15 / ppi_gnn Goto Github PK

View Code? Open in Web Editor NEW

54.0 1.0 16.0 399.07 MB

Python 100.00%

ppi_gnn's Introduction

PPI_GNN

In order to replicate the results mentioned in paper, please follow the following steps:

Download the Pan's human features file and place the files at ../human_features/processed/. The link is given in PPI_GNN/Human_features/README.md. For the S. cerevisiae PPI dataset, download the input feature file and place it at ../S. cerevisiae/processed/. The link is given in PPI_GNN/S. cerevisiae/README.md.
Next use the command: python train.py to train the model.

The steps to predicting protein interactions on a new dataset are:

First, get the node features from protein sequences using the SeqVec method (seqvec_embedding.py) and then build the protein graph (proteins_to_graphs.py).
Next, use the command "python data_prepare.py" to get input features for the model.
Then, use the command "python train.py" to train the model.
Use the command: "python test.py" to evaluate the trained model on unseen data (test set).

To create the ppi_env environment, run: $ conda env create -f ppi_env.yml

ppi_gnn's People

Contributors

Stargazers

Watchers

Forkers

hiteshidudeja mm230 ardagoreci pravinpoudel kehan777 honchkrow francoiszg rikhldr0267 venkywonka hmudradi3 oomycota1 wesleywt souvik-snh rucvma dahee-e souvksinha

ppi_gnn's Issues

Can you please release the download link for the raw protein data? I would appreciate it if I could use your code to understand how the raw data is processed.

S. cerevisiae processed data access denied

Hi jha,

It is an excellent work for prediction PPI. I want to reproduce the prediction results and have a comparison in my work. However, the S. cerevisiae processed data in google drive is access denied. Can you share the data again?
Thanks very much!

Honchkrow

Unable to run

Hello there,

It would be greatly appreciated if you could also upload the environment file somewhere so that your results are reproducible.

Otherwise we have no idea how to run your model with which version of pytorch etc.

Thanks!

IndexError: list index out of range for torch.load(glob.glob(prot_1)[0])

Dear Sir/Madam,

My update:
As I don't have the completed dataset, I guess the original issue comes from below reasons:

npy_file_new(human_dataset).npy has 22217 data
Current available human data is only 4444+1111=5555
Above causes the below problem. Please feel free to correct me. Thanks.

Original issue:
I am running this project on google colab. This might not be an issue, but I don't know how to solve it.
There is a problem showing as : IndexError: list index out of range.
The part of result as:
GCNN Loaded Training on 4444 samples..... 15657 first prot is /content/gdrive/MyDrive/PPI_GNN/PPI_GNN/human_features/processed/3AIH.pt [] 15657 Second prot is /content/gdrive/MyDrive/PPI_GNN/PPI_GNN/human_features/processed/1DEV.pt Traceback (most recent call last): File "train.py", line 97, in <module> train(model, device, trainloader, optimizer, epoch+1) File "train.py", line 45, in train for count,(prot_1, prot_2, label) in enumerate(trainloader): File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 530, in __next__ data = self._next_data() File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 570, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataset.py", line 471, in __getitem__ return self.dataset[self.indices[idx]] File "/content/gdrive/MyDrive/PPI_GNN/PPI_GNN/data_prepare.py", line 41, in __getitem__ prot_1 = torch.load(glob.glob(prot_1)[0]) IndexError: list index out of range

The error comes from the code:
def __getitem__(self, index): prot_1 = os.path.join(self.processed_dir, self.protein_1[index]+".pt") print(index) print(f'first prot is {prot_1}') print(glob.glob('prot_1')) prot_2 = os.path.join(self.processed_dir, self.protein_2[index]+".pt") print(index) print(f'Second prot is {prot_2}') prot_1 = torch.load(glob.glob(prot_1)[0]) print(f'Here lies {glob.glob(prot_2)}') prot_2 = torch.load(glob.glob(prot_2)[0]) print(torch.tensor(self.label[index])) return prot_1, prot_2, torch.tensor(self.label[index])

It seems that glob.glob('prot_1') is null. How to solve this problem?
Thanks in advance.

Restricted access to original processed folder

I would like to replicate results mentioned in the paper. However, access to the original processed folder containing all 4,188 protein graphs is restricted and will require permission. I would greatly appreciate if you can assist and grant me access to the folder. Thank you very much.

Runtime error

I was trying to replicate your result. I followed README.md to set up this repo. However, when I try to run it I got a Runtime error as followed:

RuntimeError: The 'data' object was created by an older version of PyG. If this error occurred while loading an already existing dataset, remove the 'processed/' directory in the dataset's root folder and try again.

I beleive this error occur because of the DataLoader in the data_prepare.py. Any solution on how to solve this? Much appreciated.

proteins_to_graphs

I am trying to understand the pipeline for converting proteins to graphs, and I have found a difficulty in this line:

ftrs = np.load("../human_features/pdb_to_seqvec_dict.npy", allow_pickle=True)
I would like to know where did you get the file pdf_to seqvec_dict.npy? and what does it contains?
Appreciate any answer.

Could you please list your environment?

!!!!!!!

jhakanchan15 / ppi_gnn Goto Github PK

ppi_gnn's Introduction

PPI_GNN

ppi_gnn's People

Contributors

Stargazers

Watchers

Forkers

ppi_gnn's Issues

Can you please release the download link for the raw protein data? I would appreciate it if I could use your code to understand how the raw data is processed.

S. cerevisiae processed data access denied

Unable to run

IndexError: list index out of range for torch.load(glob.glob(prot_1)[0])

Restricted access to original processed folder

Runtime error

proteins_to_graphs

Could you please list your environment?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent