benedekrozemberczki / clustergcn Goto Github PK

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

License: GNU General Public License v3.0

Python 100.00%

gcn graph-convolution graph-neural-networks graph-convolutional-networks deepwalk node2vec pytorch graphsage graph2vec musae

clustergcn's Introduction

Benedek A. Rozemberczki/ Homepage / Twitter / GitHub / Google Scholar

Welcome stranger

⏰ Currently working on machine learning for drug discovery.
🤖 I would love to collaborate on the machine learning libraries ChemicalX and RexMex.

Great news

🧬 MOOMIN: Deep Molecular Omics Network for Anti-Cancer Drug Combination Therapy was accepted at CIKM 2022.
🪙 The Shapley Value in Machine Learning was accepted at IJCAI 2022.
⭐ A Unified View of Relational Deep Learning for Drug Pair Scoring was accepted at IJCAI 2022.
⚗️ ChemicalX: A Deep Learning Library for Drug Pair Scoring was accepted at KDD 2022.

clustergcn's People

Contributors

Stargazers

Watchers

Forkers

hyzcn hunglethanh9 praveenjoshi01 tungk wcreus puncoz-bookmarks tonydeep shengyupei raymondhliu mdjabc gnn2qsu strategist922 flavio58it xujinglin uzeroj yy2lyx batermj mosaddek-hossain vcjy2017 wsmonroe ifkid cslele zhouyonglong gaohuan2015 jhy1993 jlqzzz yipeng5 xrosliang hello-web nguyenvanhoang7398 rachaelkaiye tonylibing seongjinahn hhh920406 littlebadrobot nirvanalan phymucs seeker1943 chontipan liuzhichenger rxlgq alxsoares kevinmel2000 mitghi morteza-haghshenas alanmorninglight rusea fjpsxh russellizadi stjordanis tenglang123 ammieqi databill86 zeta1999 zbn123 vistalee demonbibi chaoshengt shualite shujunge everest1215 trantorrepository atique597 zshwuhan lxngoddess5321 yangyongguang milkigit yonglinzhao cyy1111-cai mldl glafiram tjufan xiaolinhan sunfeng90 bucaixiaosheng shengguanwsu ren98feng hars-singh bronzepot jingyangzhang0222 yiyang-wang yukizhao1998 cxzyoung heming-zhang hitori940101 kimsijin33 chun-hua tor4k wanghuidev zhihan-lu hmmgnn sanjaygorur trendingtechnology furoce butterfly-chinese lchj shuguoj zyzisastudyreallyhardguy streakyporker mightycrane

clustergcn's Issues

ppi

`import torch
import time
import torch.nn as nn
import torch.nn.functional as F
import os.path as osp
from torch_geometric.datasets import PPI
from ppi_cluster import ClusterData, ClusterLoader
from torch_geometric.nn import SAGEConv, ChebConv
from sklearn.metrics import f1_score

path = osp.join(osp.dirname(osp.realpath(file)), '..', 'data', 'PPI')
dataset = PPI(path)
train_dataset = PPI(path, split='train') #20graphs
val_dataset = PPI(path, split='val') #2graphs
test_dataset = PPI(path, split='test') #2graphs

print('Partioning the graph... (this may take a while)')
train_dataset_list = []
val_dataset_list = []
test_dataset_list = []
dataset_list = []
train_dataset_index = test_dataset_index = val_dataset_index = 0

for data in train_dataset:
cluster_data = ClusterData(data, 'train', train_dataset_index, num_parts=2, recursive=False,
save_dir=dataset.processed_dir)
loader = ClusterLoader(cluster_data, batch_size=20, shuffle=True,
num_workers=0)
train_dataset_list.append(loader)
dataset_list.append(loader)
train_dataset_index += 1

for data in test_dataset:
cluster_data = ClusterData(data, 'test', test_dataset_index, num_parts=2, recursive=False,
save_dir=dataset.processed_dir)
loader = ClusterLoader(cluster_data, batch_size=20, shuffle=True,
num_workers=0)
test_dataset_list.append(loader)
dataset_list.append(loader)
test_dataset_index += 1

for data in val_dataset:
cluster_data = ClusterData(data, 'val', val_dataset_index, num_parts=2, recursive=False,
save_dir=dataset.processed_dir)
loader = ClusterLoader(cluster_data, batch_size=20, shuffle=True,
num_workers=0)
val_dataset_list.append(loader)
dataset_list.append(loader)
val_dataset_index += 1

print('Done!')`

Predicting labels with respect to each node

How can I use the same model and code to predict node-wise labels as when clusters are formed original node-ids changes?

ppi dataset

`
import torch
import time
import torch.nn as nn
import torch.nn.functional as F
import os.path as osp
from torch_geometric.datasets import PPI
from ppi_cluster import ClusterData, ClusterLoader
from torch_geometric.nn import SAGEConv, ChebConv
from sklearn.metrics import f1_score

print('Done!')
`

AttributeError: 'Graph' object has no attribute 'node'

Encounter such an error when run python src/main.y

Cannot run main.py

src/main.py --epochs 100
+-------------------+----------------------+
| Parameter | Value |
+===================+======================+
| Cluster number | 10 |
+-------------------+----------------------+
| Clustering method | metis |
+-------------------+----------------------+
| Dropout | 0.500 |
+-------------------+----------------------+
| Edge path | ./input/edges.csv |
+-------------------+----------------------+
| Epochs | 100 |
+-------------------+----------------------+
| Features path | ./input/features.csv |
+-------------------+----------------------+
| Layers | [16, 16, 16] |
+-------------------+----------------------+
| Learning rate | 0.010 |
+-------------------+----------------------+
| Seed | 42 |
+-------------------+----------------------+
| Target path | ./input/target.csv |
+-------------------+----------------------+
| Test ratio | 0.900 |
+-------------------+----------------------+

Metis graph clustering started.

Traceback (most recent call last):
File "src/main.py", line 24, in
main()
File "src/main.py", line 18, in main
clustering_machine.decompose()
File "/Users/linmiao/gits/ClusterGCN/src/clustering.py", line 38, in decompose
self.metis_clustering()
File "/Users/linmiao/gits/ClusterGCN/src/clustering.py", line 56, in metis_clustering
(st, parts) = metis.part_graph(self.graph, self.args.cluster_number)
File "/usr/local/lib/python3.7/site-packages/metis.py", line 765, in part_graph
graph = networkx_to_metis(graph)
File "/usr/local/lib/python3.7/site-packages/metis.py", line 574, in networkx_to_metis
for i in H.node:
AttributeError: 'Graph' object has no attribute 'node'

The error of metis, Segmentation fault (core dumped)

I found that I can use the random model to divide the graph, but when using Metis, the code will terminate abnormally. I want to ask what causes this. I change "IDXTYPEWIDTH = os.getenv('METIS_IDXTYPEWIDTH', '32')" in metis.py (line 31) to "IDXTYPEWIDTH = os.getenv('METIS_IDXTYPEWIDTH', '64')", but it doesn't work!!!

python src/main.py
+-------------------+----------------------+
| Parameter | Value |
+===================+======================+
| Cluster number | 10 |
+-------------------+----------------------+
| Clustering method | metis |
+-------------------+----------------------+
| Dropout | 0.500 |
+-------------------+----------------------+
| Edge path | ./input/edges.csv |
+-------------------+----------------------+
| Epochs | 200 |
+-------------------+----------------------+
| Features path | ./input/features.csv |
+-------------------+----------------------+
| Layers | [16, 16, 16] |
+-------------------+----------------------+
| Learning rate | 0.010 |
+-------------------+----------------------+
| Seed | 42 |
+-------------------+----------------------+
| Target path | ./input/target.csv |
+-------------------+----------------------+
| Test ratio | 0.900 |
+-------------------+----------------------+

Metis graph clustering started.

Segmentation fault (core dumped)

Why is the Feature Matrix (X) sparse instead of the Adjacency Matrix (A)?

line 40 in utils.py
features = coo_matrix((feature_values, (node_index, feature_index)), shape=(node_count, feature_count)).toarray()

The implementation of the ' Stochastic Multiple Partitions' method

Appreciate your good implementation of Cluster-GCN on pytorch. However, I couldn't find the ' Stochastic Multiple Partitions' method in your implementation. Or i just missed it?

About installation

Hi there:
Thank you for your great work, I've finally got the code running.
To make the installation in README.md more precise & complete. You may want to add the following dependancies:

torch_spline_conv == 1.0.4
torch_sparse == 0.2.2
torch_scatter == 1.0.4
torch_cluster == 1.1.5 (strict)

Failed to locate Metis

Hi.
I installed metis with 'pip install metis' and an error occurred: 'RuntimeError: Could not locate METIS dll. Please set the METIS_DLL environment variable to its full path.' In the linkhttps://metis.readthedocs.io/en/latest/_modules/metis.html I knew that I needed to download this: http://glaros.dtc.umn.edu/gkhome/views/metis
but still I failed to build the source code of metis.

Would you please give me some advise on solving this problem? Thank you in advance!

some code is missing

there is no code of Stochastic Multiple Partitions and Issues of training deeper GCNs...

RuntimeError: Could not locate METIS dll.

hello,when I run main.py, the error massage appears:

raise RuntimeError('Could not locate METIS dll. Please set the METIS_DLL environment variable to its full path.')
RuntimeError: Could not locate METIS dll. Please set the METIS_DLL environment variable to its full path.

do you know how to solve it?

For ppi

Hello. Thanks for your work and code. It's great that Cluster-GCN achieves great performance in PPI datasets. But it seems that you have not opened source the code for PPI node classification.

Do you find the best model on validation dataset at first then test on the unseen test dataset?
I notice that GraphStar now is the SOTA. However, they don't use the validation dataset and directly find the best model on test dataset.

Can you share code of PPI with us and mention how to split dataset in the readme file? It's important for others to follow your great job.

ImportError: No module named 'torch_spline_conv'

I followed the instructions of installation properly, however, error above occurred.

After checking the site packages folder, i do not find the file torch_spline_conv.
I will google around for finding out why that is happening, but thought you might have some insights

Any help is appreciated.

The complete trace is as follows

File "src/main.py", line 4, in <module>
    from clustergcn import ClusterGCNTrainer
  File "/media/anuj/Softwares & Study Material/Study Material/MS Stuff/RA/ClusterGCN/src/clustergcn.py", line 5, in <module>
    from layers import StackedGCN
  File "/media/anuj/Softwares & Study Material/Study Material/MS Stuff/RA/ClusterGCN/src/layers.py", line 2, in <module>
    from torch_geometric.nn import GCNConv
  File "/home/anuj/virtualenv-forest/gcn/lib/python3.5/site-packages/torch_geometric/nn/__init__.py", line 1, in <module>
    from .conv import *  # noqa
  File "/home/anuj/virtualenv-forest/gcn/lib/python3.5/site-packages/torch_geometric/nn/conv/__init__.py", line 1, in <module>
    from .spline_conv import SplineConv
  File "/home/anuj/virtualenv-forest/gcn/lib/python3.5/site-packages/torch_geometric/nn/conv/spline_conv.py", line 3, in <module>
    from torch_spline_conv import SplineConv as Conv
ImportError: No module named 'torch_spline_conv'

how to generate embeddings using your library if my input is in grapson(json) or graphml(xml) format?

TypeError: object of type 'int' has no len()

hello, when I run main.py, I found the error message:
File "D:\anaconda3.4\lib\site-packages\pymetis_init_.py", line 44, in _prepare_graph
for i in range(len(adjacency)):
TypeError: object of type 'int' has no len()

I have installed pymetis package to solve the metis.dll, this error occurs in the pymetis_init_.py.
do you know how to solve it?

Metis Segmentation Fault (Cored Dumped)

Hi, @benedekrozemberczki thanks so much for sharing the code. After following the installation instructions, it seems to failing to install the metis for python. I can run the main.py with random partition method. However, I always get "aborted (core dumped)" when using metis partition method. Could you help me to target issue? It'd better if you can share some suggestions.

Download metis-5.1.0.tar.gz from http://glaros.dtc.umn.edu/gkhome/metis/metis/download and unpack it
cd metis-5.1.0
make config shared=1 prefix=~/.local/
make install
export METIS_DLL=~/.local/lib/libmetis.so

Here is an official example to check metis. However, I always get "aborted (core dumped)" error.
`

import networkx as nx
import metis
G = metis.example_networkx()
(edgecuts, parts) = metis.part_graph(G, 3)
`

===== My System Information ====
Ubuntu 18, python 3.7

How to generate the three files in this ’input‘ when i wanna run the new dataset？

i want to know why the metis install in Ubuntu, and the command run in Mac

ppi with clusterdata

print('Done!')`

Error: AttributeError: 'Graph' object has no attribute 'node'

File "/home/sjyjya/anaconda3/envs/tftlc/lib/python3.7/site-packages/metis.py", line 574, in networkx_to_metis
for i in H.node:
AttributeError: 'Graph' object has no attribute 'node'

ppi

import torch import time

issues about the

(st, parts) = metis.part_graph(self.graph, self.args.cluster_number)

How to run REDDIT on it? it's diffcult to translate adj into nx.graph for that it is so big.

Amazon2M Dataset

Hi,

Are you planning on releasing the Amazon2M dataset used in the paper?

Thanks,
Emanuele

Runtime error about metis

At the train begining that part the full graph, the function "metis.part_graph(self.graph, self.args.cluster_number)" throws an error:
Traceback (most recent call last): File "C:/Users/xieRu/Desktop/ML/ClusterGCN/src/main.py", line 30, in <module> main() File "C:/Users/xieRu/Desktop/ML/ClusterGCN/src/main.py", line 19, in main clustering_machine.decompose() File "C:\Users\xieRu\Desktop\ML\ClusterGCN\src\clustering.py", line 38, in decompose self.metis_clustering() File "C:\Users\xieRu\Desktop\ML\ClusterGCN\src\clustering.py", line 56, in metis_clustering (st, parts) = metis.part_graph(self.graph, self.args.cluster_number) File "D:\Program\Anaconda\lib\site-packages\metis.py", line 800, in part_graph _METIS_PartGraphKway(*args) File "D:\Program\Anaconda\lib\site-packages\metis.py", line 677, in _METIS_PartGraphKway adjwgt, nparts, tpwgts, ubvec, options, objval, part) OSError: exception: access violation writing 0x000001B0B9C0E000

But I tried test package metis as follow, It works:
`
import metis
from networkx import karate_club_graph

zkc = karate_club_graph()
graph_clustering=metis.part_graph(zkc)
`
So, what happend?

Different partition size

Metis may provide sub-graphs with unequal number of nodes. The size of the adjacency matrix will be different. How do you handle this issue during training?

How to install metis? my metis don't have attribute part graph! help!!!!! thanks

some question about code

It seems like your code didn't consider the connection between clusters,and normalization that are mentioned in paper ,will you add these two options?

Segmentation Fault when running code

Hi, I was running your code and encounter a segmentation fault
the error was happenend at
clustering.py line 58
(st, parts) = metis.part_graph(self.graph, self.args.cluster_number)

I was wondering if anyone also encounter this issue?

I've changed the
IDXTYPEWIDTH = os.getenv('METIS_IDXTYPEWIDTH', '64')
and my python version is 3.7

I know the version in repo is 3.5, but I encounter trouble when installing torch-scatter, torch-sparse.. using python3.7
so I changed to python3.5

Frame problem

Hello, I would like to ask my data is all labeled, can I use this framework for training? I think many of them are semi-supervised frameworks.

a spelling mistakes in layer.py

from torch-geomteric.nn
but in this repo is "from torch_geomteric.nn"
a small mistake :)

Segmentation fault While running main.py on Ubuntu

while i am running main.py i am getting the segmentation fault error on Ubuntu.

python3 main.py --epochs 100

+-------------------+----------------------------------------------------------+
| Parameter | Value |
+===================+==========================================================+
| Cluster number | 10 |
+-------------------+----------------------------------------------------------+
| Clustering method | metis |
+-------------------+----------------------------------------------------------+
| Dropout | 0.500 |
+-------------------+----------------------------------------------------------+
| Edge path | /home/User/Desktop/ClusterGCN-master/input/edges.csv |
+-------------------+----------------------------------------------------------+
| Epochs | 100 |
+-------------------+----------------------------------------------------------+
| Features path | /home/User/Desktop/ClusterGCN- |
| | master/input/features.csv |
+-------------------+----------------------------------------------------------+
| Layers | [16, 16, 16] |
+-------------------+----------------------------------------------------------+
| Learning rate | 0.010 |
+-------------------+----------------------------------------------------------+
| Seed | 42 |
+-------------------+----------------------------------------------------------+
| Target path | /home/User/Desktop/ClusterGCN- |
| | master/input//target.csv |
+-------------------+----------------------------------------------------------+
| Test ratio | 0.900 |
+-------------------+----------------------------------------------------------+

Metis graph clustering started.

Segmentation fault

About the feature

Hi，

Thanks for your inspiring work!
I wonder what the values represent in the features.csv.

Vanilla version vs Advanced version

Thanks for your nice code first！
This version is only Vanilla ClusterGCN, right?
BTW, do you know what is the default dataset?

Metis hits a Segmentation fault when running _METIS_PartGraphKway

I'm using the default test input files.
I've attached pdb screenshot during the run.
Environment:
Ubuntu 18.04
Anaconda (Python 3.7.3),
torch-geometric==1.3.0
torch-scatter==1.3.0
torch-sparse==0.4.0
torch-spline-conv==1.1.0
metis==0.2a.4

PDB Error

Requirements.txt

support for pytorch1.0

can not run the code even i use pytorch 0.4.1

issues about the metis algorithm

(st, parts) = metis.part_graph(self.graph, self.args.cluster_number)
Thanks for your awesome code, could you please tell me how metis conduct the graph partition?
Cause the self.graph here doesn't include the information about edge weights and feature attributes.

benedekrozemberczki / clustergcn Goto Github PK

clustergcn's Introduction

Welcome stranger

Great news

clustergcn's People

Contributors

Stargazers

Watchers

Forkers

clustergcn's Issues

Recommend Projects

Recommend Topics

Recommend Org