fishmoon1234 / dag-gnn Goto Github PK

View Code? Open in Web Editor NEW

286.0 286.0 70.0 29 KB

License: Apache License 2.0

Python 100.00%

dag-gnn's People

Contributors

Stargazers

Watchers

dag-gnn's Issues

What does the input data G and X look like?

Hi,
When I run the code, I do not understand the meaning of ''match = re.search('/([\w]+)_s([\w]+)_v([\w]+).txt', file)'' and ''all_data[samplesN][version] = data'' here. Can I directly input my data like this 'file_pattern = data_dir +"sEMG.txt"'? Waiting for your answer. Thanks in advance.

License

Thank you for making this implementation available! Could you please add a license to allow reuse/modification of the code (https://choosealicense.com/)?

without a license, the default copyright laws apply, meaning that you retain all rights to your source code and no one may reproduce, distribute, or create derivative works from your work. (https://help.github.com/en/articles/licensing-a-repository)

Thanks in advance,
Alex

The element value of adjacency matrix

Without additional constrain, the learned adjacency matrix may contain some negative values. What the negative value means?

Reparameterization

Hi,

I am confused about the construction of the encoder and decoder. It seems that there is no reparameterization step in your code. In other words, you directly used M_x/M_z as x/z. Is there any problem?

Looking forward to your reply.

Demo fo datasets containing discrete variables?

Hi,

I'm trying to apply it for a dataset with discrete variables.
I noticed that there is a MLPDEncoder in modules. However, in the paper, you said that we could still use Equation (6) as the encoder. Is there any particular reason that you add softmax in that encoder?

I'll really appreciate if you could provide a sample code for dealing with discrete variables.

Thank you very much.

====To add
In MLPDiscreteDecoder, it seems like you assume all the discrete variables are of the same size, which does not make sense in practice. How do you deal with it? And what to do wtih mixing types of continuous and discrete variables?

why save encoder weight in encoder_file failed

I am a little confused, why use this condition to save weight

if args.save_folder and np.mean(nll_val) < best_val_loss:

      torch.save(encoder.state_dict(), encoder_file)

thank you !

how to use dag-gnn as a package

I wonder how I can use dag-gnn as a package in my project.

about the real datasets

Thanks for your sharing.Is the "--data_filename" means you had used real datasets to train the model?If it is true, can you share the datasets with us please?

Questions about input data

I'm trying to refactor this code to follow the sci-kit learn API for some experimentation on my own data, but I'm encountering shape issues.

I expected the required shape of X to be (n, d) based on this,

but the output of the the simulate_data function is 3 dimensional, i.e. (n, d, x_dims). see here

So I have the following questions:

What are the expected dimensions of X?
Are variables in the rows or columns of X?
Is there a way to handle mixed data?
What is the difference between x_dims and the data_variable_size? Which is the number of variables?

Much thanks in advance!

Andrew

DAGNN

Is the DagNN used for the scene text detection?

confused about `Wa` in MLPEncoder

logits = torch.matmul(adj_Aforz, x+self.Wa) - self.Wa
what is Wa? It seems that Wa doesn't appear in the paper.

is train.py for the VAE implementation ? latent variable z is not sampled from its posterior p(z|x)

Hi there, in train.py line 327, it seems like the 'logits' is directly used as the latent variable z. Based on the loss function KL_gaussian_sem it seems like that would be the log of mean of z.

The original VAE formulation requires the z to be sampled from its posterior which then requires the reparameterisation trick to correctly approximate the expectation with monte carlo estimate. Skipping this step would make the resulting implementation deviate quite far from the VAE framework.

I am also not sure why KL_gaussian_sem is used instead of KL_gaussian, this and the above choices only make sense if you are not assuming the implemented model in train.py won't take the form of a VAE. Could you please confirm this?

How to get the data/ directory

I am trying to run DAG-GNN and am new to this field, can you please help me with the data/ directory

Why are feat_train, feat_valid, and feat_test values the same?

Hi Dear Fishmoon, good day!

Recently, I am reading your paper with title "DAG-GNN: DAG Structure Learning with Graph Neural Networks", along with code, I found one issue that i am confused about it, Please check the attached screenshot

The code in the mentioned path: DAG-GNN-master->src->utils.py (line#425-436)
I want to know , Why are feat_train, feat_valid, and feat_test values are the same, and how you design the train_data, Valid_data, and test_data.
Could you give me some explanation about these lines of code.

Thank you in advance

cannot import name 'complete_to_chordal_graph'

This is my issues:
ImportError: cannot import name 'complete_to_chordal_graph' from partially initialized module 'networkx.algorithms' (most likely due to a circular impor
t) (D:\Program_Files\Anaconda3\lib\site-packages\networkx\algorithms_init_.py)

Wish you could help me, thank you.

KL loss in 340th line, train.py

Hi @fishmoon1234 . Thanks for your code.

I would like to confirm with you about the KL loss defined in line 340, train.py.

I am confused about why you compute KL loss with KL_guassian_sem when MLPs are used as encoder and decoder. As MLP and Sem are two options for encoder and decoder.

Further, it seems that the implementation of KL_guassian_sem here doesn't align with the equation (9). It simply uses the multiplication of the vectors with mu * mu

Looking forward to your suggestions.

Thanks a lot!

a bug report

Dear @fishmoon1234 , thanks for your code.

I test the code with the "--encoder" configured as "sem" and it seems that the number of return values in the encoder doesn't align with the number of variables in line 324, train.py.

Would you please help to fix this issue?

Thanks a lot!

I am looking forward to your reply.

Confused about 'nan error'

I'm confused about line 41 in module.py file:if torch.sum(self.adj_A != self.adj_A): print('nan error \n')Why add this sentence？Under what circumstances will ‘’nan error‘’ occur?

Looking forward to your reply

Updating the parameters of A in the main function in train.py

Hi @fishmoon1234,

I would like to ask what the purpose of the following snippet of code is:

` print("Optimization Finished!")
print("Best Epoch: {:04d}".format(best_epoch))
if ELBO_loss > 2 * best_ELBO_loss:
break

        # update parameters
        A_new = origin_A.data.clone()
        h_A_new = _h_A(A_new, args.data_variable_size)
        if h_A_new.item() > 0.25 * h_A_old:
            c_A*=10
        else:
            break

        # update parameters
        # h_A, adj_A are computed in loss anyway, so no need to store
    h_A_old = h_A_new.item()
    lambda_A += c_A * h_A_new.item()

    if h_A_new.item() <= h_tol:
        break`

Currently, I understand everything that is going on except for the first if statement after the two prints. I have no idea why that is there. If you could help me understand I would very much appreciate it. Thank you!

Synthetic data, how to get X without noise?

Hello, I am confused about how the X data is generated. From my understanding it should satisfy X=A_transpose * X, where is the ground truth graph G, randomly generated in simulate_random_dag. However, in simulate_sem, the variable "eta" seems to be 0, as X is initiated as np.zeros. And eta is X.dot(W), where W is G as np array. If I don't add any noise to it, X is just zero. I'm just wondering if there is a way to get X without noise?
Sorry, I'm not very familiar with it.
Much appreciated!!

Are there any tricks in this expression?

here's the line(52) in modules.py:
logits = torch.matmul(adj_Aforz, x + self.Wa) - self.Wa

whether is it same as the following line?
logits = torch.matmul(adj_Aforz, x)

I've test result between the two, however they are the same.
So I wonder if there were any tricks so that the param Wa is necessary?
Wish you could help me, thank you.

How do I train a network on my own data?

Hi,

I would like to train a DAG-GNN on my own synthetic data. Is there a python command to do so? In the README you only seem to provide the command to re-execute your synthetic datasets.

Cheers
Osman

about transpose in line 619-629 of utils.py

first in line 47 (MLDEncoder) in train.py , as mentioned in paper, we want adj_Aforz = I-A^T.

adj_Aforz = I-A^T

    adj_Aforz = preprocess_adj_new(adj_A1)

but in preprocess_adj_new, etc, i can not find any transpose in the following code:
def preprocess_adj(adj):
adj_normalized = (torch.eye(adj.shape[0]).double() + (adj.transpose(0,1)))
return adj_normalized

def preprocess_adj_new(adj):
adj_normalized = (torch.eye(adj.shape[0]).double() - (adj.transpose(0,1)))
return adj_normalized

def preprocess_adj_new1(adj):
adj_normalized = torch.inverse(torch.eye(adj.shape[0]).double()-adj.transpose(0,1))
return adj_normalized

it seems that transpose(0,1) keep the original form of a matrix.

fishmoon1234 / dag-gnn Goto Github PK

dag-gnn's People

Contributors

Stargazers

Watchers

Forkers

dag-gnn's Issues

if args.save_folder and np.mean(nll_val) < best_val_loss:

adj_Aforz = I-A^T

Recommend Projects

Recommend Topics

Recommend Org