Git Product home page Git Product logo

tgn's Introduction

TGN: Temporal Graph Networks [arXiv, YouTube, Blog Post]

Dynamic Graph TGN

Introduction

Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time).

In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient.

We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.

Running the experiments

Requirements

Dependencies (with python >= 3.7):

pandas==1.1.0
torch==1.6.0
scikit_learn==0.23.1

Dataset and Preprocessing

Download the public data

Download the sample datasets (eg. wikipedia and reddit) from here and store their csv files in a folder named data/.

Preprocess the data

We use the dense npy format to save the features in binary format. If edge features or nodes features are absent, they will be replaced by a vector of zeros.

python utils/preprocess_data.py --data wikipedia --bipartite
python utils/preprocess_data.py --data reddit --bipartite

Model Training

Self-supervised learning using the link prediction task:

# TGN-attn: Supervised learning on the wikipedia dataset
python train_self_supervised.py --use_memory --prefix tgn-attn --n_runs 10

# TGN-attn-reddit: Supervised learning on the reddit dataset
python train_self_supervised.py -d reddit --use_memory --prefix tgn-attn-reddit --n_runs 10

Supervised learning on dynamic node classification (this requires a trained model from the self-supervised task, by eg. running the commands above):

# TGN-attn: self-supervised learning on the wikipedia dataset
python train_supervised.py --use_memory --prefix tgn-attn --n_runs 10

# TGN-attn-reddit: self-supervised learning on the reddit dataset
python train_supervised.py -d reddit --use_memory --prefix tgn-attn-reddit --n_runs 10

Baselines

### Wikipedia Self-supervised

# Jodie
python train_self_supervised.py --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn --n_runs 10

# DyRep
python train_self_supervised.py --use_memory --memory_updater rnn --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn --n_runs 10


### Reddit Self-supervised

# Jodie
python train_self_supervised.py -d reddit --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn_reddit --n_runs 10

# DyRep
python train_self_supervised.py -d reddit --use_memory --memory_updater rnn --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn_reddit --n_runs 10


### Wikipedia Supervised

# Jodie
python train_supervised.py --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn --n_runs 10

# DyRep
python train_supervised.py --use_memory --memory_updater rnn --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn --n_runs 10


### Reddit Supervised

# Jodie
python train_supervised.py -d reddit --use_memory --memory_updater rnn --embedding_module time --prefix jodie_rnn_reddit --n_runs 10

# DyRep
python train_supervised.py -d reddit --use_memory --memory_updater rnn  --dyrep --use_destination_embedding_in_message --prefix dyrep_rnn_reddit --n_runs 10

Ablation Study

Commands to replicate all results in the ablation study over different modules:

# TGN-2l
python train_self_supervised.py --use_memory --n_layer 2 --prefix tgn-2l --n_runs 10 

# TGN-no-mem
python train_self_supervised.py --prefix tgn-no-mem --n_runs 10 

# TGN-time
python train_self_supervised.py --use_memory --embedding_module time --prefix tgn-time --n_runs 10 

# TGN-id
python train_self_supervised.py --use_memory --embedding_module identity --prefix tgn-id --n_runs 10

# TGN-sum
python train_self_supervised.py --use_memory --embedding_module graph_sum --prefix tgn-sum --n_runs 10

# TGN-mean
python train_self_supervised.py --use_memory --aggregator mean --prefix tgn-mean --n_runs 10

General flags

optional arguments:
  -d DATA, --data DATA         Data sources to use (wikipedia or reddit)
  --bs BS                      Batch size
  --prefix PREFIX              Prefix to name checkpoints and results
  --n_degree N_DEGREE          Number of neighbors to sample at each layer
  --n_head N_HEAD              Number of heads used in the attention layer
  --n_epoch N_EPOCH            Number of epochs
  --n_layer N_LAYER            Number of graph attention layers
  --lr LR                      Learning rate
  --patience                   Patience of the early stopping strategy
  --n_runs                     Number of runs (compute mean and std of results)
  --drop_out DROP_OUT          Dropout probability
  --gpu GPU                    Idx for the gpu to use
  --node_dim NODE_DIM          Dimensions of the node embedding
  --time_dim TIME_DIM          Dimensions of the time embedding
  --use_memory                 Whether to use a memory for the nodes
  --embedding_module           Type of the embedding module
  --message_function           Type of the message function
  --memory_updater             Type of the memory updater
  --aggregator                 Type of the message aggregator
  --memory_update_at_the_end   Whether to update the memory at the end or at the start of the batch
  --message_dim                Dimension of the messages
  --memory_dim                 Dimension of the memory
  --backprop_every             Number of batches to process before performing backpropagation
  --different_new_nodes        Whether to use different unseen nodes for validation and testing
  --uniform                    Whether to sample the temporal neighbors uniformly (or instead take the most recent ones)
  --randomize_features         Whether to randomize node features
  --dyrep                      Whether to run the model as DyRep

TODOs

  • Make code memory efficient: for the sake of simplicity, the memory module of the TGN model is implemented as a parameter (so that it is stored and loaded together of the model). However, this does not need to be the case, and more efficient implementations which treat the models as just tensors (in the same way as the input features) would be more amenable to large graphs.

Cite us

@inproceedings{tgn_icml_grl2020,
    title={Temporal Graph Networks for Deep Learning on Dynamic Graphs},
    author={Emanuele Rossi and Ben Chamberlain and Fabrizio Frasca and Davide Eynard and Federico 
    Monti and Michael Bronstein},
    booktitle={ICML 2020 Workshop on Graph Representation Learning},
    year={2020}
}

tgn's People

Contributors

darabos avatar emalgorithm avatar goingmyway avatar yule-buaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tgn's Issues

How to make predictions with TGN?

Hi @emalgorithm ,

Thanks for the interesting work.

I want for each user in the training set to predict the top K training items they are most likely to interact with at future time T.

Here is what I did, please let me know if I got something wrong:

  1. Trained TGN with memory with tgn/train_self_supervised.py
  2. Using the trained TGN to compute embeddings for all training users and all training items using tgn.embedding_module.compute_embedding. I am using the memory and time differences between last update of each node and T.
  3. For each user use tgn.affinity_score to compute the affinity with all items and select the TOP K

Does it seem correct? I can share the code, it is short.

Thank you
Marc

why the unseen nodes are included in the test set for the transductive link prediction task?

Hi,

After examing the code implementation, I observed that the test set for transductive link prediction includes both observed nodes and unseen nodes. But I guess unseen nodes should be excluded from the test set in transductive settings (please correct me if I'm wrong).

Could you please tell me why the unseen nodes are included in the test set for the transductive link prediction task? Thank you very much!

How do you ensure that a randomly selected negative sample does not happen to be a destination?

Thank you for this study and the scripts. I think it is a great contribution to the study of dynamic graph.

But when I was reading the Code, I found a problem that I thought about for a long time but couldn't understand. I suspect that this may affect the effectiveness of the model evaluation. During the generation of negatives_batch, I noticed that the RandEdgeSampler randomly selects a batch of destination nodes as negative samples for calculation. But what if the samples that happen to be selected are the ones that has an edge with the source node?

Perhaps when there are enough nodes, this will not cause a problem. But I'm still curious as to why you don't avoid this problem. (Exclude in random selection)

Thanks

Problem about update_memory

Hi @emalgorithm, I got some problems when reading your codes.

When memory_update_at_start=True, the msg_agg and msg_func will calc twice, before the compute_embedding and after compute_embedding. Before the compute_embedding, the get_updated_memory function will calc all nodes' memory. After the compute_embedding, update_memory function will calc positive nodes.

if self.memory_update_at_start:
    # Persist the updates to the memory only for sources and destinations (since now we have
    # new messages for them)
    self.update_memory(positives, self.memory.messages)

    assert torch.allclose(memory[positives], self.memory.get_memory(positives), atol=1e-5), \
      "Something wrong in how the memory was updated"

    # Remove messages for the positives since we have already updated the memory using them
    self.memory.clear_messages(positives)

  unique_sources, source_id_to_messages = self.get_raw_messages(source_nodes,
                                                                source_node_embedding,
                                                                destination_nodes,
                                                                destination_node_embedding,
                                                                edge_times, edge_idxs)
  unique_destinations, destination_id_to_messages = self.get_raw_messages(destination_nodes,
                                                                          destination_node_embedding,
                                                                          source_nodes,
                                                                          source_node_embedding,
                                                                          edge_times, edge_idxs)
  if self.memory_update_at_start:
    self.memory.store_raw_messages(unique_sources, source_id_to_messages)
    self.memory.store_raw_messages(unique_destinations, destination_id_to_messages)
  else:
    self.update_memory(unique_sources, source_id_to_messages)
    self.update_memory(unique_destinations, destination_id_to_messages)

The code annotation here was "Persist the updates to the memory only for sources and destinations (since now we have new messages for them)", but actually the message in this batch was update after the memory update, update_memory function was updating memory from the message in last batch. So here comes a problem that update_memory(positives, self.memory.messages) was updating positive nodes in this batch, and updated messages was from last batch. I don't understand why the code is doing this, maybe it's a bug?

I think here needs to update all nodes' memory (or record last batch's positive nodes), or update memory in get_updated_memory function directly (replace it to update_memory).

Sensitivity to different graph sizes

Dear authors,

thank you for open-sourcing your code :) I am planning to test your approach on temporal graphs that differ wildly in terms graph size, i.e., number of nodes and edges. Do you generally think that TGN's are very sensitive to varying graph sizes (especially at initialization) or did that not come up when developing your model?

Best regards

memory in-efficent?

Hi I was testing this model for link prediction on my own dataset: PubMed co-author; and when I specified to augment the model with node memory, it couldn't even run with a batch size of 5? And when 'not using the memory', the whole model works. Based on what I've read in your paper, the memory is where the performance really made a difference. and sorry I'm really new to graph neural networks, but am I missing anything or just the model is extremely inefficient?

AssertionError on python train_self_supervised.py --use_memory --prefix tgn-attn --n_runs 10

With the procedure running as mentioned in the Readme, for the step

python train_self_supervised.py --use_memory --prefix tgn-attn --n_runs 10

I get the following AssertionError

Traceback (most recent call last):
  File "train_self_supervised.py", line 221, in <module>
    timestamps_batch, edge_idxs_batch, NUM_NEIGHBORS)
  File "/home/i0325777/Loci/Infectology/vre-tgn/model/tgn.py", line 211, in compute_edge_probabilities
    source_nodes, destination_nodes, negative_nodes, edge_times, edge_idxs, n_neighbors)
  File "/home/i0325777/Loci/Infectology/vre-tgn/model/tgn.py", line 166, in compute_temporal_embeddings
    "Something wrong in how the memory was updated"
AssertionError: Something wrong in how the memory was updated

I assume this is not expected behavior and it also does not happen every time I run it. Anybody else experiencing this?

About twitter datasets

hello dear authors, would you like to tell me where I can get twitter dataset. thanks!

Errors while running train_self_supervised.py

When running the train_self_supervised.py on the Reddit dataset, I always got the followings errors:

Traceback (most recent call last):
File "train_self_supervised.py", line 221, in
timestamps_batch, edge_idxs_batch, NUM_NEIGHBORS)
File "/home/scg/tgn-master/model/tgn.py", line 211, in compute_edge_probabilities
source_nodes, destination_nodes, negative_nodes, edge_times, edge_idxs, n_neighbors)
File "/home/scg/tgn-master/model/tgn.py", line 166, in compute_temporal_embeddings
"Something wrong in how the memory was updated"
AssertionError: Something wrong in how the memory was updated

And, can I annotate " assert torch.allclose(memory[positives], self.memory.get_memory(positives), atol=1e-5),
"Something wrong in how the memory was updated" in tgn.py " . Thank you very much.

[Code Missing] Missing Benchmarking code published on your TGN paper

Hi twitter research team of TGN,

Great work on your paper! I'm particularly interested in the model. And I am trying to reproduce your results to compare between TGN and the other models, such as GAE, TGAT, JODIE that you mentioned in your paper.

Could you please provide the code you used to produce the results in the following table?

Screenshot 2023-09-12 at 2 08 32 PM

Errors while loading saved state into a new TGN instance

I successfully trained a TGN model using my own dataset. However when I try load a saved model into a new TGN instance, I get the following errors:

Exception has occurred: RuntimeError
Error(s) in loading state_dict for TGN:
Missing key(s) in state_dict: "message_function.mlp.0.weight", "message_function.mlp.0.bias", "message_function.mlp.2.weight", "message_function.mlp.2.bias", "message_function.layers.0.weight", "message_function.layers.0.bias", "message_function.layers.2.weight", "message_function.layers.2.bias".
size mismatch for memory_updater.memory_updater.weight_ih: copying a param with shape torch.Size([516, 517]) from checkpoint, the shape in current model is torch.Size([516, 100]).
File "C:\Users\fwaris\source\repos\tgn-master\tgnexport.py", line 58, in
tgn.load_state_dict(model_state)

looks like mlp weights are missing and there is a mismatch in shape for 'weight_ih'.

Node Classification

Does this repo include the code for node classification? If not, is there any plan to share the code?

Why is the node_feature matrix + 1 larger than the largest index of nodes?

In the preprocess code you currently only use a zero matrix, so it does not matter in your experiments.

rand_feat = np.zeros((max_idx + 1, 172))

But why is the feature matrix one row larger than the highest index of the nodes? To me that looks like it is one row larger than the number of nodes.

max_idx = max(new_df.u.max(), new_df.i.max())
rand_feat = np.zeros((max_idx + 1, 172))

In our own dataset, we have actual node features and we try to estimate the benefit of using a TGN to classify the nodes vs. a neural network using only the node features. So I am wondering if I have to include a row of features that are all zero above or below my features to complete the matrix correctly?

Twitter dataset

Hi,
I'm wondering how to reproduce the future edge prediction results from your work for the Twitter dataset from the 2020 RecSys Challenge.
If I want to create a dataset similar to what you've described in the paper, should I format the data with the same columns as Reddit and Wikipedia, and then run the preprocessing script with bipartite=False?
Are there any other tips you can give me on necessary changes for training the model?
Thanks a lot.

[Question] why memory_updater can't be updated by backpropagation without set --memory_update_at_end flag?

From the code, I think when I don't set --memory_update_at_end, the memory_updater's param will be updated, but actually when I test it, it don't get updated

here is my test code in train_self_supervised.py

        parameter_dict_before_backward = {k:v.detach().clone() for k,v in tgn.memory_updater.named_parameters()}
        loss.backward()
        optimizer.step()
        parameter_dict_after_backward = {k:v.detach().clone() for k, v in tgn.memory_updater.named_parameters()}
        for (k1, v1), (k2, v2) in zip(parameter_dict_after_backward.items(), parameter_dict_before_backward.items()):
            if not torch.allclose(v1,v2,1e-05):
                print(f"{k1} is updated, before{v1}, after{v2}")
                break
            print(f"{k1} all close")

Is there something i do wrong?

Potential problem with embeddings computing?

Hi,

let's say we are using GraphSumEmbedding as a layer in TGN.

As I understood from Your paper in the propagation part, we are supposed to concatenate embeddings from the previous layer of our neighbors and from the previous layer ourselves as nodes.

So for example, if we want to calculate an embedding on layer 2 of a node with id "a" whose neighbors are nodes with id: "b", "c", "d", we need to calculate embeddings of neighbors "b", "c", and "d" on layer 1 and we also need embedding of node "a" on layer 1.

But from code it doesn't look to me that way, could be I am wrong, but You are always sending source_node_features to the aggregate function, meaning You are always using features on layer 0 of source nodes: memory + raw_features:

 source_embedding = self.aggregate(n_layers, source_node_features,
                                        source_nodes_time_embedding,
                                        neighbor_embeddings,
                                        edge_time_embeddings,
                                        edge_features,
                                        mask)

And then later You are in GraphSumEmbedding.calculate doing

source_features = torch.cat([source_node_features,
                                 source_nodes_time_embedding.squeeze()], dim=1)

This is a problem since when you want to calculate embeddings on layer 2 You will be using neighbor_embeddings on layer 1, but source_node_features from layer 0.

Why is that so:
because You are setting source_node_features at the beginning and not changing variable:

 source_node_features = self.node_features[source_nodes_torch, :]

    if self.use_memory:
      source_node_features = memory[source_nodes, :] + source_node_features

I think that part of calculating source_node_embeddings is completely missing.

This is the source code down below of GraphEmbedding:

def compute_embedding(self, memory, source_nodes, timestamps, n_layers, n_neighbors=20, time_diffs=None,
                        use_time_proj=True):
    """Recursive implementation of curr_layers temporal graph attention layers.
    src_idx_l [batch_size]: users / items input ids.
    cut_time_l [batch_size]: scalar representing the instant of the time where we want to extract the user / item representation.
    curr_layers [scalar]: number of temporal convolutional layers to stack.
    num_neighbors [scalar]: number of temporal neighbor to consider in each convolutional layer.
    """

    assert (n_layers >= 0)

    source_nodes_torch = torch.from_numpy(source_nodes).long().to(self.device)
    timestamps_torch = torch.unsqueeze(torch.from_numpy(timestamps).float().to(self.device), dim=1)

    # query node always has the start time -> time span == 0
    source_nodes_time_embedding = self.time_encoder(torch.zeros_like(
      timestamps_torch))

    source_node_features = self.node_features[source_nodes_torch, :]

    if self.use_memory:
      source_node_features = memory[source_nodes, :] + source_node_features

    if n_layers == 0:
      return source_node_features
    else:

      neighbors, edge_idxs, edge_times = self.neighbor_finder.get_temporal_neighbor(
        source_nodes,
        timestamps,
        n_neighbors=n_neighbors)

      neighbors_torch = torch.from_numpy(neighbors).long().to(self.device)

      edge_idxs = torch.from_numpy(edge_idxs).long().to(self.device)

      edge_deltas = timestamps[:, np.newaxis] - edge_times

      edge_deltas_torch = torch.from_numpy(edge_deltas).float().to(self.device)

      neighbors = neighbors.flatten()
      neighbor_embeddings = self.compute_embedding(memory,
                                                   neighbors,
                                                   np.repeat(timestamps, n_neighbors),
                                                   n_layers=n_layers - 1,
                                                   n_neighbors=n_neighbors)

      effective_n_neighbors = n_neighbors if n_neighbors > 0 else 1
      neighbor_embeddings = neighbor_embeddings.view(len(source_nodes), effective_n_neighbors, -1)
      edge_time_embeddings = self.time_encoder(edge_deltas_torch)

      edge_features = self.edge_features[edge_idxs, :]

      mask = neighbors_torch == 0

      source_embedding = self.aggregate(n_layers, source_node_features,
                                        source_nodes_time_embedding,
                                        neighbor_embeddings,
                                        edge_time_embeddings,
                                        edge_features,
                                        mask)

      return source_embedding

And code for GraphSumEmbedding

 def aggregate(self, n_layer, source_node_features, source_nodes_time_embedding,
                neighbor_embeddings,
                edge_time_embeddings, edge_features, mask):
    neighbors_features = torch.cat([neighbor_embeddings, edge_time_embeddings, edge_features],
                                   dim=2)
    neighbor_embeddings = self.linear_1[n_layer - 1](neighbors_features)
    neighbors_sum = torch.nn.functional.relu(torch.sum(neighbor_embeddings, dim=1))

    source_features = torch.cat([source_node_features,
                                 source_nodes_time_embedding.squeeze()], dim=1)
    source_embedding = torch.cat([neighbors_sum, source_features], dim=1)
    source_embedding = self.linear_2[n_layer - 1](source_embedding)

    return source_embedding

Instructions contain multiple errors

The instructions and README contain multiple errors.

  1. The referenced 'reddit' and 'wikipedia' datasets don't exist. The temporal graph datasets related to these names found at the site are: 'soc-RedditHyperlinks' and 'wiki-talk-temporal'.
  2. The linked datasets are not csv files as described.
  3. The linked datasets do not have the label column required by the preprocessing script.
  4. The linked datasets have radically different node and edge counts than are listed in the paper.

Kumar et al's paper does not provide the datasets either. They state:
"Reddit post dataset: this public dataset consists of one month
of posts made by users on subreddits [2]. We selected the 1,000
most active subreddits as items and the 10,000 most active users.
This results in 672,447 interactions. We convert the text of each
post into a feature vector representing their LIWC categories [35]"
"Wikipedia edits: this public dataset is one month of edits made
by edits on Wikipedia pages [3]. We selected the 1,000 most edited
pages as items and editors who made at least 5 edits as users (a total
of 8,227 users). This generates 157,474 interactions. Similar to the
Reddit dataset, we convert the edit text into a LIWC-feature vector"

Please include or link to the exact datasets used in the paper for reproducibility.

Memory of some nodes is zero tensor after training

The ratio of zero tensor is different on different datasets. I am confused about this, because according to the article description, the memory of each node should be updated during training. Is there any reasonable explanation for this?

Multi-GPU Training

Could you give some guidance on how to train the model on multiple GPUs? I have a very large dataset and I run out of memory (32 GB total).
Should I use PyTorch's Distributed Data Parallel? or just assign different GPUs to different components of the model?

Thanks.

TGN identical to DyRep

The result for TGN is identical to DyRep on wikipedia and reddit. Could this be a bug?

Run script failed due to assertion

Hi team, when I tried to run script using Reddit dataset, it failed after 6 training epoch.

Screenshot_2023-12-21-21-43-10-33_b925948ebc6526d76511b6ea3af63aeb.jpg

I run the script on NVIDIA RTX 3080ti, which does not support CUDA 10.1. Therefore, I upgraded torch version to 2.1 with CUDA 12.1. Other requirements (pandas, scikit-learn, python3.7) stated in README.md are unchanged.

How could this happen?

If you need any more details about my environment, please comment and I will provide them.

paddlepaddle library

i have dependency issues while importing paddle.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
jupyter-server 1.24.0 requires anyio<4,>=3.1.0, but you have anyio 4.0.0 which is incompatible.
moviepy 1.0.3 requires decorator<5.0,>=4.0.2, but you have decorator 5.1.1 which is incompatible.
numba 0.56.4 requires numpy<1.24,>=1.18, but you have numpy 1.26.0 which is incompatible.
tensorflow 2.13.0 requires numpy<=1.24.3,>=1.22, but you have numpy 1.26.0 which is incompatible.
tensorflow-metadata 1.14.0 requires protobuf<4.21,>=3.20.3, but you have protobuf 4.24.3 which is incompatible.
Successfully installed Pillow-10.0.1 anyio-4.0.0 astor-0.8.1 certifi-2023.7.22 decorator-5.1.1 exceptiongroup-1.1.3 h11-0.14.0 httpcore-0.18.0 httpx-0.25.0 idna-3.4 numpy-1.26.0 opt-einsum-3.3.0 paddle-bfloat-0.1.7 paddlepaddle-2.5.0 protobuf-4.24.3 sniffio-1.3.0
WARNING: The following packages were previously imported in this runtime:
[PIL,certifi,decorator]
You must restart the runtime in order to use newly installed versions.
please guide

Time is not considered when predicting new interactions

I modified the test function to evaluate if changing time impacts the predictions, and it does not.

Here is the way I did it:

For a specific pair of src and dst, I created multiple interactions having different timesteps, and the predictions are all equal probabilities. Basically changing time does nothing.

I am now confused about the entire idea of the paper. What is the goal of TGN? Is the goal just to predict future destinations without time?

Why updating memory during testing phase ?

As the title suggests, I think data in test-set shouldn't be used for updating the model's memory. In your implementation, you updates the model's memory with the positive triples' information:

# Persist the updates to the memory only for sources and destinations (since now we have # new messages for them) self.update_memory(positives, self.memory.messages)

But in fact, while testing in real-world scene, the data is unlabeled so we have no idea whether the triple is positive or not.

Updating memory fails for datasets that are not bipartite

Hi,

If I am not mistaken, there seems to be a bug when using the model on a Unipartite dataset when updating the memory at the end of each batch memory_update_at_start=False.

Running the model like this incorrectly triggers the AssertionError: Trying to update to time in the past of the memory_updater module. This is due to lines 185-186 in tgn.py.

def compute_temporal_embeddings(self, source_nodes, destination_nodes, negative_nodes, edge_times,
                                  edge_idxs, n_neighbors=20):
    ...
    if self.use_memory:
      if self.memory_update_at_start:
        # Update memory for all nodes with messages stored in previous batches
        memory, last_update = self.get_updated_memory(list(range(self.n_nodes)),
                                                      self.memory.messages)
      else:
        memory = self.memory.get_memory(list(range(self.n_nodes)))
        last_update = self.memory.last_update

      ...

    if self.use_memory:
      if self.memory_update_at_start:
        # Persist the updates to the memory only for sources and destinations (since now we have
        # new messages for them)
        self.update_memory(positives, self.memory.messages)

        assert torch.allclose(memory[positives], self.memory.get_memory(positives), atol=1e-5), \
          "Something wrong in how the memory was updated"

        # Remove messages for the positives since we have already updated the memory using them
        self.memory.clear_messages(positives)

      unique_sources, source_id_to_messages = self.get_raw_messages(source_nodes, source_node_embedding, destination_nodes, destination_node_embedding, edge_times, edge_idxs)
      unique_destinations, destination_id_to_messages = self.get_raw_messages(destination_nodes, destination_node_embedding, source_nodes, source_node_embedding, edge_times, edge_idxs)
      if self.memory_update_at_start:
        self.memory.store_raw_messages(unique_sources, source_id_to_messages)
        self.memory.store_raw_messages(unique_destinations, destination_id_to_messages)
      else:
        self.update_memory(unique_sources, source_id_to_messages)                  <-- 185
        self.update_memory(unique_destinations, destination_id_to_messages)        <-- 186

     ...

    return source_node_embedding, destination_node_embedding, negative_node_embedding

When the source_nodes and destination_nodes contain non-overlapping node ids this is not a problem. However, when using a unipartite graph, the same node id can be in the source_nodes and the destination_nodes, which causes the described issue if this node id is associated with a later timestamp on the source node side, then the target node side.

This problem can be resolved by replacing:

      if self.memory_update_at_start:
        self.memory.store_raw_messages(unique_sources, source_id_to_messages)
        self.memory.store_raw_messages(unique_destinations, destination_id_to_messages)
      else:
        self.update_memory(unique_sources, source_id_to_messages)
        self.update_memory(unique_destinations, destination_id_to_messages)

with:

            self.memory.store_raw_messages(unique_sources, source_id_to_messages)
            self.memory.store_raw_messages(unique_destinations, destination_id_to_messages)

            if not self.memory_update_at_start:
                unique_node_ids = np.unique(np.concatenate((unique_sources, unique_destinations)))
                self.update_memory(unique_node_ids,
                             self.memory.messages)
                self.memory.clear_messages(unique_node_ids)

Edit: Found an issue in the fix initially proposed and updated matching the pull request

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.