Git Product home page Git Product logo

ultra's People

Contributors

eltociear avatar migalkin avatar sanyam-2026 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ultra's Issues

Cudatoolkit 11.8

Hi,
I am looking forward to run ULTRA. Looking great!

I am actually stuck installing the packages you refer in the installation instructions. I run the cuda installation (tried with pip also, and fixed the CUDA_HOME env variable as you pointed out) and got error with the cudatoolkit:

LibMambaUnsatisfiableError: Encountered problems while solving:
  - nothing provides requested cudatoolkit 11.8**
  - nothing provides cuda 11.8.* needed by pytorch-cuda-11.8-h8dd9ede_2

Could not solve for environment specs
The following packages are incompatible
├─ cudatoolkit 11.8**  does not exist (perhaps a typo or a missing channel);
└─ pytorch-cuda 11.8**  is not installable because it requires
   └─ cuda 11.8.* , which does not exist (perhaps a missing channel).

Do you know if another version of cudatoolkit runs well with your stack? Looks like the cudatoolkit 11.8 is giving (at least to me) some issues. I would appreciate any help.Thanks!
Jaime

ULTRA on HuggingFace Hub

Hi @clefourrier, you helped us some time ago to put LRGB on 🤗 Datasets, so you seem the only person from HF who has any remote experience with Graph ML although we know you are into LLM leaderboards recently :)

We tried to put ULTRA checkpoints on the Hub here https://huggingface.co/mgalkin with the basic write-up and code examples. Could you please have a look to check whether graph models are ok in this format on the Hub?

A few caveats at the moment:

  • Model repos on the Hub pack most of the code from this repo and have additional interfaces UltraConfig and UltraLinkPrediction
  • I can only wrap those into AutoConfig and AutoLinkPrediction locally, I presume that Auto interfaces will recognize Ultra website-wide only if we implement it directly in Transformers?

Error in running the example

When I run the example you provided

python script/run.py -c config/transductive/inference.yaml --dataset CoDExSmall --epochs 0 --bpe null --gpus null --ckpt ckpts/ultra_4g.pth

I get the error below:
FileNotFoundError: [Errno 2] No such file or directory: 'ckpts/ultra_4g.pth'

The error goes away when I replace config/transductive/inference.yaml with config/transductive/pretrain_4g.yaml, but then I get a new error:

IndexError: range object index out of range
Could you please resolve this issue?

No such file or directory: 'ckpts/ultra_4g.pth'

Error

When I run script/run.py in any setting I get the following error on colab.

/content/ULTRA
18:57:37   Random seed: 1024
18:57:37   Config file: config/inductive/inference.yaml
18:57:37   {'checkpoint': 'ckpts/ultra_4g.pth',
 'dataset': {'class': 'FB15k237Inductive',
             'root': '~/git/ULTRA/kg-datasets/',
             'version': 'v1'},
 'model': {'class': 'Ultra',
           'entity_model': {'aggregate_func': 'sum',
                            'class': 'IndNBFNet',
                            'hidden_dims': [64, 64, 64, 64, 64, 64],
                            'input_dim': 64,
                            'layer_norm': True,
                            'message_func': 'distmult',
                            'short_cut': True},
           'relation_model': {'aggregate_func': 'sum',
                              'class': 'NBFNet',
                              'hidden_dims': [64, 64, 64, 64, 64, 64],
                              'input_dim': 64,
                              'layer_norm': True,
                              'message_func': 'distmult',
                              'short_cut': True}},
 'optimizer': {'class': 'AdamW', 'lr': 0.0005},
 'output_dir': '~/git/ULTRA/output',
 'task': {'adversarial_temperature': 1,
          'metric': ['mr', 'mrr', 'hits@1', 'hits@3', 'hits@10', 'hits@10_50'],
          'name': 'InductiveInference',
          'num_negative': 256,
          'strict_negative': True},
 'train': {'batch_per_epoch': None,
           'batch_size': 16,
           'gpus': [0],
           'log_interval': 100,
           'num_epoch': 0}}
18:57:37   FB15k237Inductive(v1) dataset
18:57:37   #train: 4245, #valid: 489, #test: 411
DEBUG
/root/git/ULTRA/output/Ultra/FB15k237Inductive/2023-10-26-18-57-37
ckpts/ultra_4g.pth
Traceback (most recent call last):
  File "/content/ULTRA/script/run.py", line 271, in <module>
    state = torch.load(cfg.checkpoint, map_location="cpu")
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 986, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 435, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 416, in __init__
    super().__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'ckpts/ultra_4g.pth'

Solution

In order to fix it and get it running, I had to initialize the model before setting up the working_dir. Essentially the working dir changes the path and it is not able to recognize the checkpoint path passed via args.

if __name__ == "__main__":
    print(os.getcwd())
    args, vars = util.parse_args()
    cfg = util.load_config(args.config, context=vars)
    model = Ultra(
        rel_model_cfg=cfg.model.relation_model,
        entity_model_cfg=cfg.model.entity_model,
    )

    print("DEBUG")
    print(os.getcwd())
    print(cfg.checkpoint)
    if "checkpoint" in cfg and cfg.checkpoint is not None:
        state = torch.load(cfg.checkpoint, map_location="cpu")
        model.load_state_dict(state["model"])
    working_dir = util.create_working_directory(cfg)

Output

For practical reasons, we need to have the predicted edges for the test data. But it seems we only access the evaluation metrics. Could you please give us guidance on how we can read the predicted edges?

Integrating a language model with ULTRA

Hi @migalkin,
First of all, Kudus for your work!!!! (both ULTRA and nodepiece 😄 ) .

I'm curious to hear your thoughts about integrating a language model (LM) with ULTRA.
Previously, with other KG models such as nodepiece, it was straightforward to integrate a language model to enrich the graph embeddings with textual embeddings.
I used to concat both the entity textual and graph representations and maybe apply additional layers to match the desired dimensions.

example:

# code from pykeen framework + modification
x_e, x_r = entity_representations[0](), self.relation_representations[0]()
indicies = torch.arange(self.text_representation.weight.data.shape[0])
x_e = self.merge_model(self.text_representation(indicies), x_e)  # Concat + linear layer

# Perform message passing and get updated states
for layer in self.gnn_encoder:
        x_e, x_r = layer(
            x_e=x_e,
            x_r=x_r,
            edge_index=getattr(self, f"{mode}_edge_index"),
            edge_type=getattr(self, f"{mode}_edge_type"),
        )

So far, it worked well and boosted the model's performance from ~50% when used with transE and up to ~30% with nodepiece on my datasets.

With ULTRA I guess that I have some additional work to do :)...
I started with understanding how the entity representation is "generated" on the fly:
https://github.com/DeepGraphLearning/ULTRA/blob/33c6e6b8e522aed3d33f6ce5d3a1883ca9284718/ultra/models.py#L166-L174C4

I understand that from that point only the tail representations are used to feed the MLP.

I replaced the MLP with my own MLP - to match the dim to the concatenation of both representations. Then, I tried to contact both, output from ULTRA with the textual entity representation. As far as I understand, due to this "late" concatenation only the tail entity textual representation will be used.
When tested, I got (almost) the same results with/without the textual representation.

Not sure what I expect to hear :), but I hope you may have an idea for combining both representations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.