deepgraphlearning / ultra Goto Github PK

View Code? Open in Web Editor NEW

301.0 301.0 38.0 5.88 MB

A foundation model for knowledge graph reasoning

License: MIT License

Python 77.82% Cuda 11.08% C++ 11.10%

ultra's People

Contributors

Stargazers

Watchers

ultra's Issues

Cudatoolkit 11.8

Hi,
I am looking forward to run ULTRA. Looking great!

I am actually stuck installing the packages you refer in the installation instructions. I run the cuda installation (tried with pip also, and fixed the CUDA_HOME env variable as you pointed out) and got error with the cudatoolkit:

LibMambaUnsatisfiableError: Encountered problems while solving:
  - nothing provides requested cudatoolkit 11.8**
  - nothing provides cuda 11.8.* needed by pytorch-cuda-11.8-h8dd9ede_2

Could not solve for environment specs
The following packages are incompatible
├─ cudatoolkit 11.8**  does not exist (perhaps a typo or a missing channel);
└─ pytorch-cuda 11.8**  is not installable because it requires
   └─ cuda 11.8.* , which does not exist (perhaps a missing channel).

Do you know if another version of cudatoolkit runs well with your stack? Looks like the cudatoolkit 11.8 is giving (at least to me) some issues. I would appreciate any help.Thanks!
Jaime

ULTRA on HuggingFace Hub

Hi @clefourrier, you helped us some time ago to put LRGB on 🤗 Datasets, so you seem the only person from HF who has any remote experience with Graph ML although we know you are into LLM leaderboards recently :)

We tried to put ULTRA checkpoints on the Hub here https://huggingface.co/mgalkin with the basic write-up and code examples. Could you please have a look to check whether graph models are ok in this format on the Hub?

A few caveats at the moment:

Model repos on the Hub pack most of the code from this repo and have additional interfaces UltraConfig and UltraLinkPrediction
I can only wrap those into AutoConfig and AutoLinkPrediction locally, I presume that Auto interfaces will recognize Ultra website-wide only if we implement it directly in Transformers?

Error in running the example

When I run the example you provided

python script/run.py -c config/transductive/inference.yaml --dataset CoDExSmall --epochs 0 --bpe null --gpus null --ckpt ckpts/ultra_4g.pth

I get the error below:
FileNotFoundError: [Errno 2] No such file or directory: 'ckpts/ultra_4g.pth'

The error goes away when I replace config/transductive/inference.yaml with config/transductive/pretrain_4g.yaml, but then I get a new error:

IndexError: range object index out of range
Could you please resolve this issue?

No such file or directory: 'ckpts/ultra_4g.pth'

Error

When I run script/run.py in any setting I get the following error on colab.

/content/ULTRA
18:57:37   Random seed: 1024
18:57:37   Config file: config/inductive/inference.yaml
18:57:37   {'checkpoint': 'ckpts/ultra_4g.pth',
 'dataset': {'class': 'FB15k237Inductive',
             'root': '~/git/ULTRA/kg-datasets/',
             'version': 'v1'},
 'model': {'class': 'Ultra',
           'entity_model': {'aggregate_func': 'sum',
                            'class': 'IndNBFNet',
                            'hidden_dims': [64, 64, 64, 64, 64, 64],
                            'input_dim': 64,
                            'layer_norm': True,
                            'message_func': 'distmult',
                            'short_cut': True},
           'relation_model': {'aggregate_func': 'sum',
                              'class': 'NBFNet',
                              'hidden_dims': [64, 64, 64, 64, 64, 64],
                              'input_dim': 64,
                              'layer_norm': True,
                              'message_func': 'distmult',
                              'short_cut': True}},
 'optimizer': {'class': 'AdamW', 'lr': 0.0005},
 'output_dir': '~/git/ULTRA/output',
 'task': {'adversarial_temperature': 1,
          'metric': ['mr', 'mrr', 'hits@1', 'hits@3', 'hits@10', 'hits@10_50'],
          'name': 'InductiveInference',
          'num_negative': 256,
          'strict_negative': True},
 'train': {'batch_per_epoch': None,
           'batch_size': 16,
           'gpus': [0],
           'log_interval': 100,
           'num_epoch': 0}}
18:57:37   FB15k237Inductive(v1) dataset
18:57:37   #train: 4245, #valid: 489, #test: 411
DEBUG
/root/git/ULTRA/output/Ultra/FB15k237Inductive/2023-10-26-18-57-37
ckpts/ultra_4g.pth
Traceback (most recent call last):
  File "/content/ULTRA/script/run.py", line 271, in <module>
    state = torch.load(cfg.checkpoint, map_location="cpu")
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 986, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 435, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 416, in __init__
    super().__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'ckpts/ultra_4g.pth'

Solution

In order to fix it and get it running, I had to initialize the model before setting up the working_dir. Essentially the working dir changes the path and it is not able to recognize the checkpoint path passed via args.

if __name__ == "__main__":
    print(os.getcwd())
    args, vars = util.parse_args()
    cfg = util.load_config(args.config, context=vars)
    model = Ultra(
        rel_model_cfg=cfg.model.relation_model,
        entity_model_cfg=cfg.model.entity_model,
    )

    print("DEBUG")
    print(os.getcwd())
    print(cfg.checkpoint)
    if "checkpoint" in cfg and cfg.checkpoint is not None:
        state = torch.load(cfg.checkpoint, map_location="cpu")
        model.load_state_dict(state["model"])
    working_dir = util.create_working_directory(cfg)

Output

For practical reasons, we need to have the predicted edges for the test data. But it seems we only access the evaluation metrics. Could you please give us guidance on how we can read the predicted edges?

Integrating a language model with ULTRA

Hi @migalkin,
First of all, Kudus for your work!!!! (both ULTRA and nodepiece 😄 ) .

I'm curious to hear your thoughts about integrating a language model (LM) with ULTRA.
Previously, with other KG models such as nodepiece, it was straightforward to integrate a language model to enrich the graph embeddings with textual embeddings.
I used to concat both the entity textual and graph representations and maybe apply additional layers to match the desired dimensions.

example:

# code from pykeen framework + modification
x_e, x_r = entity_representations[0](), self.relation_representations[0]()
indicies = torch.arange(self.text_representation.weight.data.shape[0])
x_e = self.merge_model(self.text_representation(indicies), x_e)  # Concat + linear layer

# Perform message passing and get updated states
for layer in self.gnn_encoder:
        x_e, x_r = layer(
            x_e=x_e,
            x_r=x_r,
            edge_index=getattr(self, f"{mode}_edge_index"),
            edge_type=getattr(self, f"{mode}_edge_type"),
        )

So far, it worked well and boosted the model's performance from ~50% when used with transE and up to ~30% with nodepiece on my datasets.

With ULTRA I guess that I have some additional work to do :)...
I started with understanding how the entity representation is "generated" on the fly:
https://github.com/DeepGraphLearning/ULTRA/blob/33c6e6b8e522aed3d33f6ce5d3a1883ca9284718/ultra/models.py#L166-L174C4

I understand that from that point only the tail representations are used to feed the MLP.

I replaced the MLP with my own MLP - to match the dim to the concatenation of both representations. Then, I tried to contact both, output from ULTRA with the textual entity representation. As far as I understand, due to this "late" concatenation only the tail entity textual representation will be used.
When tested, I got (almost) the same results with/without the textual representation.

Not sure what I expect to hear :), but I hope you may have an idea for combining both representations.

Torch install instructions may be invalid

(custom_env) me@MY-MBP ULTRA % pip install torch==2.1.0 --index-url https://download.pytorch.org/whl/cu118
Looking in indexes: https://download.pytorch.org/whl/cu118
ERROR: Could not find a version that satisfies the requirement torch==2.1.0 (from versions: none)
ERROR: No matching distribution found for torch==2.1.0

deepgraphlearning / ultra Goto Github PK

ultra's People

Contributors

Stargazers

Watchers

Forkers

ultra's Issues

Cudatoolkit 11.8

ULTRA on HuggingFace Hub

Error in running the example

No such file or directory: 'ckpts/ultra_4g.pth'

Error

Solution

Output

Integrating a language model with ULTRA

Torch install instructions may be invalid

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent