Git Product home page Git Product logo

kge-hake's Introduction

HAKE: Hierarchy-Aware Knowledge Graph Embedding

This is the code of paper Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction. Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, Jie Wang. AAAI 2020. arxiv

Dependencies

Results

The results of HAKE and the baseline model ModE on WN18RR, FB15k-237 and YAGO3-10 are as follows.

WN18RR

MRR HITS@1 HITS@3 HITS@10
ModE 0.472 0.427 0.486 0.564
HAKE 0.496 ± 0.001 0.452 0.516 0.582

FB15k-237

MRR HITS@1 HITS@3 HITS@10
ModE 0.341 0.244 0.380 0.534
HAKE 0.346 ± 0.001 0.250 0.381 0.542

YAGO3-10

MRR HITS@1 HITS@3 HITS@10
ModE 0.510 0.421 0.562 0.660
HAKE 0.546 ± 0.001 0.462 0.596 0.694

Running the code

Usage

bash runs.sh {train | valid | test} {ModE | HAKE} {wn18rr | FB15k-237 | YAGO3-10} <gpu_id> \
<save_id> <train_batch_size> <negative_sample_size> <hidden_dim> <gamma> <alpha> \
<learning_rate> <num_train_steps> <test_batch_size> [modulus_weight] [phase_weight]
  • { | }: Mutually exclusive items. Choose one from them.
  • < >: Placeholder for which you must supply a value.
  • [ ]: Optional items.

Remark: [modulus_weight] and [phase_weight] are available only for the HAKE model.

To reproduce the results of HAKE and ModE, run the following commands.

HAKE

# WN18RR
bash runs.sh train HAKE wn18rr 0 0 512 1024 500 6.0 0.5 0.00005 80000 8 0.5 0.5

# FB15k-237
bash runs.sh train HAKE FB15k-237 0 0 1024 256 1000 9.0 1.0 0.00005 100000 16 3.5 1.0

# YAGO3-10
bash runs.sh train HAKE YAGO3-10 0 0 1024 256 500 24.0 1.0 0.0002 180000 4 1.0 0.5

ModE

# WN18RR
bash runs.sh train ModE wn18rr 0 0 512 1024 500 6.0 0.5 0.0001 80000 8 --no_decay

# FB15k-237
bash runs.sh train ModE FB15k-237 0 0 1024 256 1000 9.0 1.0 0.0001 100000 16

# YAGO3-10
bash runs.sh train ModE YAGO3-10 0 0 1024 256 500 24.0 1.0 0.0002 80000 4

Visualization

To plot entity embeddings on a 2D plane (Figure 4 in our paper), please refer to this issue.

Citation

If you find this code useful, please consider citing the following paper.

@inproceedings{zhang2020learning,
  title={Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction},
  author={Zhang, Zhanqiu and Cai, Jianyu and Zhang, Yongdong and Wang, Jie},
  booktitle={Thirty-Fourth {AAAI} Conference on Artificial Intelligence},
  pages={3065--3072},
  publisher={{AAAI} Press},
  year={2020}
}

Acknowledgement

We refer to the code of RotatE. Thanks for their contributions.

Other Repositories

If you are interested in our work, you may find the following paper useful.

Duality-Induced Regularizer for Tensor Factorization Based Knowledge Graph Completion. Zhanqiu Zhang, Jianyu Cai, Jie Wang. NeurIPS 2020. [paper] [code]

kge-hake's People

Contributors

jianyucai avatar zhanqiuzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

kge-hake's Issues

Some questions about the code

Hello, Recently I am trying to draw Figure 2 and Figure 3,how did you draw the diagrams about the relationship in Figure 2 and Figure 3 in the paper? Do you have a reference code for this? Thank you very much!

I'd like to ask about the running environment and running steps

Hello, I am an undergraduate who has just come into contact with the knowledge map, and I have just read your paper. I would like to ask whether the code is running in Linux environment? Can I run and operate under windows? Second, what are the running steps of the code?

We look forward to your answers~

Questions

Hi,

First of all, thanks for sharing the code behind your paper! I have a few questions; mainly about how the details in the paper maps to the code.

  1. My main concern is how a semantic hierarchies is translated into the embedding. How do we go from text values from the knowledge graphs to an embedding with modulus values that obey the hierarchy; for instance "dog" and "mammal" gets embedded with modulus values 2 and 1? Is it a consequence of the score function?

  2. In the code, we have the parameters modulus_weight and phase_weight that are used to multiply with the modulus and phase score. In the paper, however, we only multiply with a weight in the phase part (the lambda parameter) as seen below.
    Screenshot 2020-05-11 at 11 49 30
    So the question is, are we missing a weight in the modulus part? If not, where does the above lambda parameter map to in the code?

  3. I can not find a description of the embedding_range parameter in the paper. Is there an explanation behind its definition? Furthermore, is the purpose of the epsilon value just to get a minimum value for the embedding range? And why 2.0?

  4. The initialisation of the HAKE and ModE classes seems to differ in how the relation embedding is initialised. The HAKE initialises the modulus and bias to ones and zeroes respectively. What is the purpose of this?

I hope the questions make sense. Thanks in advance!

I just want to ask one question about code.

In have a question about the code written in "codes/models.py".
In "codes/models.py", there is an "if statement" as below.
I want to know if there is a reason for dividing the condition with the 'if statement'.

        if batch_type == BatchType.HEAD_BATCH:
            phase_score = phase_head + (phase_relation - phase_tail)
        else:
            phase_score = (phase_head + phase_relation) - phase_tail

Of course, since it is a float operation, the value is different.
But this is due to the limitation of the operation device.
So I wonder if there is a logical difference.

Thank you for reading.

Question about figure 3 visualization

Hello, I am currently a final year student and I have a graduation project on this knowledge graph topic. I am very impressed with your approach to the problem. I have a question about "Figure 3" in the article, I don't quite understand how to draw this graph. Could you please be more specific about this?
Thank you very much and I look forward to hearing from you

One question

Thank you very much for the code! I read your code carefully, I have one question,I would be grateful if you could help me with the answer

in data.py:131

    def collate_fn(data):
        positive_sample = torch.stack([_[0] for _ in data], dim=0)
        negative_sample = torch.stack([_[1] for _ in data], dim=0)
        subsample_weight = torch.cat([_[2] for _ in data], dim=0)
        batch_type = data[0][3]# this line
        return positive_sample, negative_sample, subsample_weight, 

this function is processing one batch, all data in this batch have only a kind of batch_type ?

关于实体层次的计算问题

应该在代码中如何计算modulus,从而来判断两个实体是否是在一个层次呢?
以及对于不在一个层次的实体,怎么判断他们所属的层次级别呢,比如说动物 鸟 麻雀,动物在麻雀两层之上,鸟在麻雀一层之上。
可以给出具体的代码示例来解决我这边提到的例子问题么。
目前我已经完成了HAKE模型的训练。

实体层次问题

你好,我用HAKE得到了实体的embedding。请问我如何根据实体的embedding,判断两个实体是否属于同一层次呢?就是我想问如何利用模型训练好的实体embedding,来区分实体的层级呢?

About the hyperparameters

I find that the hyperparameters from the HAKE is almost the same as RotatE expect the new parameters

  • MODULUS_WEIGHT

  • PHASE_WEIGHT

Could you explain how you grid seach them?(such as the range of grid search)

Number of parameters

Hello,

I was wondering where one could find the number of parameters of the ModE and HAKE on WN18RR, FB15k-237 and YAGO3-10.

Cheers

About the 2D plane.

I'm very curious about the 2D pic from the paper.
Could you tell me how you plot them?
1

Question about HAKE: how is hierarchy considered?

Hello, I have a question about how HAKE models hierarchy in the input graph. Specifically, does the embedding of an entity using HAKE only consider information at a 1-hop distance or is information considered at a multi-hop distance?

For example, if an entity is at the highest level of hierarchy in the graph, does HAKE's embedding of this entity capture information all the way from it's hierarchy down to the lowest level of hierarchy, or is it just considering information at a 1-hop distance?

Thank you!

负采样

您好,负采样算法您是否实现了

CUDA error: device-side assert triggered

Hello, thank you so much for your excellent work! When I use my own data set, it has the following error, can you ask me why? The same environment configuration I used with the FB15k-237 dataset you provided did not have this error. But my data runs on TransE, and I converted it to the same format as the FB15k-237 data set you provided. This problem has been bothering me for a long time. I hope you can help me。The following is my data set information and error information.
image
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.