Git Product home page Git Product logo

gana-fewshotkgc's Introduction

GANA-FewShotKGC

SIGIR2021: Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion. Paper

This is our source code and data for the paper:

Guanglin Niu, Yang Li, Chengguang Tang, Ruiying Geng, Jian Dai, Qiao Liu, Hao Wang, Jian Sun, Fei Huang, Luo Si. Relational Learning with Gated and Attentive Neighbor Aggregator for Few-Shot Knowledge Graph Completion. SIGIR 2021.

Author: Dr. Guanglin Niu (beihangngl at buaa.edu.cn)

Introduction

Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledge graph completion (FKGC) has recently gained more research interests. Some existing models employ a few-shot relation's multi-hop neighbor information to enhance its semantic representation. However, noise neighbor information might be amplified when the neighborhood is excessively sparse and no neighbor is available to represent the few-shot relation. Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances. Thus, inferring complex relations in the few-shot scenario is difficult for FKGC models due to limited training instances. In this paper, we propose a few-shot relational learning with global-local framework to address the above issues. At the global stage, a novel gated and attentive neighbor aggregator is built for accurately integrating the semantics of a few-shot relation's neighborhood, which helps filtering the noise neighbors even if a KG contains extremely sparse neighborhoods. For the local stage, a meta-learning based TransH (MTransH) method is designed to model complex relations and train our model in a few-shot learning fashion. Extensive experiments show that our model outperforms the state-of-the-art FKGC approaches on the frequently-used benchmark datasets NELL-One and Wiki-One. Compared with the strong baseline model MetaR, our model achieves 5-shot FKGC performance improvements of 8.0% on NELL-One and 2.8% on Wiki-One by the metric Hits@10.

Dataset

The dataset can be downloaded from Drive. Unzip it to the directory ./GANA-FewShotKGC.

The Structure of the project is as followings:

GANA-FewShotKGC
    |--./NELL
    |--trainer_gana.py
    |--params.py
    |--models_gana.py
    |--main_gana.py
    |--hyper_embedding.py
    |--embedding.py
    |--data_loader.py

Run the code

CUDA_VISIBLE_DEVICES=0 python main_gana.py --dataset NELL-One --data_path ./NELL \
--few 5 --data_form Pre-Train \
--prefix nellone_gana5 --max_neighbor 100

gana-fewshotkgc's People

Contributors

ngl567 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

gana-fewshotkgc's Issues

实验设置不符,代码和论文不符,实验结果有差距

细节问题就不说了,说主要的部分:

  1. 实验设置问题:论文中使用的数据集的描述是MetaR的Pre-train设置下的数据集,在main_gana.py的line 27直接设置为In-Train导致参数data_from设置仅设置了实体嵌入的初始化是使用预嵌入的还是随机初始化的。
  2. 邻居编码和门控部分:论文中公式1, 2涉及实体邻居编码部分,代码中的Bias,relu的顺序和论文中的不符合,邻居注意力的过程中没有对dummy邻居进行mask,公式5的bias也没有,如果是用nn.Linear自带的Bias会导致bias与门控关联。对应于model_gana.py的line 161~line 170。
  3. 关系表示聚合部分:公式中是对每一个LSTM的隐层输出做注意力机制加权求和,代码中用few+1步的最后隐层输出对前面few个relation输出求注意力加权求和。公式11~ 公式13论文和代码完全对不上。对应于model_gana.py的line 59 ~ line 65。
  4. MTransH部分:论文公式14与代码实现不符,根据TransH的原理,代码部分应该是正确的,应该是写作疏忽。然而,公式19描述的超平面梯度更新在代码model_gana.py的line 226中使用的是关系梯度进行的梯度更新。

在按照作者提供的设置上用NELL数据集尝试了许多代码,结果如下:

  • 修改TransH中超平面的更新为论文中的更新方式,MRR的性能由0.320左右下降至0.290左右
  • 将邻居编码器的mask和关系聚合方式修改为论文中的描述,MRR性能在0.300~0.320左右
  • 如果严格按照MetaR中的Pre-Train实验设置,MRR的性能大约在0.270以下

作者得到的结果应该是在In-Train设置下得到的结果,根据MetaR 提供的结果(该结果可被复现),In-Train设置下NELL的结果更好,比MetaR原始论文报告好很多。In-Train设置下,MetaR在WIKI上确实受到噪声影响,这个工作Gate的提出也是基于这个motivation。于是跑了WIKI数据,MRR在0.290左右,相比于MetaR确实显著提升。

总结:

  1. 论文中很多内容与代码对应不上。
  2. 首先数据集设置应该是In-Train的设置,但初始化是使用预训练的初始化。
  3. 实验结果难以复现,且差距显著。
  4. Gate方法对于In-Train设置下WIKI数据集的噪声影响确实有消除作用。

损失函数形参的维度不一致

请问我运行您的代码出现了错误 RuntimeError: margin_ranking_loss : All input tensors should have same dimension but got sizes: input1: torch.Size([1024, 3]), input2: torch.Size([1024, 3]), target: torch.Size([1]) ,我查阅了方法的源码形参要求维度都保持一致,但是您模型里面调用方法的维度好像是不一致的 y = torch.Tensor([1]).to(self.device) loss = self.loss_func(p_score, n_score, y)

--data_form

dataform的["Pre-Train", "In-Train", "Discard"]分别代表什么意思啊。

关于代码中邻居编码的问题

你好,不好意思又来提问了,代码里有个地方实在是有点看不明白,希望能够给一点解答。

在 model_gana.py 的 neighbor_encoder 函数里,第四行 entself = connections[:,0,0].squeeze(-1),如果我没有理解错的话,entself 代表的是 support set 中的头、尾实体,这个代码用来取的是 entself 的值,我个人感觉应该是错误的。connections 只包含了邻居节点的信息,并且 [:,0,0] 取的也应该是关系而不是实体。

希望能够得到答复。再三打扰了,烦请指教。

The program is running with an error

When I run the code according to "python main_gana.py --dataset NELL-One --data_path . /NELL --few 5 --data_form Pre-Train --prefix nellone_gana5 --max_neighbor 100", the following error appears when I run the code.
image

复杂关系

大佬,想问下复杂关系的实验该如何做呢?是要对处理过的数据集中的1-n文件如何做呢?

code

Hello~There is no file in the project. Where is your code?

Some issues with the implementation of GANA

Dear Authors:

There are some issues with the implementation of GANA: Non-reproducible results, In-Train setting rather than Pre-Train setting, and inconsistencies between paper and code.

1. Reproducibility
The results are not reproducible. Dependencies such as PyTorch version, CUDA version, and versions of the other Python libraries are not provided. Also, if we directly use the environment provided in MetaR, we get the following results on NELL-One (1-shot):

MRR Hit@10 Hit@5 Hit@1
GANA (Paper) 0.307 0.483 0.409 0.211
GANA (Reproduce) 0.265 0.453 0.357 0.163
MetaR (GitHub) 0.308 0.475 0.406 0.216

We tuned GANA using the following parameters:
max_neighbor: 50, 100, 200
lstm_hiddendim: 100, 200, 600, 700
The results are much lower than the reported results in MetaR.

2. Pre-Train Setting
In the paper, it is mentioned that GANA utilizes pretrained embeddings from GMatching. However, in main.py, tail is set to _in_train, which is different from the paper's descriptions. It seems that the implementation is based on the "In-Train" setting, not the "Pre-Train" setting.

3. Inconsistencies between the Paper and the Implementation

  • In the paper, the hyperplane vector is updated using the gradient of the hyperplane vector. However, in the code, the hyperplane vector is updated using the gradient of the relation embedding. (In model_gana.py, line 239)
  • The paper describes GAT, but the implementation is GATv2. (In model_gana.py, lines 172-174)

Hope to get a reply. Thank you.

Best Regards,
RapidsLNT

关于论文代码和原文不符的问题,以及代码的结果问题

你好,我有两点问题比较疑惑,一是原文的公式 (14) 和你们公布的代码中的写法不一致,我把代码的写法修改为论文中的公式后,运行得到的结果反而更差了;二是跑了多次代码效果均没有达到论文所公布的结果,是不是有什么参数设置或者是一些细节被我我忽略了?希望能够得到解答,谢谢!

code?

when will you upload the code?

1-N和N-N

大佬能发一下这个数据集split成1-N和N-N及N-1的数据集代码么

self.norm_vector = self.h_embedding[0]报错

运行到这儿就报错了,大佬可以重新修改一下代码吗;数据集有些小出入我都修改了,不知道改没改对,代码里的设置也没有按照论文进行设置,照搬的metaR,这样很难复现结果呀

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.