Git Product home page Git Product logo

fewrel's Introduction

FewRel Dataset, Toolkits and Baseline Models

Our benchmark website: https://thunlp.github.io/fewrel.html

FewRel is a large-scale few-shot relation extraction dataset, which contains more than one hundred relations and tens of thousands of annotated instances cross different domains. Our dataset is presented in our EMNLP 2018 paper FewRel: A Large-Scale Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation and a following-up version is presented in our EMNLP 2019 paper FewRel 2.0: Towards More Challenging Few-Shot Relation Classification.

Based on our dataset and designed few-shot settings, we have two different benchmarks:

  • FewRel 1.0: This is the first one to incorporate few-shot learning with relation extraction, where your model need to handle both the few-shot challenge and extracting entity relations from plain text. This benchmark provides a training dataset with 64 relations and a validation set with 16 relations. Once you submit your code to our benchmark website, it will be evaluated on a hidden test set with 20 relations. Each relation has 100 human-annotated instances.

  • FewRel 2.0: We found out that there are two long-neglected aspects in previous few-shot research: (1) How well models can transfer across different domains. (2) Can few-shot models detect instances belonging to none of the given few-shot classes. To dig deeper in these two aspects, we propose the 2.0 version of our dataset, with newly-added domain adaptation (DA) and none-of-the-above (NOTA) detection challenges. Find our more in our paper and evaluation websites FewRel 2.0 domain adaptation / FewRel 2.0 none-of-the-above detection

Citing

If you used our data, toolkits or baseline models, please kindly cite our paper:

@inproceedings{han-etal-2018-fewrel,
    title = "{F}ew{R}el: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation",
    author = "Han, Xu and Zhu, Hao and Yu, Pengfei and Wang, Ziyun and Yao, Yuan and Liu, Zhiyuan and Sun, Maosong",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D18-1514",
    doi = "10.18653/v1/D18-1514",
    pages = "4803--4809"
}

@inproceedings{gao-etal-2019-fewrel,
    title = "{F}ew{R}el 2.0: Towards More Challenging Few-Shot Relation Classification",
    author = "Gao, Tianyu and Han, Xu and Zhu, Hao and Liu, Zhiyuan and Li, Peng and Sun, Maosong and Zhou, Jie",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1649",
    doi = "10.18653/v1/D19-1649",
    pages = "6251--6256"
}

If you have questions about any part of the paper, submission, leaderboard, codes, data, please e-mail [email protected].

Contributions

For FewRel 1.0, Hao Zhu first proposed this problem and proposed the way to build the dataset and the baseline system; Ziyuan Wang built and maintained the crowdsourcing website; Yuan Yao helped download the original data and conducted preprocess; Xu Han, Hao Zhu, Pengfei Yu and Ziyun Wang implemented baselines and wrote the paper together; Zhiyuan Liu provided thoughtful advice and funds through the whole project. The order of the first four authors are determined by dice rolling.

Dataset and Pretrain files

The dataset has already be contained in the github repo. However, due to the large size, glove files (pre-trained word embeddings) and BERT pretrain checkpoint are not included. Please use the script download_pretrain.sh to download these pretrain files.

We also provide pid2name.json to show the Wikidata PID, name and description for each relation.

Note: We did not release the test dataset for both FewRel 1.0 and 2.0 for fair comparison. We recommend you to evaluate your models on the validation set first, and then submit it to our evaluation websites (which you can find above).

Training a Model

To run our baseline models, use command

python train_demo.py

This will start the training and evaluating process of Prototypical Networks in a 5-way 5-shot setting. You can also use different args to start different process. Some of them are here:

  • train / val / test: Specify the training / validation / test set. For example, if you use train_wiki for train, the program will load data/train_wiki.json for training. You should always use train_wiki for training and val_wiki (FewRel 1.0 and FewRel 2.0 NOTA challenge) or val_pubmed (FewRel 2.0 DA challenge) for validation.
  • trainN: N in N-way K-shot. trainN is the specific N in training process.
  • N: N in N-way K-shot.
  • K: K in N-way K-shot.
  • Q: Sample Q query instances for each relation.
  • model: Which model to use. The default one is proto, standing for Prototypical Networks. Note that if you use the PAIR model from our paper FewRel 2.0, you should also use --encoder bert --pair.
  • encoder: Which encoder to use. You can choose cnn or bert.
  • na_rate: NA rate for FewRel 2.0 none-of-the-above (NOTA) detection. Note that here na_rate specifies the rate between Q for NOTA and Q for positive. For example, na_rate=0 means the normal setting, na_rate=1,2,5 corresponds to NA rate = 15%, 30% and 50% in 5-way settings.

There are also many args for training (like batch_size and lr) and you can find more details in our codes.

Inference

You can evaluate an existing checkpoint by

python train_demo.py --only_test --load_ckpt {CHECKPOINT_PATH} {OTHER_ARGS}

Here we provide a BERT-PAIR checkpoint (trained on FewRel 1.0 dataset, 5 way 1 shot).

Reproduction

BERT-PAIR for FewRel 1.0

python train_demo.py \
    --trainN 5 --N 5 --K 1 --Q 1 \
    --model pair --encoder bert --pair --hidden_size 768 --val_step 1000 \
    --batch_size 4  --fp16 \

Note that --fp16 requires Nvidia's apex.

5 way 1 shot 5 way 5 shot 10 way 1 shot 10 way 5 shot
Val 85.66 89.48 76.84 81.76
Test 88.32 93.22 80.63 87.02

BERT-PAIR for Domain Adaptation (FewRel 2.0)

python train_demo.py \
    --trainN 5 --N 5 --K 1 --Q 1 \
    --model pair --encoder bert --pair --hidden_size 768 --val_step 1000 \
    --batch_size 4  --fp16 --val val_pubmed --test val_pubmed \
5 way 1 shot 5 way 5 shot 10 way 1 shot 10 way 5 shot
Val 70.70 80.59 59.52 70.30
Test 67.41 78.57 54.89 66.85

BERT-PAIR for None-of-the-Above (FewRel 2.0)

python train_demo.py \
    --trainN 5 --N 5 --K 1 --Q 1 \
    --model pair --encoder bert --pair --hidden_size 768 --val_step 1000 \
    --batch_size 4  --fp16 --na_rate 5 \
5 way 1 shot (0% NOTA) 5 way 1 shot (50% NOTA) 5 way 5 shot (0% NOTA) 5 way 5 shot (50% NOTA)
Val 74.56 73.09 75.01 75.38
Test 76.73 80.31 83.32 84.64

Proto-CNN + Adversarial Training for Domain Adaptation (FewRel 2.0)

python train_demo.py \
    --val val_pubmed --adv pubmed_unsupervised --trainN 10 --N {} --K {} \ 
    --model proto --encoder cnn --val_step 1000 \
5 way 1 shot 5 way 5 shot 10 way 1 shot 10 way 5 shot
Val 48.73 64.38 34.82 50.39
Test 42.21 58.71 28.91 44.35

fewrel's People

Contributors

gaotianyu1350 avatar prokil avatar rjchee avatar thucsthanxu13 avatar xxcclong avatar yuwl798180 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fewrel's Issues

data set problem

请问数据集中实体的类别从哪可以获取到,数据中实体只有ID

There is something wrong in GNN and MetaNet model

When choose gnn or metanet, the parameter in forward is Q that means Num of instances for each class in the queryset. However, in framework, it passed Q * N_for_train + na_rate * Q that means total_Q.

Segmentation fault (core dumped)

Unfortunately, running this command cause a segmentation fault for me:

python train_demo.py \
    --trainN 5 --N 5 --K 1 --Q 1 \
    --model pair --encoder bert --pair --hidden_size 768 --val_step 1000 \
    --batch_size 4  --fp16 --na_rate 5 \
Segmentation fault (core dumped)

Any idea how can I resolve this?

数据集预处理

你好!请问数据集是使用什么工具做tokenize处理呢?stanford nlp还是spacy还是其他?谢谢!

数据集

数据集中为什么有很多unicode字符编码

bug

当时使用
python3 train_demo.py --K 1
训练proto cnn时,损失时nan,当K = 5时可以正常训练,请问这是正常的吗

inference

hi, may i ask for an inference demo?

proto.py文件中forward方法的疑问

你好,我在跑5-way-5-shot Few-Shot Relation Classification model: proto encoder: cnn,在训练的时候trainN参数为10。
在proto.py 文件代码中:
support = torch.mean(support, 2) # Calculate prototype for each class
logits = -self.batch_dist(support, query) # (B, total_Q, N)
minn, _ = logits.min(-1)
logits = torch.cat([logits, minn.unsqueeze(2) - 1], 2) # (B, total_Q, N + 1)
_, pred = torch.max(logits.view(-1, N+1), 1)
return logits, pred
logits 与minn进行拼接形成新的logits,维度变为N+1=11。我们设置了trainN为10,这样求交叉熵的时候,拼接的minn部分并没有发挥任何用处。我想问一下作者这个minn有什么其他我没有想到的用处么?期待您的回答!十分感谢!

Leaderboard misses details

What is DualGraph model ? paper ?
How is it helpful to the community if a model claims a score with no reference, code, details or a contact ?

About val pubmed data

Hello again,thanks for your last answer. Can you tell me about the way you construct val pubmed dataset.

你好

你好,请问fewrel做人类实验的时候,怎么保证不带入测试者以前学到的东西呀?我看论文中有写:Note that these labelers are not provided the name of the relations and any extra information,是在测试的时候隐藏relation,只给一个relation一个数字label么?

train_demo.py doesn't work with pytorch 1.0.0

Using pytorch 1.0.0, I get the following error trying to run train_demo.py

Traceback (most recent call last):
  File "train_demo.py", line 35, in <module>
    framework.train(model, model_name, 4, 20, N, K, 5)
  File "/home/ubuntu/FewRel/fewshot_re_kit/framework.py", line 155, in train
    iter_loss += self.item(loss.data)
  File "/home/ubuntu/FewRel/fewshot_re_kit/framework.py", line 82, in item
    return x[0]
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

results on val set

hello,as my gpu can not run the bert encoder with enough sentence length,can you show the val set result with the bert pair model on DA task(Fewrel2.0)。Thank you very much。

codalab Step 2: Add the dev set and scripts

Click here to enter commands (e.g., help, run '', rm , kill , etc.).
CodaLab> new fewrel
Worksheet with name fewrel already exists
CodaLab> cl add bundle 0x22dc32baa68f43a1ace0dd3d5921386a
User zkzhu(0xa14f1d11a7db447a93979b5308ddb856) does not have sufficient permissions on worksheet 0xb843335f02a9443995db2013fbe2e7f4 (have read, need all).
CodaLab>

报错了,

如何理解n-way k-shot的实现

最近刚接触few-shot learning,阅读了一些文献,对n-way k-shot如何实现一直感到很疑惑。看了作者data_loader中的实现,似乎是每个episode进行一次n-way k-shot的采样。但这样做,假设采样次数足够多并且每个数据被取到的概率一样,大概会有60%的数据被sample出来,FewRel**七万个样例,也就是说会有四万多个数据被用来训练,使用这么多数据还能叫few-shot么?还望不吝赐教。

How to upload models and parameters to codelab?

I have re-implemented your released code, but I encounterd a problem when I upload the models to the codelab. I have tried the upload button of the top left, but there is no response and I can find any other portal to upload them.
微信截图_20191209175523

Bert pair + pubmed

Hello , i want to train bert pair for val_pubmed --adv pubmed_unsupervised and use my own emeddings related to pubmed data
Is it posssible to do?

Can you elaborate more on the FewRel 2.0 DA dataset?

The FewRel 2.0 DA dataset looks interesting as it can take quite a few effort to construct such a biomedical dataset. Can you help us understand more of this part:

  1. Can you give an example on how you built the initial test set aligning PubMed and UMLS (e.g. how did you figure out the relations etc.)?
  2. How many annotation effort is required in validating the initial test set (e.g., 100 annotator hours)? Is biomedical knowledge a prerequisite for the annotators?
  3. In which platform is this annotation job executed (e.g., Amazon MTurk)?

Thanks,

Allen

SNAIL

SNAIL模型中输入只有x没有y,这个和原论文模型是不是有些不一样?

Lack of the code of transformer

Hi~ i have clone the code from the repository, and the code of transformer is not found when i start to reproduce the bert-pair. Could you provided the code.


from transformers import BertTokenizer, BertModel, BertForMaskedLM, BertForSequenceClassification


Thanks~

The method for DA seems unable to work on other datasets.

Hello, i 'm trying the DA method(the basic one, not bert pair) on SCIERC, a dataset for scientific relation extraction, but found the result is poor. I also run it without DA part and compared it with the result of using DA, found that two results are similar. I made same comparison on other datasets(nty, semeval, semeval-2018-task7) and got the same result. I want to know why the method can gain a significant improvement(about 10%) on pubmed but not work on other datasets.

run test_demo.py encounter "cuda out of memory"

I replaced the test data set with validation data set and run "python train_demo.py gnn",on gpu,
then train gnn model finished. I run "python test_demo.py" ,with the problem "CUDA out of memory",my gpu memory is 10G, .

Is there a discrepancy between paper and repository?

BERT-PAIR for Domain Adaptation (FewRel 2.0) achieves 67.41 for 5-way 1-shot in the repository and website.
But in your paper BERT-PAIR achieves 56.25 for 5-way 1-shot in your paper.
Is there a discrepancy between paper and repository?

Generating a dataset in your data format

Hi
I want to use your code for training and testing on my own data, but I don't know how I can generate a dataset in the format of train_wiki.json or val_wiki.json from raw text! Is there any code available?
Thanks in advance

Test data missing

Directory data/ does not contain test.json.

train_demo.py throws the following exception:

Traceback (most recent call last):
  File "train_demo.py", line 28, in <module>
    test_data_loader = JSONFileDataLoader('./data/test.json', './data/glove.6B.50d.json', max_length=max_length)
  File "FewRel/fewshot_re_kit/data_loader.py", line 98, in __init__
    raise Exception("[ERROR] Data file doesn't exist")
Exception: [ERROR] Data file doesn't exist

when i use the code,i get some error.But,i do not know where i am worng ?

framework.train(model, prefix, batch_size, trainN, N, K, Q,
pytorch_optim=pytorch_optim, load_ckpt=opt.load_ckpt, save_ckpt=ckpt,
na_rate=opt.na_rate, val_step=opt.val_step, fp16=opt.fp16, pair=opt.pair,
train_iter=opt.train_iter, val_iter=opt.val_iter, bert_optim=bert_optim)

------>

logits, pred = model(support, query,
N_for_train, K, Q * N_for_train + na_rate * Q)

-------> def forward(self, input):
return F.conv1d(input, self.weight, self.bias, self.stride,
self.padding, self.dilation, self.groups)

then i get error:RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED.

my enviroment: python 3.7,pytorch 1.0.0,CUDA Version 9.0.176 ,cudnn 7.5

i will be very appreciate it very much if you can help me!thanks!

Bert model on a single gpu

Hi!
By default settings the pair batch size is way to big for a single forward pass through the Bert model. How to set the parameters to train or at least evaluate such a model? (BERT + pair)

Strange evaluation results

Hi, I'm using the build-in evaluation function, but I'm getting some strange results. Seems all results are better than what's reported in the paper. Is there anything we should pay attention to such as the selection of Q and val_step to have the same behavior as the official evaluation script in CodaLab?
Right now For proto, I can get 49.87% accuracy after 3000 steps on 5-way 1-shot, at step 30000 it reports 67.98% eval accuracy.

PRETRAIN=bert-base-uncased
VAL=val_pubmed
python train_demo.py \
    --train train_wiki\
    --val $VAL\
    --test $VAL\
    --trainN 5 \
    --N 5 \
    --K 1 \
    --Q 1 \
    --model proto \
    --encoder bert \
    --hidden_size 768 \
    --val_iter 1000 \
    --val_step 500  \
    --batch_size 2 \
    --grad_iter 2 \
    --pretrain_ckpt pretrain/$PRETRAIN

关系具体含义?

有没有具体说明每个关系类别的文件,比如P177是桥与河流;同时部分关系噪声较大

Load data error

Hi,

I am trying to use the model in my self-generated data. Specifically, I generated a 5-way 1-shot dataset to use as train, validation, and test. However, when I trained the model, it gave me error as follows
python fewrel_demo.py --train train_AD --val val_AD --test test_AD --trainN 5 --N 5 --K 1 --Q 1 --model pair --encoder bert --pair --hidden_size 768 --val_step 1000 --batch_size 1

5-way-1-shot Few-Shot Relation Classification model: pair encoder: bert max_length: 128 Start training... Use bert optim! Traceback (most recent call last): File "fewrel_demo.py", line 216, in <module> main() File "fewrel_demo.py", line 207, in main train_iter=opt.train_iter, val_iter=opt.val_iter, bert_optim=bert_optim) File "/proj/htzhu/users/yyang96/KG/FewRel-master/fewshot_re_kit/framework.py", line 195, in train batch, label = next(self.train_data_loader) File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data) File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise() File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg) ValueError: Caught ValueError in DataLoader worker process 0. Original Traceback (most recent call last): File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index) File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/nas/longleaf/home/yyang96/.conda/envs/kg/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/proj/htzhu/users/yyang96/KG/FewRel-master/fewshot_re_kit/data_loader.py", line 161, in __getitem__ self.K + self.Q, False) File "mtrand.pyx", line 819, in numpy.random.mtrand.RandomState.choice ValueError: Cannot take a larger sample than population when 'replace=False'

Do you have any idea what is happening here? Thanks a lot for your support!

在CNNSentenceEncoder中mask全为1

image
经过前面的处理,indexed_tokens的长度会等于self.max_length,如果使用mask[:len(indexed_tokens)] = 1,那么mask会都是1

# padding
while len(indexed_tokens) < self.max_length:
       indexed_tokens.append(self.word2id['[PAD]'])
indexed_tokens = indexed_tokens[:self.max_length]
# mask
mask = np.zeros((self.max_length), dtype=np.int32)
mask[:len(indexed_tokens)] = 1

mask didn't work

Hi!
I found that the mask operator in sentence_encoder.py didn't work.
image
Maybe we should reverse their positions(mask first and then padding).
Right?

关系名称及描述在哪?

您好,我在维基百科上查找P58这个关系时,没有找到这个关系。回看了一遍论文,发现2.3节提到所有关系的介绍和描述在附录A.2中记录,请问从哪里可以获取附录A.2 ?? 另外,在维基百科搜索部分关系时,例如,"P105": "OpenStreetMap Relation ID/关联ID","P14": "NTU", "P101": "ISO 3166-1 alpha-2/国家代码 (ISO 3166-1 alpha-2)","P1001": "DRqrykAn",这些关系的描述很奇怪,不像我们普遍认知的常用关系,出生地,首都等等,能不能解释一下?

如何提交测试?

您好,我看到FewRel主页的提交测试部分已经变成了比赛形式,但是我没有看懂如何提交测试模型,如果您有时间,望解答

Bert-pair model does not work properly

1、May I ask if you can provide the docker image that can normally run bert-pair? After a lot of attempts, including changing torch version and parameter modification, there are still many mistakes, which are all due to the environment.It feels like this is also affecting people's motivation to do more research on your data set.
2、Or can you provide me with pytorch, python and cuda versions

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.