Git Product home page Git Product logo

sembert's Introduction

SemBERT: Semantics-aware BERT for Language Understanding

(2020/10/07) Update: Tips for possible issues

  1. SRL prediction mismatches the provided samples

The POS tags are slightly different using different spaCy versions. SemBERT used spacy==2.0.18 to obtain the verbs.

Refer to allenai/allennlp#3418, cooelf/SemBERT#12 (CHN).

  1. SRL is not a registered name for Model.

Please try pip install --pre allennlp-models

  1. Issues about AllenNLP

If you encounter issues about the class or variables in AllenNLP, please try to use a lower version, e.g., 0.8.1.

Our experiment environment for reference:

Python 3.6+ PyTorch (1.0.0) AllenNLP (0.8.1)

=========================================

Codes for the paper Semantics-aware BERT for Language Understanding in AAAI 2020

Overview

Requirements

(Our experiment environment for reference)

Python 3.6+ PyTorch (1.0.0) AllenNLP (0.8.1)

Datasets

GLUE data can be downloaded from GLUE data by running this script and unpack it to directory glue_data. We provide an example data sample in glue_data/MNLI to show how SemBERT works.

Instructions

This repo shows the example implementation of SemBERT for NLU tasks. We basically used the pre-trained BERT uncased models so do not forget to pass the parameter --do_lower_case.

The example script are as follows:

Train a model

Note: please replace the sample data with labeled data (use our labeled data or annotate your data following the instructions below).

CUDA_VISIBLE_DEVICES=0 \
python run_classifier.py \
--data_dir glue_data/SNLI/ \
--task_name snli \
--train_batch_size 32 \
--max_seq_length 128 \
--bert_model bert-wwm-uncased \
--learning_rate 2e-5 \
--num_train_epochs 2 \
--do_train \
--do_eval \
--do_lower_case \
--max_num_aspect 3 \
--output_dir glue/snli_model_dir

Evaluation

Both run_classifier.py and run_snli_predict.py can be used for evaluation, where the later is simplified for easy employment.

The major difference is that run_classifier.py takes labeled data as input, while run_snli_predict.py integrates the real-time semantic role labeling, so it uses the original raw data.

Evaluation using labeled data

CUDA_VISIBLE_DEVICES=0 \
python run_classifier.py \
--data_dir glue_data/SNLI/ \
--task_name snli \
--eval_batch_size 128 \
--max_seq_length 128 \
--bert_model bert-wwm-uncased \
--do_eval \
--do_lower_case \
--max_num_aspect 3 \
--output_dir glue/snli_model_dir

Evaluation using raw data (with real-time semantic role labeling)

Our trained SNLI model (reaching 91.9% test accuracy) can be accessed here.

https://drive.google.com/drive/folders/1Yn-WCw1RaMxbDDNZRnoJCIGxMSAOu20_?usp=sharing

To use our trained SNLI model, please put the SNLI model and the SRL model to the snli_model_dir and srl_model_dir, respectively.

As shown in our example SNLI model, the folder of snli_model_dir should contain three files:

vocab.txt and bert_config.json from the BERT model folder that are used for training your model;

pytorch_model.bin that is the trained SNLI model.

CUDA_VISIBLE_DEVICES=0 \
python run_snli_predict.py \
--data_dir /share03/zhangzs/glue_data/SNLI \
--task_name snli \
--eval_batch_size 128 \
--max_seq_length 128 \
--max_num_aspect 3 \
--do_eval \
--do_lower_case \
--bert_model snli_model_dir \
--output_dir snli_model_dir \
--tagger_path srl_model_dir

For prediction, use the flag: --do_predict for either the script run_classifier.py or run_snli_predict.py. The output pred file can be directly used for GLUE online submission and evaluation.

Data annotation (Semantic role labeling)

We provide two kinds of semantic labeling method,

  • online: each word sequence are passed to label module to obtain the tags which could be used for online prediction. This would be time-consuming for large corpus. See tag_model/tagging.py

    If you want to use the online one, please specify the --tagger_path parameter in the run.py file.

  • offline: the current one that pre-process the datasets and save them for later loading for training and evaluation. See tag_model/tagger_offline.py

    Our labeled data can be downloaded here for quick start.

    Google Drive: https://drive.google.com/file/d/1B-_IRWRvR67eLdvT6bM0b2OiyvySkO-x/view?usp=sharing

    Baidu Cloud:

    Link https://pan.baidu.com/s/1EduMJAfEXet_9yCfVob9qA Password:sl7l

Note this repo is based on the offline version, so that the column id/index in the data-processor would be slightly different from the original, which is like this:

text_a = line[-3] text_b = line[-2] label = line[-1]

If you use the original data instead of our preprocessed one by tag_model/tagger_offline.py, please modify the index according to the dataset structure.

SRL model

The SRL model in this implementation used the ELMo-based SRL model from AllenNLP.

Recently, there is a new BERT-based model, which is a nice alternative.

Reference

Please kindly cite this paper in your publications if it helps your research:

@inproceedings{zhang2020SemBERT,
	title={Semantics-aware {BERT} for language understanding},
	author={Zhang, Zhuosheng and Wu, Yuwei and Zhao, Hai and Li, Zuchao and Zhang, Shuailiang and Zhou, Xi and Zhou, Xiang},
  	booktitle={the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-2020)},
	year={2020}
}

sembert's People

Contributors

cooelf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sembert's Issues

Error: online data annotation

你好,我想使用bert_srl进行srl标注,因此我下载了您给定的链接,下载下来后,里面有3个文件
config.json vocabulary/ weights.th
我将这3个文件放在一个文件夹bert_srl下,路径为xxx/xxx/bert_srl

最后我在运行run_classifier.py时加入--tagger_path xxx/xxx/bert_srl
可是却得到
allennlp.common.checks.ConfigurationError: srl_bert is not a registered name for Model. You probably..
这样的错误,请问具体应该怎么操作呢

RuntimeError: size mismatch, m1: [32 x 78], m2: [778 x 778]

Hi, @cooelf. Thanks again for providing this really interesting work! I am trying to get the code working so I can ultimately adapt it for some other work, and I'm running into some trouble. Hope you can clarify.

I have cloned the repository to my server, set up a Py3.6 virtual environment, installed PyTorch (1.0.0), AllenNLP (0.8.1), and spacy==2.0.18 (also did pip install --pre allennlp-models), and downloaded the GLUE data. I then run the following:

CUDA_VISIBLE_DEVICES=0 \
python run_classifier.py \
--data_dir glue_data/MNLI \
--task_name MNLI \
--train_batch_size 32 \
--max_seq_length 128 \
--bert_model bert-base-uncased \
--learning_rate 2e-5 \
--num_train_epochs 2 \
--do_train \
--do_eval \
--do_lower_case \
--max_num_aspect 3 \
--output_dir glue/MNLI_model_dir

(You'll notice that I replaced 'bert-wwm-uncased' with 'bert-base-uncased' because I got an error when I tried to use 'bert-wwm-uncased'. I know #8 asks about this as well but even with Google Translate I was not able to figure out how to solve the issue with 'bert-wwm-uncased'.)

I get the following output.

Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
  FutureWarning)
11/03/2021 12:21:35 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
11/03/2021 12:21:35 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/richier/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
Traceback (most recent call last):
  File "run_classifier.py", line 1237, in <module>
    main()
  File "run_classifier.py", line 857, in main
    train_examples = processor.get_train_examples(args.data_dir)
  File "run_classifier.py", line 408, in get_train_examples
    self._read_tsv(os.path.join(data_dir, "train.tsv_tag_label")), "train")
  File "run_classifier.py", line 91, in _read_tsv
    with open(input_file, "r", encoding='utf-8') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'glue_data/SNLI/train.tsv_tag_label'
(py36_sembert) [richier@reslnapollo02 SemBERT]$ bash train.sh 
Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex.
/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/sklearn/utils/linear_assignment_.py:22: FutureWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead.
  FutureWarning)
11/03/2021 12:21:54 - INFO - __main__ -   device: cuda n_gpu: 1, distributed training: False, 16-bits training: False
11/03/2021 12:21:54 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt from cache at /home/richier/.pytorch_pretrained_bert/26bc1ad6c0ac742e9b52263248f6d0f00068293b33709fae12320c0e35ccfbbb.542ce4285a40d23a559526243235df47c5f75c197f04f37d1a0c124c32c9a084
{'contradiction': 0, 'entailment': 1, 'neutral': 2}
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-0
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] conceptual ##ly cream ski ##mming has two basic dimensions - product and geography . [SEP] product and geography are what make cream ski ##mming work . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 17158 2135 6949 8301 25057 2038 2048 3937 9646 1011 4031 1998 10505 1012 102 4031 1998 10505 2024 2054 2191 6949 8301 25057 2147 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: neutral (id = 2)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-1
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] you know during the season and i guess at at your level uh you lose them to the next level if if they decide to recall the the parent team the braves decide to call to recall a guy from triple a then a double a guy goes up to replace him and a single a guy goes up to replace him [SEP] you lose the things to the following level if the people recall . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 2017 2113 2076 1996 2161 1998 1045 3984 2012 2012 2115 2504 7910 2017 4558 2068 2000 1996 2279 2504 2065 2065 2027 5630 2000 9131 1996 1996 6687 2136 1996 13980 5630 2000 2655 2000 9131 1037 3124 2013 6420 1037 2059 1037 3313 1037 3124 3632 2039 2000 5672 2032 1998 1037 2309 1037 3124 3632 2039 2000 5672 2032 102 2017 4558 1996 2477 2000 1996 2206 2504 2065 1996 2111 9131 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-2
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] one of our number will carry out your instructions minute ##ly . [SEP] a member of my team will execute your orders with immense precision . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 2028 1997 2256 2193 2097 4287 2041 2115 8128 3371 2135 1012 102 1037 2266 1997 2026 2136 2097 15389 2115 4449 2007 14269 11718 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-3
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] how do you know ? all this is their information again . [SEP] this information belongs to them . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 2129 2079 2017 2113 1029 2035 2023 2003 2037 2592 2153 1012 102 2023 2592 7460 2000 2068 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:21:54 - INFO - __main__ -   *** Example ***
11/03/2021 12:21:54 - INFO - __main__ -   guid: train-4
11/03/2021 12:21:54 - INFO - __main__ -   tokens: [CLS] yeah i tell you what though if you go price some of those tennis shoes i can see why now you know they ' re getting up in the hundred dollar range [SEP] the tennis shoes have a range of prices . [SEP]
11/03/2021 12:21:54 - INFO - __main__ -   input_ids: 101 3398 1045 2425 2017 2054 2295 2065 2017 2175 3976 2070 1997 2216 5093 6007 1045 2064 2156 2339 2085 2017 2113 2027 1005 2128 2893 2039 1999 1996 3634 7922 2846 102 1996 5093 6007 2031 1037 2846 1997 7597 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:21:54 - INFO - __main__ -   label: neutral (id = 2)
tokenizer vocab size:  22
11/03/2021 12:21:54 - INFO - pytorch_pretrained_bert.modeling -   loading archive file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz from cache at /home/richier/.pytorch_pretrained_bert/distributed_-1/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba
11/03/2021 12:21:54 - INFO - pytorch_pretrained_bert.modeling -   extracting archive file /home/richier/.pytorch_pretrained_bert/distributed_-1/9c41111e2de84547a463fd39217199738d1e3deb72d4fec4399e6e241983c6f0.ae3cef932725ca7a30cdcb93fc6e09150a55e2a130ec7af63975a16c153ae2ba to temp dir /tmp/tmppi9y9ey2
11/03/2021 12:22:00 - INFO - pytorch_pretrained_bert.modeling -   Model config {
  "attention_probs_dropout_prob": 0.1,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "max_position_embeddings": 512,
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "type_vocab_size": 2,
  "vocab_size": 30522
}

11/03/2021 12:22:06 - INFO - pytorch_pretrained_bert.modeling -   Weights of BertForSequenceClassificationTag not initialized from pretrained model: ['cnn.char_cnn.weight', 'cnn.char_cnn.bias', 'tag_model.embed.tag_embeddings.weight', 'tag_model.embed.LayerNorm.weight', 'tag_model.embed.LayerNorm.bias', 'tag_model.fc.weight', 'tag_model.fc.bias', 'dense.weight', 'dense.bias', 'pool.weight', 'pool.bias', 'classifier.weight', 'classifier.bias']
11/03/2021 12:22:06 - INFO - pytorch_pretrained_bert.modeling -   Weights from pretrained model not used in BertForSequenceClassificationTag: ['cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.LayerNorm.bias']
11/03/2021 12:22:10 - INFO - __main__ -   ***** Running training *****
11/03/2021 12:22:10 - INFO - __main__ -     Num examples = 51
11/03/2021 12:22:10 - INFO - __main__ -     Batch size = 32
11/03/2021 12:22:10 - INFO - __main__ -     Num steps = 2
{'contradiction': 0, 'entailment': 1, 'neutral': 2}
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-0
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] the new rights are nice enough [SEP] everyone really likes the newest benefits [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 1996 2047 2916 2024 3835 2438 102 3071 2428 7777 1996 14751 6666 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: neutral (id = 2)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-1
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] this site includes a list of all award winners and a search ##able database of government executive articles . [SEP] the government executive articles housed on the website are not able to be searched . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 2023 2609 2950 1037 2862 1997 2035 2400 4791 1998 1037 3945 3085 7809 1997 2231 3237 4790 1012 102 1996 2231 3237 4790 7431 2006 1996 4037 2024 2025 2583 2000 2022 9022 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: contradiction (id = 0)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-2
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] uh i do n ' t know i i have mixed emotions about him uh sometimes i like him but at the same times i love to see somebody beat him [SEP] i like him for the most part , but would still enjoy seeing someone beat him . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 7910 1045 2079 1050 1005 1056 2113 1045 1045 2031 3816 6699 2055 2032 7910 2823 1045 2066 2032 2021 2012 1996 2168 2335 1045 2293 2000 2156 8307 3786 2032 102 1045 2066 2032 2005 1996 2087 2112 1010 2021 2052 2145 5959 3773 2619 3786 2032 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: entailment (id = 1)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-3
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] yeah i i think my favorite restaurant is always been the one closest you know the closest as long as it ' s it meets the minimum criteria you know of good food [SEP] my favorite restaurants are always at least a hundred miles away from my house . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 3398 1045 1045 2228 2026 5440 4825 2003 2467 2042 1996 2028 7541 2017 2113 1996 7541 2004 2146 2004 2009 1005 1055 2009 6010 1996 6263 9181 2017 2113 1997 2204 2833 102 2026 5440 7884 2024 2467 2012 2560 1037 3634 2661 2185 2013 2026 2160 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: contradiction (id = 0)
11/03/2021 12:22:10 - INFO - __main__ -   *** Example ***
11/03/2021 12:22:10 - INFO - __main__ -   guid: dev_matched-4
11/03/2021 12:22:10 - INFO - __main__ -   tokens: [CLS] i do n ' t know um do you do a lot of camping [SEP] i know exactly . [SEP]
11/03/2021 12:22:10 - INFO - __main__ -   input_ids: 101 1045 2079 1050 1005 1056 2113 8529 2079 2017 2079 1037 2843 1997 13215 102 1045 2113 3599 1012 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11/03/2021 12:22:10 - INFO - __main__ -   label: contradiction (id = 0)
11/03/2021 12:22:10 - INFO - __main__ -   ***** Running evaluation *****
11/03/2021 12:22:10 - INFO - __main__ -     Num examples = 51
11/03/2021 12:22:10 - INFO - __main__ -     Batch size = 8
Iteration:   0%|                                                                                                                                                 | 0/2 [00:00<?, ?it/s]
Epoch:   0%|                                                                                                                                                     | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "run_classifier.py", line 1237, in <module>
    main()
  File "run_classifier.py", line 975, in main
    loss = model(input_ids, segment_ids, input_mask, start_end_idx, input_tag_ids,  label_ids)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/richier/SemBERT/pytorch_pretrained_bert/modeling.py", line 1026, in forward
    pooled_output = self.pool(first_token_tensor)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 67, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/richier/anaconda3/envs/py36_sembert/lib/python3.6/site-packages/torch/nn/functional.py", line 1352, in linear
    ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t())
RuntimeError: size mismatch, m1: [32 x 78], m2: [778 x 778] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:266

That seems to be saying that there's some mismatch in some matrix shapes? Any idea what's causing that?

srl is not a registered name for model

When I try to run "evaluation using raw data" using the given command (changing only the data_dir to /glue_data/SNLI), I get this error:

f"{name} is not a registered name for {cls.name}. "
allennlp.common.checks.ConfigurationError: srl is not a registered name for Model. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.

QA Model?

When will the QA model be available?

Pool function dimension not match

Dear author,

I encountered the same error as #25, which is from this line of code:

pooled_output = self.pool(first_token_tensor)
.

The solution in #25 indeed solves the error but I am not sure whether it is correct. I found that changing dim=-1 to dim=1 in

first_token_tensor, pool_index = torch.max(sequence_output, dim=-1)
also solve the issue. And based on my understanding it makes more sense to fuse the features with max() than by simply taking the first feature.

So just want to confirm with you whether it is a correct solution? Please advise. Thank you.

数据集下载失败

您好,根据您提供的datasets下载数据的脚本无法访问,无法下载数据集,该怎么办呢?

srl is not a registered name for Model.

I put the extracted srl_model in srl_model_dir.

> ~/SemBERT/srl_model_dir$ ls
> config.json  files_to_archive.json  fta  vocabulary  weights.th

But I get the error mentioned below.

08/06/2020 18:50:49 - INFO - __main__ -   device: cpu n_gpu: 0, distributed training: False, 16-bits training: False
08/06/2020 18:50:49 - INFO - pytorch_pretrained_bert.tokenization -   loading vocabulary file snli_model_dir/vocab.txt
08/06/2020 18:50:49 - INFO - allennlp.models.archival -   loading archive file srl_model_dir
Traceback (most recent call last):
  File "run_snli_predict.py", line 598, in <module>
    main()
  File "run_snli_predict.py", line 464, in main
    srl_predictor = SRLPredictor(args.tagger_path)
  File "/home/akib/SemBERT/tag_model/tagging.py", line 7, in __init__
    self.predictor = Predictor.from_path(SRL_MODEL_PATH)
  File "/home/akib/.local/lib/python3.8/site-packages/allennlp/predictors/predictor.py", line 275, in from_path
    load_archive(archive_path, cuda_device=cuda_device),
  File "/home/akib/.local/lib/python3.8/site-packages/allennlp/models/archival.py", line 192, in load_archive
    model = Model.load(
  File "/home/akib/.local/lib/python3.8/site-packages/allennlp/models/model.py", line 391, in load
    model_class: Type[Model] = cls.by_name(model_type)  # type: ignore
  File "/home/akib/.local/lib/python3.8/site-packages/allennlp/common/registrable.py", line 137, in by_name
    subclass, constructor = cls.resolve_class_name(name)
  File "/home/akib/.local/lib/python3.8/site-packages/allennlp/common/registrable.py", line 184, in resolve_class_name
    raise ConfigurationError(
allennlp.common.checks.ConfigurationError: srl is not a registered name for Model. You probably need to use the --include-package flag to load your custom code. Alternatively, you can specify your choices using fully-qualified paths, e.g. {"model": "my_module.models.MyModel"} in which case they will be automatically imported correctly.

This is my file system tree for reference:

README.md
├── SemBERT.png
├── data_process
│   ├── __init__.py
│   ├── __pycache__
│   │   ├── __init__.cpython-38.pyc
│   │   └── datasets.cpython-38.pyc
│   ├── datasets.py
│   └── util.py
├── glue_data
│   └── MNLI
│       ├── dev_matched.tsv_tag_label
│       ├── test_matched.tsv_tag_label
│       └── train.tsv_tag_label
├── output
├── pytorch_pretrained_bert
│   ├── __init__.py
│   ├── __main__.py
│   ├── __pycache__
│   │   ├── __init__.cpython-38.pyc
│   │   ├── file_utils.cpython-38.pyc
│   │   ├── modeling.cpython-38.pyc
│   │   ├── optimization.cpython-38.pyc
│   │   └── tokenization.cpython-38.pyc
│   ├── file_utils.py
│   ├── modeling.py
│   ├── optimization.py
│   └── tokenization.py
├── run_classifier.py
├── run_scorer.py
├── run_snli_predict.py
├── snli_model_dir
│   ├── bert_config.json
│   ├── pytorch_model.bin
│   └── vocab.txt
├── srl_model_dir
│   ├── config.json
│   ├── files_to_archive.json
│   ├── fta
│   │   ├── model.text_field_embedder.elmo.options_file
│   │   └── model.text_field_embedder.elmo.weight_file
│   ├── vocabulary
│   │   ├── labels.txt
│   │   ├── non_padded_namespaces.txt
│   │   └── tokens.txt
│   └── weights.th
└── tag_model
    ├── __pycache__
    │   ├── modeling.cpython-38.pyc
    │   ├── tag_tokenization.cpython-38.pyc
    │   └── tagging.cpython-38.pyc
    ├── modeling.py
    ├── tag_tokenization.py
    ├── tagger_offline.py
    └── tagging.py

Use pre-trained SemBERT as a sentence encoder?

Hi, thanks for writing this very interesting paper and for providing this nice GitHub repo. 😄 I was wondering if it's possible -- and more to the point, straightforward! -- to use a SemBERT that you have trained to just get a sentence representation, which I can then use in my own neural network for classification or regression? I have a classification (or maybe ordinal regression...) task for which SRL features might be useful over and above BERT.

Missing key(s) in state_dict: "bert_model.embeddings.position_ids".

Hello!
Trying to run run_snli_predict.py with parameters as suggested on your main page, with real-time semantic role labeling.
I downloaded all neccesary models provided on the main page. Yet I received following error. I'll apreciate any help.

Traceback (most recent call last):
  File "C:/Users/Sajemiur/SemBERT/run_snli_predict.py", line 598, in <module>
    main()
  File "C:/Users/Sajemiur/SemBERT/run_snli_predict.py", line 464, in main
    srl_predictor = SRLPredictor(args.tagger_path)
  File "C:\Users\Sajemiur\SemBERT\tag_model\tagging.py", line 7, in __init__
    self.predictor = Predictor.from_path(SRL_MODEL_PATH)
  File "C:\Users\Sajemiur\anaconda3\envs\transformers\lib\site-packages\allennlp\predictors\predictor.py", line 275, in from_path
    load_archive(archive_path, cuda_device=cuda_device),
  File "C:\Users\Sajemiur\anaconda3\envs\transformers\lib\site-packages\allennlp\models\archival.py", line 192, in load_archive
    model = Model.load(
  File "C:\Users\Sajemiur\anaconda3\envs\transformers\lib\site-packages\allennlp\models\model.py", line 398, in load
    return model_class._load(config, serialization_dir, weights_file, cuda_device, opt_level)
  File "C:\Users\Sajemiur\anaconda3\envs\transformers\lib\site-packages\allennlp\models\model.py", line 337, in _load
    model.load_state_dict(model_state)
  File "C:\Users\Sajemiur\anaconda3\envs\transformers\lib\site-packages\torch\nn\modules\module.py", line 846, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for SrlBert:
	Missing key(s) in state_dict: "bert_model.embeddings.position_ids". 

Process finished with exit code 1

Python 3.8
Tranformers | 3.2
allennlp | 1.0.0
allennlp-models | 1.1.0rc4

Use your code for textual entailment

Hi, thanks for the code!

Can I somehow use your code to get textual entailment probabilities for pairs of sentences from my own dataset? If yes, please help give me some hints.

Cannot re-produce the result on SNLI

I have downloaded the pre-trained model from https://drive.google.com/open?id=1Yn-WCw1RaMxbDDNZRnoJCIGxMSAOu20_ and https://s3-us-west-2.amazonaws.com/allennlp/models/srl-model-2018.05.25.tar.gz accessed here, and tried to re-produce the experiment on SNLI dataset with the following environment settings:
python 3.6
allennlp 0.8.1
torch 1.8.0+cu111

When the evaluation finished, I found that the test accuracy is only 0.8563 (= 8412 / 9824) and dev acc is 0.8557, which are far lower than the results reported in the paper. Though the different module versions might lead to performance decreasing, is the drop of about 0.06 reasonable?

Interesting contribution

Your study is an interesting contribution.
I have a shallow question: Do you have used BERT only as a tokenizer in a subword level?
Anyway, as the source code will give more details about, i'll appreciate if you turn it avaliable for us.

Thank you for sharing you work.

About SQuAD task

大佬,我想问一下你是如何对SQuAD处理集进行语义标注处理的?

因为我看到GLUE baseline的任务数据句子长度都比较短而且是单个句子。我试着调用allennlp的predictor.predict_batch_json(sentence),这个sentence如果是多句话(比如整个passages)的时候,predict 的tags是逐句进行的,其他句的tag则全部填充为0.这种情况下设置max_num_aspect=3很明显就无法捕捉的所有句子的语义。

请问为什么要用max处理sequence_out

在modeling.py的1129行附近
#first_token_tensor = sequence_output[:, 0]
first_token_tensor, pool_index = torch.max(sequence_output, dim=1)

我调试看到sequence_output是一个[8,46,1034]维的tensor,为什么要用在1维上的max来处理它呢?这样会把不同token的第三维混合吧。
一般不是用cls来获得全句的语义信息吗?就像注释前的那样。

而且我按照目前的代码复现了,效果非常棒,就让我更想不通了,不用cls为什么可以有这么优秀的效果。
谢谢作者!

Errors while trying running the model

Hi,

I'm trying to run the model you published on git and i'm getting on one machine the following error:

RuntimeError: CUDA out of memory. Tried to allocate 90.00 MiB (GPU 0; 6.00 GiB total capacity; 4.17 GiB already allocated; 86.27 MiB free; 330.21 MiB cached)

I tried to reduce batch size (even to 1) and make --max_seq_length = 10, and I still get this error exactly after 9 epocs.

FYI: I'm running the following command:

python run_classifier.py --data_dir glue_data/MNLI/ --eval_batch_size 1 --max_seq_length 10 --bert_model bert-base-uncased --do_lower_case --task_name mnli --do_train --do_eval --do_predict --output_dir glue/base_mnli --learning_rate 3e-5 --num_train_epochs 200

on other machine i always get this error before training:

: 'NoneType' object has no attribute 'tokenize'pytorch_pretrained_bert.tokenization - Model name 'bert-base-uncased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt' was a path or url but couldn't find any file associated to this path or url.

I guess that you encountered these errors, so i try to contact you and see if you can help me solve them and run the model successfully.

I'll appreciate your help.

Thanks,
Shahar

Allennlp预测SRL结果不一致

您好,当我直接使用原始数据进行SRL预测时,所得到的结果与您提供的测试样本不一致

如 The new rights are nice enough

样本测试所给的结果是 {"verbs": [{"verb": "are", "description": "[ARG1: The new rights] [V: are] [ARG2: nice enough]", "tags": ["B-ARG1", "I-ARG1", "I-ARG1", "B-V", "B-ARG2", "I-ARG2"]}], "words": ["The", "new", "rights", "are", "nice", "enough"]}

而allennlp预测出来的结果是 [{'verbs': [], 'words': ['The', 'new', 'rights', 'are', 'nice', 'enough']}]

allennlp 0.8.1 allennlp-models=1.0.0
也测试过 allennlp 1.0.0 allennlp-models=1.0.0

疑问:对于标注序列的处理,实际代码实现中好像没有使用到BiGRU

大佬您好,我看您论文中提到对一条文本数据对应的多条标注的数据的处理,先是做一个map,然后将这多条标注数据输入进一个BiGRU层。但我在看您的开源代码时,发现您的做法好像没有用到BiGRU,而是直接选择多条标注中的前3条。
我想问问是什么原因呢? 还是我哪里理解错了,望您解答,谢谢

Errors from allennlp

Dear developers,

Unfortunately I cannot get your code to work, that is, none of the examples from the readme. After solving other errors, I get stuck on the assertion error OOV token not found.

assert self._oov_token in self._token_to_index[namespace], "OOV token not found!"
AssertionError: OOV token not found!

I tried changing the line endings, running the different examples in your readme, and installing different versions of allennlp. Do you have a solution for this problem?

Thank you for your time,

Roos

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.