I was trying to run NER for en and de, with training and testing on both using mBERT-cased.
Set $LANG to 'en,de' and passed it to both train_langs and predict_langs. The training and eval works fine but then I get the same error as in #16 .
06/04/2020 20:42:13 - INFO - transformers.modeling_utils - loading weights file /content/xtreme-dev/outputs-temp//panx/bert-base-multilingual-cased-LR2e-5-epoch-MaxLen128/checkpoint-best/pytorch_model.bin
06/04/2020 20:42:18 - INFO - __main__ - all languages = en
06/04/2020 20:42:18 - INFO - __main__ - Creating features from dataset file at /content/xtreme-dev/download//panx/panx_processed_maxlen128/en/test.bert-base-multilingual-cased in language en
06/04/2020 20:42:18 - INFO - utils_tag - lang_id=0, lang=en, lang2id=None
06/04/2020 20:42:18 - INFO - utils_tag - Writing example 0 of 10634
06/04/2020 20:42:18 - INFO - utils_tag - *** Example ***
06/04/2020 20:42:18 - INFO - utils_tag - guid: en-1
06/04/2020 20:42:18 - INFO - utils_tag - tokens: [CLS] Shortly after ##ward , an en ##cou ##raging response influenced him to go to India ; he arrived at Ad ##yar in 1884 . [SEP]
06/04/2020 20:42:18 - INFO - utils_tag - input_ids: 101 50752 10662 16988 117 10151 10110 30656 108545 21001 31377 10957 10114 11783 10114 11098 132 10261 22584 10160 25474 22953 10106 13366 119 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
06/04/2020 20:42:18 - INFO - utils_tag - input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
06/04/2020 20:42:18 - INFO - utils_tag - segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
06/04/2020 20:42:18 - INFO - utils_tag - label_ids: -100 6 6 -100 6 6 6 -100 -100 6 6 6 6 6 6 6 6 6 6 6 6 -100 6 6 6 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100 -100
06/04/2020 20:42:18 - INFO - utils_tag - langs: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
example.langs [] [] 0
ex_index 1 10634
Traceback (most recent call last):
File "/content/xtreme-dev/third_party/run_tag.py", line 698, in <module>
main()
File "/content/xtreme-dev/third_party/run_tag.py", line 637, in main
result, predictions = evaluate(args, model, tokenizer, labels, pad_token_label_id, mode="test", lang=lang, lang2id=lang2id)
File "/content/xtreme-dev/third_party/run_tag.py", line 247, in evaluate
eval_dataset = load_and_cache_examples(args, tokenizer, labels, pad_token_label_id, mode=mode, lang=lang, lang2id=lang2id)
File "/content/xtreme-dev/third_party/run_tag.py", line 358, in load_and_cache_examples
lang=lg
File "/content/xtreme-dev/third_party/utils_tag.py", line 218, in convert_examples_to_features
assert len(langs) == max_seq_length
TypeError: object of type 'NoneType' has no len()