The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'BertTokenizer'.
The class this function is called from is 'AlbertTokenizer'.
Traceback (most recent call last):
File "/home/efsz/localCode/Pytorch-NLU/test/tc/tet_tc_base_multi_label.py", line 73, in
lc.process()
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcRun.py", line 32, in process
self.corpus = Corpus(self.config, self.logger)
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcData.py", line 25, in init
self.tokenizer = self.load_tokenizer(self.config)
File "/home/efsz/localCode/Pytorch-NLU/pytorch_nlu/pytorch_textclassification/tcData.py", line 206, in load_tokenizer
tokenizer = PRETRAINED_MODEL_CLASSES[config.model_type][1].from_pretrained(config.pretrained_model_name_or_path)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained
return cls._from_pretrained(
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2017, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/transformers/models/albert/tokenization_albert.py", line 183, in init
self.sp_model.Load(vocab_file)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/home/efsz/anaconda3/envs/nlu/lib/python3.10/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a string
其他模型好像没问题,但是albert的tokenizer加载会报这个错