Comments (4)
1、数据样本是用BMES体系标注,而解码方法bioes_decode(。。。)名字看是BIOES体系,是命名不严谨,还是数据标注标准不一致? 2、main.py的最后一行,bertForNer.predict(raw_text, model_path),在predict方法中,调用了bioes_decode()进行解码,但是传递的参数导致decode_tokens[index_]是一个tensor,无法和0比较用==比较,会报错,不知道是我调试问题,还是代码的问题。
第一个问题:mid_data里面的nor_ent2d.json里面标注的就是bioes的格式额,BMES是在哪看到的。
第二个问题:
if self.args.use_crf:
output = logits
else:
output = logits.detach().cpu().numpy()
output = np.argmax(output, axis=2)
如果没有使用crf,传入的就转为numpy了。如果使用crf,你看下output=logits这里的变量类型是什么。
from pytorch_bert_bilstm_crf_ner.
谢谢回复。
第一个问题,data/raw_data文件夹下的train.char.bmes,dev.char.bmes,test.char.bmes文件中似乎是bmes格式
第二个问题,目前就是跑现在这个项目的数据和源码,lstm为true,crf为false,遇到了错误。最后一句代码跟踪到的是一个tensor。
tensor([ 1.2309e+00, 6.4990e-04, -1.0493e+00, -4.7460e-01, -1.6203e+00,
-3.2032e-01, 2.4996e-01, -1.7915e+00, -2.1812e+00, -7.5134e-01,
-1.8038e+00, -1.7972e+00, -2.1244e+00, -2.9960e-01, -1.9316e+00,
-1.2641e+00, -1.1646e+00, 6.9202e+00, 2.2951e-01, 4.4400e-01,
2.5265e+00, -1.3150e+00, -2.2979e+00, -1.0872e+00, -2.8844e+00,
-1.0658e+00, -6.3236e-01, -1.4389e+00, -1.4511e+00, -7.5626e-01,
-1.5736e+00, -1.5487e+00, -1.4611e+00], device='cuda:0')
Traceback (most recent call last):
File "pytorch_bert_bilstm_crf_ner-main/pytorch_bert_bilstm_crf_ner-main/main.py", line 259, in
bertForNer.predict(raw_text, model_path)
File "pytorch_bert_bilstm_crf_ner-main/pytorch_bert_bilstm_crf_ner-main/main.py", line 158, in predict
pred_entities = decodeUtils.bioes_decode(output[0][1:1 + len(tokens)], "".join(tokens), self.idx2tag)
File "pytorch_bert_bilstm_crf_ner-main\pytorch_bert_bilstm_crf_ner-main\utils\decodeUtils.py", line 106, in bioes_decode
if decode_tokens[index_]==0:
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
不知道问题出在哪里。还请多指教
from pytorch_bert_bilstm_crf_ner.
谢谢回复。 第一个问题,data/raw_data文件夹下的train.char.bmes,dev.char.bmes,test.char.bmes文件中似乎是bmes格式 第二个问题,目前就是跑现在这个项目的数据和源码,lstm为true,crf为false,遇到了错误。最后一句代码跟踪到的是一个tensor。 tensor([ 1.2309e+00, 6.4990e-04, -1.0493e+00, -4.7460e-01, -1.6203e+00, -3.2032e-01, 2.4996e-01, -1.7915e+00, -2.1812e+00, -7.5134e-01, -1.8038e+00, -1.7972e+00, -2.1244e+00, -2.9960e-01, -1.9316e+00, -1.2641e+00, -1.1646e+00, 6.9202e+00, 2.2951e-01, 4.4400e-01, 2.5265e+00, -1.3150e+00, -2.2979e+00, -1.0872e+00, -2.8844e+00, -1.0658e+00, -6.3236e-01, -1.4389e+00, -1.4511e+00, -7.5626e-01, -1.5736e+00, -1.5487e+00, -1.4611e+00], device='cuda:0')
Traceback (most recent call last): File "pytorch_bert_bilstm_crf_ner-main/pytorch_bert_bilstm_crf_ner-main/main.py", line 259, in bertForNer.predict(raw_text, model_path) File "pytorch_bert_bilstm_crf_ner-main/pytorch_bert_bilstm_crf_ner-main/main.py", line 158, in predict pred_entities = decodeUtils.bioes_decode(output[0][1:1 + len(tokens)], "".join(tokens), self.idx2tag) File "pytorch_bert_bilstm_crf_ner-main\pytorch_bert_bilstm_crf_ner-main\utils\decodeUtils.py", line 106, in bioes_decode if decode_tokens[index_]==0: RuntimeError: Boolean value of Tensor with more than one value is ambiguous 不知道问题出在哪里。还请多指教
(1)preprocess.py里面有一段:
if ent_start == ent_end:
label_ids[ent_start] = ent2id['S-' + ent_type]
else:
label_ids[ent_start] = ent2id['B-' + ent_type]
label_ids[ent_end] = ent2id['E-' + ent_type]
for i in range(ent_start + 1, ent_end):
label_ids[i] = ent2id['I-' + ent_type]
进行了转换。
(2)把predict里面if self.args.use_crf:改为if self.args.use_crf: == 'True'
from pytorch_bert_bilstm_crf_ner.
非常感谢,确实忽略了crf是str,已经成功,谢谢
from pytorch_bert_bilstm_crf_ner.
Related Issues (20)
- 大哥你好。我想问一下File "/home/vrlab/wwt/pytorch_bert_bilstm_crf_ner-main/bert_base_model.py", line 11, in __init__ assert os.path.exists(bert_dir) and os.path.exists(config_path), \ AssertionError: pretrained bert file does not exist是为啥 HOT 5
- 运行问题 HOT 6
- 您好 HOT 9
- 大佬您好,可以帮忙看看这个bug吗 HOT 13
- 换成BIO类型的数据应该怎么做? HOT 7
- 导出onnx问题 Error(s) in loading state_dict for BertNerModel: Unexpected key(s) in state_dict: "linear.weight", "linear.bias". HOT 2
- 对BERT模型进行继续预训练对提高性能是否有帮助? HOT 4
- 训练自己的数据,内存占用一直增长,直到吃满内存 HOT 2
- 请问一小pkl文件怎么处理得到? HOT 1
- > 我加你qq吧,你说下。 HOT 2
- 关于使用CRF文件将BERT+CRF模型转换为ONNX的问题 HOT 6
- 网页问题 HOT 1
- 英文实体识别的问题 HOT 1
- RuntimeError: expected predicate to be bool, got torch.uint8 HOT 1
- Albert问题 HOT 2
- AssertionError: pretrained bert file does not exist HOT 1
- 更换数据集后报CUDA error: device-side assert triggered HOT 1
- 求一份分词数据集 HOT 1
- 我的checkpoints文件夹是空的 HOT 1
- 这个命名实体识别的算法思路是什么? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pytorch_bert_bilstm_crf_ner.