Comments (5)
1. tokenize
Use the code below
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
tokenizer("姚明,男,汉族,1980年9月12日出生于上海市徐汇区。")
2. read the OpenUE paper https://aclanthology.org/2020.emnlp-demos.1/
NER model will output the BIEOS ids, and SEQ model will output relation ids.
predicate_probabilities
is relation ids logits and token_label_predictions
is the BIEOS ids for each tokens including [CLS] and [SEP].
So if you want to use the end2end relation extraction, you first need to get the relation types in the sentence, and then pad the relation ids to the sentence and use the sentence with relation ids to get the NER result. Finally, combine the entities
in NER result and relation types
in the RE model's output.
from openue.
Our code is based on torchserve github repo https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers. You can find the setup_config.json in it.
Also you can modify some hyper-parameters in the json file.
{
"model_name":"bert-base-uncased",
"mode":"sequence_classification",
"do_lower_case":true,
"num_labels":"2",
"save_mode":"pretrained",
"max_length":"150",
"captum_explanation":true,
"embedding_name": "bert",
"FasterTransformer":false
}
from openue.
谢谢!添加setup_config.json打包后模型在torchserve跑起来了。
另外,能否提供一个torchserve部署后,请求调用的参数示例?
POST http://localhost:3000/predictions/BERTForNER
{
"data":{
"body":"姚明,男,汉族,1980年9月12日出生于上海市徐汇区。"
}
}
`- --- Logging error ---
- Traceback (most recent call last):
- File "/usr/lib/python3.6/logging/init.py", line 996, in emit
-
stream.write(msg)
- UnicodeEncodeError: 'ascii' codec can't encode characters in position 34-41: ordinal not in range(128)
- Call stack:
- File "/usr/local/lib/python3.6/dist-packages/ts/model_service_worker.py", line 183, in
-
worker.run_server()
- File "/usr/local/lib/python3.6/dist-packages/ts/model_service_worker.py", line 155, in run_server
-
self.handle_connection(cl_socket)
- Invoking custom service failed.
- File "/usr/local/lib/python3.6/dist-packages/ts/model_service_worker.py", line 114, in handle_connection
- Traceback (most recent call last):
-
resp = service.predict(msg)
- File "/usr/local/lib/python3.6/dist-packages/ts/service.py", line 100, in predict
- File "/usr/local/lib/python3.6/dist-packages/ts/service.py", line 100, in predict
-
ret = self._entry_point(input_batch, self.context)
-
ret = self._entry_point(input_batch, self.context)
.wlm.WorkerThread - Backend response time: 5
-
File "/usr/local/lib/python3.6/dist-packages/ts/torch_handler/base_handler.py", line 197, in handle
-
File "/usr/local/lib/python3.6/dist-packages/ts/torch_handler/base_handler.py", line 197, in handle
-
data_preprocess = self.preprocess(data)
-
data_preprocess = self.preprocess(data)
-
File "/home/model-server/tmp/models/de1f2c6c5bf9446f8d2d2668d4fbb68a/handler_ner.py", line 119, in preprocess
-
File "/home/model-server/tmp/models/de1f2c6c5bf9446f8d2d2668d4fbb68a/handler_ner.py", line 124, in preprocess
-
logger.info(f"Received text: {input_text}")
-
Message: "Received text: {'data': {'body': '\u59da\u660e\uff0c\u7537\uff0c\u6c49\u65cf\uff0c1980\u5e749\u670812\u
-
input_ids=torch.tensor([_['input_ids'] for _ in total_inputs]).to(self.device),
-
Arguments: ()
-
File "/home/model-server/tmp/models/de1f2c6c5bf9446f8d2d2668d4fbb68a/handler_ner.py", line 124, in
-
input_ids=torch.tensor([_['input_ids'] for _ in total_inputs]).to(self.device),`
from openue.
API is determined by the handler_ner.py
and handler_ner.py
code. By default, the input json should be
{
"input_ids": List # shape (128)
"attetnion_mask": List # shape (128)
"token_type_ids": List # shape (128)
}
from openue.
非常感谢! it worked!!
另外还有两个地方有些疑惑:
1.请求参数要如何通过要预测的文本转化得到?
“姚明,男,汉族,1980年9月12日出生于上海市徐汇区。” -----> {"input_ids": List # shape (128),"attetnion_mask": List # shape (128)"token_type_ids": List # shape (128)}
2.torchserve返回的数据结果是什么样的对应关系?
{
"outputs": {
"predicate_probabilities": [[
0.608515739440918,
0.6202450394630432,
0.7023037672042847,
-0.030435102060437202,
-0.10990876704454422,
]]
,
"token_label_predictions": [
[
[
-2.197793960571289,
-0.5554990768432617,
-0.7127630710601807,
-1.0187653303146362,
-1.2688184976577759,
1.81879723072052,
9.981047630310059,
-1.8538830280303955
]
]
]
}
predicate_probabilities 部分是 list [ list [ ] ] 结构 ,
token_label_predictions 部分是 list [ list [ list [] ] ] 结构 ,
如何对应到实体和关系上面 ?
谢谢 !✍
from openue.
Related Issues (20)
- 模型训练中断 HOT 3
- predict_online.py没有找到捏 HOT 1
- 新版本改动有点多啊 HOT 1
- 按照ske.ipynb的流程测试,但是F1值为0,ner和seq的指标值也很低,这个是什么原因 HOT 2
- 一些运行在Google Colab的问题 HOT 5
- colab模型路径保存问题 HOT 1
- 请问ske.ipynb中的数据集../dataset/ske/train.json可以提供吗?原始下载路径http://lic2019.ccf.org.cn/kg似乎失效 HOT 1
- ske.ipynb中的推理config名字是否错了 HOT 1
- 源代码能做事件抽取吗 HOT 2
- colab无法做验证模型 HOT 1
- loss不变 HOT 2
- 怎么用这个将已经训练好的模型进行知识抽取呢 HOT 10
- 运行 run_ner/seq.sh 无反应,并没有下载对应的 ./dataset HOT 1
- run_seq.sh报错:Can't pickle local object 'get_linear_schedule_with_warmup.<locals>.lr_lambda' HOT 1
- ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' HOT 2
- 请问一下用于事件抽取的数据集格式 HOT 2
- 数据集格式问题 HOT 1
- openue是pipeline方法还是联合抽取方法? HOT 1
- AttributeError: type object 'Trainer' has no attribute 'add_argparse_args' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openue.