Git Product home page Git Product logo

Comments (5)

CheaSim avatar CheaSim commented on June 7, 2024 2

1. tokenize

Use the code below

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
tokenizer("姚明,男,汉族,1980年9月12日出生于上海市徐汇区。")

2. read the OpenUE paper https://aclanthology.org/2020.emnlp-demos.1/

NER model will output the BIEOS ids, and SEQ model will output relation ids.
predicate_probabilities is relation ids logits and token_label_predictions is the BIEOS ids for each tokens including [CLS] and [SEP].

So if you want to use the end2end relation extraction, you first need to get the relation types in the sentence, and then pad the relation ids to the sentence and use the sentence with relation ids to get the NER result. Finally, combine the entities in NER result and relation types in the RE model's output.

from openue.

CheaSim avatar CheaSim commented on June 7, 2024

Our code is based on torchserve github repo https://github.com/pytorch/serve/tree/master/examples/Huggingface_Transformers. You can find the setup_config.json in it.
Also you can modify some hyper-parameters in the json file.

{
 "model_name":"bert-base-uncased",
 "mode":"sequence_classification",
 "do_lower_case":true,
 "num_labels":"2",
 "save_mode":"pretrained",
 "max_length":"150",
 "captum_explanation":true,
 "embedding_name": "bert",
 "FasterTransformer":false
}

from openue.

roar090 avatar roar090 commented on June 7, 2024

谢谢!添加setup_config.json打包后模型在torchserve跑起来了。

另外,能否提供一个torchserve部署后,请求调用的参数示例?

POST http://localhost:3000/predictions/BERTForNER
{
"data":{
"body":"姚明,男,汉族,1980年9月12日出生于上海市徐汇区。"
}
}

`- --- Logging error ---

  • Traceback (most recent call last):
  • File "/usr/lib/python3.6/logging/init.py", line 996, in emit
  • stream.write(msg)
    
  • UnicodeEncodeError: 'ascii' codec can't encode characters in position 34-41: ordinal not in range(128)
  • Call stack:
  • File "/usr/local/lib/python3.6/dist-packages/ts/model_service_worker.py", line 183, in
  • worker.run_server()
    
  • File "/usr/local/lib/python3.6/dist-packages/ts/model_service_worker.py", line 155, in run_server
  • self.handle_connection(cl_socket)
    
  • Invoking custom service failed.
  • File "/usr/local/lib/python3.6/dist-packages/ts/model_service_worker.py", line 114, in handle_connection
  • Traceback (most recent call last):
  • resp = service.predict(msg)
    
  • File "/usr/local/lib/python3.6/dist-packages/ts/service.py", line 100, in predict
  • File "/usr/local/lib/python3.6/dist-packages/ts/service.py", line 100, in predict
  • ret = self._entry_point(input_batch, self.context)
    
  • ret = self._entry_point(input_batch, self.context)
    

.wlm.WorkerThread - Backend response time: 5

  • File "/usr/local/lib/python3.6/dist-packages/ts/torch_handler/base_handler.py", line 197, in handle

  • File "/usr/local/lib/python3.6/dist-packages/ts/torch_handler/base_handler.py", line 197, in handle

  • data_preprocess = self.preprocess(data)
    
  • data_preprocess = self.preprocess(data)
    
  • File "/home/model-server/tmp/models/de1f2c6c5bf9446f8d2d2668d4fbb68a/handler_ner.py", line 119, in preprocess

  • File "/home/model-server/tmp/models/de1f2c6c5bf9446f8d2d2668d4fbb68a/handler_ner.py", line 124, in preprocess

  • logger.info(f"Received text: {input_text}")
    
  • Message: "Received text: {'data': {'body': '\u59da\u660e\uff0c\u7537\uff0c\u6c49\u65cf\uff0c1980\u5e749\u670812\u

  • input_ids=torch.tensor([_['input_ids'] for _ in total_inputs]).to(self.device),
    
  • Arguments: ()

  • File "/home/model-server/tmp/models/de1f2c6c5bf9446f8d2d2668d4fbb68a/handler_ner.py", line 124, in

  • input_ids=torch.tensor([_['input_ids'] for _ in total_inputs]).to(self.device),`
    

from openue.

CheaSim avatar CheaSim commented on June 7, 2024

API is determined by the handler_ner.py and handler_ner.py code. By default, the input json should be

{
"input_ids":  List  # shape (128)
"attetnion_mask":  List # shape (128)
"token_type_ids": List # shape (128)
}

from openue.

roar090 avatar roar090 commented on June 7, 2024

非常感谢! it worked!!
另外还有两个地方有些疑惑:

1.请求参数要如何通过要预测的文本转化得到?

“姚明,男,汉族,1980年9月12日出生于上海市徐汇区。” -----> {"input_ids": List # shape (128),"attetnion_mask": List # shape (128)"token_type_ids": List # shape (128)}

2.torchserve返回的数据结果是什么样的对应关系?

{
"outputs": {
"predicate_probabilities": [[
0.608515739440918,
0.6202450394630432,
0.7023037672042847,
-0.030435102060437202,
-0.10990876704454422,
]]
,
"token_label_predictions": [
[
[
-2.197793960571289,
-0.5554990768432617,
-0.7127630710601807,
-1.0187653303146362,
-1.2688184976577759,
1.81879723072052,
9.981047630310059,
-1.8538830280303955

]
]
]
}

predicate_probabilities 部分是 list [ list [ ] ] 结构 ,
token_label_predictions 部分是 list [ list [ list [] ] ] 结构 ,
如何对应到实体和关系上面 ?

谢谢 !✍

from openue.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.