Git Product home page Git Product logo

zjunlp / openue Goto Github PK

View Code? Open in Web Editor NEW
318.0 11.0 62.0 80.74 MB

[EMNLP 2020] OpenUE: An Open Toolkit of Universal Extraction from Text

Home Page: http://openue.zjukg.org

License: MIT License

Python 75.85% Shell 2.57% Jupyter Notebook 21.58%
triple-extraction relation-extraction named-entity-recognition event-extraction intent-classification slot-filling nlp-extraction-tasks openue nlp pytorch

openue's Issues

loss不变

作者你好,我跑了pytorch版本两个脚本run_ner.sh,run_seq.sh,发现loss都没有变化,训练到最后评价指标也都是接近于0,请问是哪里设置不对吗?

More Documentation in English

Hey, first of all, great work. Loved your work in this field.
I think Event Extraction is not working properly on the demo page(http://openue.top/index.html). Can you have a look at it?
Another request is to prepare the documentation in English. A lot of people will be having many questions about the dataset used, entities on which model is trained, pretrained model f scores, and so on.

Whenever I hover over the Knowledge graph in demo page, it shows something in Chinese which I am not able to understand. Can you have a look at it? Secondly how about showing the predicate instead of the predicate code.

run_seq.sh报错:Can't pickle local object 'get_linear_schedule_with_warmup.<locals>.lr_lambda'

Traceback (most recent call last):
File "main.py", line 146, in
main()
File "main.py", line 117, in main
if not test_only: trainer.fit(lit_model, datamodule=data)
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 458, in fit
self._run(model)
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 756, in _run
self.dispatch()
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 797, in dispatch
self.accelerator.start_training(self)
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/pytorch_lightning/plugins/training_type/ddp_spawn.py", line 122, in start_training
mp.spawn(self.new_process, **self.mp_spawn_kwargs)
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "/cephfs/xqwang/openue/openue/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 179, in start_processes
process.start()
File "/root/miniconda3/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/root/miniconda3/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/root/miniconda3/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/root/miniconda3/lib/python3.7/multiprocessing/popen_fork.py", line 20, in init
self._launch(process_obj)
File "/root/miniconda3/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/root/miniconda3/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'get_linear_schedule_with_warmup..lr_lambda'

请问这个是什么问题呢?run_ner.sh也遇到了同样的问题。依赖都是按照requirements安装的。

ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data'

Traceback (most recent call last):
  File "main.py", line 7, in <module>
    import pytorch_lightning as pl
  File "/home/users/peikaili/.conda/envs/openue/lib/python3.8/site-packages/pytorch_lightning/__init__.py", line 20, in <module>
    from pytorch_lightning import metrics  # noqa: E402
  File "/home/users/peikaili/.conda/envs/openue/lib/python3.8/site-packages/pytorch_lightning/metrics/__init__.py", line 15, in <module>
    from pytorch_lightning.metrics.classification import (  # noqa: F401
  File "/home/users/peikaili/.conda/envs/openue/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/__init__.py", line 14, in <module>
    from pytorch_lightning.metrics.classification.accuracy import Accuracy  # noqa: F401
  File "/home/users/peikaili/.conda/envs/openue/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/accuracy.py", line 18, in <module>
    from pytorch_lightning.metrics.utils import deprecated_metrics
  File "/home/users/peikaili/.conda/envs/openue/lib/python3.8/site-packages/pytorch_lightning/metrics/utils.py", line 22, in <module>
    from torchmetrics.utilities.data import get_num_classes as _get_num_classes
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' 

Just follow the readme file and show this error

Missing data

Hi

Great work! I want to try out the model and while running the first step:

  1. Data Preprocessing. Put the pretrained language model (e.g., BERT) in the pretrained_model folder and put all raw data (run script download_ske.sh in the benchmark folder) in the raw_data folder.

I couldn't find the benchmark folder and the script 'download_ske.sh'
image

I also could not find these scripts in any folders for step 4
sh export_seq.sh ske
sh serving_cls.sh ske

can you help me?

关于训练语料

感谢你的工作。请问训练语料有哪些呢?是基于开源的语料训练的吗

运行 run_ner/seq.sh 无反应,并没有下载对应的 ./dataset

您好!非常感谢该项目的开放与维护。

我在尝试运行时出现了两个问题:

  1. 在 windows 10 - 64 位 环境下使用 Pycharm,安装完依赖和其他文件后,运行 run_ner/seq 没有反应,并没有下载对应的 ./dataset以及生成目录。

  2. 在 colab 中使用 OpenUE_demo.ipynb 进行训练后尝试在使用 srcripts下 的 interactive.sh 时,报错:

image

Using /usr/local/lib/python3.7/dist-packagesFinished processing dependencies for openue==0.2.501/23/2022 12:58:52 - INFO - openue.data.data_module - add total special tokens: 50
['[relation0]', '[relation1]', '[relation2]', '[relation3]', '[relation4]', '[relation5]', '[relation6]', '[relation7]', '[relation8]', '[relation9]', '[relation10]', '[relation11]', '[relation12]', '[relation13]', '[relation14]', '[relation15]', '[relation16]', '[relation17]', '[relation18]', '[relation19]', '[relation20]', '[relation21]', '[relation22]', '[relation23]', '[relation24]', '[relation25]', '[relation26]', '[relation27]', '[relation28]', '[relation29]', '[relation30]', '[relation31]', '[relation32]', '[relation33]', '[relation34]', '[relation35]', '[relation36]', '[relation37]', '[relation38]', '[relation39]', '[relation40]', '[relation41]', '[relation42]', '[relation43]', '[relation44]', '[relation45]', '[relation46]', '[relation47]', '[relation48]', '[relation49]']
Traceback (most recent call last):
File "main.py", line 146, in
main()
File "main.py", line 94, in main
lit_model = litmodel_class(args=args, data_config=data.get_config())
File "/usr/local/lib/python3.7/dist-packages/openue/lit_models/transformer.py", line 166, in init
self._init_model()
File "/usr/local/lib/python3.7/dist-packages/openue/lit_models/transformer.py", line 175, in _init_model
self.model = Inference(self.args)
File "/usr/local/lib/python3.7/dist-packages/openue/models/model.py", line 145, in init
self._init_models()
File "/usr/local/lib/python3.7/dist-packages/openue/models/model.py", line 174, in _init_models
label2id={label: i for i, label in enumerate(self.labels_seq)},
File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/configuration_auto.py", line 582, in from_pretrained
config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py", line 556, in get_config_dict
local_files_only=local_files_only,
File "/usr/local/lib/python3.7/dist-packages/transformers/configuration_utils.py", line 844, in get_configuration_file
path_or_repo, revision=revision, use_auth_token=use_auth_token, local_files_only=local_files_only
File "/usr/local/lib/python3.7/dist-packages/transformers/file_utils.py", line 2103, in get_list_of_files
return list_repo_files(path_or_repo, revision=revision, token=token)
File "/usr/local/lib/python3.7/dist-packages/huggingface_hub/hf_api.py", line 885, in list_repo_files
repo_id, revision=revision, token=token, timeout=timeout
File "/usr/local/lib/python3.7/dist-packages/huggingface_hub/hf_api.py", line 868, in model_info
r.raise_for_status()
File "/usr/local/lib/python3.7/dist-packages/requests/models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/api/models/output/ske/seq/epoch=2-Eval

  1. 应该如何在训练完成后进行指定输入文本的知识图谱提取呢(以 Colab 上的 Demo 为例)?使用 lit_model.inference(inputs) 吗?

祝您有愉快的一天!
因我刚开始这方面的工作,多有叨扰,感激不尽。

数据集格式问题

英文数据集的格式和中文数据集一样吗?能否提供一下英文数据集

找不到setup_config.json

    # read configs for the mode, model_name, etc. from setup_config.json
    setup_config_path = os.path.join(model_dir, "setup_config.json")
    if os.path.isfile(setup_config_path):
        with open(setup_config_path) as setup_config_file:
            self.setup_config = json.load(setup_config_file)
    else:
        logger.warning("Missing the setup_config.json file.")

执行run_ner.sh 训练出来的模型 文件夹下没有setup_config.json文件 ,打包成mar部署后 报错 Missing the setup_config.json file.

一些运行在Google Colab的问题

您好,最后的模型保存在哪里?我是用Google Colab训练的。
而且我发现Google Colab是否没有写完,我参照ske.ipynb编写实体识别模块,但是出现了如下错误:
image
您能否提供更完整的版本?

请问一下用于事件抽取的数据集格式

如题,看到之前的issue里面OpenUE可以用于事件抽取,并且提供了DuEE1.0数据集作为参考格式。请问DuEE1.0是可以直接作为模型输入还是需要改动呢?毕竟和demo中提供的三元组抽取数据集格式不一样

AttributeError: type object 'Trainer' has no attribute 'add_argparse_args'

运行./scripts/run_ner.sh
报错:
Traceback (most recent call last):
File "main.py", line 146, in
main()
File "main.py", line 73, in main
parser = _setup_parser()
File "main.py", line 33, in _setup_parser
trainer_parser = pl.Trainer.add_argparse_args(parser)
AttributeError: type object 'Trainer' has no attribute 'add_argparse_args'

在进行实体与关系联合抽取时出现了下面这个问题!

_INFO:tensorflow: name = predicate_output_weights:0, shape = (49, 768) INFO:tensorflow: name = predicate_output_bias:0, shape = (49,) INFO:tensorflow: name = token_label_output_weights:0, shape = (10, 768) INFO:tensorflow: name = token_label_output_bias:0, shape = (10,) INFO:tensorflow:Error recorded from evaluation_loop: Values of eval_metric_ops must be (metric_value, update_op) tuples, given: Tensor("ArgMax:0", shape=(?,), dtype=int32) for key: predicate_prediction INFO:tensorflow:evaluation_loop marked as finished WARNING:tensorflow:Reraising captured error Traceback (most recent call last): File "run_sequnce_labeling.py", line 883, in tf.app.run() File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run _sys.exit(main(argv)) File "run_sequnce_labeling.py", line 814, in main result = estimator.evaluate(input_fn=eval_input_fn, steps=eval_steps) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2424, in evaluate rendezvous.raise_errors() File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/error_handling.py", line 128, in raise_errors six.reraise(typ, value, traceback) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/six.py", line 703, in reraise raise value File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2418, in evaluate name=name File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 478, in evaluate return _evaluate() File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 460, in _evaluate self._evaluate_build_graph(input_fn, hooks, checkpoint_path)) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1484, in _evaluate_build_graph self._call_model_fn_eval(input_fn, self.config)) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1520, in _call_model_fn_eval features, labels, model_fn_lib.ModeKeys.EVAL, config) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2195, in _call_model_fn features, labels, mode, config) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/estimator/estimator.py", line 1195, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 2479, in _model_fn features, labels, is_export_mode=is_export_mode) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1259, in call_without_tpu return self._call_model_fn(features, labels, is_export_mode=is_export_mode) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 1538, in _call_model_fn return estimator_spec.as_estimator_spec() File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py", line 330, in as_estimator_spec prediction_hooks=self.prediction_hooks + hooks) File "/home/fanyongfeng/.conda/envs/ner-rc/lib/python3.6/site-packages/tensorflow/python/estimator/model_fn.py", line 236, in new 'tuples, given: {} for key: {}'.format(value, key)) TypeError: Values of eval_metric_ops must be (metric_value, update_op) tuples, given: Tensor("ArgMax:0", shape=(?,), dtype=int32) for key: predicate_prediction _

使用的命令是sh train_seq.sh ske
环境问题全部解决,但是在运行训练命令后出现这个问题。
如果有调试好的代码可以发邮件给我:[email protected] 谢谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.