Knowledge triples extraction (entities and relations extraction) and knowledge base construction based on dependency syntax for open domain text.
基于依存句法分析,实现面向开放域文本的知识三元组抽取(实体和关系抽取)及知识库构建。
Welcome to watch, star or fork.
"**国家主席***访问韩国,并在首尔大学发表演讲"
We can extract knowledge triples from the sentence as follows:
- (**, 国家主席, ***)
- (***, 访问, 韩国)
- (***, 发表演讲, 首尔大学)
knowledge_extraction/
|-- code/ # code directory
| |-- bean/
| |-- core/
| |-- demo/ # procedure entry
| |-- tool/
|-- data/ # data directory
| |-- input_text.txt # input text file
| |-- knowledge_triple.json # output knowledge triples file
|-- model/ # ltp models, can be downloaded from http://ltp.ai/download.html, select ltp_data_v3.4.0.zip
|-- resource # dictionaries dirctory
|-- requirements.txt # dependent python libraries
|-- README.md # project description
This repo was tested on Python 3.5+. The requirements are:
- jieba>=0.39
- pyltp>=0.2.1
cd ./code/demo/
python extract_demo.py
If you use the code, please kindly cite the following paper:
Jia S, Li M, Xiang Y. Chinese Open Relation Extraction and Knowledge Base Establishment[J]. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 2018, 17(3): 15.