Git Product home page Git Product logo

fine_tuning_bert_for_joint_entity_and_relation_extraction_in_chinese_medical_text's Introduction

Fine-tuning BERT for joint entity and relation extraction in chinese medical text

用法

下载预训练参数

google-research/bert下载BERT-Base, Chinese解压并放入pretrain_weights文件夹下

准备数据集

将关系抽取数据集转换成如下格式

[
    {
        "text": "左前降支近段管壁不规则,最狭窄处30-40%。",
        "entities": [
            [
                0,
                4,
                "左前降支",
                "body"
            ],
            [
                4,
                6,
                "近段",
                "location"
            ],
            [
                13,
                15,
                "狭窄",
                "grade"
            ],
            [
                16,
                22,
                "30-40%",
                "num"
            ]
        ],
        "relations": [
            [
                0,
                1,
                "locate_1_2"
            ],
            [
                0,
                2,
                "grade_of_2_1"
            ],
            [
                2,
                3,
                "num_of_1_2"
            ]
        ]
    },
    ...
]

再将数据集分割保存到

  • data_set/train.json
  • data_set/dev.json
  • data_set/test.json

修改config.py中的neid2type以及reid2type

# NER 表
neid2type = {
    0: 'null',
    1: 'O',
    2: 'B-xxxx',
    3: 'I-xxx',
    4: 'E-xxxx',
    5: 'S-xxxx',
    .......
}

# RE 关系表
# 0: 'null', (个人习惯)
# 1: 'no_relation',
reid2type = {
    0: 'null',
    1: 'no_relation',
    2: 'xxxxx_of_1_2',
    3: 'xxxxx_of_2_1',
    .........
}

安装conda依赖

conda env create -f environment.yml

激活环境并运行

conda activate tf1.12
python main.py

引用

@inproceedings{xue2019fine,
  title={Fine-tuning BERT for joint entity and relation extraction in chinese medical text},
  author={Xue, Kui and Zhou, Yangming and Ma, Zhiyuan and Ruan, Tong and Zhang, Huanhuan and He, Ping},
  booktitle={2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)},
  pages={892--897},
  year={2019},
  organization={IEEE}
}

鸣谢

参考了以下代码

google-research/bert

Separius/BERT-keras

fine_tuning_bert_for_joint_entity_and_relation_extraction_in_chinese_medical_text's People

Contributors

genggui001 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.