Git Product home page Git Product logo

nereval's Introduction

nereval

image

Evaluation script for named entity recognition (NER) systems based on entity-level F1 score.

Definition

The metric as implemented here has been described by Nadeau and Sekine (2007) and was widely used as part of the Message Understanding Conferences (Grishman and Sundheim, 1996). It evaluates an NER system according to two axes: whether it is able to assign the right type to an entity, and whether it finds the exact entity boundaries. For both axes, the number of correct predictions (COR), the number of actual predictions (ACT) and the number of possible predictions (POS) are computed. From these statistics, precision and recall can be derived:

precision = COR/ACT
recall = COR/POS

The final score is the micro-averaged F1 measure of precision and recall of both type and boundary axes.

Installation

pip install nereval

Usage

The script can either be used from within Python or from the command line when classification results have been written to a JSON file.

Usage from Command Line

Assume we have the following classification results in input.json:

[
  {
    "text": "CILINDRISCHE PLUG",
    "true": [
      {
        "text": "CILINDRISCHE PLUG",
        "type": "Productname",
        "start": 0
      }
    ],
    "predicted": [
      {
        "text": "CILINDRISCHE",
        "type": "Productname",
        "start": 0
      },
      {
        "text": "PLUG",
        "type": "Productname",
        "start": 13
      }
    ]
  }
]

Then the script can be executed as follows:

python nereval.py input.json
F1-score: 0.33

Usage from Python

Alternatively, the evaluation metric can be directly invoked from within python. Example:

import nereval
from nereval import Entity

# Ground-truth:
# CILINDRISCHE PLUG
# B_PROD       I_PROD
y_true = [
    Entity('CILINDRISCHE PLUG', 'Productname', 0)
]

# Prediction:
# CILINDRISCHE PLUG
# B_PROD       B_PROD
y_pred = [
    # correct type, wrong text
    Entity('CILINDRISCHE', 'Productname', 0),
    # correct type, wrong text
    Entity('PLUG', 'Productname', 13)
]

score = nereval.evaluate([y_true], [y_pred])
print('F1-score: %.2f' % score)
F1-score: 0.33

Note on Symmetry

The metric itself is not symmetric due to the inherent problem of word overlaps in NER. So evaluate(y_true, y_pred) != evaluate(y_pred, y_true). This comes apparent if we consider the following example (tagger uses an BIO scheme):

# Example 1:
Input:     CILINDRISCHE PLUG     DIN908  M10X1   Foo
Truth:     B_PROD       I_PROD   B_PROD  B_DIM   O
Predicted: B_PROD       B_PROD   B_PROD  B_PROD  B_PROD

Correct Text: 2
Correct Type: 2

# Example 2 (inversed):
Input:     CILINDRISCHE PLUG     DIN908  M10X1   Foo
Truth:     B_PROD       B_PROD   B_PROD  B_PROD  B_PROD
Predicted: B_PROD       I_PROD   B_PROD  B_DIM   O

Correct Text: 2
Correct Type: 3

Notes and References

Used in a student research project on natural language processing at University of Twente, Netherlands.

References

nereval's People

Contributors

jantrienes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

nereval's Issues

How to distinguish the four ways

in your blog, you said that SEMeval has set up four ways to evaluate, such as type、strict、exact、partial,then I didn't figure out how to use them correctly! looking forward to your reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.