Git Product home page Git Product logo

finn-wsd-eval's Introduction

Evaluation framework

pipeline status

Setup

The evaluation framework is distributed together with all systems and requirements as a Docker image, which can be pulled like so:

$ docker pull registry.gitlab.com/frankier/stiff:latest

And run like so:

$ docker -v /path/to/working/dir/:/work/ run python eval.py /work/results.json /work/eurosense.eval/

For CUDA accelerated experiments, you can use nvidia-docker. For running in shared computing environments, in which you don't have root access, I recommend udocker.

You can also set up requirements manually. See the Dockerfile for the list of commands to run.

Evaluation corpus

The one requirement which is not included in the Docker image is the evaluation corpus. Please follow the instructions in the STIFF README.

Filtering experiments with eval.py

You can run a subset of experiments by passing filters to eval.py, e.g.

$ python eval.py /work/results.json /work/eurosense.eval/ Knowledge 'Cross-lingual Lesk' mean=pre_sif_mean

Making tables with table.py

You can make LaTeX tables with table.py, e.g.

$ python table.py results.json --filter='Knowledge;Cross-lingual Lesk' --table='use_freq;vec:fasttext,numberbatch,double;mean expand;wn_filter'

Licenses

This project is licensed under the Apache v2 license. The code in ukb-eval is vendorized from UKB, and therefore licensed under the GPL. The scorer in support/scorer is under an unknown license, possibly public domain.

See also

  • STIFF: Automatically created sense tagged corpus of Finnish and corpus wrangling tools.
  • STIFF-explore: Some exploratory coding related to STIFF.
  • finn-man-ann: Small, Finnish language, manually annotated word sense corpus.
  • FinnTK: Simple, high-level toolkit for Finnish NLP, mainly providing convenience methods for, and gluing together, other tools.
  • extjwnl_fiwn: Java code to make extjwnl interoperate with FinnWordNet.
  • FinnLink: Link between FinnWordNet and Finnish Propbank created by joining with PredicateMatrix.
  • finn-sense-clust: Sense clusterings of FinnWordNet.

Forks/fixes

  • ItMakeseSense: ItMakesSense fork to support FiWN for use by finn-wsd-eval
  • AutoExtend: AutoExtend fork to support FiWN and ConceptNet Numberbatch
  • babelnet-lookup: babelnet-lookup fork to obtain BABEL2WN_MAP.
  • FinnWordNet: Temporary fixes to FinnWordNet 2.0.
  • Eurosense: Attempted fixes to Eurosense.

finn-wsd-eval's People

Contributors

frankier avatar

Watchers

 avatar

finn-wsd-eval's Issues

Allow experiments to fail

Experiments should fail at the appropriate level of granularity (but traceback should still be printed to stderr)

Can use more of a frame derived clustering with multiple synsets than just deleting the synsets entirely

E.G. if we have

laskea.02,00948071-v
laskea.02,00712556-v
laskea.02,02731632-v
laskea.02,00685081-v
laskea.03,02645839-v
laskea.03,02731632-v
laskea.03,00685081-v
laskea.03,00950431-v
laskea.06,01938426-v

we get a valid clustering by deleting the duplicates as we do now

laskea.02,00948071-v
laskea.02,00712556-v
laskea.03,02645839-v
laskea.03,00950431-v
laskea.06,01938426-v

but we could get an additional clustering by deleting the rest of the contents of the affected clusters and merging them

laskea.02or03,02731632-v
laskea.02or03,00685081-v
laskea.06,01938426-v

or is a more principled approach could be to take all possible choices of cluster for each duplicated synset, convert to graphs and find + remove contradictions, then find cliques. I think this is a bit different since it might refuse to say whether 02731632-v and 00685081-v are in the same cluster so would generate two clusterings

laskea.02or03,02731632-v
laskea.06,01938426-v

and

laskea.02or03,00685081-v
laskea.06,01938426-v

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.