Git Product home page Git Product logo

sebastianruder / nlp-progress Goto Github PK

View Code? Open in Web Editor NEW
22.3K 1.3K 3.6K 1.32 MB

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Home Page: https://nlpprogress.com/

License: MIT License

HTML 8.62% Ruby 0.48% Python 90.90%
natural-language-processing machine-learning named-entity-recognition machine-translation nlp-tasks dialogue

nlp-progress's Introduction

nlp-progress's People

Contributors

astariul avatar avisil avatar csarron avatar cwenner avatar fredrodrigues avatar gangeshwark avatar hossein-amirkhani avatar kaiqiangsong avatar leondz avatar manuelsh avatar mayhewsw avatar miguelballesteros avatar nirantk avatar oneplus avatar peterjliu avatar ramadistra avatar rktamplayo avatar sebastianruder avatar separius avatar shahbazsyed avatar shamilcm avatar stared avatar svjan5 avatar takase avatar tomlisankie avatar udnet96 avatar vncorenlp avatar yangheng95 avatar yuanhetian avatar yuvalpinter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nlp-progress's Issues

Error in NER for WNUT2017

The SOTA model's eval score and the linked paper is incorrect. F1 score is 49.49 as reported in the paper. More details at: https://github.com/zalandoresearch/flair

The paper will be published in NAACL 2019, the citation string is:

Pooled Contextualized Embeddings for Named Entity Recognition (to appear). Alan Akbik, Tanja Bergmann and Roland Vollgraf. 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics, NAACL 2019.

What about other languages?

Thanks for this work!

These pages seem to cover the progress only for English (well, except MT). Do you have plans to include other languages?

One extreme example is POS tagging and dependency parsing. UD has 60+ languages :) For others, there should be very limited data

Unable to find the Yelp Review Dataset online

Hi,

First, thanks a lot for this Github repo for SOTA in NLP, it is very useful !

Also, I am not able to find the Yelp Review Dataset you are mentioning. The version available on Yelp's website has more than 5 million points. The article mentions 560K for the Polarity and 650K for the Full. Do you know where I could find the latter ones on which the SOTA models are tested?

Thanks a lot,

New task: noun compound interpretation

Noun compound interpretation

The semantic interpretation of noun compounds (NCs) deals with the detection and semantic classification of the relations between noun constituents.

Example: spoon handle => PART-WHOLE, student protest => AGENT, fee-hike protest => CAUSE.

Noun compound interpretation is a well-studied task. For more information, one may refer Girju et. al. (2005) and Nakov (2013).

Question:

  • Where should I add this?
  • Under an existing category? If yes, which one?

Notes:

  • This is NOT Semantic Role Labeling as the relations are not between verb and noun.
  • This is NOT Relationship Extraction as the relations are based on the context. Interpretation of noun compound can be done via paraphrasing (student protest => protest by student(s)) which is not case with relation extraction.

New task: "Resolving the Scope and Focus of Negation"

Dear team, thanks for maintaining this project!

Anyone think we should also track the task "Resolving the Scope and Focus of Negation"?
https://www.clips.uantwerpen.be/sem2012-st-neg/index.html

Scope and focus of negation
Negation is a pervasive and intricate linguistic phenomenon present in all languages (Horn 1989). Despite this fact, computational semanticists mostly ignore it; current proposals to represent the meaning of text either dismiss negation or only treat it in a superficial manner. This shared task tackles two key steps in order to obtain the meaning of negated statements: scope and focus detection. Regardless of the semantic representation one favors (predicate calculus, logic forms, binary semantic relations, etc.), these tasks are the basic building blocks to process the meaning of negated statements.

Results (not sure if it is up to date):
https://www.clips.uantwerpen.be/sem2012-st-neg/results.html

Open source tagging tool similar to prodi.gy

Hello Sebastian,

Forgive me if this issue should not be posted here. This repo may only be limited to updates regarding papers, but I was wondering if you are aware of any open source tools that are similar to prodi.gy in application but also applicable to named entity tagging and semantic tagging.

Thanks!

a relationship extraction issue

for this list https://github.com/sebastianruder/NLP-progress/blob/master/english/relationship_extraction.md

I would like to point out a data issue

a new model of Distantly Supervised Relationship Extraction using the same training dataset (522611 ) is be able to compare with the same results of models (PCNN+ATT, PCNN+ONE etc.) reported Lin's paper (Lin et al., 2016).
(the cleaned dataset was updated by Lin and could be downloaded from https://github.com/thunlp/NRE)

The problem is that, some new papers (e.g. two in EMNLP 2018 and one in AAAI2019) ) used the unprocessed data (570088), which contains duplicated instances in the test set. the unclean data will give higher unreliable results.

issues already have been discussed in
thunlp/NRE#16
thunlp/OpenNRE#27

the unclean data was tested and has effects on the results.

Automatic metadata fetching via API call

Dear Sebastian, dear NLP-progress Contributors,

Thank you for creating this database!

More of a question than issue here...

I believe, I have interesting idea for improving this resource (which I tested in my own list of papers on interpretable ML https://github.com/lopusz/awesome-interpretable-machine-learning). I thought I might share it also here.

The idea is to provide only arXiv id (and scores) in the yaml files and let the script generate title, authors, year and url on the basis of the arXiv API call. If the paper is not available at arXiv the same can be done via Semantic Scholar API or doi API.

This essentially reduces the amount of copy & paste and ensures good consistency of metadata.

I have created a little proof of concept here.
https://github.com/lopusz/NLP-progress

The sample simplified yaml "template" (only with ids is here):
https://github.com/lopusz/NLP-progress/blob/devel/_data/dependency_parsing.yaml.template

It is processed by gener_yaml python script (requires pyyaml package)
https://github.com/lopusz/NLP-progress/blob/devel/_data/gener_yaml.py

and produces the full yaml for jekyll.
https://github.com/lopusz/NLP-progress/blob/devel/_data/dependency_parsing.yaml

Workflow can be traced in the Makefile.
https://github.com/lopusz/NLP-progress/blob/devel/Makefile

Do you think it is useful?

In addition to improving consistency, it could ease "yamlisation" of the other components. Also one could think of easily including more metadata (e.g. more accurate publication date for better timeline graphing?)

If you find it interesting we could think how to refactor this POC, so it best fits the NLP-progress workflow...

Best regards,
Michał

NLP Progress Graph

Hi Sebastian,
loved your idea for this repo.
I was thinking if we can have a graph, something like this


showing progress of different tasks in NLP based on the updates to their markdown file.
I have created a shell script which clones your repo into my local, counts the no of commit for different files and using python/pandas preprocess the result and create a bar chart out of it and uploads it to a free image uploading service.

Currently, it shows count of all the commit for a specific file but if we can have a guideline for adding new results, fixing errors .. Maybe different identifiers

Then we can count the no of times, a new result has been added to an NLP task.
This can help in visualizing the NLP areas of most active/Improving research.

Currently, the graph doesn't make much sense but over the time it will improve as we update with more results.

Also, If you think something like this can benefit the community, i can create a cron job on my pc(i don't have a server) which will update the image url with the latest graph which you can show on the main page.

Unofficial links to datasets

For grammatical error correction the two (NUCLE, Lang-8) main data sets are not available for download directly through "official" sources, but they are available in github repositories. Can they be linked for educational purposes?

Recommendation systems

Hi Sebastian,
Thanks for your wonderful repository/website. I found there is no benchmark/dataset for Recommendation systems which absolutely is an important task in NLP/IR.

Include a column for papers with source publicly available

First of all, thank you for this awesome work.

One suggestion. Include a separated column with a link for the source-code. This would help finding implementation that we can easily adopt and test against.

This would be in the scope of this project?

NLP Progress community

I am trying to learn nlp and would like to follow nlp aficionados on twitter or mastodon. I have background in programming and love sharing. Let me know your twitter and mastodon handles. Here are mines:

Maybe we could setup a mastodon instance dedicated to NLP? WDYT?

Conll-2003 uncomparable results

Because of the small size the training set of Conll-2003, some authors incorporated the development set as a part of training data after tuning the hyper-parameters. Consequently, not all results are directly comparable.

Train+dev:

Flair embeddings (Akbik et al., 2018)
Peters et al. (2017)
Yang et al. (2017)

Maybe those results should be marked by an asterisk

Suggestion: Add basic problems like sentence segmentation and tokenization

This is a freaking amazing overview & super useful! Many thanks!

Could I suggest to also add basic tasks like:

  • sentence segmentation
  • (word) tokenization

I noticed that what I assumed should be a trivial task by now is still not perfectly solved and algorithms I use make basic mistakes in both problem domains.

Dialogue Act Classification as separate task?

I'm currently looking into Dialogue Act Classification and I would like to add the SOTA to this repo, but I'm not sure if it counts as a separate task since there already is a page for Dialogue State Tracking which has Dialogue Act Classification as a subtask.
So would that be helpful nevertheless or is it not relevant?

Thanks in advance!

Summarization metrics

Hi guys! I've observed that researches featured for summarization mostly describes evaluating summaries only in the following metrics:

  • Rouge / variants
  • Meteor
  • Compression Ratio (CR)
  • F1 score

And recalling from past researches, I see they are the most often used.

Does anyone have an idea why these are favored over the other metrics?
specifically:

  • Retention Ratio (RR) [ Hassel M, Evaluation of automatic text summarization, 2004]
    • Answer Recall Lenient (ARL) [Mani 2002 , TIPSTER SUMMAC Text Summarization Evaluation ]
    • Answer Recall Strict (ARS) [Mani 2002 , TIPSTER SUMMAC Text Summarization Evaluation ]

Among others? (I mentioned RR because I used it previously along with CR)

Thanks for this great Repo btw!

Entity Linking Dataset

I am not able to download AIDA CoNLL-YAGO Dataset for the Entity Linking task. Please someone let me know how to do it.

Consider unifying some/many task-specific files

In response to #6:

For me: Having a single README (or, potentially, at least a smaller number than we have now) makes it easier (for me) to skim through all numbers, papers, results, etc.

Request to open for discussion:

  • consider re-unifying the many task-specific files back into a single README.
  • Or, possibly, into some smaller number of grouped, related items. (Don't know a good grouping offhand, but could investigate.)

Opening this for discussion at request of @sebastianruder .... polling community to see what would be easiest.

I will concede that in the long run, either way, there may be a pressure to split things back up (as tasks & papers accumulate), but my (personal!) leaning would be to keep things merged, for now, and then split things back up at some future point.

Dataset about argument mining and storytelling?

Thank you for your precious contribution for NLP. I am a candidate PHD in China and attracted by some new research topics mentioned in ‘Accepted Tutorial in ACL 2019’.
I really want to know that whether there is proper data for argument mining and storytelling.
Looking forward for your reply. Thank U.

YAML - pros and cons

I'd like to discuss here the pros and cons of using YAML going forward or whether we should stick with Markdown tables. Here are some pros and cons, mainly from @NirantK (in #116), @stared (in #43, #64) and myself.

Pros:

  • Easier trend spotting in performance improvements
  • Easy to create plots and visualizations going forward
  • Data is separated from presentation

Cons:

  • Hard for contributors, e.g. HTML omissions can't be spotted without setting up Jekyll locally
  • Github Repo becomes useless for readers, relying exclusively on nlpprogress.com
  • Many visualizations (e.g. bar charts) based on performance numbers are not more useful than the raw tables

Other opinions are welcome.

Recommend adding a link to GERBIL in entity_linking.md

GERBIL in general is just a really good entry point for people interested in Entity Linking.

There are issues with inscrutability e.g. can't see what candidates were considered/rejected by an EL service, you don't actually get to see the annotations returned by the EL system. Only the final stats as generated by GERBIL. However, it is a recognised bench marking platform that should definitely be in anyone's toolbox if they are interested in EL.

Website here: http://gerbil.aksw.org/gerbil/

Paper here: http://www.www2015.it/documents/proceedings/proceedings/p1133.pdf

[feature] please add "Chinese word segmentation" into wish list

Hi, thanks for your great project. I think you should add "Chinese word segmentation" as a topic into your wish list. Regretfully, I don't have so much knowledge to contribute to your project in this area, but I really hope someone could do it. Thanks!

Embeddings table

Any reason why universal embeddings / sentence embeddings section is left out in table of contents and wish list ? Is the tracking for extrinsic tasks only ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.