Git Product home page Git Product logo

redditsota / state-of-the-art-result-for-machine-learning-problems Goto Github PK

View Code? Open in Web Editor NEW
8.9K 873.0 1.3K 151 KB

This repository provides state of the art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue or submit Google form (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately.

License: Apache License 2.0

state-of-the-art-result-for-machine-learning-problems's Introduction

State-of-the-art result for all Machine Learning Problems

LAST UPDATE: 20th Februray 2019

NEWS: I am looking for a Collaborator esp who does research in NLP, Computer Vision and Reinforcement learning. If you are not a researcher, but you are willing, contact me. Email me: [email protected]

This repository provides state-of-the-art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately.

You can also submit this Google Form if you are new to Github.

This is an attempt to make one stop for all types of machine learning problems state of the art result. I can not do this alone. I need help from everyone. Please submit the Google form/raise an issue if you find SOTA result for a dataset. Please share this on Twitter, Facebook, and other social media.

This summary is categorized into:

Supervised Learning

NLP

1. Language Modelling

Research Paper Datasets Metric Source Code Year
Language Models are Unsupervised Multitask Learners
  • PTB
  • WikiText-2
  • Perplexity: 35.76
  • Perplexity: 18.34
Tensorflow 2019
BREAKING THE SOFTMAX BOTTLENECK: A HIGH-RANK RNN LANGUAGE MODEL
  • PTB
  • WikiText-2
  • Perplexity: 47.69
  • Perplexity: 40.68
Pytorch 2017
DYNAMIC EVALUATION OF NEURAL SEQUENCE MODELS
  • PTB
  • WikiText-2
  • Perplexity: 51.1
  • Perplexity: 44.3
Pytorch 2017
Averaged Stochastic Gradient Descent
with Weight Dropped LSTM or QRNN
  • PTB
  • WikiText-2
  • Perplexity: 52.8
  • Perplexity: 52.0
Pytorch 2017
FRATERNAL DROPOUT
  • PTB
  • WikiText-2
  • Perplexity: 56.8
  • Perplexity: 64.1
Pytorch 2017
Factorization tricks for LSTM networks One Billion Word Benchmark Perplexity: 23.36 Tensorflow 2017

2. Machine Translation

Research Paper Datasets Metric Source Code Year
Understanding Back-Translation at Scale
  • WMT 2014 English-to-French
  • WMT 2014 English-to-German
  • BLEU: 45.6
  • BLEU: 35.0
2018
WEIGHTED TRANSFORMER NETWORK FOR MACHINE TRANSLATION
  • WMT 2014 English-to-French
  • WMT 2014 English-to-German
  • BLEU: 41.4
  • BLEU: 28.9
2017
Attention Is All You Need
  • WMT 2014 English-to-French
  • WMT 2014 English-to-German
  • BLEU: 41.0
  • BLEU: 28.4
2017
NON-AUTOREGRESSIVE NEURAL MACHINE TRANSLATION
  • WMT16 Ro→En
  • BLEU: 31.44
2017
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
  • NIST02
  • NIST03
  • NIST04
  • NIST05
  • 38.74
  • 36.01
  • 37.54
  • 33.76
  • 2017

    3. Text Classification

    Research Paper Datasets Metric Source Code Year
    Learning Structured Text Representations Yelp Accuracy: 68.6 2017
    Attentive Convolution Yelp Accuracy: 67.36 2017

    4. Natural Language Inference

    Leader board:

    Stanford Natural Language Inference (SNLI)

    MultiNLI

    Research Paper Datasets Metric Source Code Year
    NATURAL LANGUAGE INFERENCE OVER INTERACTION SPACE Stanford Natural Language Inference (SNLI) Accuracy: 88.9 Tensorflow 2017
    BERT-LARGE (ensemble) Multi-Genre Natural Language Inference (MNLI)
    • Matched accuracy: 86.7
    • Mismatched accuracy: 85.9
    2018

    5. Question Answering

    Leader Board

    SQuAD

    Research Paper Datasets Metric Source Code Year
    BERT-LARGE (ensemble) The Stanford Question Answering Dataset
    • Exact Match: 87.4
    • F1: 93.2
    2018

    6. Named entity recognition

    Research Paper Datasets Metric Source Code Year
    Named Entity Recognition in Twitter using Images and Text Ritter
    • F-measure: 0.59
    NOT FOUND 2017

    7. Abstractive Summarization

    Research Paper Datasets Metric Source Code Year
    Cutting-off redundant repeating generations
    for neural abstractive summarization
    • DUC-2004
    • Gigaword
    • DUC-2004
      • ROUGE-1: 32.28
      • ROUGE-2: 10.54
      • ROUGE-L: 27.80
    • Gigaword
      • ROUGE-1: 36.30
      • ROUGE-2: 17.31
      • ROUGE-L: 33.88
    NOT YET AVAILABLE 2017
    Convolutional Sequence to Sequence
    • DUC-2004
    • Gigaword
    • DUC-2004
      • ROUGE-1: 33.44
      • ROUGE-2: 10.84
      • ROUGE-L: 26.90
    • Gigaword
      • ROUGE-1: 35.88
      • ROUGE-2: 27.48
      • ROUGE-L: 33.29
    PyTorch 2017

    8. Dependency Parsing

    Research Paper Datasets Metric Source Code Year
    Globally Normalized Transition-Based Neural Networks
    • Final CoNLL ’09 dependency parsing
    • 94.08% UAS accurancy
    • 92.15% LAS accurancy
    • 2017

    Computer Vision

    1. Classification

               
    Research Paper Datasets Metric Source Code Year
    Dynamic Routing Between Capsules
    • MNIST
    • Test Error: 0.25±0.005
    2017
    High-Performance Neural Networks for Visual Object Classification
    • NORB
    • Test Error: 2.53 ± 0.40
    2011
    Giant AmoebaNet with GPipe
    • CIFAR-10
    • CIFAR-100
    • ImageNet-1k
    • ...
    • Test Error: 1.0%
    • Test Error: 8.7%
    • Top-1 Error 15.7
    • ...
    2018
    ShakeDrop regularization
    • CIFAR-10
    • CIFAR-100
    • Test Error: 2.31%
    • Test Error: 12.19%
    2017
    Aggregated Residual Transformations for Deep Neural Networks
    • CIFAR-10
    • Test Error: 3.58%
    2017
    Random Erasing Data Augmentation
    • CIFAR-10
    • CIFAR-100
    • Fashion-MNIST
    • Test Error: 3.08%
    • Test Error: 17.73%
    • Test Error: 3.65%
    Pytorch 2017
    EraseReLU: A Simple Way to Ease the Training of Deep Convolution Neural Networks
    • CIFAR-10
    • CIFAR-100
    • Test Error: 3.56%
    • Test Error: 16.53%
    Pytorch 2017
    Dynamic Routing Between Capsules
    • MultiMNIST
    • Test Error: 5%
    2017
    Learning Transferable Architectures for Scalable Image Recognition
    • ImageNet-1k
    • Top-1 Error:17.3
    2017
    Squeeze-and-Excitation Networks
    • ImageNet-1k
    • Top-1 Error: 18.68
    2017
    Aggregated Residual Transformations for Deep Neural Networks
    • ImageNet-1k
    • Top-1 Error: 20.4%
    2016

    2. Instance Segmentation

    Research Paper Datasets Metric Source Code Year
    Mask R-CNN
    • COCO
    • Average Precision: 37.1%
    2017

    3. Visual Question Answering

    Research Paper Datasets Metric Source Code Year
    Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
    • VQA
    • Overall score: 69
    2017

    4. Person Re-identification

         
    Research Paper Datasets Metric Source Code Year
    Random Erasing Data Augmentation
    • Rank-1: 89.13 mAP: 83.93
    • Rank-1: 84.02 mAP: 78.28
    • labeled (Rank-1: 63.93 mAP: 65.05) detected (Rank-1: 64.43 mAP: 64.75)
    Pytorch 2017

    Speech

    Speech SOTA

    1. ASR

    Research Paper Datasets Metric Source Code Year
    The Microsoft 2017 Conversational Speech Recognition System
    • Switchboard Hub5'00
    • WER: 5.1
    2017
    The CAPIO 2017 Conversational Speech Recognition System
    • Switchboard Hub5'00
    • WER: 5.0
    2017

    Semi-supervised Learning

    Computer Vision

         
    Research Paper Datasets Metric Source Code Year
    DISTRIBUTIONAL SMOOTHINGWITH VIRTUAL ADVERSARIAL TRAINING
    • SVHN
    • NORB
    • Test error: 24.63
    • Test error: 9.88
    Theano 2016
    Virtual Adversarial Training: a Regularization Method for Supervised and Semi-supervised Learning
    • MNIST
    • Test error: 1.27
    2017
    Few Shot Object Detection
    • VOC2007
    • VOC2012
    • mAP : 41.7
    • mAP : 35.4
    2017
    Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro
    • Rank-1: 83.97 mAP: 66.07
    • Rank-1: 84.6 mAP: 87.4
    • Rank-1: 67.68 mAP: 47.13
    •          
    • Test Accuracy: 84.4
    Matconvnet 2017

    Unsupervised Learning

    Computer Vision

    1. Generative Model
    Research Paper Datasets Metric Source Code Year
    PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION Unsupervised CIFAR 10 Inception score: 8.80 Theano 2017

    NLP

    Machine Translation

    Research Paper Datasets Metric Source Code Year
    UNSUPERVISED MACHINE TRANSLATION USING MONOLINGUAL CORPORA ONLY
    • Multi30k-Task1(en-fr fr-en de-en en-de)
    • BLEU:(32.76 32.07 26.26 22.74)
    2017
    Unsupervised Neural Machine Translation with Weight Sharing
    • WMT14(en-fr fr-en)
    • WMT16 (de-en en-de)
    • BLEU:(16.97 15.58)
    • BLEU:(14.62 10.86)
    2018

    Transfer Learning

    Research Paper Datasets Metric Source Code Year
    One Model To Learn Them All
    • WMT EN → DE
    • WMT EN → FR (BLEU)
    • ImageNet (top-5 accuracy)
    • BLEU: 21.2
    • BLEU:30.5
    • 86%
    2017

    Reinforcement Learning

    Research Paper Datasets Metric Source Code Year
    Mastering the game of Go without human knowledge the game of Go ElO Rating: 5185 2017

    Email: [email protected]

    state-of-the-art-result-for-machine-learning-problems's People

    Contributors

    bachstelze avatar hanpum avatar layumi avatar redditsota avatar rodgzilla avatar sshekh avatar taoyudong avatar thanhnguyentang avatar yichengong avatar zhunzhong07 avatar

    Stargazers

     avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

    Watchers

     avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

    state-of-the-art-result-for-machine-learning-problems's Issues

    State-of-the art for 20NewsGroups

    Does anyone happen to know the state-of-the-art for the popular 20 News Groups dataset? (And what's the most common train/dev/test splits people use?)

    Collaborate with nlpprogress.com/ ?

    Hi, thanks for maintaining this list, it's awesome!

    Just wonder if you are aware of nlpprogress.com/ which are doing similar things but focus on NLP. It would be nice to work together with them.

    Time Series Classification

    Time Series Classification is a very popular machine learning problem.
    You can find a full survey and empirical study (link to paper) on 85 datasets that can be found here.
    More recently in our paper we showed that deep learning can also reach state if the art performance for Time Series Classification.

    • Research paper name: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances (link) & Deep learning for time series classification: a review (link)
    • Dataset: UEA archive (link)
    • Metric: Accuracy + average rank comparison over the datasets (reference)
    • Source code: Time Series Classification (link) & Deep learning for time series classification (link)
    • Year: 2017 for Time Series Classification & 2018 for Deep learning for time series classification

    Mask RCNN implementations - additional info

    Hi, I'd like to add several models implementing Mask R-CNN.
    First one is Facebook Detectron in Caffe2. Works good.
    Another one is in Tensorflow with custom Slim library. This one is not supported by author, but works.
    Last one is MXNet

    About the link in the description - it is Keras on top of TensorFlow, not pure TensorFlow.
    Hope it helps.

    P.S. May you add guidance in what format people should add pull requests?

    Your email address `[email protected]` was not found

    I had just sent an email to [email protected] however it seems not work

    Address not found
    --
    Your message wasn't delivered to [email protected] because the address couldn't be found, or is unable to receive mail.
    
    The response was:The email account that you tried to reach does not exist. Please try double-checking the recipient's email address for typos or unnecessary spaces. Learn more at https://support.google.com/mail/?p=NoSuchUser r20-v6sor96536itb.73 - gsmtp
    

    So I just post my email at here ;)

    Hi there,

    I'm a machine learning newbie with 20 years of programming experience. I love ML and this year I'll go for my Ph.D. study for ChatBot(NLU, ChatUI) in Beijing.

    I'm willing to help as a collaborator because I love your idea of making one stop for all types of machine learning problems state of the art result, it helps me much.

    Please feel free to let me know what I could do for you at any time.

    My GitHub: https://github.com/zixia
    My LinkedIn: https://linkedin.com/in/zixia
    My WeChat: 918999

    Have a nice day!

    Huan LI
    [email protected]

    New Topic for Computer Vision

    It's an excellent job in the repo.

    For computer vision, some of the tasks will be important.
    I will provide some topics and references that I am familiar with.

    Instance Segmentation:
    MASK-RCNN
    The dataset used for evaluation is COCO

    Bounding-Box Object Detection
    MASK-RCNN

    Some other metrics for evaluation might be important, such as fps for detection.
    YOLO2
    SSD

    For bounding-box object detection, there are some other datasets:
    ImageNet DET
    Pascal VOC
    UA-DETRAC

    I have not looked at the development in speed for a while, so might be something new.
    MASK-RCNN provides the best accuracy now for sure.

    Include NLG papers

    NLG, the other end of NLP, is important in many fields where AI is being applied. Please include the latest NLG research as well as imo it would be very helpful.

    Problem request: Dynamic pricing

    Very interested in machine learning solutions to any form of dynamic pricing including but not limited to:

    Formulations

    • Base case Known supply, known demand
    • Retail Known supply, stochastic demand
    • Consignment Stochastic supply, stochastic demand

    Industries

    • E-commerce
    • Brick and mortar
    • Airlines
    • Hotels

    weighted Transformer

    There is a new paper out with a faster learning and bit bit better BLEU score for the transformer architecture called Weighted Transformer Network for Machine Translation:
    https://arxiv.org/abs/1711.02132

    but i think there is no open source implementation available

    NASNet

    NASNet, the paper is here,
    The code is here.
    The top-1 Error is 17.3 at ImageNet-1k.

    Language Modelling: WikiText-103

    Here is an update to the list for language modelling. The WikiText-103 dataset is not currently listed, but it has been a popular dataset for large vocabulary (not covered by ptb/wt2) and long time dependencies (not covered by billion word benchmark). It seems relevant for this list.

    Paper title: Fast Parametric Learning with Activation Memorization
    Dataset: WikiText-103
    Metric: 29.2 (Perplexity)
    Year: 2018
    Link: https://arxiv.org/abs/1803.10049

    Add Word Sense Disambiguation (WSD)

    Past SoTA

    It will be good if a track of SoTA can be provided, listing the path of how the tech is developed.
    I may be able to help in some of the areas

    Speech section

    There is an existing repo for speech SOTAs: https://github.com/syhw/wer_are_we. Perhaps, you want to reference them and/or join forces with them.

    Concerning the Switchboard number, you need to mention that it used 2000h set for training and the Switchboard Hub5'00 for testing (not Call Home subset).

    History of state-of-art results

    It would be interesting to keep the list of previous state-of-start results and related papers.
    That would help understanding the evolution of methods to address every single problem.

    For readibility, I suggest to list them on a new page.
    If you think this idea is worthwile, I'll start collecting information on this topic and will submit a PR.

    Thanks for this wonderful resource! 👏

    Object Detection

    Hi Yudong,

    I didn't see anything about object detection. Is there any reason for that or simply saying you forgot to add?

    Thanks

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.