This repository provides state of the art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue or submit Google form (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately.

License: Apache License 2.0

state-of-the-art-result-for-machine-learning-problems's Introduction

State-of-the-art result for all Machine Learning Problems

LAST UPDATE: 20th Februray 2019

NEWS: I am looking for a Collaborator esp who does research in NLP, Computer Vision and Reinforcement learning. If you are not a researcher, but you are willing, contact me. Email me: [email protected]

This repository provides state-of-the-art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date or missing, please raise this as an issue (with this information: research paper name, dataset, metric, source code and year). We will fix it immediately.

You can also submit this Google Form if you are new to Github.

This is an attempt to make one stop for all types of machine learning problems state of the art result. I can not do this alone. I need help from everyone. Please submit the Google form/raise an issue if you find SOTA result for a dataset. Please share this on Twitter, Facebook, and other social media.

This summary is categorized into:

Supervised Learning
Semi-supervised Learning
- Computer Vision
Unsupervised Learning
- Speech
- Computer Vision
- NLP
Transfer Learning
Reinforcement Learning

Supervised Learning

NLP

1. Language Modelling

Research Paper	Datasets	Metric	Source Code	Year
Language Models are Unsupervised Multitask Learners	PTB WikiText-2	Perplexity: 35.76 Perplexity: 18.34	Tensorflow	2019
BREAKING THE SOFTMAX BOTTLENECK: A HIGH-RANK RNN LANGUAGE MODEL	PTB WikiText-2	Perplexity: 47.69 Perplexity: 40.68	Pytorch	2017
DYNAMIC EVALUATION OF NEURAL SEQUENCE MODELS	PTB WikiText-2	Perplexity: 51.1 Perplexity: 44.3	Pytorch	2017
Averaged Stochastic Gradient Descent with Weight Dropped LSTM or QRNN	PTB WikiText-2	Perplexity: 52.8 Perplexity: 52.0	Pytorch	2017
FRATERNAL DROPOUT	PTB WikiText-2	Perplexity: 56.8 Perplexity: 64.1	Pytorch	2017
Factorization tricks for LSTM networks	One Billion Word Benchmark	Perplexity: 23.36	Tensorflow	2017

2. Machine Translation

Research Paper	Datasets	Metric	Source Code	Year
Understanding Back-Translation at Scale	WMT 2014 English-to-French WMT 2014 English-to-German	BLEU: 45.6 BLEU: 35.0	PyTorch	2018
WEIGHTED TRANSFORMER NETWORK FOR MACHINE TRANSLATION	WMT 2014 English-to-French WMT 2014 English-to-German	BLEU: 41.4 BLEU: 28.9	NOT FOUND	2017
Attention Is All You Need	WMT 2014 English-to-French WMT 2014 English-to-German	BLEU: 41.0 BLEU: 28.4	PyTorch Tensorflow	2017
NON-AUTOREGRESSIVE NEURAL MACHINE TRANSLATION	WMT16 Ro→En	BLEU: 31.44	PyTorch	2017
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets	NIST02 NIST03 NIST04 NIST05	38.74 36.01 37.54 33.76	NMTPY	2017

3. Text Classification

Research Paper	Datasets	Metric	Source Code	Year
Learning Structured Text Representations	Yelp	Accuracy: 68.6	Tensorflow	2017
Attentive Convolution	Yelp	Accuracy: 67.36	Theano	2017

4. Natural Language Inference

Leader board:

Stanford Natural Language Inference (SNLI)

MultiNLI

Research Paper	Datasets	Metric	Source Code	Year
NATURAL LANGUAGE INFERENCE OVER INTERACTION SPACE	Stanford Natural Language Inference (SNLI)	Accuracy: 88.9	Tensorflow	2017
BERT-LARGE (ensemble)	Multi-Genre Natural Language Inference (MNLI)	Matched accuracy: 86.7 Mismatched accuracy: 85.9	Tensorflow PyTorch	2018

5. Question Answering

Leader Board

SQuAD

Research Paper	Datasets	Metric	Source Code	Year
BERT-LARGE (ensemble)	The Stanford Question Answering Dataset	Exact Match: 87.4 F1: 93.2	Tensorflow PyTorch	2018

6. Named entity recognition

Research Paper	Datasets	Metric	Source Code	Year
Named Entity Recognition in Twitter using Images and Text	Ritter	F-measure: 0.59	NOT FOUND	2017

7. Abstractive Summarization

Research Paper	Datasets	Metric	Source Code	Year
Cutting-off redundant repeating generations for neural abstractive summarization	DUC-2004 Gigaword	DUC-2004 ROUGE-1: 32.28 ROUGE-2: 10.54 ROUGE-L: 27.80 Gigaword ROUGE-1: 36.30 ROUGE-2: 17.31 ROUGE-L: 33.88	NOT YET AVAILABLE	2017
Convolutional Sequence to Sequence	DUC-2004 Gigaword	DUC-2004 ROUGE-1: 33.44 ROUGE-2: 10.84 ROUGE-L: 26.90 Gigaword ROUGE-1: 35.88 ROUGE-2: 27.48 ROUGE-L: 33.29	PyTorch	2017

8. Dependency Parsing

Research Paper	Datasets	Metric	Source Code	Year
Globally Normalized Transition-Based Neural Networks	Final CoNLL ’09 dependency parsing	94.08% UAS accurancy 92.15% LAS accurancy	SyntaxNet	2017

Computer Vision

1. Classification

Research Paper	Datasets	Metric	Source Code	Year
Dynamic Routing Between Capsules	MNIST	Test Error: 0.25±0.005	Official Implementation PyTorch Tensorflow Keras Chainer List of all implementations	2017
High-Performance Neural Networks for Visual Object Classification	NORB	Test Error: 2.53 ± 0.40	NOT FOUND	2011
Giant AmoebaNet with GPipe	CIFAR-10 CIFAR-100 ImageNet-1k ...	Test Error: 1.0% Test Error: 8.7% Top-1 Error 15.7 ...	NOT FOUND	2018
ShakeDrop regularization	CIFAR-10 CIFAR-100	Test Error: 2.31% Test Error: 12.19%	NOT FOUND	2017
Aggregated Residual Transformations for Deep Neural Networks	CIFAR-10	Test Error: 3.58%	PyTorch	2017
Random Erasing Data Augmentation	CIFAR-10 CIFAR-100 Fashion-MNIST	Test Error: 3.08% Test Error: 17.73% Test Error: 3.65%	Pytorch	2017
EraseReLU: A Simple Way to Ease the Training of Deep Convolution Neural Networks	CIFAR-10 CIFAR-100	Test Error: 3.56% Test Error: 16.53%	Pytorch	2017
Dynamic Routing Between Capsules	MultiMNIST	Test Error: 5%	PyTorch Tensorflow Keras Chainer List of all implementations	2017
Learning Transferable Architectures for Scalable Image Recognition	ImageNet-1k	Top-1 Error:17.3	Tensorflow	2017
Squeeze-and-Excitation Networks	ImageNet-1k	Top-1 Error: 18.68	CAFFE	2017
Aggregated Residual Transformations for Deep Neural Networks	ImageNet-1k	Top-1 Error: 20.4%	Torch	2016

2. Instance Segmentation

Research Paper	Datasets	Metric	Source Code	Year
Mask R-CNN	COCO	Average Precision: 37.1%	Detectron (Official Version) MXNet Keras TensorFlow	2017

3. Visual Question Answering

Research Paper	Datasets	Metric	Source Code	Year
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge	VQA	Overall score: 69	NOT FOUND	2017

4. Person Re-identification

Research Paper	Datasets	Metric	Source Code	Year
Random Erasing Data Augmentation	Market-1501 CUHK03-new-protocol DukeMTMC-reID	Rank-1: 89.13 mAP: 83.93 Rank-1: 84.02 mAP: 78.28 labeled (Rank-1: 63.93 mAP: 65.05) detected (Rank-1: 64.43 mAP: 64.75)	Pytorch	2017

Speech

Speech SOTA

1. ASR

Research Paper	Datasets	Metric	Source Code	Year
The Microsoft 2017 Conversational Speech Recognition System	Switchboard Hub5'00	WER: 5.1	NOT FOUND	2017
The CAPIO 2017 Conversational Speech Recognition System	Switchboard Hub5'00	WER: 5.0	NOT FOUND	2017

Semi-supervised Learning

Computer Vision

Research Paper	Datasets	Metric	Source Code	Year
DISTRIBUTIONAL SMOOTHINGWITH VIRTUAL ADVERSARIAL TRAINING	SVHN NORB	Test error: 24.63 Test error: 9.88	Theano	2016
Virtual Adversarial Training: a Regularization Method for Supervised and Semi-supervised Learning	MNIST	Test error: 1.27	NOT FOUND	2017
Few Shot Object Detection	VOC2007 VOC2012	mAP : 41.7 mAP : 35.4	NOT FOUND	2017
Unlabeled Samples Generated by GAN Improve the Person Re-identification Baseline in vitro	Market-1501 CUHK-03 DukeMTMC-reID CUB-200-2011	Rank-1: 83.97 mAP: 66.07 Rank-1: 84.6 mAP: 87.4 Rank-1: 67.68 mAP: 47.13 Test Accuracy: 84.4	Matconvnet	2017

Unsupervised Learning

Computer Vision

1. Generative Model

Research Paper	Datasets	Metric	Source Code	Year
PROGRESSIVE GROWING OF GANS FOR IMPROVED QUALITY, STABILITY, AND VARIATION	Unsupervised CIFAR 10	Inception score: 8.80	Theano	2017

NLP

Machine Translation

Research Paper	Datasets	Metric	Source Code	Year
UNSUPERVISED MACHINE TRANSLATION USING MONOLINGUAL CORPORA ONLY	Multi30k-Task1(en-fr fr-en de-en en-de)	BLEU:(32.76 32.07 26.26 22.74)	NOT FOUND	2017
Unsupervised Neural Machine Translation with Weight Sharing	WMT14(en-fr fr-en) WMT16 (de-en en-de)	BLEU:(16.97 15.58) BLEU:(14.62 10.86)	NOT FOUND	2018

Transfer Learning

Research Paper	Datasets	Metric	Source Code	Year
One Model To Learn Them All	WMT EN → DE WMT EN → FR (BLEU) ImageNet (top-5 accuracy)	BLEU: 21.2 BLEU:30.5 86%	Tensorflow	2017

Reinforcement Learning

Research Paper	Datasets	Metric	Source Code	Year
Mastering the game of Go without human knowledge	the game of Go	ElO Rating: 5185	C++	2017

Email: [email protected]

state-of-the-art-result-for-machine-learning-problems's People

Contributors

Stargazers

Watchers

Forkers

gaosq0604 shawnthu zssasa allensmile ashaywalke dixiematt8 likeucode liu-peiwen wikieden jafei0912 gc9999 tianya2010 shu-howting xuyuanchi bovey0809 songemeng ricelingz weeang763162 zaczzy chinarefers godning intsci lsptb dikea zqcr knowledgehacker makai281 yindaren vincent-vivian-liu pandamax wx1ng yue6121 andor-z fangwudi liukangling jiacli guoxs lostinet maggie0106 zhukaisjtu hiredd loovelj carollee1993 jq wuchengzhu thunder112358 lizihan021 snowcranestart ezc chelovek21 yong-zhang-mpf lauvchen ganghu1993 devenlu linpingchuan crlbajsoso ml-lab davidsonggithub wuqixiaobai sunnanbo sjinsa zzhlovenlp mzk665 lihuawu duxiaoyang06 007shiguanzhang barbieliu skyroam xulukai gumplus leezqcst betterenvi cateliu ziyu123 djoffrey wodeweilai lingjianshi whos budaicidewei yuanqunyong lixiang0 peterzhang2029 autuanliu jkhlot nightinwhite codebyzhu rayshark mrleetree george1ee lishiji1992 wjxcodes jiazaiwu ijinfeng iry1991 eriknie muserafiki mantian12357 numbersprite yufish zhengziqiang

state-of-the-art-result-for-machine-learning-problems's Issues

PTB state of the art

A recent ICLR 2018 submission (https://arxiv.org/pdf/1711.03953.pdf) claims to achieve sub-50 perplexities in language modelling on Penn Treebank.

State-of-the art for 20NewsGroups

Does anyone happen to know the state-of-the-art for the popular 20 News Groups dataset? (And what's the most common train/dev/test splits people use?)

Collaborate with nlpprogress.com/ ?

Hi, thanks for maintaining this list, it's awesome!

Just wonder if you are aware of nlpprogress.com/ which are doing similar things but focus on NLP. It would be nice to work together with them.

Time Series Classification

Time Series Classification is a very popular machine learning problem.
You can find a full survey and empirical study (link to paper) on 85 datasets that can be found here.
More recently in our paper we showed that deep learning can also reach state if the art performance for Time Series Classification.

Research paper name: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances (link) & Deep learning for time series classification: a review (link)
Dataset: UEA archive (link)
Metric: Accuracy + average rank comparison over the datasets (reference)
Source code: Time Series Classification (link) & Deep learning for time series classification (link)
Year: 2017 for Time Series Classification & 2018 for Deep learning for time series classification

source code of FRATERNAL DROPOUT

https://github.com/kondiz/fraternal-dropout

GAN NMT

paper: Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets https://arxiv.org/abs/1703.04887
year: 2107
average BLEU score in chinese - english NIST03 to NIST05 task up to 42.6 points
implementation: https://github.com/ngohoanhkhoa/GAN-NMT

Mask RCNN implementations - additional info

Hi, I'd like to add several models implementing Mask R-CNN.
First one is Facebook Detectron in Caffe2. Works good.
Another one is in Tensorflow with custom Slim library. This one is not supported by author, but works.
Last one is MXNet

About the link in the description - it is Keras on top of TensorFlow, not pure TensorFlow.
Hope it helps.

P.S. May you add guidance in what format people should add pull requests?

Your email address `[email protected]` was not found

I had just sent an email to [email protected] however it seems not work

Address not found
--
Your message wasn't delivered to [email protected] because the address couldn't be found, or is unable to receive mail.

The response was:The email account that you tried to reach does not exist. Please try double-checking the recipient's email address for typos or unnecessary spaces. Learn more at https://support.google.com/mail/?p=NoSuchUser r20-v6sor96536itb.73 - gsmtp

So I just post my email at here ;)

Hi there,

I'm a machine learning newbie with 20 years of programming experience. I love ML and this year I'll go for my Ph.D. study for ChatBot(NLU, ChatUI) in Beijing.

I'm willing to help as a collaborator because I love your idea of making one stop for all types of machine learning problems state of the art result, it helps me much.

Please feel free to let me know what I could do for you at any time.

My GitHub: https://github.com/zixia
My LinkedIn: https://linkedin.com/in/zixia
My WeChat: 918999

Have a nice day!

Huan LI
[email protected]

Ip address don't have one and WiFi is not working

How to restore IP address

Qualitative Results

Nice work on the repo.

Maybe there should be some (unranked) honorary mentions for non quantitative / semi-quantitative results?

e.g. for image completion/inpainting
http://hi.cs.waseda.ac.jp/~iizuka/projects/completion/data/completion_sig2017.pdf

image translation
https://arxiv.org/abs/1703.10593

generative models
http://research.nvidia.com/sites/default/files/publications/karras2017gan-paper-v2.pdf

Google form not found

Please fix

Add source code for paper：NON-AUTOREGRESSIVE NEURAL MACHINE TRANSLATION

source code：https://github.com/salesforce/nonauto-nmt

New Topic for Computer Vision

It's an excellent job in the repo.

For computer vision, some of the tasks will be important.
I will provide some topics and references that I am familiar with.

Instance Segmentation:
MASK-RCNN
The dataset used for evaluation is COCO

Bounding-Box Object Detection
MASK-RCNN

Some other metrics for evaluation might be important, such as fps for detection.
YOLO2
SSD

For bounding-box object detection, there are some other datasets:
ImageNet DET
Pascal VOC
UA-DETRAC

I have not looked at the development in speed for a while, so might be something new.
MASK-RCNN provides the best accuracy now for sure.

A website that might be useful

I find a website on which there are some state of the art objects classification results. I hope it will be useful. The link is:
http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html

Updated Speech recognition SOTA result.

According to Speech SOTA, the new state of the art on Switchboard Hub5'00 is The CAPIO 2017 Conversational Speech Recognition System, at WER of 5.0%, published in Dec., 2017.

Source code URLs for 2 paper in the Text Classification part

Source code URL for Learning Structured Text Representations
Source code URL for Attentive Convolution

ml issues

Include NLG papers

NLG, the other end of NLP, is important in many fields where AI is being applied. Please include the latest NLG research as well as imo it would be very helpful.

Problem request: Dynamic pricing

Very interested in machine learning solutions to any form of dynamic pricing including but not limited to:

Formulations

Base case Known supply, known demand
Retail Known supply, stochastic demand
Consignment Stochastic supply, stochastic demand

Industries

E-commerce
Brick and mortar
Airlines
Hotels

STOA of Unsupervised Learning for Machine Translation

Research Paper:
Unsupervised Neural Machine Translation with Weight Sharing
Datasets:

1.WMT14+16(en-fr fr-en de-en en-de)
2.LDC(zh-en)
Metric:
1.BLEU:(16.97 15.58 14.62 10.86)
2.BLEU:(14.52)
Source Code:
https://github.com/ZhenYangIACAS/unsupervised-NMT
Year:
2018

weighted Transformer

There is a new paper out with a faster learning and bit bit better BLEU score for the transformer architecture called Weighted Transformer Network for Machine Translation:
https://arxiv.org/abs/1711.02132

but i think there is no open source implementation available

NASNet

NASNet, the paper is here,
The code is here.
The top-1 Error is 17.3 at ImageNet-1k.

Language Modelling: WikiText-103

Here is an update to the list for language modelling. The WikiText-103 dataset is not currently listed, but it has been a popular dataset for large vocabulary (not covered by ptb/wt2) and long time dependencies (not covered by billion word benchmark). It seems relevant for this list.

Paper title: Fast Parametric Learning with Activation Memorization
Dataset: WikiText-103
Metric: 29.2 (Perplexity)
Year: 2018
Link: https://arxiv.org/abs/1803.10049

Add Word Sense Disambiguation (WSD)

There are a few different metrics and many datasets. I'll collect results later.

General

"Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison", 2017

Supervised

Semi-Supervised

"Semi-supervised Word Sense Disambiguation with Neural Models", 2016

SENet for image classification on ImageNet

paper: Squeeze-and-Excitation Networks
dataset: ImageNet 1K
metric: Top-1 Error 18.68
source code: https://github.com/hujie-frank/SENet
year: 2017

Past SoTA

It will be good if a track of SoTA can be provided, listing the path of how the tech is developed.
I may be able to help in some of the areas

add ImageNet results to Giant AmoebaNet

The Giant AmoebaNet claims SoTA results on ImageNet of 84.3% top-1 accuracy, beating the 2017 ImageNet winner, Squeeze-and-Excitation Networks. Should these be listed along with the CIFAR-10 and CIFAR-100 results?

(also, Gaint is misspelled. It should be Giant)

Add a cell for Neural Programming

These papers are already exploring this concept
Neural Turing Machines
Neural Random Access Machines
Neural Programmer

reinforcement learning

strong baselines from openai for reinforcement learning:
https://github.com/openai/baselines
blog post: https://blog.openai.com/openai-baselines-dqn/

one model to learn them all

In the transfer learning section this Model should be mentioned:

https://arxiv.org/abs/1706.05137
But i dont know how to compare it to other approaches.

State of the art results for Sarcasm Detection

This a request to add state of the art results for sarcasm detection in text available online (like tweets, movie reviews, news etc).

Speech section

There is an existing repo for speech SOTAs: https://github.com/syhw/wer_are_we. Perhaps, you want to reference them and/or join forces with them.

Concerning the Switchboard number, you need to mention that it used 2000h set for training and the Switchboard Hub5'00 for testing (not Call Home subset).

Natural Language Inference SoTA result

SNLI Leaderboard
https://nlp.stanford.edu/projects/snli/

MultiNLI Leaderboard
https://www.kaggle.com/c/multinli-matched-open-evaluation/leaderboard

SoTA: Densely Interactive Inference Network
Paper: https://arxiv.org/pdf/1709.04348.pdf
Code: https://github.com/YichenGong/Densely-Interactive-Inference-Network

History of state-of-art results

It would be interesting to keep the list of previous state-of-start results and related papers.
That would help understanding the evolution of methods to address every single problem.

For readibility, I suggest to list them on a new page.
If you think this idea is worthwile, I'll start collecting information on this topic and will submit a PR.

Thanks for this wonderful resource! 👏

New ImageNet state-of-the-art

Paper : Fixing the train-test resolution discrepancy (https://arxiv.org/abs/1906.06423)
Code : https://github.com/facebookresearch/FixRes
Year : 2019
Metric : Top-1 Accuracy / Top-5 Accuracy

Thanks

redditsota / state-of-the-art-result-for-machine-learning-problems Goto Github PK

state-of-the-art-result-for-machine-learning-problems's Introduction

State-of-the-art result for all Machine Learning Problems

LAST UPDATE: 20th Februray 2019

NEWS: I am looking for a Collaborator esp who does research in NLP, Computer Vision and Reinforcement learning. If you are not a researcher, but you are willing, contact me. Email me: [email protected]

Supervised Learning

NLP

1. Language Modelling

2. Machine Translation

3. Text Classification

4. Natural Language Inference

5. Question Answering

6. Named entity recognition

7. Abstractive Summarization

8. Dependency Parsing

Computer Vision

1. Classification

2. Instance Segmentation

3. Visual Question Answering

4. Person Re-identification

Speech

1. ASR

Semi-supervised Learning

Computer Vision

Unsupervised Learning

Computer Vision

1. Generative Model

NLP

Machine Translation

Transfer Learning

Reinforcement Learning

state-of-the-art-result-for-machine-learning-problems's People

Contributors

Stargazers

Watchers

Forkers

state-of-the-art-result-for-machine-learning-problems's Issues

Formulations

Industries

General

Supervised

Semi-Supervised

Recommend Projects

Recommend Topics

Recommend Org