- letnet 264KB
- ResNet18 43MB
- resnet18 vanilla: 94% (teacher)
- resnet18 mixup: 95%
- lenet vanilla: 66%
- lenet kd: 75.5%
- lenet kd + temperature2 (kd_temp2): 75%
- lenet kd + mixup (alpha=0.2): 76%
- lenet kd + mixup (alpha=1): 75%
- lenet kd + mixup (alpha=2): 75%
- lenet kd + manifold mixup (alpha=0.2): 76%
- lenet kd + manifold mixup (alpha=1):
- lenet kd + fitnet + manifold mixup
- [ ]
By Hongyi Zhang, Moustapha Cisse, Yann Dauphin, David Lopez-Paz.
Facebook AI Research
Mixup is a generic and straightforward data augmentation principle. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples.
This repository contains the implementation used for the results in our paper (https://arxiv.org/abs/1710.09412).
If you use this method or this code in your paper, then please cite it:
@article{
zhang2018mixup,
title={mixup: Beyond Empirical Risk Minimization},
author={Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz},
journal={International Conference on Learning Representations},
year={2018},
url={https://openreview.net/forum?id=r1Ddp1-Rb},
}
- A computer running macOS or Linux
- For training new models, you'll also need a NVIDIA GPU and NCCL
- Python version 3.6
- A PyTorch installation
Use python train.py
to train a new model.
Here is an example setting:
$ CUDA_VISIBLE_DEVICES=0 python train.py --lr=0.1 --seed=20170922 --decay=1e-4
This project is CC-BY-NC-licensed.
The CIFAR-10 reimplementation of mixup is adapted from the pytorch-cifar repository by kuangliu.