Git Product home page Git Product logo

fashion-mnist-c's Introduction

Note: As no changes to this codebase are planned, this repository is archived. If you have any questions, do not hesitate to contact us by email.

FMNIST-C (Corrupted Fashion-Mnist)

Lint & Test Code style: black Imports: isort Docstr-Coverage Python Version DOI License

This repository contains the source code used to create the FMNIST-C dataset, a corrupted Fashion-MNIST benchmark for testing out-of-distribution robustness of computer vision models.

FMNIST is a drop-in replacement for MNIST. FMNIST-C is a corresponding drop-in replacement for MNIST-C.

Corruptions

The following corruptions are applied to the images, equivalently to MNIST-C:

  • Noise (shot noise and impulse noise)
  • Blur (glass and motion blur)
  • Transformations (shear, scale, rotate, brightness, contrast, saturate, inverse)

In addition, we apply various image flippings and turnings: For fashion images, flipping the image does not change its label, and still keeps it a valid image. However, we noticed that in the nominal fmnist dataset, most images are identically oriented (e.g. most shoes point to the left side). Thus, flipped images provide valid OOD inputs.

Most corruptions are applied at a randomly selected level of severity, s.t. some corrupted images are really hard to classify whereas for others the corruption, while present, is subtle.

Usage

The easiest way to use fashion-mnist-c is through huggingface datasets:

# Install huggingface datasets
# pip install datasets

# The next two lines are all you need to load the corrupted dataset
from datasets import load_dataset
fmnist_c = load_dataset("mweiss/fashion_mnist_corrupted")

# Convert test sets numpy arrays (if you want)
#   You could of course do the same with the training set, but in most robustness studies, 
#   you'd use corrupted data only for testing, not for training.
import numpy as np
fmnist_c_x_test = np.array([np.array(x) for x in fmnist_c['test']['image']])
fmnist_c_y_test = np.array(fmnist_c['test']['label'])

Otherwise, this repository contains the binaries of the datasets in two formats:

  • ./generated/npy/... Numpy arrays.
  • ./generated/ubyte/... The file format used for the original mnist dataset. These files can thus be used as drop-in replacements in most mnist dataset data loaders.

Examples

Turned Blurred Rotated Noise Noise Turned

Citation

If you use this dataset, please cite the following paper:

@inproceedings{Weiss2022SimpleTechniques,
  title={Simple Techniques Work Surprisingly Well for Neural Network Test Prioritization and Active Learning},
  author={Weiss, Michael and Tonella, Paolo},
  booktitle={Proceedings of the 31th ACM SIGSOFT International Symposium on Software Testing and Analysis},
  year={2022}
}

Also, you may want to cite FMNIST and MNIST-C.

Credits

  • FMNIST-C is inspired by Googles MNIST-C and our repository is essentially a clone of theirs. See their paper and repo.
  • Find the nominal (i.e., non-corrupted) Fashion-MNIST dataset here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.