Git Product home page Git Product logo

rs4a's Introduction

Randomized Smoothing of All Shapes and Sizes

Last update: July 2020.


Code to accompany our paper:

Randomized Smoothing of All Shapes and Sizes
Greg Yang*, Tony Duan*, J. Edward Hu, Hadi Salman, Ilya Razenshteyn, Jerry Li.
International Conference on Machine Learning (ICML), 2020 [Paper] [Blog Post]

Notably, we outperform existing provably $\ell_1$-robust classifiers on ImageNet and CIFAR-10.

Table of SOTA results.

Figure of SOTA results.

This library implements the algorithms in our paper for computing robust radii for different smoothing distributions against different adversaries; for example, distributions of the form $e^{-\|x\|_\infty^k}$ against $\ell_1$ adversary.

The following summarizes the (distribution, adversary) pairs covered here.

Venn Diagram of Distributions and Adversaries.

We can compare the certified robust radius each of these distributions implies at a fixed level of $\hat\rho_\mathrm{lower}$, the lower bound on the probability that the classifier returns the top class under noise. Here all noises are instantiated for CIFAR-10 dimensionality ($d=3072$) and normalized to variance $\sigma^2 \triangleq \mathbb{E}[\|x\|_2^2]=1$. Note that the first two rows below certify for the $\ell_1$ adversary while the last row certifies for the $\ell_2$ adversary and the $\ell_\infty$ adversary. For more details see our tutorial.ipynb notebook.

Certified Robust Radii of Distributions

Getting Started

Clone our repository and install dependencies:

git clone https://github.com/tonyduan/rs4a.git
conda create --name rs4a python=3.6
conda activate rs4a
conda install numpy matplotlib pandas seaborn 
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
pip install torchnet tqdm statsmodels dfply

Experiments

To reproduce our SOTA $\ell_1$ results on CIFAR-10, we need to train models over

$$
\sigma \in \{0.15, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75,2.0,2.25, 2.5,2.75, 3.0,3.25,3.5\},
$$

For each value, run the following:
python3 -m src.train
--model=WideResNet
--noise=Uniform
--sigma={sigma}
--experiment-name=cifar_uniform_{sigma}

python3 -m src.test
--model=WideResNet
--noise=Uniform
--sigma={sigma}
--experiment-name=cifar_uniform_{sigma}
--sample-size-cert=100000
--sample-size-pred=64
--noise-batch-size=512

The training script will train the model via data augmentation for the specified noise and level of sigma, and save the model checkpoint to a directory ckpts/experiment_name.

The testing script will load the model checkpoint from the ckpts/experiment_name directory, make predictions over the entire test set using the smoothed classifier, and certify the $\ell_1, \ell_2,$ and $\ell_\infty$ robust radii of these predictions. Note that by default we make predictions with $64$ samples, certify with $100,000$ samples, and at a failure probability of $\alpha=0.001$.

To draw a comparison to the benchmark noises, re-run the above replacing Uniform with Gaussian and Laplace. Then to plot the figures and print the table of results (for $\ell_1$ adversary), run our analysis script:

python3 -m scripts.analyze --dir=ckpts --show --adv=1

Note that other noises will need to be instantiated with the appropriate arguments when the appropriate training/testing code is invoked. For example, if we want to sample noise $\propto \|x\|_\infty^{-100}e^{-\|x\|_\infty^{10}}$, we would run:

 python3 -m src.train
--noise=ExpInf
--k=10
--j=100
--sigma=0.5
--experiment-name=cifar_expinf_0.5

Trained Models

Our pre-trained models are available.

The following commands will download all models into the pretrain/ directory.

mkdir -p pretrain
wget --directory-prefix=pretrain http://www.tonyduan.com/resources/2020_rs4a_ckpts/cifar_all.zip
unzip -d pretrain pretrain/cifar_all.zip
wget --directory-prefix=pretrain http://www.tonyduan.com/resources/2020_rs4a_ckpts/imagenet_all.zip
unzip -d pretrain pretrain/imagenet_all.zip

ImageNet (ResNet-50): [All Models, 2.3 GB]

CIFAR-10 (Wide ResNet 40-2): [All Models, 226 MB]

By default the models above were trained with noise augmentation. We further improve upon our state-of-the-art certified accuracies using recent advances in training smoothed classifiers: (1) by using stability training (Li et al. NeurIPS 2019), and (2) by leveraging additional data using (a) pre-training on downsampled ImageNet (Hendrycks et al. NeurIPS 2019) and (b) semi-supervised self-training with data from 80 Million Tiny Images (Carmon et al. 2019). Our improved models trained with these methods are released below.

ImageNet (ResNet 50):

CIFAR-10 (Wide ResNet 40-2):

An example of pre-trained model usage is below. For more in depth example see our tutorial.ipynb notebook.

from src.models import WideResNet
from src.noises import Uniform
from src.smooth import *

# load the model
model = WideResNet(dataset="cifar", device="cuda")
saved_dict = torch.load("pretrain/cifar_uniform_050.pt")
model.load_state_dict(saved_dict)
model.eval()

# instantiation of noise
noise = Uniform(device="cpu", dim=3072, sigma=0.5)

# training code, to generate samples
noisy_x = noise.sample(x)

# testing code, certify for L1 adversary
preds = smooth_predict_hard(model, x, noise, 64)
top_cats = preds.probs.argmax(dim=1)
prob_lb = certify_prob_lb(model, x, top_cats, 0.001, noise, 100000)
radius = noise.certify(prob_lb, adv=1)

Repository

  1. ckpts/ is used to store experiment checkpoints and results.
  2. data/ is used to store image datasets.
  3. tables/ contains caches of pre-calculated tables of certified radii.
  4. src/ contains the main souce code.
  5. scripts/ contains the analysis and plotting code.

Within the src/ directory, the most salient files are:

  1. train.py is used to train models and save to ckpts/.

  2. test.py is used to test and compute robust certificates for $\ell_1,\ell_2,\ell_\infty$ adversaries.

  3. noises/test_noises.py is a unit test for the noises we include. Run the test with

    python -m unittest src/noises/test_noises.py

    Note that some tests are probabilistic and can fail occasionally. If so, rerun a few more times to make sure the failure is not persistent.

  4. noises/noises.py is a library of noises derived for randomized smoothing.

rs4a's People

Contributors

tonyduan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

rs4a's Issues

Pretrained imagenet models not compatible with current code

The code is not compatible with the provided pertained imagenet models, in that it does not provide required 'norm' and 'log_sig' elements in the state_dict. I think this must be due to a change in the normalization technique used since training. Could you provide information/code for the original normalization method? Thanks.

bash-4.2$ python3 -m src.test --output-dir=imagenet --dataset-skip=1000 --noise=Uniform --sigma 0.5 --experiment-name=imagenet_uniform_0.5 --model ResNet --noise-batch-size=128 --save-path=imagenet/imagenet_uniform_0.5/imagenet_uniform_050.pt --dataset imagenet /nfshomes/alevine0/.local/lib/python3.7/site-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead. import pandas.util.testing as tm Traceback (most recent call last): File "/home/alevine0/miniconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/alevine0/miniconda3/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/alevine0/rs4a/src/test.py", line 59, in <module> model.load_state_dict(saved_dict) File "/home/alevine0/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1045, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for ResNet: Missing key(s) in state_dict: "norm.module.mu", "norm.module.log_sig".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.