microsoft / robust-models-transfer Goto Github PK

View Code? Open in Web Editor NEW

239.0 14.0 34.0 25 KB

Official repository for our NeurIPS 2020 *oral* "Do Adversarially Robust ImageNet Models Transfer Better?"

Home Page: https://arxiv.org/abs/2007.08489

License: MIT License

Python 100.00%

robust-models-transfer's Introduction

Transfer Learning using Adversarially Robust ImageNet Models

This repository contains the code and models necessary to replicate the results of our paper:

Do Adversarially Robust ImageNet Models Transfer Better?
Hadi Salman*, Andrew Ilyas*, Logan Engstrom, Ashish Kapoor, Aleksander Madry
Paper: https://arxiv.org/abs/2007.08489
Blog post: https://www.microsoft.com/en-us/research/blog/adversarial-robustness-as-a-prior-for-better-transfer-learning/

    @InProceedings{salman2020adversarially,
        title={Do Adversarially Robust ImageNet Models Transfer Better?},
        author={Hadi Salman and Andrew Ilyas and Logan Engstrom and Ashish Kapoor and Aleksander Madry},
        year={2020},
        booktitle={ArXiv preprint arXiv:2007.08489}
    }

Getting started

Our code relies on the MadryLab public robustness library, which will be automatically installed when you follow the instructions below.

Clone our repo: git clone https://github.com/microsoft/robust-models-transfer.git

Install dependencies:

conda create -n robust-transfer python=3.7
conda activate robust-transfer
pip install -r requirements.txt

Running transfer learning experiments

The entry point of our code is main.py (see the file for a full description of arguments).

1- Download one of the pretrained robust ImageNet models, say an L2-robust ResNet-18 with ε = 3. For a full list of models, see the section below!

mkdir pretrained-models & 
wget -O pretrained-models/resnet-18-l2-eps3.ckpt "https://huggingface.co/madrylab/robust-imagenet-models/resolve/main/resnet18_l2_eps3.ckpt"

2- Run the following script (best parameters for each dataset and architecture can be found here)

python src/main.py --arch resnet18 \
  --dataset cifar10 \
  --data /tmp \
  --out-dir outdir \
  --exp-name cifar10-transfer-demo \
  --epochs 150 \
  --lr 0.01 \
  --step-lr 30 \
  --batch-size 64 \
  --weight-decay 5e-4 \
  --adv-train 0 \
  --model-path pretrained-models/resnet-18-l2-eps3.ckpt
  --freeze-level -1

--freeze-level -> -1: full-network transfer | 4: fixed-feature transfer

3- That's it!

Datasets that we use (see our paper for citations)

aircraft (Download)
birds (Download)
caltech101 (Download)
caltech256 (Download)
cifar10 (Automatically downloaded when you run the code)
cifar100 (Automatically downloaded when you run the code)
dtd (Download)
flowers (Download)
food (Download)
pets (Download)
stanford_cars (Download)
SUN397 (Download)

To use any of these datasets in the code:

Download (click or use wget) and extract the desired dataset somewhere, e.g.
```
tar -xvf pets.tar
mkdir /tmp/datasets
mv pets /tmp/datasets
```

Add the dataset name and path as arguments , e.g.

python src/main.py --arch resnet18  ... --dataset pets --data /tmp/datasets/pets

Architectures

You can choose an architecture to use by simply passing it as arguments to the code e.g.

python src/main.py --arch resnet50 ...

The set of possible architectures is:

archs = [resnet18, 
         resnet50, 
         wide_resnet50_2, 
         wide_resnet50_4, 
         densenet,
         mnasnet,
         mobilenet,
         resnext50_32x4d,
         shufflenet,
         vgg16_bn
         ]

Download our robust ImageNet models

If you find our pretrained models useful, please consider citing our work.

Standard Accuracy of L2-Robust ImageNet Models

Model	ε=0	ε=0.01	ε=0.03	ε=0.05	ε=0.1	ε=0.25	ε=0.5	ε=1.0	ε=3.0	ε=5.0
ResNet-18	69.79	69.90	69.24	69.15	68.77	67.43	65.49	62.32	53.12	45.59
ResNet-50	75.80	75.68	75.76	75.59	74.78	74.14	73.16	70.43	62.83	56.13
Wide-ResNet-50-2	76.97	77.25	77.26	77.17	76.74	76.21	75.11	73.41	66.90	60.94
Wide-ResNet-50-4	77.91	78.02	77.87	77.77	77.64	77.10	76.52	75.51	69.67	65.20

Model	ε=0	ε=3
DenseNet	77.37	66.98
MNASNET	60.97	41.83
MobileNet-v2	65.26	50.40
ResNeXt50_32x4d	77.38	66.25
ShuffleNet	64.25	43.32
VGG16_bn	73.66	57.19

Standard Accuracy of Linf-Robust ImageNet Models

Model	ε=0.5/255	ε=1/255	ε=2/255	ε=4/255	ε=8/255
ResNet-18	66.13	63.46	59.63	52.49	42.11
ResNet-50	73.73	72.05	69.10	63.86	54.53
Wide-ResNet-50-2	75.82	74.65	72.35	68.41	60.82

We are hosting these models on HuggingFace too, check them out!

Maintainers

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

robust-models-transfer's People

Contributors

Stargazers

Watchers

robust-models-transfer's Issues

Pretrained models on coco

Will the pretrained models for the MS coco dataset be available?

Cannot download the dataset

Hi, thanks for providing the code. When I am trying to download the dataset, it shows the error that

"This request is not authorized to perform this operation. RequestId:cc37a921-a01e-0013-3ab7-c85b53000000 Time:2023-08-06T22:43:57.6153791Z
"
Could you please resolve this problem?

Request about code of training models on ImageNet

Hi, I'm wondering how I could use your codebase to train a model from scratch on ImageNet as pretrained one. Could you please share the code or if I'm missing something?

Error loading Mnasnet pre-trained model

Thanks for sharing this neat codebase.

I am trying to load the MNASNET pre-trained model and get the following assertion error from pytorch. It looks related to the MNASNET version that is being loaded. I am not familiar with this architecture so let me know if this issue belongs in the pytorch forums instead.

Thank you.

AssertionError Traceback (most recent call last)
in
37
38 for m in pre_trained_model_names:
---> 39 model, _ = model_utils.make_and_restore_model(
40 arch=models.mnasnet1_0(False),
41 dataset=datasets.ImageNet(''),

~/miniconda3/envs/robust-transfer/lib/python3.8/site-packages/robustness/model_utils.py in make_and_restore_model(arch, dataset, resume_path, parallel, pytorch_pretrained, add_custom_forward, *_)
100 sd = checkpoint[state_dict_path]
101 sd = {k[len('module.'):]:v for k,v in sd.items()}
--> 102 model.load_state_dict(sd)
103 print("=> loaded checkpoint '{}' (epoch {})".format(resume_path, checkpoint['epoch']))
104 elif resume_path:

~/miniconda3/envs/robust-transfer/lib/python3.8/site-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
1207 load(child, prefix + name + '.')
1208
-> 1209 load(self)
1210 del load
1211

~/miniconda3/envs/robust-transfer/lib/python3.8/site-packages/torch/nn/modules/module.py in load(module, prefix)
1205 for name, child in module._modules.items():
1206 if child is not None:
-> 1207 load(child, prefix + name + '.')
1208
1209 load(self)

~/miniconda3/envs/robust-transfer/lib/python3.8/site-packages/torch/nn/modules/module.py in load(module, prefix)
1201 def load(module, prefix=''):
1202 local_metadata = {} if metadata is None else metadata.get(prefix[:-1], {})
-> 1203 module._load_from_state_dict(
1204 state_dict, prefix, local_metadata, True, missing_keys, unexpected_keys, error_msgs)
1205 for name, child in module._modules.items():

~/miniconda3/envs/robust-transfer/lib/python3.8/site-packages/torchvision/models/mnasnet.py in _load_from_state_dict(self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs)
169 missing_keys: List[str], unexpected_keys: List[str], error_msgs: List[str]) -> None:
170 version = local_metadata.get("version", None)
--> 171 assert version in [1, 2]
172
173 if version == 1 and not self.alpha == 1.0:

AssertionError:

How to load ImageNet models in Tensorflow using these weights

Hi,

Kindly someone may show how to load an ImageNet model in tensorflow having the provided weights.
I tried following things:

provided the path to the provided weights in the Tensorflow weights argument.
tried model.load_weights()
All of the things are saying error:
h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5f.pyx in h5py.h5f.open()

OSError: Unable to open file (file signature not found)

Mismatch in accuracy

Hello!

Great work, thanks for making the code public!

I tried running L2 PGD trained Resnet50 eps1 model ImageNet validation set. I was getting 82.5% as standard top-1 accuracy vs 70.43% as mentioned in the paper. The robust accuracy was 69.4%.
Robustness library was used for evaluation with the following command from CLI. Attack specs are L2 PGD-3 with eps=1 step size=2/3 as mentioned in the paper.

python -m robustness.main --dataset imagenet --data ../../../datasets/imagenet2012/ --adv-train 1 --arch resnet50 --eps 1. --attack-lr 0.666666667 --attack-steps 3 --constraint 2 --eval-only 1 --out-dir logs --resume ../../pretrained_models/resnet50_l2_eps1.ckpt --adv-eval 1 --batch-size 128 --mixed-precision 1

I am not sure what am I missing here. Can you please suggest what factors should I take care of while running an evaluation for it? Thank you.

How to load resnet models with the given checkpoints?

Can you please provide the script to load resnet model with the given checkpoints in the repo?

Robust DenseNet121 Models

Hi authors,
Thank you for releasing your code and your pre-trained models.

Standard Accuracy Section of README benchmarks the performance of DenseNet161 on Imagenet. (The number of convolution channels for conv0 is 96 for DenseNet161 while it is 64 for DenseNet121).

Will it be possible for you to release the adversarially trained DenseNet121 model? If your answer is No, could you please tell me which command to use to obtain an adversarially trained DenseNet121 ImageNet model?

closed

Pretrained COCO models

It looks like pretrained models on COCO are still missing? Any chance you could upload the weights?

Failed to download models

Hi there,

I tried to download some pretrained models, but failed. Could you please fix the links at https://github.com/microsoft/robust-models-transfer#download-our-robust-imagenet-models ?

AuthenticationFailedServer failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. RequestId:95671a2b-401e-000b-0d30-ba8434000000 Time:2021-10-05T21:32:40.6862769ZSignature not valid in the specified time frame: Start [Wed, 10 Jun 2020 07:06:23 GMT] - Expiry [Tue, 05 Oct 2021 15:06:23 GMT] - Current [Tue, 05 Oct 2021 21:32:40 GMT]

Reported ImageNet results on test set or training set?

Hi, impressed by your work! I'm curious that whether all the results reported in your paper's Table 3 is on ImageNet test set or training set?

Issue loading VGG16_bn model

Hi, I'm trying to load the VGG_16bn model in the following manner

OUT_DIR = '/tmp/'
NUM_WORKERS = 16
BATCH_SIZE = 512

from robustness import model_utils, datasets, train, defaults
from robustness.datasets import CIFAR, ImageNet
import torch
from cox.utils import Parameters
import cox.store
from torchvision import transforms
imagenet_ds = ImageNet('/tmp/')
densenet , _ = model_utils.make_and_restore_model(arch='vgg16_bn', dataset=imagenet_ds, 
                                                          resume_path='vgg16_bn_l2_eps0.ckpt', parallel=False)

but when I do I notice the following error

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in load_state_dict(self, state_dict, strict)
   1050         if len(error_msgs) > 0:
   1051             raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
-> 1052                                self.__class__.__name__, "\n\t".join(error_msgs)))
   1053         return _IncompatibleKeys(missing_keys, unexpected_keys)
   1054 

RuntimeError: Error(s) in loading state_dict for AttackerModel:
	Missing key(s) in state_dict: "model.features.0.weight", "model.features.0.bias", "model.features.1.weight", "model.features.1.bias", "model.features.1.running_mean", "model.features.1.running_var", "model.features.3.weight", "model.features.3.bias", "model.features.4.weight", "model.features.4.bias", "model.features.4.running_mean", "model.features.4.running_var", "model.features.7.weight", "model.features.7.bias", "model.features.8.weight", "model.features.8.bias", "model.features.8.running_mean", "model.features.8.running_var", "model.features.10.weight", "model.features.10.bias", "model.features.11.weight", "model.features.11.bias", "model.features.11.running_mean", "model.features.11.running_var", "model.features.14.weight", "model.features.14.bias", "model.features.15.weight", "model.features.15.bias", "model.features.15.running_mean", "model.features.15.running_var", "model.features.17.weight", "model.features.17.bias", "model.features.18.weight", "model.features.18.bias", "model.features.18.running_mean", "model.features.18.running_var", "model.features.20.weight", "model.features.20.bias", "model.features.21.weight", "model.features.21.bias", "model.features.21.running_mean", "model.features.21.running_var", "model.features.24.weight", "model.features.24.bias", "model.features.25.weight", "model.features.25.bias", "model.features.25.running_mean", "model.features.25.running_var", "model.features.27.weight", "model.features.27.bias", "model.features.28.weight", "m...
	Unexpected key(s) in state_dict: "model.model.features.0.weight", "model.model.features.0.bias", "model.model.features.1.weight", "model.model.features.1.bias", "model.model.features.1.running_mean", "model.model.features.1.running_var", "model.model.features.1.num_batches_tracked", "model.model.features.3.weight", "model.model.features.3.bias", "model.model.features.4.weight", "model.model.features.4.bias", "model.model.features.4.running_mean", "model.model.features.4.running_var", "model.model.features.4.num_batches_tracked", "model.model.features.7.weight", "model.model.features.7.bias", "model.model.features.8.weight", "model.model.features.8.bias", "model.model.features.8.running_mean", "model.model.features.8.running_var", "model.model.features.8.num_batches_tracked", "model.model.features.10.weight", "model.model.features.10.bias", "model.model.features.11.weight", "model.model.features.11.bias", "model.model.features.11.running_mean", "model.model.features.11.running_var", "model.model.features.11.num_batches_tracked", "model.model.features.14.weight", "model.model.features.14.bias", "model.model.features.15.weight", "model.model.features.15.bias", "model.model.features.15.running_mean", "model.model.features.15.running_var", "model.model.features.15.num_batches_tracked", "model.model.features.17.weight", "model.model.features.17.bias", "model.model.features.18.weight", "model.model.features.18.bias", "model.model.features.18.running_mean", "model.model.features....

Any thoughts on what this is? For reference I had no problem loading the ResNet50 models. Thanks much, big fan of the library!

Failed to download pretrained models.

Hi, could you please help fix this link?
Thank you so much~!

Training settings for the model uploaded on RobustBench

Hi authors,

I noticed that the results on RobustBench were different for the models on the Robustness library (62.56/29.22) and for this work (64.02/34.96) for the same architecture ResNet-50, although the same code base was used for the pretraining of both on ImageNet. Could you please share the differences in the training settings of both?
I believe the defaults for the training of models on the Robustness library are the following with 7 attack steps:
datasets.ImageNet: { "epochs": 200, "batch_size":256, "weight_decay":1e-4, "step_lr": 50 },

Unexpected keys in state_dict of densenet161 model

I tried to load your checkpoint of densenet model (i.e. args.arch='densenet161'), but found unexpected keys as follows:

normalizer.new_mean
normalizer.new_std
model.model.features.conv0.weight
model.model.features.norm0.weight
model.model.features.norm0.bias
model.model.features.norm0.running_mean
model.model.features.norm0.running_var
model.model.features.norm0.num_batches_tracked
model.model.features.denseblock1.denselayer1.norm1.weight
model.model.features.denseblock1.denselayer1.norm1.bias
model.model.features.denseblock1.denselayer1.norm1.running_mean
model.model.features.denseblock1.denselayer1.norm1.running_var
model.model.features.denseblock1.denselayer1.norm1.num_batches_tracked
model.model.features.denseblock1.denselayer1.conv1.weight
model.model.features.denseblock1.denselayer1.norm2.weight
model.model.features.denseblock1.denselayer1.norm2.bias
model.model.features.denseblock1.denselayer1.norm2.running_mean
model.model.features.denseblock1.denselayer1.norm2.running_var
model.model.features.denseblock1.denselayer1.norm2.num_batches_tracked
model.model.features.denseblock1.denselayer1.conv2.weight
model.model.features.denseblock1.denselayer2.norm1.weight
......
model.model.features.denseblock4.denselayer24.conv2.weight
model.model.features.norm5.weight
model.model.features.norm5.bias
model.model.features.norm5.running_mean
model.model.features.norm5.running_var
model.model.features.norm5.num_batches_tracked
model.model.classifier.weight
model.model.classifier.bias
attacker.normalize.new_mean
attacker.normalize.new_std
attacker.model.model.features.conv0.weight
attacker.model.model.features.norm0.weight
attacker.model.model.features.norm0.bias
attacker.model.model.features.norm0.running_mean
attacker.model.model.features.norm0.running_var
attacker.model.model.features.norm0.num_batches_tracked
attacker.model.model.features.denseblock1.denselayer1.norm1.weight
attacker.model.model.features.denseblock1.denselayer1.norm1.bias
......
attacker.model.model.features.norm5.weight
attacker.model.model.features.norm5.bias
attacker.model.model.features.norm5.running_mean
attacker.model.model.features.norm5.running_var
attacker.model.model.features.norm5.num_batches_tracked
attacker.model.model.classifier.weight
attacker.model.model.classifier.bias

These keys are with prefixes model.model. and attacker.model.model. Is there anything wrong with the saved checkpoint?

Can you provide the ‘robustness’ package's python code？

Hi, I didn't find the robustness package in your provided source code.

Well, I found it at https://github.com/MadryLab/robustness! Thank you!