hendrycks / natural-adv-examples Goto Github PK

A Harder ImageNet Test Set (CVPR 2021)

License: MIT License

Python 100.00%

adversarial-attacks robustness adversarial-example ml-safety imagenet domain-generalization

natural-adv-examples's Introduction

Natural Adversarial Examples

We introduce natural adversarial examples -- real-world, unmodified, and naturally occurring examples that cause machine learning model performance to significantly degrade.

Download the natural adversarial example dataset ImageNet-A for image classifiers here.

Download the natural adversarial example dataset ImageNet-O for out-of-distribution detectors here.

Natural adversarial examples from ImageNet-A and ImageNet-O. The black text is the actual class, and the red text is a ResNet-50 prediction and its confidence. ImageNet-A contains images that classifiers should be able to classify, while ImageNet-O contains anomalies of unforeseen classes which should result in low-confidence predictions. ImageNet-1K models do not train on examples from “Photosphere” nor “Verdigris” classes, so these images are anomalous. Many natural adversarial examples lead to wrong predictions, despite having no adversarial modifications as they are examples which occur naturally.

Citation

If you find this useful in your research, please consider citing:

@article{hendrycks2021nae,
  title={Natural Adversarial Examples},
  author={Dan Hendrycks and Kevin Zhao and Steven Basart and Jacob Steinhardt and Dawn Song},
  journal={CVPR},
  year={2021}
}

natural-adv-examples's People

Contributors

Stargazers

Watchers

natural-adv-examples's Issues

Only test part , and no trainval datasets inside ?

What structure in the file PATH_TO_IMAGENET_VAL = "./imagenet1k/val/"?

Is this folder like the structure as follows (from your code it should be like this):
-n00001
-XXX.JPEG
-XXX.JPEG
-n00002
-XXX.JPEG
-XXX.JPEG

But my standard imagenet1k val dataset, only pictures in this folder. So your code can't run correctly.
the structure is like this:
-XXX.JPEG
-XXX.JPEG
-XXX.JPEG
-XXX.JPEG

Another file name convention question

Adding to #2, just to clarify, with the following path:

n01498041/0.000348_chameleon _ box turtle_0.55540705.jpg

the true class for the image is "stingray" (from the directory n01498041), but it was predicted as a "box turtle" with confidence 0.55540705. So how do I interpret the 0.000348_chameleon part of the file name?

Is fine-tuning needed when evaluating on ImageNet-O?

Hi,

Thanks for making the two benchmarks public. I was wondering that when evaluating OOD detection performance on ImageNet-O, did you fine-tune the model on the 200 selected classes, or just took the MSP score for that 200 classes from a pre-trained 1000 classes model?

ResNet50 get 0% (not ~2% as reported in the paper)

Hey,

Thanks for your such great work. I am trying to run the eval.py to and use ResNet50 pretrained to get your result in Figure 2. However, I am getting 0% accuracy for ResNet50.

Can you give me your advice?
Giang

Mismatch between the ImageNet-A classes reported in the paper and in the download?

Hi,
thanks for providing this dataset!
I downloaded ImageNet-A following the link in the README of this repository and had a look at the provided README to see which classes are available. I noticed that they are not exactly the same as the 200 classes reported in Section 8 of the paper, e.g. zebra, tiger, goldfish and hammerhead are among the classes mentioned in the paper but not in the README.txt from the downloaded .tar file (this is not exhaustive, I only checked a few). Am I looking in the wrong place?

Thanks and best regards
Verena

Understand AUPR95 in our paper

Hello @hendrycks ,

In your code, you are using the model's confidence scores on two the same imagenet-o datasets. Can you explain why you do this? How do you compute the AUPR95 by two lists of confidence scores. I am trying to improve AUPR95 from your paper but can not grasp how do you get the AUPR here. I did a quick check with two lists of 2000 random floats and the result is given below. What should I expect when I run my program to improve OOD performance.

confidence_in = np.random.rand(10000,)
confidence_out = np.random.rand(10000,)
in_score = -confidence_in
out_score = -confidence_out

aurocs, auprs, fprs = [], [], []
measures = calibration_tools.get_measures(out_score, in_score)
aurocs = measures[0]; auprs = measures[1]; fprs = measures[2];
calibration_tools.print_measures_old(aurocs, auprs, fprs, method_name='MSP')

Output:

FPR95:	94.88
AUROC: 	50.90
AUPR:  	50.77

Thank you a lot!

Download o fhttps://github.com/pytorch/vision/archive/master.zip

Hello,

I am trying to run eval_many_models.py, I commented out all but resnet50 to get
the publication comparison, when I loop over:
for net_params in models_to_test:
I get:

('pytorch/vision', 'resnet50')
Downloading: "https://github.com/pytorch/vision/archive/master.zip" to ~/.models/master.zip

FileNotFoundError Traceback (most recent call last)
in
2 print(net_params)
3 if net_params[0] == "pytorch/vision":
----> 4 net = torch.hub.load(net_params[0], net_params[1], pretrained=True)
5 elif "facebookresearch/deit" in net_params[0]:
6 net = torch.hub.load(net_params[0], net_params[1], pretrained=True)

~/opt/anaconda3/lib/python3.7/site-packages/torch/hub.py in load(repo_or_dir, model, *args, **kwargs)
335
336 if source == 'github':
--> 337 repo_or_dir = _get_cache_or_reload(repo_or_dir, force_reload, verbose)
338
339 model = _load_local(repo_or_dir, model, *args, **kwargs)

~/opt/anaconda3/lib/python3.7/site-packages/torch/hub.py in _get_cache_or_reload(github, force_reload, verbose)
142 url = _git_archive_link(repo_owner, repo_name, branch)
143 sys.stderr.write('Downloading: "{}" to {}\n'.format(url, cached_file))
--> 144 download_url_to_file(url, cached_file, progress=False)
145
146 with zipfile.ZipFile(cached_file) as cached_zipfile:

~/opt/anaconda3/lib/python3.7/site-packages/torch/hub.py in download_url_to_file(url, dst, hash_prefix, progress)
406 dst = os.path.expanduser(dst)
407 dst_dir = os.path.dirname(dst)
--> 408 f = tempfile.NamedTemporaryFile(delete=False, dir=dst_dir)
409
410 try:

~/opt/anaconda3/lib/python3.7/tempfile.py in NamedTemporaryFile(mode, buffering, encoding, newline, suffix, prefix, dir, delete)
545 flags |= _os.O_TEMPORARY
546
--> 547 (fd, name) = _mkstemp_inner(dir, prefix, suffix, flags, output_type)
548 try:
549 file = _io.open(fd, mode, buffering=buffering,

~/opt/anaconda3/lib/python3.7/tempfile.py in _mkstemp_inner(dir, pre, suf, flags, output_type)
256 file = _os.path.join(dir, pre + name + suf)
257 try:
--> 258 fd = _os.open(file, flags, 0o600)
259 except FileExistsError:
260 continue # try again

FileNotFoundError: [Errno 2] No such file or directory: '/Users/gjkunde/.models/tmpm0j8p0m6'

Please advise, thank you, Gerd

Question about testing pretrained custom model.

Hi. Hendrycks.
Can you provide instructions to actually test custom pretrained models?

Naming convention of the jpg files

The jpg files seem to follow a naming convention. I'm guessing they contain the misclassified class and the probabilities assigned to that misclassified class. Can someone confirm this?

How to interpret file names?

Thank you for the dataset. What are the values in a file name?
For example, in this one, 0.000116_digital clock _ digital clock_0.865662, the last value seems to be the probability, what is the first value?

`stable_cumsum` needs to be imported

stable_cumsum

natural-adv-examples/calibration_tools.py

Line 144 in 0777070

tps = stable_cumsum(y_true)[threshold_idxs]

needs to be imported.

I just get the acc of 0.9% in your dataset when I use PVT. It's too lower that I can't believe,

I just use the code of https://www.kaggle.com/paultimothymooney/starter-kernel-for-imagenet-a-adversarial-examples and load the model in the follow waynet = pvt_medium() checkpoint = torch.load("../input/pvt-medium/4f268100-d9ba-11eb-8129-547eacb8081d", map_location='cpu') if 'model' in checkpoint: msg = net.load_state_dict(checkpoint['model']) else: msg = net.load_state_dict(checkpoint) net.cuda() net.eval() the code and model of pvt is therehttps://github.com/whai362/PVT/tree/v2/classification.
I get the result pvt_medium Accuracy on ImageNet-A Dataset (%): 0.9333 Baseline RMS Calib Error (%): 51.24 AURRA (%): 0.96 As you can see, the result is very bad.
Is there anything wrong? could you help me to fix my mistake. Thanks very much.

Used model

Hello,

thanks for the interesting work.
Which resnet50 models did you use?
Specifically, was the ResNet50_Weights.IMAGENET1K_V2 of pytorch https://pytorch.org/vision/main/models/generated/torchvision.models.resnet50.html used or can it still be evaluated as an unknown resnet50?

Greetings

hendrycks / natural-adv-examples Goto Github PK

natural-adv-examples's Introduction

Natural Adversarial Examples

Citation

natural-adv-examples's People

Contributors

Stargazers

Watchers

Forkers

natural-adv-examples's Issues

('pytorch/vision', 'resnet50') Downloading: "https://github.com/pytorch/vision/archive/master.zip" to ~/.models/master.zip

Recommend Projects

Recommend Topics

Recommend Org

('pytorch/vision', 'resnet50')
Downloading: "https://github.com/pytorch/vision/archive/master.zip" to ~/.models/master.zip