borealisai / advertorch Goto Github PK

A Toolbox for Adversarial Robustness Research

License: GNU Lesser General Public License v3.0

Python 22.01% Jupyter Notebook 77.98% Shell 0.02%

adversarial-attacks adversarial-example adversarial-examples adversarial-learning adversarial-machine-learning adversarial-perturbations benchmarking machine-learning pytorch robustness security toolbox

advertorch's People

Contributors

Stargazers

Watchers

Forkers

gwding codeaudit wanglouis49 shinelinxx nadiiach chaoso pandinosaurus aiexperts envs haochange tracyjin rotorliu rajasekharponakala srk97 gzoumpourlis jhsp7 hm-li liangzhangxd byronwwang sunshine352 stjordanis cathalsmyth tianqwang dejiao2018 george1ee anirudhdagar nguyenducnhaty hunglethanh9 laurentmnr95 drmeerkat ujjawalsharma15 an1006634493 iikovalenko peterouzh jiajiongcao alexverine lea4n shaybek vaibhavsaraf fra31 scy1243 xunge tarokiritani bitterisland07 qixiuai udapy liuye6666 bruceshuyu dawkinszheng2 cosmoshua qianrenjian sjy-1995 ashutoshbsathe ningfeiwang zhangxiao96 nooneust pengchengguo samuelemarro pinglmlcv jeffreyzpan kishordgupta lazycat2018 khrulkovv liyemei neineit fugokidi dlyzhou spliew ssdxshcx joeybose cunjian lld533 araujoalexandre pbashivan hedeyu yaxinli0-0 shiyujiaaaa nagisa-eevee sihong-chen yt114 masanoriyamada shijianjian suelence-hu solidshen sheepfriend snehashischatterjee1997 domjrivera mencyhuang freegliboracle aaa6237 truskovskiyk skivenk geoffnn 978570053 nudt-lijian yangzhoufly death-rain atlas8346 harrywuhust2022 pige2nd

advertorch's Issues

Support tabular data?

Hi,

May I know will this tool support tabular data? If so, may you give an example?

L1 PGD attack issue

Hello! I got the following errors when trying to use L1PGD attacks

.local/lib/python3.6/site-packages/advertorch/attacks/utils.py", line 56, in rand_init_delta
delta.data = clamp(x.data + delta.data, clip_min, clip_max) - x.data
RuntimeError: expected type torch.cuda.FloatTensor but got torch.FloatTensor

So, I changed the line 55 and 62 from delta.data to delta.data.cuda(), and it worked. But I do not know why when I was using other L2, and LInf PGD attacks there were no such issues. Can I use L2 and LInf with this change?

Thank you!

Some confusion about 'ctx_noparamgrad_and_eval' and optimizing noise

Question 1:
In the tutorial_train_mnist.py, you designed the function ctx_noparamgrad_and_eval(), as follows:

with ctx_noparamgrad_and_eval(model):
    data = adversary.perturb(data, target)

Does it mean the same as follows:

model.eval()
with torch.no_grad():
    with torch.enable_grad():
        loss_attacker = ...
model.train()

I'm not sure if the above is correct. Is it true that with torch.no_grad() doesn't work in my code?

I guess you write this function to prevent the calculation of redundant parameter gradients and the effect of model.train() when performing an attack, and at the same time, to ensure that the noise gradient can be calculated.

In the later evaluation, since there is no need to distinguish the parameter gradient and the noise gradient, they are unified to the following form: Line 87 and Line 97

 model.eval()
with torch.no_grad():

I'm not sure if I understand correctly.

Question 2:
When attacking, your code is to optimize the noise separately (as shown below).

    delta.requires_grad_()
    for ii in range(nb_iter):
        outputs = predict(xvar + delta)
        loss = loss_fn(outputs, yvar)

I also found that there are some codes that adds the original image and noise first, and then performs overall optimization (as shown below).

    x_adv = xvar + delta                           # Calculate the gradient of x_adv, not delta
    for ii in range(nb_iter):
        outputs = predict(x_adv)
        loss = loss_fn(outputs, yvar)

Is there a difference between these two methods?
looking forward to your reply : )

Incorrect arguments for BasicIterativeAttack's superclass constructor

BasicIterativeAttack's constructor calls PGDAttack using positional arguments. For example, the L2 variant looks like this:

super(L2BasicIterativeAttack, self).__init__(
            predict, loss_fn, eps, nb_iter, eps_iter, rand_init,
            clip_min, clip_max, ord, targeted)

However, PGDAttack's constructor looks like this:

def __init__(
            self, predict, loss_fn=None, eps=0.3, nb_iter=40,
            eps_iter=0.01, rand_init=True, clip_min=0., clip_max=1.,
            ord=np.inf, l1_sparsity=None, targeted=False):

Notice l1_sparsity=None between ord and targeted. This means that the value of targeted is assigned to PGDAttack.l1_sparsity instead of PGDAttack.targeted, while PGDAttack.targeted will always be the default value (False).

请问MI-FGSM的实现中,并未考虑对初始样本的梯度,因为梯度为0吗

Attacks on Inception_v1 (googlenet)

Hi Team. Thanks for the awesome code. Can someone guide me to train googlenet as an adversarially robust model.

I have trained (pytorch implementation of googlenet) a my model on classification dataset of 18k sample with 5 categories.

Now, how to attack the model by using the pretrained weights?

Numerical Stability in Pytorch affect the effectiveness of attack?

Hello,
Thanks for the amazing work on Pytorch, since it is very similar to cleverhans, I recommend it to a lot of my friends.

I am using this lib to evaluate the robustness of my model, and I somehow see this issue from another repo
https://github.com/kleincup/DEEPSEC/issues/3

Basically, it reveals some unknown numerical stability of Pytorch. So I test whether it may also occur in this lib by comparing the output of the model with/without
logits = logits - torch.max(logits, dim=1, keepdim=True)[0]

For four-convolutional layers and 2 full-connection layer network on MNIST, the classification accuracy of the adversary on FGSM with eps=0.3 is 24.9%, but when I add the line above to fix the numerical stability, the classification accuracy of the adversary drops to 2.0%

To my best knowledge, the latter seems to be more reasonable.

I understand it is not related to the implementation of advertorch, yet what I see that the output for the models in the test_utils.py, none of them takes numerical stability into consideration. So if this issue may also occur in advertorch, may I propose to fix this in the tutorial?

Multiple inputs of model

hello, Multiple inputs of model?

ypred= model(data0, data1, data2, data3=data3)

adversary.perturb(？？？？？？？？？？？？)

What should I do?

Targeted attacks increases accuracy to 100%

If we use targeted attacks option in many attacks, it increases the accuracy of network to almost 100% which is very strange. I have observed this in cifar10, 100 and imagenet. Here is one examlpe.

multi_gpu

can i perform adversarial attack on multiple gpus? how to configure the setting on multi-gpu

Please add deepfool/DDN support

Thank you guys for your incredible work ! This toolbox boosts our research a lot.
But we found some more on this topic that DDN and DeepFool are also good attack methods.
Plaease check:
Deep Fool
Decoupling Direction and Norm(NIPS 18' Challenge Top-1)

If possible, I am willing to contribute to this repository and share my implemention of these two methods compatible with your interface later.

Adversarial training on cifar10?

Hello,

I really appreciate your work!
When I tried to perform the adversarial training on the CIFAR10 dataset by modifying the code of tutorial_train_mnist.py. I changed the get_mnist_train_loader and get_mnist_test_loader functions to get_cifar10_train_loader and get_cifar10_test_loader, and the LeNet5 model's input dimensions accordingly. But the problem is the loss doesn't decrease and clean acc and adv acc is always at 10%.
I also tried to use a larger model like Resnet. But the problem is the same.
So any ideas on why the loss doesn't decrease for cifar10?

`Train Epoch: 1 [0/50000 (0%)] Loss: 2.420085
Train Epoch: 1 [20000/50000 (40%)] Loss: 2.302042
Train Epoch: 1 [40000/50000 (80%)] Loss: 2.303400

Test set: avg cln loss: 2.3025, cln acc: 1000/10000 (10%)

Test set: avg adv loss: 2.3035, adv acc: 1000/10000 (10%)

Train Epoch: 2 [0/50000 (0%)] Loss: 2.300893
Train Epoch: 2 [20000/50000 (40%)] Loss: 2.303464
Train Epoch: 2 [40000/50000 (80%)] Loss: 2.303012

Test set: avg cln loss: 2.3026, cln acc: 1000/10000 (10%)

Test set: avg adv loss: 2.3027, adv acc: 1000/10000 (10%)

Train Epoch: 3 [0/50000 (0%)] Loss: 2.301586
Train Epoch: 3 [20000/50000 (40%)] Loss: 2.301844
Train Epoch: 3 [40000/50000 (80%)] Loss: 2.303260

Test set: avg cln loss: 2.3025, cln acc: 1000/10000 (10%)

Test set: avg adv loss: 2.3031, adv acc: 999/10000 (10%)

Train Epoch: 4 [0/50000 (0%)] Loss: 2.303174
Train Epoch: 4 [20000/50000 (40%)] Loss: 2.302358
Train Epoch: 4 [40000/50000 (80%)] Loss: 2.302135

Test set: avg cln loss: 2.3025, cln acc: 1008/10000 (10%)

Test set: avg adv loss: 2.3029, adv acc: 1000/10000 (10%)

Train Epoch: 5 [0/50000 (0%)] Loss: 2.303104
Train Epoch: 5 [20000/50000 (40%)] Loss: 2.303405
Train Epoch: 5 [40000/50000 (80%)] Loss: 2.301460

Test set: avg cln loss: 2.3023, cln acc: 1000/10000 (10%)

Test set: avg adv loss: 2.3032, adv acc: 1000/10000 (10%)

Train Epoch: 6 [0/50000 (0%)] Loss: 2.303206
Train Epoch: 6 [20000/50000 (40%)] Loss: 2.300870
Train Epoch: 6 [40000/50000 (80%)] Loss: 2.303452

Test set: avg cln loss: 2.3025, cln acc: 1000/10000 (10%)

Test set: avg adv loss: 2.3028, adv acc: 1000/10000 (10%)

Train Epoch: 7 [0/50000 (0%)] Loss: 2.302966
Train Epoch: 7 [20000/50000 (40%)] Loss: 2.302667
Train Epoch: 7 [40000/50000 (80%)] Loss: 2.303157

Test set: avg cln loss: 2.3025, cln acc: 1238/10000 (12%)

Test set: avg adv loss: 2.3028, adv acc: 724/10000 (7%)

Train Epoch: 8 [0/50000 (0%)] Loss: 2.302794
Train Epoch: 8 [20000/50000 (40%)] Loss: 2.302416
Train Epoch: 8 [40000/50000 (80%)] Loss: 2.302629

Test set: avg cln loss: 2.3025, cln acc: 1000/10000 (10%)

Test set: avg adv loss: 2.3027, adv acc: 1000/10000 (10%)

Train Epoch: 9 [0/50000 (0%)] Loss: 2.302886
Train Epoch: 9 [20000/50000 (40%)] Loss: 2.302330
Train Epoch: 9 [40000/50000 (80%)] Loss: 2.302263
`

Thanks!

How to use BPDA to attack the adversarial trained model

Hi,

Given a model (defenseNet) trained with adversarial training based on PGD attack, I want to use the BPDA to attack and evaluate this model with provided BPDAWrapper.

First I use the following code:

from advertorch.attacks import PGDAttack
from advertorch.bpda import BPDAWrapper

attacker = BPDAWrapper(defenseNet, forwardsub=defenseNet)
adversary = PGDAttack(attacker)

bpda_adv = adversary.perturb(clean_data, true_label)
bpda_adv_defended = defense(bpda_adv)

Results based on the above codes are always close to the white-box PGD attack test with defenseNet.

Then I use the following code:

def _identity(x):
return x

attacker = BPDAWrapper(defenseNet, forwardsub=_identity)
adversary = PGDAttack(attacker)

The following error reports:
RuntimeError: Mismatch in shape: grad_output[0] has a shape of torch.Size([64, 6]) and output[0] has a shape of torch.Size([64, 3, 32, 32]).

Seems forwardsub just directly outputs the input images with shape of torch.Size([64, 3, 32, 32]).

May I ask which way is correct and how to use the BPDA to attack and evaluate this model with provided BPDAWrapper?

Thanks so much.

add simple spatial transform attack

A simple spatial transform attack was proposed at ICML2019.

This is a simple attack that creates an adversarial example of image rotation and translation with a grid search (or random).

The advertorch already has a spatial transform attack ICLR2018 is implemented, but the simpler ICML I think adding 2019 would be useful.

If possible, I am willing to contribute to this repository and share my implementation of the method compatible with your interface later.

However, I only checked the accuracy compared with the paper using mnist.
Because I only tried a random attack due to memory capacity issues, and this attack relies on random numbers

If I were to contribute, I would check for accuracy with an error bar.

Gradient calculation issue about PGD attacks

First of all, I would like to thank you for this incredible work!

I suppose that the gradient of loss should be calculated w.r.t the input image instead of the perturbations (e.g. delta in the following codes) in each iteration of PGD attack. May I know why the gradient of loss is calculated w.r.t the perturbations (e.g. delta.grad.data.sign()) in each iteration?

Thanks.

if delta_init is not None:
    delta = delta_init
else:
    delta = torch.zeros_like(xvar)

delta.requires_grad_()
for ii in range(nb_iter):
    outputs = predict(xvar + delta)
    loss = loss_fn(outputs, yvar)
    if minimize:
        loss = -loss

    loss.backward()
    if ord == np.inf:
        grad_sign = delta.grad.data.sign()
        delta.data = delta.data + batch_multiply(eps_iter, grad_sign)
        delta.data = batch_clamp(eps, delta.data)
        delta.data = clamp(xvar.data + delta.data, clip_min, clip_max
                           ) - xvar.data

Printing attack adversaries in a verbose mode, for better debugging purpose

Full stacktrace:

/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [64,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [65,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [66,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [67,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [68,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
............................
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [30,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
/pytorch/aten/src/THC/THCTensorScatterGather.cu:188: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = long, Dims = 2]: block: [0,0,0], thread: [31,0,0] Assertion `indexValue >= 0 && indexValue < tensor.sizes[dim]` failed.
Traceback (most recent call last):
  File "adversarial_test_model.py", line 56, in <module>
    advdata = adversary.perturb(clndata, target)
  File "/nfshomes/smatcovi/works/pilot-project/ppenv/lib/python3.7/site-packages/advertorch/attacks/carlini_wagner.py", line 209, in perturb
    final_l2distsqs = torch.FloatTensor(final_l2distsqs).to(x.device)
RuntimeError: CUDA error: device-side assert triggered

I am trying to use the CarliniWagnerL2Attack. I am using the code tutorial_train_mnist.py but for checking robust accuracy. When I run locally (GTX1050) everything works. When I try to run on a nvidia K20 or M40 I get this error.

Best way to constrain CarliniWagnerL2Attack on epsilon

This is more a question than an issue. I'm just wondering what's the best way to put constraints on CarliniWagnerL2Attack so that the perturbation is within a norm epsilon-ball?

Thanks.

Doubts about implementation of carlini_wagner attack

Thank you so much for the wonderful library. I have one doubt about carlini_wagner attack though. The original paper talks about using 1/2 * (tanh (w) + 1) in order to ensure the delta values lie between 0 and 1. It seems here the code uses tanh only to ensure rescaling but the optimization is still done on the values of x and x + delta . In that case, why are we doing an extra clipping here? If tanh is only used to ensure values remain within the range, can we use a normal torch.clamp() function instead? Is there some reason to still use the tanh() function then?

I am a bit confused about the implementation and some pointers would be really appreciated.

JSMA never success on CIFAR10

First of all, I would like to thank you for this incredible work!
I try the following code, to attack CIFAR10 with JSMA. The attack fails all the time (the code works with other attacks).

import os
import pickle
import torch
import torchvision
import torchvision.transforms as transforms
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from tqdm import tqdm
from advertorch.utils import predict_from_logits
from advertorch_examples.utils import _imshow
from advertorch.attacks import PGDAttack, FGSM, JSMA

def get_test_loader():
transform = transforms.Compose([transforms.ToTensor()])
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=10, shuffle=False, num_workers=20)
return testloader

def get_pretrain_model():
with open('../models/resnetxt_acc_87.pkl', 'rb') as f:
net = pickle.load(f)
return net.module #net is a DataParallel object

testloader = get_test_loader()
net = get_pretrain_model()
adversary = JSMA(net, num_classes=10)

data = next(iter(testloader))
images, labels = data
cln_data, true_label = images.to('cuda'), labels.to('cuda')

adv_untargeted = adversary.perturb(cln_data, true_label)
preds = net(adv_untargeted)
estimate_prob, estimate_class = torch.max(preds .data, 1)

wrong = true_label!=estimate_class
print(wrong)

#output: tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], device='cuda:0', dtype=torch.uint8)

Attack successfully all the time?

Hello!
I learn how to use advertorch from this site, https://github.com/BorealisAI/advertorch/blob/master/advertorch_examples/tutorial_attack_defense_bpda_mnist.ipynb.
I wonder how to know if the current attack works successfully? 'Successfully' means that adversarial samples can be generated under the given constraint, and it is normal that no adversarial samples generated in some robust situation. However, I can't see any cues to judge whether the current attack works.
Correspondingly, another similar toolbox called Foolbox will prompt a None warning when the attack fails.

Failed adv training: MNIST and 2_layer_MLP

Hello, I used your code to train 2 layers of MLP and found that I couldn't adv train successfully. My modified code is as follows:

class SimpleModel(nn.Module):
    def __init__(self, dim_input=DIM_INPUT, num_classes=NUM_CLASS):
        super(SimpleModel, self).__init__()
        self.fc1 = nn.Linear(dim_input, 10)
        self.fc2 = nn.Linear(10, num_classes)

    def forward(self, x):
        x = x.view(x.size(0), -1) # MNIST input
        x = self.fc1(x)
        x = F.relu(x)
        x = self.fc2(x)
        return x

The result is that cln acc and adv acc are always 10%
I find that the gradient disappears and the predictions of the adversarial samples are always the same class, do you know why?(I tried adjusting the learning rate but it didn't seem to work, and the code is correct during standard training)
I also tested the MLP in your implementation, but still cannot achieve adversarial training(MNIST)
Looking forward to your reply

name 'adversary' is not defined

Hello,
I used this library well until now... and a weird problem happened.

name 'adverary' is not defined

from second data batch loop, that error comes out...
It is weird because If it is not defined, than how it works in first batch ?

`
adversary = LinfPGDAttack(
base_net, loss_fn=nn.CrossEntropyLoss(reduction="sum"),
eps=8/255., nb_iter=50, eps_iter=2/255., rand_init=True,
clip_min=0.0, clip_max=1.0, targeted=False
)
def adversarial_test(test_loader, net, adversary):

net.eval()
test_loss = test_acc = 0
iterator = tqdm(test_loader)
print(adversary) # it is printed well.  <advertorch.attacks.iterative_projected_gradient~

for data, target in iterator:
    
    data, target = data.cuda(), target.cuda()
    adv = adverary.perturb(data,target)
    output = net(adv)

smoothing only works for image with single channel

FGSM return tensor cannot compute the gradient

I want to get the fgsm_tensor.grad().clone() but got the problem about

"AttributeError: 'NoneType' object has no attribute 'clone'".

However, PGD doesn't have this problem.

I find that fgsm_tensor.is_leaf is False, so that can not compute the gradient.

In this line, one_step_gradient.py#L71 xadv.is_leaf become False.

Maybe you can change return xadv to return xadv.detach() to fix this problem, but I don't confirm that if it is necessary.

Question :Black Box substitute model implementation

Hello!
I was wondering if the framework has an implementation of the black box surrogate model :) I found it in the old version of it but not the recent.
Thanks

CarliniWagnerL2Attack on MNIST does NOT work

Thanks for this awesome toolbox.
When I try to attack MNIST using the CarliniWagnerL2Attack, the test results indicated that the attack was not successful.
Here comes the code:

    testset = torchvision.datasets.MNIST(root='./dataset', train=False, download=True, transform=transform_test)
    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=False, num_workers=4)

    cw_attack = CarliniWagnerL2Attack(predict=target_model,
                                      num_classes=10,
                                      confidence=2.0,
                                      targeted=True,
                                      learning_rate=0.001,
                                      binary_search_steps=5,
                                      max_iterations=1000,
                                      abort_early=True,
                                      clip_min=0.0,
                                      clip_max=1.0)

    # construct adversarial samples
    for i, data in enumerate(testloader, 0):
        x, y = data
        x, y = x.to(device), y.to(device)
        y_pred = target_model(x).argmax(dim=1)
        print("y_pred:", y_pred)

        # Random target construction
        if y.size() != torch.Size([]):
            range_ = y.size()[0]
        else:
            range_ = 1
        targets = []
        for index in range(range_):
            target = randint(0, 9)
            while target == y[index].item():
                target = randint(0, 9)
            targets.append(target)
            attack_target = torch.tensor(targets).to(device)
        print("attack_target:", attack_target)     

        # C&W
        with ctx_noparamgrad_and_eval(target_model):
            x_adv = cw_attack.perturb(x, attack_target)
        y_pred_adv = target_model(x_adv).argmax(dim=1)
        print("y_pred_adv:", y_pred_adv)
        raise Exception

And the results were:

y_pred: tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5], device='cuda:0')
attack_target: tensor([6, 8, 3, 7, 8, 2, 5, 6, 0, 2, 2, 8, 4, 6, 2, 2], device='cuda:0')
y_pred_adv: tensor([7, 2, 1, 0, 4, 1, 4, 9, 5, 9, 0, 6, 9, 0, 1, 5], device='cuda:0')

The pred labels after the CW attack are the same as before.
Any tips would be appreciated.

Is it possible to implement the MomentumIterativeAttack to L2

Is it possible to implement the MomentumIterativeAttack to L2? Just like the PGDAttack which both has the Linf and L2. Thanks for your reply.

How does PGDAttack deal with embedding？

Hi, It is a great open-source project! But as a beginner, I had some problems.
In Pytorch version, delta in PGDAttack as a parameter should be type FloatTensor. But my model uses embedding layer, it has to transform delta into LongTensor, so delta will lose its gradient because only floatTensor in Pytorch can grad. What should I do?

L1PGDAttack switches cuda devices

When running L1PGDAttack on a GPU that is different from cuda:0, I get the following error message:

.../lib/python3.8/site-packages/advertorch/attacks/iterative_projected_gradient.py", line 109, in perturb_iterative
    delta.data = clamp(xvar.data + delta.data, clip_min, clip_max
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!

After investigating a bit, it seems that the following line seems to be problematic:

advertorch/advertorch/attacks/iterative_projected_gradient.py

Line 108 in a0d4e58

delta.data = delta.data.cuda()

Instead of simply setting delta.data = delta.data.cuda(), it is better to use delta.data = delta.data.to(xvar.device).
Note that you can remove the test in L107.

Error while running PGD for Object Detection

I'm trying to perform PGD attack on YOLOv3 model pretrained on PASCAL VOC dataset. As soon as i pass image and label to perturb function, I get an error AttributeError: 'tuple' object has no attribute 'size'. Now, I pass the target and labels as tensor so I'm not sure why are they being converted back to tuples. Here is code that i'm executing and attached is the error:

from advertorch.attacks import LinfPGDAttack
import torch.nn as nn
import numpy as np

use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

print(use_cuda)

model.to(device)
torch.cuda.empty_cache();

model = model.eval()

x = torch.from_numpy(train_image[0].asnumpy())

y = torch.from_numpy(class_ids[0])

x = torch.Tensor(batch[0][0].asnumpy())
y = torch.Tensor(batch[6][0].asnumpy())

y = batch[6][0]

print(type(x))

print("x = ",batch[0][0])

print(type(y))

print("y = ",batch[6][0])

adversary = LinfPGDAttack(
model, loss_fn=nn.BCEWithLogitsLoss(reduction="none"), eps=0.3,
nb_iter=40, eps_iter=0.01, rand_init=False, clip_min=0.0, clip_max=1.0,
targeted=False)
#nn.CrossEntropyLoss(reduction="sum")

x = torch.unsqueeze(x,0)

adv_untargeted = adversary.perturb(x, y)

Adversarial training works well but Attacking the Network throws errors

with torch.no_grad():
        for batch_idx, (inputs, targets) in enumerate(testloader):
            inputs, targets = inputs.cuda(), targets.cuda()
            adv_untargeted = adversary.perturb(inputs, targets)
            inputs = adv_untargeted

            outputs = net(inputs)
            ....

When I run the code above I get the following errors:

line 85, in test adv_untargeted = adversary.perturb(inputs, targets)
line 189, in perturb l1_sparsity=self.l1_sparsity, 
line 70, in perturb_iterative loss.backward()
line 198, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph)
line 100, in backward allow_unreachable=True)  # allow_unreachable flag
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

As you can see, when I run the code, I am not able to get past adv_untargeted = adversary.perturb(inputs, targets). I know this may be due to the way I call the function but please can you let me know what I am doing wrong? I have been stuck on this for way too long.

Thanks.

Expected object of type torch.cuda.FloatTensor

Hello,

Thanks for the awesome code. I am currently using advertorch to attack my self-trained Densenet with 8 classes. Everything works well for the attack methods without num_classes such as LinfPGDAttack. However when I use any attacks that need num_classes such as CarliniWagnerL2Attack. I got the error:

RuntimeError Traceback (most recent call last)
in
----> 1 advimg = adversary.perturb(image_tensor, label)

~/anaconda3/envs/pytorch04py37/lib/python3.7/site-packages/advertorch/attacks/carlini_wagner.py in perturb(self, x, y)
224 loss, l2distsq, output, adv_img =
225 self._forward_and_update_delta(
--> 226 optimizer, x_atanh, delta, y_onehot, loss_coeffs)
227 if self.abort_early:
228 if ii % (self.max_iterations // NUM_CHECKS or 1) == 0:

~/anaconda3/envs/pytorch04py37/lib/python3.7/site-packages/advertorch/attacks/carlini_wagner.py in _forward_and_update_delta(self, optimizer, x_atanh, delta, y_onehot, loss_coeffs)
131 output = self.predict(adv)
132 l2distsq = calc_l2distsq(adv, transimgs_rescale)
--> 133 loss = self._loss_fn(output, y_onehot, l2distsq, loss_coeffs)
134 loss.backward()
135 optimizer.step()

~/anaconda3/envs/pytorch04py37/lib/python3.7/site-packages/advertorch/attacks/carlini_wagner.py in _loss_fn(self, output, y_onehot, l2distsq, const)
101 loss2 = (l2distsq).sum()
102 loss1 = torch.sum(const * loss1)
--> 103 loss = loss1 + loss2
104 return loss
105

RuntimeError: Expected object of type torch.cuda.FloatTensor but found type torch.FloatTensor for argument #3 'other'

adversary.perturb(self, x, y)

Hello, this tool is really great! Thank you for your contribution.
I'm curious when the parameter y in this function is assigned to None?

ImportError: cannot import name 'zero_gradients' from 'torch.autograd.gradcheck'

This issue exists because this line:
from advertorch.attacks.utils import zero_gradients

which refers to the zero_gradients function in PyTorch is obsolete.

This is because zero_gradients is removed from PyTorch 1.9. It was present until PyTorch 1.7.

could i produce ensemble attacks using this code?

the parameter of attack is like MomentumIterativeAttack(model,F.nll_loss),could i pass a list of model to this???

Is batch attack supported for PGD?

delta.requires_grad_()
for ii in range(nb_iter):
    outputs = predict(xvar + delta)
    loss = loss_fn(outputs, yvar)
    if minimize:
        loss = -loss

    loss.backward()

This is part of the advertorch/advertorch/attacks/iterative_projected_gradient.py .
I wonder if this PGD attack supports batch attack. I mean, can this function get a batch of different images and generates the corresponding attacks for each images? Because of loss = loss_fn(outputs, yvar), it seems the batch is not supported. I just want to make sure I am right.

image mean and std preprocessing against (min, max) clip

Hello!
When I try to add FGSM perturbation on images in CIFAR-10 dataset, I've already done preprocessing as below, and the data range of images is no longer [0, 1].
transform_test = transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)), ])
However, FGSM needs to be defined with 'data range' arguments, which are clip_min=0.0 and clip_max=1.0 by default. I do know these two arguments should be adjusted correspondingly, but how to compute this new data range after image preprocessing?
What comes to me is to create a toy image which only contains 0.0 and 1.0 pixels, and compute the data range after image preprocessing. But this seems a little clumsy. Is there any method in advertorch to handle this problem? As far as I know, there is a 'preprocessing=(mean, std)' together with 'bounds=(0, 1)' in Foolbox toolbox.

CarliniWagnerL2Attack takes forever on CIFAR10

Thanks for this great toolbox!! :)

I am trying to use this attack to perturb a model. I am just calling it with the default parameters, and it seems to take forever:

Is this expected behaviour?

In tutorial_train_mnist.py, what is the reason for choosing 'mean' or 'sum' in F.cross_entropy

In 'tutorial_train_mnist.py', there are 4 F.cross_entropy, they are located in 4 places below.
place 1(Line 60)

    if flag_advtrain:
        from advertorch.attacks import LinfPGDAttack
        adversary = LinfPGDAttack(
            model, loss_fn=nn.CrossEntropyLoss(reduction="sum"), eps=0.3,     # Use "sum"
            nb_iter=40, eps_iter=0.01, rand_init=True, clip_min=0.0,
            clip_max=1.0, targeted=False)

place 2(Line 78)

            optimizer.zero_grad()
            output = model(data)
            loss = F.cross_entropy(
                output, target, reduction='elementwise_mean')                                       # Use "mean"
            loss.backward()
            optimizer.step()

place 3(Line 100)

        for clndata, target in test_loader:
            clndata, target = clndata.to(device), target.to(device)
            with torch.no_grad():
                output = model(clndata)
            test_clnloss += F.cross_entropy(
                output, target, reduction='sum').item()                                                           # Use "sum"
            pred = output.max(1, keepdim=True)[1]
            clncorrect += pred.eq(target.view_as(pred)).sum().item()

place 4(Line 109)

            if flag_advtrain:
                advdata = adversary.perturb(clndata, target)
                with torch.no_grad():
                    output = model(advdata)
                test_advloss += F.cross_entropy(
                    output, target, reduction='sum').item()                                                      # Use "sum"
                pred = output.max(1, keepdim=True)[1]
                advcorrect += pred.eq(target.view_as(pred)).sum().item()

Can using "mean" in place 1 affect the results? Or why use "mean"(place 2) when updating the model and "sum"(place 1) when generating noise

Problem that pytorch v1.9.0 is not supported

Advertorch uses the deprecated function torch.autograd.gradcheck.zero_gradients().
This function is no longer available in v1.9.0, so Advertorch will not work.

torch
v1.8.1: https://github.com/pytorch/pytorch/blob/v1.8.1/torch/autograd/gradcheck.py#L12
v.1.9.0https://github.com/pytorch/pytorch/blob/v1.9.0/torch/autograd/gradcheck.py

Question about normalization step in adversarial training example

Hello,

Thank you for the great work! I have a quick question.

I'm looking at the adversarial training example and there seems to be no normalization step for MNIST images. Is it on purpose?

Thanks

Inquiry for adversarial training

Hi, thanks for making this pytorch toolbox, it really helps a lot.

I'm just wondering if there are any adversarial trained models in pytorch? Or will you provide these models in the future?

Thanks again for sharing this wonderful toolbox

Why cw_loss has a 50 intercept?

In the code, I find the cw_loss has an intercept of 50.

advertorch/advertorch/loss.py

Line 76 in 98f9b41

loss = clamp(elementwise_margin(input, target) + 50, 0.)

Is there a particular reason to shift the offset to 50? That basically makes the clamp function useless?

What if predict(network) has multiple outputs?

Hi there, thanks for your awesome work!
I am now facing a problem that my network(predict function) has 2 outputs.

out1, out2 = net(input)
adversary = GradientSignAttack(net,loss_fn,eps,clip_min,clip_max)

GradientSIgnAttack does not work because 2 outputs is confusing. I was thinking about using concatenation (mentioned in another issue), but it might not be helpful. Because my network has 6 constitutional layers, out1 is extracted out after 3rd layer and out2 is the final output. What I need is the adversarial example of out2. If using concatenation, the calculated adversarial example will not be correct, right?

Do you have any idea to deal with this problem?

Error loading notebook

I tried advertorch_examples/tutorial_attack_defense_bpda_mnist.ipynb on the jupyter notebook, but got an error.

Unreadable Notebook: 
advertorch/advertorch_examples/tutorial_attack_defense_bpda_mnist.ipynb 
NotJSONError('Notebook does not appear to be JSON: \'{\\n "cells": [\\n {\\n "cell_type": "m...')

I was able to run tutorial_attack_imagenet.ipynb on the jupyter notebook without any problems.

Attack on ensemble models?

Hi, first want to say thanks for your effort to make research much easier!

Recently, some papers, among which i cites two, create adversarial images on multiple models to increase transferability and success rate.

Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets
Delving into Transferable Adversarial Examples and Black-box Attacks

I wonder would it be possible to add this feature to this project?

question for iterative l2 attack

I have questions for both l2 PGD attack and Momentum iterative attack methods when compared with cleverhans
as for PGD:
why using
if eps is not None:
delta.data = clamp_by_pnorm(delta.data, ord, eps)
instead of
delta.data = batch_clamp(eps, delta.data)
same as for momentum iterative attack?

I thought the iteration step should be:
delta = clamp[(delta + eps_iter * (pnorm(g))), eps]
But you are somehow doing a clamp_by_pnorm operation here. It's quite different from other's implementations. So why are you doing this for l2 attack?

And I followed your settings to run l2-pgd attack on ResNet 50 trained on ImageNet, the attack didn't work at all. But L-inf worked for both PGD and MIFSGM.

RuntimeErrorTraceback (most recent call last)
<ipython-input-51-e48501c1581e> in <module>
      6     adversary = LocalSearchAttack(model, adv_target)
      7     adversary.targeted = True
----> 8     adv_targeted = adversary.perturb(data, adv_target)
      9     print(target, model(adv_targeted).max(1, keepdim=True)[1])
     10     break;

/opt/conda/lib/python3.7/site-packages/advertorch/attacks/localsearch.py in perturb(self, x, y)
    197     def perturb(self, x, y=None):
    198         x, y = self._verify_and_process_inputs(x, y)
--> 199         return _perturb_batch(self.perturb_single, x, y)
    200 
    201     def _rescale_to_m0d5_to_0d5(self, x, vmin=0., vmax=1.):

/opt/conda/lib/python3.7/site-packages/advertorch/attacks/localsearch.py in _perturb_batch(perturb_single, x, y)
    274 def _perturb_batch(perturb_single, x, y):
    275     for ii in range(len(x)):
--> 276         temp = perturb_single(x[ii], y[ii])[None, :, :, :]
    277         if ii == 0:
    278             result = temp

/opt/conda/lib/python3.7/site-packages/advertorch/attacks/localsearch.py in perturb_single(self, x, y)
    125         best_dist = np.inf
    126         rescaled_x, lb, ub = self._rescale_to_m0d5_to_0d5(
--> 127             rescaled_x, vmin=self.clip_min, vmax=self.clip_max)
    128 
    129         if self.comply_with_foolbox is True:

/opt/conda/lib/python3.7/site-packages/advertorch/attacks/localsearch.py in _rescale_to_m0d5_to_0d5(self, x, vmin, vmax)
    200 
    201     def _rescale_to_m0d5_to_0d5(self, x, vmin=0., vmax=1.):
--> 202         x = x - (vmin + vmax) / 2
    203         x = x / (vmax - vmin)
    204         return x, -0.5, 0.5

RuntimeError: expected type torch.cuda.FloatTensor but got torch.cuda.LongTensor

And maybe these implementation are not robust enough

pxy = pxy[torch.randperm(len(pxy))[:self.max_nb_seeds]]
...
pxy_star = pxy[indices.data.cpu()]

Because I got ValueError: only one element tensors can be converted to Python scalars on the second iteration.

And have anybody attacked Mnist successfully with SinglePixelAttack or LocalSearchAttack?

borealisai / advertorch Goto Github PK

advertorch's People

Contributors

Stargazers

Watchers

Forkers

advertorch's Issues

print(use_cuda)

x = torch.from_numpy(train_image[0].asnumpy())

y = torch.from_numpy(class_ids[0])

y = batch[6][0]

print(type(x))

print("x = ",batch[0][0])

print(type(y))

print("y = ",batch[6][0])

Recommend Projects

Recommend Topics

Recommend Org