Git Product home page Git Product logo

backdoor's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

backdoor's Issues

Is this detection method a white-box setting?

Hello, I have learned your work on backdoor detection of neural network. I want to confirm if your method are under white-box condition(because the mask and reversed trigger generation require the gradient of the model). But there is a paper says that your method are under black-box condition (https://arxiv.org/pdf/2007.10760.pdf Page. 22, Table II). I think the authors are wrong, what do you think?

Reverse-engineered triggers: can you share them?

Can you please share the three image files of the reverse-engineered triggers of the following three models: GTRSB-model and VGG-Face model (square trigger) and VGG-Face model (watermark trigger).

That would be very helpful to replicate your experiments.

Thanks,

where is implementation on partial backdoor attack?

Thank you for your work on backdoor attacks!
In your paper, you mentioned that you have conducted experiments on partial backdoor attack. I've also noticed that in issue "Adaptation for partial backdoor attack #9", someone succeeded in implementing partial backdoor attack.
However, I didn't figure out where I can make partial backdoor attack settings, and where the codes are to implement partial backdoor attack. In "gtsrb_injection_example.py", it looks like that you just choose pictures from arbitrary label (with injection ratio), and convert them to target label. I couldn't found where I can mark source label. Could you help me figure out where I can make this setting?
Thanks a lot!

The reversed mask of the targetted label can't converge on MNIST dataset

My classmate and I changed the parameters in visualizer.py and gtsrb_visualize_example.py, which is MNIST_visualize_example.py in our case, to reproduce the reverse engineering process on MNIST dataset. When it started to reverse the triggers, the mask of the targeted label cannot converge, so that it cannot tell the properly targeted label.

Here is the result of mad outlier detection. The targetted label is 5. The trigger is a little white square on the right bottom.
2021-05-27 21-22-14 的屏幕截图

Here are the results of the reversed masks of each label.
res

Here are the results of the reversed masks of label 5 in each step.
process

Here are our parameters in MNIST_visualize_example.py

# # input size
IMG_ROWS = 28
IMG_COLS = 28
IMG_COLOR = 1
IMAGE_SHAPE = (IMG_ROWS, IMG_COLS, IMG_COLOR)
mad_outlier_detection
#
CLASSES_ALL = 10  # total number of classes in the model
Y_TARGET = 5  # (optional) infected target label, used for prioritizing label scanning

PREPROCESS = 'mnist'  # preprocessing method for the task, GTSRB uses raw pixel intensities

# parameters for optimization
BATCH_SIZE = 32  # batch size used for optimization
LR = 0.1  # learning rate
STEPS = 1000  # total optimization iterations
NB_SAMPLE = 1000  # number of samples in each mini batch
MINI_BATCH = NB_SAMPLE // BATCH_SIZE  # mini batch size used for early stop
INIT_COST = 2e-3  # initial weight used for balancing two objectives

We have tried to adjust the other parameters, too. And we found that when LR was smaller and INIT_COST was a bit larger, like 2e-3, the result would be a bit closer to the proper targetted label .

Detection Ineffective on MNIST model

Dear Bolun,
Thanks so much for sharing your code.

I trained a trojaned and a cleaned MNIST model based on the settings of BadNet paper. I have tested the attacked trojan accuracy and regular test accuracy both larger than 98%. However, when I tried to detect the trojan using the code, the computed anomaly index for the trojaned model is 2.6 and for the cleaned model is 3.6. I am wondering if you have any ideas about what might be the possible cause.

y_true and y_pred position

Hi, I'm curious that why the position of y_true and y_pred get exchange? Is this a mistake or it is done on purpose? Can you please explain why? Thanks!

self.loss_acc = categorical_accuracy(output_tensor, y_true_tensor)

Watermark pattern might be incorrect

Hello,

Can you please double-check the image: vggface_watermark_pattern.png.

It seems to me that this is not the watermark pattern (according to your paper).

If so, can you please upload the correct one?

Best,

About data poisoning

Hello, this is a cool job. Recently I'm reading your paper. I'm wondering can you share the file to poison the clean dataset? It can be a great help to understand your paper. Thanks a lot.

The Reverse Engineering output is not correct

Hi,sir.

I want do some experiment on mnist data/model . I used Badnet method.Here is my trigger image.
1
I just modify four pixel on the upper left corner . turn the value 0 to 255 . And label about 2000 trigger image to 1 which original label is 8. Add those dataset to 60000 train image.
The infect model on test dataset accuracy is around 99% . and The attack accuracy is 100%
But I used the reverse engineering's output is not correct:
this is the pixel_mnist_fusion_label_1.png:
pixel_mnist_fusion_label_1
this is the pixel_mnist_mask_label_1.png

gtsrb_visualize_mask_label_1
and the output of outlier detection is:

  • 10 labels found
  • median: 482.968628, MAD: 107.412963
  • anomaly index: 1.030401
  • flagged label list:
  • elapsed time 0.01 s

Here is my setting. Those setting is the same on train time :
` #input size
IMG_ROWS = 28
IMG_COLS = 28
IMG_COLOR = 1
INPUT_SHAPE = (IMG_ROWS, IMG_COLS, IMG_COLOR)

NUM_CLASSES = 10 # total number of classes in the model
Y_TARGET = 1 # (optional) infected target label, used for prioritizing label scanning

INTENSITY_RANGE = 'mnist' # preprocessing method for the task, GTSRB uses raw pixel intensities
`
Can you give me some advice where is wrong?
Thank you

implementation on other datasets?

I tried your released code on CIFAR10 dataset, and the results is not satisfied. The reverse-engineered trigger is not simialr to the pattern or mask -- white square. Could you release your code on other datasets mentioned in your paper?

Thanks a lot!

Information about VGGFace models is missing: can you add it?

Hello,

I have two requests:

  1. Can you provide the code you used to reverse-engineer the VGG-Face models? It would be great if you add this code to this repo.

  2. Can you provide the information you used to apply the pruning method over the two models: GTSRB-based model and VGG-Face model? That is, which neurons did you remove and how did you select them? The results I am getting are lower than the ones reported in the paper.

Adaptation for partial backdoor attack

Your excellent works is much appreciated. However i have one small question to throw.
As mentioned in the paper, you have implemented detection for partial backdoor attack, may you share the code.
My confusion is, for each target label, if i try each source-target pairs, i may eventually get several reversed triggers for each target label. Still, i can find a subset of reversed triggers that are regarded as outlier (in the case, for a target label, the number of reversed triggers is likely to be more than 1). Then when i am doing detection, the activation profile should be determined by which reversed trigger(s)?
Thanks for your kindly reply!

reg_best does not converge

I tried to use pretrained backdoor model on CIFAR10 dataset. But during visualization none of values among cost, attack, loss, ce, reg, reg_best gets updated.
Here is the snapshot:

loading dataset
X_test shape (50, 32, 32, 3)
Y_test shape (50, 10)
loading model
processing label 7
resetting state
('mask_tanh', -3.672258450327029, 3.5073656483843307)
('pattern_tanh', -3.9436355297972994, 4.010161127999756)
step: 0, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 528.047913, reg_best: inf
step: 1, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 547.919495, reg_best: inf
step: 2, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
step: 3, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
step: 4, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
down cost from 0.00E+00 to 0.00E+00
step: 5, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
step: 6, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
step: 7, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
step: 8, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
step: 9, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf
down cost from 0.00E+00 to 0.00E+00
step: 10, cost: 0.00E+00, attack: 0.000, loss: 16.118097, ce: 16.118097, reg: 548.243958, reg_best: inf

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.