privacytrustlab / ml_privacy_meter Goto Github PK
View Code? Open in Web Editor NEWPrivacy Meter: An open-source library to audit data privacy in statistical and machine learning algorithms.
License: MIT License
Privacy Meter: An open-source library to audit data privacy in statistical and machine learning algorithms.
License: MIT License
When running tutorials/alexnet.py line 84 tries to use attribute 'defense' which is never defined in the codebase.
1)how can i attack these models?
2)can i attack some models like scikit-learn.job joblib?
In my experiments it seems datasets computation is not always working correctly.
In one of my latest experiments, for example, I have a dataset with 18411 input samples and a member-set that is almost the half, with 9286 samples.
I am building my datahandler object with attack_percentage=75
, and my member_train
and member_test
variables within the attack_data
class are correctly initialized with 6964 and 2322 samples respectively (that sum up to 9286, and that's fine)
The problem is with nonmember_train
and nonmember_test
variables; in fact, they get also initialized with 6964 and 2322 samples respectively, and that does not make sense as all these datasets sum up to 18572, which is more than my initial dataset of 18411 samples.
So this is the situation I have:
dataset size
: 18411
member-set size
: 9286, member_train size
: 6964, member_test size
: 2322
nonmember-set size
: 9286, nonmember_train size
: 6964, nonmember_test size
: 2322
The expected nonmember-set size
, by the way, is 9125
For completeness, those numbers have been extracted from meminf.py file, right before the main training procedure begins:
print(len(list(mtrainset.unbatch().as_numpy_iterator())))
print(len(list(nmtrainset.unbatch().as_numpy_iterator())))
print(len(list(mtestset.unbatch().as_numpy_iterator())))
print(len(list(nmtestset.unbatch().as_numpy_iterator())))
Investigating the problem, it seems there are two more issues:
The first one is that it seems that those datasets are not mutually exclusive; it seems something in the attack_data class is not working as expected, as I was able to find:
(Those numbers keep changing a bit probably because of the randomization of the dataset when initializing it.)
If I don't get it wrong, this should not happen.
Previous intersections, however, are not always present; the situation varies a lot depending on the member-set I use. With some member-set I get no intersections, with others I get only some intersections between the datasets. All the member-set, by the way, are computed in the same way, so there is no syntactic difference between them.
I still can't say why that's happening, but I'll keep investigating it.
The second issue is that the intersection
method in attack_utils class is not working as it should;
nmtrainset
and nmtestset
, but that's not working on my system.example
, in my case, is a tuple of two tf.Tensor, the first one is the array with the input features and the second one is the label. The easiest way to tell what is happening is to show you, so here's a little demo:
>>> A = tf.constant([1,2,3,4])
>>> B = tf.constant([0])
>>> C = tf.constant([0])
>>> hash(bytes(np.array(B)))
-3575773765697568389
>>> hash(bytes(np.array(C)))
-3575773765697568389
>>> hash(bytes(np.array((A, B))))
-1488792138337542742
>>> hash(bytes(np.array((A, C))))
-1549983801855091178
Basically, equal samples are hashing differently.
I cannot say if that behavior is related to the python version or what. I am using python 3.8.6 by the way.
Hi,
Thanks so much for providing such a useful and interesting framework. As the title says, would you please check the code for the issue? There seem to be some errors in version 1.0.1.
In order to plot the ROCCurveReport, it is needed a explanations.json file. But it is never created and therefore, the report crashes. I'm trying
ROCCurveReport.generate_report(`
metric_result=audit_results,
inference_game_type=InferenceGame.PRIVACY_LOSS_MODEL,
show=True
)
Hi
if it seems that the MIA can't be successful I get often an attack accuracy that is exactly: 0.500200092792511
Did you recognize this value in your experiments, too?
If you did is there a known reason why the implementation returns exactly this value?
Im using the AlexNet tutorial python file with following setting:
input_shape = (32, 32, 3)
cmodelA = tf.keras.models.load_model(cprefix)
saved_path = "datasets/cifar100_train.txt.npy"
dataset_path = 'datasets/cifar100.txt'
datahandlerA = ml_privacy_meter.utils.attack_data.attack_data(dataset_path=dataset_path,
member_dataset_path=saved_path,
batch_size=100,
attack_percentage=10, input_shape=input_shape,
normalization=True)
attackobj = ml_privacy_meter.attack.meminf.initialize(
target_train_model=cmodelA,
target_attack_model=cmodelA,
train_datahandler=datahandlerA,
attack_datahandler=datahandlerA,
layers_to_exploit=[72], # last layer of my ResNet20
device=None, epochs=3, model_name=cprefix)
Hi,
I have been reading the paper in which your team study and propose this attack framework [1]. In [1] it is stated that the learning rate of the attack is set to 0.0001 but in this implementation, by default it is set to 0.001 which is an order of magnitude less and in the tutorials such learning rate is unmodified.
Could you suggest me which learning rate is more adequate?
Moreover, in the appendix A of the paper [1] there is a description of the Architecture of the attack model, but such description doesn't match the implementation shown in this repository.
Could you suggest me which implemetation should be better? (the one in the paper or the one given in this repository)
[1] Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning (https://arxiv.org/abs/1812.00910)
Hello everyone,
Firstly, thanks for the great work. I am trying to attack with ML-Privacy-Meter to a target model as in TF example model. You may see the model code in below.
However, when I try to exploit Convolutional Gradients, specifically the third Convolutional layer, I have a shape mismatch error as in below.
Traceback (most recent call last):
File "tutorials/attack_alexnet.py", line 104, in
attackobj.train_attack()
File "/home/ml-privacy-meter/ml_privacy_meter/attack/meminf.py", line 518, in train_attack
moutputs = self.forward_pass(model, mfeatures, mlabels)
File "/home/ml-privacy-meter/ml_privacy_meter/attack/meminf.py", line 454, in forward_pass
attack_outputs = self.attackmodel(self.inputArray)
File "/home/ml-privacy-meter/venv/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 998, in call
input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
File "/home/ml-privacy-meter/venv/lib/python3.8/site-packages/tensorflow/python/keras/engine/input_spec.py", line 271, in assert_input_compatibility
raise ValueError('Input ' + str(input_index) +
ValueError: Input 3 is incompatible with layer model: expected shape=(None, 64, 3, 64), found shape=(1, 3, 3, 64, 64)
I'd like to get help from you in this regard.
Thanks,
Hi all,
I'll try to attack my pre-trained
[ResNet20.zip] https://github.com/privacytrustlab/ml_privacy_meter/files/6685533/ResNet20.zip)
model with the following model architecture:
ResNet20_architecture.txt
For training, I used the same procedure as in the tutorial suggested.
To attack the model I use the tutorial file attack_alexnet.py with the following config:
input_shape = (32, 32, 3)
cmodelA = tf.keras.models.load_model(cprefix)
cmodelA.summary()
saved_path = "datasets/cifar100_train.txt.npy"
dataset_path = 'datasets/cifar100.txt'
datahandlerA = ml_privacy_meter.utils.attack_data.attack_data(dataset_path=dataset_path,
member_dataset_path=saved_path,
batch_size=100,
attack_percentage=10, input_shape=input_shape,
normalization=True)
attackobj = ml_privacy_meter.attack.meminf.initialize(
target_train_model=cmodelA,
target_attack_model=cmodelA,
train_datahandler=datahandlerA,
attack_datahandler=datahandlerA,
layers_to_exploit=[72],
# gradients_to_exploit=[1],
device=None, epochs=3, model_name='ResNet20')
Attacking the model without the gradient_to_exploit parameter works:
Epoch 0 over :Attack test accuracy: 0.499799907207489, Best accuracy : 0.499799907207489
But if I try to exploit the gradients of the first conv2d (Conv2D) layer with Output Shape: (None, 32, 32, 16) that have 448 params and is connected to input_1[0][0] which is referred by the index 1 of gradients_to_exploit=[1] this error occurs:
Traceback (most recent call last):
File "tutorials/attack_alexnet.py", line 88, in <module>
attackobj.train_attack()
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf.py", line 473, in train_attack
moutputs = self.forward_pass(model, mfeatures, mlabels)
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf.py", line 412, in forward_pass
self.get_gradients(model, features, labels)
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf.py", line 363, in get_gradients
toappend = tf.reshape(grads[g], reshaped)
File "/Users/christianstudinsky/opt/anaconda3/envs/ml_privacy_meter_36/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, **kwargs)
File "/Users/christianstudinsky/opt/anaconda3/envs/ml_privacy_meter_36/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 195, in reshape
result = gen_array_ops.reshape(tensor, shape, name)
File "/Users/christianstudinsky/opt/anaconda3/envs/ml_privacy_meter_36/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8373, in reshape
tensor, shape, name=name, ctx=_ctx)
File "/Users/christianstudinsky/opt/anaconda3/envs/ml_privacy_meter_36/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 8398, in reshape_eager_fallback
ctx=ctx, name=name)
File "/Users/christianstudinsky/opt/anaconda3/envs/ml_privacy_meter_36/lib/python3.6/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 432 values, but the requested shape has 9 [Op:Reshape]
Or if I try to exploit the gradients of the second conv2d_2 (Conv2D) layer with Output Shape: (None, 32, 32, 16) that have 272 params and is connected to re_lu[0][0] which is referred to the index 1 of gradients_to_exploit=[1] this error occurs:
Traceback (most recent call last):
File "tutorials/attack_alexnet.py", line 73, in <module>
device=None, epochs=3, model_name='ResNet20')
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf.py", line 168, in __init__
self.create_attack_components(layers)
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf.py", line 287, in create_attack_components
self.create_gradient_components(model, layers)
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf.py", line 261, in create_gradient_components
module = cnn_for_cnn_gradients(shape)
File "/Users/christianstudinsky/Documents/0_Masterarbeit/1_Experiments/ml_privacy_meter/ml_privacy_meter/attack/meminf_modules/create_cnn.py", line 116, in cnn_for_cnn_gradients
dim1 = int(input_shape[3])
File "/Users/christianstudinsky/opt/anaconda3/envs/ml_privacy_meter_36/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 889, in __getitem__
return self._dims[key].value
IndexError: list index out of range
I don't know if im missing a important thing or I'll execute ml_privacy_meter in a wrong way.
Sunny greetings from Karlsruhe
Chris
Hello, there might be a trivial flaw in your code.
In line 39 of the file ml_privacy_meter/tutorials/attack_alexnet.py
:
it is written that the means and standard deviations for normalization will be calculated if unset.
I understand that you expect this to be determined by this line in ml_privacy_meter/tutorials/attack_alexnet.py
:
However, it doesn't work but instead introduces an error on my machine:
AttributeError: 'attack_data' object has no attribute 'means'
The fact that you didn't declare these two attributes (means
and stddevs
) in the initialization method of class attack_data
is presumably the reason. So the bug crawled into your program.
Looking forward to your reply.
hello,When I ran the code, there was an error that exceeded the video memory. I tried to modify the batch_size but still reported an error. I would like to ask how big the GPU memory is during the experiment? Also, is this experiment actually running on a personal laptop or on a server?thank you very much.
Hello,
First of all, I wanted to congratulate for the impressive quality of the work you did with privacy_meter
, it's really great.
I wanted to ask what is the conceptual difference between the Metric
and InferenceGame
. It seems to me that there is a 1:1 relationship between the notion of InferenceGame and the Inference Game definitions (3.1-3.4) given by Ye et al.
However, it seems that this notion is not used when mounting the actual attack, which relies instead on the Metric notion. InferenceGame is then used when generating the report. I am failing to understand exactly how these two notions are related to eachother.
Thanks in advance!
I am implementing a blackbox attack against the basic binary TensorFlow classifier with tabular data below. Here is the notebook:
credit_default.ipynb.zip
It errors out due to a size-incompatibility during the training of the attack object. attackobj.train_attack()
. It appears to be related to how shape is defined in ml_privacy_meter.utils.attack_data.AttackData
, but I am not able to see how it can be set correctly. Thank you for the help in advance (since this type of classifier is very common, adding it to the library demos may also add value).
TensorFlow version: 2.1.4
`---------------------------------------------------------------------------
InvalidArgumentError Traceback (most recent call last)
in
----> 1 attackobj.train_attack()
~/username/git/ml_privacy_meter/ml_privacy_meter/attack/meminf.py in train_attack(self)
443 model = self.target_train_model
444
--> 445 pred = model(nm_features)
446 acc = accuracy_score(nm_labels, np.argmax(pred, axis=1))
447 print('Target model test accuracy', acc)
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py in call(self, inputs, *args, **kwargs)
820 with base_layer_utils.autocast_context_manager(
821 self._compute_dtype):
--> 822 outputs = self.call(cast_inputs, *args, **kwargs)
823 self._handle_activity_regularization(inputs, outputs)
824 self._set_mask_metadata(inputs, outputs, input_masks)
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/sequential.py in call(self, inputs, training, mask)
265 if not self.built:
266 self._init_graph_network(self.inputs, self.outputs, name=self.name)
--> 267 return super(Sequential, self).call(inputs, training=training, mask=mask)
268
269 outputs = inputs # handle the corner case where self.layers is empty
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/network.py in call(self, inputs, training, mask)
715 return self._run_internal_graph(
716 inputs, training=training, mask=mask,
--> 717 convert_kwargs_to_constants=base_layer_utils.call_context().saving)
718
719 def compute_output_shape(self, input_shape):
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/network.py in _run_internal_graph(self, inputs, training, mask, convert_kwargs_to_constants)
889
890 # Compute outputs.
--> 891 output_tensors = layer(computed_tensors, **kwargs)
892
893 # Update tensor_dict.
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py in call(self, inputs, *args, **kwargs)
820 with base_layer_utils.autocast_context_manager(
821 self._compute_dtype):
--> 822 outputs = self.call(cast_inputs, *args, **kwargs)
823 self._handle_activity_regularization(inputs, outputs)
824 self._set_mask_metadata(inputs, outputs, input_masks)
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/keras/layers/core.py in call(self, inputs)
1129 if rank > 2:
1130 # Broadcasting is required for the inputs.
-> 1131 outputs = standard_ops.tensordot(inputs, self.kernel, [[rank - 1], [0]])
1132 # Reshape the output back to the original ndim of the input.
1133 if not context.executing_eagerly():
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py in tensordot(a, b, axes, name)
4104 b_reshape, b_free_dims, b_free_dims_static = _tensordot_reshape(
4105 b, b_axes, True)
-> 4106 ab_matmul = matmul(a_reshape, b_reshape)
4107 if isinstance(a_free_dims, list) and isinstance(b_free_dims, list):
4108 return array_ops.reshape(ab_matmul, a_free_dims + b_free_dims, name=name)
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/util/dispatch.py in wrapper(*args, **kwargs)
178 """Call target, and fall back on dispatchers if there is a TypeError."""
179 try:
--> 180 return target(*args, **kwargs)
181 except (TypeError, ValueError):
182 # Note: convert_to_eager_tensor currently raises a ValueError, not a
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/ops/math_ops.py in matmul(a, b, transpose_a, transpose_b, adjoint_a, adjoint_b, a_is_sparse, b_is_sparse, name)
2796 else:
2797 return gen_math_ops.mat_mul(
-> 2798 a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
2799
2800
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/ops/gen_math_ops.py in mat_mul(a, b, transpose_a, transpose_b, name)
5614 pass # Add nodes to the TensorFlow graph.
5615 except _core._NotOkStatusException as e:
-> 5616 _ops.raise_from_not_ok_status(e, name)
5617 # Add nodes to the TensorFlow graph.
5618 if transpose_a is None:
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/tensorflow_core/python/framework/ops.py in raise_from_not_ok_status(e, name)
6604 message = e.message + (" name: " + name if name is not None else "")
6605 # pylint: disable=protected-access
-> 6606 six.raise_from(core._status_to_exception(e.code, message), None)
6607 # pylint: enable=protected-access
6608
~/username/git/credoai_research/pythonenv3/lib/python3.6/site-packages/six.py in raise_from(value, from_value)
InvalidArgumentError: Matrix size-incompatible: In[0]: [44220,1], In[1]: [22,1] [Op:MatMul] name: sequential/layer1/Tensordot/MatMul/
`
Hello together,
I'd like to experiment with this tool in a federated learning setting for my master thesis, but I can't achieve better accuracy than 0.5121 with the tutorial using the Blackbox config. So I think I oversee something essential.
My execution of the tutorial:
attackobj = ml_privacy_meter.attack.meminf.initialize( target_train_model=cmodelA, target_attack_model=cmodelA, train_datahandler=datahandlerA, attack_datahandler=datahandlerA, layers_to_exploit=[26], exploit_loss=False, device=None)
Besides this execution, I had to do few changes in the datasets/files because they used python2 functions. I commented the original lines of code out to make it transparent what changes I did. I attached them to this issue.
Also, I added the following line below the import of matplotlib in LoC 11: matplotlib.use('TkAgg')
This was necessary to overcome an error related to my Mac OS.
The terminal output is also attached.
create_cifar100_train.txt
preprocess_cifar100.txt
I tried the Whitebox config as well and there I achieved an accuracy of 0.7480 in the first 3 epochs.
So, I hope there is just this little thing I oversee in the blackbox setting.
Thank you for supporting this project.
All the best from Karlsruhe Germany
Hello, ml_privacy_meter
looks good. it was well encapsulated.
I'm going to apply your tool to evaluate my model.
And I have some questions as follow.
Hi, I would like to know how this library should be used on a federal learning scenario? Also, how should I be able to reproduce your attacks on federated learning described on your paper " Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks" ?I would appreciate a lot if you could reply promptly.
It cause some error when trying to load "alexnet_pretrained". hdf5 save would be appreciated.
Hi, I noticed that the detailed implementation of attack S and attack P was not given in the source code of Enhanced MIA, how can I find them?
Hi, I would like to know if there is any pytorch implementation on this. Or if there are any future works on this in pytorch
I want to calculate the Privacy Leakage metric from this Usenix paper, which is simply the difference between the true positive rate (TPR) and the false positive rate (FPR) of the inference attack.
For the Attack-S, it seems straightforward since the result from audit_obj.run()[0]
contains members fp
and tp
. However, for the Attack-R, fp
and tp
are lists with n+1 elements (n being the number of reference models) sorted in ascending order.
It adds to my confusion that there is a single roc_auc
returned, so it's not clear to me how it is computed from the lists of tp
and fp
, and which values from those lists I should use to calculate the Privacy Leakage metric; can you help?
How long it takes to train shadow models is normal?
First of all thanks a lot for your open source contributions, then I'm having some problems browsing the code. You set num_datapoints (default-5k) as the number of training data for the target model. When the attack model is trained, a number of samples of attack_size (default-500) are extracted from the training data of the target model as the member training set of the attack model. My doubt is that the population set is composed of the whole training set (CIFAR-5W) and test set (CIFAR-1W) of the pre-trained alexnet (target model). When we sample the same number of samples in the population set as the non-member training set of the attack model, we may draw a subset of the training set in the population set, which is the member training set for the attack model. I think this will generate cheating.
I thought of a workaround. We should retrain the alexnet model. The training set of the retrained alexnet model should be the same as the target model training set in attack_alexnet.py.
Looking forward to hearing from you, thank you!
I have tried the default dataset of cifar100 and purchase100, for cifar100, it could reach a accuracy of 75%, while for purchase100, I could only get an accuracy of 52%, which is basiclly randomly guessing for the model. I wonder are there some special settings to use non-image dataset.
Besides, I tried to implement cifar10 here also, and the accuracy is around 63%, I am wondering did you tried this privacy meter with cifar10 and are there some adjustments needed.
Thanks.
It would be great to have a conda recipe so that it can be included with projects that have more complicated build processes (for example, using libraries that need C/C++ compilers).
There are some tutorial in the archive folder from the earlier version of the library which work for whitebox attacks (exposing gradients etc). I don't see similar options with the restructured code. Is it still possible to audit those attacks or have those attacks been renamed?
Hello, this is my first issue so please bear with me.
Tldr: Should model.py
use torch.tensor()
instead of torch.Tensor()
?
Explanation: I was adapting the reference_metric tutorial to use Pytorch when I encountered a compatibility issue between the passed tensor and the loss function when creating the Audit object. In particular, the loss function nn.CrossEntropyLoss()
required a torch.int64
tensor while it was receiving a torch.float32
tensor. This is despite passing a dtype('int64')
numpy array as the value to the y
key in the Dataset object used in the InfromationSource
s in the Audit constructor. I traced back the error and it seems to stem from model.py
. I noticed that model.py
uses torch.Tensor()
(which creates a torch.FloatTensor
) instead of torch.tensor()
(which infers the dtype
of the tensor automatically). After replacing the code with torch.tensor()
, the program ran as expected.
Is there a different way of making this work without editing the source? Or is torch.Tensor()
just supposed to be torch.tensor()
? Thanks!
Hello together,
I tried to install this project again and the described setup doesn't work anymore.
I installed Python 3.6.13
in my Anaconda environment with following packages using the pip install -r requirements.txt
command.
conda_list.txt attached.
But it throws in the end:
ERROR: No matching distribution found for tensorflow-gpu==2.5.0
(dt_test) Admins-MacBook-Pro:ml_privacy_meter christianstudinsky$ pip install -r requirements.txt
....
Requirement already satisfied: scipy==1.4.1 in /Users/christianstudinsky/opt/anaconda3/envs/dt_test/lib/python3.6/site-packages (from -r requirements.txt (line 39)) (1.4.1)
Collecting six==1.14.0
Using cached six-1.14.0-py2.py3-none-any.whl (10 kB)
Collecting sklearn==0.0
Using cached sklearn-0.0-py2.py3-none-any.whl
Requirement already satisfied: tensorboard==2.1.1 in /Users/christianstudinsky/opt/anaconda3/envs/dt_test/lib/python3.6/site-packages (from -r requirements.txt (line 42)) (2.1.1)
ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==2.5.0 (from versions: 0.12.1, 1.0.0, 1.0.1, 1.1.0)
ERROR: No matching distribution found for tensorflow-gpu==2.5.0
Hi I have a question with regards to the implementation of ModelIntermediateOutput class in information_source_signal.py. The class uses a dictionary with key "layers" to determine which layer's output must be extracted as input sihnal to the attack. Should the dictionary be passed as a parameter to the constructor? If yes, It does not have an init method.
I assume we need to define an init function.
Also, in get_intermediate_outputs method in PytorchModelTensor in model.py, self.intermediate_output is a dictionary that is not defined. Can you explain what is expected to be implemented here?
Thanks so much for the help!
@mihirkhandekar Hello, I'm still confused and sorry to trouble you again.
In the issue #19, you replied me that
Model A (target_train_model) can be used as a shadow model to evaluate the performance of your membership inference model on Model B (target_attack_model).
In the tutorials provided, you show me two examples. But in both of them, you pass the same model to target_train_model
and target_attack_model
, which really confuses me.
In the file ml_privacy_meter/attack/meminf.py
, the handling of target_attack_model
stops at receiving it. And no further operation.
As for the method test_attack()
, I'm not sure what role it plays in your blueprint.
Is it a mistake that self.target_train_model
should be replaced with self.target_attack_model
to evaluate the performance of membership inference model on Model B (target_attack_model)?
Expect receiving your reply.
I found myself unsure about lots of concepts int privacy_meter, e.g., the way how metrics work, or what some arguments stand for,
Wondering if i could find a place to check these out?
Especially uncertain about reference metric..
Will you develop versions based on Federated Learning and Unsupervised Learning in the future? I am very interested in your articles.
Your work is excellent, providing a great verification tool for security and privacy researchers. I would like to inquire whether your method can be combined with existing differential privacy defense frameworks, such as the Opacus differential privacy framework. Is it possible to create a tutorial to demonstrate how to verify the effectiveness of differential privacy in defending against your MIA attack method? Thank you!
Hi, I'm reading the paper Enhanced Membership Inference Attacks against Machine Learning Models. It is very well written! I'm wondering if you have plan to open-source the code of "MIA via Distillation", since "MIA via Distillation" is claimed to have better performance than other MIAs already implemented in this package. I'm very interested in the algorithm/implementation details of this attack. Thanks!
In some situations there seems to be problems with how the ROC curve is calculated. This is clearly visible when there is a very high number of False Positives or False Negatives. I'll post an example where IMHO the auc should be 0.5 (and not 1.0), as all my samples are classified as positives and there are no negatives.
The error is probably due to following line in meminf.py:
I think it should be something like:
fpr, tpr, _ = roc_curve(target, probs > 0.5)
From what I can see, in fact, the error comes from the way SKLearn sets the thresholds for computing the false positive rate and the true positive rate when passing an array with float values:
If your target vector is
[[1], [0], [1], [1], [0]]
and your predictions vector (probs) is
[[0.9], [0.9], [0.9], [0.9], [0.9]]
the auc(fpr, tpr) value returned by SKLearn is 0.5, (meaning there are 0 false negatives and 2 false positives, and that's fine), but if your predictions vector is
[[0.9], [0.8], [0.9], [0.9], [0.8]]
the auc(fpr, tpr) value returned by SKLearn is 1, (like there were 0 false negatives and 0 false positives))
If I get this right, this is an unwanted behavior for your use-case, as you want an auc of 0.5 in both cases (both 0.8 and 0.9 are values greater then 0.5 and a prediction value greater than 0.5 should count as false positive if the target value is 0)
By doing probs > 0.5
, however, it returns a vector of booleans of the same size as probs with True where the original values were greater then 0.5 and with False otherwhere; passing this vector instead of probs seems to work.
Hi what happened to the codes/folder for Enhanced MIA?
The PyTorch model in shadow_metric.ipynb uses nn.CrossEntropyLoss
, which expects unnormalized logits. However, the model outs probabilities due to the use of nn.Softmax
. This causes the model to not achieve 100% accuracy on the training set.
Additionally, criterions in PyTorch typically take arguments in the order of logits, targets
. However, the code provides targets, logits
. This, however, is not a functional concern because targets contains class probabilities (rather than class indices). It will probably become an issue once the bug above is fixed.
Both of these issues also exist in avg_loss_training_algo.ipynb
Hi.
When I run the alexnet.py code to create the model and then run the attack_alexnet.py code, I have an oom problem, so I ask a question.
First of all, my setting environment was run with rtx 2080ti and I used tensorflow-gpu version 2.1.0 as in requirements.txt.
So what I want to ask is
Thanks in advance.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.