lishenghui / blades Goto Github PK
View Code? Open in Web Editor NEWBlades: A Unified Benchmark Suite for Byzantine Attacks and Defenses in Federated Learning
License: Apache License 2.0
Blades: A Unified Benchmark Suite for Byzantine Attacks and Defenses in Federated Learning
License: Apache License 2.0
hi, i am running mnist_examples.py and confusized about the aggregation and attack strategy parameter settings. Both aggregator and attacker parameter have "num_clients" and "num_byzantine". Should aggregation and attack strategy parameter set the same or there is other meaning i didn't figure out?
And how can we set a aggregator parameter using no defence strategy which can make a compare about the result of using and not using defence strategy?
Hello, I have been having problems when I use pip to install blades, and I can't use blades. Is there any problem with the project recently?
Hello author. Is it your team that implemented all these defense algorithms? Because I want to add a comparison experiment to my code, but it is not compatible with Blades, so I would like to ask if there is a source code for the defense algorithms. Thank you very much!
When I specify trusted_id as 0, it throws assert len(trusted_clients) == 1, and when I specify fedavg, it throws TypeError: run() got an unexpected keyword argument 'global_model'.
so pleasure to find your word but can u upload the new version or new simulation example?
since i cant run the mini-example.py and simulation on Mnist.py
My Configuration:
python 3.8.3
CUDA 11.3
torch 1.10
dear prof, when i run todo_cifar10_gpu.py , it sends the error "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!" and i don't know how to debug it. Could you please give me some advise?
[New feature]
Support for configuration files for simulations
It is not possible to test non-IID data splits by editing the .yaml configuration files as the IID splitter is always used.
To replicate, run blades/tuned_examples/fedavg_cifar10_resnet_noniid.yaml
where
iid:
grid_search: [False]
alpha:
grid_search: [0.1, 0.5, 1, 100]
The resulting trials use the IID data splitter and have the same results for different alpha values.
When I ran Mean under the previous version of the signflipping attack, the attack was very effective. The accuracy of the global model aggregated by Mean is only 10%. Here is the code for the previous signflipping attack.
class SignflippingClient(ByzantineClient):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def local_training(self, data_batches):
for data, target in data_batches:
data, target = data.to(self.device), target.to(self.device)
data, target = self.on_train_batch_begin(data=data, target=target)
self.optimizer.zero_grad()
output = self.model(data)
loss = torch.clamp(self.loss_func(output, target), 0, 1e5)
loss.backward()
for name, p in self.model.named_parameters():
p.grad.data = -p.grad.data
self.optimizer.step()
But when I am running the signflipping attack on the current version, mean is convergent. I'm not sure what the problem is.
class SignFlipAdversary(Adversary):
def on_algorithm_start(self, algorithm: Algorithm):
class SignFlipCallback(ClientCallback):
def on_backward_end(self, task):
model = task.model
for _, para in model.named_parameters():
para.grad.data = -para.grad.data
for client in self.clients:
client.to_malicious(callbacks_cls=SignFlipCallback, local_training=True)
My config is
fedavg_blades:
run: FEDAVG
stop:
training_iteration: 200
# train_loss: 100000
config:
random_seed:
# grid_search: [122, 123]
grid_search: [111]
# grid_search: [111, 112, 123, 124, 125]
dataset_config:
type: MNIST
num_clients: 20
train_batch_size: 128
evaluation_interval: 5
num_remote_workers: 0
num_gpus_per_worker: 0.6
num_cpus_per_worker: 0
num_cpus_for_driver: 8
num_gpus_for_driver: 0.3
global_model: mlp
client_config:
lr: 0.1
momentum:
grid_search: [0.9]
server_config:
aggregator:
grid_search: [
type: Mean,
]
optimizer:
type: SGD
lr: 1
# lr_schedule: [[0, 0.1], [1500, 0.1], [1501, 0.01], [2000, 0.01]]
momentum:
grid_search: [0.0]
# grid_search: [0.0, 0.5, 0.9]
num_malicious_clients:
grid_search: [8 ]
adversary_config:
grid_search:
- type: blades.adversaries.SignFlipAdversary
My guess is that the current version of signflipping was written incorrectly, and it should have inverted the sign of the gradient for all malicious users by def on_local_round_end()
.
when I run the example ,it's shows : Error: No available node types can fulfill resource request {'CPU': 1.0, 'GPU': 0.2}. Add suitable node types to this cluster to resolve this issue.
This is a problem of ray?
I'm not sure if you're aware, or if this situation only exists on the multiple OS's I have run the simulation suite on, but the example files you have and even the main code, simply do not work.
There are many references which may be python version specific, as they simply prevent the program from running at all,
for example, to import the simulator, you constantly refer to blades.core.simulator, however this only works when it's called as blades.simulator not .core.
The import of options from args does not work, and appears to be a python2 file.
I have created brand new environments including only the files needed for this project and the problems remain. I have also tried this on various python 3 platforms, in windows and linux, and each time I am unable to have things operate in the methods you have identified.
I was desperate to use this suite as it fit the needs of my research paper, but my lack of ability to operate it, made this a frustrating exercise.
First and foremost, I would like to express my support and appreciation for your work on poisoning attack on federated learning.
Additionally, I'm curious to know whether you can reproduce the similar result as the original paper.
Best regards
Hello, your code helped me a lot, but I found a problem during my use: how to change the total number of clients. I set both the "num_clients" parameter when generating the dataset and specifying "num_clients" in "aggregator_kws" (I'm using krum), but they don't work, there are always only 10 clients. Please how can I solve this problem, thanks.
In attack_multikrum
function, Fang's adaptive attack, Line 124-126
term_1 = min_score / ((n_benign - n_attackers - 1) * torch.sqrt(torch.Tensor([d]))[0])
should be
term_1 = min_score / ((n_benign - 2 * n_attackers - 1) * torch.sqrt(torch.Tensor([d]))[0])
based on equation 3 from the original paper
Please correct me if I am wrong :)
Would you mind helping me make a MWE for a custom attacker? I loved your code so far btw -- it is great. I think the custom aggregator seems easy -- I just pass in my aggregation object. I do not know how to add my own attack code though.
Even if your attack example is super simple, say replacing all compromised gradients with zero, I would love to see an example in a single python file.
Thank you for considering!
Please add LFR and UNION of Fang's paper
Love from xdu
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.