lishenghui / blades Goto Github PK

Blades: A Unified Benchmark Suite for Byzantine Attacks and Defenses in Federated Learning

License: Apache License 2.0

Python 94.82% Jupyter Notebook 5.18%

byzantine-fault-tolerance distributed-systems federated-learning robust-machine-learning fedavg federated federated-learning-simulator robust-optimization model-poisoning-attack robust-federated-learning

blades's People

Contributors

Stargazers

Watchers

blades's Issues

example/mnist_example.py

hi, i am running mnist_examples.py and confusized about the aggregation and attack strategy parameter settings. Both aggregator and attacker parameter have "num_clients" and "num_byzantine". Should aggregation and attack strategy parameter set the same or there is other meaning i didn't figure out?
And how can we set a aggregator parameter using no defence strategy which can make a compare about the result of using and not using defence strategy?

How to install blades

Hello, I have been having problems when I use pip to install blades, and I can't use blades. Is there any problem with the project recently?

Awesome work！！

Hello author. Is it your team that implemented all these defense algorithms? Because I want to add a comparison experiment to my code, but it is not compatible with Blades, so I would like to ask if there is a source code for the defense algorithms. Thank you very much!

How to run fltrust based on fedavg

When I specify trusted_id as 0, it throws assert len(trusted_clients) == 1, and when I specify fedavg, it throws TypeError: run() got an unexpected keyword argument 'global_model'.

can u upload the new version or new simulation example?

so pleasure to find your word but can u upload the new version or new simulation example?
since i cant run the mini-example.py and simulation on Mnist.py

Can python 3.8 install blades?

My Configuration：

python 3.8.3
CUDA 11.3
torch 1.10

about the problem running the todo_cifar10_gpu.py

dear prof, when i run todo_cifar10_gpu.py , it sends the error "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!" and i don't know how to debug it. Could you please give me some advise?

[Feature] Support for configuration file

[New feature]
Support for configuration files for simulations

Configuration file dataset config always uses IID splitter

It is not possible to test non-IID data splits by editing the .yaml configuration files as the IID splitter is always used.

To replicate, run blades/tuned_examples/fedavg_cifar10_resnet_noniid.yaml where

       iid:
        grid_search: [False]
       alpha:
        grid_search: [0.1, 0.5, 1, 100]

The resulting trials use the IID data splitter and have the same results for different alpha values.

Difference between the two versions of signflipping?

When I ran Mean under the previous version of the signflipping attack, the attack was very effective. The accuracy of the global model aggregated by Mean is only 10%. Here is the code for the previous signflipping attack.

class SignflippingClient(ByzantineClient):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
    
    def local_training(self, data_batches):
        for data, target in data_batches:
            data, target = data.to(self.device), target.to(self.device)
            data, target = self.on_train_batch_begin(data=data, target=target)
            self.optimizer.zero_grad()
            
            output = self.model(data)
            loss = torch.clamp(self.loss_func(output, target), 0, 1e5)
            loss.backward()
            for name, p in self.model.named_parameters():
                p.grad.data = -p.grad.data
            self.optimizer.step()

But when I am running the signflipping attack on the current version, mean is convergent. I'm not sure what the problem is.

class SignFlipAdversary(Adversary):
    def on_algorithm_start(self, algorithm: Algorithm):
        class SignFlipCallback(ClientCallback):
            def on_backward_end(self, task):
                model = task.model
                for _, para in model.named_parameters():
                    para.grad.data = -para.grad.data

        for client in self.clients:
            client.to_malicious(callbacks_cls=SignFlipCallback, local_training=True)

My config is

fedavg_blades:
  run: FEDAVG
  stop:
    training_iteration: 200
    # train_loss: 100000

  config:
    random_seed:
        # grid_search: [122, 123]
      grid_search: [111]
      # grid_search: [111, 112, 123, 124, 125]
    dataset_config:
      type: MNIST
      num_clients: 20
      train_batch_size: 128

    evaluation_interval: 5


    num_remote_workers: 0
    num_gpus_per_worker: 0.6
    num_cpus_per_worker: 0
    num_cpus_for_driver: 8
    num_gpus_for_driver: 0.3

    global_model: mlp

    client_config:
        lr: 0.1
        momentum:
          grid_search: [0.9]

    server_config:
      aggregator:
        grid_search: [
          type: Mean,
          ]

      optimizer:
        type: SGD
        lr: 1
        # lr_schedule: [[0, 0.1], [1500, 0.1], [1501, 0.01], [2000, 0.01]]
        momentum:
          grid_search: [0.0]
          # grid_search: [0.0, 0.5, 0.9]

    num_malicious_clients:
      grid_search: [8 ]

    adversary_config:

      grid_search:

        - type: blades.adversaries.SignFlipAdversary

My guess is that the current version of signflipping was written incorrectly, and it should have inverted the sign of the gradient for all malicious users by def on_local_round_end().

runtime error

can you list out all the requiremented python modules with its proper version

example question

when I run the example ，it's shows ： Error: No available node types can fulfill resource request {'CPU': 1.0, 'GPU': 0.2}. Add suitable node types to this cluster to resolve this issue.
This is a problem of ray？

Outdated Examples

I'm not sure if you're aware, or if this situation only exists on the multiple OS's I have run the simulation suite on, but the example files you have and even the main code, simply do not work.

There are many references which may be python version specific, as they simply prevent the program from running at all,
for example, to import the simulator, you constantly refer to blades.core.simulator, however this only works when it's called as blades.simulator not .core.

The import of options from args does not work, and appears to be a python2 file.

I have created brand new environments including only the files needed for this project and the problems remain. I have also tried this on various python 3 platforms, in windows and linux, and each time I am unable to have things operate in the methods you have identified.

I was desperate to use this suite as it fit the needs of my research paper, but my lack of ability to operate it, made this a frustrating exercise.

about the experiments' result

First and foremost, I would like to express my support and appreciation for your work on poisoning attack on federated learning.
Additionally, I'm curious to know whether you can reproduce the similar result as the original paper.
Best regards

Consine Distance OR Consine Similarity?

https://github.com/bladesteam/blades/blob/master/src/blades/aggregators/clustering.py

How to change the number of clients, the "num_clients" parameter does not work？

Hello, your code helped me a lot, but I found a problem during my use: how to change the total number of clients. I set both the "num_clients" parameter when generating the dataset and specifying "num_clients" in "aggregator_kws" (I'm using krum), but they don't work, there are always only 10 clients. Please how can I solve this problem, thanks.

Adaptive Attacks: FangAttack

In attack_multikrum function, Fang's adaptive attack, Line 124-126
term_1 = min_score / ((n_benign - n_attackers - 1) * torch.sqrt(torch.Tensor([d]))[0])
should be
term_1 = min_score / ((n_benign - 2 * n_attackers - 1) * torch.sqrt(torch.Tensor([d]))[0])
based on equation 3 from the original paper

Please correct me if I am wrong :)

Example for Custom Attacker?

Would you mind helping me make a MWE for a custom attacker? I loved your code so far btw -- it is great. I think the custom aggregator seems easy -- I just pass in my aggregation object. I do not know how to add my own attack code though.

Even if your attack example is super simple, say replacing all compromised gradients with zero, I would love to see an example in a single python file.

Thank you for considering!

Please add LFR and UNION of Fang's paper

Thanks for this AWESMOE platform

Love from xdu

lishenghui / blades Goto Github PK

blades's People

Contributors

Stargazers

Watchers

Forkers

blades's Issues

can you list out all the requiremented python modules with its proper version

Recommend Projects

Recommend Topics

Recommend Org