Git Product home page Git Product logo

federated-learning-sparsification's Introduction

Federated Learning: Sparsification

The communication efficiency of federated learning is improved by sparsifying the parameters uploaded by the clients, hence reducing the size of the upload.

Three approaches to sparsification are compared:

  • Random - Randomly selecting k% parameters
  • Top-k - Selecting the top k% parameters with the largest absolute differences before and after model training
  • Threshold - Selecting parameters with absolute differences that are larger than a given threshold

Two datasets (data.py) are used for the experiments:

  • CIFAR-10 - Imbalanced, non-iid partition of data between the clients
  • FEMNIST - Data distributed between the clients based on the author

The basic convolutional model CNN500k (models.py):

  • Model for CIFAR-10 - 471,338 parameters (1.89 MB)
  • Model for FEMNIST - 369,432 parameters (1.48 MB)

All experiments were implemented in PyTorch using Flower's virtual client engine (https://github.com/adap/flower).

Experiments

72 experiments were run for all combinations of:

  • Dataset = [femnist, cifar]
  • Approach = [random, topk, threshold]
  • Sparsify by:
    • Random & Top-k = [0.5, 0.3, 0.1, 0.05, 0.03, 0.01]
    • Threshold = [0.001, 0.003, 0.005, 0.007, 0.0085, 0.01]
  • Keep first and last layer = [TRUE, FALSE]

Results from the experiments: data and figures.

Experimental set-up:

  • Number of clients = 50
  • Number of epochs = 1
  • Learning rate = 0.1
  • Optimiser = SGD
  • Regularisation = 0
  • Fraction of clients sampled each round:
    • FEMNIST = 0.25
    • CIFAR = 0.3
  • Number of federated learning rounds:
    • FEMNIST = 30
    • CIFAR = 180
  • Metric = Accuracy as measured on the test dataset

Baseline:

Federated Averaging (https://arxiv.org/abs/1602.05629) using the same set-up as above.

Simulation

Example command:
python simulation.py --dataset_name="cifar" --approach="topk" --sparsify_by=0.1 --num_rounds=180

Parameter Description
--dataset_name Can be femnist or cifar.
--femnist_location Path to the location of the femnist data. Must be pre-downloaded.
--approach Can be random topkor threshold.
--sparsify_by Float between 0 and 1 indicating the fraction of parameters to select. For the threshold approach this corresponds to the threshold value.
--num_rounds Number of federated learning rounds.
--keep_first_last Boolean TRUE or FALSE. Indicates whether to force the selection of all parameters in the very first and last layers in the network.
--epochs Number of epochs each client trains the local model for.
--learning_rate Learning rate for the model training.
--regularisation Regularisation/weight decay parameter for the optimiser.

federated-learning-sparsification's People

Contributors

natasha-r avatar

Stargazers

 avatar

Watchers

 avatar

federated-learning-sparsification's Issues

help

when i try to run "python simulation.py --dataset_name="cifar" --approach="topk" --sparsify_by=0.1 --num_rounds=180", it shows Exception that 'bytes' object has no attribute 'pop'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.