Git Product home page Git Product logo

divfl's Introduction

Diverse Client Selection for Federated Learning via Submodular Maximization

Code for ICLR 2022 paper:

Title: Diverse Client Selection for Federated Learning via Submodular Maximization [pdf] [presentation]
Authors: Ravikumar Balakrishnan* (Intel Labs), Tian Li* (CMU), Tianyi Zhou* (UW), Nageen Himayat (Intel Labs), Virginia Smith (CMU), Jeff Bilmes (UW)
Institutes: Intel Labs, Carnegie Mellon University, University of Washington

@inproceedings{
balakrishnan2022diverse,
title={Diverse Client Selection for Federated Learning via Submodular Maximization},
author={Ravikumar Balakrishnan and Tian Li and Tianyi Zhou and Nageen Himayat and Virginia Smith and Jeff Bilmes},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=nwKXyFvaUm}
}

Abstract
In every communication round of federated learning, a random subset of clients communicate their model updates back to the server which then aggregates them all. The optimal size of this subset is not known and several studies have shown that typically random selection does not perform very well in terms of convergence, learning efficiency and fairness. We, in this paper, propose to select a small diverse subset of clients, namely those carrying representative gradient information, and we transmit only these updates to the server. Our aim is for updating via only a subset to approximate updating via aggregating all client information. We achieve this by choosing a subset that maximizes a submodular facility location function defined over gradient space. We introduce “federated averaging with diverse client selection (DivFL)”. We provide a thorough analysis of its convergence in the heterogeneous setting and apply it both to synthetic and to real datasets. Empirical results show several benefits to our approach including improved learning efficiency, faster convergence and also more uniform (i.e., fair) performance across clients. We further show a communication-efficient version of DivFL that can still outperform baselines on the above metrics.

Preparation

Dataset generation

We already provide four synthetic datasets that are used in the paper under corresponding folders. For all datasets, see the README files in separate data/$dataset folders for instructions on preprocessing and/or sampling data.

The statistics of real federated datasets are summarized as follows.

Dataset Devices Samples Samples/device
mean (stdev)
MNIST 1,000 69,035 69 (106)
FEMNIST 200 18,345 92 (159)
Shakespeare 143 517,106 3,616 (6,808)
Sent140 772 40,783 53 (32)

Downloading dependencies

pip3 install -r requirements.txt  

References

See our DivFL paper for more details as well as all references.

Acknowledgements

Our implementation is based on FedProx.

divfl's People

Contributors

tianyizhou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.