Git Product home page Git Product logo

moe-fl's Introduction

Robust Federated Learning by Mixture of Experts (MoE-FL)

This repository contains implementation of "Robust Federated Learning by Mixture of Experts". This study presents a novel weighted average model based on the mixture of experts (MoE) concept to provide robustness in Federated learning (FL) against the poisoned/corrupted/outdated local models.

Getting Started

We have tested and run the code using Ubuntu 20.04. You can follow the instruction below on Ubuntu 20.04, or you may install appropriate libraries considering your distribution.

Requirements

Before running the experiments, make sure you have installed the following packages system-wide:

sudo apt install \
    python3-dev build-essential libsrtp2-dev libavformat-dev \
    libavdevice-dev python3-wheel python3-venv

After installing the required tools, you might want to create a Python virtual environment for easier management of python packages. Therefore:

# Create Python virtual environment
python3 -m venv moe-fl-vene

# Activate virtual environment
source moe-fl-venv/bin/activate

# Do not forget to upgrade the pip and install wheel/setuptools
pip install --upgrade pip 
pip install wheel setuptools

Install python packages using:

pip install -r requirements.txt

Configurations

You can change the configuration by modifying configs/defaults.yml. Some of the available parameters and their default values that you can change are:

runtime:
    epochs: 1
    rounds: 400
    lr: 0.01
    momentum: 0.5
    batch_size: 20
    random_seed: 12345
    weight_decay: 0.0
    test_batch_size: 20
    use_cuda: False # We have run all experiments using CPU only, not tested CUDA

attack:
    attackers_num: 50
    attack_type: 1

server:
    data_fraction: 0.15 # Fraction of data that will be shared with the server

mnist:
    load_fraction: 1
    shards_num: 200 # With 200 shards, there would be 300 samples per each shards
    shards_per_worker_num: 2
    selected_users_num: 30 
    total_users_num: 100 # Total number of users to partion data among them

Experiments

You have two options to run the experiments, either with IID or Non-IID datasets. To do so, you can run run-study-iid-mnist.py or run-study-niid-mnist.py, respectively. You can overwrite some values like epochs and rounds in configs/default.yml when you run the script using command-line options. Options in brackets "[ ]" are optional and options in parens "( )" are required.

run-study-iid-mnist.py 
        (--avg | --opt) \ # average mode or optimized (moe-fl) mode
        [--epochs=NUM] \
        [--rounds=NUM] \
        [--attack-type=ID] \
        [--attackers-num=num] \
        [--selected-workers=NUM] \
        [--log] \ # Enable logging
        [--nep-log] \ # Enable neptune logging
        [--output-prefix=NAME] 

โš ๏ธ Due to a bug in PySyft version < 0.2.9, you cannot run a long experiment because of memory leakage. As of submitting this study, there was no solution to this problem. Hence, we have to manually break the experiments and save the states of each run and continue again. Therefore, there is an argument --start-round in the Non-IID experiment.

run-study-niid-mnist.py 
        (--avg | --opt) \ # average mode or optimized (moe-fl) mode
        --start-round=NUM \ # start from previously run experiment (see warning above)
        --rounds=NUM \
        [--not-pure] \ # force the experiment use not pure dataset
        [--attackers-num=num] \
        [--log] \ # Enable logging
        [--nep-log] \ # Enable neptune logging
        [--output-prefix=NAME] 

Results and Output

Each execution of the code will generate the following files:

  • accuracy: Accuracy for each round
  • all_users: List of all workers names
  • attackers: List of attackers out of all workers
  • configs: Saved configuration file used when experiment started
  • mapped_dataset: Binary file, used to handle memory leakage problem of PySyft. We have to manage the database given to each worker and maintain the same allocation in all subsequent running.
  • models: Folder containing saved models of workers and the server
  • seeds: Seeds used for the experiment
  • selected_workers: List of selected workers out of all workers
  • server_pub_dataset: Binary file, used for saving the dataset of server
  • train_loss: Loss of each round

Notes

  • In order to successfully run the experiment, you need to configure cvxpy package and provide appropriate solvers. We use mosek, which requires a license to work correctly.
  • We use supervisor to run the experiment. Each experiment runs for a specified number of rounds. Then it exits and gets restarted by the supervisor. This is necessary as PySyft does not handle memory properly and it will lead to memory leakage otherwise. You can check a few examples of how we utilize this approach by checking the supervisor folder.
  • In order to use the neptune logging system, you have to set the environment variables properly before starting the experiments. Please refer to their website for more information.

Citation

@article{parsaeefard2021robust,
  title={Robust Federated Learning by Mixture of Experts},
  author={Parsaeefard, Saeedeh and Etesami, Sayed Ehsan and Garcia, Alberto Leon},
  journal={arXiv preprint arXiv:2104.11700},
  year={2021}
}

moe-fl's People

Contributors

etesami avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

minhthangbk

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.