Git Product home page Git Product logo

label_dp_antipodes's Introduction

Antipodes of Label Differential Privacy: PATE and ALIBI

This repository is the official implementation of Antipodes of Label Differential Privacy: PATE and ALIBI.

Citation

@misc{malek2021antipodes,
      title={Antipodes of Label Differential Privacy: {PATE} and {ALIBI}},
      author={Mani Malek and Ilya Mironov and Karthik Prasad and Igor Shilov and Florian Tram{\`e}r},
      year={2021},
      eprint={2106.03408},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Requirements

To install requirements:

pip install -r requirements.txt

The above command assumes a Linux environment with CUDA support. Please refer to https://pytorch.org/ for your specific environment.

If you're training PATE with CIFAR100, you'll also need to install apex manually:

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install

Training

PATE

PATE model is trained in 3 stages. Below are example commands with hyperparameters we used.

Stage 1: Train teacher ensemble

Optionally add -N 1000 for canary runs

CIFAR10
python train_teacher.py --dataset cifar10 --n_teachers 800 --teacher-id 0 --epochs 40
CIFAR100
python train_teacher.py --dataset cifar100 --n_teachers 100 --teacher-id 0 --epochs 125 --weight_decay 0.001 --width 8 --amp true --opt_level O2

Stage 2: Aggregate votes

Once all teachers are trained, we need to aggregate all votes into a single file

python aggregate_votes.py --n_teachers 800

Stage 3: Train student

CIFAR10
python train_student.py --dataset cifar10 --n_samples 250 --n_teachers 800 --selection_noise 800 --result_noise 500 --noise.threshold 400 --epochs 200
CIFAR100
python train_student.py --dataset cifar100 --n_samples 1000 --n_teachers 100 --epochs 125 --weight_decay 0.001 --width 8 --amp true --opt_level O2

ALIBI

train_cifar_alibi.py implements the training and evaluation of ResNet on CIFAR. Run the following to explore the arguments (dataset, model, architecture, noising mechanism, post-processing mode, hyperparameters, training knobs, etc.) that can be set during the runs.

python train_cifar_alibi.py --help

Most notably,

  • Use --dataset "CIFAR10" and --dataset "CIFAR100" to train on CIFAR-10 and CIFAR-100 respectively.
  • Use --arch "resnet" and --arch "wide-resnet" to train on ResNet-18 and Wide-Resnet18-100 respectively.
  • Use --canary 1000 to train with 1000 mislabeled "canaries".
  • In our paper, we used --seed 11337

NOTE: Tables 3 and 4 summarize hyperparameters for PATE-FM and ALIBI respectively.

Memorization attacks on trained models

To reproduce the memorization attack on our trained models, run

cd memorization_attack
python attack.py

Results

Results are summarized in Tables 1 and 2 of our paper.

License

The majority of facebookresearch/label_dp_antipodes is licensed under CC-BY-NC, however portions of the project are available under separate license terms: kekmodel/FixMatch-pytorch, kuangliu/pytorch-cifar, and facebookresearch/label_dp_antipodes/memorization_attack are licensed under MIT license, tensorflow/privacy is licensed under Apache-2.0 license.

Acknowledgements

label_dp_antipodes's People

Contributors

ftramer avatar ilyamironov avatar karthikprasad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

secsearch llbbcc

label_dp_antipodes's Issues

Can not reproduce the result in Table 2.

Hellow.
I try to reproduce the results in Table. Specifically, to reproduce the result of $\epsilon=8$, I run

python train_cifar_alibi.py --dataset cifar100 --arch resnet --lr 0.0037 --weight-decay 0.00335 --sigma 0.35

but only obtain 51% test accuracy.
Is there something wrong with my configuration?

Inappropriate picture dimension

How to solve it:
RuntimeError: The size of tensor a (32) must match the size of tensor b (3) at non-singleton dimension 2

How to choose sigma?

Hello. I don't find any clue of how to choose sigma in neither of your paper (Table 4 doesn't mention it) or the codes. As sigma is important for both utility and privacy, I'm wondering:

  1. what are the settings of sigma for the different levels of privacy in Table 2?
  2. what are the privacy levels of settings in Table 4?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.