trailofbits / privacyraven Goto Github PK

Privacy Testing for Deep Learning

License: Apache License 2.0

Python 99.31% Dockerfile 0.60% Shell 0.10%

machine-learning deep-learning python privacy privacy-preserving-machine-learning privacy-enhancing-technologies membership-inference model-extraction model-inversion

privacyraven's Introduction

Note: This project is on hiatus.

PrivacyRaven is a privacy testing library for deep learning systems. You can use it to determine the susceptibility of a model to different privacy attacks; evaluate privacy preserving machine learning techniques; develop novel privacy metrics and attacks; and repurpose attacks for data provenance and other use cases.

PrivacyRaven supports label-only black-box model extraction, membership inference, and (soon) model inversion attacks. We also plan to include differential privacy verification, automated hyperparameter optimization, more classes of attacks, and other features; see the GitHub issues for more information. PrivacyRaven has been featured at the OpenMined Privacy Conference, Empire Hacking, and Trail of Bits blog.

Why use PrivacyRaven?

Deep learning systems, particularly neural networks, have proliferated in a wide range of applications, including privacy-sensitive use cases such as facial recognition and medical diagnoses. However, these models are vulnerable to privacy attacks that target both the intellectual property of the model and the confidentiality of the training data. Recent literature has seen an arms race between privacy attacks and defenses on various systems. And until now, engineers and researchers have not had the privacy analysis tools they need to rival this trend. Hence, we developed PrivacyRaven- a machine learning assurance tool that aims to be:

Usable: Multiple levels of abstraction allow users to either automate much of the internal mechanics or directly control them, depending on their use case and familiarity with the domain.
Flexible: A modular design makes the attack configurations customizable and interoperable. It also allows new privacy metrics and attacks to be incorporated straightforwardly.
Efficient: PrivacyRaven reduces the boilerplate, affording quick prototyping and fast experimentation. Each attack can be launched in fewer than 15 lines of code.

How does it work?

PrivacyRaven partitions each attack into multiple customizable and optimizable phases. Different interfaces are also provided for each attack. The interface shown below is known as the core interface. PrivacyRaven also provides wrappers around specific attack configurations found in the literature and a run-all-attacks feature.

Here is how you would launch a model extraction attack in PrivacyRaven:

#examples/extract_mnist_gpu.py
import privacyraven as pr
from privacyraven.utils.data import get_emnist_data
from privacyraven.extraction.core import ModelExtractionAttack
from privacyraven.utils.query import get_target
from privacyraven.models.victim import train_four_layer_mnist_victim
from privacyraven.models.four_layer import FourLayerClassifier

# Create a query function for a target PyTorch Lightning model
model = train_four_layer_mnist_victim()


def query_mnist(input_data):
    # PrivacyRaven provides built-in query functions
    return get_target(model, input_data, (1, 28, 28, 1))


# Obtain seed (or public) data to be used in extraction
emnist_train, emnist_test = get_emnist_data()

# Run a model extraction attack
attack = ModelExtractionAttack(
    query_mnist, # query function
    200,  # query limit
    (1, 28, 28, 1), # victim input shape
    10, # number of targets
    (3, 1, 28, 28),  # substitute input shape
    "copycat", # synthesizer name
    FourLayerClassifier, # substitute model architecture
    784,  # substitute input size
    emnist_train, # seed train data
    emnist_test, # seed test data
)

Since the only main requirement from the victim model is a query function, PrivacyRaven can be used to attack a wide range of models regardless of the framework and distribution method. The other classes of attacks can be launched in a similar fashion. See the examples folder for more information.

Want to use PrivacyRaven?

Install poetry.
Git clone this repository.
Run poetry update
Run poetry install.

If you'd like to use a Jupyter Notebook environment, run poetry shell followed by jupyter notebook.

Additionally, if you'd like to run PrivacyRaven in a Docker container, run chmod +x build.sh followed by ./build.sh. Note that depending on the amount of resources you allocate to Docker, PrivacyRaven's performance may be drastically impacted.

Feel free to join our #privacyraven channel in Empire Hacking if you need help using or extending PrivacyRaven. The official pip release will arrive soon.

Please note that PrivacyRaven is still early in development and is undergoing rapid changes. Users are advised to update frequently and avoid applying PrivacyRaven to critical use cases.

Want to contribute to PrivacyRaven?

PrivacyRaven is still a work-in-progress. We invite you to contribute however you can whether you want to incorporate a new synthesis technique or make an attack function more readable. Please visit CONTRIBUTING.md to get started.

Why is it called PrivacyRaven?

The raven has been associated with a variety of concepts in different cultures through time. Among these, the raven is commonly associated with prophecy and insight. Naturally, we named the tool PrivacyRaven because it is designed to provide insights into the privacy of deep learning.

Who maintains PrivacyRaven?

The core maintainers are:

License

This library is available under the Apache License 2.0. For an exception to the terms, please contact us.

References

While PrivacyRaven was built upon a plethora of research on attacking machine learning privacy, the research most critical to the development of PrivacyRaven are:

Appearances

This is a list of publications, presentations, blog posts, and other public-facing media discussing PrivacyRaven.

privacyraven's People

Contributors

Stargazers

Watchers

Forkers

scottshufe ol-msc ksasi pwang00 suhacker1 zedoul datawiz0 karimelshetihy neogyk ameg-yag stjordanis xzhewei wobujingchangwuyu superf0sh ethicalsecurity-agency muskanb

privacyraven's Issues

Create an overlaid diverging histogram for extraction

Is your feature request related to a problem? Please describe.
It is difficult to reason about the difference between a victim and substitute model from a simple label agreement score.

Describe the solution you'd like.
Create an overlaid diverging histogram that showcases the correct and incorrect responses on a single axis.

Describe alternatives you've considered.
If a better solution is available, please comment on this issue.

Detail any additional context.
This was inspired by Figure 3 of Understanding and Visualizing Data Iteration in Machine Learning.

Implement property inference attacks

More details will be added later

CLI

Build a CLI tool for Privacy Raven. Details TBD.

Blocked on #14 .

Create a PyTorch Lightning callback that uses model extraction

Is your feature request related to a problem? Please describe.
Have a callback perform optimized model extraction.

Detail any additional context.
This may be better suited to be stored within the PyTorch Lightning Bolts repository. A callback for #24 would also be of interest.

Verify machine unlearning

Is your feature request related to a problem? Please describe.
PrivacyRaven should be able to verify machine unlearning techniques as per Towards Probabilistic Verification of Machine Unlearning and other relevant works.

Create a wrapper around PyTorch Lightning Callbacks

Is your feature request related to a problem? Please describe.
PyTorch Lightning callbacks enable users to extend the functionality after already creating the attack. In other words, this should be a function that takes in the attack output itself.

Detail any additional context.
If there are any callbacks that should be integrated within PrivacyRaven, create an issue or comment on this one.

Determine query complexity of attacks

Is your feature request related to a problem? Please describe.
Currently, users don't have much guidance when choosing the best attack for their use case. One of the most important factors is the query complexity: how difficult it is to launch one of these attacks and a relative estimate of the computational complexity.

Describe the solution you'd like.
More research is needed to detail and visualize a solution. Comment on this issue with your solution before submitting a PR.

Detail any additional context.
This will affect metrics visualization issues. This paper might be useful.

Implement initial model inversion and property inference

Follow "Neural Network Inversion in Adversarial Setting via Background Knowledge Alignment"

We should revise the current implementation.

Integrate TensorBoard

Is your feature request related to a problem? Please describe.
TensorBoard enables more transparency into the development of the victim and substitute models.

Describe the solution you'd like.
Make the output as clear and succint as possible.

Add "help" function

Is your feature request related to a problem? Please describe.
Currently, there is no option for users to request help for how to use this tool. Adding some "help" function will improve usability.

Blocked on #22

Create a tabular output for run-all-attacks

Is your feature request related to a problem? Please describe.
When running multiple attacks, it should be easy to compare the effectiveness of each one.

Describe the solution you'd like.
Returning a table of each attack and respective metrics is a start. This could eventually be turned into graphs and other visuals.
A pandas data frame is a potential format.

Detail any additional context.
TensorFlow Privacy is an excellent example.

Add attacker advantage to membership inference attack

More details to be added later

Modify the model extraction output

Is your feature request related to a problem? Please describe.
Currently, it is hard to evaluate the risk of a model extraction attack. We need the risk to be more obvious and to have better visuals.

Describe the solution you'd like.
The current output should (at the minimum):

Print blocks to separate the different phases
Display a sentence describing the label agreement ("Out of 1000 data points, the target model and substitute model agreed upon 900 data points")
Clearly delineate whether a metric is for accuracy or fidelity
Clean up the current extraction output; in particular, remove as many warning as possible (the PyTorch Lightning progress bar may need to be changed)

Add new PyTorch Lightning model

Add a model card for the current models

Paper: Model Cards for Model Reporting

Optimize extraction example

Is your feature request related to a problem? Please describe.
Apply hyperparameter optimization and other tuning to the extraction example to increase the performance. Display as many metrics as possible.

Describe the solution you'd like.
This would be best displayed in a Jupyter Notebook and potentially turned into a tutorial.

Detail any additional context.
If there are any limits on the performance of the attack caused by PrivacyRaven, please raise an issue or fix the issue inside of your PR. It would also be nice if the extraction unit test asserts that the success of the attack is over a specific value/percentage instead of merely checking for failures.

Add support for Python 3.6, 3.8, and 3.9

Is your feature request related to a problem? Please describe.
Presently, PrivacyRaven supports 3.7. Supporting 3.6 specifically would enable PrivacyRaven to work on Google Colab.

Describe the solution you'd like.
Minimal changes to existing code is preferred.

Describe alternatives you've considered.
Forcing Colab to run on a 3.8 kernel would increase the complexity of getting started with PrivacyRaven.

Detail any additional context.
This is in response to #51.

Add guidance for protecting against these attacks in README

Is your feature request related to a problem? Please describe.
Currently, the tool provides various ways to attack a users model. It would be beneficial to the user to also inform them of ways in which they can protect their models.

Describe the solution you'd like.
Consult the relevant literature and consolidate advice, such as using differential privacy.

Add PrivacyRaven-specific Jupyter Widgets

Is your feature request related to a problem? Please describe.
The title is self-explanatory. Comment and/or create a new issue for the proposed widget. Metrics visualization is the primary use case.

Allow membership inference attacks to accept extracted models

Is your feature request related to a problem? Please describe.
The membership inference attack currently performs model extraction, which would be redundant if model extraction was already performed. Allow users to input in an extracted model and skip the extraction component of membership inference.

Create a differentially private victim model

Is your feature request related to a problem? Please describe.
This would increase the variety of victim models for quick prototyping. Presently, PrivacyRaven only provides a three layer classifier trained on the MNIST dataset as a victim model.

Describe the solution you'd like.
A lightweight model written using Opacus is preferable.

Describe alternatives you've considered.
The model can also be built with TensorFlow Privacy or PySyft.

Detail any additional context.
This victim model could also be used inside of examples, reinforcing and/or extending the results from this paper among others. Multiple people can work on this issue.

Edit CONTRIBUTING.md

Is your feature request related to a problem? Please describe.
Instead of a list of bullet points, it would be nice to reorganize the file into sections. It would be helpful if more attention was paid to installation and setup.

Detail any additional context.
Make sure to include that:

Right now, all of the tests run on GitHub Actions and therefore must explicitly be defined as not using a GPU
All issues under "needs validation" (like this one) should have a comment with an overview of the solution on the issue itself
Commands often need poetry to run like poetry run python or poetry run pytest
It is best to run Nox before making any commits
Explain how to look at the GitHub Projects. The most important issues are at the top
Take a look at "help wanted" issues first
We prefer tests using Python Hypothesis
Link to relevant papers
Code should adhere to PyTorch and PyTorch Lightning best practices (See: 1, 2, 3, 4, 5)
We're interested in new issues that add new synthesizers, attack configurations, strategies, victim models, and substitute model architectures (or, broadly, whatever improves the scope, efficiency, and effectiveness of PrivacyRaven)
Some papers are not reproducible

Metrics visualization interface

Is your feature request related to a problem? Please describe.
Instead of merely printing hard-to-understand metrics, PrivacyRaven should generate visuals and easy-to-understand metrics for this tool to be more broadly applicable. This can be solved in a multitude of ways, but user understanding should be the top priority.

Add more membership inference configurations

Is your feature request related to a problem? Please describe.
Create functions that call the membership inference class with specific configurations as has been done with model extraction.

Detail any additional context.
Blocked on #7 and #8.

Add GPU detection within attacks

Is your feature request related to a problem? Please describe.
This should be integrated into all of the attacks. Users will no longer be required to explicitly state how many GPUs they are using. However, they should have the ability to do so. This should output something like:

GPU Available: True; GPU Used: False

This is motivated by the fact that the tests only work on a CPU due to GitHub Actions, so all tests must explicitly define that they are not using a GPU.

Describe the solution you'd like.
Explicit statements take priority over automated detection. This should turn off all GPU-specific behavior, especially .cuda methods.

Describe alternatives you've considered.
We could have separate tests on a GPU with a different CI. Self-hosted runners are not an option. We are open to alternatives.

Detail any additional context.
If you include a test for GPU behavior, it must be ignored by GitHub Actions somehow.

Potentially interesting links:

PrivacyRaven as a property testing tool

Is your feature request related to a problem? Please describe.
Currently, there is not an easy way to integrate PrivacyRaven as an assurance tool into an existing project.

Describe the solution you'd like.
Adjust PrivacyRaven to function as a property testing tool that can integrate with existing projects. Comment on this issue with your solution before submitting a PR.

Blocked on #22

Add more privacy metrics

Describe the solution you'd like.
I would like a new folder in src with a plethora of metrics.

Detail any additional context.
This is a list of papers. For each metric, create a new issue linking to this one and resolve that one.

Add retraining and subset sampling to extraction

Is your feature request related to a problem? Please describe.
PrivacyRaven currently trains a substitute model on synthetic data generated separately from substitute training. However, many model extraction attacks train the substitute model adaptively using subset sampling strategies.

Describe the solution you'd like.
PrivacyRaven should allow users to specify a subset sampling strategy. By default, it should be assumed that a subset sampling strategy is not being used.

Detail any additional context.
Multiple functions in extraction/attacks.py as well as the core classes for membership inference and model extraction need to be changed in addition to the attributes of the core model extraction class.

Distribute robustness metrics between extraction and membership inference

Presently, both synthesis.py and robust.py contain robustness metrics with the main functions residing in the former. Moving these functions to the latter and refactoring them to better separate the use cases would be useful

Automated hyperparameter optimization

Is your feature request related to a problem? Please describe.
Presently, users have to manually pick different hyperparameters for training a substitute model and other phases of each attack. However, constructing an effective attack often requires these hyperparameters to be optimized.

Describe the solution you'd like.
Optuna is a commonly used library to facilitate automated hyperparameter optimized. Different techniques should be tested in order to determine what the best optimization strategy is for PrivacyRaven.

Describe alternatives you've considered.
Other hyperparameter libraries can be used.

Detail any additional context.
The PyTorch Lightning parameters already contained within PrivacyRaven can be changed if needed. Additionally, the hyperparameters are stored within a dictionary, but can be move to an enum or other suitable solution.

Set up Sphinx documentation

Reference Hypermodern Python.

Build tests for model extraction and utils

Is your feature request related to a problem? Please describe.
No tests exist for the different model extraction interfaces and for the utilities.

Describe the solution you'd like.
Create unit tests using Pytest and/or Hypothesis.

Describe alternatives you've considered.
N/A

Detail any additional context.
Either traditional unit testing or property-based testing is appropriate as long as the decision is justified. Make sure to add any new libraries to Poetry.

Unable to run or use PrivacyRaven in Colab

Hi,

It appears due to the use of poetry to enforce dependencies and create virtualenv for PrivacyRaven, we are unable to use this in google colab or jupyter notebook environments. Please suggest a work around or alternative for the same.

Thanks and Regards,
Sasikanth

Create a membership inference example

Use attrs for model extraction

Refactor src/extraction/core.py by building the class for model extraction attacks with attrs instead (as has been done in src/m_inference/core.py).

Create tests for membership inference and model inversion

Is your feature request related to a problem? Please describe.
This is self-explanatory.

Detail any additional context.
Blocked on #5, #2, and #14.

Add GitHub issue templates

Create a tutorial or example that uses a novel synthesizer, model, and callback

Is your feature request related to a problem? Please describe.
PrivacyRaven enables users to create their own synthesizers, models (substitute and victim), and PyTorch Lightning callbacks to be used inside of attacks, but this currently isn't highlighted anywhere.

Verify implementations of differential privacy

Is your feature request related to a problem? Please describe.
We would like for PrivacyRaven to be able to audit differentially private machine learning implementations.

Detail any additional context.
Relevant Papers and Repositories:

Add more model extraction attacks

Is your feature request related to a problem? Please describe.
We want every model extraction attack to be achievable in PrivacyRaven. This does not include side channel, white-box, full or partial prediction, or explanation-based attacks.

Describe the solution you'd like.
PrivacyRaven has three interfaces for attacks:

The core interface defines each attack parameter individually.
The specific interface runs a predefined attack configuration.
The cohesive interface runs every possible attack.

A user should be able to run the attack in every interface; this means that all the building blocks for the attack should be contained within PrivacyRaven. For example, new synthesizers or subset selection strategies for a specific attack should be added, so that it can be applied using the core interface.

If you would like to implement an attack, comment with the name of the paper. Then, create a new issue referencing this issue with the name of the paper in the title.

Detail any additional context.
This is a list of papers describing model extraction attacks that should be added to PrivacyRaven.

Retroactively prove if PrivacyRaven was used on a model

Comment with a design document explaining your solution. One potential solution is to watermark the models/"hide" the PrivacyRaven logo or some key inside the mode.

Separate model-specific and data-specific hyperparameters

Is your feature request related to a problem? Please describe.
Right now, PrivacyRaven mixes all of the hyperparameters necessary to train a substitute model into a single dictionary. This should be replaced with Lightning Data Modules and/or other structures to clarify the role of each parameter and make it easier to extend PrivacyRaven for different tasks. This may require automatically turning all synthesized data into a Data Module.

Detail any additional context.

YouTube Video on Lightning Data Modules

Create an aggregated embedding for membership inference hot spots

Is your feature request related to a problem? Please describe.
We need a clarifying visual to showcase the privacy risks of membership inference, especially as it varies between classes.

Describe the solution you'd like.
An aggregated embedding similar to Figure 4 of Understanding and Visualizing Data Iteration in Machine Learning would be useful. Highlight the worst case.

Describe alternatives you've considered.
Please comment any alternatives.

Fix HopSkipJump extraction

Is your feature request related to a problem? Please describe.
With the new security updates and other changes, HopSkipJump-based extraction no longer works. We need to fix it so that using HopSkipJump to populate the synthetic dataset for model extraction is possible.