facebookresearch / crypten Goto Github PK

A framework for Privacy Preserving Machine Learning

License: MIT License

Python 83.92% Jupyter Notebook 14.64% JavaScript 1.11% CSS 0.30% Dockerfile 0.04% Procfile 0.01%

crypten's Introduction

CrypTen is a framework for Privacy Preserving Machine Learning built on PyTorch. Its goal is to make secure computing techniques accessible to Machine Learning practitioners. It currently implements Secure Multiparty Computation as its secure computing backend and offers three main benefits to ML researchers:

It is machine learning first. The framework presents the protocols via a CrypTensor object that looks and feels exactly like a PyTorch Tensor. This allows the user to use automatic differentiation and neural network modules akin to those in PyTorch.
CrypTen is library-based. It implements a tensor library just as PyTorch does. This makes it easier for practitioners to debug, experiment on, and explore ML models.
The framework is built with real-world challenges in mind. CrypTen does not scale back or oversimplify the implementation of the secure protocols.

Here is a bit of CrypTen code that encrypts and decrypts tensors and adds them

import torch
import crypten

crypten.init()

x = torch.tensor([1.0, 2.0, 3.0])
x_enc = crypten.cryptensor(x) # encrypt

x_dec = x_enc.get_plain_text() # decrypt

y_enc = crypten.cryptensor([2.0, 3.0, 4.0])
sum_xy = x_enc + y_enc # add encrypted tensors
sum_xy_dec = sum_xy.get_plain_text() # decrypt sum

It is currently not production ready and its main use is as a research framework.

Installing CrypTen

CrypTen currently runs on Linux and Mac with Python 3.7. We also support computation on GPUs. Windows is not supported.

For Linux or Mac

pip install crypten

If you want to run the examples in the examples directory, you should also do the following

pip install -r requirements.examples.txt

Examples

To run the examples in the examples directory, you additionally need to clone the repo and

pip install -r requirements.examples.txt

We provide examples covering a range of models in the examples directory

The linear SVM example, mpc_linear_svm, generates random data and trains a SVM classifier on encrypted data.
The LeNet example, mpc_cifar, trains an adaptation of LeNet on CIFAR in cleartext and encrypts the model and data for inference.
The TFE benchmark example, tfe_benchmarks, trains three different network architectures on MNIST in cleartext, and encrypts the trained model and data for inference.
The bandits example, bandits, trains a contextual bandits model on encrypted data (MNIST).
The imagenet example, mpc_imagenet, performs inference on pretrained models from torchvision.

For examples that train in cleartext, we also provide pre-trained models in cleartext in the model subdirectory of each example subdirectory.

You can check all example specific command line options by doing the following; shown here for tfe_benchmarks:

python examples/tfe_benchmarks/launcher.py --help

How CrypTen works

We have a set of tutorials in the tutorials directory to show how CrypTen works. These are presented as Jupyter notebooks so please install the following in your conda environment

conda install ipython jupyter
pip install -r requirements.examples.txt

Introduction.ipynb - an introduction to Secure Multiparty Compute; CrypTen's underlying secure computing protocol; use cases we are trying to solve and the threat model we assume.
Tutorial_1_Basics_of_CrypTen_Tensors.ipynb - introduces CrypTensor, CrypTen's encrypted tensor object, and shows how to use it to do various operations on this object.
Tutorial_2_Inside_CrypTensors.ipynb – delves deeper into CrypTensor to show the inner workings; specifically how CrypTensor uses MPCTensor for its backend and the two different kind of sharings, arithmetic and binary, are used for two different kind of functions. It also shows CrypTen's MPI-inspired programming model.
Tutorial_3_Introduction_to_Access_Control.ipynb - shows how to train a linear model using CrypTen and shows various scenarios of data labeling, feature aggregation, dataset augmentation and model hiding where this is applicable.
Tutorial_4_Classification_with_Encrypted_Neural_Networks.ipynb – shows how CrypTen can load a pre-trained PyTorch model, encrypt it and then do inference on encrypted data.
Tutorial_5_Under_the_hood_of_Encrypted_Networks.ipynb - examines how CrypTen loads PyTorch models, how they are encrypted and how data moves through a multilayer network.
Tutorial_6_CrypTen_on_AWS_instances.ipynb - shows how to use scrips/aws_launcher.py to launch our examples on AWS. It can also work with your code written in CrypTen.
Tutorial_7_Training_an_Encrypted_Neural_Network.ipynb - introduces the automatic differentiation functionality of CrypTensor. This functionality makes it easy to train neural networks in CrypTen.

Documentation and citing

CrypTen is documented here.

The protocols and design protocols implemented in CrypTen are described in this paper. If you want to cite CrypTen in your papers (much appreciated!), you can cite it as follows:

@inproceedings{crypten2020,
  author={B. Knott and S. Venkataraman and A.Y. Hannun and S. Sengupta and M. Ibrahim and L.J.P. van der Maaten},
  title={CrypTen: Secure Multi-Party Computation Meets Machine Learning},
  booktitle={arXiv 2109.00984},
  year={2021},
}

Join the CrypTen community

Please contact us to join the CrypTen community on Slack

See the CONTRIBUTING file for how to help out.

License

CrypTen is MIT licensed, as found in the LICENSE file.

crypten's People

Contributors

Stargazers

Watchers

Forkers

manik-hossain haochenuw jvmncs niklausliu zhaoluo kuan-li rezacsedu rahulbhalley hongyunnchen avain vishalbelsare trendingtechnology stjordanis tchigher shaunstanislauslau maggichk yrbneumann berryhn gavinuhma ssdyue carlzhangweiwen lariffle knottb wanghuazhong farbdrucker uday5162 jjusti bygreencn hehedalulu jhjiangcs vlaskinvlad gmuraru hlyu368 vibhatha rob-stlouis fionnoif leoyichen noobyogi0010 treetrees youben11 srravula1 mpcml neineit databill86 vreis sailfish009 uishi yugandhartripathi mzkaramat ajnovice marksibrahim lvdmaaten shubho awoziji pkuliuliu mengfanxiao walexi sha-yuan nthparty mruberry hyunjay sparverius iliazintchenko hitum-dev knut0815 ruiyuzhu yangzpag yuehchuan wanglun1996 nilanshrajput linkonbsmrstu colinmatthewgeorge87 carly97 tf369 xuleimath erfaneshrati jamesdu0504 brendanschell alabid bnationsdev jesse9 davidbytebit knarflin mlh-fellowship tnpe rtajeddine donnyyou samueldadams stefanomozart jeffreysijuntan ankitshah009 mh739025250 vinayak15 officialfrancismendoza curious-mike bquast yzy010203 unsky boyuan12 darwin-systems

crypten's Issues

FixedPointEncoder.decode() produces incorrect results

The fixed point encoder produces correct results when round-tripping small integers:

>>> import crypten
>>> crypten.init()
>>> def round_trip(input):
...   encoder = crypten.encoder.FixedPointEncoder()
...   output = int(encoder.decode(encoder.encode(input)))
...   print(input, output, input - output)
... 
>>> round_trip(1)
1 1 0
>>> round_trip(13)
13 13 0
>>> round_trip(254)
254 254 0
>>> round_trip(2**16)
65536 65536 0
>>> round_trip(2**16 + 1)
65537 65537 0

However, larger values don't always produce correct results:

>>> round_trip(2**24)
16777216 16777216 0
>>> round_trip(2**24 + 1)
16777217 16777216 1                   <-- Wrong!
>>> round_trip(2**24 + 2)
16777218 16777218 0
>>> round_trip(2**24 + 3)
16777219 16777220 -1                  <-- Wrong!
>>> round_trip(2**47 - 2)
140737488355326 140737488355328 -2    <-- Wrong!
>>> round_trip(2**47 - 1)
140737488355327 140737488355328 -1    <-- Wrong!

Since the encoder reserves 16 bits for fractions, I would expect any integer in the range [-2^47, +2^47) to be encoded and decoded without change.

Looking at the implementation of FixedPointEncoder.encode(), I see that values are simply multiplied by the scale factor (2^16 by default) as you'd expect. However, the implementation of decode() is (I think) doing much more than it should - seems like it should simply divide by the scale factor. If I simplify the implementation to do just that, then I get the expected results.

Can you comment on the implementation of decode()?

Many thanks,
Tim

Support for 'group' sharing

Feature

In torch.distributed, we can create a group and perform function in that group but not the default world. Can we also add this in Crypten?

mpc.run_multiprocess decorated functions blocking

When going through tutorial 3 and 4, cells 6 and 7 respectively blocks (waited for around 1 hour without results), the run_multiprocessor function is waiting for child processes here. I guess something is going wrong with a child process but couldn't figure out what yet.

Workspace

python 3.7.5
torch 1.3.0
crypten is installed from the last commit
Linux MANJARO

How to encrypt and save shares of a pretrained pytorch model?

Hi,

Please forgive my lack of deep understanding on the topic..........

I am trying to load a pretrained torch model, encrypt using crypten and save parts of the model using something like this:

First I encrypt the model and verify it is encrypted, then I would like to save the model:

@mpc.run_multiprocess(world_size=2)
def save_encrypted_model(model):
    # Save features, labels for Data Labeling example
    rank = comm.get().get_rank()
    crypten.save_from_party(private_model, f'private_model.encrypted.{rank}', src=rank)

Private model above is already encrypted. First of all, is this the correct way to save it?

Do I need to encrypt with a world size>1, or world size = 1? (see update below)

Do I save one file from a single rank? Or multiple files from multiple ranks like above. If I do, how do I reload the model?

Thanks

Update:

When I try encrypting with a share like this:

@mpc.run_multiprocess(world_size=2)
def create_model_shares(model):
    rank = comm.get().get_rank()
    model.encrypt(src=rank)
        
create_model_shares(private_model)

And then try to view a particular layer's encrypted MPC tensor value according to rank, I get:

Rank 0:
 MPCTensor(
	_tensor=tensor([ 15078,  16867,      0, -41953,      0,  10576,  30083,      0,  19777,
             0,  22494,  20096, -15800,     -2,   7024,  14254,  24308, -36044,
        -41872,  37914,  20004,  37636,  32006,  21770,  12858,  12557,  10062,
          6427,  32524,   1640,  10788,  21414,  16931,  29099, -19852,  -1316,
             0,  20762,      0,  15416,  15369,  20442,  28396,  18902,  16951,
         45435,  27650,  22283,      0,  16446,  19953,  40853,  25514,  21493,
        -26507,  25037,  11895,  16515, -28068,  14005,  37782,  36900, -25656,
         15519])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)

Rank 1:
 MPCTensor(
	_tensor=tensor([ 15078,  16867,      0, -41953,      0,  10576,  30083,      0,  19777,
             0,  22494,  20096, -15800,     -2,   7024,  14254,  24308, -36044,
        -41872,  37914,  20004,  37636,  32006,  21770,  12858,  12557,  10062,
          6427,  32524,   1640,  10788,  21414,  16931,  29099, -19852,  -1316,
             0,  20762,      0,  15416,  15369,  20442,  28396,  18902,  16951,
         45435,  27650,  22283,      0,  16446,  19953,  40853,  25514,  21493,
        -26507,  25037,  11895,  16515, -28068,  14005,  37782,  36900, -25656,
         15519])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)

Arent't each of those numbers supposed to be different?

Calling crypten.nn.from_pytorch hangs

Bug

We want to have a parent process that call crypten.nn.from_pytorch then child processes will do this as well, however, child processes seems to block at the first call to torch.onnx.export in the crypten.nn.from_pytorch function.

Reproduce

This example script can help in reproducing the bug

import torch
import crypten
import multiprocessing


def _get_model():
    class ExampleNet(torch.nn.Module):
        def __init__(self):
            super(ExampleNet, self).__init__()
            self.conv1 = torch.nn.Conv2d(1, 16, kernel_size=5, padding=0)
            self.fc1 = torch.nn.Linear(16 * 12 * 12, 100)
            self.fc2 = torch.nn.Linear(
                100, 2
            )  # For binary classification, final layer needs only 2 outputs

        def forward(self, x):
            out = self.conv1(x)
            out = torch.nn.functional.relu(out)
            out = torch.nn.functional.max_pool2d(out, 2)
            out = out.view(out.size(0), -1)
            out = self.fc1(out)
            out = torch.nn.functional.relu(out)
            out = self.fc2(out)
            return out

    dummy_input = torch.empty(1, 1, 28, 28)
    example_net = ExampleNet()
    model = crypten.nn.from_pytorch(example_net, dummy_input)
    return model


def proc():
    print("\tGetting model inside proc")
    # it blocks here only when we have called crypten.nn.from_pytorch in the parent process
    model = _get_model()
    print("\tGot model inside proc")
    return model


print("[+] Start")
# it doesn't block if we call this multiple times inside the same process
model = _get_model()
print("[+] Got model")
process = multiprocessing.Process(target=proc, args=())
print("[+] Starting process")
process.start()
print("[+] Waiting process")
process.join()
print("[+] End")

Environment

$ python collect_env.py
Collecting environment information...
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Manjaro Linux
GCC version: (GCC) 9.2.0
CMake version: version 3.16.2

Python version: 3.7
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip3] numpy==1.18.0
[pip3] torch==1.3.1
[conda] torch                     1.4.0                    pypi_0    pypi
[conda] torchvision               0.5.0                    pypi_0    pypi

$ pip freeze
absl-py==0.9.0
appdirs==1.4.3
astor==0.8.1
attrs==19.3.0
backcall==0.1.0
black==19.10b0
bleach==3.1.0
certifi==2019.11.28
cffi==1.13.2
chardet==3.0.4
Click==7.0
coverage==4.5
-e git+https://github.com/facebookresearch/CrypTen.git@68e0364c66df95ddbb98422fb641382c3f58734c#egg=crypten
cryptography==2.8
decorator==4.4.1
defusedxml==0.6.0
dill==0.3.1.1
entrypoints==0.3
Flask==1.1.1
Flask-SocketIO==4.2.1
future==0.18.2
gast==0.2.2
google-pasta==0.1.8
grpcio==1.26.0
h5py==2.10.0
idna==2.8
importlib-metadata==1.3.0
ipykernel==5.1.3
ipython==7.10.2
ipython-genutils==0.2.0
ipywidgets==7.5.1
itsdangerous==1.1.0
jedi==0.15.1
Jinja2==2.10.3
joblib==0.14.1
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==6.0.0
jupyter-core==4.6.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
lz4==3.0.2
Markdown==3.1.1
MarkupSafe==1.1.1
mistune==0.8.4
more-itertools==8.0.2
msgpack==1.0.0
nbconvert==5.6.1
nbformat==4.4.0
notebook==6.0.2
numpy==1.18.1
onnx==1.6.0
opt-einsum==3.1.0
packaging==20.1
pandocfilters==1.4.2
parso==0.5.2
pathspec==0.7.0
pexpect==4.7.0
phe==1.4.0
pickleshare==0.7.5
Pillow==6.2.2
pluggy==0.13.1
prometheus-client==0.7.1
prompt-toolkit==2.0.9
protobuf==3.11.2
ptyprocess==0.6.0
pudb==2019.2
py==1.8.1
pycparser==2.19
Pygments==2.5.2
pyOpenSSL==19.1.0
pyparsing==2.4.6
pyrsistent==0.15.6
pytest==5.3.4
pytest-cov==2.8.1
python-dateutil==2.8.1
python-engineio==3.11.1
python-socketio==4.4.0
PyYAML==5.2
pyzmq==18.1.0
qtconsole==4.6.0
regex==2020.2.20
requests==2.22.0
RestrictedPython==5.0
scikit-learn==0.22
scipy==1.4.1
Send2Trash==1.5.0
six==1.13.0
sklearn==0.0
-e [email protected]:youben11/PySyft.git@1faf4400ffcce224fde347333cbfe15a5ab12660#egg=syft
syft-proto==0.2.1a1.post2
tblib==1.6.0
tensorboard==1.15.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
tf-encrypted==0.5.9
toml==0.10.0
torch==1.4.0
torchvision==0.5.0
tornado==4.5.3
traitlets==4.3.3
typed-ast==1.4.1
typing-extensions==3.7.4.1
urllib3==1.25.7
urwid==2.1.0
wcwidth==0.1.7
webencodings==0.5.1
websocket-client==0.57.0
websockets==8.1
Werkzeug==0.16.0
widgetsnbextension==3.5.1
wrapt==1.11.2
zipp==0.6.0

[BUG] mpc_autograd_cnn throws a TypeError

Environment

OS: MacOS 10.15.6
Python: Python 3.7.6

Commit Head

6dec17c

Requirement versions

crypten==0.1
torch==1.6.0
torchvision==0.5.0
visdom==0.1.8.9
numpy==1.18.1
onnx==1.7.0

Command

$ python mpc_autograd_cnn/launcher.py

Output

output

Update

I had actually installed crypten 0.1. But when I switched to master with the commit head above, I got this output

About secret-sharing in Tutorial 2

for example,
x_enc = crypten.cryptensor([1, 2, 3]), the x_enc is generated bu secret-sharinig, so it nedd at least two soruce, but the code
x_enc = crypten.cryptensor(x, src=i)
how the only one source i could complete encrype?

Filter encrypted tensor indices based on value

Feature

Filter encrypted tensor indices based on value.

import crypten
import torch
crypten.init()

from crypten.mpc import MPCTensor

x = torch.tensor([1,01,1,1,-1,1])

label = MPCTensor(1)
x_crypten = MPCTensor(x)
positive = x_crypten == label
print(positive.get_plain_text() # tensor([1,0,1,1,0,1])

Currently I do not see any implemented methods for retrieving indices in tensor x that are equal to the encrypted label. See below for torch implementation.

Alternatives

With torch we can perform the following operations.

label = 1
positive = x == label
indices = positive.nonzero().squeeze()
print(indices) # tensor([0,2,3,5])

Bug: AttributeError: module 'crypten' has no attribute 'init'

Followed installation instructions on Ubuntu 18.04 and received this error when running the example code from stable CryptTen release (latest pip):

$ python crypten.py
Traceback (most recent call last):
  File "crypten.py", line 2, in <module>
    import crypten
  File "/mnt/c/pytorch-classify/crypten.py", line 4, in <module>
    crypten.init()
AttributeError: module 'crypten' has no attribute 'init'

Example code:

import torch
import crypten

crypten.init()

x = torch.tensor([1.0, 2.0, 3.0])
x_enc = crypten.cryptensor(x) # encrypt

x_dec = x_enc.get_plain_text() # decrypt

y_enc = crypten.cryptensor([2.0, 3.0, 4.0])
sum_xy = x_enc + y_enc # add encrypted tensors
sum_xy_dec = sum_xy.get_plain_text() # decrypt sum

Add a preloaded argument to crypten.load

Feature

Add a preloaded argument to crypten.load to be able to use tensors or models that are retrieved using other mechanisms than torch.load.

Alternatives

Have a parameter for passing the function to call to retrieve the tensor/model.

Additional context

crypten.load can load objects from files by calling the torch.load function, however, we can sometimes have a different mechanism for querying tensors/models, but still doesn't wanna re-implement the dim and size exchange.

Fix dictionary `.clone()` to `.copy()` for gradients AutogradBatchNorm

see my pull request
#21

crypten.load reinitializing cuda in forked process?

I'm attempting to load a simple neural network on the master branch and receiving an error that CUDA is being reinitialized in a forked process when attempting to load from a checkpoint. Here's the stack trace:

Traceback (most recent call last):
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/sam/Documents/CrypTen/crypten/mpc/context.py", line 30, in _launch
    return_value = func(*func_args, **func_kwargs)
  File "/home/sam/Documents/svmvscnn/import.py", line 13, in run
    model = crypten.load('checkpoint.pth', dummy_model=model_class, src=0)
  File "/home/sam/Documents/CrypTen/crypten/__init__.py", line 337, in load
    obj = load_closure(f)
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/serialization.py", line 853, in _load
    result = unpickler.load()
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/serialization.py", line 845, in persistent_load
    load_tensor(data_type, size, key, _maybe_decode_ascii(location))
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/serialization.py", line 834, in load_tensor
    loaded_storages[key] = restore_location(storage, location)
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/serialization.py", line 175, in default_restore_location
    result = fn(storage, location)
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/serialization.py", line 157, in _cuda_deserialize
    return obj.cuda(device)
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/_utils.py", line 71, in _cuda
    with torch.cuda.device(device):
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/cuda/__init__.py", line 225, in __enter__
    self.prev_idx = torch._C._cuda_getDevice()
  File "/home/sam/miniconda3/envs/newcrypten/lib/python3.7/site-packages/torch/cuda/__init__.py", line 164, in _lazy_init
    "Cannot re-initialize CUDA in forked subprocess. " + msg)
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method

For reference, I'm running Ubuntu 20.04. Here is the code to reproduce. Both of the methods here fail. A checkpoint of the network will be attached to this issue.

import crypten
import crypten.mpc as mpc
import torch
from torch import nn
import torch.nn.functional as func

class Net1(nn.Module):
    def __init__(self):
        super(Net1, self).__init__()
        self.fc1 = nn.Linear(768, 10)
        self.fc2 = nn.Linear(10, 2)
    
    def forward(self, batch):
        x = self.fc1(batch)
        x = func.relu(x)
        x = self.fc2(x)
        return func.relu(x)

@mpc.run_multiprocess(world_size=2)
def run():
    model_class = Net1()
    #model = crypten.load_from_party('checkpoint.pth', model_class=model_class, src=0)
    model = crypten.load('checkpoint.pth', dummy_model=model_class, src=0)
    dummy_input = torch.empty((1, 1, 768))
    dummy_input.to('cuda')
    private_model = crypten.nn.from_pytorch(model, dummy_input)
    private_model.encrypt()
    private_model.eval()
    input = torch.rand((1, 1, 768))
    input = crypten.cryptensor(input, src=0)
    classification = private_model(input)
    print(classification)
    print('done')

crypten.init()
torch.set_num_threads(1)
run()

checkpoint.zip

Tutorial #7 throwing ValueError: Pattern not supported: expected Constant followed by Reshape node.

# We'll now set up the data for our small example below
# For illustration purposes, we will create toy data
# and encrypt all of it from source ALICE
x_small = torch.rand(100, 1, 28, 28)
y_small = torch.randint(1, (100,))

# Transform labels into one-hot encoding
label_eye = torch.eye(2)
y_one_hot = label_eye[y_small]

# Transform all data to CrypTensors
x_train = crypten.cryptensor(x_small, src=ALICE)
y_train = crypten.cryptensor(y_one_hot)

# Instantiate and encrypt a CrypTen model
model_plaintext = ExampleNet()
dummy_input = torch.empty(1, 1, 28, 28)
model = crypten.nn.from_pytorch(model_plaintext, dummy_input)
model.encrypt()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-c6c40d1b9ef3> in <module>
     16 model_plaintext = ExampleNet()
     17 dummy_input = torch.empty(1, 1, 28, 28)
---> 18 model = crypten.nn.from_pytorch(model_plaintext, dummy_input)
     19 model.encrypt()

/anaconda2/envs/py3/lib/python3.7/site-packages/crypten/nn/onnx_converter.py in from_pytorch(pytorch_model, dummy_input)
     35     # construct CrypTen model:
     36     f = _from_pytorch_to_bytes(pytorch_model, dummy_input)
---> 37     crypten_model = from_onnx(f)
     38     f.close()
     39 

/anaconda2/envs/py3/lib/python3.7/site-packages/crypten/nn/onnx_converter.py in from_onnx(onnx_string_or_file)
    122     """Converts an onnx model to a CrypTen model"""
    123     converter = FromOnnx(onnx_string_or_file)
--> 124     crypten_model = converter.to_crypten()
    125     return crypten_model
    126 

/anaconda2/envs/py3/lib/python3.7/site-packages/crypten/nn/onnx_converter.py in to_crypten(self)
    171             crypten_model.add_module(node_output_name, crypten_module, node_input_names)
    172 
--> 173         crypten_model = self.modify_shapes_in_graph(crypten_model)
    174         crypten_model = FromOnnx._get_model_or_module(crypten_model)
    175         return crypten_model

/anaconda2/envs/py3/lib/python3.7/site-packages/crypten/nn/onnx_converter.py in modify_shapes_in_graph(crypten_model)
    304                     raise ValueError(
    305                         """Pattern not supported: expected
--> 306                         Constant followed by Reshape node."""
    307                     )
    308                 value = crypten_modules[i - 1].value.long()

ValueError: Pattern not supported: expected
                        Constant followed by Reshape node.

Add LogSoftmax, Softmax, and Dropout modules

I have used a pytorch VGG-16 pre-trained model, added 2 more layers to it to fit my needs and trained the model and saved it as a .pth file.
However, when I am trying to convert it to a CrypTen model using the .load() function. It gives the following error

<ipython-input-41-8acf3ddeab54> in <module>()
      1 dummy_model = dummy_model
----> 2 plaintext_model = crypten.load('/content/drive/My Drive/vgg16-model.pth', dummy_model=dummy_model, src=ALICE)
      3 
      4 # Encrypt the model from Alice:
      5 

/usr/local/lib/python3.6/dist-packages/crypten/__init__.py in load(f, encrypted, dummy_model, src, **kwargs)
    233 
    234                 # raise error
--> 235                 raise TypeError("Unrecognized load type %s" % type(result))
    236 
    237         # Non-source party

TypeError: Unrecognized load type <class 'dict'>

I saw in the crypten/__init__.py file that this error is in the block where it is checked whether result which is the return value of torch.load(f, **kwargs), is of type tensor or torch.nn.Module and this error is raised when it is neither of them.

So to check, I ran the result = torch.load(f, **kwargs) command separately and that gave the following error -

AttributeError                            Traceback (most recent call last)

/usr/local/lib/python3.6/dist-packages/torch/serialization.py in _check_seekable(f)
    190     try:
--> 191         f.seek(f.tell())
    192         return True

5 frames

AttributeError: 'VGG' object has no attribute 'seek'


During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)

/usr/local/lib/python3.6/dist-packages/torch/serialization.py in raise_err_msg(patterns, e)
    185                                 " Please pre-load the data into a buffer like io.BytesIO and" +
    186                                 " try to load from it instead.")
--> 187                 raise type(e)(msg)
    188         raise e
    189 

AttributeError: 'VGG' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

Is there any way to solve this issue so that I can use CrypTen with my pre-trained VGG-16 model?

some problem about Feature Aggregation scenario

1.If I have two parties to train a model,how can I predict new data?concat them and then predict?
2.two parties encrypt a same data will get a same result,right?can it be different in Feature Aggregation scenario?

SMPC for Tree Based Models

Is it possible to get SMPC working on Tree Based Models like Decision Tree or Ensemble models?

Loading each party's data across machines.

Hi,

Lets say Alice's data is already stored on '/tmp/alice_data.pth' on machine 1 and Bob's data is already stored on '/tmp/bob_data.pth' on machine 2. Assuming Alice is rank 0 and Bob is rank 1, the following code will fail for Alice on line for enc_bob_data :

rank = os.getenv('RANK')
print('rank ' + rank)

crypten.init()
torch.set_num_threads(1)

num_features = 2

torch.manual_seed(1)

alice_data = torch.randn(1, num_features)
print(alice_data)

ALICE = 0
BOB = 1

filenames = {
    "alice_data": "/tmp/alice_data.pth",
    "bob_data": "/tmp/bob_data.pth",
}

crypten.save(alice_data, filenames["alice_data"], src=ALICE)

@mpc.run_multiprocess(world_size=2)
def joint_compute():

    if comm.get().get_rank() == 0:
        enc_alice_data = crypten.load(filenames["alice_data"], src=ALICE)
    else:
        enc_bob_data = crypten.load(filenames["bob_data"], src=BOB)

    enc_total = enc_alice_data + enc_bob_data
    dec_out = enc_total.get_plain_text()
    print(dec_out)

joint_compute()

The 2 machines/parties are joined by:

INIT_METHOD = 'tcp://xxxx:23456'
current_env["RENDEZVOUS"] = INIT_METHOD

Any idea? Thanks.

Support for Other Modules

Hi,

I was wondering about support for more modules such as max, min, or random tensors while constructing a neural network. Would these features be easily extendable or is there some inherent limitation? I would assume since ReLU and Dropout are supported, that functions like the ones mentioned should be possible, but I wasn't sure what the best approach would be.

Thank you!

Timeline for GPU Support

I noticed after some troubleshooting that there is no GPU support, as indicated in the README. When do you plan to support this functionality? Thanks!

Tutorial 3 cannot load examples.<anything>

Going through the tutorial, I cannot download anything from the 'examples' module.

I have completed the Tutorials 0-2, and just ran into this issue for some reason (I haven't left my computer since starting).
I have attached a pdf of the jupyter notebook, and a screen shot of the error.

Tutorial_3_Introduction_to_Access_Control - Jupyter Notebook.pdf

Any tips to get this going? What did I miss or is missing in documentation?
Thanks in advance!!

mpc.run_multiprocess decorated functions can't return a torch tensor

Returning a torch.tensor from a run_multiprocess decorated function will throw ConnectionRefusedError: [Errno 111] Connection refused, I'm not sure why this happens with torch tensors and not other python objects, I guess pytorch requires processes to be still active to get tensors. Might be related to this

workspace information

$ python collect_env.py 
Collecting environment information...
PyTorch version: 1.3.0
Is debug build: No
CUDA used to build PyTorch: 10.1.243

OS: Manjaro Linux
GCC version: (GCC) 9.2.0
CMake version: version 3.16.2

Python version: 3.7
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip3] numpy==1.18.0
[pip3] torch==1.3.1
[conda] torch                     1.3.0                    pypi_0    pypi
[conda] torchvision               0.4.1                    pypi_0    pypi
$
$ pip freeze
absl-py==0.9.0
astor==0.8.1
attrs==19.3.0
backcall==0.1.0
bleach==3.1.0
certifi==2019.11.28
chardet==3.0.4
Click==7.0
-e git+https://github.com/facebookresearch/CrypTen.git@693dd6cc9918e982963e6acc186215d0f0769080#egg=crypten
decorator==4.4.1
defusedxml==0.6.0
entrypoints==0.3
Flask==1.1.1
Flask-SocketIO==4.2.1
future==0.18.2
gast==0.2.2
google-pasta==0.1.8
grpcio==1.26.0
h5py==2.10.0
idna==2.8
importlib-metadata==1.3.0
ipykernel==5.1.3
ipython==7.10.2
ipython-genutils==0.2.0
ipywidgets==7.5.1
itsdangerous==1.1.0
jedi==0.15.1
Jinja2==2.10.3
joblib==0.14.1
jsonschema==3.2.0
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==6.0.0
jupyter-core==4.6.1
Keras-Applications==1.0.8
Keras-Preprocessing==1.1.0
lz4==2.2.1
Markdown==3.1.1
MarkupSafe==1.1.1
mistune==0.8.4
more-itertools==8.0.2
msgpack==0.6.2
nbconvert==5.6.1
nbformat==4.4.0
notebook==6.0.2
numpy==1.18.0
onnx==1.6.0
opt-einsum==3.1.0
pandocfilters==1.4.2
parso==0.5.2
pexpect==4.7.0
phe==1.4.0
pickleshare==0.7.5
Pillow==6.2.1
prometheus-client==0.7.1
prompt-toolkit==2.0.9
protobuf==3.11.2
ptyprocess==0.6.0
Pygments==2.5.2
pyrsistent==0.15.6
python-dateutil==2.8.1
python-engineio==3.11.1
python-socketio==4.4.0
PyYAML==5.2
pyzmq==18.1.0
qtconsole==4.6.0
requests==2.22.0
scikit-learn==0.22
scipy==1.4.1
Send2Trash==1.5.0
six==1.13.0
sklearn==0.0
-e [email protected]:youben11/PySyft.git@477be2e677127b924d5968bcb51fe87e3017912a#egg=syft
syft-proto==0.1.0a1.post36
tblib==1.6.0
tensorboard==1.15.0
tensorflow==1.15.0
tensorflow-estimator==1.15.1
termcolor==1.1.0
terminado==0.8.3
testpath==0.4.4
tf-encrypted==0.5.9
torch==1.3.0
torchvision==0.4.1
tornado==6.0.3
traitlets==4.3.3
typing-extensions==3.7.4.1
urllib3==1.25.7
wcwidth==0.1.7
webencodings==0.5.1
websocket-client==0.56.0
websockets==8.1
Werkzeug==0.16.0
widgetsnbextension==3.5.1
wrapt==1.11.2
zipp==0.6.0
zstd==1.4.3.2

[Question] Training on private data

First, I would like to thank you for the fantastic and well-structured tutorials.
I have a question about a security use case, but I don’t know the technical term of it.

Alice wants to train a simple neural network on MNIST data. 
But Bob has all the labels needed for training, and he will not share it. 
Can Alice train his model on the encrypted data from Bob and How?

Tutorial 7 doesn't run because of AutogradCryptensor

Current version of crypten doesn't allow to use AutogradCryptensor directly, but instead, suggest to use the requires_grad parameters with cryptensor constructor, so I updated Tutorial 7 as a way to test this new functionality, but cell 8 throws an error

/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/site-packages/torch/storage.py:34: FutureWarning: pickle support for Storage will be removed in 1.5. Use `torch.save` instead
  warnings.warn("pickle support for Storage will be removed in 1.5. Use `torch.save` instead", FutureWarning)
/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/site-packages/torch/storage.py:34: FutureWarning: pickle support for Storage will be removed in 1.5. Use `torch.save` instead
  warnings.warn("pickle support for Storage will be removed in 1.5. Use `torch.save` instead", FutureWarning)
Epoch 0 in progress:
Process Process-3:
Traceback (most recent call last):
  File "/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/context.py", line 30, in _launch
    return_value = func(*func_args, **func_kwargs)
  File "<stdin>", line 35, in run_encrypted_training
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 42, in __call__
    return self.forward(*args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 373, in forward_function
    return object.__getattribute__(self, name)(*tuple(args), **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 494, in forward
    output = self._modules[node_to_compute](input)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 42, in __call__
    return self.forward(*args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 373, in forward_function
    return object.__getattribute__(self, name)(*tuple(args), **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 1453, in forward
    x = x.conv2d(self.weight, stride=self.stride, padding=self.padding)
  File "/home/youben/git-repo/CrypTen/crypten/cryptensor.py", line 261, in autograd_forward
    result = grad_fn.forward(ctx, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/gradients.py", line 1358, in forward
    return input.conv2d(kernel, padding=padding, stride=stride)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/mpc.py", line 41, in convert_wrapper
    return func(result, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/mpc.py", line 1230, in ob_wrapper_function
    result._tensor = getattr(result._tensor, name)(value, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/arithmetic.py", line 416, in conv2d
    return self._arithmetic_function(kernel, "conv2d", **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/arithmetic.py", line 254, in _arithmetic_function
    result, y, *args, **kwargs
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/beaver.py", line 62, in conv2d
    return __beaver_protocol("conv2d", x, y, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/beaver.py", line 37, in __beaver_protocol
    eps_del = ArithmeticSharedTensor.reveal_batch([x - a, y - b])
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/arithmetic.py", line 190, in reveal_batch
    return comm.get().all_reduce(shares, batched=True)
  File "/home/youben/git-repo/CrypTen/crypten/communicator/communicator.py", line 160, in logging_wrapper
    return func(self, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/communicator/distributed_communicator.py", line 181, in all_reduce
    dist.all_reduce(tensor, op=op, group=self.main_group, async_op=True)
  File "/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 919, in all_reduce
    _check_single_tensor(tensor, "tensor")
  File "/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 233, in _check_single_tensor
    "to be of type torch.Tensor.".format(param_name))
RuntimeError: Invalid function argument. Expected parameter `tensor` to be of type torch.Tensor.
Process Process-4:
Traceback (most recent call last):
  File "/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/youben/anaconda3/envs/om-dev-nightly/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/context.py", line 30, in _launch
    return_value = func(*func_args, **func_kwargs)
  File "<stdin>", line 35, in run_encrypted_training
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 42, in __call__
    return self.forward(*args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 373, in forward_function
    return object.__getattribute__(self, name)(*tuple(args), **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 494, in forward
    output = self._modules[node_to_compute](input)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 42, in __call__
    return self.forward(*args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 373, in forward_function
    return object.__getattribute__(self, name)(*tuple(args), **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/nn/module.py", line 1453, in forward
    x = x.conv2d(self.weight, stride=self.stride, padding=self.padding)
  File "/home/youben/git-repo/CrypTen/crypten/cryptensor.py", line 261, in autograd_forward
    result = grad_fn.forward(ctx, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/gradients.py", line 1358, in forward
    return input.conv2d(kernel, padding=padding, stride=stride)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/mpc.py", line 41, in convert_wrapper
    return func(result, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/mpc.py", line 1230, in ob_wrapper_function
    result._tensor = getattr(result._tensor, name)(value, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/arithmetic.py", line 416, in conv2d
    return self._arithmetic_function(kernel, "conv2d", **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/arithmetic.py", line 254, in _arithmetic_function
    result, y, *args, **kwargs
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/beaver.py", line 62, in conv2d
    return __beaver_protocol("conv2d", x, y, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/beaver.py", line 37, in __beaver_protocol
    eps_del = ArithmeticSharedTensor.reveal_batch([x - a, y - b])
  File "/home/youben/git-repo/CrypTen/crypten/mpc/primitives/arithmetic.py", line 190, in reveal_batch
    return comm.get().all_reduce(shares, batched=True)
  File "/home/youben/git-repo/CrypTen/crypten/communicator/communicator.py", line 160, in logging_wrapper
    return func(self, *args, **kwargs)
  File "/home/youben/git-repo/CrypTen/crypten/communicator/distributed_communicator.py", line 184, in all_reduce
    req.wait()
RuntimeError: [/opt/conda/conda-bld/pytorch_1585984172566/work/third_party/gloo/gloo/transport/tcp/pair.cc:575] Connection closed by peer [127.0.1.1]:23453
ERROR:root:One of the parties failed. Check past logs

Issues with mathematical operations

Hello, I am having issues with the math operations. I have small numbers in my code, and I am having incorrect results when I multiply, divide, do the square or the square root of the numbers. I tried changing precision but that did not help. Then I tested on simple examples, and it seems that these operations give incorrect results for high or low numbers.
I also noticed that making the precision higher makes the products and divisions give wrong outputs, even for integers.
Would there be a good way to avoid these issues?

Issue with division operation

Hello, I am having issues with the division operations. I am having incorrect results when I do division in my code. And I did some tests and found that when the number is quite large, the results would be wrong, as shown in the figure. I want to know how I can solve this problem.

Unable to install crypten on VM

I can successfully install the package on my Ubuntu laptop, but not on a GCP VM.

In a new GCP Compute Engine VM with Debian 10, I create a virtualenv with virtualenv venv_test -p python3.

Python version is 3.7.3, pip is 20.3.1. I can run pip install crypten and it downloads all the packages, completing successfully. But when I run pip freeze I see none of those packages. The whl file's name is crypten-0.1-py3-none-any.whl. I also tried installing it with python -m pip install crypten and pip install git+https://github.com/facebookresearch/.

I haven't ever seen pip give a silent failure like this before, and I'm not sure if it's CrypTen or a problem with something in the VM. I tried downgrading to pip 20.4.2, and also tried creating an Ubuntu 20.04 VM and had the same thing happen. The same thing happens when I pip install -r requirements.txt CrypTen's dependencies, and none of them are showing up in either the VE lib folder or the /usr/local/lib/python3.8 folder. It's hard to know what to do, since I'm not getting any errors. Any advice on what I can check to figure out what's going on?

Crypten' has no attribute 'print

AttributeError                            Traceback (most recent call last)
<ipython-input-5-b53a8eee043c> in <module>
      9 z_enc1 = x_enc + y      # Public
     10 z_enc2 = x_enc + y_enc  # Private
---> 11 crypten.print("\nPublic  addition:", z_enc1.get_plain_text())
     12 crypten.print("Private addition:", z_enc2.get_plain_text())
     13 

AttributeError: module 'crypten' has no attribute 'print'

I tried running the tutorial notebook and got this error. Is this a problem on my system or Crypten has removed the print feature. If its the latter I would be happy to refractor the notebooks accordingly and submit a PR.

Maybe also add notebooks to tests to ensure its in sync with the codebase changes?

How does secret sharing work with one source

Hello
in the following example, x is encrypted though we have only one source. How is that possible, don't we need at least two sources? else how does it work?

x = torch.tensor([1.0, 2.0, 3.0])

x_enc = crypten.cryptensor(x)

thank you in advance

Question: understanding fixed-point division

I'm wondering if you get give me a high-level description of what the following code is doing in crypten/mpc/primitives/arithmetic.py:324 ... I'm especially confused about why it only applies when there are 3-or-more players?

Thanks in advance,
Tim

        if isinstance(y, int) or is_int_tensor(y):
            # Truncate protocol for dividing by public integers:
            if comm.get().get_world_size() > 2:
                wraps = self.wraps()
                self.share /= y
                # NOTE: The multiplication here must be split into two parts
                # to avoid long out-of-bounds when y <= 2 since (2 ** 63) is
                # larger than the largest long integer.
                self -= wraps * 4 * (int(2 ** 62) // y)
            else:
                self.share /= y
            return self

KeyError: 'weight' when using resnet18 following mpc_cifar.py code

I run into this error, when I try using CrypTen with resnet18 model for my code

KeyError                                  Traceback (most recent call last)
<ipython-input-11-8baf0b8ce013> in <module>
     16 
     17             input_size = get_input_size(val_loader, batch_size)
---> 18             private_model = construct_private_model(input_size, hosp_models[i])
     19             validate(loader=val_loader, model=private_model)
     20             print('\n')

<ipython-input-5-19f29188bd4f> in construct_private_model(input_size, model)
     15     else:
     16         model_upd = LeNet()
---> 17     private_model = crypten.nn.from_pytorch(model_upd, dummy_input).encrypt(src=0)
     18     return private_model
     19 

~/Downloads/mila project/udacity_DP_FL/CrypTen/crypten/nn/onnx_converter.py in from_pytorch(pytorch_model, dummy_input)
     35     # construct CrypTen model:
     36     f = _from_pytorch_to_bytes(pytorch_model, dummy_input)
---> 37     crypten_model = from_onnx(f)
     38     f.close()
     39 

~/Downloads/mila project/udacity_DP_FL/CrypTen/crypten/nn/onnx_converter.py in from_onnx(onnx_string_or_file)
    122     """Converts an onnx model to a CrypTen model"""
    123     converter = FromOnnx(onnx_string_or_file)
--> 124     crypten_model = converter.to_crypten()
    125     return crypten_model
    126 

~/Downloads/mila project/udacity_DP_FL/CrypTen/crypten/nn/onnx_converter.py in to_crypten(self)
    184             # add CrypTen module to graph
    185             crypten_module = crypten_class.from_onnx(
--> 186                 parameters=parameters, attributes=attributes
    187             )
    188             node_output_name = list(node.output)[0]

~/Downloads/mila project/udacity_DP_FL/CrypTen/crypten/nn/module.py in from_onnx(parameters, attributes)
   1584 
   1585         # initialize module:
-> 1586         in_channels = parameters["weight"].size(1)
   1587         out_channels = parameters["weight"].size(0)
   1588         module = Conv2d(

KeyError: 'weight'

Tutorial for running crypten in multiple hosts

Hi guys,

Is there a tutorial for running Crypten examples in real multiple hosts instead of simulated process . I wanna compare the performance with TFE when having bandwidth and latency constraint.

Add a shape argument to crypten.load

Feature

Add a shape argument to crypten.load so that dim and size are set accordingly if known in advance. This will reduce communication between parties in case we know the shape of private tensors.

Additional context

Parties exchange dimensions of tensors in order to create the shares. We can get rid of this communication process if we know this information in advance.

No grad after backward

Hi,

I have been looking at customized CrypTen modules, but the backward seems easy to fail. Could you check if this is an issue?

x = torch.rand(2, 4)
x_enc = crypten.cryptensor(x, requires_grad=True)

# y_enc = x_enc    # This line gives the grad correctly
y_enc = x_enc[:]   # The question is how to allow indexing & slicing operations to have gradients?

grad = torch.rand_like(y_enc.get_plain_text())
grad_enc = crypten.cryptensor(grad)
y_enc.backward(grad_enc)
print(x_enc.grad.get_plain_text())

Also, I have been seeing ValueError: CrypTen does not support op Mul. quite often when calling crypten.nn.from_pytorch. May I ask why op Mul is not supported?

Thanks!

Issue running Tutorial 7

I encounter the following error when running cell 8: RuntimeError: result type Float can't be cast to the desired output type Long

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-10-882a0accf550> in <module>
     12 
     13     # forward pass
---> 14     output = model(x_train)
     15     loss_value = loss(output, y_train)
     16 

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/nn/module.py in __call__(self, input)
     53 
     54     def __call__(self, input):
---> 55         return self.forward(input)
     56 
     57     def train(self, mode=True):

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/nn/module.py in wrapped_forward(*args)
     48                 """Forward function that wraps CrypTensors in AutogradCrypTensor."""
     49                 args = _to_autograd(args)
---> 50                 return object.__getattribute__(self, "forward")(*args)
     51 
     52             return wrapped_forward

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/nn/module.py in forward(self, input)
    324             if len(input) == 1:
    325                 input = input[0]  # unpack iterable if possible
--> 326             output = self._modules[node_to_compute](input)
    327             values[node_to_compute] = output
    328             _mark_as_computed(node_to_compute)

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/nn/module.py in __call__(self, input)
     53 
     54     def __call__(self, input):
---> 55         return self.forward(input)
     56 
     57     def train(self, mode=True):

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/nn/module.py in wrapped_forward(*args)
     48                 """Forward function that wraps CrypTensors in AutogradCrypTensor."""
     49                 args = _to_autograd(args)
---> 50                 return object.__getattribute__(self, "forward")(*args)
     51 
     52             return wrapped_forward

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/nn/module.py in forward(self, x)
    991 
    992     def forward(self, x):
--> 993         x = x.conv2d(self.weight, stride=self.stride, padding=self.padding)
    994         if hasattr(self, "bias"):
    995             x = x.add(self.bias.unsqueeze(-1).unsqueeze(-1))

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/autograd_cryptensor.py in autograd_forward(*args, **kwargs)
    206 
    207                 # apply correct autograd function:
--> 208                 result = grad_fn.forward(ctx, inputs, **kwargs)
    209                 if not isinstance(result, tuple):  # output may be tensor or tuple
    210                     result = (result,)

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/gradients.py in forward(ctx, input, padding, stride)
   1194             padding = (padding, padding)
   1195         ctx.save_multiple_for_backward((input, kernel, padding, stride))
-> 1196         return input.conv2d(kernel, padding=padding, stride=stride)
   1197 
   1198     @staticmethod

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/mpc/mpc.py in convert_wrapper(self, *args, **kwargs)
     40             def convert_wrapper(self, *args, **kwargs):
     41                 result = self.to(ptype)
---> 42                 return func(result, *args, **kwargs)
     43 
     44             return convert_wrapper

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/mpc/mpc.py in ob_wrapper_function(self, value, *args, **kwargs)
   1141         if isinstance(value, CrypTensor):
   1142             value = value._tensor
-> 1143         result._tensor = getattr(result._tensor, name)(value, *args, **kwargs)
   1144         return result
   1145 

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/mpc/primitives/arithmetic.py in conv2d(self, kernel, **kwargs)
    388     def conv2d(self, kernel, **kwargs):
    389         """Perform a 2D convolution using the given kernel"""
--> 390         return self._arithmetic_function(kernel, "conv2d", **kwargs)
    391 
    392     def conv_transpose2d(self, kernel, **kwargs):

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/mpc/primitives/arithmetic.py in _arithmetic_function(self, y, op, inplace, *args, **kwargs)
    244             else:  # scale by larger of self.encoder.scale and y.encoder.scale
    245                 if self.encoder.scale > 1 and y.encoder.scale > 1:
--> 246                     return result.div_(result.encoder.scale)
    247                 elif self.encoder.scale > 1:
    248                     result.encoder = self.encoder

~/anaconda3/envs/pytorch/lib/python3.7/site-packages/crypten/mpc/primitives/arithmetic.py in div_(self, y)
    310                 self -= wraps * 4 * (int(2 ** 62) // y)
    311             else:
--> 312                 self.share /= y
    313             return self
    314 

RuntimeError: result type Float can't be cast to the desired output type Long

semi-honest or dishonest majority?

I found the paper link on the document of Crypten. As follows.

For technical details see Damgard et al. 2012 and Beaver 1991 outlining the Beaver protocol used in our implementation.

SPDZ( Damgard et al. 2012) is dishonest majority. While, CrypTen is semi-honest or dishonest majority?

CrypTen Communication costs question

Hi,

Apologies if this is a trivial question. I am using CrypTen to perform private inference in a 2-party setting and I wanted to examine the communication costs. I have increased the logging level to INFO and I am using the print_communication_stats method to do this. It seems to be working as I am getting the following output:

INFO:root:====Communication Stats====
INFO:root:====Communication Stats====
INFO:root:Rounds: 37
INFO:root:Bytes : 8289040
INFO:root:Rounds: 37
INFO:root:Bytes : 8289040

I believe it's reported twice because I have a world size of 2 so each process is printing the stats. However, when it comes to interpreting this I am slightly unsure about one thing.

Is the total communication cost 8289040 + 8289040 = 16,578,080 bytes or is each process already reporting the total communication cost between the parties: 8289040 bytes

Thanks a lot in advance!
Veneta

How does plain_text works

Hi!
I had a doubt regarding how get_plain_text() works. Does this involve a round of communication between all the processes in the process pool, because if not wouldn't it be a security flaw if a user could just use that function to get the decrypted value? Also, I wasn't able to understand the working of the multiply operation, in the sense that how many rounds of communication between parties it requires?

Is it possible to encrypt a tensor with rank=0 but distribute the shares only to rank=1 and rank=2?

I'm thinking about situations like this.
Suppose I want to calculate 6 + 10 but I don't have enough computational capacity. So, I want to use two remote PCs to calculate this value. However, I cannot fully trust these remote PCs, so I want to encrypt 6 and 10 by dividing each value, say, 6 = 2 + 4, 10 = 3 + 7. Then I distribute these values to each remote PC, (In this case, remote PC1 gets 2 & 3, and remote PC2 gets 4 & 7.), and they add shares they get, that is, remote PC1 calculate 2 + 3 = 5, remote PC2 calculate 4 + 7 = 11. Finally, I get the target value by adding these two results(5 + 11 = 16).
From my understanding, the code will be something like below.
ALICE = 0; BOB = 1; CAROL = 2; (In the example above, I = ALICE, remote PC1 = BOB, and remote PC2 = CAROL)
@mpc.run_multiprocess(world_size=3)
def simple_addition():
a = crypten.load_from_party(filenames["a"], src=ALICE)# This file contains 6
b = crypten.load_from_party(filenames["b"], src=ALICE)# This file contains 10
crypten.print((a+b).get_plain_text())

The problem is that ALICE doesn't have the computational capacity to carry out addition, but she has to join in the calculation if I use the code above.
Is there a way to encrypt a & b with src=ALICE, but distribute the shares only to BOB and CAROL, so that ALICE doesn't have to carry out addition?

Encrypt / Decrypt data in javascript?

Feature

I would like to use crypten with javascript application ( nodejs, vuejs, etc. ). It would be nice to have javascript encryption/decryption. If I would like to use some encrypted web api, I am unable to do it without some python proxy which adds complexity to the real world deployment. Would you think that it would be possible to add some javascript encryptor/decryptor ?

Alternatives

Additional context

B2A needs to be run in parallel

https://github.com/facebookresearch/CrypTen/blob/master/crypten/mpc/primitives/converters.py#L35

for i in range(bits):
    binary_bit = binary_tensor & 1
    arithmetic_bit = beaver.B2A_single_bit(binary_bit)

If we are to convert a full 64 bit B share into A share, this would cost 64x round trips, which is inefficient. The loop bodies have no dependency with each other, so they should be run in parallel.

Support for bmm operation

Hi, I am wondering if Crypten supports the bmm operation like in PyTorch, cause I have searcher for a while but did not find it. Thanks!

[Bug] BCEWithLogitsLoss Import Error

Hi there,

The docstring for BCEWithLogitsLoss has escape characters (\t and \r) that prevent the function from being imported properly. I believe converting the docstring to a raw string should fix the issue.

crypten.nn.from_pytorch throwing a TypeError

This commit introduces the use of enable_onnx_checker argument in the call to torch.onnx.export which I couldn't find in the stable documentation. The call to crypten.nn.from_pytorch is therefore throwing a TypeError: export() got an unexpected keyword argument 'enable_onnx_checker'.

Tutorial Training SMPC on GPU with multiple workers

Hi there! Thanks for adding GPU support. Can you add a tutorial or two on developing a SMPC training algorithm which utilizes CUDA?

I seem to be having difficulties with getting such an algorithm up and running. So far, when I spawn multiple workers to load the private datasets, I can't load my model to cuda because it is in a forked subprocess (eg. pytorch/pytorch#40403), which I tried to circumvent using https://pytorch.org/docs/stable/notes/multiprocessing.html#cuda-in-multiprocessing , however, this seems to cause pickling errors.

Thanks!

OpenMined/PySyft#4561

Support for Different Activation Functions?

I was wondering what support there was for different activation functions. I saw that ReLu is primarily used in all the examples, but when I tried to run the example with sigmoid, I was getting errors that it is not supported. I just wanted to confirm whether there is no support for other activation functions or if it's just something on my end.

Thank you!

rounding error in division operation

I tried arithmetic operations referring tutorials.

x_enc=crypten.cryptensor([1.0,2.0,3.0])
y=3
y_enc = crypten.cryptensor(y)

z_enc1=x_enc/y
z_enc2=x_enc/y_enc
print(z_enc1.get_plain_text())
print(z_enc2.get_plain_text())
print(z_enc1.get_plain_text()==z_enc2.get_plain_text())

Divided by 3, the results of public and private calculation are different.
Is this correct behavior?

tensor([0.3333, 0.6667, 1.0000])
tensor([0.3334, 0.6667, 1.0001])
tensor([False, False, False])

RuntimeError: result type Float can't be cast to the desired output type Long

While I was running the Tutorial_4 in jupyter (https://github.com/facebookresearch/CrypTen/blob/master/tutorials/Tutorial_4_Classification_with_Encrypted_Neural_Networks.ipynb)
, it was fine until the part with Classifying Encrypted Data with Encrypted Model.

import crypten.mpc as mpc
import crypten.communicator as comm

labels = torch.load('/tmp/bob_test_labels.pth').long()
count = 100 # For illustration purposes, we'll use only 100 samples for classification

@mpc.run_multiprocess(world_size=2)
def encrypt_model_and_data():
    # Load pre-trained model to Alice
    model = crypten.load('models/tutorial4_alice_model.pth', dummy_model=dummy_model, src=ALICE)
    
    # Encrypt model from Alice 
    dummy_input = torch.empty((1, 784))
    private_model = crypten.nn.from_pytorch(model, dummy_input)
    private_model.encrypt(src=ALICE)
    
    # Load data to Bob
    data_enc = crypten.load('/tmp/bob_test.pth', src=BOB)
    data_enc2 = data_enc[:count]
    data_flatten = data_enc2.flatten(start_dim=1)

    # Classify the encrypted data
    private_model.eval()
    output_enc = private_model(data_flatten)
    
    # Compute the accuracy
    output = output_enc.get_plain_text()
    accuracy = compute_accuracy(output, labels[:count])
    print("\tAccuracy: {0:.4f}".format(accuracy.item()))
    
encrypt_model_and_data()

When I ran that part,an Error occured: (RuntimeError: result type Float can't be cast to the desired output type Long)

Environmental Setup:
Conda 4.9.2
Cudatoolkit 11.0.221
Jupyter 1.0.0
Crypten 0.1

Step to reproduce the error:
By running the Tutorial_4 in jupyter and error will occur in the part "Classifying Encrypted Data with Encrypted Model"

The attachment document is the error log , conda env list and the HTML for the jupter with code.
error.txt
condalist.txt
Tutorial_4_Classification_with_Encrypted_Neural_Networks.zip

OS: Ubuntu 18.04
Display Card: 1070Ti

It would be great if there is help or knowing the part(s) that go wrong, thanks!

Getting Started with SMPC (Error!)

I am a newbie to Crypten and SMPC, I was trying to play around with the library.

import crypten
import torch
crypten.init() 
torch.set_num_threads(1)
import torchvision.datasets as datasets
import torchvision.models as models
import torchvision.transforms as transforms
transform=transforms.ToTensor()
trainset = datasets.MNIST(root='./data', train=True,download=True, transform=transform)from torchvision.models import resnet18
from torch import nn
model = resnet18(num_classes=10) # MNIST has 10 classesencrypted_model=crypten.nn.from_pytorch(model,trainset[0])

I get the following error
RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended.

How to debug this?

Create a store for pre-process data

Feature

Create a store where pre-processed data is stored (required for the online computation phase).

For example, the beaver triples could be kept there - if we know what parties might do SMPC in the near future.

In this way, we speed up the online phase of performing operations like mul on additively secret shared values.
I think this can be seen as a separation between the online (where computation is done on shared data) and the offline phase (generate data required for the computation).

Why do this?

In a production-like environment, it might be more critical to run the online phase at a faster rate.

Regardless, I also think that generating the pre-processed data (in the offline phase) it might require some time (here I think it might require a lot if we need to perform operations like conv, mul - using the beaver triples).

When to generate this data?

This data can be generated when the parties are in an "idle" kind of state (they are not doing computation) -- can be used a specific threshold here (might require more investigation).
In an ideal scenario, I am thinking we can have "pairs" for each party (P1X P1Y) where P1X is responsible only with the offline phase (generating pre-processing data required in the online phase) and P1Y is responsible with the online computation.
- Need to be taken into consideration:
  - How much data is required to change between the parties - both offline and online phase (Bandwith Limitations)
  - How many resources are required for generating the offline data for different operations
  - I am sure there might be more scenarios that might be taken into consideration

Alternatives

The alternative is to generate the beaver triples (here we can consider any offline data that might be done in the pre-processing step) in the online phase (when they are required) - this will introduce an overhead because you have to generate/share them (I think this is happening at the moment).

I think a good idea is to back off to generating data in the online phase in case the store does not have sufficient data to perform the computation.

Plan

Create a class for the store
Possible interface
- retrieve/get(type_preprocess, shape)
- store/add(type_preprocess, data)
Tests (while doing the implementation for the class)
Benchmark simple training (here compute how much it takes for the offline phase and the online phase)
- The offline_phase + online_phase should be (nearly) identical with running CrypTen without the store

Fully Homomorphic Encryption

Feature

The ability to fully homomorphically encrypt tensors, be able to store them/ serialize and reload the encrypted data. unencrypt the data (in particular for databases), and compute operations using GPU parallelisation or in a distributed manner.

Alternatives

Secure multiparty computation, and differential privacy. However neither of these fulfill the need for trustless systems, or systems that involve a non-tech-savy but sensitive data source. Or the long term availability of encrypted data without risk to the data owner(s).

Additional context

There are multiple FHE libraries out there such as HeLib, and MS-SEAL. However they all require rather hacky integration with either pytorch or tensorflow (although TensorFlow 1 is the only one that I have seen working with FHE thus far). I read that there was an aspiration to eventually integrate FHE into CrypTen, is this still the case? I just wanted to formalise an issue for FHE, because it is something I would be very interested in if it were to be implemented. However that being said I am more than up for helping out with pull requests if the opportunity so arises as it would be great to see here.

facebookresearch / crypten Goto Github PK

crypten's Introduction

Installing CrypTen

Examples

How CrypTen works

Documentation and citing

Join the CrypTen community

License

crypten's People

Contributors

Stargazers

Watchers

Forkers

crypten's Issues

Feature

Workspace

Update:

Bug

Reproduce

Environment

Environment

Commit Head

Requirement versions

Command

Output

Update

Feature

Alternatives

Feature

Alternatives

Additional context

workspace information

Feature

Additional context

Feature

Alternatives

Additional context

Feature

Why do this?

When to generate this data?

Alternatives

Plan

Feature

Alternatives

Additional context

Recommend Projects

Recommend Topics

Recommend Org