twosixlabs / armory Goto Github PK

View Code? Open in Web Editor NEW

173.0 18.0 67.0 14.29 MB

ARMORY Adversarial Robustness Evaluation Test Bed

License: MIT License

Python 85.78% Shell 0.35% Jupyter Notebook 13.88%

armory's People

Stargazers

Watchers

armory's Issues

Add poisoning attack baseline (or possible benchmark)

Not sure if we'll be able to get this ready before kickoff.

Develop Pseudocode Script for Evaluation

Add baseline model example with full output graphs

Inception-ResNet-v2?

Integrate JSON lint into Travis

Add "--check" to format_json and return 1 for failure, 0 for success
Add to Travis yml

Add PyTorch example

It'd be nice to have an example for the demonstration

Claim a PyPi package name

It looks unlikely we'll be able to get armory. Let me know if there are any thoughts/suggestions:
armory-evaluation
armory-evaluator
armory-testbed

cc @davidslater @ng390

Upload image to Docker Hub

This would hopefully enable us to more quickly test different tensorflow versions, for instance.

Add project structure for simulation

PM wants to focus early on more realistic environments.

These will require simulation of specific environments (such as impulse response for audio and rendering for images/video) to be tractable.

The goal for this issue is to set up an overall structure (probably with an initial identity function) that can be used for various downstream experiments.

Does Carlini attack in ART map properly to uint8 input like paper?

Enable GitHub deploy keys in addition to user account tokens

ART/ARMORY feature questions

Features we need, but don't know where they will live yet (we can add to these):

Save/Load functionality for models
Save functionality for adversarial datasets
Dataset generators

Build Kubernetes API

We need to establish a dedicated AWS kubernetes cluster that can handle requests to run ARMORY in the cloud:
https://github.com/twosixlabs/armory/blob/master/armory/webapi/kubeflow_aws.py

Create scalable kubernetes cluster on AWS
Create API to run ARMORY
Determine method of hiding API keys from end-users (possibly a second install-able package?)

Refactor preprocessing to it's own module

Currently we've just stubbed pre-processing within the eval classifier:
https://github.com/twosixlabs/armory/blob/master/armory/eval/classification.py#L24-L27

This should be refactored and exist as a callable module.

Save/load functionality for models

Currently, models are generated and trained in each evaluation. It would be good to have a unified interface for saving/loading them that abstracts out the underlying toolkit.

Dataset and model hashing for repeatability

We should make it straightforward to check that everyone is using the exact same dataset and model (at least for baseline models). This may get more challenging when dealing with multiple frameworks.

Ensuring randomization seeds are properly initialized is a related issue, though that should probably be its own issue.

Improved Docker Error Handling

If docker daemon is not running, starting the eval will error and print an ugly stack trace.
Killing the eval script with a keyboard interrupt will not clean up the docker image.

Add project structure for perceptual metrics

PM wants to focus early on more realistic perceptual metrics.

This will require more general usage for metrics (as many of interest are difficult to compute).

The goal for this issue is to set up an overall structure (probably with an initial identity function) that can be used for various downstream experiments.

Setup CI/CD Testing

We should setup CI/CD for testing python3 + lint. Likely will use Travis or GitHub actions.

Add ensemble defense example

Since several performers have mentioned using ensemble of models in their defenses, if possible it would be nice to showcase this ability during the kickoff.

Optimal baselines - dataset and task specific

Instead of using absolute accuracy metrics (0%, 100%) as the optimal baselines for tasks, it would be helpful to have dataset+task-specific baselines that help calibrate results to the difficulty of the particular task.

The truly optimal baseline would be to find the nearest training point (under the specified metric) to each test point and use that as an epsilon upper bound. This is computationally infeasible for larger datasets, but an approximate nearest neighbor approach could work suitably. I see this being of particular interest for certified defenses.

Add transfer attack example

For the upcoming Kickoff we'll want to showcase a transfer attack using pre-generated images. This will require:

Dataset APIs to pull pre-generated adversarial examples
New eval script for transfer attacks

Demonstrate pulling from external repo

As part of our eval configs we have a field for performer_repo. We will want to make a way to download the repo within the container and parse the rest of the config to run code in external or armory repositories.

Modify docker permissions

Currently any files written or modified inside the armory docker container will be written as root. Instead we should drop root priviledges and write as the current user.

Reference:
https://dille.name/blog/2018/07/16/handling-file-permissions-when-writing-to-volumes-from-docker-containers/

Armory instance fails to stop if in global scope

Bit of a weird bug. Something is returning as NoneType when the request package calls to docker to remove the container

This runs successfully:

from armory.docker.management import ManagementInstance


def testing():
    manager = ManagementInstance()
    runner = manager.start_armory_instance()
    runner.docker_container.exec_run("python demo_log.py", stream=True)
    manager.stop_armory_instance(runner)

testing()

While this will return an error on clean-up:

from armory.docker.management import ManagementInstance

manager = ManagementInstance()
runner = manager.start_armory_instance()
runner.docker_container.exec_run("python demo_log.py", stream=True)
manager.stop_armory_instance(runner)

Exception ignored in: <bound method ArmoryInstance.__del__ of <armory.docker.management.ArmoryInstance object at 0x7f567bef2438>>
Traceback (most recent call last):
  File "/home/sean-morgan/code/armory/armory/docker/management.py", line 46, in __del__
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/models/containers.py", line 432, in stop
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/utils/decorators.py", line 19, in wrapped
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/api/container.py", line 1153, in stop
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/utils/decorators.py", line 46, in inner
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 226, in _post
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/sessions.py", line 581, in post
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/sessions.py", line 519, in request
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/sessions.py", line 449, in prepare_request
  File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/utils.py", line 172, in get_netrc_auth
TypeError: 'NoneType' object is not callable

Return and log stdout/stderr from inside the armory container

Currently the logging info that occurs within the container is not visible to the user. This was attempted to be fixed by:
https://github.com/twosixlabs/armory/blob/master/armory/eval/evaluator.py#L43-L44

But it needs a bit more investigation to get this to work properly

Add compatibility for tf 2.0

Currently, there are a number of places where 2.0 fails. Our data file requires graph mode to work, so eager must be disabled when using 2.0:

import tensorflow as tf

tf.compat.v1.disable_eager_execution()

There are some things 1.15 specific that are called, like:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/davidslater/git/armory/armory/data/data.py", line 201, in imagenet_adversarial
    image_label_ds = adv_ds.map(lambda example_proto: _parse(example_proto))
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1211, in map
    return MapDataset(self, map_func, preserve_cardinality=True)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3416, in __init__
    use_legacy_function=use_legacy_function)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2695, in __init__
    self._function = wrapper_fn._get_concrete_function_internal()
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1854, in _get_concrete_function_internal
    *args, **kwargs)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2689, in wrapper_fn
    ret = _wrapper_helper(*args)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2634, in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
AttributeError: in converted code:
    relative to /Users/davidslater/git/armory/armory/data:

    data.py:201 None  *
        image_label_ds = adv_ds.map(lambda example_proto: _parse(example_proto))
    data.py:175 _parse  *
        example = tf.parse_single_example(serialized_example, ds_features)

    AttributeError: module 'tensorflow' has no attribute 'parse_single_example'

Example: Apply Carlini L2 and Linf attacks to CIFAR10 classifier

Test out two github instances for importing private evaluation material

Documentation Sprint

Need to improve the documentation significantly.

Update run_evaluation and armory imports to minimize dependencies

MNIST dataset generator slows down model training

With numpy arrays, training the basic MNIST keras model was really fast. It is at least an order of magnitude slower to train, now.

Container hooks prior to evaluation

Enable pip installs prior to evaluation

Import style discussion

To make it easier to identify and keep track of requirements, I propose the following style for imports:

"""
Docstring
"""

# block of built-in imports
import os
import json
import shutil

# block of external package imports
import requests
import numpy as np
from art import attacks

# block of internal package imports
from armory.webapi.data import SUPPORTED_DATASETS                                         
from armory.docker.management import ManagementInstance                                   
from armory.utils.external_repo import download_and_extract

# rest of the code
log = logging.getLogger(__name__)
# ...

Thoughts?

Update logging for library usage

Install NullHandler to base init.py and allow scripts to install their own handler so that they can override file-specific logging configuration.

Currently, the basicConfig level in files such as armory/docker/management.py override those defined in the main script, as they install their own coloredlogs handler.

Interactive Mode for Docker Debugging

I am sure that performers will want to be able to play inside the docker container, especially if things don't work. You can launch a container from python:

>>> from armory.docker.management import ManagementInstance
>>> manager = ManagementInstance()
>>> runner = manager.start_armory_instance()

Then attach to it in bash:

(armory) davidslater@aleph-5:~/armory$ docker ps
CONTAINER ID        IMAGE                   COMMAND                  CREATED             STATUS              PORTS               NAMES
ba87077ad356        twosixlabs/armory:0.1   "/bin/sh -c 'tail -f…"   18 seconds ago      Up 17 seconds                           elegant_leavitt
(armory) davidslater@aleph-5:~/armory$ docker exec -it ba87077ad356 bash

But that's a bit clunky. Thoughts?

Support CleverHans Library

As part of the armory config, we would like the user to be able to swap between models wrapped with ART and CleverHans:
https://github.com/twosixlabs/armory/blob/master/examples/eval_classifier.py#L15

This requires some analysis of the differences between the wrappers and a well thought out logic for handling each respective library.

Standardize and verify evaluation configs

It'll be more user friendly to standardize what is and isn't allowed in the configurations. Unused fields can simply be null.

Data API should return generators

Currently our dataset API returns the entire dataset in memory:
https://github.com/twosixlabs/armory/blob/master/armory/webapi/data.py#L13

We want to refactor this so that a data generator is returned and then handled properly by the classifier eval. This is required for larger datasets than can fit into memory.

Create framework agnostic evaluations

Currently we've used some frameworks specific calls in our evaluations for convenience. Eg:
https://github.com/twosixlabs/armory/blob/master/performer_evaluation/fgm_attack.py#L44

I think this is fine as we want to promote user flexibility... but we need to mark them as framework specific and provide some framework agnostic evaluations which ART enables.

Update evaluator error catching

Currently, it has a catch-all for Docker Daemon issues when launching a container. A branch can be added to check for nvidia and provide a better warning issue or revert to ordinary runtime.

(armory) davidslater@aleph-5:~/armory$ python run_evaluation.py examples/mnist_fgm.json 
2020-01-29 12:19:27 aleph-5.local armory.eval.evaluator[14763] ERROR Starting instance failed. Is Docker Daemon running?
Traceback (most recent call last):
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 261, in _raise_for_status
    response.raise_for_status()
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http+docker://localhost/v1.35/containers/create

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/davidslater/git/armory/armory/eval/evaluator.py", line 56, in run_config
    runner = self.manager.start_armory_instance()
  File "/Users/davidslater/git/armory/armory/docker/management.py", line 67, in start_armory_instance
    temp_inst = ArmoryInstance(runtime=self.runtime)
  File "/Users/davidslater/git/armory/armory/docker/management.py", line 41, in __init__
    "twosixlabs/armory:0.1", **container_args
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/models/containers.py", line 803, in run
    detach=detach, **kwargs)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/models/containers.py", line 861, in create
    resp = self.client.api.create_container(**create_kwargs)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/container.py", line 430, in create_container
    return self.create_container_from_config(config, name)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/container.py", line 441, in create_container_from_config
    return self._result(res, True)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 267, in _result
    self._raise_for_status(response)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 263, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 400 Client Error: Bad Request ("Unknown runtime specified nvidia")

Make armory instance configurable with nvidia runtime

Need to support GPU and GPU selection so that evaluations can leverage them.

Test examples on windows

Add Flake8 CI testing

Black will handle all of our formatting, but things like unused imports and other PEP8 violations will not be caught.

Audio Task(s) and Dataset(s)

It is unclear what the best datasets and tasks are for audio examples.

Autotuning verbose warning

When iterating through mnist data generator, I get this error repeated continuously:

2020-01-10 15:53:34.078149: W tensorflow/core/framework/model.cc:855] Failed to find a tunable parameter that would decrease the output time. This means that the autotuning optimization got stuck in a local maximum. The optimization attempt will be aborted.

Docker build hangs due to large build context

Building the docker image in the main directory includes all subdirectories, including datasets. This creates a GIANT tar file to send to the docker server. Since we don't actually need the local directory files to build the container, we should either:
A) Move the container to an empty folder in the repo OR
B) Add a .dockerignore file that ignores the large directories (.git, datasets, etc.)

Extend classification evaluation to generate full epsilon-accuracy graphs

Update JSON parser to look only at files not ignored by git

This should prevent unnecessary directory parsing (or modifying of data). If we assume git (which we should), we can get all tracked files with:

git ls-files .

and all untracked but not ignored files with:

git ls-files . --exclude-standard --others

The . for the path is an optional parameter.

twosixlabs / armory Goto Github PK

armory's People

Stargazers

Watchers

Forkers

armory's Issues

Recommend Projects

Recommend Topics

Recommend Org