twosixlabs / armory Goto Github PK
View Code? Open in Web Editor NEWARMORY Adversarial Robustness Evaluation Test Bed
License: MIT License
ARMORY Adversarial Robustness Evaluation Test Bed
License: MIT License
Not sure if we'll be able to get this ready before kickoff.
Inception-ResNet-v2?
It'd be nice to have an example for the demonstration
It looks unlikely we'll be able to get armory
. Let me know if there are any thoughts/suggestions:
armory-evaluation
armory-evaluator
armory-testbed
This would hopefully enable us to more quickly test different tensorflow versions, for instance.
PM wants to focus early on more realistic environments.
These will require simulation of specific environments (such as impulse response for audio and rendering for images/video) to be tractable.
The goal for this issue is to set up an overall structure (probably with an initial identity function) that can be used for various downstream experiments.
Features we need, but don't know where they will live yet (we can add to these):
We need to establish a dedicated AWS kubernetes cluster that can handle requests to run ARMORY in the cloud:
https://github.com/twosixlabs/armory/blob/master/armory/webapi/kubeflow_aws.py
Currently we've just stubbed pre-processing within the eval classifier:
https://github.com/twosixlabs/armory/blob/master/armory/eval/classification.py#L24-L27
This should be refactored and exist as a callable module.
Currently, models are generated and trained in each evaluation. It would be good to have a unified interface for saving/loading them that abstracts out the underlying toolkit.
We should make it straightforward to check that everyone is using the exact same dataset and model (at least for baseline models). This may get more challenging when dealing with multiple frameworks.
Ensuring randomization seeds are properly initialized is a related issue, though that should probably be its own issue.
PM wants to focus early on more realistic perceptual metrics.
This will require more general usage for metrics (as many of interest are difficult to compute).
The goal for this issue is to set up an overall structure (probably with an initial identity function) that can be used for various downstream experiments.
We should setup CI/CD for testing python3 + lint. Likely will use Travis or GitHub actions.
Since several performers have mentioned using ensemble of models in their defenses, if possible it would be nice to showcase this ability during the kickoff.
Instead of using absolute accuracy metrics (0%, 100%) as the optimal baselines for tasks, it would be helpful to have dataset+task-specific baselines that help calibrate results to the difficulty of the particular task.
The truly optimal baseline would be to find the nearest training point (under the specified metric) to each test point and use that as an epsilon upper bound. This is computationally infeasible for larger datasets, but an approximate nearest neighbor approach could work suitably. I see this being of particular interest for certified defenses.
For the upcoming Kickoff we'll want to showcase a transfer attack using pre-generated images. This will require:
As part of our eval configs we have a field for performer_repo
. We will want to make a way to download the repo within the container and parse the rest of the config to run code in external or armory repositories.
Currently any files written or modified inside the armory docker container will be written as root. Instead we should drop root priviledges and write as the current user.
Bit of a weird bug. Something is returning as NoneType when the request package calls to docker to remove the container
This runs successfully:
from armory.docker.management import ManagementInstance
def testing():
manager = ManagementInstance()
runner = manager.start_armory_instance()
runner.docker_container.exec_run("python demo_log.py", stream=True)
manager.stop_armory_instance(runner)
testing()
While this will return an error on clean-up:
from armory.docker.management import ManagementInstance
manager = ManagementInstance()
runner = manager.start_armory_instance()
runner.docker_container.exec_run("python demo_log.py", stream=True)
manager.stop_armory_instance(runner)
Exception ignored in: <bound method ArmoryInstance.__del__ of <armory.docker.management.ArmoryInstance object at 0x7f567bef2438>>
Traceback (most recent call last):
File "/home/sean-morgan/code/armory/armory/docker/management.py", line 46, in __del__
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/models/containers.py", line 432, in stop
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/utils/decorators.py", line 19, in wrapped
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/api/container.py", line 1153, in stop
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/utils/decorators.py", line 46, in inner
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 226, in _post
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/sessions.py", line 581, in post
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/sessions.py", line 519, in request
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/sessions.py", line 449, in prepare_request
File "/home/sean-morgan/miniconda3/envs/armory/lib/python3.6/site-packages/requests/utils.py", line 172, in get_netrc_auth
TypeError: 'NoneType' object is not callable
Currently the logging info that occurs within the container is not visible to the user. This was attempted to be fixed by:
https://github.com/twosixlabs/armory/blob/master/armory/eval/evaluator.py#L43-L44
But it needs a bit more investigation to get this to work properly
Currently, there are a number of places where 2.0 fails. Our data file requires graph mode to work, so eager must be disabled when using 2.0:
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
There are some things 1.15 specific that are called, like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/davidslater/git/armory/armory/data/data.py", line 201, in imagenet_adversarial
image_label_ds = adv_ds.map(lambda example_proto: _parse(example_proto))
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1211, in map
return MapDataset(self, map_func, preserve_cardinality=True)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3416, in __init__
use_legacy_function=use_legacy_function)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2695, in __init__
self._function = wrapper_fn._get_concrete_function_internal()
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1854, in _get_concrete_function_internal
*args, **kwargs)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function
capture_by_value=self._capture_by_value),
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2689, in wrapper_fn
ret = _wrapper_helper(*args)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2634, in _wrapper_helper
ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
raise e.ag_error_metadata.to_exception(e)
AttributeError: in converted code:
relative to /Users/davidslater/git/armory/armory/data:
data.py:201 None *
image_label_ds = adv_ds.map(lambda example_proto: _parse(example_proto))
data.py:175 _parse *
example = tf.parse_single_example(serialized_example, ds_features)
AttributeError: module 'tensorflow' has no attribute 'parse_single_example'
Need to improve the documentation significantly.
With numpy arrays, training the basic MNIST keras model was really fast. It is at least an order of magnitude slower to train, now.
Enable pip installs prior to evaluation
To make it easier to identify and keep track of requirements, I propose the following style for imports:
"""
Docstring
"""
# block of built-in imports
import os
import json
import shutil
# block of external package imports
import requests
import numpy as np
from art import attacks
# block of internal package imports
from armory.webapi.data import SUPPORTED_DATASETS
from armory.docker.management import ManagementInstance
from armory.utils.external_repo import download_and_extract
# rest of the code
log = logging.getLogger(__name__)
# ...
Thoughts?
Install NullHandler to base init.py and allow scripts to install their own handler so that they can override file-specific logging configuration.
Currently, the basicConfig level in files such as armory/docker/management.py override those defined in the main script, as they install their own coloredlogs handler.
I am sure that performers will want to be able to play inside the docker container, especially if things don't work. You can launch a container from python:
>>> from armory.docker.management import ManagementInstance
>>> manager = ManagementInstance()
>>> runner = manager.start_armory_instance()
Then attach to it in bash:
(armory) davidslater@aleph-5:~/armory$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ba87077ad356 twosixlabs/armory:0.1 "/bin/sh -c 'tail -f…" 18 seconds ago Up 17 seconds elegant_leavitt
(armory) davidslater@aleph-5:~/armory$ docker exec -it ba87077ad356 bash
But that's a bit clunky. Thoughts?
As part of the armory config, we would like the user to be able to swap between models wrapped with ART and CleverHans:
https://github.com/twosixlabs/armory/blob/master/examples/eval_classifier.py#L15
This requires some analysis of the differences between the wrappers and a well thought out logic for handling each respective library.
It'll be more user friendly to standardize what is and isn't allowed in the configurations. Unused fields can simply be null
.
Currently our dataset API returns the entire dataset in memory:
https://github.com/twosixlabs/armory/blob/master/armory/webapi/data.py#L13
We want to refactor this so that a data generator is returned and then handled properly by the classifier eval. This is required for larger datasets than can fit into memory.
Currently we've used some frameworks specific calls in our evaluations for convenience. Eg:
https://github.com/twosixlabs/armory/blob/master/performer_evaluation/fgm_attack.py#L44
I think this is fine as we want to promote user flexibility... but we need to mark them as framework specific and provide some framework agnostic evaluations which ART enables.
Currently, it has a catch-all for Docker Daemon issues when launching a container. A branch can be added to check for nvidia and provide a better warning issue or revert to ordinary runtime.
(armory) davidslater@aleph-5:~/armory$ python run_evaluation.py examples/mnist_fgm.json
2020-01-29 12:19:27 aleph-5.local armory.eval.evaluator[14763] ERROR Starting instance failed. Is Docker Daemon running?
Traceback (most recent call last):
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 261, in _raise_for_status
response.raise_for_status()
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: http+docker://localhost/v1.35/containers/create
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/davidslater/git/armory/armory/eval/evaluator.py", line 56, in run_config
runner = self.manager.start_armory_instance()
File "/Users/davidslater/git/armory/armory/docker/management.py", line 67, in start_armory_instance
temp_inst = ArmoryInstance(runtime=self.runtime)
File "/Users/davidslater/git/armory/armory/docker/management.py", line 41, in __init__
"twosixlabs/armory:0.1", **container_args
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/models/containers.py", line 803, in run
detach=detach, **kwargs)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/models/containers.py", line 861, in create
resp = self.client.api.create_container(**create_kwargs)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/container.py", line 430, in create_container
return self.create_container_from_config(config, name)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/container.py", line 441, in create_container_from_config
return self._result(res, True)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 267, in _result
self._raise_for_status(response)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/api/client.py", line 263, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/Users/davidslater/opt/anaconda3/envs/armory/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 400 Client Error: Bad Request ("Unknown runtime specified nvidia")
Need to support GPU and GPU selection so that evaluations can leverage them.
Black will handle all of our formatting, but things like unused imports and other PEP8 violations will not be caught.
It is unclear what the best datasets and tasks are for audio examples.
When iterating through mnist data generator, I get this error repeated continuously:
2020-01-10 15:53:34.078149: W tensorflow/core/framework/model.cc:855] Failed to find a tunable parameter that would decrease the output time. This means that the autotuning optimization got stuck in a local maximum. The optimization attempt will be aborted.
Building the docker image in the main directory includes all subdirectories, including datasets. This creates a GIANT tar file to send to the docker server. Since we don't actually need the local directory files to build the container, we should either:
A) Move the container to an empty folder in the repo OR
B) Add a .dockerignore file that ignores the large directories (.git, datasets, etc.)
This should prevent unnecessary directory parsing (or modifying of data). If we assume git (which we should), we can get all tracked files with:
git ls-files .
and all untracked but not ignored files with:
git ls-files . --exclude-standard --others
The .
for the path is an optional parameter.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.