AdvHunter: Detecting Adversarial Perturbations in Black-Box Neural Networks through Hardware Performance Counters
AdvHunter is a defense framework designed to protect Deep Neural Networks (DNNs) from adversarial examples, even in black-box scenarios where the network's internal details are unknown. It leverages Hardware Performance Counters (HPCs) to monitor the microarchitectural activities of a DNN during inference. By applying Gaussian Mixture Models to the collected HPC data, AdvHunter identifies anomalies indicating whether an input is legitimate or has been altered by adversarial perturbations.
- Operating System: Linux (Tested on Ubuntu 18.04.6 LTS).
- Processor: Intel (Tested on Intel i7-9700).
- Python Version: Python 3.10.9 (Confirmed compatibility).
- CUDA Toolkit: Optional for GPU acceleration during training (Tested with version 11.5, V11.5.119).
- Ensure the
perf
tool is installed on your system. - Set up a dedicated Python virtual environment and install required dependencies:
python -m venv advhunter source advhunter/bin/activate pip install -r requirements.txt
-
Model Training: Train a ResNet18 model on the CIFAR10 dataset.
python model_training.py
The best-performing model is saved in
logs/best_model.pth
. -
Adversarial Examples Generation: Generate adversarial examples by specifying various arguments.
python adversarial_examples.py --model=<model> --attack_type=<attack> --attack_method=<method> --epsilon=<epsilon> --target_class=<target>
The supported arguments are:
- Attack Type (
--attack_type
): Specifytargeted
oruntargeted
. - Attack Method (
--attack_method
): Specifyfgsm
,pgd
, ordeepfool
. - Perturbation Strength (
--epsilon
): Set the perturbation strength. - Target Class (
--target_class
): Fortargeted
attacks, specify the misclassification target class.
Outputs:
- Benign Images: Saved in
logs/benign
directory. - Predicted Benign Labels: Logged in
logs/benign_labels.log
file. - Adversarial Images: Saved in
logs/<attack_type>/<attack_method>_<epsilon>
directory. - Predicted Adversarial Labels: Logged in
logs/<attack_type>/<attack_method>_<epsilon>_labels.log
file.
- Attack Type (
-
Profile Performance Counters: Profile hardware performance counters using the
perf
tool during inference of both benign and adversarial image sets. Superuser access is required to run theperf
tool.- For Benign Images.
./profile_script.sh [cache]
- For Adversarial Images.
./profile_script.sh <attack_type> <attack_method> <epsilon> [cache]
Include the optional
cache
argument for both to collect cache-based performance counter data. Performance counter data for benign images is logged inlogs/perf_benign.log
and for adversarial images inlogs/perf_<attack_type>_<attack_method>_<epsilon>.log
. - For Benign Images.
-
Process Performance Counters Data: Convert the logged performance counter data into a structured JSON format.
python process_hpc_log.py --attack_type=<attack> --attack_method=<method> --epsilon=<epsilon> [--cache]
Include the optional
--cache
argument to process performance counter data for cache-based events. The processed data files are saved inlogs/perf_benign.json
andlogs/perf_<attack_type>_<attack_method>_<epsilon>.json
. -
Anomaly Detection and Model Evaluation: Construct Gaussian Mixture Models using performance counter data and predicted labels for benign images. Use these models and prediced labels for adversarial images to detect anomalies. The framework's detection capability is quantified using
accuracy
andF1-score
metrics.python build_advhunter.py --attack_type=<attack> --attack_method=<method> --epsilon=<epsilon> [--cache]
Include the optional
--cache
argument to analyze cache-based performance counter data. -
Reproducibility: In order to reproduce the results presented in the paper, following files are included in the
reproducibility
directory.best_model.pth
benign_labels.log
untargeted/fgsm_0.1_labels.log
perf_benign.log
perf_untargeted_fgsm_0.1.log
If you find our work interesting and use it in your research, please cite our paper describing:
Manaar Alam and Michail Maniatakos, "AdvHunter: Detecting Adversarial Perturbations in Black-Box Neural Networks through Hardware Performance Counters", DAC 2024.
@inproceedings{DBLP:conf/dac/AlamM24,
author = {Manaar Alam and
Michail Maniatakos},
title = {{AdvHunter: Detecting Adversarial Perturbations in Black-Box Neural Networks through Hardware Performance Counters}},
booktitle = {61st {ACM/IEEE} Design Automation Conference, {DAC} 2024, San Francisco,
CA, USA, June 23-27, 2024},
publisher = {{IEEE}},
year = {2024}
}
For more information or help with the setup, please contact Manaar Alam at: [email protected]