Git Product home page Git Product logo

malwaregan's Introduction

Adversarial Malware Generation Using GANs

docs

Implementation of a Generative Adversarial Network (GAN) that can create adversarial malware examples. The work is inspired by MalGAN in the paper "Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN" by Weiwei Hu and Ying Tan.

Framework written in PyTorch and supports CUDA.

Running the Script

The malware GAN is provided as a package in the folder malgan. A driver script is provided in main.py, which processes input arguments via argparse. The basic interface is:

python main.py Z BATCH_SIZE NUM_EPOCHS MALWARE_FILE BENIGN_FILE
  • Z -- Dimension of the latent vector. Must be a positive integer.
  • BATCH_SIZE -- Batch size for malicious examples. The benign batch size is proportional to BATCH_SIZE and the fraction of total training samples that are benign.
  • NUM_EPOCHS -- Maximum number of training epochs
  • MALWARE_FILE -- Path to a serialized numpy or torch matrix where the rows represent a single malware file's binary feature vector.
  • BENIGN_FILE -- Path to a serialized numpy or torch matrix where the rows represent a single benign file's binary feature vector.

For checkout purposes, we recommend calling:

python main.py 10 32 100 data/trial_mal.npy data/trial_ben.npy 

Dataset

A trial dataset is included with this implementation in the data folder. The data was publish in the repository: yanminglai/Malware-GAN. This dataset should only be used for proof of concept and initial trials.

We recommend the SLEIPNIR dataset. It was published by ad-Dujaili et al. The authors requested that the dataset not be shared publicly, and we respect that request. However, researchers and students may request access directly from the authors as described on their Github repository. Look for the link to the Google form.

CUDA Support

The implementation supports both CPU and CUDA (i.e., GPU) execution. If CUDA is detected on the system, the implementation defaults to CUDA support.

Requirements

This program was tested with Python 3.6.5 on MacOS and on Debian Linux. requirements.txt enumerates the exact packages used. A summary of the key requirements is below:

  • PyTorch (torch) -- Ver. 1.2.0
  • Scikit-Learn (sklearn) -- Ver. 0.20.2
  • NumPy (numpy)
  • TensorboardX -- If runtime profiling is not required, this can be removed.

malwaregan's People

Contributors

zaydh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

malwaregan's Issues

Where is the Adversarial generated Binary?

Hey @ZaydH I do understand that the GAN in MalGAN takes in the feature vector for a malware binary, and then generates an adversarial malware feature vector. Would it be possible to modify this so as to pass it a malware binary (this part is easy) and then generate an adversarial malware binary (how do I generate a malware binary from the feature list)?

requirements

hello zayd! Do you have any idea if torch 1.8.1 works with your program? I have python 3.8 so torch torch 1.2.0 can't be downloaded.

How to run your code with SLEIPNIR dataset

Hi Zay

I have got SLEIPNIR dataset from the author. But your sample code uses a data format differrent from SLEIPNIR dataset which consists of several individual files. So how can I run your malGAN with SLEIPNIR dataset?

Thanks

What are the _Saved Models_?

Hey, @ZaydH I have another one for you, and this might be a little silly, so apologies in advance.

I was wondering what exactly the Saved Models are? And how it is intended to be used.

Operation error

OSError: [Errno 22] Invalid argument: 'F:\GAN\New-MalwareGAN-master\MalwareGAN-master\logs_2020-03-14-17:48:00.920065.log'

I always get this error when I run your program. I hope you can help me solve this problem. Thank you!

what is "Best"?

Hello mr. Zayd, I have tried your code and it works great, but I have a question concerning the output: in the fifth column, right next to the "Discrim Train Loss", there is a "Best?" column, what does it actually mean? what does the "+"s signify?

installation+implementation

Good morning mr. Zayd, I hope my post finds you well! We are college students and MalwareGAN is our final year project so if you can help us with a step-by-step explanation on how to install it and implement it with a specific dataset and which version of Debian Linux to use. We'll be very grateful since we couldn't find anything on it on the internet. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.