Git Product home page Git Product logo

tanaybhadula / malware-image-detection Goto Github PK

View Code? Open in Web Editor NEW
13.0 1.0 1.0 5.29 MB

A deep learning project which uses a method that converts malware .bytes files into gray-scale images and uses a CNN deep learning model to classify the converted malware image and identify the malware family it belongs to.

License: MIT License

Jupyter Notebook 100.00%
classification cnn cybersecurity deep-learning keras machine-learning malware python scipy tensorflow

malware-image-detection's Introduction

๐Ÿ–ฅ๏ธ Image-based Malware Classification using CNN

Introduction

Analyzing a huge amount of malware is a major burden for security analysts.Malware developers have been highly successful in evading signature-based detection techniques. Most of the prevailing static analysis techniques involve a tool to parse the executable, and extract features or signatures. Most of the dynamic analysis techniques involve the binary file to be run in a sand-boxed environment to examine its behaviour. This can be easily thwarted by hiding the malicious activities of the file if it is being run inside a virtual environment. Hence, there has been a need to explore new approaches to overcome the limitations of static or dynamic analysis such as time intensity, resource consumption, scalability.

We propose a method for visualizing and classifying malware using image processing techniques. Malware binaries are visualized as gray-scale images, with the observation that for many malware families, the images belonging to the same family appear very similar in layout and texture. By converting the executable into an image representation, we have made our analysis process free from the problems faced by standard static and dynamic analyses

Dataset Used

For the training and evaluation of our proposed model we have used the Malimg Dataset. The Malimg Dataset contains 9349 malware images, belonging to 25 families/classes. Thus, our goal is to perform a multi-class classification of malware.

Link - https://drive.google.com/drive/folders/1CnFx26NfWfQchIU85wRNfHjqfk7Up6hl?usp=sharing

A Malware can belong to one of the following class :

  • Adialer.C
  • Agent.FYI
  • Allaple.A
  • Allaple.L
  • Alueron.gen!J
  • Autorun.K
  • C2LOP.P
  • C2LOP.gen!g
  • Dialplatform.B
  • Dontovo.A
  • Fakerean
  • Instantaccess
  • Lolyda.AA1
  • Lolyda.AA2
  • Lolyda.AA3
  • Lolyda.AT
  • Malex.gen!J
  • Obfuscator.AD
  • Rbot!gen
  • Skintrim.N
  • Swizzor.gen!E
  • Swizzor.gen!I
  • VB.AT
  • Wintrim.BX
  • Yuner.A

Converting malware binaries to gray-scale images

To convert the binary files into gray scale images we make use of the hexadecimal representation of the file's binary content and convert those files into PNG images. For example the resulting image after converting the 0ACDbR5M3ZhBJajygTuf.bytes binary file into a PNG.

binary to gray scale

CNN Model Architecture

CNN model includes following layers to make it perform feature and pattern extractions from images and help classify the malware family.

  • Convolutional Layer : 30 filters, (3 * 3) kernel size
  • Max Pooling Layer : (2 * 2) pool size
  • Convolutional Layer : 15 filters, (3 * 3) kernel size
  • Max Pooling Layer : (2 * 2) pool size
  • DropOut Layer : Dropping 25% of neurons.
  • Flatten Layer
  • Dense/Fully Connected Layer : 128 neurons, Relu activation function
  • DropOut Layer : Dropping 50% of neurons.
  • Dense/Fully Connected Layer : 50 neurons, Softmax activation function
  • Dense/Fully Connected Layer : num_class neurons, Softmax activation function

The input has a shape of [64 * 64 * 3] : [width * height * depth]. In our case, each Malware is a RGB image.

Block Diagram

Block Diagram

Future Work

  • Future work will be focused on conducting results using more advanced CNN models like Inception V3, VGG16-Net, ResNet50, CNN-SVM, MLP-SVM ,GRU-SVM etc.
  • We also want to convert malware images into color RGB images before classification to enhance the accuracy and precision.
  • We also want to implement a web based or GUI based tool to convert malware binary files into images and then classifying them.

malware-image-detection's People

Contributors

tanaybhadula avatar

Stargazers

 avatar  avatar  avatar Pedro Ketzer avatar  avatar Devanshu Srivastava avatar  avatar canoztas avatar TyagarajN avatar Bert avatar docker network avatar  avatar hmhung_2906 avatar

Watchers

 avatar

Forkers

hongcheolhyun

malware-image-detection's Issues

I need you help

When I run your code, I get this problem.
Unknown metric function: f1_m. Please ensure this object is passed to the custom_objects argument.
Can you help me solve it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.