Git Product home page Git Product logo

nheijmans / malzoo Goto Github PK

View Code? Open in Web Editor NEW
92.0 8.0 29.0 275 KB

Mass static malware analysis tool

Home Page: https://www.sans.org/reading-room/whitepapers/threathunting/automated-analysis-abuse-mailbox-employees-malzoo-37207

License: GNU General Public License v2.0

Python 48.70% Shell 1.40% YARA 48.40% Dockerfile 1.50%
python wiki-page splunk mongodb elasticsearch malware-analysis automation email-parsing

malzoo's Introduction

Logo

What is MalZoo?

MalZoo is a mass static malware analysis tool that collects the information in a Mongo database and moves the malware samples to a repository directory based on the first 4 chars of the MD5 hash. It was build as a internship project to analyze sample sets of 50 G.B.+ (e.g. from http://virusshare.com).

A few examples where it can be used for:

  • Use the collected information to visualize the results (e.g. see most used compile languages, packers etc.)
  • Gather intell of large open source malware repositories (original intend of the project)
  • Monitor a mailbox, analyze the emails and attachments

Installation information on VM's and bare-metal

For more information on installation and collection of data, check out the Wiki of this repository.

Cloud Serverless deployment

For the deployment in the AWS Cloud with a Serverless architecture, check out the repository Malzoo Serverless for an auto-deployment solution.

Docker container deployment

If you would like to deploy the Malzoo project in a Docker container, you can start very easily with pulling the image from Docker Hub docker pull statixs/malzoo

And then start a container from there. More instructions further below.

Information collected

See the wiki page Information collected which data is collected for which sample.

Installation

See the wiki page Installation to install MalZoo. The best option is to use the auto installation script bootstrap.sh and once that is done running you only have to execute export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

Configuration

After the installation you need to adjust the configuration file malzoo.conf in the config directory. If you are using the Splunk functionality, create a HTTP event collector token in your Splunk instance and copy the token to the configuration file (behind the Splunk part, so you replace the xxx-'es).

Usage

See this Wiki page on how to use Malzoo as an application. Below is the description on how to use the Docker image.

Docker

Pull the image from Docker Hub with the command

docker pull statixs/malzoo:latest

Environment list

The environment list contains two items that need to be included, in order for Malzoo to find the virtual environment of Python and to know where the library is for calculating the Fuzzy hashes. The environment file should contain:

PYTHONPATH=/home/malzoo/malzoo
LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

The first variable makes the virtual environment the Python path, so all the dependencies are found. The second variable is for the Fuzzy hash library to be found correctly.

Start a container with persistent logs

If you want to have the logs persistently stored on the host OS, use the following command

docker container run --detach --publish 127.0.0.1:1338:1338/tcp --name malzoo_engine --env-file env.list --rm --volume=./malzoo-logs:/home/malzoo/malzoo/logs/ statixs/malzoo:latest

This will link the folder malzoo-logs to the Malzoo folder in the container for storing logs. These can then be collected in your favorite data analysis tool. The data of Malzoo is stored in JSON by default. If the data should be send to one of the other receivers like Splunk or MongoDB, you can configure that in the configuration file of Malzoo.

Start a container with persistent sample storage

Samples are stored by default in the $HOME/malzoo/storage/ folder. If you want those to be persistent on the host OS, use the following command

docker container run --detach --publish 127.0.0.1:1338:1338/tcp --name malzoo_engine --env-file env.list --rm --volume=./malzoo-samples:/home/malzoo/malzoo/storage/ statixs/malzoo:latest

The samples are stored within a subfolder, that is named after the first 4 characters of the hash. This option allows for you to build a malware repository persistenly, while using Malzoo as the analysis engine to receive, analyze and store samples. By combining both the persistent logs and samples, the Malzoo engine containers can be scaled up by higher submission rates of samples and stopped in quiet hours.

Credit

Special thanks goes to the Viper project (http://viper.li). I learned alot about how to automate malware analyse by this project. Also a big thanks to all the developers of the modules and software used and making it available for everyone to use.

License

This project is released under the GPL 2.0 License. See the LICENSE for details.

malzoo's People

Contributors

nheijmans avatar tcwaddell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

malzoo's Issues

POP3 Instead of IMAP - Feature Request

Are we able to use POP3 instead of IMAP for the connection to the Exchange Mailbox? Or, can we specify the IMAP connection must be Secured IMAP? From looking at the configuration file, it doesn't look like either option is available.

'Monitor' object has no attribute '_popen'

I'm getting this error when trying to use the 'monitor directory' feature. From some minor debugging, it appears that it happens at the 'monitor.daemon=True' line (malzoo.py line 99 in the current master). The other methods seem to use Process in this section rather than the object itself - perhaps this needs to be done for directory monitoring as well?

Not parsing emails

Code is not parsing emails . Issue is because in emailworker.py in function "process" variable "Email" is a dict not str. When I fix it like below it started parsing :

if isinstance(Email['filename'], str):
               msg = email.message_from_string(Email['filename'])
           else:
               msg = Email['filename']

Also exception handlers need to improved . I spend few hours to find where is that issue .

Recursive Directory

RFE request

It would be useful to have directory monitor to have a recursive as an option. This way we could monitor not only the main directory but any directory inside.

Sample processing argument mismatch

Malzoo produces the following error when attempting to process an email with no attachment.

root@ubuntu:/opt/malzoo# python malzoo.py
[] Malzoo runs in monitor mode now!
[
] Starting components...
[+] Starting API supplier!
[+] Starting mail supplier!
Process EmailWorker-9:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/opt/malzoo/malzoo/common/abstract.py", line 62, in run
self.process(sample)
TypeError: process() takes exactly 3 arguments (2 given)

URL's from emails not getting sent to Cuckoo

I'm trying to utilise malzoo to get URL's and attachments from emails and run them through cuckoo. Attachments in emails work fine either singular or even better multiple but URL's don't and the debug log shows the following.

2017-10-25 14:05:04,387 - emailworker - {'sha1': '547c8063df4f73c12f5e90d7a22a4bc69b5a32b6', 'id_tag': 'malzoo', 'submit_date': 1508936704, 'msg_id': 'CA+wG9XrKrshrLJfaDUyzU=QSGHFx5MKeAuaQJddPC9HiR1-R0Q@mail.gmail.com', 'sample_type': 'attachment', 'filename': 'f8908ca9e854b7afc7cd11b74dfd4c4f', 'md5': 'f8908ca9e854b7afc7cd11b74dfd4c4f'} - local variable 'urls' referenced before assignment

And the URL doesn't get scanned in Cuckoo.

Any ideas. Cuckoo is very latest version 2.0.4 and Malzoo is latest from Github

Phil

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.