Git Product home page Git Product logo

maltindex's Introduction

maltindex

Mal Tindex is an Open Source tool for indexing binaries and help attributing malware campaigns. It was first presented in the EuskalHack 2017 conference, in Donostia (Basque Country). Both Linux, Windows and MacOSX should be supported targets but actual indexing has been only tested in Linux.

How it works?

It uses (for now) IDA and Diaphora to export to a database a set of signatures for each function found in each binary indexed. Then, the most "rare" functions are stored in various tables and these are used to find "rare" coincidences between malware samples that, perhaps, can be useful in order to attribute actors and malware campaigns.

How to use it

First of all, you need a Diaphora supported version of IDA (6.8, 6.9 and 6.95) and Python. Once you have all the requirements, you will need to create a *.cfg file to specify the database data (for this first proof-of-concept, there is only support yet for SQLite, but I will add support "soon" for any database supported by web.py). The *.cfg file has the following form:

########################################################################
# Example configuration for SQLite3
########################################################################
[database]
dbn=sqlite
# Database name
db=/path/to/your/to/be/created/database/db_name.sqlite

Once you have your *.cfg file created, you can run the following commands:

$ export DIAPHORA_DB_CONFIG=/path/to/cfg
$ diaphora_index_batch.py /dir/where/ida/is/installed samples_dir

It will then find every single executable binary in all directories, recursively, and launch the appropriate IDA program (i.e., idaq or idaq64), and index every binary. After all the binaries are indexed and all tables populated, you can then use the command line tool "malindex.py" to analyse your dataset:

$ maltindex.py <database path>
MalTindex> match MD5
(...it will show all matches in the dataset for functions found in the binary with that specific MD5...)
MalTindex> match MD5_1 MD5_2
(...it will show all matches between both MD5s...)
MalTindex> unique MD5
(...it will show unique rare matches for the given MD5, if any...)

And that's it! Remember that your dataset must be significantly big in order to get significant results. According to some friends, the minimum required number of binaries (both goodware and malware) is around 1 million samples.

maltindex's People

Contributors

joxeankoret avatar

Watchers

James Cloos avatar . avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.