Git Product home page Git Product logo

extractor's Introduction

EXTRACTOR

EXTRACTOR helps to extract the system level attack behavior from unstructured threat reports. The extracted attack behavior will be represented in form of directed graphs, where the nodes represent system entities and edges represent system calls. EXTRACTOR leverages Natural Language Processing (NLP) techniques to transform a raw threat report into a graph representation.

Instructions

Requirements

This repository uses python 3.5+ and has the following requirements:

nltk == 3.4.5
spaCy == 2.1.0
allennlp == 0.8.4
neuralcoref == 4.0.0
graphviz == 0.13.2
textblob == 0.15.3
pattern == 3.6
numpy == 1.18.1

You can directly install the requirements using pip install -r requirements.txt

Usage

Run EXTRACTOR with python3 main.py [-h] [--asterisk ASTERISK] [--crf CRF] [--rmdup RMDUP] [--elip ELIP] [--gname GNAME] [--input_file INPUT_FILE].

Depending on the usage, each argument helps to provide a different representation of the attack behavior. [--asterisk true] creates abstraction and can be used to replace anything that is not perceived as IOC/system entity into a wild-card. This representation can be used to be searched within the audit-logs.

[--crf true/false] allows activating or deactivating of the co-referencing module.

[--rmdup true/false] enables removal of duplicate nodes-edge.

[--elip true/false] is to choose whether to replace ellipsis subjects using the surrounding subject or not.

[--input_file path/filename.txt] is to pass the text file to the application.

[--gname graph_name] is to specify the name output graph (two files will be created, e.g., graph.pdf and graph.dot).

Example

python3 main.py --asterisk true --crf true --rmdup true --elip true --input_file input.txt --gname mygraph

Summarizer

To perform the prediction/text summarization, you need to first convert the txt file to the required format tsv using python3 txt2tsv.py.

Prediction

To do the extractive summarization, run python3 prediction.py

extractor's People

Contributors

ksatvat avatar devdolapo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.