Light

dart-laboratory / flash-ids Goto Github PK

View Code? Open in Web Editor NEW

27.0 0.0 6.0 44.95 MB

Jupyter Notebook 100.00%

flash-ids's Introduction

FLASH

Welcome to the FLASH repository. Here, we offer the implementation details of the method introduced in our research paper titled "FLASH: A Comprehensive Approach to Intrusion Detection via Provenance Graph Representation Learning". Our paper can be found at this Link.

Prerequisites

To run Flash you need to install Jupyter Notebook. More detailed instructions on installing and running Jupyter Notebooks can be found at this Link.

Installation

We have provided a requirements.txt file detailing the specific dependency versions. Use the following command to install the required libraries.

pip install -r requirements.txt

Datasets

Flash is evaluated on open-source datasets from Darpa and the research community. You can access these datasets using the following links.

Darpa OpTC

https://github.com/FiveDirections/OpTC-data

Darpa E3

https://drive.google.com/drive/folders/1fOCY3ERsEmXmvDekG-LUUSjfWs6TRdp

Streamspot

https://github.com/sbustreamspot/sbustreamspot-data

Unicorn

https://github.com/margoseltzer/shellshock-apt

Code Structure

The parsers for each dataset are integrated within their respective Jupyter Notebooks. For every dataset, there is a dedicated Notebook designed for evaluation. These Notebooks handle the downloading, parsing, and executing evaluations on their respective datasets. We have provided pre-trained model weights to run evaluations. Each notebook has parameters to control different components of the system. More detailed instructions are given in the Notebooks. After running these Notebooks, the results will be displayed at the end of each execution.

Contributing

We welcome all feedback and contributions. If you wish to file a bug or enhancement proposal or have other questions, please use the Github Issue. If you'd like to contribute code, please open a Pull Request.

BibTeX

@inproceedings{flash2024,
  title = {FLASH: A Comprehensive Approach to Intrusion Detection via Provenance Graph Representation Learning},
  author = {Rehman, Mati Ur and Ahmadi, Hadi and Hassan, Wajih Ul},
  booktitle = {IEEE Symposium on Security and Privacy (S\&P)},
  year = {2024},
}

flash-ids's People

Contributors

Stargazers

Forkers

tianci-king mmamun1 abouelkhair5 aldaihanabdullah3 ahmed3amerai psy99

flash-ids's Issues

Suggestion to improve efficiency in the unicorn notebook

In the unicorn notebook and specially in the prepare_graph function you call nodes.keys and index function twice and those are expensive calls that result in the prepare_graph call taking over 2 hours on a very strong machine

edge_index = [[], []]
for src, dst in edges:
    src_index = list(nodes.keys()).index(src)
    dst_index = list(nodes.keys()).index(dst)
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

an alternative would be to precompute all the indencies and store them in a hashmap and compute the graph in a few seconds
for example:

node_index_map = {node: i for i, node in enumerate(nodes.keys())}
for src, dst in tqdm(edges):
    src_index = node_index_map[src]
    dst_index = node_index_map[dst]
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

Clarification Regarding the benign data files used for training on DARPA OpTC dataset

Hi,

I am trying to download benign data files of DARPA OpTC, for training the models from the ground up.

I want to confirm the benign data files used for training, Have you used the six days of benign activities for the same hosts used for evaluation {0201, 0501, 005}?
Could you specify the exact data files for the train set as for the test set?

Regards,

Clarification Regarding Experimental Results

Hello, I tried to reproduce the paper results for the Darpa OpTC dataset. I used the pre-trained model weights. I got the following results:

Attack- Prec. Rec. F-score TP/FP/FN/TN

Attack1 - 0.81 0.95 0.87 59/14/3/204,957
Attack2 - 0.94 0.92 0.93 378/22/32/638,829
Attack3 - 0.93 0.92 0.93 167/12/14/179,390

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.