Git Product home page Git Product logo

flash-ids's Introduction

FLASH

Welcome to the FLASH repository. Here, we offer the implementation details of the method introduced in our research paper titled "FLASH: A Comprehensive Approach to Intrusion Detection via Provenance Graph Representation Learning". Our paper can be found at this Link.

Prerequisites

To run Flash you need to install Jupyter Notebook. More detailed instructions on installing and running Jupyter Notebooks can be found at this Link.

Installation

We have provided a requirements.txt file detailing the specific dependency versions. Use the following command to install the required libraries.

pip install -r requirements.txt

Datasets

Flash is evaluated on open-source datasets from Darpa and the research community. You can access these datasets using the following links.

Darpa OpTC

https://github.com/FiveDirections/OpTC-data

Darpa E3

https://drive.google.com/drive/folders/1fOCY3ERsEmXmvDekG-LUUSjfWs6TRdp

Streamspot

https://github.com/sbustreamspot/sbustreamspot-data

Unicorn

https://github.com/margoseltzer/shellshock-apt

Code Structure

The parsers for each dataset are integrated within their respective Jupyter Notebooks. For every dataset, there is a dedicated Notebook designed for evaluation. These Notebooks handle the downloading, parsing, and executing evaluations on their respective datasets. We have provided pre-trained model weights to run evaluations. Each notebook has parameters to control different components of the system. More detailed instructions are given in the Notebooks. After running these Notebooks, the results will be displayed at the end of each execution.

Contributing

We welcome all feedback and contributions. If you wish to file a bug or enhancement proposal or have other questions, please use the Github Issue. If you'd like to contribute code, please open a Pull Request.

BibTeX

@inproceedings{flash2024,
  title = {FLASH: A Comprehensive Approach to Intrusion Detection via Provenance Graph Representation Learning},
  author = {Rehman, Mati Ur and Ahmadi, Hadi and Hassan, Wajih Ul},
  booktitle = {IEEE Symposium on Security and Privacy (S\&P)},
  year = {2024},
}

flash-ids's People

Contributors

mati607 avatar

Stargazers

 avatar Saleha Muzammil avatar  avatar 罗子安(Zian Luo) avatar  avatar Baitong Zha avatar Zhang Yu avatar Abdullah Aldaihan avatar R4ngers avatar Wintery avatar Felician Paul Almasan Puscas avatar  avatar 刘浩宇 avatar  avatar v01cano avatar  avatar wenzhuolin avatar  avatar Dongqi Han avatar  avatar Jun ZENG avatar wood avatar YangNing avatar Luo Mitchell avatar 111bing avatar Jeffrey avatar Mr.唐 avatar

flash-ids's Issues

Suggestion to improve efficiency in the unicorn notebook

In the unicorn notebook and specially in the prepare_graph function you call nodes.keys and index function twice and those are expensive calls that result in the prepare_graph call taking over 2 hours on a very strong machine

edge_index = [[], []]
for src, dst in edges:
    src_index = list(nodes.keys()).index(src)
    dst_index = list(nodes.keys()).index(dst)
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

an alternative would be to precompute all the indencies and store them in a hashmap and compute the graph in a few seconds
for example:

node_index_map = {node: i for i, node in enumerate(nodes.keys())}
for src, dst in tqdm(edges):
    src_index = node_index_map[src]
    dst_index = node_index_map[dst]
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

Clarification Regarding the benign data files used for training on DARPA OpTC dataset

Hi,

I am trying to download benign data files of DARPA OpTC, for training the models from the ground up.

I want to confirm the benign data files used for training, Have you used the six days of benign activities for the same hosts used for evaluation {0201, 0501, 005}?
Could you specify the exact data files for the train set as for the test set?

Regards,

Clarification Regarding Experimental Results

Hello, I tried to reproduce the paper results for the Darpa OpTC dataset. I used the pre-trained model weights. I got the following results:

Attack- Prec. Rec. F-score TP/FP/FN/TN

Attack1 - 0.81 0.95 0.87 59/14/3/204,957
Attack2 - 0.94 0.92 0.93 378/22/32/638,829
Attack3 - 0.93 0.92 0.93 167/12/14/179,390

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.