Git Product home page Git Product logo

yadd's Issues

Add your own dataset

Add Your Own Dataset

Issue Description

๐Ÿ“Š Task: Contribute to our project by adding your own dataset for DDoS detection. Whether it's a unique collection of packet data or network traffic captures, your dataset can enhance the diversity and robustness of our analysis.

Submission Guidelines

  1. Dataset Format:

    • Ensure your dataset is well-organized and provides clear documentation on the structure and meaning of features.
    • If your dataset is too large to upload directly, consider providing a Google Drive link or any other accessible cloud storage link.
  2. Dataset Information:

    • Include information about the nature of the data, source, and any unique characteristics it possesses.
  3. Data Standardization:

    • If possible, standardize labels and features to align with the project's conventions. Ensure binary labels (0 or 1) for normal and DDoS attack traffic.
  4. README Update:

    • If applicable, update the README to include details about your dataset and how it can be utilized for DDoS detection.

Note

  • We appreciate your contribution to our growing collection of datasets. Your submission will be valuable for fostering collaboration and advancing DDoS detection research.
  • Refer to /data/CONTRIBUTING.md for guidelines on dataset contribution and /data/README.md for instructions on updating the README.

Developing the Model

DDoS Detection Model Challenge

Issue Description

๐Ÿš€ Challenge: Compete to create the most accurate model for detecting DDoS attacks! Participants are tasked with developing a machine learning model that assigns labels of 1 for DDoS attacks and 0 for normal traffic based on packet features. Additionally, participants are required to maintain a list of source IPs associated with detected DDoS packets.

Evaluation Criteria

  • Model Accuracy: The accuracy of the machine learning model in correctly classifying packets.
  • False Positive Rate: Minimize false positives to enhance precision.
  • List of Source IPs: Maintain an accurate list of source IPs for all packets classified as DDoS attacks.

Data Details

  • Three datasets are provided for training and evaluation.
  • If you wish to add another dataset, feel free to contribute to the other open-for-all issue and get your data added.
  • Features include packet metadata such as IP addresses, TCP/UDP ports, and flags.
  • Labels should be binary, with 1 denoting DDoS attacks and 0 denoting normal traffic.

Submission Guidelines

  • Participants are encouraged to use a variety of machine learning algorithms and techniques.
  • Submissions should include a Jupyter notebook or Python script containing the model implementation.
  • Clearly document and comment your code for transparency and understanding.

Reward

  • The participant with the most accurate model and well-maintained list of source IPs will be recognized and only their PR will be merged.

Additional Information

  • Participants can discuss their approaches, findings, and seek clarifications in the project's discussions.
  • Please adhere to project coding standards and guidelines during implementation.

Note

  • Refer to /data/CONTRIBUTING.md for information on dataset usage and /scripts/model_evaluation.ipynb for existing evaluation conventions.

Analyse the Sample dataset and submit a detailed analysis of the various ddos attack signatures and vectors spotted in the data

DDoS Attack Signature Analysis

Overview

This analysis focuses on identifying DDoS attack signatures in a sample network dataset. We aim to spot anomalous patterns indicating potential DDoS attacks.

Dataset

The dataset includes packet headers, payload content, and relevant features, covering both normal and potential attack instances.

Add the analysis in the reports/data folder in pdf format

Steps

  1. Data Exploration:

    • Understand dataset structure.
    • Identify outliers or irregularities.
  2. Feature Extraction:

    • Extract key features.
    • Include statistical measures and traffic attributes.
  3. Signature Definition:

    • Define potential DDoS attack signatures.
    • Focus on packet headers, payload content, and traffic patterns.
  4. Signature Spotting:

    • Analyze instances with detected anomalies.
    • Identify potential DDoS attack signatures.

IT SHOULD BE IN A PDF FORMAT

Add Dataset

Fork and Add Datasets: Fork the repository, clone it locally, create a new branch, add datasets in /data/ following guidelines, and submit a pull request.

Contribution Guidelines: Refer to /data/CONTRIBUTING.md for detailed instructions on dataset format and ensure adherence to specified column structure.

Make sure your data is NOT a duplicate of data already in the Repository. Any sort of duplicate data PRs will be penalised

Data Standardisation and Cleaning

Data Standardization and Cleaning Notebook

Issue Description

๐Ÿ“„ Task: Develop a Python notebook to facilitate the integration of three provided datasets. The goal is to standardize labels and features, ensuring consistency across all datasets. Additionally, implement data cleaning procedures to enhance data quality. Finally, store the processed data in variables X (features) and Y (labels), with labels being binary (0 or 1) to represent the presence or absence of a DDoS attack.

Implementation Steps

  1. Data Reading:

    • Create functions to read data from the three datasets, considering their respective formats.
  2. Standardization:

    • Standardize labels and features across datasets to ensure uniformity.
    • Align features to match their corresponding counterparts in other datasets.
    • Then CONCATENATE all three datasets
  3. Data Cleaning:

    • Implement cleaning procedures to handle missing values, outliers, or any inconsistencies in the datasets.
  4. Label Standardization:

    • Convert labels to binary values (0 or 1) based on the presence or absence of DDoS attacks.
  5. Variable Assignment:

    • Store the standardized and cleaned data in variables, assigning features to variable X and labels to variable Y.

Additional Considerations

  • Documentation:

    • Provide inline comments and documentation for each step to enhance code readability.
  • Testing:

    • Include testing steps to ensure the notebook produces the expected results.
  • Guidelines:

    • Adhere to the project's coding and documentation guidelines.

Add 3rd dataset

Fork and Add Datasets: Fork the repository, clone it locally, create a new branch, add datasets in /data/ following guidelines, and submit a pull request.

Contribution Guidelines: Refer to /data/CONTRIBUTING.md for detailed instructions on dataset format and ensure adherence to specified column structure.

Make sure your data is NOT a duplicate of data already in the Repository. Any sort of duplicate data PRs will be penalised

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.