Git Product home page Git Product logo

hashehri / network-traffic-classification-unsw-nb15 Goto Github PK

View Code? Open in Web Editor NEW
14.0 1.0 6.0 18.99 MB

Binary Classification for detecting intrusion network attacks. In order, to emphasize how a network packet with certain features may have the potentials to become a serious threat to the network.

Jupyter Notebook 100.00%
machine-learning deep-learning network-analysis network-security ai sklearn xgboost-algorithm random-forest knn-classification decsion-tree pickle logistic-regression voting-classifier stacking-ensemble python3 classification-algorithims

network-traffic-classification-unsw-nb15's Introduction

Detecting Malicious Attacks Through The Network Traffic

By(Hatim Alshehri & Fahad Alnafisa)

Abstract:

The network intrusion detection system (NIDS) has become an essential tool for detecting attacks in computer networks and protecting the critical information and systems. The effectiveness of an NIDS is usually measured by the high number of detected attacks and the low number of false alarms. Machine learning techniques are widely used for building robust intrusion detection systems, which adapt with the continuous changes in the network attacks. However, a comparison of such machine learning techniques needs more investigation to show their efficiency and appropriateness for detecting sophisticated malicious attacks. The project goal is to detect and examine network traffic from malicious attacks by generating a machine learning model to classify the network traffic.

What is IDS (Intrusion Detection System)?

Intrusion Detection Systems (IDS) are precisely present to prevent attacks and infiltration to Networks, which might affect the organization. They monitor network traffic for suspicious activities and issue alert in case of issues.

Types if IDS:

  • Signature-based intrusion detection– In this kind incoming attacks are compared with pre-existing database of known attacks.
  • Anomaly-based intrusion detection- It uses statistics to form a baseline usage of the networks at different time intervals. They were introduced to detect unknown attacks.

Based on where they discover, they can be classified into:

  • Network intrusion detection (NIDS)
  • Host intrusion detection (HIDS)

Problem Statement:

With the rise of Internet usage, it is very important to protect Networks. The most common risk to a network’s security is an intrusion such as brute force, denial of service or even an infiltration from within a network. With the changing patterns in network behavior, it is necessary to switch to a dynamic approach to detect and prevent such intrusions.

Importance of this dataset:

Although there were few daatset available before this dataset for NIDS, but they were generated decades ago and do not provide realistic outputs. That's why this dataset had been created by Nour Moustafa to tackle existing problems like: unbalanced dataset, missing values etc.

Data:

This data set has a hybrid of the real modern normal and the contemporary synthesized attack activities of the network traffic. Existing and novel methods are utilised to generate the features of the UNSW- NB15 data set. This data set is available here.

  • The obtained dataset consists of over 175k network traffic records with 45 features.

  • Field Description:

Field Name Description
id unique identifier for each attack
dur Record total duration
proto Transaction protocol
service http, ftp, ssh, dns ..,else (-)
state The state and its dependent protocol, e.g. ACC, CLO, else (-)
spkts Source to destination packet count
dpkts Destination to source packet count
sbytes Source to destination bytes
dbytes Destination to source bytes
rate The avrage attack rate
sttl Source to destination time to live
dttl Destination to destination time to live
sload Source packets retransmitted or dropped
dload Destination packets retransmitted or dropped
sloss Source packets retransmitted or dropped
dloss Destination packets retransmitted or dropped
sinpkt Source inter-packet arrival time (mSec)
dinpkt Destination inter-packet arrival time (mSec)
sjit Source jitter (mSec)
djit Destination jitter (mSec)
swin Source TCP window advertisement
dwin Destination TCP window advertisement
stcpb Source TCP sequence number
dtcpb Destination TCP sequence number
tcprtt The sum of ’synack’ and ’ackdat’ of the TCP
synack The time between the SYN and the SYN_ACK packets of the TCP
ackdat The time between the SYN_ACK and the ACK packets of the TCP
smean Mean of the flow packet size transmitted by the src
dmean Mean of the flow packet size transmitted by the dst
trans_depth the depth into the connection of http request/response transaction
response_body_len The content size of the data transferred from the server’s http service
ct_srv_src No. of connections that contain the same service and destination address in 100 connections according to the last time
ct_state_ttl No. for each state according to specific range of values for source/destination time to live
ct_dst_ltm No. of connections of the same destination address in 100 connections according to the last time
ct_src_dport_ltm No of connections of the same source address and the destination port in 100 connections according to the last time
ct_dst_sport_ltm No of connections of the same destination address and the source port in 100 connections according to the last time
ct_dst_src_ltm No of connections of the same source and the destination address in in 100 connections according to the last time
is_ftp_login If the ftp session is accessed by user and password then 1 else 0
ct_ftp_cmd No of flows that has a command in ftp session
ct_flw_http_mthd No. of flows that has methods such as Get and Post in http service
ct_src_ltm No. of connections of the same destination address in 100 connections according to the last time
ct_srv_dst No. of connections that contain the same service and destination address in 100 connections according to the last time
is_sm_ips_ports If source equals to destination IP addresses and port numbers are equal, this variable takes value 1 else 0
attack_cat The name of each attack category. In this data set, nine categories (e.g., Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode and Worms)
label 0 for normal and 1 for attack records

Tools:

  • python 3.9
  • pandas
  • numpy
  • matplotlib
  • seaborn
  • sklearn
  • PrettyTable
  • XGBoost
  • Pickle

network-traffic-classification-unsw-nb15's People

Contributors

hashehri avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.