Git Product home page Git Product logo

anomalydetection's Introduction

Lifecycle Status Travis-CI Build Status codecov Total Downloads

anomalyDetection

anomalyDetection implements procedures to aid in detecting network log anomalies. By combining various multivariate analytic approaches relevant to network anomaly detection, it provides cyber analysts efficient means to detect suspected anomalies requiring further evaluation.

Installation

You can install anomalyDetection two ways.

  • Using the latest released version from CRAN:
install.packages("anomalyDetection")
  • Using the latest development version from GitHub:
if (packageVersion("devtools") < 1.6) {
  install.packages("devtools")
}

devtools::install_github("koalaverse/anomalyDetection", build_vignettes = TRUE)

Learning

To get started with anomalyDetection, read the intro vignette: vignette("Introduction", package = "anomalyDetection"). This will provide a thorough introduction to the functions provided in the package.

References

Gutierrez, R.J., Boehmke, B.C., Bauer, K.W., Saie, C.M. & Bihl, T.J. (2017) "anomalyDetection: Implementation of augmented network log anomaly detection procedures." The R Journal, 9(2), 354-365. link

anomalydetection's People

Contributors

andy-mccarthy avatar auburngrads avatar bfgray3 avatar bgreenwell avatar bradleyboehmke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

anomalydetection's Issues

Switch to RcppArmadillo for horns_curve

Switching to RcppArmadillo makes horns_curve roughly five times faster!

#include <RcppArmadillo.h>
using namespace Rcpp;

// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]

arma::mat horns_curve_cpp(int n, int p, int nsim) {
  arma::mat M(n,p);
  arma::mat C(p,p);
  arma::vec E(p);
  arma::mat EM(nsim, p);
  arma::vec EV(p);
  for(int i = 0; i < nsim; ++i) {
    M = arma::randn(n, p);
    C = arma::cov(M);
    E = arma::eig_sym(C);
    EM.row(i) = arma::sort(E, "descend").t();
  }
  EV = arma::mean(EM, 0).t();
  return EV;
}


/*** R
# Load required packages
library(anomalyDetection)
library(ggplot2)
library(microbenchmark)

# Faster horns_curve function
horns_curve_2 <- function(x) {
  horns_curve_cpp(n = nrow(x), p = ncol(x), nsim = 1000)
}

# Security logs data
x <- security_logs %>%
  tabulate_state_vector(10)

# Compare values
head(cbind(horns_curve(x), horns_curve_2(x)))
plot(horns_curve(x), horns_curve_2(x))
abline(0, 1)

# Compare speed
mb <- microbenchmark(
  horns_curve(x),
  horns_curve_2(x),
  times = 100L
)
autoplot(mb)
*/

Introduction Markdown Page - improper data types

In security_logs data frame, Src_Port and Dst_Port are integers and get summed when you create your state vector creation. "host (website) addresses and port numbers are necessarily part of the underlying TCP/IP protocols", so ports should be of character data type just like IP address.

This changes some of your results listed out in introduction.Rmd. Still after changing the data types, this data's boxplot calculations after multivariate analysis still show interesting outliers!

+1 for this package from me.

Interaction with SparkR?

Reviewer comment: Another area to explore would be distributable systems like spark and SparkR where such a tool would be welcome.

This is a great point, we should look to see how we can leverage packages like SparkR to make anomalyDetection directly compatible with distributed systems.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.