Git Product home page Git Product logo

imac's Introduction

Installation:

  • Clone the adapted MOA repository from here

  • Build on windows: The easiest way is to use an IDE such as IntelliJ

  • Build on linux:

    1. cd execmoa
    2. mkidr moa, mkdir moa/classifiers, mkdir moa/classifiers/trees
    3. javac -cp moa.jar HoeffdingTree.java
    4. mv *.class moa/classifiers/trees
    5. jar -uf moa.jar moa/classifiers/trees/*
    

    The code in execmoa comes from this GitHub

Datasets

Synthetic data

Synthetic data is all generated using the API proposed by MOA.

SEA: '-i 5000000 -f 10000 -s (generators.SEAGenerator -b)',

LED: '-i 1000000 -f 1000 -s (ConceptDriftStream -s (generators.LEDGeneratorDrift -d 1) -d (ConceptDriftStream -s (generators.LEDGeneratorDrift -d 3) -d (ConceptDriftStream -s (generators.LEDGeneratorDrift -d 5) -d (generators.LEDGeneratorDrift -d 7) -w 50000 -p 250000 ) -w 50000 -p 250000 ) -w 50000 -p 250000)'

AGR: '-i 10000000 -f 10000 -s (ConceptDriftStream -s (generators.AgrawalGenerator -f 1 -b) -d (ConceptDriftStream -s (generators.AgrawalGenerator -f 2 -b) -d (ConceptDriftStream -s (generators.AgrawalGenerator -b) -d (generators.AgrawalGenerator -f 4 -b) -w 500 -p 2500000) -w 500 -p 2500000 ) -w 500 -p 2500000)',

RBF: '-i 2000000 -f 10000 -s (generators.RandomRBFGenerator -c 15 -a 200 -n 100) ',

RTG: '-i 10000000 -f 10000 -s (generators.RandomTreeGenerator -c 25 -o 200 -u 200)',

Real-word data

Covertype. The forest covertype data set represents forest cover type for 30 x 30 meter cells obtained from the US Forest Service Region 2 Resource Information System (RIS) data.

KDD99. KDD99 dataset corresponds to a cyber-attack de- tection problem (i.e., attack or common access), an inherent streaming scenario since instances are sequentially presented as a time series.

MNIST8M. MNIST8M is the augmentation of original MNIST database by using pseudo- random deformations and translations.

Code

We have added the IMAC method to the code provided by this github.

There are four algorithms:

  • VFDT algorithm, which periodic executes split-attempts.
  • OSM algorithm, which predicts the interval of split-attempts in VFDT by assuming that heuristic measure of the second-best attribute does not change and the the heuristic measure of best attribute increases as much as possible as the data arrives.
  • CGD algorithm, which predicts the interval of split-attempts in VFDT by assuming that the difference of best attribute and second-best attribute does not change.
  • IMAC algorithm, which determines the potential split timing in VFDT with incremental information.
trees/HoeffdingTree.java        // Clean implementation
trees/HoeffdingTreeMeasure.java // can evaluate split-delay and split-attempts

run.py // evaluate VFDT, OSM, CGD and IMAC
run_measure.py // evaluate split-delay and split-attempts of VFDT, OSM, CGD and IMAC

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.