Git Product home page Git Product logo

poppy's Introduction

PoPPy

A Point Process Toolbox Based on PyTorch

PoPPy is a machine learning toolbox focusing on point process model, which achieves rich functionality on data operations, high flexibility on model design and high scalability on optimization.

Recent updates

  • PoPPy starts to support GPU-based computation. For feature-based model and large-scale dataset with lots of event types, the learning process is accelerated.

Data operations

  • Import event sequences and their features from csv files
  • Random and/or feature-based event sequence stitching
  • Random and/or feature-based event sequence superposing
  • Event sequence aggregating
  • Batch sampling of event sequence

Models

  • Poisson process
  • linear Hawkes process
  • nonlinear Hawkes process
  • Factorized point process
  • Feature-involved point process
  • Mixture models of point processes

Loss functions

  • Maximum likelihood estimation
  • Least-square estimation
  • Conditional likelihood estimation

Decay kernels

For Hawkes processes, multiple decay kernels are applicable:

  • Exponential kernel
  • Rayleigh kernel
  • Gaussian kernel
  • Gate kernel
  • Powerlaw kernel
  • Multi-Gaussian kernel

Optimization

  • SGD based on learning algorithms
  • Support CPU or GPU-based computations

Simulation

  • Ogata's thinning algorithm

Platform

  • I developed and tested PoPPy on MacOS>=10.13, Ubuntu=16.04LTS, and Windows10 (Conda environment)

Installation

  • Step 1: Install Anaconda3 and PyTorch 1.0
  • Step 2: Download the package and unzip it
  • Step 3: Change "POPPY_PATH" in dev/util.py to the directory of the downloaded package.

Usage

More details can be found in tutorial and the pdf files in the folder "docs".

Citation

@article{xu2018poppy, title={PoPPy: A Point Process Toolbox Based on PyTorch}, author={Xu, Hongteng}, journal={arXiv preprint arXiv:1810.10122}, year={2018} }

Tricks

  • Generally, the parameters of exogenous intensity and those of endogenous impact are not in a same scale, i.e., mu ~ O(1/C) and alpha ~ O(1/C^2). When learning the model, different learning rates should be applied to different parameters adaptively. Therefore, although all SGD optimizers in PyTorch are usable, Adam is the recommended choice.
  • When softplus activation is applied to a model, we'd better turn the sparsity and nonnegative contraints off.
  • When training the mixture model of Hawkes processes, we need to select a large epoch to get meaningful clustering results. I found that in the initial phase, the distribution of clusters will be very imbalanced, i.e., most of sequences will be categorized into one cluster. Fortunately, with the increase of epochs, the distribution will be rebalanced and the sizes of different clusters will be comparable.

On going

  • Integrate more advanced models
  • Adding more examples
  • Documentation
  • Optimizing code framework and data structure to achieve further acceleration

poppy's People

Contributors

hongtengxu avatar xiaotinghe avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.