Git Product home page Git Product logo

topk's Introduction

HeavyKeeper

Note

Segment has paused maintenance on this project, but may return it to an active status in the future. Issues and pull requests from external contributors are not being considered, although internal contributions may appear from time to time. The project remains available under its open source license for anyone to use.

This package implements the HeavyKeeper algorithm for efficiently finding top-K flows in an unbounded set of flows.

Paper: https://www.usenix.org/system/files/conference/atc18/atc18-gong.pdf

The reference implementation is pretty difficult to follow. I found the RedisBloom implementation to be far eaier to read, if you're interested in seeing a more battle-tested implementation. The RedisBloom implementation also has some optimizations that might be worth looking at (a decay lookup table, for example) and it supports arbitrary increments greater than one, which I've implemented here.

This implementation uses a default width and depth to simplify usage:

  • width = k * log(k) (minimum of 256)
  • height = log(k) (minimum of 3)

Usage

hk := topk.New(100, 0.9)

hk.Add("foo", 1)
hk.Add("bar", 5)
hk.Add("baz", 1)
hk.Add("baz", 1)

for _, fc := range hk.Top() {
  fmt.Printf("%s = %d\n", fc.Flow, fc.Count)
}

// bar = 5
// baz = 2
// foo = 1

count, ok := hk.Count("bar")
fmt.Println(count, ok)
// 5 true

Benchmarks

The algorithm itself is rather efficient on its own; I haven't invested any time in further optimizing things (yet).

goos: darwin
goarch: amd64
pkg: github.com/segmentio/topk
cpu: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
BenchmarkAdd/K=10-12         	17065675	        79.38 ns/op	       0 B/op	       0 allocs/op
BenchmarkAdd/K=50-12         	11193319	       106.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkAdd/K=100-12        	 9880362	       131.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkAdd/K=500-12        	 7442464	       159.6 ns/op	       0 B/op	       0 allocs/op
BenchmarkAdd/K=1000-12       	 7125268	       167.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkAdd/K=5000-12       	 5797017	       206.3 ns/op	       0 B/op	       0 allocs/op
BenchmarkAdd/K=10000-12      	 5218218	       233.2 ns/op	       0 B/op	       0 allocs/op

topk's People

Contributors

tysonmote avatar bhavanki avatar

Stargazers

 avatar  avatar Tibor Schmidt avatar Juca Da avatar abc avatar JellyTony avatar yicheny avatar Fei Chen avatar Dmitry Chuev avatar Akshay Pawar avatar Mars Hall avatar Brian O'Rourke avatar Peter DeMartini avatar  avatar

Watchers

 avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.