Git Product home page Git Product logo

keap's Introduction

Keap

Maven Central Maintainability Apache License 2.0 Pure Kotlin

Keap is a heap data structure written in Kotlin similar to binary heap. It maintains separately the queue of elements and the tournament tree atop the queue.

Keap is stable, that is, it keeps initial order of equal elements.

It's faster than binary heap in terms of number of comparisons. For any kind of input (random or ordered) of size n, the heapify procedure requires exactly n - 1 comparisons. Binary heap reaches this number of comparisons only for ordered input. For random input, keap compared to binary heap does approximately 1.9 times less comparisons to heapify, 2.7 times less comparisons to offer, and 3.5 times less comparisons to poll. Though, array-backed keap on the average consumes 1.5 times more memory than array-backed binary heap.

Performance summary of keap and binary heap (both array-backed) is as follows:

Keap average Keap worst case Binary heap average Binary heap worst case
Heapify exactly n - 1 exactly n - 1 Θ(n) Θ(n)
Peek Θ(1) Θ(1) Θ(1) Θ(1)
Poll Θ(log n) Θ(log n) Θ(log n) Θ(log n)
Offer Θ(1) Θ(log n) Θ(1) Θ(log n)
Memory used Θ(n) Θ(n) exactly n exactly n

Here are two applications: keap-based PriorityQueue as a replacement of java.util.PriorityQueue and Keapsort sorting algorithm. Both might be useful in two cases:

  1. stability is a must;
  2. comparing elements is rather heavyweight operation (e.g., it requires a database access).

PriorityQueue

PriorityQueue is a keap-based stable replacement of java.util.PriorityQueue.

Keapsort

Keapsort is a Heapsort with keap used instead of binary heap. Keapsort is a comparison-based sorting algorithm with a worst-case O(n log n) runtime. As keap is a stable priority queue, Keapsort is stable. Unlike Heapsort, Keapsort doesn't have an in-place version.

Like Heapsort, Keapsort produces first output after Θ(n) comparisons. For Keapsort, this Θ(n) estimate is just equal to n - 1. To sort random input completely, Keapsort does almost two times less comparisons than Heapsort, nearly 1% less comparisons than Mergesort and 1-2% more comparisons than Timsort.

Like Heapsort, Keapsort is not a modern CPU-friendly algorithm since it has poor data cache performance.

Like Heapsort, Keapsort doesn't seem to parallelize well.

Benchmarks

JMH Gradle Plugin is used to build and run benchmarks. Benchmark results are obtained on a PC running under Windows 7 with Intel(R) Core(TM) i7-3770 3.4 GHz CPU and 64-bit JRE build 1.8.0_112-b15 with the following java parameters: -Xms1g -Xmx1g. To get results in your environment, run:

./gradlew clean jar jmh

For both java.util.PriorityQueue and keap-based PriorityQueue, there are two types of benchmarks:

  1. Benchmarks examining operations with random elements: heapify (building the queue from a collection), offer, peek, and poll. The queue elements are random strings of length ~30 characters with constant 10-characters prefix. Queue size is 10000.
  2. Benchmarks examining offering ordered elements: offerIncreasing (successively increasing elements) and offerDecreasing (successively decreasing elements). The queue elements are strings of length ~30 characters with constant 10-characters prefix. Queue size is 10000.

Current results grouped for easy review are as follows. Building the queue from collections of random elements:

Benchmark                                   Mode  Cnt   Score   Error   Units
JavaQueueRandomBenchmark.heapify           thrpt   20   1.810 ± 0.039  ops/ms
KeapQueueRandomBenchmark.heapify           thrpt   20   3.487 ± 0.013  ops/ms

Basic queue operations with random elements:

Benchmark                                   Mode  Cnt   Score   Error   Units
JavaQueueRandomBenchmark.offer             thrpt   20   3.649 ± 0.041  ops/us
JavaQueueRandomBenchmark.peek              thrpt   20  79.458 ± 0.626  ops/us
JavaQueueRandomBenchmark.poll              thrpt   20   4.250 ± 0.070  ops/us
KeapQueueRandomBenchmark.offer             thrpt   20   3.727 ± 0.073  ops/us
KeapQueueRandomBenchmark.peek              thrpt   20  84.371 ± 0.499  ops/us
KeapQueueRandomBenchmark.poll              thrpt   20  11.993 ± 0.239  ops/us

Offering ordered elements:

Benchmark                                   Mode  Cnt   Score   Error   Units
JavaQueueOrderedBenchmark.offerDecreasing  thrpt   20   5.432 ± 0.177  ops/us
JavaQueueOrderedBenchmark.offerIncreasing  thrpt   20  30.795 ± 0.588  ops/us
KeapQueueOrderedBenchmark.offerDecreasing  thrpt   20   7.959 ± 0.033  ops/us
KeapQueueOrderedBenchmark.offerIncreasing  thrpt   20   8.860 ± 0.044  ops/us

The scores above are numbers of operations per microsecond, for heapify - per millisecond. So the greater the score, the better performance.

Download

Maven Central

<!-- in Maven project -->
<dependency>
    <groupId>com.github.penemue</groupId>
    <artifactId>keap</artifactId>
    <version>0.3.0</version>
</dependency>
// in Gradle project
dependencies {
    compile 'com.github.penemue:keap:0.3.0'
}

Building from Source

Gradle is used to build, test, and run benchmarks. JDK 1.8 and Kotlin 1.3.10 are required. To build the project, run:

./gradlew

ToDo

  • Compare Keapsort to Quicksort.

  • Looks like a tree-backed version of keap could be exposed as an immutable/persistent/lock-free heap data structure. In addition, it could support heap merge operation in Θ(1) time.

keap's People

Contributors

denvned avatar penemue avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

denvned

keap's Issues

Heap property can be broken since 0.2.0

Since version 0.2.0, compacted tournament tree can lose heap property in some cases. The problem happens for enough large queue if neighbor elements (with even and odd indices) are swapped and the new even element is not sifted up to the root. For compacted tournament tree, there should not be breakable sifting up procedure, all elements should always be sifted up to the root.

Iterable<T>.keapify() for empty collection results in an exception

java.lang.IllegalArgumentException: null
	at com.github.penemue.keap.PriorityQueue$Companion.getToCapacity(PriorityQueue.kt:429) ~[keap-0.3.0.jar]
	at com.github.penemue.keap.PriorityQueue$Companion.access$getToCapacity$p(PriorityQueue.kt:419) ~[keap-0.3.0.jar]
	at com.github.penemue.keap.PriorityQueue.<init>(PriorityQueue.kt:84) ~[keap-0.3.0.jar]
	at com.github.penemue.keap.PriorityQueue.<init>(PriorityQueue.kt:106) ~[keap-0.3.0.jar]
	at com.github.penemue.keap.PriorityQueueKt.keapify(PriorityQueue.kt:29) ~[keap-0.3.0.jar]
	at com.github.penemue.keap.SortedIterable.iterator(SortedIterable.kt:27) ~[keap-0.3.0.jar]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.