Light

708-145 / tbsort Goto Github PK

View Code? Open in Web Editor NEW

2.0 0.0 0.0 624 KB

TreeBinSort

License: GNU General Public License v3.0

C++ 100.00%

tbsort's Introduction

TBSort

TreeBinSort - A general O(n * log(log(n)) sort

TB scale sorting and beyond - order of magnitude speedup over std::sort at 10^12 element scale.

A blend of Sample- and Interpolationsort with an icing made of the comparison sort of choice.

Algorithm properties:

Stable: yes. Elements with the same sort key keep their order.
General: yes. No information about the distribution of data is assumed.
In-place: no. This is a field of ongoing research.
Parallel: no. This is a field of ongoing research.
Distributed: no. This is a field of ongoing research.
Time complexity: O(n * log (log n)). Achieving an even better time complexity is a field of ongoing research.
Caveat: has initial overhead. Break even point is currently at around 2M elements. With further optimization this can likely be reduced.
Caveat: limited to int type currently. More types will be added.

Algorithm idea:

Hybrid of interpolation sort, a type of radix sort, with a final comparison sort.
The interpolation step is content dependent by sampling randomly from the unsorted data.
The size of the sampled search tree and number of target bins is chosen such that no step uses more than O(n * log (log(n))) comparisons.

The algorithm consists of 3 phases:

Tree: sampling log n elements randomly and sorting them to form a search tree. This requires only in O(log n * log (log(n))) comparisons.
Bin: for each input element the closest left and right element is searched in the search tree. This requires O(n * log (log(n))) comparisons since the search tree contains log n elements. The distance to each side is used to determine the target bin for each element.
Sort: each target bin is sorted using a comparison sort. The average size of each bin is log n elements. Sorting all bins requires O(n * log (log(n))) comparisons since each of the n/log n bins needs O(log n * log (log(n))) comparisons to sort. If a bin happens to be much larger than expected the algorithm is called recursively and a new search tree is sampled for this bin.

tbsort's People

Contributors

Stargazers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.