Singeli sort

Algorithms in Singeli, aiming for high performance and broad adaptivity for sorting CPU-native types (integers and floats). A secondary goal is a well-commented and readable codebase that explains what various methods are good for and how they're implemented to use hardware to its full potential. Will probably end up using SIMD if available to speed up a few things, but this is primarily intended to be a portable rather than SIMD-first sort.

Compile with (add -t cpp for C++ compatibility):

singeli src/sort.singeli -o sort.c

Or the following without a Singeli install. CBQN builds on Unix-like systems (including macOS and WSL) in under a minute; see WinBQN for Windows. There's also a pre-compiled copy of sort.c in compiled/sort.c. It may not always be up to date.

git clone https://github.com/dzaima/CBQN.git
cd CBQN && make && cd -
git clone https://github.com/mlochbaum/Singeli.git
CBQN/BQN Singeli/singeli src/sort.singeli -o sort.c

To benchmark:

gcc -O3 bench.c
./a.out

Exported functions are defined in src/sort.singeli, and their C prototypes appear at the bottom of sort.c: the arguments for sorts are array, length, aux (or scratch buffer), and possibly aux length in bytes. These are likely to change over time.

Overview

Singeli sort currently hybridizes the following algorithms; all are used for sort32 and other functions use subsets. The overall structure is that the glidesort layer may call quicksort, which calls the different base cases in various situations.

Quicksort partitioning from fluxsort and crumsort
Outer merge layer: modified glidesort (powersort rules made lazy to defer to quicksort if runs aren't found)
Merging: based on piposort
Small arrays: sorting networks as in ipn_unstable, extra merging and insertion following quadsort
Radix sort: mostly like ska_sort_copy
Counting sort: see section in rhsort
Robin Hood sort

In progress, still has various issues:

Drapesort, similar to drop-Merge sort

Other methods to consider later:

In-place partitioning with pdqsort. Slower than crumsort but it does adapt to mostly-sorted data well.
Interleaved merges and bidirectional partitioning from glidesort. These have not yet been demonstrated to improve performance relative to fluxsort, and there are indications that they slow things down on older processors in addition to bumping up code size. I'll wait for the paper explaining choices made before looking into them further.

Guide to the source

The source code is supposed to be the place to go to get full descriptions and details. I am certain it fails in this role—particularly in places where I don't expect anyone's reading, so please complain if a part you've chosen to read is not well explained!

The general-use files:

sort.singeli Main file: includes and the sorting function definitions.
base.singeli Basic definitions to be used elsewhere. This includes operators, which are all user-defined in Singeli.
common.singeli Other definitions that are more specific than base but may be used in multiple places.
arith.singeli Some log and square root code to keep it out of the way.

And specific algorithms:

quicksort.singeli
- partition.singeli Partitioning
- median.singeli Medians and pseudomedians for picking candidates
  - xorshift.singeli Pseudo-random number generator (PRNG) avoids patterns
merge.singeli Merging utilities and pisort
glide.singeli Glidesort strategy: use merges for natural runs
- (merge.singeli)
small.singeli Small array sorting
- network.singeli Sorting networks for some fixed sizes
- ins.singeli Insertion sorting
- (merge.singeli)
radix.singeli Radix sorts
- prefix.singeli Prefix sums
count.singeli Counting sort
- (prefix.singeli)
rh.singeli Robin Hood sort, for evenly distributed data
dropsort.singeli (unused) Dropsorts for nearly-sorted arrays

Some quick notes on Singeli. Everything in brackets {} is run at compile time, so a call like dist{dn}{U, minv, maxv} is all inlined (dist is called a generator). Functions are called with parentheses and are used rarely, for things that are exported, used in many places, or recursive.

All operators are user-defined, with many picked up from standard includes skin/c and skin/cext. Some of the ones that are unfamiliar relative to C are listed below.

Syntax	Meaning
`x <{dn} y`	Compare `x` and `y`, flipping ordering if `dn` is `1`
`x -> i`	Value at offset `i` from pointer `x`
`x <- v`	Store `v` at pointer `x`
`x <-{i} v`	Store `v` at offset `i` from pointer `x`
`x <-> y`	Swap values at pointers `x` and `y`
`a <~> b`	Swap variables of variables `a` and `b`
`T~~v`	Cast `v` to same-width type `T`
`T^~v`	Promote `v` to supertype `T`
`T<~v`	Narrowing conversion of `v` to type `T`

Singeli sort uses a fair amount of compile-time trickery to support lots of sorting functions while keeping the code reasonably clean. Functions all support sorting in both directions (dn is 0 for up and 1 for down), and many of them support a sort-by operation that actually passes around a tuple of pointers: one to be sorted and others to be moved in the same pattern. A related operation is "grade", which reorders indices as the data should be ordered, and leaves the data intact (it may partially or completely sort it in aux space). A few funny operators are used to support sort-by: for example *+T to turn tuple type T into a tuple of pointer types, *? to avoid loading from extra pointers until the values are needed, and >* to compare pointer values.

Syntax	Meaning
`>p`	Get first pointer only from multi-pointer
`p >*{dn} q`	Compare `p` and `q` by value at first pointer
`*?p`	Lazy load at `p`
`p *? i`	Lazy load at offset `i` from `p`
`*+T`	Tuple of pointer types

mlochbaum / singelisort Goto Github PK

singelisort's Introduction

Singeli sort

Overview

Guide to the source

singelisort's People

Contributors

Stargazers

Watchers

singelisort's Issues

sorting networks

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent