isoos / bit_array Goto Github PK
View Code? Open in Web Editor NEWBit array implementation in Dart
Home Page: https://pub.dev/packages/bit_array
License: BSD 3-Clause "New" or "Revised" License
Bit array implementation in Dart
Home Page: https://pub.dev/packages/bit_array
License: BSD 3-Clause "New" or "Revised" License
Currently it's not possible to compare two BitArray
s (and the other BitSets too).
As a workaround I cloned the repository and added the auto-generated methods.
@override
bool operator ==(Object other) =>
identical(this, other) ||
other is BitArray &&
runtimeType == other.runtimeType &&
_data == other._data;
@override
int get hashCode => _data.hashCode;
Consider the _bitMask
and _clearMask
masks:
final _bitMask = List<int>.generate(32, (i) => 1 << i);
final _clearMask = List<int>.generate(32, (i) => ~(1 << i));
We can bake them into constant lists:
// print("const _bitMask = [${List<int>.generate(32, (i) => 1 << i).join(", ")}];");
const _bitMask = [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072, 262144, 524288, 1048576, 2097152, 4194304, 8388608, 16777216, 33554432, 67108864, 134217728, 268435456, 536870912, 1073741824, 2147483648];
// print("const _clearMask = [${List<int>.generate(32, (i) => ~(1 << i)).join(", ")}];");
const _clearMask = [4294967294, 4294967293, 4294967291, 4294967287, 4294967279, 4294967263, 4294967231, 4294967167, 4294967039, 4294966783, 4294966271, 4294965247, 4294963199, 4294959103, 4294950911, 4294934527, 4294901759, 4294836223, 4294705151, 4294443007, 4293918719, 4292870143, 4290772991, 4286578687, 4278190079, 4261412863, 4227858431, 4160749567, 4026531839, 3758096383, 3221225471, 2147483647];
This appears to be a clear improvement (especially when it comes to AOT):
JIT
without baking: 8.3s~
with baking: 8s~
AOT
without baking: 9.4s~
with baking: 8.1s~
I din't bother with _cardinalityBitCounts
because of #20.
@isoos what do you think?
Can i create a BitArray from string like that
111111111111111111100000001100000000000000001111
somehow? If not, it would be nice to have a method "fromBinaryString" like in this library https://www.npmjs.com/package/node-bitarray
Hey @isoos,
I think there's a simple way to improve the performance of all the set operations of BitArray
on other BitArray
s by specializing them to take a BitArray
and using that knowledge to avoid using iterators.
That is, I've observed that we can be more efficient if we replace the following:
// ... and and/andNot/xor
void or(BitSet set) {
final iter = set.asUint32Iterable().iterator;
for (var i = 0; i < _data.length && iter.moveNext(); i++) {
_data[i] |= iter.current;
}
}
with
// ... and and/andNot/xor
void or(BitArray set) {
final Uint32List data2 = set._data;
for (var i = 0; i < _data.length && i < data2.length; i++) {
_data[i] |= data2[i];
}
}
I'm not proposing to replace the existing set operations, but to add new ones that work on BitArray
s only.
What do you think?
Consider this bitset implementation by lemire is JS https://github.com/lemire/FastBitSet.js:
It appears to manually unroll some loops which might also give us a boost in performance, too, e.g.:
The or
operation is currently on my hot path and I'm curious whether manually unrolling some loops could help.
Consider the following implementation of iteration over all set bits in a bit_set:
inline bool bitset_for_each(const bitset_t *b, bitset_iterator iterator,
void *ptr) {
size_t base = 0;
for (size_t i = 0; i < b->arraysize; ++i) {
uint64_t w = b->array[i];
while (w != 0) {
uint64_t t = w & (~w + 1);
int r = roaring_trailing_zeroes(w);
if (!iterator(r + base, ptr)) return false;
w ^= t;
}
base += 64;
}
return true;
}
it uses count trailing zeroes to improve efficiency.
Blocked by #17 & dart-lang/sdk#52673, or maybe there's an efficient-enough way to simulate count trailing zeroes (e.g. https://stackoverflow.com/questions/31233609/what-is-the-most-efficient-to-count-trailing-zeroes-in-an-integer).
Currently, the cardinality implementation of BitArray walks the buffer in byte chunks, looks up the popcount for each chunk and sums them up to get the cardinality.
There's a BitSet implementation hidden in the ANTLR4 runtime implementation:
It calculates the cardinality by what appears to be a software simulated popcount that has been specialized for lists:
BitArray would probably benefit from taking a similar approach.
(#17 / dart-lang/sdk#52673 would be nice here, too.)
For future reference, here's an implementation of popcount:
int count_bits_64_popcount(
final int value_64_bits,
) {
// TODO try this:
// /// http://resnet.uoregon.edu/~gurney_j/jmpc/bitwise.html
// #define BITCOUNT(x) (((BX_(x)+(BX_(x)>>4)) & 0x0F0F0F0F) % 255)
// #define BX_(x) ((x) - (((x)>>1)&0x77777777) - (((x)>>2)&0x33333333) - (((x)>>3)&0x11111111))
// /// https://graphics.stanford.edu/~seander/bithacks.html
// int _count_bits_parallel_32_bits(
// final int value,
// ) {
// // In binary: 01010101010101010101010101010101
// const alternating_1 = 0x55555555;
// // In binary: 00110011001100110011001100110011
// const alternating_2 = 0x33333333;
// // In binary: 00001111000011110000111100001111
// const alternating_3 = 0x0F0F0F0F;
// // In binary: 00000000111111110000000011111111
// const alternating_4 = 0x00FF00FF;
// // In binary: 00000000000000001111111111111111
// const alternating_5 = 0x0000FFFF;
// final a = value >> 1;
// final b = value - (a & alternating_1);
// final c = ((b >> 2) & alternating_2) + (b & alternating_2);
// final d = ((c >> 4) + c) & alternating_3;
// final e = ((d >> 8) + d) & alternating_4;
// final f = ((e >> 16) + e) & alternating_5;
// return f;
// }
//
// int _manual_count_bits_64(
// final int value_64_bits,
// ) {
// // Shift the high 32 bits down.
// final hi = _count_bits_parallel_32_bits(value_64_bits >> 32);
// // The subroutine is assumed to ignore the bits beyond 32.
// final lo = _count_bits_parallel_32_bits(value_64_bits);
// return hi + lo;
// }
// According to wikipedia, this should be the fastest way to do popcount?
// const m1 = 0x5555555555555555; //binary: 0101...
// const m2 = 0x3333333333333333; //binary: 00110011..
// const m4 = 0x0f0f0f0f0f0f0f0f; //binary: 4 zeros, 4 ones ...
// const m8 = 0x00ff00ff00ff00ff; //binary: 8 zeros, 8 ones ...
// const m16 = 0x0000ffff0000ffff; //binary: 16 zeros, 16 ones ...
// const m32 = 0x00000000ffffffff; //binary: 32 zeros, 32 ones
// const h01 = 0x0101010101010101; //the sum of 256 to the power of 0,1,2,3...
//
// int x = value_64_bits;
// x -= (x >> 1) & m1; //put count of each 2 bits into those 2 bits
// x = (x & m2) + ((x >> 2) & m2); //put count of each 4 bits into those 4 bits
// x = (x + (x >> 4)) & m4; //put count of each 8 bits into those 8 bits
// return (x * h01) >> 56; //returns left 8 bits of x + (x<<8) + (x<<16) + (x<<24) + ...
const String lookup_table = "\x00\x01\x01\x02\x01\x02\x02\x03\x01\x02\x02\x03"
"\x02\x03\x03\x04\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04"
"\x04\x05\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05"
"\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06\x01\x02"
"\x02\x03\x02\x03\x03\x04\x02\x03\x03\x04\x03\x04\x04\x05\x02\x03\x03\x04"
"\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06\x02\x03\x03\x04\x03\x04"
"\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06\x03\x04\x04\x05\x04\x05\x05\x06"
"\x04\x05\x05\x06\x05\x06\x06\x07\x01\x02\x02\x03\x02\x03\x03\x04\x02\x03"
"\x03\x04\x03\x04\x04\x05\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05"
"\x04\x05\x05\x06\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05"
"\x05\x06\x03\x04\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07"
"\x02\x03\x03\x04\x03\x04\x04\x05\x03\x04\x04\x05\x04\x05\x05\x06\x03\x04"
"\x04\x05\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07\x03\x04\x04\x05"
"\x04\x05\x05\x06\x04\x05\x05\x06\x05\x06\x06\x07\x04\x05\x05\x06\x05\x06"
"\x06\x07\x05\x06\x06\x07\x06\x07\x07\x08";
return lookup_table.codeUnitAt(value_64_bits & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 8) & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 16) & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 24) & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 32) & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 40) & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 48) & 0xFF) +
lookup_table.codeUnitAt((value_64_bits >>> 56) & 0xFF);
}
(Some while ago I've found that the String-lookuptable-version appears to be the fastest. The comments contain some other versions. Maybe we could use that to walk a Uint64List view in 64bit chunks? Many possible approaches here.)
Since Uint8List may reside in a non 32bit aligned position in memory, it is impossible to initialize the bit array from Uint8List.
Several alternatives:
@isoos, I can issue a PR, What do you think?
Dart has some SIMD support via Int32x4. Int32x4 is supposed to support 4 bit ops for the price of one. Int32x4List is supposed to store a bunch of Int32x4 and it might make sense to use Int32x4List as the underlying buffer for BitArray to increase the performance of set operations.
I don't plan to work on this anytime soon, but I wanted to write that idea down so maybe somebody else can give it a go.
Hello @isoos!
(I hope you don't mind all the issues. Having an efficient bitset in Dart has been on my TODO list for a looong time :), and your work is the best starting point that I've been able to find so far.)
One of my BitArray-related bottlenecks appears to be related to hashCode
calculations:
What do you think about having a hashCode cache on BitArray? That is, on each mutation to the underlying BitArray buffer we'd invalidate the current hashCode (i.e., _hashCodeCache = null
) and on each invocation of hashCode
we'd set it to the current hashCode if it is not present (i.e., _hashCodeCache ??= currentHashCode
).
What do you think?
Since we can export easily to an int array, we should be able to import it as well, i.e.:
var bits = BitArray(history.keys.max)..setBits(setKeys); // would be nice to have a direct constructor for this as `BitArray(setKeys)`
var ints = bits.asUint32Iterable().toList();
expect(BitArray.parseInts(ints), bits);
This should be better optimized than #4.
Even better solution could be to serialize/deserialize from binary string representation - I need this BitArray for optimized storage in the first place.
import 'package:bit_array/bit_array.dart';
class PrimeGen {
final primeLength =100;
BitArray _numbers;
PrimeGen() {
_numbers = BitArray(primeLength);
_numbers.setAll();
_numbers.clearBits([0, 1]);
/* ----------------------------- Seive Algorihm ----------------------------- */
for (int i = 2; i <= primeLength; ++i) {
if (_numbers[i]) {
for (int j = i * i; j <= primeLength; j += i) {
_numbers.clearBit(j);
}
}
}
}
int get count => _numbers.cardinality;
List<int> get list {
List<int> primeList = [];
for (int i = 0; i < primeLength; ++i) {
if (_numbers[i]) {
primeList.add(i);
}
}
return primeList;
}
}
here the geter function count is returning 52 but in actual it should return 25
Hey, do you have any plans to support
Bitmap.fromHeadless(width, height, bytes)
and
bitmap.buildHeaded().buffer.asUint8List()
Like in the pub package bitmap
The pub bitmap package doesn't support web so bit_array can become a good alternative
For reference: dart-lang/sdk#52673
Instead of reallocating the list every time the length changes:
set length(int value) {
if (_length == value) {
return;
}
final data = Uint32List(_bufferLength32(value));
data.setRange(0, math.min(data.length, _data.length), _data);
_data = data;
_length = _data.length << 5;
}
You could check if _bufferLength32(value) has changed, and reallocate only then. This can dramatically reduce the allocations in case the list is build incrementally:
set length(int value) {
if (_bufferLength32(_length) == _bufferLength32(value)) {
return;
}
final data = Uint32List(_bufferLength32(value));
data.setRange(0, math.min(data.length, _data.length), _data);
_data = data;
_length = _data.length << 5;
}
What do you think?
I've noticed that BitArray equality could be specialized to be more efficient when two BitArray instances are being compared.
An is
check would probably be fine (i.e. #18 (comment)) which could be used to introduce a fast path that uses for loops over the underlying buffers instead of having to use iterators (as is currently being done in the base BitSet implementation).
Furthermore, I feel like inlining the fold in hashCode
might give us some easy gains, too. I'm not sure about the overhead of fold and the closure... JIT might be able to deal with it just fine, but I have doubts about AOT.
Is it possible to use “&”, “|”, “^”, “<<“, “>>”, “~”; on the bitsets?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.