positiveblue / libdori Goto Github PK
View Code? Open in Web Editor NEWProbabilistic Data Structures
License: MIT License
Probabilistic Data Structures
License: MIT License
There are unused/useless imports in the LogLog.hpp file.
Remove them.
There are three basic cmd tools right now:
Three use an arg parser for arguments, but they do not explain anywhere.
Implement the Recordinality algorithm.
Data Streams can be studied as random permutations. That fact allows a wealth of classical and recent results from combinatorics to be recycled as estimators for various statistics over data streams.
Recordinality estimates the number of distinct elements in a stream by counting the number of K-records occurring in it.
Implement a basic version (extensible) as cardinality estimator.
HyperLogLog is the most used algorithm in that domain. Paper
Taking advantage of the wiki pages of GitHub could be interesting for the project.
There are many people who know nothing about the algorithms. I could point to the right place to learn more about them.
A section explaining how the code is organized would be very useful for my future yo or other contributors.
I do not know yet why Travis CI is failing to build the project but it should not. I should take a look to it as soon as possible.
Given that I will fix the Travis builds I could try to compile the project with gcc AND clang to make sure all works perfectly (I am doing it now only with clang).
Robert Sedgewick from Princeton presented a new algorithm for cardinality estimation at AofA '16.
It is inspired in HyperLogLog, but it reduces de memory footprint (even more!).
Would be great (and easy) implement it and use it as default in libDori.
For more information, you can find the slides of the presentation here
I want people using my software, and often their first introduction will be through the README in the source code or on the project’s GitHub page.
I should not be lazy, the project needs a great README.
Now that the library is growing would be interesting having a web page generated with sphinx describing the API.
The code could be commented a bit more.
Not for anytime soon (I am very busy working on other stuff) but would be great have some basic python bindings for libDori.
Compiling the library throws some warnings. Basically
delete called on 'XXX' that is abstract but has non-virtual destructor [-Wdelete-non-virtual-dtor]
control reaches end of non-void function [-Wreturn-type]
private field 'XXX' is not used [-Wunused-private-field]
field 'XXX' will be initialized after field '_isGrowing' [-Wreorder]
The code needs to be fixed.
The library is now getting bigger so the test are becoming more important.
They have to:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.