Git Product home page Git Product logo

libdori's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

libdori's Issues

Travis CI

I do not know yet why Travis CI is failing to build the project but it should not. I should take a look to it as soon as possible.

Given that I will fix the Travis builds I could try to compile the project with gcc AND clang to make sure all works perfectly (I am doing it now only with clang).

Warning flags during compilation

Compiling the library throws some warnings. Basically

delete called on 'XXX' that is abstract but has non-virtual destructor [-Wdelete-non-virtual-dtor]

control reaches end of non-void function [-Wreturn-type]

private field 'XXX' is not used [-Wunused-private-field]

field 'XXX' will be initialized after field '_isGrowing' [-Wreorder]

The code needs to be fixed.

CLI tools need a --help command

There are three basic cmd tools right now:

  • cardinality
  • frequency
  • sample

Three use an arg parser for arguments, but they do not explain anywhere.

Wikipage at Github.com

Taking advantage of the wiki pages of GitHub could be interesting for the project.

  • There are many people who know nothing about the algorithms. I could point to the right place to learn more about them.

  • A section explaining how the code is organized would be very useful for my future yo or other contributors.

Tests update

The library is now getting bigger so the test are becoming more important.

They have to:

  • Cover all the main features of it.
  • Build by default and test the installation/compilation with CMake
  • Remove dependencies with other files (like D1.txt, etc..)

Update Readme.

I want people using my software, and often their first introduction will be through the README in the source code or on the project’s GitHub page.

I should not be lazy, the project needs a great README.

Implement Recordinality algorithm

Implement the Recordinality algorithm.

Data Streams can be studied as random permutations. That fact allows a wealth of classical and recent results from combinatorics to be recycled as estimators for various statistics over data streams.

Recordinality estimates the number of distinct elements in a stream by counting the number of K-records occurring in it.

Implement a basic version (extensible) as cardinality estimator.

HyperBitBit

Robert Sedgewick from Princeton presented a new algorithm for cardinality estimation at AofA '16.

It is inspired in HyperLogLog, but it reduces de memory footprint (even more!).

Would be great (and easy) implement it and use it as default in libDori.

For more information, you can find the slides of the presentation here

Adding Sphinx API documentation

Now that the library is growing would be interesting having a web page generated with sphinx describing the API.

The code could be commented a bit more.

Python bindings

Not for anytime soon (I am very busy working on other stuff) but would be great have some basic python bindings for libDori.

  • CAPI or cffi would be a fast and straightforward way to write them.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.