Git Product home page Git Product logo

sigir16-topkcomplete's Introduction

This is an example project of the SIGIR 2016 tutorial Succinct Data Structures in Information Retrieval: Theory and Practice presented by Simon Gog and Rossano Venturini.

The example shows how the Succinct Data Structure Library can be used to implement a space-efficient top-k query completion system. The final result is an almost state-of-the-art system which is implemented in less than 300 lines of code.

Here is an example of our final system. The index is built over titles and click counts of Wikipedia pages.

Searching Wikipedia titles

Installation

    ./install.sh

Building the project

    cd build
    cmake ..
    make

CMake will parse the index.config file and generate binaries for each index. The index name will be the prefix of the corresponding executables.

Running the command line version

    ./index1-main ../data/stops_nl.txt

The binary will generate an index and wait for user input and answer queries (one per line) interactively. The index is stored in ../data/stops_nl.txt.index1.sdsl and a visualization of its memory consumption is available at stops_nl.txt.index1.html. In general, each executable IDX-* will store the generated index at file.IDX.sdsl and its space visualization at file.IDX.html.

Running the webserver version

    ./index1-webserver ../data/stops_nl.txt 8000

The binary will generate an index and start a webserver which will listen to the specified port.

Running the demo application

  1. Change into the build directory
  2. Download the Wikipedia titles by calling make download
  3. Build the executable by calling make index4ci-webserver
  4. Generate the index and start the webserver by calling ./index4ci-webserver ../data/enwiki-20160601-all-titles
  5. You can access the demo at http://127.0.0.1:8000

Credits

  • Thanks to Sascha Witt for preparing the example input file which contains the pairs of Dutch train stations and number of daily train stops.

  • Thanks to all contributers to the SDSL project.

sigir16-topkcomplete's People

Contributors

simongog avatar rossanoventurini avatar kleinron avatar

Watchers

tidesq avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.