Git Product home page Git Product logo

queryrec's Introduction

Query Recommendation System

This is a query recommendation system using Jaccard Similarity and collaborative filtering to recommend queries to users based on the previous rated queries. Also a solution is provided to rate a query generally for all users. This project was done as part of the course "Data Mining" at the University of Trento by Seyed Mohammad Mousavi and Omar Facchini. You can find the report of the project in the report folder or simply by clicking here.

Managing the Environment and Dependencies

To start working, first install virtualenv with pip.

pip install virtualenv

Then create an empty virtual environment.

virtualenv .venv

Note that .venv is the name of the virtual environment directory, this directory is omitted in the .gitignore file.

After creating the virtual environment, activate it.

UNIX based Operating Systems (GNU/Linux, macOS, etc.)

source .venv/bin/activate

Windows

.\venv\Scripts\activate

Now you can install the required python packages in the clean environment you just created.

pip install -r requirements.txt

Data generation

The data needed to run the system is already provided. In the situation in which the users wants to generate different datasets they will have to make sure to be located in the queryrec folder in their PC and then run python .\src\datagen\datagenerator.py for windows or python ./src/datagen/datagenerator.py for linux.

These commands will generate three different sizes of datasets, currently the size is static.

What is in the folders

the src folder contains two folders, one for the data generation and one for the actual implementation of the system called dataSetup. In the dataSetup folder there are three .ipynb files that go step by step in the application. The query_recommendation.ipynb notebook takes into account only the baseline as it uses only the smallest database data. The query_recommendation_evaluation.ipynb is the main notebook to use when running the entire system as it uses all the datasets. The general_utility.ipynb notebook is devoted to the execution of the part B of the project as it shows how our idea of utility could be implemented, since the main purpose was to show how well the approach we chose for the utility could work, the notebook uses only the baseline dataset.

How to run the notebooks

The easiest way to run these notebooks would be by utilizing a jupyter notebook which allows to visualize different blocks of code and run each one separately, to properly work, the blocks have to be run in order. To install jupyter notebook simply run pip install jupyterlab and to run it use jupyter lab command. for more information on how to install, check out the jupyter website

Another possible way to be able to run these files is through Visual Studio Code using the Jupyter extension which can be found in the extensions side of this editor.

queryrec's People

Contributors

smmousavisp avatar omarfacchini avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.