Git Product home page Git Product logo

top-k-rec's Introduction

Top-k Recommendation

Introduction

A python code collection for top-k recommendation refactored by tensorflow
The old repository is here.
Current implementation is purely based on python, however, its speed is slower than the old one.

The collection will consist of following methods:

  • Bayesian Personalized Ranking (BPR)
    • BPR is the very first version of the BPR based methods.
    • It is only applicable in in-matrix recommendation scenario.
  • Visual Bayesion Personalized Ranking (VBPR)
    • VBPR is the extension of BPR to combine visual contents in the rating prediction.
    • It can recommend videos in both in-matrix and out-of-matrix recommendation scenarios.
  • DeepMusic (DPM)
    • DPM uses multiple layer perceprion (MLP) to learn the content latent vectors from MFCC.
    • It recommends videos in both in-matrix and out-of-matrix recommendation scenarios.
  • Collaborative Topic Regression (CTR)
    • CTR uses LDA to learn the topic distribution from the textual content vectors, then performs the collaborative regression to learn the user and item latent vectors.
    • CTR can perform in-matrix and out-of-matrix recommendation but only with the textual content vectors.
    • The original code can be downloaded from here.
  • Collaborative Deep Learning (CDL)
    • CDL uses stacked denoising auto-encoder (SDAE) to learn the content latent vectors, then performs the collaborative regression to learn the user and item latent vectors.
    • CDL can perform in-matrix and out-of-matrix recommendation.
    • The original code can be downloaded from here.
    • CDL originally supports textual contents only.
    • CDL can support non-textual contents by replacing the binary visiable layer with Gaussian visiable layer.
  • Neural Collaborative Filtering (NCF)
  • Collaborative Embedding Regression (CER)

Instruction

All the code in the repository is written in Python 3.
To simplify the installation of Python 3, please use Anaconda.
The dependencies are numpy, scipy, tensorflow.
After forking, you should configure several things before running the code:

  • Use pip to install numpy, scipy, and tensorflow;
  • Download datasets

For training, you can run

python train.py

For evaluation, you can run

python evaluate.py -d data -m embed/cer -f 0 -sl im om

This will evaluate cer's performance in both in-matrix and out-of-matrix settings with content feature (In our example, this is meta).
By default, the evaluation will report accuracy@5,10,15,20,25 and 30.

Dataset

Due to the file size limitation, datasets for training and testing are hosted by other places.
At present, we provide two datasets derived from Movielens 10M and Netflix:
Movielens: ratings and features
Netflix: rating and features
Each of them will have following data files for experiments:

  • uid:
    • User id list where each line is a user id. The id sequence may not be continuous.
  • vid:
    • Video id list where each line is a video id. The id sequence may not be continuous.
  • f?[tr|te][.|.im|.om].[idl|txt]:
    • Rating related files where ? is the fold index, tr denotes training set, te denotes testing set, im denotes in-matrix evaluation, om indicate out-of-matrix evaluation, idl denotes id list and txt denotes rating file.
    • Each line in rating files starts with a used id, and is filled with the corresponding item-rating pairs separated by commas. In each video-rating pair, 1 denotes like and 0 denotes dislike.
    • For instance:
      1. f2tr.txt contains the ratings in the training set 2
      2. f2te.im.txt contains the ratings in the test set 2 for in-matrix evaluation
      3. f2te.om.txt contains the ratings in the test set 2 for out-matrix evaluation
  • The input data files for ctr are also provided. Their suffixes are 'mfp'.
  • The feature files could be read by pickle in binary mode. The feature vectors are aligned to the id list in vid.

Please modify the access path inside code to make the execution correctly.

The original 10380 videos can be downloaded from below link:
Baidu Yunpan

Reference

If you use above codes or data, please cite the paper below:
@article{VCRS,
    author   = {Xingzhong Du and Hongzhi Yin and Ling Chen and Yang Wang and Yi Yang and Xiaofang Zhou},
    title = {Personalized Video Recommendation Using Rich Contents from Videos},
    journal = {TKDE},
    year = {2019}
}

top-k-rec's People

Contributors

domainxz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.