Git Product home page Git Product logo

irecsys / deepcarskit Goto Github PK

View Code? Open in Web Editor NEW
19.0 3.0 4.0 6.18 MB

A Deep Learning Based Context-Aware Recommendation Library

Home Page: https://carskit.github.io/

License: MIT License

Shell 0.12% Python 98.76% HTML 1.12%
collaborative-filtering context-aware context-aware-recommender-system deep-learning neural-collaborative-filtering neural-network pytorch recommender-system deep-recommender-system

deepcarskit's Introduction

DeepCARSKit

A Deep Learning Based Context-Aware Recommendation Library

License Website carskit.github.io python Citation Badge DOI:10.1007/978-3-319-76207-4_15

CARSKit Website

History

  • CARSKit was released in 2015, and it was the first open-source library for context-aware recommendations. There were no more significant updates in CARSKit since 2019. It was a library built based on Java and Librec v1.3. There is a version in Python, CARSKit-API, which is a python wrapper of CARSKit.
  • Recommender systems based on deep learning have been well-developed in recent years. The context-aware recommendation models based on traditional collaborative filtering (e.g., KNN-based CF, matrix factorization) turned out to be out-dated. Therefore, we develop and release DeepCARSKit which was built upon the RecBole v1.0.0 recommendation library. DeepCARSKit is a Deep Learning Based Context-Aware Recommendation Library which can be run with correct setting based on Python and PyTorch.

Feature

  • Implemented Deep Context-Aware Recommendation Models. Currently, we support the CARS models built based on factorization machines (FM) and Neural Collaborative Filtering (NeuCF and NeuMF). More algorithms will be added.

  • Multiple Data Splits & Evaluation Options. We provide evaluations based on both hold-out and N-fold cross validations.

  • Extensive and Standard Evaluation Protocols. We rewrite codes in RecBole to adapt the evaluations for context-aware recommendations. Particularly, item recommendations can be produced for each unique combination of (user and context situation). Relevance and Ranking metrics, such as precision, recall, NDCG, MRR, can be calculated by taking context information into consideration.

  • Autosave Best Logs. DeepCARSKit can automatically save the best log/configuration of the models you run, in the folder of 'log/best/'.

  • Other Features. Other characteristic in DeepCARSKit are inherited from RecBole, suc as GPU accelerations.

News & Updates

03/19/2022: We release DeepCARSKit v1.0.0

Documents

Installation

DeepCARSKit works with the following operating systems:

  • Linux
  • Windows 10
  • macOS X

DeepCARSKit requires Python version 3.7 or later, torch version 1.7.0 or later, and RecBole version 1.0.0 or later (v 1.1+ is not compatible with DeepCARSKit). For more details, you can refer to the list of requirements. If you want to use DeepCARSKit with GPU, please ensure that CUDA or cudatoolkit version is 9.2 or later. This requires NVIDIA driver version >= 396.26 (for Linux) or >= 397.44 (for Windows10).

More info about installation from conda and pip will be released later. Currenly, you can make a git clone of the source codes. We will pulish it to pypi and conda in next release.

Quick-Start

With the source code, you can use the provided script for initial usage of our library:

python run.py

This script will run the NeuCMFi model on the DePaulMovie dataset.

Data Sets & Preparation

A list of available data sets for research on context-aware recommender systems can be found here. We provide two data sets (i.e., DePaulMovie and TripAdvisor) in the library. You can refer to its data format, such as depaulmovie.inter.

More specifically, you need to prepare a data set looks like this: (use 'float' and 'token' to indicate numerical and nominal variables)

  • user_id:token
  • item_id:token
  • rating:float
  • context variable 1:token
  • context variable 2:token
  • context variable N:token
  • contexts:token => a concatenation of context conditions
  • uc_id:token => a concatenation of user_id and contexts

Algorithms in NeuCMF Framework

An extensive NeuCMF framework is included in the DeepCARSKit library. There are multiple variants of the NeuCMF models in this framework.

alt text

Hyperparameter tuning

You can tune up the parameters from the configuration file, config.yaml

A user guide with more and more details is on the way...

Major Releases

Releases Date
v1.0.0 03/19/2022

Cite

If you find DeepCARSKit useful for your research or development, please cite the following paper:

@article{deepcarskit,
    title={DeepCARSKit: A Deep Learning Based Context-Aware Recommendation Library},
    author={Zheng, Yong},
    journal={Software Impacts},
    volume={13},
    pages={100292},
    year={2022},
    publisher={Elsevier}
}

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change. Please make sure to update tests as appropriate.

We welcome collaborations and contributors to the DeepCARSKit. Your names will be listed here.

Sponsors

The current project was supported by Google Cloud Platform. We are looking for more sponsors to support the development and distribution of this libraray. If you are interested in sponsorship, please let me know. Our official email is DeepCARSKit [at] gmail [dot] com.

License

MIT License

deepcarskit's People

Contributors

irecsys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

deepcarskit's Issues

Error when running run.py script

I have cloned the repository and want to test the code, so I have started following the instructions in the README file and am getting some errors. (I cloned this repo one day before posting this issue so that you can get the exact version to reproduce the error)

Steps to reproduce Error

Error is at the end of the bash area

Some more additional information about the hardware and software
Software

  • OS = Rocky Linux 8.5 (Green Obsidian)
  • Python = 3.9.9

Hardware

  • CUDA = 11.6
  • GPU = NVIDIA A2

Error

GPU availability:  True
Num of GPU:  1
NVIDIA A2
Current GPU index:  0

18 Feb 12:52    INFO  
General Hyper Parameters:
gpu_id = 0
use_gpu = True
seed = 2022
state = INFO
reproducibility = True
data_path = dataset/tripadvisor
checkpoint_dir = saved
show_progress = False
save_dataset = False
dataset_save_path = None
save_dataloaders = False
dataloaders_save_path = None
log_wandb = False

Training Hyper Parameters:
epochs = 50
train_batch_size = 500
learner = adam
learning_rate = 0.01
train_neg_sample_args = {'distribution': 'none', 'sample_num': 'none', 'alpha': 'none', 'dynamic': False, 'candidate_num': 0}
eval_step = 1
stopping_step = 10
clip_grad_norm = None
weight_decay = 0.0
loss_decimal_place = 4

Evaluation Hyper Parameters:
eval_args = {'split': {'CV': 5}, 'group_by': 'user', 'mode': 'labeled', 'order': 'RO'}
repeatable = False
metrics = ['MAE', 'RMSE', 'AUC']
topk = [10, 20, 30]
valid_metric = MAE
valid_metric_bigger = False
eval_batch_size = 409600
metric_decimal_place = 4

Dataset Hyper Parameters:
field_separator = ,
seq_separator =  
USER_ID_FIELD = user_id
ITEM_ID_FIELD = item_id
RATING_FIELD = rating
TIME_FIELD = timestamp
seq_len = None
LABEL_FIELD = label
threshold = {'rating': 0}
NEG_PREFIX = neg_
load_col = None
unload_col = None
unused_col = None
additional_feat_suffix = None
rm_dup_inter = None
val_interval = None
filter_inter_by_user_or_item = True
user_inter_num_interval = [0,inf)
item_inter_num_interval = [0,inf)
alias_of_user_id = None
alias_of_item_id = None
alias_of_entity_id = None
alias_of_relation_id = None
preload_weight = None
normalize_field = None
normalize_all = None
ITEM_LIST_LENGTH_FIELD = item_length
LIST_SUFFIX = _list
MAX_ITEM_LIST_LENGTH = 50
POSITION_FIELD = position_id
HEAD_ENTITY_ID_FIELD = head_id
TAIL_ENTITY_ID_FIELD = tail_id
RELATION_ID_FIELD = relation_id
ENTITY_ID_FIELD = entity_id
benchmark_filename = None

Other Hyper Parameters: 
worker = 0
wandb_project = recbole
shuffle = True
require_pow = False
enable_amp = False
enable_scaler = False
transform = None
numerical_features = []
discretization = None
kg_reverse_r = False
entity_kg_num_interval = [0,inf)
relation_kg_num_interval = [0,inf)
MODEL_TYPE = ModelType.CONTEXT
CONTEXT_SITUATION_FIELD = contexts
USER_CONTEXT_FIELD = uc_id
neg_sampling = None
mf_embedding_size = 64
mlp_embedding_size = 64
mlp_hidden_size = [128, 64, 32]
dropout_prob = 0.1
mf_train = True
mlp_train = True
embedding_size = 64
ranking = False
sigmoid = False
ranking_valid_metric = Recall@10
ranking_metrics = ['Precision', 'Recall', 'NDCG', 'MRR', 'MAP']
err_valid_metric = MAE
err_metrics = ['MAE', 'RMSE', 'AUC']
MODEL_INPUT_TYPE = InputType.POINTWISE
eval_type = EvaluatorType.VALUE
single_spec = True
local_rank = 0
device = cuda
eval_neg_sample_args = {'distribution': 'none', 'sample_num': 'none'}


18 Feb 12:52    INFO  tripadvisor
The number of users: 2372
Average actions of users: 5.978490088570224
The number of items: 2270
Average actions of items: 6.24724548259145
The number of inters: 14175
The sparsity of the dataset: 99.73674142529214%
Remain Fields: ['user_id', 'item_id', 'rating', 'trip', 'contexts', 'uc_id']
Context dimension - trip: 6 values: : ['BUSINESS' 'COUPLES' 'FAMILY' 'FRIENDS' 'SOLO' '[PAD]']
Traceback (most recent call last):
  File "/scratch/apeddi/DeepCARSKit/run.py", line 32, in <module>
    run(config_file_list=config_list)
  File "/scratch/apeddi/DeepCARSKit/deepcarskit/quick_start/quick_start.py", line 96, in run
    train_data, valid_data = data_preparation(config, dataset)
  File "/scratch/apeddi/DeepCARSKit/deepcarskit/data/utils.py", line 132, in data_preparation
    train_sampler, valid_sampler = create_samplers(config, dataset, built_datasets[fold])
  File "/scratch/apeddi/DeepCARSKit/deepcarskit/data/utils.py", line 301, in create_samplers
    if train_neg_sample_args['strategy'] != 'none':
KeyError: 'strategy'

@irecsys Could you please help me in resolving this error?

getting error while running run.py

KeyError: 'strategy'
Traceback (most recent call last):
File "/home/user/DeepCARSKit/run.py", line 32, in
run(config_file_list=config_list)
File "/home/user/DeepCARSKit/deepcarskit/quick_start/quick_start.py", line 96, in run
train_data, valid_data = data_preparation(config, dataset)
File "/home/user/DeepCARSKit/deepcarskit/data/utils.py", line 132, in data_preparation
train_sampler, valid_sampler = create_samplers(config, dataset, built_datasets[fold])
File "/home/user/DeepCARSKit/deepcarskit/data/utils.py", line 301, in create_samplers
if train_neg_sample_args['strategy'] != 'none':

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.