Git Product home page Git Product logo

kgrec's Introduction

Introduction

This is the Pytorch implementation for our paper at KDD'23: Knowledge Graph Self-Supervised Rationalization for Recommendation.

Environment Dependencies

You can refer to requirements.txt for the experimental environment we set to use.

run KGRec

Simply use:

python run_kgrec.py --dataset [dataset_name]

And the hyperparameters we use are fixed according to the dataset in KGRec.py.

Baseline Models (KGCL, KGIN)

We also implement KGCL and include the original KGIN release in our repository. For example, to run KGCL, you may execute:

alibaba-ifashion

python run_kgcl.py --mu 0.7 --tau 0.2 --cl_weight 0.1

last-fm

python run_kgcl.py --mu 0.5 --tau 0.1 --cl_weight 0.1

mind

python run_kgcl.py --mu 0.6 --tau 0.2 --cl_weight 0.1

Citation

Please kindly cite our work if you find our paper or codes helpful.

@inproceedings{yang2023knowledge,
  title={Knowledge graph self-supervised rationalization for recommendation},
  author={Yang, Yuhao and Huang, Chao and Xia, Lianghao and Huang, Chunzhen},
  booktitle={Proceedings of the 29th ACM SIGKDD conference on knowledge discovery and data mining},
  pages={3046--3056},
  year={2023}
}

kgrec's People

Contributors

hkuds avatar yuh-yang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

kgrec's Issues

加速模型訓練

作者您好,
我照您原本設置執行run_kgrec.py
以下為各個dataset下一個epoch之執行時間
last-fm : 15min
Screenshot from 2023-09-13 18-37-47

mind-f : 2h26min
Screenshot from 2023-09-13 18-36-13

alibaba-fashion : 32min
Screenshot from 2023-09-13 18-43-07

執行環境 :
system : Ubuntu 20.04.5 LTS
CPU : Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
GPU : NVIDIA GeForce RTX 3090

檢查GPU和CPU皆正常運作
請問該如何使執行時間如您在之前issue所述 : 800 secs for iFashion and Last-FM, and 700 secs for MIND
謝謝

baseline复现

请问MCCLK是如何复现的,原始论文的代码中进行的是CTR预估

环境配置

您好,可以问下代码的大概还环境配置吗?比如python、pytorch以及用到的一些主要库的版本,,readme文件和论文里没有写。不胜感激

Dataset

作者您好,想再請教關於dataset的問題:
以last-fm為例,論文當中數據如圖
image
1.interaction 數量對不上
論文中interactions數量為3034796
但是train.txt 之interaction 數量為1289003
test.txt 之interaction 數量為423635

2.entity
論文中entity數量為58266
但程式碼中n_entities數量為106389

3.dataset split 不一致
論文中寫說train/valid/test = 7:1:2
程式碼當中只有train/test = 1289003:423635 約為3:1

想知道為何會有如此不一致之情況?

4.test interval和early stop step設定的原因?
test_interval = 10 if args.dataset == 'last-fm' else 1
early_stop_step = 5 if args.dataset == 'last-fm' else 10

5.test time
我在last-fm執行kgrec 程式test time 跟train time 差不多久
在alibaba-fashion 執行kgrec程式test time是train time的3倍
請問該如何加速test time(參考您之前在mind執行的log,test time是遠小於train time的)

謝謝

Does batch size affect the results?

Hello, does the batch size affect the result? I found that using the default batch size:1024 takes a long time to run, so I want to make the batch size a little bit bigger to speed up, thank you!

model problem

Thank you very much for sharing
i run the code python run_kgrec.py --dataset last-fm to process the dataset
When I ran it(python run_kgrec.py --dataset last-fm) the second time to train
i get this error

NameError: name 'test_user_set' is not defined

experimental result

Thank you very much for sharing the code

after running the code python run_kgrec.py --dataset last-fm
Where can I view the experimental results,ndcg or recall

程式碼與論文設計不一致

作者您好,
在trace完程式碼以後,有些問題想要請教您:
1.KGRec.py combine two mask,而論文沒有此設計
2.為何在class KGRec 的forward 函式當中,
item_attn_mean_1[item_attn_mean_1 == 0.] = 1.
item_attn_mean_2[item_attn_mean_2 == 0.] = 1.
不太理解為何要把score 最低的item設為最高?
3.計算contrastive loss時,依照公式17,分母應該會產生4倍的between_sim(j分別以v,v',v''帶入公式)而不是程式碼所寫之兩倍的between_sim
4.
公式17,
應該是把所有item下之loss相加而非程式碼之取平均
5.
公式14,
對應程式碼應該使用_bi_norm_lap(adj),而程式碼卻使用_si_norm_lap(adj)是為何?

5 groups

请问实验中所使用的5组数据是如何划分的?划分的依据是什么呀

模型表現

作者您好,我嘗試跑了您提供的所有程式碼,想請教以下問題:
1.
此表為論文中的數據
image
此表為我以程式碼所跑出來的數據
image
在一些數據上差異甚大,kgrec在mind dataset上面也輸給kgcl
我完全照預設設定執行程式,只有改變batch size而已
2.
image
上面寫說early stopping at 10, recall@20:0.0349
但在表格當中recall卻是0.02996574
為什麼會有不一致的情形?

謝謝

数据集具体内容

数据集中展示的是item和entity的id,请问该如何找到entity对应的name呢? 我希望可以进一步了解kg包含的具体内容,但是,例如,last-fm的kg抽取自freebase,但是基于kg里的org_id,我还没有找到id映射到name的方法。

figure3,figure4

Could you explain how Figure 3 and Figure 4, G1, G2,,,,,,, are grouped together in your paper KG-Rec

运行代码里kgcl的mind-f数据集

你好,能告知下运行kgcl的mind-f数据集的具体参数吗,我用github上python run_kgcl.py --mu 0.6 --tau 0.2 --cl_weight 0.1 跑不出论文中的结果,其他2个数据集都能跑出一样的结果

baseline性能问题

请问一下论文里复现SGL时,采用的是哪个代码以及参数设置。

Rationale Masking Mechanism

您好,論文裡有不解之處想請教您:
Screenshot from 2023-09-12 00-43-20

這裡是取top 𝑘-highest rational scores 之triplet集合:Mk
但是為何下面寫說'we remove the edges M𝑘 with low rationale scores...'
謝謝

IndexError: index ... is out of bounds for dimension 0 with size ...

Thank you for sharing the code!

I tried KGCL on two datasets, and both reported similar errors:

alibaba-fashion:

index 30208 is out of bounds for dimension 0 with size 30040
Traceback (most recent call last):
  File "run_kgcl.py", line 154, in <module>
    ret = evaluator.test(model, user_dict, n_params)
  File ".../KGRec-main/utils/evaluator.py", line 139, in test
    u_g_embeddings = user_gcn_emb[user_batch]
IndexError: index 30208 is out of bounds for dimension 0 with size 30040

last-fm:

index 23566 is out of bounds for dimension 0 with size 23566
Traceback (most recent call last):
  File "run_kgcl.py", line 154, in <module>
    ret = evaluator.test(model, user_dict, n_params)
  File ".../KGRec-main/utils/evaluator.py", line 152, in test
    i_g_embddings = entity_gcn_emb[item_batch]
IndexError: index 23566 is out of bounds for dimension 0 with size 23566

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.