Git Product home page Git Product logo

Comments (4)

lpworld avatar lpworld commented on July 28, 2024

Hi Kangqi,

Thank you so much for your interest in our paper, and for your careful observation of our codes! Indeed, your suggestions would definitely make our project much better. We wish to clarify the issues you raised as follows:

(1) Regarding the splitting of the training/test set. You are right in the sense that what we are currently doing is not entirely "chronological", because we rank the order of user behaviors in the training set and the test set separately, instead of doing them together. The reason for that is we wish to test the performance of our proposed method with different combinations of user behavioral sequences, i.e., we randomly shuffle the dataset in multiple runs for cross-validation purposes (which is done externally). In this way, we will get a more robust (but also slightly "incorrect", as you point out) evaluation of the recommendation performance. We have followed your advice and conducted the time-based splitting, and we confirm that our performance is still significantly better than the baselines.

(2) Regarding the calculation of HR@10 and Unexpectedness measures. Note that our dataset is very sparse and for each user in the test set, very few of them have interaction records of more than 10. Therefore, when we calculate the HR@10 metric, it is equivalent to calculating HR@ALL, i.e., we calculate the utility values of all possible <history_behavior, target_item> pairs in the test set and determine whether it is suitable for recommendations. The same logic applies to the unexpectedness measure as well. You are absolutely right; ideally, we want to calculate the unexpectedness values only for those products that are actually recommended. However, due to the sparsity of the dataset, if we recommend the top-10 item to the user, we will actually traverse every record in the test set. That is why we choose to fix the <history_behavior, target_item> pairs for evaluation. Again, as you mention, it is not an ideal setting and could lead to biases in the evaluation process.

Sorry for the confusion, and thanks again for your important advice. I hope these can address some of your concerns. Of course, you are more than welcome to contact me with any further comments you have. Have a nice day!

Best Regards,
Pan

from purs.

lkq1992yeah avatar lkq1992yeah commented on July 28, 2024

Hi Pan,

Thanks for your quick, patient and detailed explanation!

I have fully understood your intuitions of data splitting and evaluation metrics. For HR@10 and unexpectedness, I come up with a new idea: In terms of HR@K and unexpectedness (yes, excluding AUC), though the dataset is relatively sparse, there's no need to limit the evaluation data to those <u, i> pairs with explicit 0/1 labels. Given the current user and his(her) behavior, we can retrieve top-k rated items from the item pool as the final recommendation list, then both HR@K and unexpectedness can be defined in a more straightforward way:

HR@K = N_{click_item_in_top_k_results} / N_positive_data.

This measure is widely used in item retrieval (matching stage) papers, for example, MIND and ComiRec.

unexpectedness = SUM_{k, <u,i>} distance(user_behavior, recommended_item_k) / (K * N_data)

Since the novelty of the recommendation result is completely unrelated with the click label, therefore all testing <u, i> pairs can be leveraged.

Obviously, such evaluation methods could be much more compuatational intensive, as we need to traverse all items from the item pool. Nevertheless, I believe these metrics can effectively enhance your demonstration of the PURS algorithm, considering that the measurement of unexpectedness plays really really an important role in this paper.

And one more question, is there any method to evaluate the unexpectedness of different variations under the same item embedding? Since the item embedding parameters are updated during the training process, it seems that we cannot avoid the bias of model differences. I'm curious if you have thought about this issue, and if there are any good ideas.

Best,
Kangqi

from purs.

lpworld avatar lpworld commented on July 28, 2024

Hi Kangqi,

Thank you again for the comments.

Regarding your first point, yes I think that would be an excellent idea and I will definitely try that in our implementations. Thank you so much for the suggestion!

Regarding your second point, it is indeed possible to do so (for example, we could train the item embeddings and the recommendation network separately). However, that would require a fundamental redesign of the currently proposed model. I will explore that in my future work.

Best Regards,
Pan

from purs.

lkq1992yeah avatar lkq1992yeah commented on July 28, 2024

Thank you for your patient response!

from purs.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.