sb-ai-lab / replay Goto Github PK
View Code? Open in Web Editor NEWA Comprehensive Framework for Building End-to-End Recommendation Systems with State-of-the-Art Models
Home Page: https://sb-ai-lab.github.io/RePlay/
License: Apache License 2.0
A Comprehensive Framework for Building End-to-End Recommendation Systems with State-of-the-Art Models
Home Page: https://sb-ai-lab.github.io/RePlay/
License: Apache License 2.0
Add .count() just after cache methods in all place
It is needed for more honest estimation of execution time
will it work on Windows11 for example LightFM
LightFM
classreplay.models.LightFMWrap(no_components=128, loss='warp', random_state=None)
Wrapper for LightFM.
from
https://sb-ai-lab.github.io/RePlay/pages/modules/models.html#replay-recommenders
Add to recommenders UCB algorithm
it is needed for covering RL area of algorithms
to ensure everything is up-to-date
Linux Mint 21.2_x64 - Replay-3.2.4.deb
Even if you have downloaded the models, it does not allow you to press the "continue" button
greetings and thanks for everything
Implement metrics with normalisation: MAP, NDCG, HitRate, Roc Auc, Precision, Recall
https://arxiv.org/pdf/1801.07030.pdf
For RS simulator
user_test_size in UserSplitter
Update model with new data in efficient way.
https://docs.google.com/spreadsheets/d/1oULhfyWgA3rqb3M9NzlBiWeY80xNYizUodyv0tQW9Js/edit#gid=888497155
Work with bigger data volumes, Update model without re-fit.
I have 'user_id' column in my data, so when I try to use some of the methods or functions, I get this error:
pyspark.sql.utils.AnalysisException: cannot resolve 'user_idx'
Examples where 'user_idx' column is required:
replay/models/base_rec.py", line 321, in _fit_wrap users = log.select("user_idx").distinct()
replay/filters.py", line 28, in min_entries entries_by_user = data_frame.groupBy("user_idx").count() # type: ignore
Find out how to preprocess and pass model weights/vectors to Kafka
Models inference
Replace RDD-based operations in metric calculation with UDF to replace with Scala UDF in future
Metrics calculation speed up
UserSplitter неисправно работает c параметром user_test_size
data_frame = pd.DataFrame({"user_idx": [1,1,1,2,2,2],
"item_idx": [1,2,3,1,2,3],
"relevance": [1,2,3,4,5,6],
"timestamp": [1,2,3,3,2,1]})
data_frame = convert2spark(data_frame)
UserSplitter(2,1,seed=80083).split(data_frame)[0].toPandas()
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.