Git Product home page Git Product logo

2019baai-zhihu-cup-findexp-4th's Introduction

2019Baai-zhihu-Cup-findexp-4th---

2019年知乎看山杯第四名

本文为2019年知乎看山杯专家发现算法第四名方案的完整代码。核心**是以用户为主要研究方向,从用户的历史行为挖掘有用信息,包括体现用户特性的一些信息和反应用户跟问题交互的信息去预测用户未来行为。具体采用了LightGBM等提升树模型跟xDeepFm等推荐系统常用的神经网络模型,最终结果同stacking融合不同模型得到。

官方在邀请记录文件中提供了过去一个月来用户被邀请的记录,每条邀请是一条样本,并且标注了用户是否接受了本次邀请作为样本的标签,接受为正反之为负样本,邀请记录中还包含有邀请的时间戳。通过用户历史的邀请记录去预测未来一周用户是否会接受邀请,官方给出的训练集以及测试集实际上是经过知乎内部的召回跟排序模块之后实际推送给用户的结果,所以这些邀请实际上本身就是知乎内部模型选择后的结果,包括测试集也是,我们理解的是参赛者拿到的训练跟预测数据分布本身就是有偏的,更加偏向于用户的喜好。所以本次比赛中围绕训练集样本本身去构造特征就会有很好的效果,比如用户、问题的样本数等。

邀请本身就蕴含着很多信息,当一个问题邀请某一个用户时本身就说明该用户很可能对这个问题是感兴趣的,所以我们构造了许多邀请计数特征,并在线下取得了很好的效果,但线上的提升却是有限,邀请数量的分布在训练集中是基本稳定的,但因为官方只给出了一半的验证集,所以计数特征对验证集来说是严重有偏的,这也是初赛很多参赛选手的线下线上分数差距大的一个重要原因。我们在初赛时给每一个计数赋予了一个关于当天样本总数跟当天出现的id数的权重后再加和从而使得线上线下能够相对稳定。

2019baai-zhihu-cup-findexp-4th's People

Contributors

voldemortzzz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

2019baai-zhihu-cup-findexp-4th's Issues

ImportError: No module named 'optimizer'

您好,很感谢您的开源代码,但是我运行代码的过程中出现了ImportError: No module named 'optimizer'错误,请问您方便把这个库上传吗?谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.