Git Product home page Git Product logo

kdd_cup2020_debiasing's Introduction

KDD_Cup2020_Debiasing

问题

  • 内存优化
  • 多进程加速

过程

  • 找到data文件夹下debiasing_data_20200401.csv,点击链接并下载数据。并在同一路径下找到[Official]含解压密码.md 文件,打开并找到对应的密码解压数据。

  • 本次比赛用户点击重复少,用户点击项目少,空间扩得比较散,用双塔DSSM、word2vec以及glove等召回效果比较差

  • 主要采用itemCF召回,lightgbm排序

目录

  • code

    - txt_img_cosine_similarity.ipynb:计算underexpose_train/underexpose_item_feat.csv中给出的文本和图像的item间的余弦相似度,排序模型中特征备用
    
    - process.py                     :数据加载,数据处理,数据embedding,数据内存优化,item聚类以及user聚类,增加session划分,用户/项目/交互特征生成
    
    - recall.py                      :icf、ucf等召回
    
    - model.py                       :lgbm排序模型
    
    - metric.py                      :召回结果评价
    
    - main_V1.ipynb                  :主函数V1版本,最后B榜最高得分
    
    - main_V2.ipynb                  :主函数V2版本
    
  • data

      - process                        :过程数据存储
      
      - underexpose_test               :官方test数据
      
      - underexpose_train              :官方train数据
               
      - debiasing_data_20200401.csv    :官方下载数据链接
      
      - [Official]含解压密码.md          :官方数据解压密码
    
  • result

      - submit_traceb_v1.csv           :提交数据存储 
    

kdd_cup2020_debiasing's People

Contributors

wanpingdou avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.