Git Product home page Git Product logo

lstm-deepfm's Introduction

Quick Tour


Contents

Data Preprocess

Download sample dataset and put them in the dataset folder.

# Some guides of binning are in datapreprocess/Kmeans_Analysis.ipynb
# Some guides of permutation feature importance are in datapreprocess/PIMP_Tutorial.ipynb
python datapreprocess/DataGenerator.py # generate file for training and evaluation
python datapreprocess/train_pimp.py # get the null importance distribution and actual importance distribution
python datapreprocess/visualize_pimp.py # visualize the importance distribution of PIMP

Pretraining

python pretrain/unsupervised_pretraining.py # for unsupervised pretraining
python pretrain/selfsupervised_pretraining.py # for self-supervised pretraining

Finetuning

python finetuning/supervised_finetuning.py # for supervised finetuning

Questions

How to compare performance with your model

We will add a module to quickly use the proposed methods, as shown in Quick Tour.

The model does not perform well on some datasets

Some parameters of the model, such as the size of the hidden layer, have a great impact on the performance. For example, in Debutanizer Dataset, a smaller hidden layer help the model generalize well.

The related papers of self-supervised learning

Self-supervised learning has achieved great success in natural language processing and computer vision, such as Bert, MAE . Especially there is a amount of unlabeled data, the task of self-supervised learning is more conducive to mining hidden relationships in the Industrial Big Data.

Is the FM module useful or will it impair performance

Our initial goal is to achieve fusion learning of various industrial data characteristics, so the ability of FM to extract discrete features is important. However, if there are few discrete features or the feature importance is low, the performance of the model may be reduced. FM module can play an integrated role. If the fusion learning performance decreases, SciPy can be used to find the optimal fusion weight. Therefore, the performance of LSTM-DeepFM will be better than that of single LSTM-Deep.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.