Git Product home page Git Product logo

pytorch_tabular's Introduction

PyTorch Tabular

pypi travis documentation status PyPI - Downloads DOI contributions welcome Open In Colab

PyTorch Tabular aims to make Deep Learning with Tabular data easy and accessible to real-world cases and research alike. The core principles behind the design of the library are:

  • Low Resistance Useability
  • Easy Customization
  • Scalable and Easier to Deploy

It has been built on the shoulders of giants like PyTorch(obviously), and PyTorch Lightning.

Table of Contents

Installation

Although the installation includes PyTorch, the best and recommended way is to first install PyTorch from here, picking up the right CUDA version for your machine.

Once, you have got Pytorch installed, just use:

 pip install pytorch_tabular[all]

to install the complete library with extra dependencies.

And :

 pip install pytorch_tabular

for the bare essentials.

The sources for pytorch_tabular can be downloaded from the Github repo_.

You can either clone the public repository:

git clone git://github.com/manujosephv/pytorch_tabular

Once you have a copy of the source, you can install it with:

python setup.py install

Documentation

For complete Documentation with tutorials visit ReadTheDocs

Available Models

To implement new models, see the How to implement new models tutorial. It covers basic as well as advanced architectures.

Usage

from pytorch_tabular import TabularModel
from pytorch_tabular.models import CategoryEmbeddingModelConfig
from pytorch_tabular.config import DataConfig, OptimizerConfig, TrainerConfig, ExperimentConfig

data_config = DataConfig(
    target=['target'], #target should always be a list. Multi-targets are only supported for regression. Multi-Task Classification is not implemented
    continuous_cols=num_col_names,
    categorical_cols=cat_col_names,
)
trainer_config = TrainerConfig(
    auto_lr_find=True, # Runs the LRFinder to automatically derive a learning rate
    batch_size=1024,
    max_epochs=100,
    gpus=1, #index of the GPU to use. 0, means CPU
)
optimizer_config = OptimizerConfig()

model_config = CategoryEmbeddingModelConfig(
    task="classification",
    layers="1024-512-512",  # Number of nodes in each layer
    activation="LeakyReLU", # Activation between each layers
    learning_rate = 1e-3
)

tabular_model = TabularModel(
    data_config=data_config,
    model_config=model_config,
    optimizer_config=optimizer_config,
    trainer_config=trainer_config,
)
tabular_model.fit(train=train, validation=val)
result = tabular_model.evaluate(test)
pred_df = tabular_model.predict(test)
tabular_model.save_model("examples/basic")
loaded_model = TabularModel.load_from_checkpoint("examples/basic")

Blogs

Future Roadmap(Contributions are Welcome)

  1. Add GaussRank as Feature Transformation
  2. Add ability to use custom activations in CategoryEmbeddingModel
  3. Add differential dropouts(layer-wise) in CategoryEmbeddingModel
  4. Add Fourier Encoding for cyclic time variables
  5. Integrate Optuna Hyperparameter Tuning
  6. Add Text and Image Modalities for mixed modal problems
  7. Add Variable Importance
  8. Integrate SHAP for interpretability

DL Models

  1. DNF-Net: A Neural Architecture for Tabular Data
  2. Attention augmented differentiable forest for tabular data
  3. XBNet : An Extremely Boosted Neural Network

Citation

If you use PyTorch Tabular for a scientific publication, we would appreciate citations to the published software and the following paper:

@misc{joseph2021pytorch,
      title={PyTorch Tabular: A Framework for Deep Learning with Tabular Data}, 
      author={Manu Joseph},
      year={2021},
      eprint={2104.13638},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}
  • Zenodo Software Citation
@article{manujosephv_2021, 
    title={manujosephv/pytorch_tabular: v0.7.0-alpha}, 
    DOI={10.5281/zenodo.5359010}, 
    abstractNote={<p>Added a few more SOTA models - TabTransformer, FTTransformer
        Made improvements in the model save and load capability
        Made installation less restrictive by unfreezing some dependencies.</p>}, 
    publisher={Zenodo}, 
    author={manujosephv}, 
    year={2021}, 
    month={May}
}

pytorch_tabular's People

Contributors

manujosephv avatar wsad1 avatar fonnesbeck avatar jxtrbtk avatar actis92 avatar yinyunie avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.