Git Product home page Git Product logo

custom_preprocessors's Introduction

custom_preprocessors

custom preprocessors for sklearn

Encoders

  • OneHotLabelEncoder Tutorial
    Automagically runs LabelEncoder and OneHotEncoder on categorical features in a pandas DataFrame. Robust to missing and unknown labels, and has advanced options for filtering out less common labels.

  • CountEncoder source
    Encoder for reducing cardinality of categorical variables by replacement with their count in the train set.

  • MultinomialNBEncoder source
    Encoder for reducing cardinality of text data by training a Multinomial Naive Bayes classifier against target labels.

  • TargetEncoder source
    Encoder for reducing cardinality of categorical variables by replacement with their aggregated category statistic (mean, median, minimum, maximum, standard deviation, sum, or quantile) in the train set. To handle insufficient sample sizes, aggregated statistic could optionally be smoothed with the aggregated prior of entire train set.

Feature Selection

  • ColumnFilter source
    Transformer for automatically filtering columns of a pandas DataFrame, allowing friendly handling of data to pass into scikit-learn estimators that are incompatible with text and time data.

  • GreedyForwardSelection source
    Transformer for choosing features with Forward Selection. Forward selection is an iterative method in which we start with having no features in the model. In each iteration, we keep adding the feature which best improves our model until an addition of a new variable does not improve the ROC AUC score of the model for a specified number of iteration rounds.

custom_preprocessors's People

Contributors

alvinthai avatar

Watchers

 avatar  avatar

Forkers

jwalsh

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.