Git Product home page Git Product logo

kagglehelper's Introduction

KaggelHelper

Self-writing library for more best practice when participating in competitions at Kaggle.com.
Wrote for myself.

print_bold(string)

Function to display in output JupyterNotebook markdowns

submit_result(df,id,target,path,name,score = 0,oof = None)

Function to from submit at kaggle.
Create two folder in path folder, oof - with predict at train and submition - with predict at test.

  • df - pandas DataFrame with id and target
  • id - id field in df
  • target - target feature in df
  • path - path to save submit on local machine
  • name - name for file with submit
  • score - score at CV if exist
  • oof - pandas DataFrame with predict at train if exist
smoothed_aggregate(df, null_field, agg_field, alpha = 10)

Smoothed aggregate, that use to reduce overfiting

  • df - pd.DataFrame() with null_field and agg_field

  • null_field - field, that will be used in groupby

  • agg_fueld - field, that need be aggregate

  • alpha - coefficent for smooth

  • return - result pd.Series() with aggregated values

def reduce_mem_usage(df, verbose=True, less_data = True)

Compresse DataFrame for low mem usage.
!!!WARNING!!! The default parameter less_data = True, that mean,
while you use this function, tou understand that while you compress
float value you may lose precision in decimal places.
If you don't want it - set parameter less_data to False

def ensemble_predictions(predictions, weights=None, type_="linear")

Function to ansamble prediction.

  • predictions - array with predictions

  • weights - weight for prediction in array

  • type_ - tpe of mix

    • 'linear' - simple mean stuck
    • 'harmonic' - ?
    • 'geometric' - ?
    • 'rank' - ranked stuck (vote method)
  • return result of stuck (array)

def lgbm_calc(train,
              test,
              features,
              target,
              param,
              score_function = roc_auc_score,
              n_fold = 3, 
              seed = 11, 
              cat_features = []
              ):

Function for predicting with lgbm, that use KFold method

  • train - train dataset
  • test - test dataset
  • features - features, that will be used for predict
  • target - target feature
  • param - param for LGBM (see doc. for lgbm)
  • score_function - score function from sklearn.metrics or you own function
  • n_fold - number folds for KFold
  • seed - seed for random
  • cat_features - categorical feature if tou have it in dataset (default [])

return :

  • oof_df - dataframe with predict for train part
  • submit - dataframe with predict for test part
  • fi - feature importance of training

def catboost_calc(train, test, features, target, param, score_function = roc_auc_score, n_fold = 3, seed = 11, cat_features = [] ): Function for predicting with CatBoost(Yandex), that use KFold method

train - train dataset test - test dataset features - features, that will be used for predict target - target feature param - param for LGBM (see doc. for lgbm) score_function - score function from sklearn.metrics or you own function n_fold - number folds for KFold seed - seed for random cat_features - categorical feature if tou have it in dataset (default []) return :

oof_df - dataframe with predict for train part submit - dataframe with predict for test part

kagglehelper's People

Contributors

necrospk avatar khaitovr avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.