Git Product home page Git Product logo

machinelearningutils's Introduction

MachineLearningUtils


Set of useful tools for machine learning projects

Teasing examples:

Plots

UsefulPlots.DataPlots.colored_scatter_matrix:

(show me the code...) UsefulPlots.colored_scatter_matrix

UsefulPlots.EvaluationPlots.confusion:

(Show me the code...) UsefulPlots.colored_scatter_matrix

Take a look here for more useful code examples


ModelUtils

utils for easier skitlearn classifier handling

classes:

LinearModelUtils

utils for easier skitlearn linear models handling

classes:

UsefulPlots

My toolbox of useful plots classes.

classes:

##CommonFeatureEngineering drop list of columns, fillna using smarter functions, map columns values in one line and more goodies..

classes:

DatasetsTools

classes:


What inside the modules...

module ModelUtils

class ModelUtils

utils for easier skitlearn classifier handling

module ModelUtils

Useful method for classifications : Utils for easier skitlearn classifier handling:

  • split_and_train
  • test_model
  • but you might want to use:
    • split_data_to_train_test
    • train_model

module UsefulPlots

class DataPlots

Plots for data exploration:

  • colored_scatter - Plot scatter of x vs y with color of third element
  • colored_scatter_matrix - A matrix of colored_scatter

class EvaluationPlots

Plots which helps to evaluate models

  • predicted_vs_actual - This method creates sctter plot of predicted values vs the actual valus.
  • plots a confusion matrix. Normalization can be applied

class VisPlotPlayGround

playground for visualization (color map and more...)

  • show_colormap - Show the gradiant of cmap
  • grayify_cmap - Return a grayscale version of the colormap

module DatasetsTools

mainly for unittests and demos

class DatasetsTools

easyier skitlearn dataset exploration mainly for unittests and demos

  • data_as_df

machinelearningutils's People

Contributors

sagivba avatar

Stargazers

 avatar

Watchers

 avatar

machinelearningutils's Issues

better documentation

add examples of small plots in the main readme file
add more examples in the example directory with readme file cointaining some short eamples

add CommonDataManipulation module

methods:
add_columns_one_to_many(self,df_master,df_details,function,master_columns_lst,details_columns_lst)
for row in df_master[master_columns_lst,:]:
df_details[details_columns_lst] where joined with row
add columns to master_columns_lst function(row,)

 returns df_master with_new_columns

add to modelutis

from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score

add plot method confusion_matrix_plot

def plot_confusion_matrix(self,
cm, classes,
normalize=True,
title='Confusion matrix',
cmap=plt.cm.Blues):

This method should prints and plots the confusion matrix.
Normalization can be applied by setting normalize=True.

add linera module util

class LinearModelUtils
def get_formula
def rmse
def rmsle
def mae
def model_info_ record
return ModelInfoRecord

class ModelInfo(list)

class ModelInfoRecord
def init

  • self
  • model_name
  • col_list
  • actual_lbl
  • formula
    -prediction_column
  • clf
    class LinearModelInfoRecord(ModelInfoRecord)
    def init
  • self
    -rmse
    -rmsle
    -mae
    -formula
    class ModelInfoReport

add unit test for plots

Don't test visualization, test different datasets using None and np.NaN
Test for the returned object and their attributes.
Check that the Utils does not throw exceptions

add to ColumnManipulation methods

  • split_to_columns(column,split_func,splited_columns_name) : returns tuple of colomns

  • drop_columns(df,columns_list) returns df without the columns_list

  • apply_by_dict(df, **kward)
    for key, value in kwargs.iteritems():
    if key not in list(df):
    print warn not in columns list
    continue
    if value is function:
    value=[value]
    if value int iterable:
    print warn
    for f in value
    df[key]=df[key].apply(f)
    return df

add quick exploration class

def init(self,df,targert_name)
def plot_releshenships()
"""
smart plot the relashenships between target and ther colums
"""

easy features engineer class

linear regression:
better_dummies:
splitting values by ranges or function
for example:
(
column: hour,
{
rush_hours : [7,8,9, ],
back_hour : [15,16,17]
others: [1:6. 10:14, 18:24]
}
)

will split the "hour" column in our df into 3 columns : rush_hours ,back_hour,others

fix boxplot out of range bug

from sklearn import datasets
from MachineLearningUtils.UsefulPlots import VisPlotPlayGround
from matplotlib import cm
from matplotlib import pyplot as plt
import pandas as pd
import numpy as np

def main():

iris = datasets.load_iris()
_df=data1 = pd.DataFrame(data= np.c_[iris['data'], iris['target']],
                 columns= iris['feature_names'] + ['target'])
print (_df.info())
# exit()
plotter=VisPlotPlayGround(df=_df, ggplot=True, cmap=cm.jet)
cmaps = [m for m in cm.datad if not m.endswith("_r")]
print(sorted(cmaps))
for c in sorted(cmaps):
    plotter.show_colormap(c)
plt.show()

if name == 'main':
main()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.