Git Product home page Git Product logo

monobinshiny's Introduction

monobinShiny 0.1.0

This is an add-on package to the monobin package that simplifies its use. The goal of monobin is to perform monotonic binning of numeric risk factor in credit rating models (PD, LGD, EAD) development. All functions handle both binary and continuous target variable. Missing values and other possible special values are treated separately from so-called complete cases.

monobinShiny provides shiny-based user interface (UI) to monobin package and it can be especially handy for less experienced R users as well as for those who intend to perform quick scanning of numeric risk factors when building credit rating models. The additional functions implemented in monobinShiny that do no exist in monobin package are: descriptive statistics, special case and outliers imputation. The function descriptive statistics is exported and can be used in R sessions independently from the user interface, while the special case and the outlier imputation functions are written to be used with shiny UI.

Installation

User can install the released version of monobinShiny from CRAN executing the following line of code:

install.packages("monobinShiny")

Additionally, development version can be installed using:

    library(devtools)
    install_github("andrija-djurovic/monobinShiny")

How to start monobinShiny application?

After installation, to start monobinShiny application, just type:

suppressMessages(library(monobinShiny))
monobinShinyApp()

If the application is installed and started properly, the following should appear in web browser:

plot

The application consists of three modules:

  1. data manager;
  2. descriptive statistics and imputation;
  3. monotonic binning.

The following sections provide short descriptions of the each module.

ℹ️ Almost all reactive elements of the application result with a notification in the lower right corner.

DATA MANAGER MODULE

This module serves for data import: the manual import browsing for a file (only .csv files accepted) or the automatic import of gcd from monobin package (Import dummy data).

plot


During the manual data import, set of checks are performed such as: file extension, approprietness of .csv file and the number of identified numeric variables. If data are imported successfully, in Data Import log output, overview of the data structure will be presented along with information about identified numeric/categorical variables.

⚠️ Be aware that only variables identified as of numeric type will be processed for the other two modules.

DESCRIPTIVE STATISTICS AND IMPUTATION

This module covers the standard steps of univariate and part of bivariate analysis in model development supplemented by the simple options (mean or median) for imputation of special case values and outlier imputation based on selected percentiles thresholds.

Before running any of the imputation procedures, target variable needs to be selected:

plot


After selecting target variable, usually imputation procedures are run. If the imputation procedures are not compulsory, user can skip this step and move to the next module.

plot

⚠️ Be aware that imputation procedures will create and add a new risk factor to imported data set. The special case values imputation will add the new risk factor names as selected risk factor + _sc_ + selected imputation method . Example: if user selects risk factor age and mean as the imputation method, the new risk factor will be added as age_sc_mean. The same procedure will run for the outlier imputation adding a new risk factor as selected risk factor + _out_ + selected upper percentile + _ + selected lower percentile (e.g. age_out_0.99_0.01). Special attention should be paid when data set contains more risk factors, because the final number of risk factors can increase significantly using imputations.

ℹ️ In the case when imputation values cannot be calculated, download buttons will appear providing possibility to the user to download and check for which risk factors inputs are not properly defined (all special case values and/or special case values to be imputed). Both fields, the all special case values and the special case values to be imputed should be defined as a list of numeric values (or values that can coerce to numeric including NA) separated by comma (,).

Ultimate goal of this module is to create report of descriptive statistics. Image below presents example of descriptive report. Details on calculated metrics can be found in the help page of the function desc.stat (?desc.stat). As it can be seen, user has a possibility to download the descriptive statistics report as well as data set used for its creation. If the imputation procedures are run, the data set will contain added risk factors (.csv files).

plot

MONOTONIC BINNING

Monotonic binning module reflects the main purpose of this package - interface to monobin package. Similar to the previous module, user should first select the target variable, then the risk factors ready for binning (if imputation is performed, the list of available risk factors will contain newly created risk factors) and finally define arguments of selected binning algorithm. Available binning algorithms are those implemented in monobin package.

plot

Running the binning procedure will result in summary table of processed risk factors and transformed data set. Both outputs can be downloaded as .csv files.

plot

Data checks and notifications

As already stated, almost every reactive element of the application produces notification output (lower righ corner). An example of the error notification for trying to import file other than .csv is presented in the following image:

plot

Below is the list of implemented data checks:

  1. if imported file has at least two numeric variables;
  2. if target variable is selected when running the imputation and report procedures;
  3. if risk factors are selected when running imputation and report procedures;
  4. if special case values are defined properly;
  5. if percentile bounds (upper and lower) for outlier imputation are defined properly;
  6. if numeric inputs for binning algorithms are defined properly;
  7. if binary type of target is specified as 0/1 variable.

monobinshiny's People

Contributors

andrija-djurovic avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.