Git Product home page Git Product logo

ibmdevelopermea / quick-and-easy-predictiveml Goto Github PK

View Code? Open in Web Editor NEW
3.0 4.0 0.0 12.6 MB

In this tutorial, we will use Watson Studio to build a predictive machine learning model with IBM SPSS Modeler and decide whether a bank customer will default on a loan. IBM Cloud Pak® for Data is an interactive, collaborative, cloud-based environment that allows developers and data scientists to work collaboratively, gain insight from data and build machine learning models.

Home Page: https://developer.ibm.com/components/cloud-pak-for-data/tutorials/build-an-ai-model-visually-with-spss-modeler-flow

spss-modeler watson-studio data-science machine-learning predictive-modeling

quick-and-easy-predictiveml's Introduction

Build a Predictive Machine Learning Model Quickly and Easily

Workshop Resources

Sign-up/Login to IBM Cloud

There are 3 steps to create your account on IBM Cloud:

  1. Put your email and password.

  2. You get a verification link with the registered email to verify your account.

  3. Fill the personal information fields.

Screen Shot 2021-05-31 at 11 25 01 AM

Workshop

In this tutorial, we will use Watson Studio to build a predictive machine learning model with IBM SPSS Modeler and decide whether a bank customer will default on a loan. IBM Cloud Pak® for Data is an interactive, collaborative, cloud-based environment that allows developers and data scientists to work collaboratively, gain insight from data and build machine learning models.

Learning objectives

After completing this tutorial, you will learn how to:

  • Upload data to Watson Studio
  • Create an SPSS® Modeler flow.
  • Use the SPSS tool to inspect data and glean insights.
  • Modify and prepare data for AI model creation using SPSS.
  • Train a machine learning model with SPSS and evaluate the results.

Prerequisites

Estimated time

Completing this tutorial should take about 30 minutes.

Steps

  1. Create a project and upload the data
  2. Create an SPSS Modeler Flow
  3. Import the data
  4. Inspect the data
  5. Data preparation
  6. Train the ML model
  7. Evaluate the results

Step 1. Create a project and upload the data

If you have not already created a project for this learning path, follow the instructions below to create one. Otherwise, you can skip to Create an SPSS Modeler Flow.

Create an IBM Cloud Pak for Data project

In Cloud Pak for Data, we use the concept of a project to collect / organize the resources used to achieve a particular goal (resources to build a solution to a problem). Your project resources can include data, collaborators, and analytic assets like notebooks and models, etc.

  • Go the (☰) navigation menu and under the Projects section click on All Projects.

    (☰) Menu -> Projects

  • Click on the New project button on the top right.

    Start a new project

  • Select Create an empty project.

    Create empty project

  • Provide a name and optional description for the project and click Create.

    Pick a name

Download the dataset for this experiment and load it into you project.

  • Download the german_credit_data.csv dataset.

  • Upload the dataset to the analytics project by clicking on Browse and selecting the downloaded file.

Step 2. Create an SPSS Modeler flow

  1. From the Project home page, click Add to Project + and choose Modeler flow.

    Add modeler flow

  2. Give the flow a meaningful name, such as Credit Risk Flow, then click Create.

    Create flow

Step 3. Import the data

  1. In the left-hand pane, expand Import, then drag and drop a Data Asset node on the canvas. Double-click on the node that was dropped on the canvas and click Change data asset.

    Data asset

  2. On the Assets page, open the Data Assets tab, choose the german_credit_data.csv file you previously uploaded and click Select.

    Import data

  3. When the data is imported, click Save.

    Save data

Step 4. Inspect the data

  1. To gain insight into your data, open the Output tab and drag and drop the data audit node onto the canvas. Hover over the Data Asset node that was dragged and dropped on the canvas earlier, and it should show a blue circular icon on the side. Click on the icon and drag over to the Data Audit node. This will connect the two nodes.

    Data Audit

  2. Hover over the Data Audit node and click on the three vertical dots to open the menu for the node. Alternatively, right-click on the Data Audit node and click Run.

    Data inspection

  3. Once it is ready, the output can be viewed by opening the Outputs menu on the right. Click on the "eye" icon to open the Data Audit (Data Audit of [21 fields]) to view statistics about the data.

    Open Data inspection

    Data inspection statistics

  4. Click X in the upper right corner to close the window.

Step 5. Data preparation

  1. Expand the Field Operations tab and drag and drop the Type node onto the canvas. Connect the Data Asset node with the Type node, then double-click on the Type node to make the necessary configurations.

    Type

  2. Click on Read Values. Once the read operation completes, check that the measure and role for each field is correct. Change the role of Risk from Input to Target, then click Save to close the tab.

    Data preparation

Step 6. Train the ML model

  1. Expand the Modeling tab, then drag and drop the Random Forest node onto the canvas. Connect the Type node to the Random Forest node. The Random Forest node will automatically be renamed Risk.

    Random Forest

  2. Right-click on the Random Forest node and click Run. When the execution is done, you will see a new golden nugget-like Risk node added to the canvas.

    Start training

  3. Right-click on the new Risk golden nugget node and choose Preview to inspect the output results.

    Preview Random Forest result

Step 7. Evaluate the results

  1. Expand the Outputs tab, then drag and drop an analysis node onto the canvas. Connect the Risk golden nugget node to the Analysis node. Right-click on the Analysis node and click Run.

    Analysis

  2. From the Outputs tab on the right, click on the "eye" icon next to analysis of [Risk] to gain insight into the accuracy of the results.

    Analysis output

    Analysis output

  3. Click on Return to flow to go back.

  4. Expand the Graphs tab, then drag and drop the Evaluation node onto the canvas. Connect the Risk golden nugget node with the Evaluation node. The Evaluation node will automatically be renamed $R-Risk. Right-click on the node and click Run.

Evaluation

  1. Double-click on the "eye" icon next to the $R-Risk output (evaluation of [$R-Risk]: Gains) to visualize the graph for the Gains. This will give the Predicted Positive Rate (or support of the classifier) vs. True Positive Rate (or sensitivity of the classifier).

    Evaluation graph

Summary

This tutorial demonstrated a small example of creating a predictive machine learning model on IBM SPSS Modeler on IBM Cloud Pak for Data. It went over importing the data into the project and the modeler flow, and preparing the data for modeling, then over the steps of choosing an appropriate algorithm for the data and training a prediction model. The last step explained how to visualize and evaluate the results of the trained model.

Workshop Resources

Authors

Scott Dangelo

Fawaz Siddiqi

Anam Mahmood

quick-and-easy-predictiveml's People

Contributors

fawazsiddiqi avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.