Git Product home page Git Product logo

ibm / customer-churn-prediction Goto Github PK

View Code? Open in Web Editor NEW
28.0 18.0 29.0 9.24 MB

WARNING: This repository is no longer maintained :warning: This repository will not be updated. The repository will be kept available in read-only mode.

License: Apache License 2.0

Jupyter Notebook 29.82% Shell 0.41% HTML 60.71% JavaScript 9.06%
machine-learning data-science watson-studio jupyter-notebook

customer-churn-prediction's Introduction

WARNING: This repository is no longer maintained โš ๏ธ

This repository will not be updated. The repository will be kept available in read-only mode. For an alternate, please see https://github.com/IBM/predictive-model-on-watson-ml or the on-premise Cloud Pak for Data version of this pattern: https://github.com/IBM/telco-customer-churn-on-icp4d

Predict Customer Churn using Watson Studio and Jupyter Notebooks

In this Code Pattern, we use IBM Watson Studio to go through the whole data science pipeline to solve a business problem and predict customer churn using a Telco customer churn dataset. Watson Studio is an interactive, collaborative, cloud-based environment where data scientists, developers, and others interested in data science can use tools (e.g., RStudio, Jupyter Notebooks, Spark, etc.) to collaborate, share, and gather insight from their data as well as build and deploy machine learning and deep learning models.

When the reader has completed this Code Pattern, they will understand how to:

  • Use Jupyter Notebooks to load, visualize, and analyze data
  • Run Notebooks in IBM Watson Studio
  • Load data from IBM Cloud Object Storage
  • Build, test and compare different machine learning models using Scikit-Learn
  • Deploy a selected machine learning model to production using Watson Studio
  • Create a front-end application to interface with the client and start consuming your deployed model.

Flow

  1. Understand the business problem.
  2. Load the provided notebook into the Watson Studio platform.
  3. Telco customer churn data set is loaded into the Jupyter Notebook.
  4. Describe, analyze and visualize data in the notebook.
  5. Preprocess the data, build machine learning models and test them.
  6. Deploy a selected machine learning model into production.
  7. Interact and consume your model using a frontend application.

Included components

  • IBM Watson Studio: Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
  • IBM Cloud Foundry: Deploy and run your applications without managing servers or clusters. Cloud Foundry automatically transforms source code into containers, scales them on demand, and manages user access and capacity.

Featured technologies

  • Jupyter Notebooks: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text.
  • Pandas: An open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
  • Seaborn: A Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.
  • Scikit-Learn: Machine Learning in Python. Simple and efficient tools for data mining and data analysis.
  • Watson Machine Learning Client: A library that allows working with Watson Machine Learning service on IBM Cloud. Train, test and deploy your models as APIs for application development, share with colleagues using this python library.
  • NodeJS: A JavaScript runtime built on Chrome's V8 JavaScript engine, used for building full stack Javascript web applications.
  • ExpressJS: A minimal and flexible Node.js web application framework that provides a robust set of features for web and mobile applications.

Watch the Video

Steps

  1. Sign up for Watson Studio

  2. Create a new Project

  3. Upload the dataset

  4. Import notebook to Watson Studio

  5. Import dataset into the notebook

  6. Create Watson Machine Learning Service instance

  7. Follow the steps in the notebook

  8. Either Deploy to IBM Cloud or Deploy locally

    8a. Deploy to IBM Cloud

    8b. Deploy locally

1. Sign up for Watson Studio

Sign up for IBM's Watson Studio. By creating a project in Watson Studio a free tier Object Storage service will be created in your IBM Cloud account. Take note of your service names as you will need to select them in the following steps.

Note: When creating your Object Storage service, select the Free storage type to avoid having to pay an upgrade fee.

2. Create a new Project

Note: By creating a project in Watson Studio a free tier Object Storage service will be created in your IBM Cloud account. Take note of your service names as you will need to select them in the following steps.

  • On Watson Studio's Welcome Page select New Project.

  • Choose the Data Science option and click Create Project.

  • Name your project, select the Cloud Object Storage service instance and click Create

3. Upload the dataset

4. Import notebook to Watson Studio

  • Create a New Notebook.

  • Import the notebook found in this repository inside the notebook folder by copying and pasting this URL in the relevant field https://raw.githubusercontent.com/IBM/customer-churn-prediction/master/notebooks/customer-churn-prediction.ipynb

  • Give a name to the notebook and select a Python 3.5 runtime environment, then click Create.

5. Import dataset into the notebook

To make the dataset available in the notebook, we need to refer to where it lives. Watson Studio automatically generates a connection to your Cloud Object Storage instance and gives access to your data.

  • Click in the cell below 2. Loading Our Dataset
  • Then go to the Files section to the right of the notebook and click Insert to code for the data you have uploaded. Choose Insert pandas DataFrame.

6. Create Watson Machine Learning Service instance

  • From IBM Cloud Catalog, under the Watson category, select Machine Learning or use the Search bar to find Machine Learning.

  • Keep the setting as they are and click Create.

  • Once the service instance is created, navigate to the Service credentials tab on the left, view credentials and make a note of them.

Note: If you can't see any credentials available, you can create a New credential.

  • In the notebook available with this pattern, there is a cell with the WML credentials available after 14. ROC Curve and models comparisons. You will need to replace the code inside with your credentials.

  • Keep this tab open, or copy the credentials to a file to use later if you deploy the web app.

7. Follow the steps in the notebook

The steps should allow you to understand the dataset, analyze and visualize it. You will then go through the preprocessing and feature engineering processes to make the data suitable for modeling. Finally, you will build some machine learning models and test them to compare their performances.

8. Either Deploy to IBM Cloud or Deploy locally

8a. Deploy to IBM Cloud

Click on the following button to clone the repo for this frontend app and create a toolchain to start deploying the app from there.

Deploy to IBM Cloud

  • Under IBM Cloud API Key: choose Create+, and then click on Deploy.

To monitor the deployment, in Toolchains click on Delivery Pipeline and view the logs while the apps are being deployed.

  • Once the app has deployed, Click on Runtime on the menu and navigate to the Environment variables tab.

  • Update the 5 environment variables with the WML_INSTANCE_NAME, USERNAME, PASSWORD, INSTANCE_ID, and URL, that you saved at the end of Create Watson Machine Learning Service instance. Add the MODEL_URL that you created in the Notebook as the variable scoring_endpoint. The app will automatically restart and be ready for use.

8b. Deploy locally

For developing the UI locally and testing it:

  • cd frontend/

  • Create a .env file in frontend folder (frontend/.env) to hold your credentials.

cp env.example .env

For our purposes here, our .env file will look like the following:

WML_INSTANCE_NAME=**Enter with your Watson Machine Learning service instance name**
USERNAME=**Enter your WML username found in credentials**
PASSWORD=**Enter your WML password found in credentials**
INSTANCE_ID=**Enter your WML instance_id found in credentials**
URL=**Enter your WML url found in credentials**
MODEL_URL=**Change with your model URL after deploying it to the cloud as in step 8**

cd frontend/
npm install
npm start

You can view the application in any browser by navigating to http://localhost:3000. Feel free to test it out.

Sample output

  • Using Postman:

  • Using the UI app:

Links

Learn more

  • Artificial Intelligence Code Patterns: Enjoyed this Code Pattern? Check out our other AI Code Patterns.
  • Data Analytics Code Patterns: Enjoyed this Code Pattern? Check out our other Data Analytics Code Patterns
  • AI and Data Code Pattern Playlist: Bookmark our playlist with all of our Code Pattern videos
  • With Watson: Want to take your Watson app to the next level? Looking to utilize Watson Brand assets? Join the With Watson program to leverage exclusive brand, marketing, and tech resources to amplify and accelerate your Watson embedded commercial solution.
  • IBM Watson Studio: Master the art of data science with IBM's Watson Studio

License

This code pattern is licensed under the Apache Software License, Version 2. Separate third party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 (DCO) and the Apache Software License, Version 2.

Apache Software License (ASL) FAQ

customer-churn-prediction's People

Contributors

dependabot[bot] avatar dolph avatar hebanas avatar imgbot[bot] avatar ljbennett62 avatar rhagarty avatar scottdangelo avatar stevemar avatar stevemart avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

customer-churn-prediction's Issues

Notebook: import pixiedust earlier for kernel restart

Upon importing pixiedust, the user is often asked to restart the kernel.
If this occurs after many cells are run, the cells must be re-run since all data and variables are lost upon kernel restart.
So....import pixiedust in one of the first cells to avoid this.

Notebook: remove output

We usually publish the notebook without the output, and with "execution_count": null,.
We also save a notebook with the output under an examples/ folder as well.

Using Docker deploy fails

The docker deploy fails.
Should these values be hard-coded in cli-config.yml ?

# The IBM version of this configuration
version : "0.0.3"
ibm-generated-id : "9d3b30f6-8c9a-4494-b2ee-35360c900bdd"
ibm-cloud-app-id : "ab0f3aed-f1a9-4ced-a3a4-2413f3f86e3e"

The output from the failure is:

Scotts-MBP-2:frontend scottda$ ibmcloud dev deploy

The hostname for this application will be: frontend
? Press [Return] to accept this, or enter a new value now> sda-customChurn-2-28
FAILED                                                             
Failed to retrieve the application. 
Application with id ab0f3aed-f1a9-4ced-a3a4-2413f3f86e3e was not found in any region.


Scotts-MBP-2:frontend scottda$ grep -R ab0f3aed *
cli-config.yml:ibm-cloud-app-id : "ab0f3aed-f1a9-4ced-a3a4-2413f3f86e3e"

Notebook: Add object storage (or wget)

We need to provide code and Readme instructions for getting and loading the data set.
The best method is to download the data locally from the web link.
Then, use watson studio to load the data (see this pattern step 3 under the bullet point From within the new project Overview panel, click Add to project on the top right, selecting Data asset. for an example)

Then, we need to show how to import the data into the notebook. Here is an example using Spark data frames.

Deploy to IBM Cloud fails

The deploy to IBM cloud button fails:

Staging...
None of the buildpacks detected a compatible application
Exit status 222
Staging failed: STG: Exited with status 222
Stopping instance c5de5965-ba3c-40d8-a1c6-443f01c739a6
Destroying container
Successfully destroyed container

FAILED
Error restarting application: NoAppDetectedError
			
TIP: Buildpacks are detected when the "cf push" is executed from within the directory that contains the app source code.

Use 'cf buildpacks' to see a list of supported buildpacks.

Use 'cf logs sda-customer-churn-prediction-2-9-28 --recent' for more in depth log information.

Finished: FAILED

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.