Git Product home page Git Product logo

lendingclub's Introduction

LendingClub

Example of processing Lending Club data with VoltDB and H2O. The goal here is to show operationalization of Machine Learning models in VoltDB.

Entities:

The primary entities here are Loans and Funds.

Loan - Each new loan (as generated by the client or ingested from Kafka) has a bunch of attributes such as (a) The loan Amount, (b) The credit rating of the borrower, (c) The % of loan amount that has been funded already, and over 100 others. Machine Learning is used to predict the risk that is associated with each loan.

Fund - A Fund is an amount of money that is available to be invested in the loans with a risk profile. Each fund has the total amount that belongs to that fund and how that amount should be divided among the different risk levels.

The goal of the application is to match the best fund to invest in a loan based on the calculated risk and the fund's tolerance for risk.

Application:

The primary performance metric of this application is the number of new loans accepted per second. To provide maximum scalability for ingestion, the insert workload is partitioned on the unique id of the loan. Since the funds are not updated frequently and do not need a lot of memory, the Funds table is replicated across the database. Since the risk factor is what brings the loans and the funds together, the relationship table that maps the funds to the loans that they are invested in is partitioned by the risk factor.

Since the relevant tables are partitioned by different columns, the application divides the processing of each new loan into two transactions. One transaction accepts the new loan, calculates the risk, and inserts it into the NEW_LOAN table. The other transaction matches the loan with the appropriate fund based on the calculated risk. The application uses VoltDB's loopback exporter for the first transaction to call the second transaction.

Workflow:

  1. Each new loan is passed to NewLoan procedure which calculates the risk and saves it to the NEW_LOANS table.
  2. Some of the new loan's data is also written out to the LOAN_BY_RISK stream which is a loopback exporter back to VoltDB
  3. The AllocateLoan procedure is called for each new loan written out to the LOAN_BY_RISK stream where a decision is made on the loan.

lendingclub's People

Stargazers

Josh Ferguson avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.