Git Product home page Git Product logo

awesome-data-science-models's Introduction

Awesome Data Science Models

The goal of these projects is to provide tutorials (and answers) of our most popular models we use for training.

They all use data⎰describe our open source accelerator for EDA.

Library Item Title Description Audience
101 Baseball Pitch Predictor Kubeflow pipeline for modeling around pitch type Citizen Data Scientists, Advanced Data Scientists
102 Beatles Like Predictor Predict who likes the Beatles using purely GCP AutoML Tables Citizen Data Scientists
201 Lending Club bad-loan predict if a loan applicant is high risk using XGBoost and hyper-parameter training. Data Scientists, Data Engineers
301 Census Income The census income example is a logistic regression model used to demonstrate Google Cloud Platform's AI Platform. Data Scientists, Data Engineers
202 Taxi Cab Prediction A wide and deep neural net implemented using Tensorflow predict trip duration for Chicago taxi rides. Data Scientists, Data Engineers
203 Black Friday Recommending product categories with multi-class categorization in AI Platform with custom prediction routines, custom hyperparameter tuning Data Scientists, Data Engineers
302 Cellular Imaging Recursion Cellular Image Classification with AI-Platform with TF2, TPU, and advanced Engineering Data Scientists, Data Engineers, HCLS
401 NASA IOT Signal processing Processing Streaming data with signal windowing and live prediction using AI Platform and GCP MLOps, Advanced Data Scientists

install instructions

These examples are meant to run on GCP AI Platform. They may very well run elsewhere but we haven't tested.

Create an instance for AI Platform notebooks:

  1. Choose Tensorflow Enterprise 2.1 (No GPUs)
    • Make sure you have 4 CPUs and at least 15 GB of memory
    • Click Open JupyterLab
    • Use the Launcher (right-hand-side of screen) to open a Terminal...
  2. Install data describe:
    • pip install data-describe[all]
    • pip install xgboost==0.82
    • pip install pandas_gbq
    • pip install google-cloud-bigquery
    • pip install google-cloud-storage

Clone the examples:

git clone https://github.com/data-describe/awesome-data-science-models.git

Sources of Data

Beatles Like Predictor

The original dataset is available from listenbrainz and provided by BigQuery here

Black Friday

The original dataset is on Kaggle, and can be found here.

Census Income

The original dataset is on the UCI Machine Learning Repository, and can be found here.

Chicago Taxi

The original dataset is a BigQuery public dataset, and can be found here. More information on BigQuery public datasets can be found here.

Lending Club

The original dataset is made public by LendingClub, and can be found at their website or here. The dataset used for this demo is a subset of the original dataset.

Cellular Image

The original data is part of the tensorflow datasets. We have copied this data into a public bucket for the demo here.

Nasa IOT Data

J. Lee, H. Qiu, G. Yu, J. Lin, and Rexnord Technical Services (2007). IMS, University of Cincinnati. "Bearing Data Set", NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA

awesome-data-science-models's People

Contributors

bipinkapri-git avatar bobbyjacob avatar brianray avatar dandawg avatar dvdjlaw avatar pdavidsosanofi avatar ryanrusson avatar sachinsaxena021988 avatar sheth108 avatar soshel avatar stevenpais1 avatar sukanyasasmal avatar truongc2 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.