Git Product home page Git Product logo

datascience_book's Introduction

Programming Probabilistically: An introduction to the world of data science

By

Eric Schles

Hello and welcome to my book! You'll find the following sections:

  1. Descriptive Statistics and Hypothesis testing
  2. Applied Statistical Tests - A/B testing
  3. Regression Introduction
  4. Classification Introduction
  5. Information Theory, Entropy and Tree Models
  6. Neural Network Models
  7. Introduction to Time Series Analysis

Each section covers about 4 to 5 chapters worth of material broken out into:

  • Basics
  • Mathematical Intuition
  • Implementation
  • Typical API
  • Advanced Use Cases

In addition to the main chapters, I've added a number of 'engineering' focused chapters that are somewhat supplemental:

Sections to come:

  • Reinforcement Learning
  • Engineering for Data Science
  • Text Processing
  • Image Processing
  • Support Vector Machines
  • Genetic Algorithms
  • neural network optimizers
  • Recommender Systems
  • A/B testing and other related workflows
  • SQL best practice
  • Timeseries Forecasting and Analysis
  • Geospatial Analysis
  • Geospatial and Timeseries forecasting
  • Video Processing
  • Building Data Dashboards
  • Working With Search
  • Building An OCR System
  • Advanced Python Usage
  • Active Learning
  • Recurrent Neural Networks
  • Convolutional Neural Networks
  • Capsule Networks
  • Adversarial Machine Learning
  • Open World - in distribution out of distribution
  • Bayesian Machine Learning
  • Graph Based Neural Networks
  • Monitoring
  • Working with Spark
  • Working with Streaming Data
  • Ensembling - scikit learn ensembling strategies
  • Random Forests
  • Additive models:
    • Gradient boosted trees
    • splines
    • General Additive Models
    • adaboost
  • explainability metrics
    • litany of examples
    • showing when and how they can fail
  • Metrics
  • Hyper parameter tunning
  • Randomness in your models
  • Counterfactual examples
  • testing in machine learning applications

To Dos

  • fix Decision Tree Implementation
  • add SVM chapter
  • add dimensionality reduction chapter
  • add clustering chapter
  • add RNN chapter
  • add conv net chapter
  • discuss attention
  • create engineering productionization chapter
  • hypothesis test as a ticket within engineering scrum context
  • reproducibility of results

datascience_book's People

Contributors

ericschles avatar ajschumacher avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.