Git Product home page Git Product logo

drip-statistical-learning's Introduction

DRIP Statistical Learning

v2.77 2 May 2017

DRIP Statistical Learning is a collection of Java libraries for Machine Learning and Statistical Evaluation.

DRIP Statistical Learning is composed of the following main libraries:

  • Probabilistic Sequence Measure Concentration Bounds Library
  • Statistical Learning Theory Framework Library
  • Empirical Risk Minimization Library
  • VC and Capacity Measure Library
  • Covering Numbers Library
  • Alternate Statistical Learning Library
  • Problem Space and Algorithms Families Library
  • Parametric Classification Library
  • Non-parametric Classification Library
  • Clustering Library
  • Ensemble Learning Library
  • Multi-linear Sub-space Learning Library
  • Real-Valued Sequence Learning Library
  • Real-Valued Learning Library
  • Sequence Labeling Library
  • Bayesian Library
  • Linear Algebra Suport Library

For Installation, Documentation and Samples, and the associated supporting Numerical Libraries please check out [DRIP] (https://github.com/lakshmiDRIP/DRIP).

##Features

###Probabilistic Bounds and Concentration of Measure Sequences ####Probabilistic Bounds

  • Tail Probability Bounds Estimation
  • Basic Probability Inequalities
  • Cauchy-Schwartz Inequality
  • Association Inequalities
  • Moment, Gaussian, and Exponential Bounds
  • Bounding Sums of Independent Random Variables
  • Non Moment Based Bounding - Hoeffding Bound
  • Moment Based Bounds
  • Binomial Tails
  • Custom Bounds for Special i.i.d. Sequences

####Efron Stein Bounds

  • Martingale Differences Sum Inequality
  • Efron-Stein Inequality
  • Bounded Differences Inequality
  • Bounded Differences Inequality - Applications
  • Self-Bounding Functions
  • Configuration Functions

####Entropy Methods

  • Information Theory - Basics
  • Tensorization of the Entropy
  • Logarithmic Sobolev Inequalities
  • Logarithmic Sobolev Inequalities - Applications
  • Exponential Inequalities for Self-Bounding Functions
  • Combinatorial Entropy
  • Variations on the Theme of Self-Bounding Functions

####Concentration of Measure

  • Equivalent Bounded Differences Inequality
  • Convex Distance Inequality
  • Convex Distance Inequality - Proof
  • Application of the Convex Distance Inequality - Bin Packing

###Statistical Learning Theory - Foundation and Framework ####Standard SLT Framework

  • Computational Learning Theory
  • Probably Approximately Correct (PAC) Learning
  • PAC Definitions and Terminology
  • SLT Setup
  • Algorithms for Reducing Over-fitting
  • Bayesian Normalized Regularizer Setup

####Generalization and Consistency

  • Types of Consistency
  • Bias-Variance or Estimation-Approximation Trade-off
  • Bias-Variance Decomposition
  • Bias-Variance Optimization
  • Generalization and Consistency for kNN

###Empirical Risk Minimization - Principles and Techniques ####Empirical Risk Minimization

  • Overview
  • The Loss Functions and Empirical Risk Minimization Principles
  • Application of the Central Limit Theorem (CLT) and the Law of Large Numbers (LLN)
  • Inconsistency of Empirical Risk Minimizers
  • Uniform Convergence
  • ERM Complexity

####Symmetrization

  • The Symmetrization Lemma

####Generalization Bounds

  • The Union Bound
  • Shattering Coefficient
  • Empirical Risk Generalization Bound
  • Large Margin Bounds

####Rademacher Complexity

  • Rademacher-based Uniform Convergence
  • VC Entropy
  • Chaining Technique

####Local Rademacher Averages

  • Star-Hull and Sub-Root Functions
  • Local Rademacher Averages and Fixed Point
  • Local Rademacher Averages - Consequences

####Normalized ERM

  • Computing the Normalized Empirical Risk Bounds
  • De-normalized Bounds

####Noise Conditions

  • SLT Analysis Metrics
  • Types of Noise Conditions
  • Relative Loss Class

###VC Theory and Capacity Measure Analysis ####VC Theory and VC Dimension

  • Empirical Processes
  • Bounding the Empirical Loss Function
  • VC Dimension - Setup
  • Incorporating the Formal VC Definition
  • VC Dimension Examples
  • VC Dimension vs. Popper's Dimension

####Sauer Lemma and VC Classifier Framework

  • Working out Sauer Lemma Bounds
  • Sauer Lemma ERM Bounds
  • VC Index
  • VC Classifier Framework

###Capacity/Complexity Estimation Using Covering Numbers ####Covering and Entropy Numbers

  • Nomenclature- Normed Spaces
  • Covering, Entropy, and Dyadic Numbers
  • Background and Overview of Basic Results

####Covering Numbers for Real-Valued Function Classes

  • Functions of Bounded Variation
  • Functions of Bounded Variation - Upper Bound
  • Functions of Bounded Variation - Lower Bound
  • General Function Classes
  • General Function Class Bounds
  • General Function Class Bounds - Lemmas
  • General Function Class - Upper Bounds
  • General Function Class - Lower Bounds

####Operator Theory Methods for Entropy Numbers

  • Generalization Bounds via Uniform Convergence
  • Basic Uniform Convergence Bounds
  • Loss Function Induced Classes
  • Standard Form of Uniform Convergence

####Kernel Machines

  • SVM Capacity Control
  • Nonlinear Kernels
  • Generalization Performance of Regularization Networks
  • Covering Number Determination Steps
  • Challenges Presenting Master Generalization Error

####Entropy Number for Kernel Machines

  • Mercer Kernels
  • Equivalent Kernels
  • Mapping Phi into L2
  • Corrigenda to the Mercer Conditions
  • L2 Unit Ball -> Epsilon Mapping Scaling Operator
  • Unit Bounding Operator Entropy Numbers
  • The SVM Operator
  • Maurey's Theorem
  • Bounds for SV Classes
  • Asymptotic Rates of Decay for the Entropy Numbers

####Discrete Spectra of Convolution Operators

  • Kernels with Compact/Non-compact Support
  • The Kernel Operator Eigenvalues
  • Choosing Nu
  • Extensions to d-dimensions

####Covering Numbers for Given Decay Rates

  • Asymptotic/Non-asymptotic Decay of Covering Numbers
  • Polynomial Eigenvalue Decay
  • Summation and Integration of Non-decreasing Functions
  • Exponential Polynomial Decay

####Kernels for High-Dimensional Data

  • Kernel Fourier Transforms
  • Degenerate Kernel Bounds
  • Covering Numbers for Degenerate Systems
  • Bounds for Kernels in R^d
  • Impact of the Fourier Transform Decay on the Entropy Numbers

####Regularization Networks Entropy Numbers Determination - Practice

  • Custom Applications of the Kernel Machines Entropy Numbers
  • Extensions to the Operator-Theoretic Viewpoint for Covering Numbers

###Alternate Statistical Learning Approaches ####Minimum Description Length Approach

  • Coding Approaches
  • MDL Analyses

####Bayesian Methods

  • Bayesian and Frequentist Approaches
  • Bayesian Approaches

####Knowledge Based Bounds

  • Places to Incorporate Bounds
  • Prior Knowledge into the Function Space

####Approximation Error and Bayes' Consistency

  • Nested Function Spaces
  • Regularization
  • Achieving Zero Approximation Error
  • Rate of Convergence

####No Free Lunch Theorem

  • Algorithmic Consistency
  • NFT Formal Statements

###Problem Space and Algorthm Families ####Generative and Discriminative Models

  • Generative Models
  • Discriminant Models
  • Examples of Discriminant Approaches
  • Differences between Generative and Discriminant Models

####Supervised Learning

  • Supervised Learning Practice Steps
  • Challenges with Supervised Learning Practice
  • Formulation

####Unsupervised Learning

####Machine Learning

  • Calibration vs. Learning

####Pattern Recognition

  • Supervised vs. Unsupervised Pattern Recognition
  • Probabilistic Pattern Recognition
  • Formulation of Pattern Recognition
  • Pattern Recognition Practice SKU
  • Pattern Recognition Applications

###Parametric Classification Algorithms ####Statistical Classification ####Linear Discriminant Analysis

  • Setup and Formulation
  • Fischer's Linear Discriminant
  • Quadratic Discriminant Analysis

####Logistic Regression

  • Formulation
  • Goodness of Fit
  • Mathematical Setup
  • Bayesian Logistic Regression
  • Logistic Regression Extensions
  • Model Suitability Tests with Cross Validation

####Multinomial Logistic Regression

  • Setup and Formulation

###Non-Parametric Classification Algorithms ####Decision Trees and Decision Lists ####Variable Bandwidth Kernel Density Estimation ####k Nearest Neighbors Algorithm ####Perceptron ####Support Vector Machines (SVM) ####Gene Expression Programming (GEP)

###Clustering Algorithms ####Cluster Analysis

  • Cluster Models
  • Connectivity Based Clustering
  • Centroid Based Clustering
  • Distribution Based Clustering
  • Density Based Clustering
  • Clustering Enhancements
  • Internal Cluster Evaluation
  • External Cluster Evaluation
  • Clustering Axiom

####Mixture Model

  • Generic Mixture Model Details
  • Specific Mixture Models
  • Mixture Model Samples
  • Identifiability
  • Expectation Maximization
  • Alternatives to Expectation Maximization
  • Mixture Model Extensions

####Deep Learning

  • Unsupervised Representation Learner
  • Deep Learning using ANN
  • Deep Learning Architectures
  • Challenges with the DNN Approach
  • Deep Belief Networks (DBN)
  • Convolutional Neural Networks (CNN)
  • Deep Learning Evaluation Data Sets
  • Neurological Basis of Deep Learning

####Hierarchical Clustering

####k-Means Clustering

  • Mathematical Formulation
  • The Standard Algorithm
  • k-Means Initialization Schemes
  • k-Means Complexity
  • k-Means Variations
  • k-Means Applications
  • Alternate k-Means Formulations

####Correlation Clustering

####Kernel Principal Component Analysis (Kernel PCA)

###Ensemble Learning Algorithms

####Ensemble Learning

  • Overview
  • Theoretical Underpinnings
  • Ensemble Aggregator Types
  • Bayes' Optimal Classifier
  • Bagging and Boosting
  • Bayesian Model Averaging (BMA)
  • Baysian Model Combination (BMC)
  • Bucket of Models (BOM)
  • Stacking
  • Ensemble Averaging vs. Basis Spline Representation

####ANN Ensemble Averaging

  • Techniques and Results

####Boosting

  • Philosophy behind Boosting Algorithms
  • Popular Boosting Algorithms and Drawbacks

####Bootstrap Averaging

  • Sample Generation
  • Bagging with 1NN - Theoretical Treatment

###Multi-linear Sub-space Learning Algorithms ####Tensors and Multi-linear Sub-space Algorithms

  • Tensors
  • Multi-linear Sub-space Learning
  • Multi-linear PCA

###Real-Valued Sequence Learning Algorithms ####Kalman Filtering

  • Continuous Time Kalman Filtering
  • Nonlinear Kalman Filtering
  • Kalman Smoothing

####Particle Filtering

###Real-Valued Learning Algorithms ####Regression Analysis

  • Linear Regression
  • Assumptions Underlying Basis Linear Regression
  • Multi-variate Regression Analysis
  • Multi-variate Predictor/Response Regression
  • OLS on Basis Spline Representation
  • OLS on Basis Spline Representation with Roughness Penalty
  • Linear Regression Estimator Extensions
  • Bayesian Approach to Regression Analysis

####Component Analysis

  • Independent Component Analysis (ICA) Specification
  • Independent Component Analysis (ICA) Formulation
  • Principal Component Analysis
  • Principal COmponent Analysis - Constrained Formulation
  • 2D Principal Component Analysis - Constrained Formulation
  • 2D Principal Component Analysis - Lagrange Multiplier Based Constrained Formulation
  • nD Principal Component Analysis - Lagrange Multiplier Based Constrained Formulation
  • Information Theoretic Analysis of PCA
  • Empirical PCA Estimation From Data Set

###Sequence Label Learning Algorithms ####Hidden Markov Models

  • HMM State Transition/Emission Parameter Estimation
  • HMM Based Inference
  • Non-Bayesian HMM Model Setup
  • Bayesian Extension to the HMM Model Setup
  • HMM in Practical World

####Markov Chain Models

  • Markov Property
  • Markov Chains
  • Classification of the Markov Models
  • Monte Carlo Markov Chains (MCMC)
  • MCMC for Multi-dimensional Integrals

####Markov Random and Conditional Fields

  • MRF/CRF Axiomatic Properties/Definitions
  • Clique Factorization
  • Inference in MRF/CRF

####Maximum Entropy Markov Models

####Probabilistic Grammar and Parsing

  • Parsing
  • Parser
  • Context-Free Grammar (CFG)

###Bayesian Analysis ####Concepts, Formulation, Usage, and Application

  • Applicability
  • Analysis of Bayesian Systems
  • Bayesian Networks
  • Hypothesis Testing
  • Bayesian Updating
  • Maximum Entropy Techniques
  • Priors
  • Predictive Posteriors and Priors
  • Approximate Bayesian Computation
  • Measurement and Parametric Calibration
  • Bayesian Regression Analysis
  • Extensions to Bayesian Regression Analysis
  • Spline Proxying of Bayesian Systems

###Linear Algebra Support ####Optimizer

  • Constrained Optimization using Lagrangian
  • Least Squares Optimizer
  • Multi-variate Distribution

####Linear Systems Analysis and Transformation

  • Matrix Transforms
  • System of Linear Equations
  • Orthogonalization
  • Gaussian Elimination

##Contact

[email protected]

drip-statistical-learning's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.