tailhq / dynaml Goto Github PK

Scala Library/REPL for Machine Learning Research

Home Page: http://tailhq.github.io/DynaML/

License: Apache License 2.0

Scala 90.12% R 0.01% Java 1.08% Shell 0.17% TeX 0.24% Ruby 0.01% Jupyter Notebook 8.38%

machine-learning scala regression repl committee-models classification gaussian-processes scala-library machine-learning-api machine-learning-algorithms

dynaml's People

Contributors

Stargazers

Watchers

dynaml's Issues

Integrate DynaML into a Jupyter notebook kernel

Making the DynaML environment available in a jupyter notebook is the next step in enabling its use in interactive data science.

Some notable starting points include.

Add support for skew gaussian distribution and multivariate skew gaussian distribution

where can i find the plots for TestGPDelve?

nice looking project you have here.

I ran TestGPDelve, but I can't find these:
"DynaML also generates Javascript plots using Wisp in the browser."

Thanks.

Create python port of DynaML

This will be in the spirit of pyspark. We can use Py4j as a bridge between the JVM and python process.

Add implementation of Wavelet Neural Networks

Reference: Wavelet Neural Networks - David Veitich

This task is broken into two parts.

Implement WaveletNetwork class based on pseudo code outlined in sections 3.2.1.1 and 3.3.1
Implement WaveNet class based on sections

For both parts use the Wavelet[I] API to represent Wavelon object instances for multivariate inputs, which have parameters that can be calculated using the GradientDescent class. Implement WaveletGradient and WaveletUpdater classes by extending Gradient and Updater.

Skew Gaussian Process Models

This has the issue #56 as a predecessor task.

Support for HMC

Hamiltonian Monte Carlo methods

Vanilla HMC
Riemannian Manifold HMC
Lagrangian HMC

Getting more done in GitHub with ZenHub

Hola! @mandar2812 has created a ZenHub account for the mandar2812 organization. ZenHub is the only project management tool integrated natively in GitHub – created specifically for fast-moving, software-driven teams.

How do I use ZenHub?

To get set up with ZenHub, all you have to do is download the browser extension and log in with your GitHub account. Once you do, you’ll get access to ZenHub’s complete feature-set immediately.

What can ZenHub do?

ZenHub adds a series of enhancements directly inside the GitHub UI:

Real-time, customizable task boards for GitHub issues;
Multi-Repository burndown charts, estimates, and velocity tracking based on GitHub Milestones;
Personal to-do lists and task prioritization;
Time-saving shortcuts – like a quick repo switcher, a “Move issue” button, and much more.

Add ZenHub to GitHub

Still curious? See more ZenHub features or read user reviews. This issue was written by your friendly ZenHub bot, posted by request from @mandar2812.

Compilation error

Hi,

I am having trouble compiling DynaML from the latest available code in master (commit 9b76fa1).

I am on macOs High Sierra, compiling the code as is (Scala version 2.11.8, SBT 0.13.8). When I run the compile command, among some warnings, I get two compile errors:

[error] /Users/iht/github/DynaML/dynaml-core/src/main/scala-2.11/io/github/mandar2812/dynaml/utils/package.scala:459: could not find implicit value for parameter ev: spire.algebra.Eq[Domain]
[error]         (ev.gcd(a._1, b._1), ev1.gcd(a._2, b._2))
[error]                ^
[error] /Users/iht/github/DynaML/dynaml-core/src/main/scala-2.11/io/github/mandar2812/dynaml/utils/package.scala:459: could not find implicit value for parameter ev: spire.algebra.Eq[Domain1]
[error]         (ev.gcd(a._1, b._1), ev1.gcd(a._2, b._2))
[error]                                     ^

SBT provides the following info when launched, just in case it is relevant:

platform: macosx-x86_64
Tensorflow-Scala Classifier: darwin-cpu-x86_64

Has anyone seen this compile error before? The project is compiling fine at Travis, so I guess it must be something specific to my system. I wiped out the ivy2 and coursier caches, tried again, and the error persists.

Thanks in advance.

images for home page

Warped Gaussian Processes

Implement warped gaussian processes as outlined in Ghahramani et. al

Add support for distributions and probability models

Distribution/Random Variable API

Must contain

Top level traits outlining a distribution with : probability mass function, cumulative distribution, sampling and inference of parameters from data.
Ability to compose distributions to get higher distributions, ability to compose dependent random variables.

Add support for multi-output Gaussian Processes

Reference: Kernels for Vector Valued Functions: A Review

Implement a subclass of GaussianProcessModel, called AbstractMOGPRegression which represents the multi-output Gaussian process outlined in section 3.1

Bayesian Software for Scala

Like we have Stan for Python, R as HMC requires some tuning, so if we use NUTS (Stan's sampler) which is more computationally efficient than the optimal static HMC. #59

Version 0.12

Creating issue for uploading version 0.12

One Step Filter

@mandar2812 Recently I have initiated a PR #37 .
Please have a look into this and I unable to find your particular folder for pushing my code #37 .

Travis is unhappy!!

New Feature: Model Single Hidden Layer feed forward Neural Networks

Aim

Write a set of abstract classes extending GraphicalModel iterface, this is a kind of open problem as we can hash out the implementation details.

New Feature: Add Hidden Markov Models/Viterbi Implementation

Aim

Completely open ended

Remove println in the code

When running GPRegression model, I find that it is particularly verbose in the console. At first, I thought it was log4j configuration being too verbose. However, grepping in the code, I can see lots of println.

I suggest replacing these println with logger debug or trace level.

Write unit tests for common model implementations

Model 'M' should be trained and tested on a pre-defined (possibly synthetic) data set and the test results should be within some reasonable bounds.
In optimization package, solvers should be tested on common synthetic data sets to see if they reasonably converge.
In the data pipes package, write unit tests to test composability and function of the data pipes API

Setup unit tests for common solvers.

Tests are currently disabled. Need to set them up by converting the current introduction into an automated test.

Implement Kalman Filter

References

New Feature: Modify GaussianLinearModel to cover classification problems.

Aim

Modifying the class to cover classification problems.

Notes

Extending the class to model classification would not need any changes in the types of class variables, but mostly in the implementation of the GradientDescent class by using the appropriate gradient and updater objects.

Implementing Bootstrap Aggregation(Bagging)

Tool for text processing and ML using Apache Spark

A utility for performing Bagging where interface takes training data, sampling proportion, and number of desired bags then repeatedly samples with replacement from training data.

Approximate Bayesian Inference

Add support for loading native code: Rust, C++

This issue/task is quite large in terms of scope and implications. One can consider several capabilities that can be seen as objectives which the native interface/connection will need to achieve

Objectives

Optimising kernel computations; more specifically of classes implementing LocalScalarKernel[I]. This should ideally be done at the level of the evaluate(x: I, y: I) function of the common kernel implementations like RBFKernel, FBMKernel, PolynomialKernel and others.
Implementing models in native code and connecting them to the DynaML Model[T, Q, R] trait.

Languages

In order of preference.

Rust
C++

Compatibility Layers

A number of java compatibility interfaces can be considered and with some deliberation one of them can be finalised.

JNI: Java Native Interface
JNA: Java Native Access, with [BridJ] backend
FFI: Java Abstracted Foreign Function Layer

Resources

Test and improve Back-propagation convergence.

This pertains to the Backpropagation class used by the FeedForwardNetwork[D] to learn its network weights. A good reference is pages 10-12 of Andrew Ng's notes on autoencoders. What we require is iterative calculation of the derivative checking procedure outlined in those pages. The required changes must be made to the Backpropagation class.

Implementing State Space Models and simulating the State Space

Updated tests #40 and #41

Process like Ornstein-Uhlenbeck process which can be simulated by specifying the parameters #41 of the process, theta - the mean of the process.

Need to write a test/validation function for the class GaussianLinearModel

Aim

Either make a change in the abstract class api by having a test method, or make a separate test class which takes a model and a test set and carries out validation on the test set. We should also ideally add a n fold cross validation function for a graph model, when no separate test set is available.

Notes

The nodes in GaussianLinearModel are labeled using the gremlin setProperty() method, maybe we can create a new apply() function (overload it) in the GaussianLinearModel object to take a training set and test set, create nodes for the test set data points and add them to the graph obejct as well. We can have separate labels for each of the training and test points.

Create Wavelet API

Requirement

Ability to be extensible for any mother wavelet
Have state i.e. translation and dilation
Must have ability to be converted to discrete computation for Digital Signal Processing

tailhq / dynaml Goto Github PK

dynaml's People

Contributors

Stargazers

Watchers

Forkers

dynaml's Issues

Hamiltonian Monte Carlo methods

How do I use ZenHub?

What can ZenHub do?

Distribution/Random Variable API

Aim

Aim

References

Aim

Notes

Objectives

Languages

Compatibility Layers

Resources

Aim

Notes

Recommend Projects

Recommend Topics

Recommend Org