juliamanifolds / manifoldml.jl Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
Continuing the discussion in #2 (and on slack). Some thoughts on what the issues might be.
As I understand it, a point on an arbitrary Manifold
object does not generally know what manifold to which it belongs, correct? This is fine as far the working with these points internal to your manifold-specific algorithms, but not ideal from the point of view of integration with the rest of the ML ecosystem. The problem is roughly analogous to categorical variables. Internally these are usually represented as integers, but algorithms still need to know the total number of possible classes to avoid problems, such as certain classes disappearing on resampling. Passing this information around is not as easy as it first appears. Life is much easier (for a tool box like MLJ) if we simply assume every point knows all the classes - and that is why we (and other packages) insist on the use of CategoricalArrays
for representing such data (although ordinary arrays of some "categorical value" type would also have sufficed.)
In the future, we might have algorithms which deal with mixed data types, one or more or which is a manifold type (think of geophysical applications) and having to keep track of metadata for a subset of variables gets messy.
So my tentative suggestion would be that MLJ users would present input data for a supervised learning algorithm from the ManifoldML package as an abstract vector of "manifold points", where a "manifold point" is a point which combines the manifold to which the point belongs with some internal representation. This could be as simple as a tuple (M, p)
, for example. We define a new scientific type ManifoldPoint{M}
where M
is the concrete manifold type and declare scitype( (M, p))
= ManifoldPoint{typeof(M)}`. Then your input type declarations in the implementation of the MLJ interface would look something like:
input_scitype(::ManifoldKNNRegressor) = AbstractVector{<:ManifoldPoint{<:MetricManifold}}
And the rest would be straightforward, I should think.
Other random thoughts:
Maybe there is some way to "decorate" existing manifolds to enforce the kind of point representation we want. I don't really understand this decorating business enough to say, or if this is really an advantage.
Maybe we want to refine the scitype to include the number_type
as type parameter
@kellertuer @mateuszbaran Your thoughts?
Hello,
following this discussion, the steps followed by PosDefManifold.jl for doing classification in the tangent space are:
parallel transport of all points to the center of the manifold (identity for the positive definite matrices manifold). This involves the computation of a center of mass of the points in the manifold and a function for parallel transport.
Exponential map (projection on the tangent space)
vectorization (a special one, for example, for the pos def manifold a weight โ2 is given to the off-diagonal elements)
I am not familiar with other manifolds. Are all these operations possible with all currently implemented manifolds?
There is a package ManifoldLearning: https://github.com/wildart/ManifoldLearning.jl . Only partially related but we may still want to take a look at what it does.
We could add the MDL information criterion: https://www.pnas.org/content/97/21/11170 . It seems to depend on the integration of scalar functions on a manifold which we don't have yet.
I've noticed that you (@kellertuer ) have written "SVM". Do you actually mean providing a set of RKHS kernels like in that paper: https://arxiv.org/pdf/1412.0265.pdf ?
EDIT: I just realized that I forgot to mention that I meant SVM from the https://github.com/JuliaManifolds/ManifoldML.jl/blob/master/ideas.md file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.