Comments (8)
Ooo not come across them before. I'll add reading them to my TODO list.
AbstractGPs is more of an interface package. For example, I'm going to refactor Stheno.jl and TemporalGPs.jl to adhere to its interface. The idea being that once you've implemented the interface other stuff should work for you.
from abstractgps.jl.
Also tagging @tagordon who has been working on extending celerite to multiple dimensions, and has written a version in Julia
from abstractgps.jl.
@ericagol It would be great to collaborate on this. As @willtebbutt mentioned, AbstractGPs
is meant to be a lightweight package for GP related types and core functionalities. At the same time, KernelFunctuons
will provide a comprehensive set of kernel function implementations used in GP modelling. In combination with the @TuringLang libraries, the long term goal of this umbrella is to provide a viable Julia alternative to the popular GPML
toolbox in Matlab.
from abstractgps.jl.
@yebai Thanks for the reply. I don't have time to help with this now, but @tagordon may be interested. The celerite formalism is fairly easy to implement, and even though it has a restriction on the functional form of the kernels, it actually provides a very complete description of stationary 1-D GPs (with a more limited extension to 2D).
from abstractgps.jl.
@willtebbutt I just watched your Juliacon 2020 talk, and it seems apparent to me that there is probably some commonality between the SDE approach that you use, and our celerite kernels which are also represented as sums of SDEs. The semi-separable matrix solver seemed to give better performance than a Kalman filter solver for a CARMA process (which is a more complex SDE), although this might be a function of implementation. I also found it interesting that you achieved better performance with static arrays, and that you managed to implement reverse-mode AD. This short paper might be of interest: https://ui.adsabs.harvard.edu/abs/2018RNAAS...2...31F/abstract @yebai I think it would be interesting to create a version of celerite which interfaces well with these other packages, and thus becomes more widely usable. I'm happy to advise on this (I'm not much of a programmer, but I understand the algorithm well, and have developed some basic Julia skills).
from abstractgps.jl.
I agree that it would be great to have a version of your celerite work that implements the AbstractGP
API, which should be very doable from what I've understood from a skim of the various papers linked. Fundamentally there are two things that you need to do
- have a way to say "use my celerite methods instead of the defaults please", and
- the linear algebra primitives to exploit the structure in the celerite methods.
Probably the best way to go about this is
- implement a small "wrapper" type, maybe you would call it
Celerite <: AbstractGP
(?), that eats anAbstractGP
- a custom
AbstractMatrix
type, perhaps calledSemiSeparableMatrix
, parametrised in terms ofA
,U
, andV
(eqn 33 from your 2017 paper) - make
cov(Celerite(f(x)))
spit out aSemiSeparableMatrix
-- this should be the only modification needed to theAbstractGP
API - a custom
B <: AbstractTriangular
type parametrised in terms ofU
andW
(eqn 35), withlogdet
andldiv
defined ( couldn't think of a better name thanB
, but we probably shouldn't call itB
) - make
cholesky(::SemiSeparableMatrix)
spit out aCholesky
that contains aB
If you do that, then everything should just work. I'm probably overlooking something, but I can't see anything I'm overlooking being more than a small detail.
On thing I wasn't quite able to glean from your 2017 paper why e.g. the OU process' kernel admits this semi-separable structure. Would you be able to point me towards a reference on that? Maybe I just missed where you derive it in the paper...
I agree that it's interesting that there's such a performance difference. I would be willing to bet a decent chunk of it is implementation related as it's generally quite hard to produce good Kalman filtering implementations with the tools we have available -- StaticArrays
are a real life-saver in the Kalman filtering case because they accelerate the small operations nicely, but they don't do especially well for even medium-dimensional state. BLAS is fine for really high-dimensional things, but unfortunately loads of interesting problems have a medium-dimensional state space. If I've understood what you're doing correctly, it looks like there's important set of scalar calculations which compute the Cholesky decomposition, but that everything else can be done efficiently using BLAS. So that might explain the difference in general.
Skeleton Implementation
I figured a sketch of 1-5 would be helpful. It actually has an extra type because I realised while writing it that I had missed something...
using AbstractGPs
using LinearAlgebra
"""
SemiSeparableMatrix{T<:Number} <: AbstractMatrix{T}
A matrix defined by
A + tril(U Vᵀ) + triu(V Uᵀ)
This matrix doesn't really do a lot other than getting passed around. Possibly you would
want to implement the AbstractArray interface for it so that it can be printed etc, but
equally you might not bother.
"""
struct SemiSeparableMatrix{
T<:Number, TA<:Diagonal{T}, TU<:AbstractMatrix{T}, TV<:AbstractMatrix{T},
} <: AbstractMatrix{T}
A::TA
U::TU
V::TV
function SemiSeparableMatrix(
A::Diagonal{T}, U::AbstractMatrix{T}, V::AbstractMatrix{T},
) where {T}
some_boring_inner_constructor_stuff
}
end
end
"""
Celerite <: AbstractGP
A wrapper for an `AbstractGP` that exploits some stuff...
"""
struct Celerite{Tgp} <: AbstractGP
gp::Tgp
end
"""
Note that a `FiniteGP` just wraps the original GP `f`, input locations `x`, and noise
variance. You should think of it as the multivariate Normal given by `f` at `x`.
"""
function AbstractGPs.cov(f::FiniteGP{<:Celerite})
A = ...
U = ...
V = ...
return SemiSeparableMatrix(A, U, V)
end
"""
B{T} <: AbstractTriangular{T}
A upper-triangular matrix given by
I + triu(W Uᵀ)
I've assumed here that it's straightforward to consider the upper triangular convention for
the Cholesky rather than the lower, because `AbstractGPs` uses the upper-triangular
convention. If it's not in this case for reason, we can come up with a work-around.
"""
struct B{
T<:Number, TU<:AbstractMatrix{T}, TW<:AbstractMatrix{T},
} <: LinearAlgebra.AbstractTriangular{T}
U::TU
W::TW
end
# It turns out that LinearAlgebra's Cholesky implementation is a bit tricky to work with
# if you're not working with dense arrays. For this reason, it's probably easier to roll
# your own here.
struct CholeskyOfSeparable{TL <: B}
U::TU
end
function LinearAlgebra.cholesky(A::SemiSeparableMatrix)
W = ...
b = B(U, W)
return CholeskyOfSeparable(B)
end
LinearAlgebra.logdet(C::CholeskyOfSeparable) = 2 * logdet_of_b
# AbstractGPs generally uses the UpperTriangular convention, rather than the lower. This
# isn't hard to handle though
function LinearAlgebra.ldiv(X::Transpose{<:Number, <:B}, Y::AbstractMatrix)
return ...
end
from abstractgps.jl.
@willtebbutt That looks like a great start!
I can answer the "why" question: it is because each celerite kernel component can be written as the sums of complex exponentials. The products of exponentials involve sums of the exponents, and thus if one exponent has a dependence on e^{-c t_1}
and another on e^{c t_2}
, then the product will have a term with e^{-c(t_1-t_2)}
. Combining these terms with the outer product of the U
and V
matrices then allows you to generate the kernel component of any two times t_1
and t_2
. Now, if c
is complex, this affords oscillatory components (although we avoid complex arithmetic). Also, to avoid numerical instability, only the exponential dependence of adjacent times are used when computing the cholesky decomposition.
This algorithm was originally a generalization of the Rybicki-Press algorithm, which has one real exponential term (i.e. an OU process), to multiple terms. Ambikasaran generalized the R-P algorithm for multiple real terms here, while in the celerite paper we applied it to complex terms, and with some prodding from me, Sivaram then derived the semi-separable approach which we then found to be about a factor of 20 times faster than the generalized Rybicki-Press algorithm, and also allowed Cholesky decomposition which enables simulation of GPs.
from abstractgps.jl.
Also, we've recently generalized the algorithm to a second (small) dimension which has a kernel structure which is an outer product of a vector with itself. This applies when there is a noisy process which has the same shape, but different amplitude of correlated noise, but different white noise in each component of the second dimension. The application we have in mind is stellar variability: stars vary in temperature across their surface, which leads (approximately) to the same shape of correlations across different wavelengths, but differing amplitudes. This was implemented by @tagordon in Julia.
from abstractgps.jl.
Related Issues (20)
- VFE/DTC should be moved to ApproximateGPs.jl HOT 3
- Should LatentFiniteGP contain the noise?
- Documentation build is ~ 55 min HOT 3
- Cross tests for SurrogateAbstractGPs
- 1D regression inconsistent with other package HOT 3
- Kernel for multidimensional output HOT 3
- ERROR: MethodError: no method matching Int64(::Irrational{:log2π}) HOT 7
- Zygote errors with parameterized mean functions and multidimensional input HOT 10
- Tidy up example 0
- Latency HOT 6
- Zygote v0.6.56 breaks tests HOT 6
- Stabilize `MeanFunction` API HOT 2
- An abstraction for the realization of a GP HOT 11
- Feature parity with GaussianProcesses.jl
- Hyperparameter optimization & maintanance HOT 7
- Noise parameter `Sigma` can't be recovered from `PosteriorGPs` HOT 5
- Log-likelihood is lower after fitting the process? HOT 5
- AbstractGP shares array data - impossible to add new points to grid (bayesian optimisation) HOT 6
- GP for functions with 2D input HOT 5
- Conflict between Zygote and AbstractGPs HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from abstractgps.jl.