Comments (6)
The primary reason is that a KroneckerLazyVariable
can represent the Kronecker product of other LazyVariables
, which we might not want to evaluate or invert explicitly. This is true, for example, when doing KISS-GP with Kronecker structure on multiple dimensions (https://github.com/cornellius-gp/gpytorch/blob/master/examples/kissgp_kronecker_product_regression.ipynb).
It's true that if KroneckerLazyVariable
were over only NonLazyVariables
, we could probably avoid CG and Lanczos.
Do you have an example use case where exploiting properties of the Kronecker product other than for MVMs would be useful? We just briefly discussed it and are open to the idea, we just can't think of what we'd gain, or how we'd keep the structure of the sub lazy variables intact.
from gpytorch.
It may also just generally be hard to justify using inversion algorithms other than CG at this point, even when they are available. CG typically gives fairly good solves with a very small number of iterations, making it a compelling choice even when an exact inverse is possible. For example, even for exact GPs, k iterations of CG requires O(kn^2) time compared to O(n^3) time to compute a Cholesky decomposition.
Even without preconditioning, k can often be very small, and often depends more on the conditioning of the matrix and the clustering of its eigenvalues rather than on the size of the matrix.
from gpytorch.
I think the primary gain over MVM is the speed. I tried a 10 dimensional grid, with 10 points in each dim. Root Decomposition is very very slow. I guess although MVM is fast with kronecker structure, the memory cost is still big with a multidimensional grid, which slows down the computation a lot. However, with kronecker root decomposition applied, this should finish instantly.
from gpytorch.
Maybe I misunderstood the code, why there is a problem when lazy_vars are LazyVariables
? For KroneckerProductLazyVariable
, InverseLazyVariable
or SquareLazyVariable
should be able to implemented to be applied to each element. At least CG and Lanczos should be able to applied elementwise, right?
from gpytorch.
Do you have an example of how a new KroneckerLazyVariable
would look? It would basically need to define a new inv_quad_log_det
.
The main cost with Kronecker structure is the exponential dependence on d of the regularly spaced grid; in 10 dimensions at 10 grid points per dimension, that's 10^10 inducing points. At the moment, the two implemented ways around this are the multiplicative grid interpolation kernel which exploits product structure (e.g. in the RBF kernel) and deep kernel learning, although we may implement SGPR as a lazy variable down the road.
It's not immediately obvious to me personally how we can get around the exponential scaling in a Kronecker lazy variable; however, if you believe you have a method, it could be extremely interesting.
from gpytorch.
Ah, I missed out Multiplicative grid interpolation kernel
before. Thank you.
from gpytorch.
Related Issues (20)
- [Bug] Problems in the normalization and standardization of data HOT 1
- [Bug] Standardization of the output and inverse transform of standard deviation HOT 3
- [Docs] `get_fantasy_model` - are posterior covariances computed from scratch or using efficient cache updates? HOT 1
- [Bug] Bug in GP Regression with KeOps Kernels HOT 1
- [Bug] Extreme oscillation in loss
- [Bug] Extreme loss oscillation during training
- [Docs] Missing docs for HammingIMQKernel
- [Feature Request] Generic typing for scale kernels
- [Docs] Making sense of batch processing and tasks
- [Feature Request] Is it possible to work with Changepoint kernels?
- Nesting GPs; using the sufficient statistics from one GP as sufficient stats in another GP - variance goes to zero
- Label flattening fails with custom mean function from another GP HOT 1
- [Docs] Unexpected behavior setting kernel priors HOT 1
- [Feature Request] Allow `kwargs` to be passed to `ExactMarginalLogLikelihood.forward()` HOT 1
- [Bug] CUDA out of memory, strange numbers HOT 1
- [Docs] qKnowledgeGradient cpu usage HOT 1
- [Bug] Erroneous detaching with (custom?) mean
- [Bug] Multitask-ExactGPs seem to not use mBCG algorithm as Singletask-ExactGPs do
- [Feature Request] Choose which dimentions to differenciate with respect to in derivative multitask GPs
- [Bug] Error in tutorials and derivative GPs HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gpytorch.