Git Product home page Git Product logo

updatableqrfactorizations.jl's Introduction

UpdatableQRFactorizations.jl

CI codecov

This package contains implementations of efficient representations and updating algorithms for QR factorizations. Notably, the implementations can scale to very high dimensions and moderately large numbers of columns even if memory is not abundant.

Basic Usage

The following example highlights the basic usage of the package.

using UpdatableQRFactorizations
n, m = 32, 8
A = randn(n, m)
F = UpdatableQR(A)

The following snippet shows how to add a column to the factorization.

b = randn(n)
add_column!(F, b)
println(Matrix(F)  [A b])

Lastly, we can also efficiently update the factorization upon removal of a column.

remove_column!(F)
println(Matrix(F)  A)

Documentation

Initialization

The UpdatableGivensQR structure is central to the package and can also be referred to as UpdatableQR, or UQR. It holds storage for Givens rotations and the R matrix of a QR factorization and allows for efficient addition and deletion of columns. See add_column! and remove_column!. There are two primary ways of initializing such a structure. First,

UpdatableQR(A::AbstractMatrix, r::Int = size(A, 1))

computes the QR factorization of A and pre-allocates enough memory for up to r columns total. In cases where n is large, and in particular, if an n by n matrix does not fit in memory, reducing the maximum allowed rank r is necessary to use this structure. Note also that there is a mutating constructor

UpdatableQR!(A::AbstractMatrix, r::Int = size(A, 1)),

which overwrites the input matrix A with the R matrix but allocates extra space for the representation of Q. Second,

UpdatableQR(n::Int, r::Int)

initializes an empty QR factorization of size n by 0 with enough space to add r columns total.

Column Addition

To add columns to the factorization, use

add_column!(F::UpdatableGivensQR, x::AbstractVector, k::Int = size(F, 2) + 1).

Given an existing QR factorization F of a matrix, add_column! computes the factorization of the same matrix after a new column x has been added as the kᵗʰ column. Computational complexity: O(nm) where size(F) = (n, m). This overwrites the existing factorization. The package also supports the addition of multiple columns with a single call with performance benefits:

add_column!(F::UpdatableGivensQR, X::AbstractMatrix, k::AbstractArray{Int} = (size(F, 2) + 1) : (size(F, 2) + size(X, 2)))

By default, both functions append the columns to the factorization. If a different column ordering is desired, the index at which the columns should be added can be passed as the third argument.

Column Removal

To remove columns from a factorization, use

remove_column!(F::UpdatableGivensQR, k::Int = size(F, 2))

Updates the existing QR factorization F of a matrix to the factorization of the same matrix after its kᵗʰ column has been deleted. Computational complexity: O(m²) where size(F) = (n, m).

On Efficiency

First, note that the goal of this package is primarily to allow for the efficient updating of QR factorizations. It is hard to beat stdlib's qr, which links to LAPACK, for a factorization from scratch. For a representative performance on a dual core 13'' MacBook Pro from 2017 for a moderately sized problem, see the following:

n, m = 1_000, 128;
A = randn(n, m);
b = randn(n);

@time F = UpdatableQR(A, 2m);
  0.018699 seconds (10 allocations: 10.241 MiB)
  
@time add_column!(F, b);
  0.000635 seconds (1 allocation: 1.141 KiB)
  
@time qr(A);
  0.006914 seconds (7 allocations: 1.047 MiB)

@time qr!(A);
  0.003037 seconds (5 allocations: 72.188 KiB)

Notably, qr is three times faster than UpdatableQR in constructing a factorization from scratch, but is significantly slower than the addition of a single column, which is not supported by qr and would require a second factorization from scratch.

On Scalability

The implementation contained herein represents Q implicitly by keeping track of the Givens rotations that constitute it, allowing a scaling to very high n and moderate m. This is in contrast to an existing implementation of updatable QR factorizations in GeneralQP.jl, which keeps the n by n Q matrix densely in memory and updates it upon column addition, prohibiting scaling to problems where n is large regardless of the number of columns m.

In fact, on the same laptop as before, we can compute an updatable QR factorization in 1,000,000 dimensions in under a minute:

n, m = 1_000_000, 128;
A = randn(n, m);
@time F = UpdatableQR(A, 2m);
39.086453 seconds (10 allocations: 10.490 GiB, 0.74% gc time)

It is also notable that contructing the QR factorization from scratch with stdlib's qr is again much faster

@time qr(A);
  3.363778 seconds (112.75 k allocations: 983.465 MiB, 1.77% gc time, 1.34% compilation time).
 
@time qr!(A);
  3.058636 seconds (5 allocations: 72.188 KiB)

However, subsequent addition of single columns beats calculating a factorization from scratch, even with the very efficient qr

@time add_column!(F, b);
  0.599013 seconds (1 allocation: 1.141 KiB)

Future work on this package could improve on the performance difference between qr and UpdatableQR for a factorization from scratch.

Limitations

At this point, UpdatableQRFactorizations.jl only supports factorizations of matrices with full column rank. Further, no specialization for sparse matrices has been implemented, which might significantly improve performance in cases where sparsity is present.

updatableqrfactorizations.jl's People

Contributors

sebastianament avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

dkarrasch

updatableqrfactorizations.jl's Issues

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Proposal: `GivensQ <: AbstractQ`

I came across this package via a PkgEval run in JuliaLang/julia#46874. I found that currently GivensQ subtypes Factorization, but is actually meant to represent an orthogonal/unitary matrix. So you may want to consider to subtype LinearAlgebra.AbstractQ. After JuliaLang/julia#46196, you could "benefit" from a lot of generic fallbacks that do exactly what you want it to do. You can check the generic part, the beginning of the new file abstractq.jl in that PR, and the added test there to see how it feels and works.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.