juliafirstorder / proximaloperators.jl Goto Github PK

Proximal operators for nonsmooth optimization in Julia

License: Other

Julia 98.96% TeX 1.04%

optimization proximal-algorithms proximal-operators julia julia-language admm prox nonsmooth-optimization first-order-methods numerical-optimization

proximaloperators.jl's Introduction

ProximalOperators.jl

Proximal operators for nonsmooth optimization in Julia. This package can be used to easily implement proximal algorithms for convex and nonconvex optimization problems such as ADMM, the alternating direction method of multipliers.

See ProximalAlgorithms.jl for generic implementations of algorithms based on the primitives here defined.

See the documentation on how to use the package.

Installation

To install the package, hit ] from the Julia command line to enter the package manager, then

pkg> add ProximalOperators

Usage

With using ProximalOperators the package exports the prox and prox! methods to evaluate the proximal mapping of several functions.

A list of available function constructors is in the documentation.

For example, you can create the L1-norm as follows.

julia> f = NormL1(3.5)
description : weighted L1 norm
type        : Array{Complex} → Real
expression  : x ↦ λ||x||_1
parameters  : λ = 3.5

Functions created this way are, of course, callable.

julia> x = randn(10) # some random point
julia> f(x)
32.40700818735099

prox evaluates the proximal operator associated with a function, given a point and (optionally) a positive stepsize parameter, returning the proximal point y and the value of the function at y:

julia> y, fy = prox(f, x, 0.5) # last argument is 1.0 if absent

prox! evaluates the proximal operator in place, and only returns the function value at the proximal point:

julia> fy = prox!(y, f, x, 0.5) # in-place equivalent to y, fy = prox(f, x, 0.5)

Related packages

References

N. Parikh and S. Boyd (2014), Proximal Algorithms, Foundations and Trends in Optimization, vol. 1, no. 3, pp. 127-239.
S. Boyd, N. Parikh, E. Chu, B. Peleato and J. Eckstein (2011), Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers, Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1-122.

Credits

ProximalOperators.jl is developed by Lorenzo Stella and Niccolò Antonello at KU Leuven, ESAT/Stadius, and Mattias Fält at Lunds Universitet, Department of Automatic Control.

proximaloperators.jl's People

Contributors

Stargazers

Watchers

proximaloperators.jl's Issues

Bundle method for non-differentiable optimisation

Bundle method has been widely adopted to compute the subgradients within a lagrangean relaxation framework due to faster convergence compared to the traditional cutting plane algorithms. I'm looking for an implementation of the bundle method in Julia in order to employ them within my lagrangean heuristic. Are there any existing implementations of the bundle method in Julia? If not, would it be possible to request for one?

calculus function fields

Hi Stella. I have a issue regarding the calculus functions. Consider the following:

 f1 = SqrNormL2(1.0)
 f2 =IndFree()
 F = Sum(f1,f2)

Now my issue is that to access the field lambda of f1 I need to ask for F.fs[1].lambda. I would have like for F to be blind to how it was created. I mean I would have liked to be able to say F.lambda instead.
I encountered this issue because I have a scenario where my function may or may not be a sum of other functions. I need both to take gradient of F and to access the fields of one of the functions.

SqrNormL2 expects λ to be positive

The upgrade to v. 0.15.0 breaks my code as SqrNormL2 now expects λ to be positive instead of non-negative. I would like to regularize only some of the parameters in my model, and achieved this by setting some values in the array λ to zero. What is the reason for this change? I would probably now go ahead and define my own proximal operator that works with 0-valued λs.

Nonnegative least squares example

Whipped this together earlier. Not sure on the stopping criterion, but it seems reasonable...

using ProximalOperators
# Define solvers
function NNLS_admm(A, b, x; tol = 1e-9, maxit = 50000, α = 1.0)
    u = zero(x)
    #Constraint dictates that x = z
    z = @view x[:]
    lastx = deepcopy(z)
    f = LeastSquares(A, b)
    g = IndNonnegative()
    for it = 1:maxit
        # perform f-update step
        prox!(x, f, x - z + u, α)
        # perform g-update step
        prox!(z, g, x + u, α)
        # stopping criterion
        if norm(lastx .- x, 2) <= tol
            println(it)
            break
        end
        # dual update
        u .+= x - z
        lastx[:] = z[:]
    end
    return z
end

m, n, k, sig = 200, 50, 100, 1e-3
A = rand(m, n)
x_true = rand(n)
b = A*x_true

x_admm = NNLS_admm(A, b, zeros(n))
println("ADMM")
println("      nnz(x)    = $(norm(x_admm, 0))")
println("      obj value = $(0.5*norm(A*x_admm-b)^2 + lam*norm(x_admm, 1))")

(x_admm .- x_true)

plot(x_admm)
plot!(x_true)

IndicatorConvexCone

You seem to have defined IndicatorConvexCone <: IndicatorConvex but most cones are defined as
immutable IndPSD <: IndicatorConvex end
immutable IndFree <: IndicatorConvex end
Is there a reason for this? I would be happy add a PR if not.
It would also be nice for the package I'm writing (and for slight efficiency) if we had IndNonnegative (and IndNonpositive) as IndicatorConvexCone instead of
IndNonnegative() = IndBox(0.0, +Inf).
That way I would be able to restrict my algorithm to work with any IndicatorConvexCone in this package.

We might also consider adding a IndDualCone{T<:IndicatorConvexCone}. I already have an implementation for this in my code, and it should be easy to include here. It looks something like this:

type DualCone{T<:IndicatorConvex} <: IndicatorConvex
    C::T
end

dual{T<:IndicatorConvex}(C::T) = DualCone{T}(C)
dual(C::DualCone) = C.C

function prox!(cone::DualCone, x, y)
    prox!(cone.C, -x, y)
    for i = 1:length(x)
        y[i] = x[i] + y[i]
    end
end

and optionally we could add something like this

#Wrapper for dual prox to avoid double duals
function proxDual!(C::IndicatorConvex, x, y)
    prox!(C, -x, y)
    @simd for i = 1:length(x)
        y[i] = x[i] + y[i]
    end
end
proxDual!(C::DualCone, x, y) = prox!(C.C, x, y)

I'm happy to create a PR for this too if you are interested.

Locally Low-Rank operator

I would like to use the LLR operator, and am considering making a PR.

Reference:
"Local Low-Rank Matrix Approximation" Joonseok Lee et al.
https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45235.pdf

Computing proximal operator of a constrained convex function

May I request for the feature extension to compute the proximal operator of a constrained convex function? For example, a constrained convex function that shows up in low rank factor analysis is:

where I_P is the indicator function of the convex set:

In my Julia package NExOS.jl I have used ProximalOperators.jl heavily when the functions are easy to compute. For now, whenever I need to deal with a constrained convex function, I am constructing the function object myself and computing the proximal operator by JuMP and a solver supported by it. If there was a subroutine in ProximalOperators.jl, which would construct the constrained function object and compute the proximal operator via calling other solvers, it would be great. Of course, I completely understand if this is outside of the scope of the package.

An immutable type ProximalPoint

This is just a design idea, but I kinda like it.

From the beginning of ProximalOperators, we had prox! and prox returning both the proximal point y and the value of f at y. This has several advantages: essentially, in many cases evaluating the prox of a function entails almost-computing the function value at the resulting point. Such value may be used, for example, in a stopping criterion, or may be of any other use for an algorithm. A notable example for this is the nuclear norm: once you have soft-thresholded the singular values, you better store their sum somewhere: otherwise evaluating f at the resulting point will be another SVD - quite expensive.

Actually, right now prox! only returns the function value, since the proximal point is written to an array provided by the user.

On the other hand, I can see the advantage of having the proximal point as the only returned argument. It's a bit more like it is in paper: you ask for the proximal point, you get the proximal point.

One way I can imagine to obviate this problem is to have an Array-like type, encapsulating the function value along with the proximal point. It should be immutable by law (point & function value shall be together until the last bit of their lives, in the name of provable correctness).

immutable ProximalPoint <: AbstractArray
  f::ProximableFunction
  point::AbstractArray
  value::Real
end

An object of this type shall be created upon prox evaluation. Then the call method would be:

function (f::ProximableFunction)(x::ProximalPoint)
  if object_id(f) == object_id(x.f)
    return x.value
  else
    return f(x.point) # this is already defined for all functions
  end
end

This way one can decide to do:

y = prox(f, x, gamma)
v = f(y) # this doesn't compute anything

Would this complicate things? That is, would this require us to (trivially) redefine several other things for the new ProximalPoint type? Things that, when called on x::ProximalPoint, should be re-called with argument x.point instead. It would be nice if it were possible to declare: whatever acts on AbstractArray acts on ProximalPoint by looking at its point field (unless otherwise specified: thank you, multiple dispatch).

Add prox! method for IndPSD handling matrices

Currently prox! only handles Symmetric or Hermitian objects. However, when using this in algorithms one will probably have a Matrix object, see JuliaFirstOrder/ProximalAlgorithms.jl#36.

Maybe it would make sense to have a method that accepts (Abstract)Matrix objects, and takes care of the wrapping/unwrapping.

Missing functions and calculus rules

This is a list of (some of the) currently missing functions and calculus rules for proximal mappings, with references. Please comment to add more to the list.

Functions

Calculus rules

Composition of even, smooth, convex function and distance from convex set, see Bauschke and Combettes, 2nd ed, Proposition 24.27
Prox of Moreau envelope, see Beck, Theorem 6.63
(#88) Epicomposition, see epicompose.jl
Spectral functions, see Beck, Section 7
- Schatten p-norms
- Ky Fan k-norms

Notes

Some rules require functions to be even, so adding an is_even property may be convenient.

Compatibility with 0.7/1.0

Currently tests fail on nightly. Since feature freeze for Julia 1.0 is about to hit, it will be soon time to start making the package forward-compatible. We can use this space to list criticalities and discussions on that regards.

Update - 25/03/2018

I'm currently going through this. General things found so far:

Package specific issues:

dspev! and dspevV! are broken

Lightweight “base” package with core definitions

Similar to other packages, we could have a very lightweight “base” package containing the core definitions (like prox and prox!, maybe some calculus rules would fit there too), so that other packages like ProximalAlgorithms or SeparableOptimization do not need to take the full dependency on ProximalOperators, but can just depend “on the API”. This would increase decoupling between packages: anyone bringing their own operators could directly use available algorithms.

Some things to figure out:

What should fit into the package? Definitely prox and prox! but other things like function conjugation would be useful I think
What would be a good name? ProximalBase is taken. Maybe ProximalCore?

Strange errors in test Julia 1.1

There are some strange errors in the tests. See Travis line 192. These do not appear in Julia nightly so it's possible it's some Julia issue.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Sums of norms

Hi Lorenzo,

I wanted to follow up on this comment: #124 (comment)

I think these weighted sums of norms can be useful for Group/Fused Lasso type problems. For the Group Lasso, one would need to think about how the group indices are specified and then you can simply execute the prox of the sum on each subvector.

I could try it out if you guys are interested in it.

Fix docs deployment

Documentation stopped being pushed a couple of releases ago. I suspect this could be the culprit.

Apart from this, the docs/travis configuration should probably be updated.

More Indicator sets

We should try to add all the indicator sets from MathProgBase. The current status is

:Free (implemented by IndBox(-Inf, Inf))
:Zero (implemented by IndBox(0, 0))
:NonNeg (implemented by IndBox(0, Inf))
:NonPos (implemented by IndBox(-Inf, 0))
:SOC - IndSOC
:SOCRotated (Should not be tricky)
:SDP - IndPSD (Can we have a compatible version for the vector form? Should we change the name to IndSDP?)
:ExpPrimal
:ExpDual (be be implemented using ExpPrimal)

It is possible that we could get slightly better performance by special versions of the first 4.
For now we could also simple introduce some aliases for the first 4 and worry about special implementations later.

GPU support via CuArrays.jl

Thanks for the nice package. I think it would be better to have GPU support (along with multi-thread evaluation for CPU in #82).

Edited by @lostella on Dec 7th, 2019

Note similar optimizations could be used in the computation of f and gradient, besides prox.

Is ProximableFunction no longer supported?

The following no longer works..
struct f <: ProximableFunction end

Is ProximableFunction replaced by something else?

What's the idiomatic way to express f(x) = f1(x) + f2(x), and get f's gradient?

I'm assuming f1 and f2 both have gradients.

SlicedSeparableSum will compute the function (just provide (f1, f2) as the first arguments, and each slice consists of the whole vector). However, it's unable to compute the gradient (which I'm assuming is reasonable in the most general case of slices, even if all input functions have gradients?).

By looking at some of the examples I also came up with a scheme involving just SeparableSum and VCAT, but that seems fairly ugly. Is there a nicer way to just compute the sum of two functions on the same input?

Whatever the answer is, I am happy to try to add it to the demos or documentation if you think it would be helpful.

Projection on Stiefel manifold

Have you guys ever considered implementing the Euclidean projection on the Stiefel manifold? See for instance, Section IV in https://people.eng.unimelb.edu.au/jmanton/pdf/Manton_Opt_Alg.pdf

Orthogonality constraints are very useful in low rank approximation and many many other applications. It would be nice to see how your (our) algorithms compare, for instance, with manifold optimization methods which are typically used to solve such problems.

Issue with views to cg!

There is an issue on this line
error message below

MethodError: Cannot `convert` an object of type Array{Float64,1} to an object of type SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true}
Closest candidates are:
  convert(::Type{T<:AbstractArray}, !Matched::T<:AbstractArray) where T<:AbstractArray at abstractarray.jl:14
  convert(::Type{T<:AbstractArray}, !Matched::Factorization) where T<:AbstractArray at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/LinearAlgebra/src/factorization.jl:46
  convert(::Type{T}, !Matched::T) where T at essentials.jl:154
  ...
IterativeSolvers.CGStateVariables{Float64,SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true}}(::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}) at cg.jl:111
cg!(::SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true}, ::ProximalOperators.Shift{Array{Float64,2},Float64}, ::Array{Float64,1}) at cg.jl:210
prox!(::SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true}, ::ProximalOperators.QuadraticIterative{Float64,Array{Float64,2},Array{Float64,1}}, ::SubArray{Float64,1,Array{Float64,1},Tuple{UnitRange{Int64}},true}, ::Float64) at quadraticIterative.jl:29
prox!(::Array{Float64,1}, ::SlicedSeparableSum{Tuple{Array{ProximalOperators.QuadraticIterative{Float64,Array{Float64,2},Array{Float64,1}},1}},Array{Array{Tuple{UnitRange{Int64}},1},1},1}, ::Array{Float64,1}, ::Float64) at slicedSeparableSum.jl:84
#fit_statespace_admm!#136(::Int64, ::Int64, ::Bool, ::Float64, ::Int64, ::Bool, ::Nothing, ::Float64, ::Int64, ::Float64, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(LTVModels.fit_statespace_admm!), ::LTVModelsBase.SimpleLTVModel{Float64}, ::Array{Float64,2}, ::Array{Float64,2}, ::Float64) at statespace_fit.jl:352
(::getfield(LTVModels, Symbol("#kw##fit_statespace_admm!")))(::NamedTuple{(:extend, :zeroinit, :iters, :D, :tol, :ridge),Tuple{Bool,Bool,Int64,Int64,Float64,Int64}}, ::typeof(LTVModels.fit_statespace_admm!), ::LTVModelsBase.SimpleLTVModel{Float64}, ::Array{Float64,2}, ::Array{Float64,2}, ::Float64) at none:0
#fit_statespace_admm#29(::Symbol, ::Bool, ::Bool, ::Base.Iterators.Pairs{Symbol,Real,NTuple{4,Symbol},NamedTuple{(:iters, :D, :tol, :ridge),Tuple{Int64,Int64,Float64,Int64}}}, ::Function, ::Array{Float64,2}, ::Array{Float64,2}, ::Float64) at statespace_fit.jl:266
(::getfield(LTVModels, Symbol("#kw##fit_statespace_admm")))(::NamedTuple{(:extend, :iters, :D, :zeroinit, :tol, :ridge),Tuple{Bool,Int64,Int64,Bool,Float64,Int64}}, ::typeof(LTVModels.fit_statespace_admm), ::Array{Float64,2}, ::Array{Float64,2}, ::Float64) at none:0
(::getfield(Main, Symbol("##374#375")))(::Float64) at sa.jl:9
iterate at generator.jl:47 [inlined]
_collect(::Array{Float64,1}, ::Base.Generator{Array{Float64,1},getfield(Main, Symbol("##374#375"))}, ::Base.EltypeUnknown, ::Base.HasShape{1}) at array.jl:632
collect_similar(::Array{Float64,1}, ::Base.Generator{Array{Float64,1},getfield(Main, Symbol("##374#375"))}) at array.jl:561
map(::Function, ::Array{Float64,1}) at abstractarray.jl:1987
top-level scope at none:0
include_string(::Module, ::String, ::String) at loading.jl:1005
include_string(::Module, ::String, ::String, ::Int64) at eval.jl:30
(::getfield(Atom, Symbol("##114#119")){String,Int64,String})() at eval.jl:94
withpath(::getfield(Atom, Symbol("##114#119")){String,Int64,String}, ::String) at utils.jl:30
withpath at eval.jl:46 [inlined]
#113 at eval.jl:93 [inlined]
with_logstate(::getfield(Atom, Symbol("##113#118")){String,Int64,String}, ::Base.CoreLogging.LogState) at logging.jl:397
with_logger at logging.jl:493 [inlined]
#112 at eval.jl:92 [inlined]
hideprompt(::getfield(Atom, Symbol("##112#117")){String,Int64,String}) at repl.jl:85
macro expansion at eval.jl:91 [inlined]
macro expansion at dynamic.jl:24 [inlined]
(::getfield(Atom, Symbol("##111#116")))(::Dict{String,Any}) at eval.jl:86
handlemsg(::Dict{String,Any}, ::Dict{String,Any}) at comm.jl:164
(::getfield(Atom, Symbol("##19#21")){Array{Any,1}})() at task.jl:259

Gradients

We seem to be missing most of the implementations of gradient/gradient!. Was this intentional or should I start adding them?
Should we define gradients even for functions that are not differentiable everywhere by returning one of the subgradients?
For example NormL1 could return simply 0 vector at 0.

It would be nice with a short command like
∇(f,x) = gradient(f,x) and
∇!(y,f,x) = gradient!(y,f,x)
any objections to me adding that @lostella ?
One could also use ∂(f,x) to signify that its part of the subdifferential, but since we would only return a set, and not a point, that could be confusing as well.

Completed gradients (in PRs) out of non-Indicator functions that are either: convex (i.e. suggradients exist) or differentiable

precompilation failing in Julia 1.1

Seems like there ProximalOperators is not passing test in Julia 1.1 with quite weird errors:

ERROR: LoadError: LoadError: StackOverflowError:

I did some investigation by setting pre-compilation off by typing

__precompile__(false)

in the main module.

What I got is a strange error in IndFree, specifically in the call:

prox!(y::AbstractArray{T}, f::IndFree, x::AbstractArray{T}, gamma::AbstractArray{T}) where {R, T <: RealOrComplex{R}} = prox!(y, f, x)

and in LeastSquaresDirect in the function:

function solve_step!(y::AbstractVector{D}, f::LeastSquaresDirect{R, RC, M, V, F}, x::AbstractVector{D}, gamma::R) where {R, RC, M, V, F <: SuiteSparse.CHOLMOD.Factor{RC}, D <: RealOrComplex{R}}

I didn't investigate further but leave these notes which are hopefully useful to solve this issue.
Errors seem related to typing/subtyping stuff.

Project on operator graph and IndAffine row rank

Hello, @lostella !
I was looking for the reliable library for proximal operators evalutation on github and I liked this one because of the general architecture and implementation (👍🏻).

However, what I am missing was the projection to an operator graph case {(x,y) | Ax=y}. Since it also could be done via Affine projection - I decided to look through the code and discovered one interesting feature.
The implementation requires full rank matrices (so the QR method for Pseudo-inverse might be applied) which... might be my case.

So, I have the following questions:

was there any reason to do so (I mean deliberately prohibit) or it's just "not implemented yet" for the case m<n. In the least case I would try to participate and add this functionality (not sure it would be possible, but I'll try). Even SVD method might be suitable (since pinv could be stored inside the struct/immutable).
what do you think about adding the graph project case. It could be done through affine projection, but as far as I know it has it's own heuristics for efficient calculations.

Inner constructors

To suppress all warnings on julia 0.6 we need to fix the inner constructors. The commit 9edfe1a follows the recommended way from

type Foo{T}
    x::T
   Foo(x)= new(x)
end

type Foo{T}
    x::T
   Foo{T}(x) where {T} = new(x)
end

which is not compatible with julia 0.5.
There seems to be a way to do it compatible with 0.5:

type Foo{T}
    x::T
    (::Type{Foo{T}}){T}(x::T) = new{T}(x)
end

maybe we should use this instead?
I am not able to understand why the backwards compatible way is not the recommended way.

Multi-threaded operators evaluation

It would be nice to add multi-threading to some of the operators e.g. using @threads, see https://docs.julialang.org/en/v1/base/multi-threading/index.html#Base.Threads.@threads

Moreover, Julia 1.3 will come with composable multi-threading which should allow for more flexibility using @spawn, see https://julialang.org/blog/2019/07/multithreading

Completing Precompose

The Precompose calculus rule needs to be completed. Given f and a linear mapping A, we want to be able to compute the prox! of x -> g(x) = f(Ax).

As of now we are able to handle the case where f is precomposed with a diagonal mapping. In this case A is represented by a:

a is a Real: then the mapping is a multiple of the identity, in which case the rule can be applied regardless of the structure of f;
a is an Array: then a represents the mapping's diagonal, and the rule can be applied provided that f is separable (actually, if all elements of a are equal then this is the same of case 1., but we don't check that now); separability of f is detected by the presence of prox!(f, x::Array, y::Array, gamma::Array), i.e., the possibility of specifying a different stepsize for each coordinate. See for example how the hinge loss is defined in terms of the sum-of-positive.

This structure presents a few drawbacks. I can think of:

Having a call with gamma::Array as fourth argument may create ambiguities: what if one wants to call the three-arguments prox!? Then prox!(f, x::Array, gamma::Array) clashes with prox!(f, x::Array, y::Array, gamma::Real=1.0). Remedy: put the stepsize gamma as third argument, and the target (output) y::Array as fourth. This is backward incompatible but better now than never, and the more I think of it the more I'm convinced it's the right signature for prox!.
The linear mapping shouldn't need to be diagonal at all to apply this rule: it suffices that AA' (where A' is the adjoint mapping) is diagonal, conformably with the separability structure of f (constant diagonal always works, as stated for example in Bauschke, Combettes, Prop. 23.32; any diagonal if f is separable, I don't have a reference for this but I'm sure it's easy to write down). For example, A may be a rotation. How should this case be accessed? Calling precompose(f, A) where A is a Matrix may not do the job, because if f is function defined over some space of matrices, than such an A may very well represent the diagonal mapping X -> A.*X over that space.

The first issue is quite urgent I believe. For the second, there's room for discussion and experiments.

Whatever way is taken, completing the support for precomposition must go through:

Have all separable functions support the call to prox! with gamma::Array.

Return types should agree with inputs?

We currently have a strange contract regarding return types when calling functions, gradient, and prox, which should probably simplified.

versioninfo():

Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

If one does

using ProximalOperators
f = SqrNormL2(Float64(1.0))
x = randn(Float32, 10)

then

julia> typeof(f(x))
Float64

julia> g, v = gradient(f, x); typeof(v)
Float64

julia> p, v = prox(f, x); typeof(v)
Float64

The reason is that the default coefficient in SqrNormL2 is Float64(1.0). Things are potentially more explicit and predictable if function values are forced to be the same as real(eltype(x)) (where real secures the case where eltype(x) <: Complex).

It would be nice to sort this out and make sure tests cover this. And also document this!

Projection on polyhedral set

It would be nice to have it there. I mean for general polyhedral sets
lb<=A*x<=ub
xmin<=x<=xmax

It could either be implemented by calling an external QP solver or solve the dual using e.g., PANOC.
Also there should be an option to warm-start the whole thing.

Package name

I made a PR to JuliaLang/METADATA.jl in order to register the package, and the issue of the package name was raised. You can see the PR here.

Personally I don't dislike Prox.jl, but I understand that it probably speaks for itself only to someone who writes "\prox" or "\mbox{prox}" a lot in their papers. I guess the following are viable options:

Proximal.jl
ProximalOperators.jl

I like option 2. better: it is on the same line as LinearOperators.jl. Also, in option 1., "proximal" what? I understand that as far as packages names are concerned, brevity is discouraged in favor of clarity. What do you think?

Error tagging new release

The REQUIRE file could not be found.
cc: @lostella

De-randomized tests

As noted several times, code coverage bounces up and down due to randomized tests. One way of avoiding this is to hardcode all cases to be tested: time consuming, but allows to cover all borderline cases. A simpler way would be to initialize the RNG seed somewhere. Definitely not a major issue, but also serves as reminder that we should put order to tests at some point.

Set at least the RNG seed
De-randomize tests

nonconvex prox calculus rule

Is there support for computing the prox of the pointwise minimum of proximable functions?
Suppose
f=min{f_1,...,f_N}
Then
prox_{γf}(x)=prox_{γf_i*}(x)
where
i*\in\argmin{f_1^γ(x),...,f_N^γ(x)}
and f_i^γ is the Moreau envelope of f_i.

This is very useful. For instance it can be used to compute the projection onto union of sets. One particular example I'm interested in is C={x| |a'*x|=b}={x| a'*x=b}\cup{x| a'*x=-b}.

Are you planning to include this anytime soon?

possible test failure in upcoming Julia version 1.5

A PkgEval run for a PR which changes the generated numbers for randn! indicates that the tests of this package might fail in Julia 1.5 (and on Julia current master). Apologies if this is a false positive.
cf. https://github.com/JuliaCI/NanosoldierReports/blob/7de24e455342298cbef56826b5827f0d7640d2c1/pkgeval/by_hash/b89e35c_vs_098ef24/logs/ProximalOperators/1.5.0-DEV-71a4a114c2.log

Installation problems on Julia 1.0.1

I have problems installing ProximalOperators in Julia 1.0.1 using Ubuntu 17.10. i first use:

(v1.0) pkg> add ProximalOperators

which works fine. Then

julia> using ProximalOperators

which throws

ERROR: LoadError: OSQP not properly installed. Please run Pkg.build("OSQP")

so I try

using Pkg

and get

julia> Pkg.build("OSQP")
  Building OSQP → `~/.julia/packages/OSQP/bNLh0/deps/build.log`
┌ Error: Error building `OSQP`:
│ [ Info: Attempting to create directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/downloads
│ [ Info: Directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/downloads already exists
│ [ Info: Downloading file https://dl.bintray.com/bstellato/generic/OSQP/0.4.1/osqp-0.4.1.tar.gz
│ [ Info: Done downloading file https://dl.bintray.com/bstellato/generic/OSQP/0.4.1/osqp-0.4.1.tar.gz
│ [ Info: Attempting to create directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/src
│ [ Info: Directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/src already exists
│ [ Info: Attempting to create directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps
│ [ Info: Directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps already exists
│ [ Info: Path /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/src/osqp-0.4.1 already exists
│ [ Info: Attempting to create directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/usr/lib
│ [ Info: Directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/usr/lib already exists
│ [ Info: Attempting to create directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/src/osqp-0.4.1/build
│ [ Info: Directory /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/src/osqp-0.4.1/build already exists
│ [ Info: Changing directory to /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/src/osqp-0.4.1/build
│ ERROR: LoadError: IOError: could not spawn `cmake -G 'Unix Makefiles' ..`: no such file or directory (ENOENT)
│ Stacktrace:
│  [1] _jl_spawn(::String, ::Array{String,1}, ::Cmd, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:367
│  [2] (::getfield(Base, Symbol("##494#495")){Cmd})(::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:509
│  [3] setup_stdio(::getfield(Base, Symbol("##494#495")){Cmd}, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:490
│  [4] #_spawn#493(::Nothing, ::Function, ::Cmd, ::Tuple{RawFD,RawFD,RawFD}) at ./process.jl:508
│  [5] _spawn at ./process.jl:504 [inlined]
│  [6] #run#504(::Bool, ::Function, ::Cmd) at ./process.jl:662
│  [7] run(::Cmd) at ./process.jl:661
│  [8] macro expansion at ./logging.jl:308 [inlined]
│  [9] run(::BinDeps.SynchronousStepCollection) at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/BinDeps.jl:518
│  [10] run(::FileRule) at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/BinDeps.jl:483
│  [11] macro expansion at ./logging.jl:308 [inlined]
│  [12] run(::BinDeps.SynchronousStepCollection) at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/BinDeps.jl:518
│  [13] macro expansion at ./logging.jl:308 [inlined]
│  [14] run(::BinDeps.SynchronousStepCollection) at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/BinDeps.jl:518
│  [15] satisfy!(::BinDeps.LibraryDependency, ::Array{DataType,1}) at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/dependencies.jl:944
│  [16] satisfy!(::BinDeps.LibraryDependency) at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/dependencies.jl:922
│  [17] top-level scope at /home/pawn0002/.julia/packages/BinDeps/ZEval/src/dependencies.jl:977
│  [18] include at ./boot.jl:317 [inlined]
│  [19] include_relative(::Module, ::String) at ./loading.jl:1041
│  [20] include(::Module, ::String) at ./sysimg.jl:29
│  [21] include(::String) at ./client.jl:388
│  [22] top-level scope at none:0
│ in expression starting at /home/pawn0002/.julia/packages/OSQP/bNLh0/deps/build.jl:62
└ @ Pkg.Operations /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Pkg/src/Operations.jl:1069

In-place Cholesky (LDLt) solve

I implemented a wapper to do in-place solves for sparse Cholesky factorizations (i.e A_ldiv_B!) in https://github.com/mfalt/CholmodSolve2.jl (which is now registered in METADATA)
This could be be used in LeastSquaresDirect here: https://github.com/kul-forbes/ProximalOperators.jl/blob/07df4b3b391d5dac71cfde61470e8919f090baec/src/functions/leastSquaresDirect.jl#L103
and probably here: https://github.com/kul-forbes/ProximalOperators.jl/blob/07df4b3b391d5dac71cfde61470e8919f090baec/src/functions/leastSquaresDirect.jl#L107
to avoid allocations completely (well O(1) allocations even on multiple calls).
The same is true for QuadraticDirect. Are there any other places in the code where we use sparse Cholesky?
EDIT: I guess IndGraphSparse

It seems like epicompose.jl, ~~indGraphFat.jl and indGraphSkinny.jl only has dense implementations, did we have a reason for this?~~ could also have a sparse implementation.

Using predicates instead of types

Types in Julia are great. But types in Julia should not be overused. As pointed out by @mfalt, it is rather complicated to encode properties of ProximableFunction objects using Julia types: the simple fact that f is ProximableConvex, for example, is lost when constructing Postcompose(f). What makes more sense, probably, are predicates of the form is_convex(f), is_cone(f) and so on, that inspect object f. See #14 for a related discussion.

Improve pretty printing

The following section of the documentation explains how to have separate behaviors for pretty printing objects and collections of objects. This is an issue in ProximalOperators, since pretty printing is very verbose and a tuple of proximable functions is displayed in a rather unreadable way: this makes me personally prefer not having any custom pretty printing at all, so I believe it could use some improvement.

https://docs.julialang.org/en/v1/manual/types/#man-custom-pretty-printing-1

simplify show implementation (currently too verbose, with too many unmanageable details, like the domain)
make sure a “compact” version is displayed when collections are displayed
implement pretty printing for the types which are missing it

Faster implementations

As we discussed, there are several ways to speed up the implementations of the prox operators. I did some tests that may be useful in rewriting them.
These are the results for the NormL1Prox, for 100 prox operations on a vector of size 1000000.
Standard implementation

function prox(f::NormL1{Float64}, x::RealOrComplexArray, gamma::Float64=1.0)
  y = sign(x).*max(0.0, abs(x)-gamma*f.lambda)
  return y
end

In-place update to avoid big allocations:

function prox1!(f::NormL1{Float64}, x::RealOrComplexArray, gamma::Float64=1.0)
  for i in eachindex(x)
    x[i] = sign(x[i]).*max(0.0, abs(x[i])-gamma*f.lambda)
  end
  return x
end

Avoid bounds-checks and use simd:

function prox2!(f::NormL1{Float64}, x::RealOrComplexArray, gamma::Float64=1.0)
  gf = gamma*f.lambda
  @inbounds @simd for i in eachindex(x)
   x[i] = sign(x[i]).*max(0.0, abs(x[i])-gf)
  end
  return x
end

Avoid function calls and all but one multiplication:

function prox3!(f::NormL1{Float64}, x::RealOrComplexArray, gamma::Float64=1.0)
  gf = gamma*f.lambda
  @inbounds @simd for i in eachindex(x)
    x[i] += (x[i] <= -gf ? gf : (x[i] >= gf ? -gf : -x[i]))
  end
  return x
end

Use ifelse to ensure that the compiler is not worried about taking both branches

function prox4!(f::NormL1{Float64}, x::RealOrComplexArray, gamma::Float64=1.0)
  gf = gamma*f.lambda
  @inbounds @simd for i in eachindex(x)
    x[i] += ifelse(x[i] <= -gf, gf , ifelse(x[i] >= gf , -gf , -x[i]))
  end
  return x
end

Results:
prox: 1.557599 seconds (1.70 k allocations: 3.725 GB, 8.86% gc time)
prox1!: 0.235333 seconds (100 allocations: 1.563 KB) (6.5x speedup)
prox2!: 0.085790 seconds (100 allocations: 1.563 KB) (2.7 x speedup)
prox3!: 0.080576 seconds (100 allocations: 1.563 KB) (1.05x speedup) (total 19x speedup)
prox4!: 0.074251 seconds (100 allocations: 1.563 KB) (1.08x speedup) (total 21x speedup)

Type instability

There is some bad type instability in some of the functions, for example IndAffine

immutable IndAffine{T <: RealOrComplex} <: IndicatorConvex
  A::AbstractArray{T,2}
  b::AbstractArray{T,1}
  R::AbstractArray{T,2}
  ....

since there is no information on the type arrays, any use of these will result in type instability.
With julia 0.6 we are able to do diagonal dispatch

immutable IndAffine{R <: RealOrComplex, M<:AbstractArray{R,1}, V<:AbstractArray{R,1}} <: IndicatorConvex
  A::M
  b::V
  R::M
  ....

Here assuming that A and R will be of the same array type. If we want b to be of the same type we could enforce this in the constructor without any penalty.
I can create a PR for this unless anyone have any issues with it.

TV norm prox

Have you included the prox of the total variation norm, i.e. sum_i |x_i - x_{i-1}|?

I have implemented Condats algorithm in order to compute this (https://lcondat.github.io/publis/Condat-fast_TV-SPL-2013.pdf).
I could share my code.

Cheers!

Precompilation error

I am getting following error message while trying to run using ProximalOperators in Julia 1.1

[ Info: Precompiling ProximalOperators [a725b495-10eb-56fe-b38b-717eba820537]
ERROR: LoadError: LoadError: StackOverflowError:
Stacktrace:
[1] top-level scope at none:0
[2] include at ./boot.jl:326 [inlined]
[3] include_relative(::Module, ::String) at ./loading.jl:1038
[4] include at ./sysimg.jl:29 [inlined]
[5] include(::String) at /nethome/ssharan31/.julia/packages/ProximalOperators/zHDx9/src/ProximalOperators.jl:3
[6] top-level scope at none:0
[7] include at ./boot.jl:326 [inlined]
[8] include_relative(::Module, ::String) at ./loading.jl:1038
[9] include(::Module, ::String) at ./sysimg.jl:29
[10] top-level scope at none:2
[11] eval at ./boot.jl:328 [inlined]
[12] eval(::Expr) at ./client.jl:404
[13] top-level scope at ./none:3
in expression starting at /nethome/ssharan31/.julia/packages/ProximalOperators/zHDx9/src/functions/indFree.jl:33
in expression starting at /nethome/ssharan31/.julia/packages/ProximalOperators/zHDx9/src/ProximalOperators.jl:39
ERROR: Failed to precompile ProximalOperators [a725b495-10eb-56fe-b38b-717eba820537] to /nethome/ssharan31/.julia/compiled/v1.1/ProximalOperators/ez37h.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] compilecache(::Base.PkgId, ::String) at ./loading.jl:1197
[3] _require(::Base.PkgId) at ./loading.jl:960
[4] require(::Base.PkgId) at ./loading.jl:858
[5] require(::Module, ::Symbol) at ./loading.jl:853

Please suggest how can I fix it. Thanks

Replace exponential cone projection?

There's a new, robust implementation of the projection onto the exponential cone detailed in this paper by Henrik Friberg.

This is implemented in MathOptSetDistances.jl. Happy to port over.
https://github.com/matbesancon/MathOptSetDistances.jl/blob/1f3e578398ba91fbbcaf805aaf55f9105d858a40/src/projections.jl#L170

Relevant section of code:

ProximalOperators.jl/src/functions/indExp.jl

Line 79 in 7889da3

 function prox!(y::AbstractVector{R}, f::IndExpPrimal, x::AbstractVector{R}, gamma::R=R(1)) where R <: Real 

Polishing docstrings

Docstrings could be polished by using the @doc macro with raw"..." strings, see https://juliadocs.github.io/Documenter.jl/stable/man/latex/#Escaping-Characters-in-Docstrings-1. This would get rid of the current abundance of backslashes due to string escaping.

Prox of Sum

I have a sum of multiple functions (each is a LeastSquare):

f = Sum(f1, f2, f3)

f can be evaluated but prox(f, x) gives an error. Apparently, prox is not (yet) defined for Sum. Is that correct? If it is, any idea on when it will be added?

Thanks a lot.

Non convex soft thresholding operator

In the demo examples of the Lasso there is the ordinary soft thresholding operator defined as S = sign max{0,|x|-alpha}, which in the ADMM version translates to prox!(z, g, x + u, gam). In eq. (13) of Chatrand and Wohlberg (https://www.researchgate.net/publication/255907067_A_Nonconvex_Admm_Algorithm_for_Group_Sparsity_With_Sparse_Groups) they suggest a more flexible soft thresholding function of the form S = sign max{0,|x|-alpha^(2-p)|x|^(p-1)} ,where p is set to 1(the ordinary Lasso) , 0.5, 0 or - 0.5. My question is if there is some way to implent this soft thresholding using the tools available, or is some new norm function needed?