Git Product home page Git Product logo

diffcp's Introduction

Build Status

diffcp

diffcp is a Python package for computing the derivative of a convex cone program, with respect to its problem data. The derivative is implemented as an abstract linear map, with methods for its forward application and its adjoint.

The implementation is based on the calculations in our paper Differentiating through a cone program.

Installation

diffcp is available on PyPI, as a source distribution. Install it with

pip install diffcp

You will need a C++11-capable compiler to build diffcp.

diffcp requires:

diffcp uses Eigen; Eigen operations can be automatically vectorized by compilers. To enable vectorization, install with

MARCH_NATIVE=1 pip install diffcp

OpenMP can be enabled by passing extra arguments to your compiler. For example, on linux, you can tell gcc to activate the OpenMP extension by specifying the flag "-fopenmp":

OPENMP_FLAG="-fopenmp" pip install diffcp

To enable both vectorization and OpenMP (on linux), use

MARCH_NATIVE=1 OPENMP_FLAG="-fopenmp" pip install diffcp

Cone programs

diffcp differentiates through a primal-dual cone program pair. The primal problem must be expressed as

minimize        c'x
subject to      Ax + s = b
                s in K

where x and s are variables, A, b and c are the user-supplied problem data, and K is a user-defined convex cone. The corresponding dual problem is

minimize        b'y
subject to      A'y + c == 0
                y in K^*

with dual variable y.

Usage

diffcp exposes the function

solve_and_derivative(A, b, c, cone_dict, warm_start=None, solver=None, **kwargs).

This function returns a primal-dual solution x, y, and s, along with functions for evaluating the derivative and its adjoint (transpose). These functions respectively compute right and left multiplication of the derivative of the solution map at A, b, and c by a vector. The solver argument determines which solver to use; the available solvers are solver="SCS", solver="ECOS", and solver="Clarabel". If no solver is specified, diffcp will choose the solver itself. In the case that the problem is not solved, i.e. the solver fails for some reason, we will raise a SolverError Exception.

Arguments

The arguments A, b, and c correspond to the problem data of a cone program.

  • A must be a SciPy sparse CSC matrix.
  • b and c must be NumPy arrays.
  • cone_dict is a dictionary that defines the convex cone K.
  • warm_start is an optional tuple (x, y, s) at which to warm-start. (Note: this is only available for the SCS solver).
  • **kwargs are keyword arguments to forward to the solver (e.g., verbose=False).

These inputs must conform to the SCS convention for problem data. The keys in cone_dict correspond to the cones, with

  • diffcp.ZERO for the zero cone,
  • diffcp.POS for the positive orthant,
  • diffcp.SOC for a product of SOC cones,
  • diffcp.PSD for a product of PSD cones, and
  • diffcp.EXP for a product of exponential cones.

The values in cone_dict denote the sizes of each cone; the values of diffcp.SOC, diffcp.PSD, and diffcp.EXP should be lists. The order of the rows of A must match the ordering of the cones given above. For more details, consult the SCS documentation.

Return value

The function solve_and_derivative returns a tuple

(x, y, s, derivative, adjoint_derivative)
  • x, y, and s are a primal-dual solution.

  • derivative is a function that applies the derivative at (A, b, c) to perturbations dA, db, dc. It has the signature derivative(dA, db, dc) -> dx, dy, ds, where dA is a SciPy sparse CSC matrix with the same sparsity pattern as A, and db and dc are NumPy arrays. dx, dy, and ds are NumPy arrays, approximating the change in the primal-dual solution due to the perturbation.

  • adjoint_derivative is a function that applies the adjoint of the derivative to perturbations dx, dy, ds. It has the signature adjoint_derivative(dx, dy, ds) -> dA, db, dc, where dx, dy, and ds are NumPy arrays.

Example

import numpy as np
from scipy import sparse

import diffcp

cone_dict = {
    diffcp.ZERO: 3,
    diffcp.POS: 3,
    diffcp.SOC: [5]
}

m = 3 + 3 + 5
n = 5

A, b, c = diffcp.utils.random_cone_prog(m, n, cone_dict)
x, y, s, D, DT = diffcp.solve_and_derivative(A, b, c, cone_dict)

# evaluate the derivative
nonzeros = A.nonzero()
data = 1e-4 * np.random.randn(A.size)
dA = sparse.csc_matrix((data, nonzeros), shape=A.shape)
db = 1e-4 * np.random.randn(m)
dc = 1e-4 * np.random.randn(n)
dx, dy, ds = D(dA, db, dc)

# evaluate the adjoint of the derivative
dx = c
dy = np.zeros(m)
ds = np.zeros(m)
dA, db, dc = DT(dx, dy, ds)

For more examples, including the SDP example described in the paper, see the examples directory.

Citing

If you wish to cite diffcp, please use the following BibTex:

@article{diffcp2019,
    author       = {Agrawal, A. and Barratt, S. and Boyd, S. and Busseti, E. and Moursi, W.},
    title        = {Differentiating through a Cone Program},
    journal      = {Journal of Applied and Numerical Optimization},
    year         = {2019},
    volume       = {1},
    number       = {2},
    pages        = {107--115},
}

@misc{diffcp,
    author       = {Agrawal, A. and Barratt, S. and Boyd, S. and Busseti, E. and Moursi, W.},
    title        = {{diffcp}: differentiating through a cone program, version 1.0},
    howpublished = {\url{https://github.com/cvxgrp/diffcp}},
    year         = 2019
}

The following thesis concurrently derived the mathematics behind differentiating cone programs.

@phdthesis{amos2019differentiable,
  author       = {Brandon Amos},
  title        = {{Differentiable Optimization-Based Modeling for Machine Learning}},
  school       = {Carnegie Mellon University},
  year         = 2019,
  month        = May,
}

diffcp's People

Contributors

akshayka avatar angeris avatar axelbreuer avatar bstellato avatar phschiele avatar roberthuisman avatar sbarratt avatar stevediamond avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diffcp's Issues

Solver scs returned status Unbounded/Inaccurate

I occasionally face the following error when using the package for different matrices A and b. It does not happen permanently, but I have no idea when it happens. It especially happens when I set sigma_value in the following code to less than 1e-4

`

def tf_solver(A: np.ndarray, b: np.ndarray, sigma_value: float):

    A = tf.constant(A)
    b = tf.constant(b)
    x_tf = cp.Variable((A.shape[1]))
    A_tf = cp.Parameter(A.shape)
    b_tf = cp.Parameter((b.shape[0]))

    constraints = [cp.norm(A_tf @ x_tf - b_tf, p=2) <= sigma_value]
    objective = cp.Minimize(cp.norm(x_tf, p=1))
    problem = cp.Problem(objective, constraints)

    assert problem.is_dpp()

    cvxpylayer = cvxlayer(problem, parameters=[A_tf, b_tf], variables=[x_tf])

    with tf.GradientTape() as tape:
        x, = cvxpylayer(A, b)
        summed_solution = tf.math.reduce_sum(x)

    gradA, gradb = tape.gradient(summed_solution, [A, b])
    grad = [gradA, gradb]

    residue = tf.tensordot(A, x, 1) - b

    return x.numpy(), grad, residue.numpy()

`


69 cs_procedure
x, grad, residue = tf_solver(A=precond_bi_orthonormal_matrix, b=precond_func_sampling_vector, sigma_value=sigma_value)

csUtils.py 30 tf_solver
x, = cvxpylayer(A, b)

cvxpylayer.py 109 call
return compute(*parameters)

custom_gradient.py 258 call
return self._d(self._f, a, k)

custom_gradient.py 212 decorated
return _eager_mode_decorator(wrapped, args, kwargs)

custom_gradient.py 409 _eager_mode_decorator
result, grad_fn = f(*args, **kwargs)
cvxpylayer.py 108
lambda *parameters: self._compute(parameters, solver_args))

cvxpylayer.py 160 _compute
A=A, b=b, c=c, cone_dict=self.cones, **solver_args)

cone_program.py 212 solve_and_derivative
A, b, c, cone_dict, warm_start=warm_start, mode=mode, **kwargs)
cone_program.py 262 solve_and_derivative_internal
raise SolverError("Solver scs returned status %s" % status)
diffcp.cone_program.SolverError:
Solver scs returned status Unbounded/Inaccurate

Sign in adjoint derivative calculation

I've been following the paper "Differentiating Through a Cone Program" and the code side-by-side, and I'm having trouble figuring out if there is a sign error in the adjoint derivative code or if I've misunderstood something.

dw = -(x @ dx + y @ dy + s @ ds)
dz = np.concatenate(
[dx, D_proj_dual_cone.rmatvec(dy + ds) - ds, np.array([dw])])
if np.allclose(dz, 0):
r = np.zeros(dz.shape)
elif mode == "dense":
r = _diffcp._solve_adjoint_derivative_dense(M, MT, dz)
else:
r = _diffcp.lsqr(MT, dz).solution
values = pi_z[cols] * r[rows + n] - pi_z[n + rows] * r[cols]
dA = sparse.csc_matrix((values, (rows, cols)), shape=A.shape)
db = pi_z[n:n + m] * r[-1] - pi_z[-1] * r[n:n + m]
dc = pi_z[:n] * r[-1] - pi_z[-1] * r[:n]
return dA, db, dc

It seems like, when compared to the paper, the code solves M.T @ r = dz for r, whereas the paper solves M.T @ g = -dz for g. So r = -g. But then the equations used in the code to compute (dA, db, dc) seem to match those in the paper, when they should all differ by a negative sign.

Similarly, for the forward-mode derivative, you solve M @ dz = dQ @ pi_z for dz, use the same equations as in the paper despite the sign difference, but you multiply (dx, dy, dz) by -1 before returning, so this is fine.

Is this a sign error in the adjoint derivative, or did I get something wrong?

(Windows installation) Failed to build wheel for diffcp (setup.py) ... error

When I try to install diffcp in anaconda prompt, it fails to build wheel for diffcp.

The detailed error, as it outputs, is " TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType". I wonder the path here.

The output error is listed as below:

Building wheels for collected packages: diffcp
  Building wheel for diffcp (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: 'c:\www\anaconda3\envs\tensorflow20\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\LIFI\\AppData\\Local\\Temp\\pip-install-wa864jaq\\diffcp\\setup.py'"'"'; __file__='"'"'C:\\Users\\LIFI\\AppData\\Local\\Temp\\pip-install-wa864jaq\\diffcp\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\LIFI\AppData\Local\Temp\pip-wheel-41hdnwlc' --python-tag cp36
       cwd: C:\Users\LIFI\AppData\Local\Temp\pip-install-wa864jaq\diffcp\
  Complete output (59 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-3.6
  creating build\lib.win-amd64-3.6\diffcp
  copying diffcp\cones.py -> build\lib.win-amd64-3.6\diffcp
  copying diffcp\cone_program.py -> build\lib.win-amd64-3.6\diffcp
  copying diffcp\utils.py -> build\lib.win-amd64-3.6\diffcp
  copying diffcp\__init__.py -> build\lib.win-amd64-3.6\diffcp
  running build_ext
  building '_diffcp' extension
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "C:\Users\LIFI\AppData\Local\Temp\pip-install-wa864jaq\diffcp\setup.py", line 108, in <module>
      "Programming Language :: Python :: 3",
    File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\__init__.py", line 145, in setup
      return distutils.core.setup(**attrs)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\core.py", line 148, in setup
      dist.run_commands()
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 955, in run_commands
      self.run_command(cmd)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 974, in run_command
      cmd_obj.run()
    File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\wheel\bdist_wheel.py", line 192, in run
      self.run_command('build')
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 974, in run_command
      cmd_obj.run()
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build.py", line 135, in run
      self.run_command(cmd_name)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\cmd.py", line 313, in run_command
      self.distribution.run_command(command)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 974, in run_command
      cmd_obj.run()
    File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
      _build_ext.run(self)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 339, in run
      self.build_extensions()
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 448, in build_extensions
      self._build_extensions_serial()
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
      self.build_extension(ext)
    File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\command\build_ext.py", line 205, in build_extension
      _build_ext.build_extension(self, ext)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 533, in build_extension
      depends=ext.depends)
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\_msvccompiler.py", line 345, in compile
      self.initialize()
    File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\_msvccompiler.py", line 238, in initialize
      vc_env = _get_vc_env(plat_spec)
    File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\msvc.py", line 170, in msvc14_get_vc_env
      return EnvironmentInfo(plat_spec, vc_min_ver=14.0).return_env()
    File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\msvc.py", line 1619, in return_env
      if self.vs_ver >= 14 and isfile(self.VCRuntimeRedist):
    File "c:\www\anaconda3\envs\tensorflow20\lib\genericpath.py", line 30, in isfile
      st = os.stat(path)
  TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
  ----------------------------------------
  ERROR: Failed building wheel for diffcp
  Running setup.py clean for diffcp
Failed to build diffcp
Installing collected packages: diffcp
    Running setup.py install for diffcp ... error
    ERROR: Command errored out with exit status 1:
     command: 'c:\www\anaconda3\envs\tensorflow20\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\LIFI\\AppData\\Local\\Temp\\pip-install-wa864jaq\\diffcp\\setup.py'"'"'; __file__='"'"'C:\\Users\\LIFI\\AppData\\Local\\Temp\\pip-install-wa864jaq\\diffcp\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\LIFI\AppData\Local\Temp\pip-record-o3aral_e\install-record.txt' --single-version-externally-managed --compile
         cwd: C:\Users\LIFI\AppData\Local\Temp\pip-install-wa864jaq\diffcp\
    Complete output (61 lines):
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.6
    creating build\lib.win-amd64-3.6\diffcp
    copying diffcp\cones.py -> build\lib.win-amd64-3.6\diffcp
    copying diffcp\cone_program.py -> build\lib.win-amd64-3.6\diffcp
    copying diffcp\utils.py -> build\lib.win-amd64-3.6\diffcp
    copying diffcp\__init__.py -> build\lib.win-amd64-3.6\diffcp
    running build_ext
    building '_diffcp' extension
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\LIFI\AppData\Local\Temp\pip-install-wa864jaq\diffcp\setup.py", line 108, in <module>
        "Programming Language :: Python :: 3",
      File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\__init__.py", line 145, in setup
        return distutils.core.setup(**attrs)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\core.py", line 148, in setup
        dist.run_commands()
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\command\install.py", line 61, in run
        return orig.install.run(self)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\install.py", line 545, in run
        self.run_command('build')
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build.py", line 135, in run
        self.run_command(cmd_name)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
        _build_ext.run(self)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 339, in run
        self.build_extensions()
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 448, in build_extensions
        self._build_extensions_serial()
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
        self.build_extension(ext)
      File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\command\build_ext.py", line 205, in build_extension
        _build_ext.build_extension(self, ext)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\command\build_ext.py", line 533, in build_extension
        depends=ext.depends)
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\_msvccompiler.py", line 345, in compile
        self.initialize()
      File "c:\www\anaconda3\envs\tensorflow20\lib\distutils\_msvccompiler.py", line 238, in initialize
        vc_env = _get_vc_env(plat_spec)
      File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\msvc.py", line 170, in msvc14_get_vc_env
        return EnvironmentInfo(plat_spec, vc_min_ver=14.0).return_env()
      File "c:\www\anaconda3\envs\tensorflow20\lib\site-packages\setuptools\msvc.py", line 1619, in return_env
        if self.vs_ver >= 14 and isfile(self.VCRuntimeRedist):
      File "c:\www\anaconda3\envs\tensorflow20\lib\genericpath.py", line 30, in isfile
        st = os.stat(path)
    TypeError: stat: path should be string, bytes, os.PathLike or integer, not NoneType
    ----------------------------------------
ERROR: Command errored out with exit status 1: 'c:\www\anaconda3\envs\tensorflow20\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\LIFI\\AppData\\Local\\Temp\\pip-install-wa864jaq\\diffcp\\setup.py'"'"'; __file__='"'"'C:\\Users\\LIFI\\AppData\\Local\\Temp\\pip-install-wa864jaq\\diffcp\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record 'C:\Users\LIFI\AppData\Local\Temp\pip-record-o3aral_e\install-record.txt' --single-version-externally-managed --compile Check the logs for full command output.

Support ECOS

It'd be great if diffcp supported the ECOS solver.

ECOS also uses a self-dual homogeneous embedding for cone programs. The details might be slightly different.

Inaccurate gradient estimation for log penalization

Hi,

I have reported an equivalent issue on cvxpylayers cvxgrp/cvxpylayers#135 but since cvxpylayers uses diffcp, I wanted to report this issue here as well.

I have built an even more simple example with inaccurate gradient estimation:

Let's suppose we want to maximize $J(x) = x + \lambda (\log(1-x) + \log(1+x))$ where $\lambda \in R^+ $.
The solution is $x^* = -\lambda + \sqrt{\lambda²+1}$
And $\frac{dx^*}{d\lambda} = -1 + \frac{\lambda}{\sqrt{\lambda²+1}}$
Now, let's calculate the sensitivity of $x^{*}$ with respect to $\lambda$, at $\lambda=1$ and $\delta\lambda=10^{-6}$, with diffcp:

import numpy as np
import cvxpy as cp
import diffcp.cone_program as cone_prog
import diffcp.utils as utils

# variables
x_ = cp.Variable()

# parameters
_lam = cp.Parameter(1, nonneg=True)
_lam.value = np.ones(1)

# objective
objective = cp.Maximize(x_ + _lam*cp.log(1+x_) + _lam*cp.log(1-x_))

problem = cp.Problem(objective)

A, b, c, cone_dims = utils.scs_data_from_cvxpy_problem(problem)

x, y, s, D, DT = cone_prog.solve_and_derivative(A,
                                                b,
                                                c,
                                                cone_dims,
                                                solve_method="ECOS",
                                                mode="dense",
                                                feastol=1e-10,
                                                abstol=1e-10,
                                                reltol=1e-10)

dlam = 1e-6
dA = utils.get_random_like(A, lambda n: np.zeros(n))
db = np.zeros(b.size)
dc = np.zeros(c.size)
dc[1] = dc[1] - dlam  # the minus sign stems from the fact that  c =  [-1., -1., -1.]
dc[2] = dc[2] - dlam  # the minus sign stems from the fact that  c =  [-1., -1., -1.]

dx, dy, ds = D(dA, db, dc)

x_pert, y_pert, s_pert, _, _ = cone_prog.solve_and_derivative(A + dA,
                                                              b + db,
                                                              c + dc,
                                                              cone_dims,
                                                              solve_method="ECOS",
                                                              mode="dense",
                                                              feastol=1e-10,
                                                              abstol=1e-10,
                                                              reltol=1e-10)

print(f"x_pert-x   = {x_pert-x}")
print(f"      dx   = {dx}")
analytical = -1+_lam.value/np.sqrt(_lam.value**2+1)
print(f"analytical = {analytical*dlam}")

and you get

x_pert-x   = [-2.93091918e-07 -2.07319449e-07  5.00378592e-07]
      dx   = [-2.61203874e-07 -1.84699034e-07  4.45902905e-07]
analytical = [-2.92893219e-07]

Where we see that $\frac{dx^*}{d\lambda} \delta\lambda$ has a finite difference approximation (-2.930e-7) which is much closer to the analytical gradient (-2.923e-7) than the Automatic Difference gradient (-2.612e-7), which is very surprising, especially for such a simple (unconstrained) problem.
Do you have an idea why AD performs so poorly on this example ?

Need to install pybind11 before diffcp

In a fresh virtual environment, when I run pip install diffcp
I get the error:

cpp/src/wrapper.cpp:1:10: fatal error: pybind11/eigen.h: No such file or directory
#include <pybind11/eigen.h>
         ^~~~~~~~~~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1
----------------------------------------
ERROR: Failed building wheel for diffcp

Once I pip install pybind11, then pip install diffcp works perfectly fine.

This should just work though, because pybind11 is in setup_requires in setup.py.

(openSUSE Leap 15.1) Failed building wheel for diffcp

I'm trying to install using pip. The installation is failing with the following errors (both when building the wheel and when it tries to install the package subsequently):

  ERROR: Command errored out with exit status 1:
   command: /global/scratch/rex/Python/miniconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-x09rf7cz/diffcp/setup.py'"'"'; __file__='"'"'/tmp/pip-install-x09rf7cz/diffcp/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-c4u82hpf --python-tag cp37
       cwd: /tmp/pip-install-x09rf7cz/diffcp/
  Complete output (33 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/diffcp
  copying diffcp/utils.py -> build/lib.linux-x86_64-3.7/diffcp
  copying diffcp/cones.py -> build/lib.linux-x86_64-3.7/diffcp
  copying diffcp/cone_program.py -> build/lib.linux-x86_64-3.7/diffcp
  copying diffcp/__init__.py -> build/lib.linux-x86_64-3.7/diffcp
  running build_ext
  building '_diffcp' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/cpp
  creating build/temp.linux-x86_64-3.7/cpp/src
  gcc -pthread -B /global/scratch/rex/Python/miniconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/global/scratch/rex/Python/miniconda3/include -I/global/scratch/rex/Python/miniconda3/include -Icpp/external/eigen -Icpp/external/eigen/Eigen -Icpp/include -I/global/scratch/rex/Python/miniconda3/include/python3.7m -c cpp/src/wrapper.cpp -o build/temp.linux-x86_64-3.7/cpp/src/wrapper.o -O3 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  gcc -pthread -B /global/scratch/rex/Python/miniconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/global/scratch/rex/Python/miniconda3/include -I/global/scratch/rex/Python/miniconda3/include -Icpp/external/eigen -Icpp/external/eigen/Eigen -Icpp/include -I/global/scratch/rex/Python/miniconda3/include/python3.7m -c cpp/src/lsqr.cpp -o build/temp.linux-x86_64-3.7/cpp/src/lsqr.o -O3 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  gcc -pthread -B /global/scratch/rex/Python/miniconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/global/scratch/rex/Python/miniconda3/include -I/global/scratch/rex/Python/miniconda3/include -Icpp/external/eigen -Icpp/external/eigen/Eigen -Icpp/include -I/global/scratch/rex/Python/miniconda3/include/python3.7m -c cpp/src/linop.cpp -o build/temp.linux-x86_64-3.7/cpp/src/linop.o -O3 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  gcc -pthread -B /global/scratch/rex/Python/miniconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/global/scratch/rex/Python/miniconda3/include -I/global/scratch/rex/Python/miniconda3/include -Icpp/external/eigen -Icpp/external/eigen/Eigen -Icpp/include -I/global/scratch/rex/Python/miniconda3/include/python3.7m -c cpp/src/deriv.cpp -o build/temp.linux-x86_64-3.7/cpp/src/deriv.o -O3 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  gcc -pthread -B /global/scratch/rex/Python/miniconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/global/scratch/rex/Python/miniconda3/include -I/global/scratch/rex/Python/miniconda3/include -Icpp/external/eigen -Icpp/external/eigen/Eigen -Icpp/include -I/global/scratch/rex/Python/miniconda3/include/python3.7m -c cpp/src/cones.cpp -o build/temp.linux-x86_64-3.7/cpp/src/cones.o -O3 -std=c++11
  cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
  g++ -pthread -shared -B /global/scratch/rex/Python/miniconda3/compiler_compat -L/global/scratch/rex/Python/miniconda3/lib -Wl,-rpath=/global/scratch/rex/Python/miniconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/cpp/src/wrapper.o build/temp.linux-x86_64-3.7/cpp/src/lsqr.o build/temp.linux-x86_64-3.7/cpp/src/linop.o build/temp.linux-x86_64-3.7/cpp/src/deriv.o build/temp.linux-x86_64-3.7/cpp/src/cones.o -o build/lib.linux-x86_64-3.7/_diffcp.cpython-37m-x86_64-linux-gnu.so
  /global/scratch/rex/Python/miniconda3/compiler_compat/ld: /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crti.o: unable to initialize decompress status for section .debug_aranges
  /global/scratch/rex/Python/miniconda3/compiler_compat/ld: /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crti.o: unable to initialize decompress status for section .debug_aranges
  /global/scratch/rex/Python/miniconda3/compiler_compat/ld: /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crti.o: unable to initialize decompress status for section .debug_aranges
  /global/scratch/rex/Python/miniconda3/compiler_compat/ld: /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crti.o: unable to initialize decompress status for section .debug_aranges
  /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crti.o: file not recognized: file format not recognized
  collect2: error: ld returned 1 exit status
  error: command 'g++' failed with exit status 1

_solve_adjoint_derivative_dense much slower than np.linalg.solve

diffcp is installed with openmp flags:

MARCH_NATIVE=1 OPENMP_FLAG="-fopenmp" pip install diffcp

It's at least 5 times slower than np.linalg.solve.
Eigen solve should not be much slower than np.linalg.solve.

Report here in case the code performance can be improved.

forward time not decreasing w/ as number of threads increases

When running examples/batch.py on my machine, I get the following output:

001 | 3.4388 +/- 0.00 | 0.4225 +/- 0.00
002 | 3.6921 +/- 0.00 | 0.3009 +/- 0.00
003 | 4.4002 +/- 0.00 | 0.2352 +/- 0.00
004 | 5.3111 +/- 0.00 | 0.1954 +/- 0.00

The leftmost column is the number of threads, the middle column is the time taken by the forward pass and the right is the time taken by the backward pass (using lsqr).

Also add in multi-processing for the batched backward passes

Calling into solve_and_derivative_batch does multiprocessing to solve the problems, but requires the user to use multiprocessing on the derivative/adjoint functions they get back if they also want to batch those computations -- multiprocessing for these should almost always be done if the forward pass also uses multiprocessing. I think there should be a pretty clean way of having an interface to enable this

Performance squeeze: Don't pre-compute anything for the derivative when initially solving

It should be pretty easy to update solve_and_differentiate to not pre-compute anything for the derivative if the user doesn't need this. Otherwise if somebody wants to use this for the derivative sometimes (but not always) then they'll incur some additional performance overhead for the derivative pre-computation even though they don't use it. This also makes the timing results of the forward/backward passes slightly off as time that should be measured in the backward pass is actually present in the forward pass. I think this overhead might even larger for the explicit mode in #2 that calls into cone_lib.dpi_explicit.

I just tried running the following quick example and this part seems to add ~15% overhead

#!/usr/bin/env python3

import numpy as np
from scipy import sparse

import diffcp

nzero = 100
npos = 100
nsoc = 100
m = nzero + npos + nsoc
n = 100

cone_dict = {
    diffcp.ZERO: nzero,
    diffcp.POS: npos,
    diffcp.SOC: [nsoc]
}

A, b, c = diffcp.utils.random_cone_prog(m, n, cone_dict)
x, y, s, D, DT = diffcp.solve_and_derivative(A, b, c, cone_dict)

# evaluate the derivative
nonzeros = A.nonzero()
data = 1e-4 * np.random.randn(A.size)
dA = sparse.csc_matrix((data, nonzeros), shape=A.shape)
db = 1e-4 * np.random.randn(m)
dc = 1e-4 * np.random.randn(n)
dx, dy, ds = D(dA, db, dc)

# evaluate the adjoint of the derivative
dx = c
dy = np.zeros(m)
ds = np.zeros(m)
dA, db, dc = DT(dx, dy, ds)
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    55                                           @profile
    56                                           def solve_and_derivative(A, b, c, cone_dict, warm_start=None, **kwargs):

...

128         1      79706.0  79706.0     84.1      result = scs.solve(data, cone_dict, **kwargs)
129
130                                               # check status
131         1          6.0      6.0      0.0      status = result["info"]["status"]
132         1          4.0      4.0      0.0      if status == "Solved/Innacurate":
133                                                   warnings.warn("Solved/Innacurate.")
134         1          4.0      4.0      0.0      elif status != "Solved":
135                                                   raise SolverError("Solver scs returned status %s" % status)
136
137         1          3.0      3.0      0.0      x = result["x"]
138         1          4.0      4.0      0.0      y = result["y"]
139         1          3.0      3.0      0.0      s = result["s"]
140
141                                               # pre-compute quantities for the derivative
142         1          7.0      7.0      0.0      m, n = A.shape
143         1          4.0      4.0      0.0      N = m + n + 1
144         1         14.0     14.0      0.0      cones = cone_lib.parse_cone_dict(cone_dict)
145         1         21.0     21.0      0.0      z = (x, y - s, np.array([1]))
146         1          4.0      4.0      0.0      u, v, w = z
147         1       1850.0   1850.0      2.0      D_proj_dual_cone = cone_lib.dpi(v, cones, dual=True)
148         1          5.0      5.0      0.0      Q = sparse.bmat([
149         1        271.0    271.0      0.3          [None, A.T, np.expand_dims(c, - 1)],
150         1        299.0    299.0      0.3          [-A, None, np.expand_dims(b, -1)],
151         1       4230.0   4230.0      4.5          [-np.expand_dims(c, -1).T, -np.expand_dims(b, -1).T, None]
152                                               ])
153         1       2878.0   2878.0      3.0      M = splinalg.aslinearoperator(Q - sparse.eye(N)) @ dpi(
154         1       3301.0   3301.0      3.5          z, cones) + splinalg.aslinearoperator(sparse.eye(N))
155         1        445.0    445.0      0.5      pi_z = pi(z, cones)
156         1       1742.0   1742.0      1.8      rows, cols = A.nonzero()

ZeroDivisionError when SCS does not solve the problem

I am getting a ZeroDivisionError for an instance where SCS returns primal infeasibility.

Here is a MWE

import numpy as np
import cvxpy as cp
import diffcp
import pickle
import scs

with open("dump.pkl", "rb") as f:
    data = pickle.load(f)

A, b, c, cone_dict, dx = data['A'], data['b'], data['c'], data['cone_dict'], data['dx']
n = len(c)
m = len(b)


# Acceleration parameter
anderson = 10  # Diverges and returns infeasibility
#  anderson = 0  # Works

# Solve with CVXPY and OSQP
x = cp.Variable(n)
prob = cp.Problem(cp.Minimize(c * x), [A @ x <= b])
result_osqp = prob.solve(verbose=True)
x_osqp = x.value

# Compute derivatives with SCS
x, y, s, D, DT = diffcp.solve_and_derivative(A, b, c, cone_dict, verbose=True,
                                             acceleration_lookback=anderson)
dA, db, dc = DT(dx, np.zeros(m), np.zeros(m))  # ERROR

dump.pkl.zip

@bodono this is related to the Anderson acceleration in SCS. Lookback of 10 for this problem makes SCS diverge returning the wrong status. I had to disable the acceleration to make it work.

@akshayka @sbarratt I think this issue is in general related to what happens if the problem is not solved because of numerical issues or because it is infeasible. Have you thought about what kind of object or information to return in case the problem is infeasible?

Apparently wrong derivatives in simple softmax-as-optimization formulation

I was trying to use cvxpylayers to get derivatives of the problem below (essentially softmax-as-constrained optimization) and ran into a bug.

$$ \begin{aligned} \max_x & \text{ } v x_1 + b x_2 - \alpha \sum_{i=1}^2 x_i \log x_i \\ & \text{ s. t. } \sum_{i=1}^2 x_i = 1 \end{aligned} $$

where $v$ and $b$ are the parameters. The optimal solution to this problem is $x = \text{softmax}(v/\alpha, b/\alpha)$.

I reported the bug at cvxgrp/cvxpylayers#145 as well but it was suggested I post here as well, as it seems the bug may be in diffcp.

Here's my best attempt at a MWE that only uses diffcp. I ran the cvxpylayers code in a debugger and just pulled out the problem parameters and hardcoded them (please let me know if I'm using the library wrong!). The derivative of x1 at point v=0.6, b=0.58 wrt b should be -10.4994 but ends up being 0.01412.

import numpy as np
import diffcp
import scipy

# program is maximize_x v*x1 + b*x2 - smooth_coeff*sum(x_i log x_i) s.t. sum(x) == 1
# we care about derivative of solution wrt b
# the analytic solution is just x = softmax(v/smooth_coeff, b/smooth_coeff)

a = scipy.sparse.csc_matrix(np.array([[ 1.,  1.,  0.,  0.],
        [-1.,  0.,  0.,  0.],
        [ 0., -1.,  0.,  0.],
        [ 0.,  0., -1.,  0.],
        [-1.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0., -1.],
        [ 0., -1.,  0.,  0.],
        [ 0.,  0.,  0.,  0.]]))

bb = np.array([1., 0., 0., 0., 0., 1., 0., 0., 1.])


# second coordinate here corresponds to negative of parameter b
# last two coordinates correspond to smoothing param of 0.01
c = np.array([-0.6 ,  -0.58  , -0.01, -0.01])

cone_dict = {'l': 2, 'q': [], 'ep': 2, 's': [], 'p': [], 'z': 1}

kwargs = {'verbose': False, 'eps_abs': 1e-05, 'eps_rel': 1e-05}

solve_method = 'SCS'

x, y, s, D, DT = diffcp.solve_and_derivative(a, bb, c, cone_dict, **kwargs)

zeros = np.zeros_like(bb)
dx = np.array([1.0,0.0,0.0,0.0])
dA, db, dc = DT(dx, zeros, zeros)

# derivative wrt parameter b_val should be -dc[1]
print('calculated derivative wrt b parameter from problem (= -dc[1]) is', -dc[1])
print('deriv of softmax(v/smooth,b/smooth) wrt b parameter from problem is approx -10.4994')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.