Git Product home page Git Product logo

gurobi-optimods's People

Contributors

edrothberg avatar etowle avatar maliheha avatar marika-k avatar mattmilten avatar nodet avatar pobonomo avatar rluce avatar ruthmair avatar silkehorn avatar simonbowly avatar torressa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gurobi-optimods's Issues

TSP (with additional side constraints)

Why this Mod?

  • TSP is a well-known combinatorial problem and can arise as subproblem of more complicated problems, e.g., from routing and scheduling.
  • Formulating the TSP as MILP is not trivial because of subtour elimination. Many people use Big-M formulations but cut formulations perform usually better (since resulting bounds are better).
  • Optionally, side constraints might be considered, e.g., node time windows, resource restrictions on total tour, precedence relations.
  • The mod will not be state-of-the-art in terms of performance (best TSP solver is Concorde, variants are usually solved via sophisticated branch-and-cut algorithms with many additional sets of valid inequalities). Still, the performance will be sufficient to solve instances with up to a few hundred nodes for the basic TSP, probably much less for variants.

What will the API be?

Input data for the basic TSP is just the complete cost/distance matrix (numpy.ndarray)
tour, objective = solve_tsp(distance_matrix)
Symmetric costs/distances allow a more efficient formulation (reducing symmetry) so a differentiation might make sense.

Variants with additional side constraints might require a time matrix for arcs, time windows for edges, other resource matrices for arcs, node pairs representing precedence relations, etc.
In those cases it makes sense to use a graph representation, e.g., via a pandas dataframe.

Additional context

Fix up boilerplate stuff

  • License: update to Apache2
  • hooks: pre-commit added
  • actions: unit tests, doc build run on main and PRs

Restructure the API docs

Currently there are too many mods on the one page in api.rst. This should have subsections or subpages, and be formatted to remove too much namespacing.

Finalize list of dependencies

After we have a good collection of models, we should think about which dependencies are the default. Should the user have easy ways (via extras) to install specific dependencies for graphs, machine learning, OR models, without getting everything under the sun with the default install?

Add general weighted matching problem

Why this Nup?

Maximum (weighted) matching. This is a distinctly different model (MIP) from the linear network flow of bipartite matching.

Does it fall under an existing category?

Yes, graphs

What will the API be?

scipy sparse graph input, scipy sparse output (max. matching as a subgraph)

Additional context

We want to have fit-for-purpose models where we assume:

  • Users would know graph theory well, so would be aware their graph is bipartite
  • They would not want to see a MIP formulation when they know a fast LP algorithm is more suitable

optimod decorator should allow parameter overrides

Two use-cases to cover here:

  • The decorator should give the mod a solver_params keyword argument so the user can override Gurobi parameters
  • create_env should take a parameter dictionary so the mod can override certain settings (in particular in relation to output)

The priority should be:

  1. user-specified parameters
  2. parameters passed to create_env in the mod
  3. parameters set by the decorator as a result of silent and logfile

Rewrite the readme

The readme needs an overhaul, it is currently pitched at internal contributors. I will borrow the structure from gurobipy-pandas / gurobi-machinelearning

(Elementary) shortest path problem with resource constraints

Why this Mod?

  • Shortest path problem from given source node to target node minimizing total arc and/or node costs, subject to knapsack constraints for the whole path on further resources like distance/time/load on arcs and/or nodes.
  • Costs are allowed to be negative potentially leading to loops along the path (even infinite loops if resource constraints are not restricting). If asking for "elementary" paths, loops are forbidden.
  • Classical NP-hard combinatorial problem, often arising as subproblem of more complicated problems, e.g., in the pricing problem of route-based formulations for vehicle routing problems.
  • State-of-the-art algorithms are labeling algorithms similar to dynamic programming that can be difficult to implement correctly and efficiently. For inefficient implementations, MILP might have comparable performance.

What will the API be?

  • Graph-based input seems more appropriate instead of multiple matrices/arrays that may be inconsistent. Pandas dataframes are a good option.
  • Resource "cost" is mandatory and serves as resource to minimize. All further resources with arbitrary names are optional for nodes and arcs.
  • Node and arc values with the same resource name are accumulated along the path.
node_data = pd.DataFrame([
    {"node": 0, "cost": -3, "time": 4, "distance": 3},
    {"node": 1, "cost": 5, "time": 3, "distance": 7},
    {"node": 2, "cost": 6, "time": 1, "distance": 2},
    {"node": 3, "cost": -4, "time": 5, "distance": 3}
])

arc_data = pd.DataFrame([
    {"source": 0, "target": 1, "cost": 16, "distance": 1},
    {"source": 0, "target": 2, "cost": -13, "distance": 3},
    {"source": 1, "target": 2, "cost": 10, "distance": 2},
    {"source": 2, "target": 1, "cost": -4, "distance": 6},
    {"source": 1, "target": 3, "cost": 12, "distance": 3},
    {"source": 3, "target": 2, "cost": -9, "distance": 2}
])

limits = {
  "time": 10,
  "distance": 18
}

path, objective = solve_espprc(node_data=node_data, arc_data=arc_data, source=0, target=3, limits=limits, elementary=True)

Additional context

Portfolio optimization with discrete constraints

Why this Nup?
Many practical aspects of asset allocation are of discrete nature: Minimum buy-in, maximum number of allowed trades per period, market segmentation, etc.

Does it fall under an existing category?

  • Other: finance

What will the API be?

asset_allocs = nupstup.portfolio_optimization(
    factor_matrix=H  # H.T@H is the variance-covariance matrix
    covariance=Sigma  # Sigma is the variance-covariance matrix
    mu=mu  # estimated first moments of return function
    maxnum_transactions=20  # No more than 20 trades
    maxnum_allocations=50  # No more than 50 allocations at a time (e.g., online problem)
    min_invest_long=0.05  # For long allocations, need at least 5% of total investment
    min_invest_short=0.01  # For short allocations, need at least 1% of total investment
    max_total_short=0.1  # Maximum 10% short selling
    [...]

Here the objective function would be min x.T @ H.T @ H @ x - mu @ x

Note that we focus here on the model

min x.T @ H.T @ H @ x - mu @ x
s.t. something

and another NUP should cover the model

max  mu @ x
s.t. something
    x.T @ H.T @ H @ x <= omega^2

Both Nups can share most code and documentation.

Other random thoughts:

Network Design Optimod

Why this Mod?

I have heard Network Design problems being discussed with a few users now so they seem to be receiving a bit of attention. Essentially you have nodes and possible arcs between the nodes as well as commodities that must flow from one specific node to another in certain quantities. There are costs to build the arcs as well as costs for commodities to flow along them. The task is to design the network to minimize the cost such that all the commodities can get where they need to be in the required quantities. This feels like a nice optimod as it can be expressed and visualised with graphs. Alternatively, it could naturally be described in terms of supply chains.

Does it fall under an existing category?

  • Graphs

What will the API be?

The data that needs to be conveyed is:

  • Network $(N,A)$ of nodes & potential arcs, which can be either directed or undirected.
  • Arcs are labelled with a flow cost $c_{ij}$, fixed cost $f_{ij}$ for building the arc, and capacity $C_{ij}$
  • Commodities $k\in K$ with origin/destination $(o_k,d_k)$ and flow demand $W_k$

The API could look something like

network, cost = gurobi_optimods.network_design(arc_labeled_network, commodities)

Where:

  • The arc_labeled_network is a scipy object representing the network with all possible arcs, and each arc would be labeled with the flow cost, fixed cost, and capacity.
  • commodities is just a list of tuples containing origin, destination, and flow demand.
  • network is just another scipy object representing an unlabelled network of nodes and arcs, and the cost is just a float representing the cost to build the network.
  • The method might need an optional argument that specifies if the network is directed or not.

Additional context

Started looking into Network Design problems as supposedly they lend themselves well to Benders Decomposition but have found that the default Gurobi behavior is very strong so think a basic MIP model will suffice for this.

Add a "view mod code" link in the docs

In #18 we removed implementation code from the docs. We should instead include a link to the source of each mod (on github) from the docs page so users can get a quick view of the implementation.

Interfaces for network models

Meta issue to make sure we have consistent APIs between network-y mods #39 #36

Networkx is very clean for this, as we can have node and edge attributes and all the data for a mod is bundled into one python object. We probably want to avoid networkx/igraph/others as dependencies, but only if we can find just as clean an API for network models using the existing numpy/scipy/pandas dependencies.

For weighted or unweighted graphs we use scipy sparse arrays (e.g. mwis and matching):

# Graph adjacency matrix (upper triangular) as a sparse matrix.
g = nx.cubical_graph()
adjacency_matrix = sp.triu(nx.to_scipy_sparse_array(g))

# Vertex weights
weights = np.array([2**i for i in range(8)])

# Compute maximum weighted independent set.
mwis = maximum_weighted_independent_set(adjacency_matrix, weights)

however this doesn't quite suit networks which need multiple attributes per edge. We could pass multiple arrays, but this is unweildy as the user has to build two scipy objects with duplicated data for the edge pattern. The alignment between sparse indexes and positions in the demand array is not too nice either, as it's restricted to 0 ... n-1 node labels.

cost = ... # sparse array
capacity = ... # sparse array
demand = ... # dense array
min_cost_flow(cost, capacity, demand)

I also wrote a network flow function for use in bipartite matching, which uses dense arrays to pass all the edge source/target indices, costs & capacities. It's meant as a low-level utility for other mods to use though, it would be fast but not a nice API:

def solve_min_cost_flow(
env: gp.Env,
edge_source: np.ndarray,
edge_target: np.ndarray,
capacity: np.ndarray,
cost: np.ndarray,
demand: np.ndarray,
):

So I think pandas is the better alternative here, with input data as below. It's easy to do a direct formulation of a flow problem using gurobipy-pandas as well, see networkflow.md

edge_data = pd.DataFrame([
    {"source": 0, "target": 1, "capacity": 16, "cost": 0},
    {"source": 0, "target": 2, "capacity": 13, "cost": 0},
    {"source": 1, "target": 2, "capacity": 10, "cost": 0},
    {"source": 2, "target": 1, "capacity": 4, "cost": 0},
    {"source": 1, "target": 3, "capacity": 12, "cost": 0},
    {"source": 3, "target": 2, "capacity": 9, "cost": 0},
    {"source": 2, "target": 4, "capacity": 14, "cost": 0},
    {"source": 4, "target": 3, "capacity": 7, "cost": 0},
    {"source": 3, "target": 5, "capacity": 20, "cost": 0},
    {"source": 4, "target": 5, "capacity": 4, "cost": 0},
])

node_data = pd.DataFrame([
    {"node": 0, "demand": -23},
    {"node": 5, "demand": 23}
])

@torressa @stevedwards what do you think (for the final version, ok to continue hacking with networkx for the time being)? It's also fine to use networkx in the docs examples, either to generate networks or produce figures. @maliheha did this for MWIS; her examples generate a networkx graph and convert to scipy using networksx/numpy functions. It's similarly straightforward for pandas, see to_pandas_edgelist which gives the same structure as edge_data above from a graph with edge attributes.

Maximize Sharpe ratio

Why this Nup?

Finance applications are probably plentiful enough to warrant their own nup category. If there is a dedicated Finance category, the maximal Sharpe ratio problem is popular and focused enough to be included.

Note that this is closely related to #9.

Does it fall under an existing category?

Other (Finance)

What will the API be?

asset_allocs = nupstup.max_sharpe_ratio(
    cov_matrix=Q  # variance-covariance matrix
    mu=mu  # estimated first moments of return function
    rf_rate=rf  # risk-free rate
)

where Q and mu are scipy/numpy data structures. We could alternatively require the factor matrix H as an argument instead of the covariance matrix Q (where Q = H.T @ H), though I think Q is more natural considering how the problem statement and model are written.

Additional context

One complication: to be solved as a QP, the model needs to be reformulated in a different variable space. Additional constraints must be added in the variable space of the reformulated model. The mapping between the two models is straightforward, but this does create a barrier for people interested in modifying the underlying model.

Run GH action to black all PR?

There ought to be a way to run black automatically on all incoming PRs, I think it makes sense to do on the Python code parts. Whether or not it makes sense on runnable code snippets intended mostly for the documentation I don't know, it may cause issues. For example one may need to squash some stuff together so that it fits nicely on the doc page. We will have to experiment a bit.

Quadratic geometric problems.

Why this Mod?
There are two quite standard geometric problems which can be solved through quadratic optimization.
The first one is a Minimum Polytope Distance problem, where a user provides 2 sets of n-dimensional points and is interested in the minimal distance between the 2 polytopes defined by the 2 sets.
The second one is a Smallest Enclosing Ball problem, where a user provides a set of n-dimensional points and is interested in the smallest ball enclosing all of the provided points.

Does it fall under an existing category?

  • Other: Geometry

What will the API be?

distance, point1, point2 = gurobi_optimods.minpoldist(points1, points2)

where points1,points2 could be a pandas dataframe or alternatively a list.

center,radius = gurobi_optimods.minencball(points)

where points could be a pandas dataframe or alternatively a list.

Additional context
We could add a function which plots the results nicely for 2D problems.
I am not sure how much those problems are used in real world applications. I can image that this could be used in clustering algorithms and statistics but I am not an expert in those fields.

Stress that all Nups should be stateless w.r.t. gurobipy data structures

The simplicity of a pure data I/O interface like a function, or a class/method call implies that no Env, Model state can be made persistent across multiple invokations. Applications that require such state, say, for dynamic processes, are not in the scope of the NupStup, and should target gurobipy instead. This should be mentioned in the contributing guideline.

Use sklearn.datasets style for example data

Instead of read_feather for datasets, create a datasets submodule.

The returned object from each load_ function is an sklearn Bunch (basically an attrdict).

>>> from sklearn.datasets import load_iris
>>> data = load_iris()
>>> data.target[[10, 25, 50]]
array([0, 0, 1])
>>> list(data.target_names)
['setosa', 'versicolor', 'virginica']

Could avoid the feather dependency this way if needed, use read_csv and hide the conversions in the datasets code.

Uplift MWIS mod to cover the four basic graph parameters

We could have a set of functions in the mod that compute...

  • the chromatic number,
  • the clique number,
  • the clique cover number, and
  • the stability number.

Some of these have interesting weighted variants (stability number/MWIS)

Remove tentative categorization in documentation

The current categorization was done with the goal of sketching how alike mods may be aggregated/organized in the future. However many people interpreted the categories "real" and flamed us for the categories we have chosen. To avoid further inconvenience we should simply roll back to a flat list of mods and revisit categorization at some later point in time.

Set up pre-commit for formatting hooks

Need to decide what to point the formatters at (maybe just the nupstup code itself initially?). I'll copy the configuration over from gurobipy-pandas.

QUBO

Why this Nup?
Because of the quantum hype, there are so many solvers which solve QUBOs heuristically where the input is just a Q matrix. We can consider having this nup with a similar interface.

Does it fall under an existing category?

  • Other (General)

What will the API be?

qubo = nupstup.qubo(coeff_matrix=Q)

Where Q is a scipy sparse graph structure. The output is an array of 0/1s.

Additional context
Downside: This can encourage comparison of Gurobi with quantum or quantum-inspired solvers for problems where we know QUBO is not necessarily the right form.

Some Flow Problems

Why this Nup?

Flow problems are present in many applications.
Particularly, the following share the same formulation (with some small changes):

  • Min cost flow.
  • Shortest path problem.
  • Max flow/min cut

All single source/sink.

Given a digraph $G=(V, E)$, with source $s$ and sink $t$, we can formulate these as follows:

$$ \begin{alignat}{2} \min \quad & \sum_{(i, j) \in E} c_{ij} x_{ij} \\ \mbox{s.t.} \quad & \sum_{j \in \delta^+(i)} x_{ij} - \sum_{j \in \delta^-(i)} x_{ji} = b_i & \forall i \in V' \\ & 0 \leq x_{ij} \le B_{ij} & \forall (i, j) \in E \\ \end{alignat} $$

Where $\delta^+(\cdot)$ ( $\delta^-(\cdot)$ ) denotes the outgoing (incoming) neighours, and

$$ b_i = \begin{cases} D & \text{if } i = s\\ -D & \text{if } i = t \\ 0 & \text{ow} \end{cases} $$

Problem $V'$ $D$ $B_{ij}$ $c_{ij}$
1. $V$ $\geq 0$ $\geq 0$ $\geq 0$
2. $V$ $=1$ $=1$ $\geq 0$
3. $V\setminus\{s,t\}$ n/a $\geq 0$ $=-1$ if $i=s, j\in\delta^+(s)$, $0$ ow

Does it fall under an existing category?

Graphs

What will the API be?

nupstup.min_cost_flow(G, costs, capacities, demand, source, sink)
nupstup.max_flow(G, capacities, source, sink)
nupstup.min_cut(G, capacities, source, sink)
nupstup.shortest_path(G, costs, source, sink)

Additional context

Well-known graph problems, so graph theory terminology is fine.

There are many real-world applications and other graph problem transformations for these, so would be good to have some of these in there as well.

Problem 1 -> Minimum weight bipartite matching
Problem 3 -> Maximum cardinality bipartite matching, closure problem

If we go up another dimension and add $x_{ij}^k$ (also $D^k$) for a commodity $k$ we can model multicommodity flows with this same formulation as well.

Add WLS license to CI

This is needed already for the cardinality-constrained regression doctests (quadratic model with >200 variables), and will also be needed for several OPF mod unittests.

To do:

  • Enable full WLS license for doc tests. Change doctest execution to push/main and pull_request_target/labelled
  • Mark tests to skip if the pip license is detected
  • Add an action to run unittest with WLS. Execute on push/main and pull_request_target/labelled

Cleanup pass over docstrings

  • Some mod docstrings are incomplete, we should do a pass through to check.
  • verbose and logfile need to be documented in decorated mods (done via the decorator)
  • In most cases we haven't linked to class definitions, run make clean html SPHINXOPTS="-W --keep-going -n" to check

Mention in the doc how-to for "Contributing a mod"

Covers what people need to do in order to contribute a nup. This probably won't be a full blown doc chapter, but instead points to the resources we already have creaded (contributing guideline, template nup, etc.). We will have a better idea on how to structure this with our coming implementation kickoff.

Unit commitment-esque model for microgrids

Why this Mod?

To give a gentler alternative to the very complex ACOPF mods, a unit commitment mod would be a useful addition. However, the classic example of a few big coal plants serving some smoke-belching factories is not so nice since (a) it's a bit tired, (b) the explanation involves many dirty words, and (c) it's not so likely that a small country would build their power grid on top of the optimods.

It would be more interesting to demonstrate usability for microgrids or household scale power usage. So we could include:

  • demand profiles,
  • solar generation patterns,
  • battery charge/discharge,
  • load-shedding for HVAC systems, and,
  • bribing your neighbours with the promise of baked goods if they pedal a dynamo for an hour so you can keep hacking.

Design requirements

Verify the mod will meet the following criteria:

  • Self-contained: the user doesn't need to interact directly with gurobipy
    objects when using the function or class exposed by the mod.
  • Stateless: data-in data-out, no follow-up optimization steps, and gurobipy
    objects are disposed as soon as solutions are retrieved.
  • Covers a well-known concept from a non-optimization field. In particular,
    we must be able to explain what the Mod does using domain-specific terminology
    without reference to the mathematical model.
  • Clear purpose / single focus. There may be natural parameters to switch on/off
    various side-constraints, but these should be kept to a minimum.

What will the API be?

Pandas is the natural input format. It may make sense to build this mod as a class to separate static infrastructure from dynamic operation concerns to support scenario modeling, e.g.

grid = MicroGrid(loads, storage, generators)
grid.operation_plan(demand_profile, solar_profile, wind_profile)

We probably don't need network layout information for microgrids, and can focus on meeting demand over time.

Additional context

A quick search reveals plenty of recent literature we could reference:

Many papers refer also to ML for demand forecasting, so this is also a good opportunity to talk about the "what next" after you've applied all your fancy machine learning tricks.

Rethink examples file structure

Current example structure is docs/examples/<nup>/gurobipy.py + docs/examples/<nup>/nupstup.py. We can import these files fine for the purposes of testing, but they can't be executed where they live because they partially override gurobipy and nupstup. Need to rename these across the board (maybe just some creative suffix like _impl?).

Fix gurobipy to v10 or above

We should require at least gurobipy v10. Reasoning:

  • The target audience is new users, so there's isn't much incentive to support old versions
  • There are nice opportunities to use the v10 matrix API in many of the mods; we would be maintaining backward compatibility at the cost of readability

Maximum Weighted Independent Set (MWIS)

Why this Nup?
The maximum weighted independent set (MWIS) problem is a famous graph theory problem. I got to appreciate this problem more where I worked on a molecular similarity application for designing new drugs. The heart of the application needed solving very big MWIS problems.

Does it fall under an existing category?

  • Graphs

What will the API be?

mwis = nupstup.mwis(adjacency_matrix=A, weights=w)

Where A is a scipy sparse graph structure and w is a numpy array of weights. The output is an array of vertices in the independent set.

Additional context
This requires solving a MIP and the LP relaxation solution does not include an optimal integer solution.

Optimal Power Flow (OPF)

Why this Mod?
We want to enable usage of Gurobi in the Power industry. Criteria:

  • Solve model, get solution (no follow up analysis, e.g. IIS)
  • Stateless: data-in data-out, no follow-up optimization steps
    • Model can be kept in memory for the purpose of returning solutions
  • Covers a well-known concept from a non-optimization field
    • In particular, we must be able to explain what the Mod does using domain-specific terminology
  • Clear purpose / single focus
    • There may be natural parameters to switch on/off various side-constraints, but these should be kept to a minimum

What will the API be?
Custom format.

Additional context
Any other details that should be communicated in the Mod writeup to help users/readers understanding?

Add detailed information about the relaxations and terminology used for ACOPF.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.