Git Product home page Git Product logo

monotonicnetworks's Introduction

Table of Contents

Lipschitz Monotonic Networks

Implementation of Lipschitz Monotonic Networks, from the ICLR 2023 Submission: https://openreview.net/pdf?id=w2P7fMy_RH

The code here allows one to apply various weight constraints on torch.nn.Linear layers through the kind keyword. Here are the available weight norms:

"one",      # |W|_1 constraint
"inf",      # |W|_inf constraint
"one-inf",  # |W|_1,inf constraint
"two-inf",  # |W|_2,inf constraint

‼️‼️Important‼️‼️

Check that you have the right kind of lipschitz constraint.
If you are not sure: kind="one-inf" in the first layer, kind="inf" in all following layers.
The default (kind="one") works well ONLY when the output is scalar!

Installation

pip conda
pip install monotonicnetworks
PyPI version
conda install -c conda-forge monotonicnetworks
Conda

Note that the package used to be called monotonenorm and was renamed to monotonicnetworks on 2023-07-15. The old package is still available on PyPI and conda-forge, but will not be updated.

PyPI version (deprecated) pip install monotonenorm

Latest conda-forge version (deprecated) conda install -c okitouni monotonenorm

Requirements

Make sure you have the following packages installed:

  • torch (required)
  • matplotlib (optional, for plotting examples)
  • tqdm (optional, to run the examples with a progress bar)

Usage

Here's an example showing two ways to create a Lipschitz-constrained linear layer.

from torch import nn
import monotonicnetworks as lmn

linear_by_norming = lmn.direct_norm(nn.Linear(10, 10), kind="one-inf") # |W|_1,inf constraint
linear_native = lmn.LipschitzLinear(10, 10, kind="one-inf") # |W|_1,inf constraint

The function lmn.direct_norm can apply various weight constraints on torch.nn.Linear layers through the kind keyword and return a Lipschitz-constrained linear layer. Alternatively, the code in montonenorm/LipschitzMonotonicNetwork.py contains several classes that can be used to create Lipschitz and Monotonic Layers directly.

  • The LipschitzLinear class is a linear layer with a Lipschitz constraint on its weights.

  • The MonotonicLayer class is a linear layer with a Lipschitz constraint on its weights and monotonicity constraints that can be specified for each input dimension, or for each input-output pair. For instance, suppose we want to model a 2 input x 3 output linear layer. We specify the monotonic constraints wrt. the first input: [1,0,-1]. Thus, the first output is monotonically increasing (1), the second has no constraint (0) and the third is monotonically decreasing (-1) wrt. the first input. For the second input: [0,1,0] only the second output has a monotonically increasing constraint. The code for this looks as follows:

import monotonicnetworks as lmn

linear = lmn.MonotonicLayer(2, 3, monotonic_constraints=[[1, 0, -1], [0, 1, 0]])

The accepted 2D tensor shape for monotonic constraints is [input_dim, output_dim].
Using a 1D tensor for the constraint assumes that they are the same for each output dimension. By default, the code assumes all outputs are monotonically increasing with all inputs.

  • The MonotonicWrapper class is a wrapper around a module with a Lipschitz constant. It adds a term to the output of the module which enforces monotonicity constraints given by monotonic_constraints. The class returns a module that is monotonic and Lipschitz with constant lipschitz_const. This is the preferred way to create a monotonic network. Example:
from torch import nn
import monotonicnetworks as lmn

lip_nn = nn.Sequential(
    lmn.LipschitzLinear(2, 32, kind="one-inf"),
    lmn.GroupSort(2),
    lmn.LipschitzLinear(32, 2, kind="inf"),
)
monotonic_nn = lmn.MonotonicWrapper(lip_nn, monotonic_constraints=[1,0]) # first input increasing, no monotonicity constraints on second input

Note that one can stack monotonic modules.

  • The SigmaNet class is a deprecated class that is equivalent to the MonotonicWrapper class.

  • The RMSNorm class is a class that implements the RMSNorm normalization layer. It can help when training a model with many Lipschitz-constrained layers.

Examples

Check out the Examples directory for more details. Specifically, Examples/flower.py shows how to train a Lipschitz Monotonic Network to regress on a complex decision boundary in 2D (under Lipschitz NNs can describe arbitrarily complex boundaries), and Examples/Examples_paper.ipynb for the code used to make the plots under Monotonicity and Robustness.

Monotonicity

We will make a simple toy regression model to fit the following 1D function $$f(x) = log(x) + \epsilon(x)$$ where $\epsilon(x)$ is a Gaussian noise term whose variance is linearly increasing in x. In this toy model, we will assume that we have good reason to believe that the function we are trying to fit is monotonic (despite non-monotnic behavior of the noise). For example, we are building a trigger algorithm to discriminate between signal and background events. Rarer events are more likely to be signal and thus we should employ a network that is monotonic in some "rareness" feature. Another example could be a hiring classifier where (all else equal) higher school grades should imply better chances of being hired.

Training a monotonic NN and an unconstrained NN on the purple points and evaluating the networks on a uniform grid gives the following result:

Monotonic Dependence

Robustness

Now we will make a different toy model with one noisy data point. This will show that the Lipschitz continuous network is more robust against outliers than an unconstrained network because its gradient with respect to the input is bounded between -1 and 1. Additionally, it is more robust against adversarial attacks/data corruption for the same reason.

Robust Against Outliers

Lipschitz NNs can describe arbitrarily complex boundaries

GroupSort weight-constrained Neural Networks are universal approximators of Lipschitz continuous functions. Furthermore, they can describe arbitrarily complex decision boundaries in classification problems provided the proper objective function is used in training. In Examples\flower.py we provide code to regress on an example "complex" decision boundary in 2D.

Here are the contour lines of the resulting network (along with the training points in black).

Flower

monotonicnetworks's People

Contributors

evensgn avatar niklasnolte avatar okitouni avatar ssrothman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

monotonicnetworks's Issues

one-inf possible bug?

in get_normed_weights, should the one-inf option be:

norms = weights.abs().max()

? It seems to me that, according to the paper's definition of one-inf matrix norm, the one-inf norm would only be correct if vectorwise=False. However, in the examples notebook "one-inf" is used when vectorwise=True.

non-negative constraint via relu

Since my goal contains the constraint that the output is non-negative, I would like to add relu to the last layer, however this seems to break the lipschitz-1 continuum. Is there a good solution for this? Thank you for your reply.

MonotonicWrapper ignoring monotonic_constraints?

Hi! This may be a stupid question owing to the fact I just started using this package and there are still a couple of concepts from the paper I have to make clear in my head. I'm trying to use monotonic networks to model cumulative hazard functions which need to be monotonically increasing (and 0 for a particular input being 0 but not relevant here). However, when experimenting with some dummy data and this package to make sure I understand how it works I've become a bit confused.

As I understand, one can define a MLP using the LipschitzLinear layers and GroupSort (instead of using activation functions like relu). Then you use the MonotonicWrapper to impose monotonic constraings. Below I've included a minimal example of how I think this should be done. So I'm defining a model that should be monotonically increasing on the first input and non-constrained on the second. I've also deliberately used a function that is not monotonic on any input to test that the model indeed can only model monotonic functions. My expectation was that training this model would fail, but I get a result that is non-monotonic and fits the true labels. I've included a plot of the predictions when the second input is 0 and for different values of the first input. I imagine that this is not a problem with the package but me misunderstanding how to properly use it. Any help is appreciated!

import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
import torch.nn.utils.parametrize as parametrize
from tqdm import tqdm

class SimpleModel(nn.Module):
    
    def __init__(self):
        super(SimpleModel, self).__init__()

        self.layers = nn.Sequential()
        
        self.layers.add_module("input_layer", lmn.LipschitzLinear(2, 32))
        self.layers.add_module(f"group_sort_1", lmn.GroupSort(2))
        self.layers.add_module("hidden_layer_1", lmn.LipschitzLinear(32, 32))
        self.layers.add_module(f"group_sort_2", lmn.GroupSort(2))
        self.layers.add_module("hidden_layer_2", lmn.LipschitzLinear(32, 1))

        self.monotonic_layers = lmn.MonotonicWrapper(self.layers, monotonic_constraints=[1, 0])
        
    def forward(self, x):
        out = self.monotonic_layers(x)
        return out

def real_function(x):
     return torch.exp(- x[0] ** 2) + x[1] ** 2

inputs = torch.randn(1000, 2)

labels = real_function(inputs.T).unsqueeze(1)

# Add some noise to labels
labels += torch.randn_like(labels) * 0.1

model = SimpleModel()

optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

for epoch in (range(1000)):
    optimizer.zero_grad()
    output = model(inputs)
    loss = F.mse_loss(output, labels)
    loss.backward()
    optimizer.step()
    scheduler.step()

    if epoch % 100 == 0:
        print(loss.item())

# Plot predictions vs real function for a value of x[1] of 0 and a range of x[0] values
import matplotlib.pyplot as plt

x = torch.torch.randn(1000, 2)
x[:, 1] = 0

# Sort x values
x = x[x[:, 0].argsort()]

y = real_function(x.T).unsqueeze(1)
ypred = model(x)

plt.plot(x[:, 0].detach().numpy(), y.detach().numpy(), label="Real function")
plt.plot(x[:, 0].detach().numpy(), ypred.detach().numpy(), label="Predicted function")
plt.legend()

output

<dummy>

This is a dummy issue to host figures.
monotonic_dependence_unobserved_UpFalse_InterpFalse
robust_against_noisy_outlier
flower

Non-Monotonic Predictions

Hi Niklas and Ouail,

I am running into an issue where during training of these monotonic NNs, the predictions are not coming out monotonic whenever the Lipschitz constant is >1.

Specifically, I am training a model similar to the "RobustModel" in https://github.com/niklasnolte/MonotonicNetworks/blob/main/Examples/Examples_paper.ipynb, except with a Lipschitz constant of 100.

I attached the barebones code (below) and the associated training data (x.pt and gt.pt in data.zip) that reproduces this non-monotonic issue. The code will raise an exception whenever a prediction from the model is non-monotonic (usually around 100-200 epochs into training). Please let me know if I am missing something.

Thank you in advance for your time and help!

Code:

import monotonicnetworks as lmn

import torch
import torch.nn as nn
import torch.optim as optim

from tqdm import tqdm
    
class MonotonicNN(nn.Module):
    def __init__(self):
        super(MonotonicNN, self).__init__()

        lnn = torch.nn.Sequential(
            lmn.LipschitzLinear(1, 512, kind="one", lipschitz_const=100),
            lmn.GroupSort(2),
            lmn.LipschitzLinear(512, 512, kind="one", lipschitz_const=100),
            lmn.GroupSort(2),
            lmn.LipschitzLinear(512, 512, kind="one", lipschitz_const=100),
            lmn.GroupSort(2),
            lmn.LipschitzLinear(512, 1, kind="one", lipschitz_const=100),
        )

        self.nn = lmn.MonotonicWrapper(lnn, lipschitz_const=100)
        self.loss = nn.MSELoss()
        
    def forward(self, x):
        x = self.nn(x)
        return x
    
    def is_monotonoic(self, ts):
        for i in range(1,len(ts)):
            if ts[i - 1] > ts[i]:
                print(ts[i - 1], ts[i])
                return False
        return True
            
    def step(self, x, gt):
        y_hat = self.forward(x)
        if not self.is_monotonoic(y_hat):
            raise

        loss = self.loss(y_hat, gt)
        return loss

    def training_step(self, x, gt):
        self.train()
        return self.step(x, gt)
    

# Training Loop
device = "cuda"
max_epochs = 100000

x = torch.load("x.pt").to(device)
gt = torch.load("gt.pt").to(device)

monotonic_model = MonotonicNN()
monotonic_model.to(device)
optimizer = optim.Adam(monotonic_model.parameters(), lr=1e-3)

for i in tqdm(range(0, max_epochs)):
    optimizer.zero_grad()

    loss = monotonic_model.training_step(x, gt)

    loss.backward()
    optimizer.step()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.