Git Product home page Git Product logo

neuralcompression's Introduction

NeuralCompression

LICENSE Build and Test

What's New

About

NeuralCompression is a Python repository dedicated to research of neural networks that compress data. The repository includes tools such as JAX-based entropy coders, image compression models, video compression models, and metrics for image and video evaluation.

NeuralCompression is alpha software. The project is under active development. The API will change as we make releases, potentially breaking backwards compatibility.

Installation

NeuralCompression is a project currently under development. You can install the repository in development mode.

PyPI Installation

First, install PyTorch according to the directions from the PyTorch website. Then, you should be able to run

pip install neuralcompression

to get the latest version from PyPI.

Development Installation

First, clone the repository and navigate to the NeuralCompression root directory and install the package in development mode by running:

pip install --editable ".[tests]"

If you are not interested in matching the test environment, then you can just apply pip install -e ..

Repository Structure

We use a 2-tier repository structure. The neuralcompression package contains a core set of tools for doing neural compression research. Code committed to the core package requires stricter linting, high code quality, and rigorous review. The projects folder contains code for reproducing papers and training baselines. Code in this folder is not linted aggressively, we don't enforce type annotations, and it's okay to omit unit tests.

The 2-tier structure enables rapid iteration and reproduction via code in projects that is built on a backbone of high-quality code in neuralcompression.

neuralcompression

  • neuralcompression - base package
    • data - PyTorch data loaders for various data sets
    • distributions - extensions of probability models for compression
    • functional - methods for image warping, information cost, flop counting, etc.
    • layers - building blocks for compression models
    • metrics - torchmetrics classes for assessing model performance
    • models - complete compression models
    • optim - useful optimization utilities

projects

Tutorial Notebooks

This repository also features interactive notebooks detailing different parts of the package, which can be found in the tutorials directory. Existing tutorials are:

  • Walkthrough of the neuralcompression flop counter (view on Colab).
  • Using neuralcompression.metrics and torchmetrics to calculate rate-distortion curves (view on Colab).

Contributions

Please read our CONTRIBUTING guide and our CODE_OF_CONDUCT prior to submitting a pull request.

We test all pull requests. We rely on this for reviews, so please make sure any new code is tested. Tests for neuralcompression go in the tests folder in the root of the repository. Tests for individual projects go in those projects' own tests folder.

We use black for formatting, isort for import sorting, flake8 for linting, and mypy for type checking.

License

NeuralCompression is MIT licensed, as found in the LICENSE file.

Model weights released with NeuralCompression are CC-BY-NC 4.0 licensed, as found in the WEIGHTS_LICENSE file.

Some of the code may from other repositories and include other licenses. Please read all code files carefully for details.

Cite

If you use code for a paper reimplementation. If you would like to also cite the repository, you can use:

@misc{muckley2021neuralcompression,
    author={Matthew Muckley and Jordan Juravsky and Daniel Severo and Mannat Singh and Quentin Duval and Karen Ullrich},
    title={NeuralCompression},
    howpublished={\url{https://github.com/facebookresearch/NeuralCompression}},
    year={2021}
}

neuralcompression's People

Contributors

0x00b1 avatar desi-ivanova avatar jordan-benjamin avatar juliusberner avatar karenullrich avatar mhavasi avatar mmuckley avatar nripeshn avatar txya900619 avatar yitai-cheng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

neuralcompression's Issues

Parallelize `pmf_to_quantized_cdf` op

Presently, pmf_to_quantized_cdf is serial and requires the user to do two additional and expensive copies by expecting a std::vector and returning a std::vector. Both of these issues can be resolved by using torchlib to vectorize the op.

log_survival_function op

Logarithm of x for a distribution’s survival function.

log_survival_function(x: torch.Tensor, distribution: torch.distributions.Distribution) -> torch.Tensor

Prerequisites

ndtr op

Normal cumulative distribution function (CDF).

ndtr(x: torch.Tensor) -> torch.Tensor

Returns the area under the standard Normal probability density function (PDF), integrated from negative infinity to x.

`pmf_to_quantized_cdf` op

Transforms a probability mass function (PMF) into a quantized cumulative distribution function (CDF) for entropy coding.

pmf_to_quantized_cdf(pmf: torch.Tensor, precision: float) -> torch.Tensor

quantization_offset op

Computes distribution-dependent quantization offset.

quantization_offset(distribution: torch.distributions.Distribution) -> torch.Tensor

For range coding of continuous random variables, the values need to be quantized first. Typically, it is beneficial for compression performance to align the centers of the quantization bins such that one of them coincides with the mode of the distribution. With offset being the mode of the distribution, for instance, this can be achieved simply by computing:

x_hat = torch.round(x - offset) + offset

This method tries to determine the offset in a best-effort fashion, based on which statistics the Distribution implements. First, a method self._quantization_offset() is tried. If that isn’t defined, it tries in turn: self.mode(), self.quantile(.5), then self.mean(). If none of these are implemented, it falls back on quantizing to integer values (i.e., an offset of zero).

The offset is always in the range [-.5, .5] as it is assumed to be combined with a round quantizer.

`AnalysisTransformation2D` layer

from torch import Tensor
from torch.nn import Conv2d, Module, Sequential

from ._generalized_divisive_normalization import GeneralizedDivisiveNormalization


class AnalysisTransformation(Module):
    def __init__(self, m: int, n: int):
        super(AnalysisTransformation, self).__init__()

        self._modules = Sequential(
            Conv2d(3, m, (5,), (2,), 2),
            GeneralizedDivisiveNormalization(m),
            Conv2d(m, m, (5,), (2,), 2),
            GeneralizedDivisiveNormalization(m),
            Conv2d(m, m, (5,), (2,), 2),
            GeneralizedDivisiveNormalization(m),
            Conv2d(m, n, (5,), (2,), 2),
        )

    def forward(self, x: Tensor) -> Tensor:
        return self._modules(x)

Prerequisites

`SynthesisTransformation2D` layer

from torch import Tensor
from torch.nn import ConvTranspose2d, Module, Sequential

from ._generalized_divisive_normalization import GeneralizedDivisiveNormalization


class SynthesisTransformation(Module):
    def __init__(self, m: int, n: int):
        super(SynthesisTransformation, self).__init__()

        self._modules = Sequential(
            ConvTranspose2d(m, n, (5,), (2,), (2,), (1,)),
            GeneralizedDivisiveNormalization(n, inverse=True),
            ConvTranspose2d(n, n, (5,), (2,), (2,), (1,)),
            GeneralizedDivisiveNormalization(n, inverse=True),
            ConvTranspose2d(n, n, (5,), (2,), (2,), (1,)),
            GeneralizedDivisiveNormalization(n, inverse=True),
            ConvTranspose2d(n, 3, (5,), (2,), (2,), (1,)),
        )

    def forward(self, x: Tensor) -> Tensor:
        return self._modules(x)

Prerequisites

Autoencoder abstract base class

Enhancement

An abstract base class for autoencoders that setups a continuous entropy layer, attaches a loss to train the continuous entropy layer, and exposes an update method that updates the continuous entropy layer’s prior cumulative distribution function (CDF).

Motivation

Implementing popular variational autoencoders models like the scale hyperprior, mean scale hyperprior, and joint autoregressive and hierarchical prior models.

Alternatives

Replicate the aforementioned features in each model implementation.

References

Variational Image Compression with a Scale Hyperprior
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, Nick Johnston
https://arxiv.org/abs/1802.01436

`GeneralizedDivisiveNormalization` layer

"""
Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the
LICENSE file in the root directory of this source tree.
"""

import functools
from typing import Callable, Optional, Union

import torch
import torch.nn.functional
from torch import Tensor
from torch.nn import Module, Parameter

from ._non_negative_parameterization import NonNegativeParameterization


class GeneralizedDivisiveNormalization(Module):
    """Generalized divisive normalization

    Implements an activation function that is a multivariate generalization of
    the following sigmoid-like function:

        y[i] = x[i] / (beta[i] + sum_j(gamma[j, i] * |x[j]|^alpha))^epsilon

    where ``i`` and ``j`` map over channels. This implementation never sums
    across spatial dimensions. It is similar to local response normalization,
    but much more flexible, as ``alpha``, ``beta``, ``gamma``, and ``epsilon``
    are trainable parameters.

    The method was originally described in:

        | “Density Modeling of Images using a Generalized Normalization
            Transformation”
        | Johannes Ballé, Valero Laparra, Eero P. Simoncelli
        | https://arxiv.org/abs/1511.06281

    and expanded in:

        | “End-to-end Optimized Image Compression”
        | Johannes Ballé, Valero Laparra, Eero P. Simoncelli
        | https://arxiv.org/abs/1611.01704

    Args:
        channels: number of channels in the input.
        inverse: compute the generalized divisive normalization response. If
            ``True``, compute the inverse generalized divisive normalization
            response (one step of fixed point iteration to invert the
            generalized divisive normalization; the division is replaced by
            multiplication).
        rectify: If ``True``, apply a ``ReLU`` non-linearity to the inputs
            before calculating the generalized divisive normalization response.
        alpha_parameter: A ``Tensor`` means that the value of ``alpha`` is
            fixed. A ``Callable`` can be used to determine the value of
            ``alpha`` as a function of some other parameter or tensor. This can
            be a ``Parameter``. ``None`` means that when the layer is
            initialized, a ``NonNegativeParameterization`` layer is created to
            train ``alpha`` (with a minimum value of ``1``). The default is a
            fixed value of ``1``.
        beta_parameter: A ``Tensor`` means that the value of ``beta`` is fixed.
            A ``Callable`` can be used to determine the value of ``beta`` as a
            function of some other parameter or tensor. This can be a
            ``Parameter``. ``None`` means that when the layer is initialized, a
            ``NonNegativeParameterization`` layer is created to train ``beta``
            (with a minimum value of ``1e-6``).
        epsilon_parameter: A ``Tensor`` means that the value of ``epsilon`` is
            fixed. A ``Callable`` can be used to determine the value of
            ``epsilon`` as a function of some other parameter or tensor. This
            can be a ``Parameter``. ``None`` means that when the layer is
            initialized, a ``NonNegativeParameterization`` layer is created to
            train ``epsilon`` (with a minimum value of 1e-6). The default is a
            fixed value of ``1``.
        gamma_parameter: A ``Tensor`` means that the value of ``gamma`` is
            fixed. A ``Callable`` can be used to determine the value of
            ``gamma`` as a function of some other parameter or tensor. This can
            be a ``Parameter``. ``None`` means that when the layer is
            initialized, a ``NonNegativeParameterization`` layer is created to
            train ``gamma``.
        alpha_initializer: initializes the ``alpha`` parameter. Only used if
            ``alpha`` is trained. Defaults to ``1``.
        beta_initializer: initializes the ``beta`` parameter. Only used if
            ``beta`` is created when initializing the layer. Defaults to ``1``.
        epsilon_initializer: initializes the ``epsilon`` parameter. Only used
            if ``epsilon`` is trained. Defaults to ``1``.
        gamma_initializer: initializes the ``gamma`` parameter. Only used if
            ``gamma`` is created when initializing the layer. Defaults to the
            identity multiplied by ``0.1``. A good default value for the
            diagonal is somewhere between ``0`` and ``0.5``. If set to ``0``
            and ``beta`` is initialized as ``1``, the layer is effectively
            initialized to the identity operation.
    """

    def __init__(
        self,
        channels: Union[int, Tensor],
        inverse: bool = False,
        rectify: bool = False,
        alpha_parameter: Union[float, int, Tensor, Parameter] = None,
        beta_parameter: Union[float, int, Tensor, Parameter] = None,
        epsilon_parameter: Union[float, int, Tensor, Parameter] = None,
        gamma_parameter: Union[float, int, Tensor, Parameter] = None,
        alpha_initializer: Optional[Callable[[Tensor], Tensor]] = None,
        beta_initializer: Optional[Callable[[Tensor], Tensor]] = None,
        epsilon_initializer: Optional[Callable[[Tensor], Tensor]] = None,
        gamma_initializer: Optional[Callable[[Tensor], Tensor]] = None,
    ):
        super(GeneralizedDivisiveNormalization, self).__init__()

        self._inverse = inverse

        self._rectify = rectify

        if alpha_parameter is None:
            if alpha_initializer is None:
                alpha_initializer = torch.ones

            self._reparameterized_alpha = NonNegativeParameterization(
                alpha_initializer(channels),
                minimum=1,
            )

            self._alpha_parameter = Parameter(
                self._reparameterized_alpha.initial_value,
            )
        else:
            self._alpha_parameter = alpha_parameter

        if beta_parameter is None:
            if beta_initializer is None:
                beta_initializer = torch.ones

            self._reparameterized_beta = NonNegativeParameterization(
                beta_initializer(channels),
                minimum=1e-6,
            )

            self._beta_parameter = Parameter(
                self._reparameterized_beta.initial_value,
            )
        else:
            self._beta_parameter = beta_parameter

        if epsilon_parameter is None:
            if epsilon_initializer is None:
                epsilon_initializer = torch.ones

            self._reparameterized_epsilon = NonNegativeParameterization(
                epsilon_initializer(channels),
                minimum=1e-6,
            )

            self._epsilon_parameter = Parameter(
                self._reparameterized_epsilon.initial_value,
            )
        else:
            self._epsilon_parameter = epsilon_parameter

        if gamma_parameter is None:
            if gamma_initializer is None:
                gamma_initializer = functools.partial(lambda x: 0.1 * torch.eye(x))

            self._reparameterized_gamma = NonNegativeParameterization(
                gamma_initializer(channels),
                minimum=0,
            )

            self._gamma_parameter = Parameter(
                self._reparameterized_gamma.initial_value,
            )
        else:
            self._gamma_parameter = gamma_parameter

    def forward(self, x: Tensor) -> Tensor:
        _, channels, _, _ = x.size()

        if self._rectify:
            x = torch.nn.functional.relu(x)

        y = torch.nn.functional.conv2d(
            x ** 2,
            torch.reshape(
                self._reparameterized_gamma(self._gamma_parameter),
                (channels, channels, 1, 1),
            ),
            self._reparameterized_beta(self._beta_parameter),
        )

        if self._inverse:
            return x * torch.sqrt(y)

        return x * torch.rsqrt(y)
"""
Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the
LICENSE file in the root directory of this source tree.
"""

import torch
import torch.testing

from neuralcompression.layers import GeneralizedDivisiveNormalization


class TestGeneralizedDivisiveNormalization:
    def test_backward(self):
        x = torch.rand((1, 32, 16, 16), requires_grad=True)

        normalization = GeneralizedDivisiveNormalization(32)

        normalized = normalization(x)

        normalized.backward(x)

        assert normalized.shape == x.shape

        assert x.grad is not None

        assert x.grad.shape == x.shape

        torch.testing.assert_allclose(
            x / torch.sqrt(1 + 0.1 * (x ** 2)),
            normalized,
        )

        normalization = GeneralizedDivisiveNormalization(
            32,
            inverse=True,
        )

        normalized = normalization(x)

        normalized.backward(x)

        assert normalized.shape == x.shape

        assert x.grad is not None

        assert x.grad.shape == x.shape

        torch.testing.assert_allclose(
            x * torch.sqrt(1 + 0.1 * (x ** 2)),
            normalized,
        )

Prerequisites

survival_function op

Survival function of x. Generally defined as 1 - distribution.cdf(x).

survival_function(x: torch.Tensor, distribution: torch.distributions.Distribution) -> torch.Tensor

Prerequisites

`HyperAnalysisTransformation` layer

from torch import Tensor
from torch.nn import Conv2d, Module, ReLU, Sequential


class HyperAnalysisTransformation(Module):
    def __init__(self, m: int, n: int):
        super(HyperAnalysisTransformation, self).__init__()

        self._modules = Sequential(
            Conv2d(m, n, (3,), (1,), 2),
            ReLU(inplace=True),
            Conv2d(n, n, (5,), (2,), 2),
            ReLU(inplace=True),
            Conv2d(n, n, (5,), (2,), 2),
        )

    def forward(self, x: Tensor) -> Tensor:
        return self._modules(x)

log_cdf op

Logarithm of the distribution’s cumulative distribution function (CDF).

log_cdf(x: torch.Tensor, distribution: torch.distributions.Distribution) -> torch.Tensor

Prerequisites

log_ndtr op

Logarithm of the normal cumulative distribution function (CDF).

log_ndtr(x: torch.Tensor) -> torch.Tensor

Prerequisites

lower_bound op

An op with similar semantics to torch.maximum but additional gradient options (e.g. x < lower_bound), e.g.

import torch
from torch import Tensor
from torch.autograd import Function


class _LowerBound(Function):
    @staticmethod
    def forward(
        ctx,
        tensor: Tensor,
        lower_bound: int,
        gradient: str = "identity_if_towards",
    ) -> Tensor:
        if gradient in ("disconnected", "identity", "identity_if_towards"):
            ctx.gradient = gradient
        else:
            raise ValueError

        ctx.mask = tensor.ge(lower_bound)

        return torch.clamp(tensor, lower_bound)

    @staticmethod
    def backward(ctx, grad_output: Tensor):
        if ctx.gradient == "identity_if_towards":
            grad_output *= torch.logical_or(ctx.mask, grad_output.lt(0.0))

        if ctx.gradient == "disconnected":
            grad_output *= ctx.mask

        return grad_output.type(grad_output.dtype), None


lower_bound = _LowerBound.apply

lower_tail op

Approximates lower tail quantile for range coding.

lower_tail(distribution: torch.distributions.Distribution, tail_mass: float) -> float

For range coding of random variables, the distribution tails need special handling, because range coding can only handle alphabets with a finite number of symbols. This method returns a cut-off location for the lower tail, such that approximately tail_mass probability mass is contained in the tails (together). The tails are then handled by using the overflow functionality of the range coder implementation (using a Golomb-like universal code).

Prerequisites:

`HyperSynthesisTransformation` layer

from torch import Tensor
from torch.nn import Conv2d, ConvTranspose2d, Module, ReLU, Sequential


class HyperSynthesisTransformation(Module):
    def __init__(self, m: int, n: int):
        super(HyperSynthesisTransformation, self).__init__()

        self._modules = Sequential(
            ConvTranspose2d(n, n, (5, 5), (2,), (2,), (1,)),
            ReLU(inplace=True),
            ConvTranspose2d(n, n, (5, 5), (2,), (2,), (1,)),
            ReLU(inplace=True),
            Conv2d(n, m, (3, 3), (1,), 1),
            ReLU(inplace=True),
        )

    def forward(self, x: Tensor) -> Tensor:
        return self._modules(x)

upper_tail op

Approximates upper tail quantile for range coding.

upper_tail(distribution: torch.distributions.Distribution, tail_mass: float) -> float

For range coding of random variables, the distribution tails need special handling, because range coding can only handle alphabets with a finite number of symbols. This method returns a cut-off location for the upper tail, such that approximately tail_mass probability mass is contained in the tails (together). The tails are then handled by using the overflow functionality of the range coder implementation (using a Golomb-like universal code).

Prerequisites:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.