Git Product home page Git Product logo

learned_optimization's Introduction

learned_optimization: Meta-learning optimizers and more with JAX

Documentation Status License

learned_optimization is a research codebase for training, designing, evaluating, and applying learned optimizers, and for meta-training of dynamical systems more broadly. It implements hand-designed and learned optimizers, tasks to meta-train and meta-test them, and outer-training algorithms such as ES, PES, and truncated backprop through time.

To get started see our documentation.

Quick Start Colab Notebooks

Our documentation can also be run as colab notebooks! We recommend running these notebooks with a free accelerator (TPU or GPU) in colab (go to Runtime -> Change runtime type).

learned_optimization tutorial sequence

  1. Introduction : Open In Colab
  2. Creating custom tasks: Open In Colab
  3. Truncated Steps: Open In Colab
  4. Gradient estimators: Open In Colab
  5. Meta training: Open In Colab
  6. Custom learned optimizers: Open In Colab

Build a learned optimizer from scratch

Simple, self-contained, learned optimizer example that does not depend on the learned_optimization library: Open In Colab

Local Installation

We strongly recommend using virtualenv to work with this package.

pip3 install virtualenv
git clone [email protected]:google/learned_optimization.git
cd learned_optimization
python3 -m venv env
source env/bin/activate
pip install -e .

Train a learned optimizer example

To train a learned optimizer on a simple inner-problem, run the following:

python3 -m learned_optimization.examples.simple_lopt_train --train_log_dir=/tmp/logs_folder --alsologtostderr

This will first use tfds to download data, then start running. After a few minutes you should see numbers printed.

A tensorboard can be pointed at this directory for visualization of results. Note this will run very slowly without an accelerator.

Need help? Have a question?

File a github issue! We will do our best to respond promptly.

Publications which use learned_optimization

Wrote a paper or blog post that uses learned_optimization? Add it to the list!

Development / Running tests

We locate test files next to the related source as opposed to in a separate tests/ folder. Each test can be run directly, or with pytest (e.g. python3 -m pytest learned_optimization/outer_trainers/). Pytest can also be used to run all tests with python3 -m pytest, but this will take quite some time.

If something is broken please file an issue and we will take a look!

Citing learned_optimization

To cite this repository:

@inproceedings{metz2022practical,
  title={Practical tradeoffs between memory, compute, and performance in learned optimizers},
  author={Metz, Luke and Freeman, C Daniel and Harrison, James and Maheswaranathan, Niru and Sohl-Dickstein, Jascha},
  booktitle = {Conference on Lifelong Learning Agents (CoLLAs)},
  year = {2022},
  url = {http://github.com/google/learned_optimization},
}

Disclaimer

learned_optimization is not an official Google product.

learned_optimization's People

Contributors

adarob avatar cdfreeman-google avatar hawkinsp avatar ivyzx avatar jharrison42 avatar lukemetz avatar oscarcarli avatar rchen152 avatar sinopalnikov avatar sohl-dickstein avatar yashk2810 avatar yilei avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

learned_optimization's Issues

Wrong implementation of hyper_v2 mix_layers

Hi,
In VeLO (https://arxiv.org/pdf/2211.09760.pdf) Section B.3, it states that mixing is done by F0(x) + max(ฯƒ(F1(ฯƒ(F2(x)))), axis = 0, keep_dims = True).
However, in the implementation of hyper_v2 (https://github.com/google/learned_optimization/blob/main/learned_optimization/research/general_lopt/hyper_v2.py#L330-L335), it essentially use only one linear layer instead of two, as the input to second linear layer is x instead of mix_layer (L332).

colab demo error

the demo colab pt1.introduction comes up with an error on the second running block
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.9.0 requires jedi>=0.10, which is not installed.

Colab link not working

Link in the following section doesn't work:

Build a learned optimizer from scratch

Simple, self-contained, learned optimizer example that does not depend on the learned_optimization library: Open In Colab

The link goes like this [https://colab.research.google.com/github/google/learned_optimization/blob/main/docs/notebooks/no_dependency_learned_optimizer.ipynb.ipynb]. There's a trailing .ipynb that should not be there.

Very Large Memory Consumption for Even A Small Dataset

Dataset: fashion_mnist
Dataset Size: 36.42MB (https://www.tensorflow.org/datasets/catalog/fashion_mnist)

Reproduce the Issue:

from learned_optimization.tasks import fixed_mlp
task = fixed_mlp.FashionMnistRelu32_8()

or

from learned_optimization.tasks.datasets import base

batch_size=128
image_size=(8, 8)
splits = ("train[0:80%]", "train[80%:90%]", "train[90%:]", "test")
stack_channels = 1

dataset = preload_tfds_image_classification_datasets(
      "fashion_mnist",
      splits,
      batch_size=batch_size,
      image_size=image_size,
      stack_channels=stack_channels)

Issue Description:
As you can see, the original FashionMnist dataset is very small. However, when I run the above code, the memory usage became crazy high, such as 10G+.

In my case, the issues occurs when the program reaches this line which in the function preload_tfds_image_classification_datasets:

  return Datasets(
      *[make_python_iter(split) for split in splits],
      extra_info={"num_classes": num_classes})

Here is the code of make_python_iter:

  def make_python_iter(split: str) -> Iterator[Batch]:
    # load the entire dataset into memory
    dataset = tfds.load(datasetname, split=split, batch_size=-1)
    data = tfds.as_numpy(_image_map_fn(cfg, dataset))

    use a python iterator as this is faster than TFDS.
    def generator_fn():

      def iter_fn():
        batches = data["image"].shape[0] // batch_size
        idx = onp.arange(data["image"].shape[0])
        while True:
          # every epoch shuffle indicies
          onp.random.shuffle(idx)
          for bi in range(0, batches):
            idxs = idx[bi * batch_size:(bi + 1) * batch_size]

            def index_into(idxs, x):
              return x[idxs]

            yield jax.tree_map(functools.partial(index_into, idxs), data)

      return prefetch_iterator.PrefetchIterator(iter_fn(), prefetch_batches)

    return ThreadSafeIterator(LazyIterator(generator_fn))

Could you please suggest a way to reduce the huge memory usage, do you have any idea why it requires so high memory, and do you (or anybody) also have this issue?

Thank you very much and looking forward to your comments.

Error while runing image_test.py

Hi!
I was having a look at the test functions of the datasets module to try to create my own task & dataset, and I get an error while running image_test.py
ValueError: Dataset imagenet2012_16 with split train doesn't appear to be preprocessed? Please run dataset creation.

I have checked the ase.py but it is not clear to me how to run the dataset creation.
Thanks

PyTorch port?

Any plans to do this? I might be interested to try working on this if not

Issue with the Demo_for_training_a_model_with_a_learned_optimizer.ipynb

Upon running the notebook as it is, I observe an AttributeError: 'tuple' object has no attribute 'dtype', while 'Training Resnets with VeLO'. On the 114th line of the 2nd cell under "Training Resnets with VeLO" i.e.

state = solver.init_state(params, L2REG, next(test_ds), batch_stats)

The return value of the "init_state" method of optaxSolver is an OptaxState and it returns

OptaxState(iter_num=jnp.asarray(0),
value=jnp.asarray(jnp.inf, value.dtype),
error=jnp.asarray(jnp.inf, dtype=params_dtype),
aux=aux,
internal_state=opt_state)

I cannot understand if value is a tuple or just a scalar and why is this error keep occurring. Please help in solving it.

Notebook Not Found

Um.. the notebooks listed on the README are non exist idk if this is intend, but they don't work :((

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.