Git Product home page Git Product logo

incense's Introduction

https://travis-ci.org/JarnoRFB/incense.svg?branch=master https://img.shields.io/lgtm/grade/python/g/JarnoRFB/incense.svg?logo=lgtm&logoWidth=18

Incense

Though automated logging of machine learning experiments results is crucial, it does not replace manual interpretation. Incense is a toolbox to facilitate manual interpretation of experiments that are logged using sacred. It lets you find and evaluate experiments directly in Jupyter notebooks. Incense lets you query the database for experiments by id, name or any hyperparmeter value. For each found experiment, configuration, artifacts and metrics can be displayed. The artifacts are rendered according to their type, e.g. a PNG image is displayed as an image, while a CSV file gets transformed to a pandas DataFrame. Metrics are by default transformed into pandas Series, which allows for flexible plotting. Together with sacred and incense, Jupyter notebooks offer the perfect solution for interpreting experiments as they allow for a combination of code that reproducibly displays the experiment’s results, as well as text that contains the interpretation.

Installation

Install the latest release

pip install incense

Or install the latest development version

pip install git+https://github.com/JarnoRFB/incense.git

Documentation

demo.ipynb demonstrates the basic functionality of incense. You can also try it out interactively on binder.

Contributing

We recommend using the VSCode devcontainer for development. It will automatically install all dependencies and start necessary services, such as mongoDB and JupyterLab. See .devcontainer/docker-compose.yml for details. If the output of id -u is something different than 1000 on your system, please add

export UID

to your .bashrc or .zshrc.

Building the container for the first time may take some time. Once in the container run

$ pre-commit install
$ python tests/example_experiment/conduct.py

to set up the pre-commit hooks and populate the example database.

Alternatively, you can use conda to set up your local development environment.

$ conda create -n incense-dev python=3.7
$ conda activate incense-dev
# virtualenv is required for the precommit environments.
$ conda install virtualenv
# tox-conda is required for using tox with conda.
$ pip install tox-conda
$ pip install -r requirements-dev.txt
$ pre-commit install

incense's People

Contributors

christian-steinmeyer avatar dependabot[bot] avatar jarnorfb avatar vnmabus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

incense's Issues

Release in PyPI

The last release is not in PyPI. I suggest you to use a Github action to upload automatically to PyPI on release.

Decoding numpy arrays require import sacred

Sacred creates its own encoder to encode numpy arrays in experiments. Thus, if sacred is not imported before obtaining an experiment with numpy arrays in the info dict, those are not unpickled. I propose to import sacred, at least when unpickle is True.

Slow when loading many experiments over network

Hi,
thanks for maintaining this cool library!

One problem I have encountered is that loading hundreds of experiments, each with several metrics, takes a very long time when done over a network.

E.g. loading 300 runs takes around a minute for me.

I think the problem is that the library sends a request to retrieve metrics for each run separately.
With 300 runs, loading the runs takes only around 5 seconds (one request) and the next 55 seconds (300 requests) are spent loading the metrics.

Would it be possible to load the runs and the metrics jointly with one request?
MongoDB should be able to fill in the metrics by their object IDs automatically I believe.

Thanks,
Ondrej

Deletion of experiments and caching

Hi, first of all thanks for making incense.

It seems like there is some caching issue (maybe related to #16 ?). In a notebook:

exps = loader.find_by_key("status", "FAILED")
print(len(exps))
for e in exps:
    e.delete(confirmed=True)
exps = loader.find_by_key("status", "FAILED")
print(len(exps))

outputs

714
714

Only after restarting the kernel does the query return 0 results.

Unify filesystem and mongo experiments and loaders

Experiments and loaders for filesystem and mongo should have a similar interface. In particular, a method such as find_by_ids should be available also for the filesystem loader, even if in that case it is just a loop. This allows a user to deal with both in the same way, using the optimized implementation for Mongo if available.

add artifact to experiment with Incense

Hello,
Thanks for developping this library. I use Sacred + MongoDB to store movie acquisitions of chemistry experiments and I use Incense to do post-treatment of the movies. Would there be an easy way to store the post-processing results as new artifacts ? something like exp.add_artifact("path_to_temp_file") ?
Thanks !

Add filter method to QuerySet

Great package! Would it be difficult to add a filter method to the QuerySet object? I'm imagining it would work similar to exps.project(on=["experiment.name", "config.optimizer", "config.epochs"]), except it takes a list of key:value pairs and return a filtered list of experiment objects. I often find myself loading all experiments of a particular name, then writing ugly for loops to further filter the list based on some criteria.

Multiple reducer in QuerySet.project

Hi, thanks for the library! Very useful for working with sacred experiments.

Is there a way to specify multiple reducers for projection? Say, I want to get the min and argmin of a metric. Right now I can do one of them with query_set.project(on=[{'metrics.loss': np.min}]) (or np.argmin) but not both because dicts only allow unique keys. Looking at the code, it doesn't seem supported yet. Would be nice if it is!

Add interface to load experiments lazily from a query

A common pattern when loading experiments is to pick some data for each experiment (such as the accuracy) and process it for each experiment (for example, storing it into a dataframe).

With the current code, in order to do that we have to first load all experiments into memory and then iterate over them. This is memory-expensive, and I had not enough memory in my machine for doing that for a (rather large) collection of experiments.

However, a lazy way to iterate over the experiments in the query can be provided, so that only one experiment needs to be created at a time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.