rhoana / butterfly Goto Github PK

Documentation

License: MIT License

Python 26.10% HTML 0.35% JavaScript 72.79% GLSL 0.37% CSS 0.15% Shell 0.24%

butterfly's Introduction

NOTE: The Rhoana pipeline is still under development, and should not be considered stable.

Rhoana - Dense Automated Neuron Annotation Pipeline

Prerequisites:
numpy          http://numpy.org
scipy          http://scipy.org
h5py           http://www.h5py.org/
mahotas        http://luispedro.org/software/mahotas
OpenCV         http://opencv.org/
pymaxflow      https://github.com/Rhoana/pymaxflow
fast64counter  https://github.com/Rhoana/fast64counter
CPLEX          http://www.ibm.com/software/integration/optimization/cplex-optimizer/


The Rhoana pipeline operates in the following stages:
Classify Membranes
Segmentation
Block dicing
Window Fusion
Pairwise Matching
Local and Global Remapping

A simple driver program is in Control/driver.py.  It takes as input a
file containing a list of images to process.  These should be aligned
EM sections.

ClassifyMembranes/classify_image takes three arguments:
  - image file
  - classifier file (an example is in ClassifyMembranes/GB_classifier.txt)
  - output HDF5
The HDF5 output will contain a single dataset, "probabilities", which
are the per-pixel membrane probabilities.

Segment/segment.py takes two arguments:
  - probabilities HDF5
  - output segmentations HDF5
Output will contain two datasets, "segmentations" and
"probabilities".  The first is of size IxJxN, with I,J the image
dimensions and N the number of generated segmentations (at various
scale and smoothness, N = 30 in the current implementation). The
"probabilities" dataset is just copied from the input.

Control/dice_block.py takes a number of arguments:
- imin, jmin, imax, jmax - the IJ coordinates of the block
- output.hdf5
- K segmentation HDF5 files
This will cut out a block as:
     np.concat([im[imin:imax, jmin:jmax, :] for im in segs[K]], 4)
(and a similar block for the per-pixel probabilities)
It will produce two datasets, "segmentations" and "probabilities".  Segmentation

WindowFusion/window_fusion_cpx.py takes two arguments:
- input block.hdf5
- output fusedblock.hdf5
This will run window fusion to reduce the IxJxNxK block to a labeled
IxJxK block.  Two datasets are produced, "labels" and "probabilities".

PairwiseMatching/pairwise_match.py takes 6 arguments
- two input fused blocks
- the direction they overlap (X = 1, Y = 2, Z = 3)  # this may be inaccurate, currently
- the number of pixels they overlap
- two output HDF5 fused blocks
Pairwise matching produces "labels", "probabilities", and "merges"
datasets.  The first block should always be closer to 0,0,0.  The
usual method is to run it first for all X-even blocks matching to
their X+1 neighbor, then all X-odd blocks matching to their X+1
neighbor, then do the same for Y, then Z.  After Pairwise Matching,
overlapping regions should be consistent.  "merges" is Lx2, with each
row indicating that two labels should be merged in the final result.

(There is a similar, program pairwise_match_region_growing.py, that
uses region growing in the probability maps for overlapping regions.)

Relabelabeling/concatenate_joins.py takes multiple matches blocks and
extracts their merges, and Relabelabeling/create_global_map.py
processes the full list of merges to create the final remap function.
Relabeling/remap_block.py takes this global remap and a single block,
and produces the remapped block.

Relabeling/extract_label_plane.py takes the following arguments:
- the output hdf5 path
- its IxJ size (same as the original image)
- a Z offset for the plane within the input blocks
- a set of (ibase, jbase, input block HDF5)
Extract Label Plane performs rougly the following action:
  for ibase, jbase, infile in args:
     input_data = infile['labels'][:, :, Z]
     output_labels[ibase:ibase+input_data.shape[0],
                   jbase:jbase+input_data.shape[1]] = intput_data

butterfly's People

Contributors

Stargazers

Watchers

Forkers

hendrikstrobelt oshaikh13 thejohnhoffer afcarl barkhajava

butterfly's Issues

Automatic URL shortening

Would be lovely (though not high priority) to have something a bit more compact for sharing purposes.

Don't need status

Don't need status anymore
https://github.com/Rhoana/butterfly/blob/update_v2/bfly/CoreLayer/Core.py#L62

Don't return content
https://github.com/Rhoana/butterfly/blob/update_v2/bfly/CoreLayer/AccessLayer/RequestHandler.py#L49

Don't want autoreload on production branch.

entity_feature

This API call should give certain features for a given segment ID.

identify which features we need and which ones need to be pre-calculated, calculated on the fly
distinguish from anatomical DB (some features will be in the database) / maybe all features will be in the DB
add API call

import computation stage

We need to compute some image features or characteristics during import of data and then store them in the DB or somewhere.

add generalized (modular) framework to have certain actions run during import

bfly starts, detects new data, computes stuff if required
shall this be executed as a different thread? does nipype wheels help (https://github.com/FNNDSC/wheels) ? or is this all too complicated and we should wait until computation is done to start serving BFLY

which actions do we need? (prolly later, framework first)

tif output format for segmentations

There is an error when requesting segmentation data.

It should be encoded as 32bit / pixel (no rgba) when format=TIF.

support json files for all tilesources

Currently, any channel can be specified by the path to the actual data, but the hdf5 tilesource also allows the channel to be specified with a path to a json file with the path to the actual data.

If we accept a json channel for any tilesource, then we will be able to store specific information about each channel in the json file. This would be one solution for specifying whether a particular channel is a mask.

Circle CI builds failing

This comes from the inability of requirements.txt to properly manage dependencies. We'll need to remove this file and have CircleCI manage dependencies with setup.py.

Right now the CircleCI builds are only happening against the thejohnhoffer fork since I can't add the integration to the main repository without admin access.

But I've added the badges to the readme files of the main repository regardless.

rc butterfly installation

fix it with existing data
symlink everything to one folder on coxfs
make sure if we have new data, we can re-trigger easily auto-detection (to test, maybe completely remove rc config)

which nosql shall we use ( https://www.infoq.com/articles/virtual-panel-current-state-of-nosql-databases ) ?
identify JSOn from @LeeKamentsky 's pipeline which can directly without conversion go in the DB
think about/identify other stuff we need in the DB (what do we need to compute to fulfil queries as specified in spec)

api/mask query

currently we store masks as a channel, but the spec requests a direct api call (api/mask)

current way of requesting this:
bfly/api/data?experiment=X&dataset=Y&sample=Z&channel=MASKNAME

idea: keep current structure and just proxy through the new api/mask request (we need to identify what the MASKNAME is)