Git Product home page Git Product logo

flowdec's Introduction

Build Status

Flowdec

*Note*: This library is no longer actively maintained and requires older versions of Python and TensorFlow to run. If you have point spread functions already, then cucim.skimage.restoration.richardson_lucy is another implementation worth considering. If not, then the utilities in this library for generating them may still be useful.


Flowdec is a library containing TensorFlow (TF) implementations of image and signal deconvolution algorithms. Currently, only Richardson-Lucy Deconvolution has been implemented but others may come in the future.

Flowdec is designed to construct and execute TF graphs in python as well as use frozen, exported graphs from other languages (e.g. Java).

Here are a few other features, advantages, and disadvantages of the project currently:

Highlights

  • Support for Windows, Mac, and Linux - Because TensorFlow can run on these platforms, so can Flowdec.
  • Client Support for Java, Go, C++, and Python - Using Flowdec graphs from Python and Java has been tested, but theoretically they could also be used by any TensorFlow API Client Libraries.
  • Point Spread Functions - PSFs can be defined as json configuration files to be generated dynamically during the deconvolution process using a Fast Gibson-Lanni Approximation Model (which can also create Born & Wolf kernels as a degenerate case).
  • GPU Accleration - Executing TensorFlow graphs on GPUs is trivial and will happen by default w/ Flowdec if you meet all of the TensorFlow requirements for this (i.e. CUDA Toolkit installed, Nvidia drivers, etc.).
  • Performance - There are other open source and commercial deconvolution libraries that run with partial GPU acceleration, which generally means that only FFT and iFFT operations run on GPUs while all other operations run on the CPU. For example, on a roughly 1000x1000x11 3D volume with a PSF of the same dimensions this means that execution times look like:
    • CPU-only solutions: 10 minutes
    • Other solutions with FFT/iFFT GPU acceleration: ~40 seconds
    • Flowdec/TensorFlow with full GPU acceleration: ~1 second
  • Signal Dimensions - Flowdec can support 1, 2, or 3 dimensional images/signals.
  • Multi-GPU Usage - This has yet to be tested, but theoretically this is possible since TF can do it (and this Multi-GPU Example is a start).
  • Image Preprocessing - A trickier part of deconvolution implementations is dealing with image padding and cropping necessary to use faster FFT implementations -- in Flowdec, image padding using the reflection of the image along each axis can be specified manually or by letting it automatically round up and pad to the nearest power of 2 (which will enable use of faster Cooley-Tukey algorithm instead of the Bluestein algorithm provided by Nvidia cuFFT used by TF).
  • Visualizing Iterations - Another difficulty with iterative deconvolution algorithms is in determining when they should stop. With Richardson Lucy, this is usually done somewhat subjectively based on visualizing results for different iteration counts and Flowdec at least helps with this by letting observer functions be given that take intermediate results of the deconvolution process to be written out to image sequences or stacks for manual inspection. Future work may include using Tensorboard to do this instead but for now, it has been difficult to get image summaries working within TF "while" loops.

Disadvantages

  • No GUIs - Flowdec is intended for use by those familiar with programming but some future work might include an ImageJ plugin (if there's interest in that). For those looking for something more interactive, imagej-ops is likely your best bet which currently supports the same PSF generation model used in Flowdec as well as Richardson Lucy deconvolution. At the moment it doesn't include full GPU acceleration but that may be on the way as part of imagej-ops-experiments. See this github issue for more details.
  • No Blind Deconvolution - Currently, nothing in this arena has been attempted but since much recent research on this subject is centered around solutions in deep learning, TensorFlow will hopefully make for a good platform in the future.

Basic Usage

Here is a basic example demonstrating how Flowdec can be used in a single 3D image deconvolution:

See full example notebook here

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from skimage import exposure
from scipy import ndimage, signal
from flowdec import data as fd_data
from flowdec import restoration as fd_restoration

# Load "Purkinje Neuron" dataset downsampled from 200x1024x1024 to 50x256x256
# See: http://www.cellimagelibrary.org/images/CCDB_2
actual = fd_data.neuron_25pct().data
# actual.shape = (50, 256, 256)

# Create a gaussian kernel that will be used to blur the original acquisition
kernel = np.zeros_like(actual)
for offset in [0, 1]:
    kernel[tuple((np.array(kernel.shape) - offset) // 2)] = 1
kernel = ndimage.gaussian_filter(kernel, sigma=1.)
# kernel.shape = (50, 256, 256)

# Convolve the original image with our fake PSF
data = signal.fftconvolve(actual, kernel, mode='same')
# data.shape = (50, 256, 256)

# Run the deconvolution process and note that deconvolution initialization is best kept separate from 
# execution since the "initialize" operation corresponds to creating a TensorFlow graph, which is a 
# relatively expensive operation and should not be repeated across multiple executions
algo = fd_restoration.RichardsonLucyDeconvolver(data.ndim).initialize()
res = algo.run(fd_data.Acquisition(data=data, kernel=kernel), niter=30).data

fig, axs = plt.subplots(1, 3)
axs = axs.ravel()
fig.set_size_inches(18, 12)
center = tuple([slice(None), slice(10, -10), slice(10, -10)])
titles = ['Original Image', 'Blurred Image', 'Reconstructed Image']
for i, d in enumerate([actual, data, res]):
    img = exposure.adjust_gamma(d[center].max(axis=0), gamma=.2)
    axs[i].imshow(img, cmap='Spectral_r')
    axs[i].set_title(titles[i])
    axs[i].axis('off')

Neuron Example

As a more realistic use case, here is an example showing how a point spread function configuration can be used in a headless deconvolution:

See full deconvolution script here

# Generate a configuration file containing PSF parameters (see flowdec.psf module for more details)
echo '{"na": 0.75, "wavelength": 0.425, "size_z": 32, "size_x": 64, "size_y": 64}' > /tmp/psf.json

# Invoke deconvolution script with the above PSF configuration and an input dataset to deconvolve.
# If flowdec has been installed, you may run the “deconvolution” command.
python examples/scripts/deconvolution.py \
--data-path=flowdec/datasets/bars-25pct/data.tif \
--psf-config-path=/tmp/psf.json \
--output-path=/tmp/result.tif \
--n-iter=25 --log-level=DEBUG
> DEBUG:Loaded data with shape (32, 64, 64) and psf with shape (32, 64, 64)
> INFO:Beginning deconvolution of data file "flowdec/datasets/bars-25pct/data.tif"
> INFO:Deconvolution complete (in 7.427 seconds)
> INFO:Result saved to "/tmp/result.tif"

Examples

Python

Java

  • Multi-GPU Example - Prototype example for how to be able to execute deconvolution against multiple GPUs in parallel (not tested yet -- waiting for the use case to come up though it is very likely possible to do)

Installation

The project can be installed, ideally in a python 3.6 environment (though it should work in 3.5 too), by running:

pip install flowdec[tf_gpu]

The previous command will install flowdec, but also ensure that tensorflow is installed with GPU support. For test purposes, you may have the non-GPU enabled version of tensorflow installed by running:

pip install flowdec[tf]

If neither [tf] nor [tf_gpu] are specified, tensorflow installation is left as an externally managed prerequisite.

Alternatively, the project could be installed from source by doing the following:

git clone https://github.com/hammerlab/flowdec.git
cd flowdec/python
pip install -e .

Docker Instructions

A local docker image can be built by running:

cd flowdec  # Note: not flowdec/docker, just cd flowdec

docker build --no-cache -t flowdec -f docker/Dockerfile .

# If on a system that supports nvidia-docker, the GPU-enabled version can be built instead via:
# nvidia-docker build --no-cache -t flowdec -f docker/Dockerfile.gpu .

The image can then be run using:

# Run in foreground (port mapping is host:container if 8888 is already taken)
docker run -ti -p 8888:8888 flowdec

# Run in background
docker run -td -p 8888:8888 --name flowdec flowdec
docker exec -it flowdec /bin/bash # Connect 

The Flowdec dockerfile extends the TensorFlow DockerHub Images so its usage is similar where running it in the foreground automatically starts jupyter notebook and prints a link to connect to it via a browser on the host system.

The previous image is built from the current master branch of github.com/hammerlab/flowdec.git. To build an image using your local copy of the source instead, you can use this command:

docker build --no-cache -t flowdec -f docker/Dockerfile.devel .

You may want to combine this with a bind mount of your local source tree into the running container. This setup will let you make edits to the source and have them immediately take effect in the running container.

LOCAL_SRC=$(pwd)
DEST_SRC=/repos/flowdec

docker run -ti -p 8888:8888 -v ${LOCAL_SRC}:${DEST_SRC} flowdec

Validation

By in large, the purpose of this project is to attain near equivalence with a subset of the functionality provided by both DeconvolutionLab2 and PSFGenerator via much faster implementations.

To validate this much has been accomplished, there are two notebooks in the python/validation folder demonstrating the following:

  • Deconvolution Validation - This notebook aggregates results from Flowdec and DeconvolutionLab2 applied to several reference datasets and verifies that deconvolved volumes are very nearly identical
  • PSF Generation Validation - This notebook aggregates results from Flowdec and PSFGenerator used to generate PSFs from a variety of different configurations and evaluates their similarity (which is also very high)

Acknowledgements

Thanks to Kyle Douglass for explaining some of the finer aspects of this Python Gibson-Lanni PSF generator, Jizhou Li for helping to better understand that diffraction model, Hadrien Mary for giving great context on the state of open-source deconvolution libraries, and Brian Northan for lending great advice/context on library performance, blind deconvolution and how point spread functions work in general.

Citation

To cite Flowdec, please use this reference:

@article {Czech460980,
	author = {Czech, Eric and Aksoy, Bulent Arman and Aksoy, Pinar and Hammerbacher, Jeff},
	title = {Cytokit: A single-cell analysis toolkit for high dimensional fluorescent microscopy imaging},
	elocation-id = {460980},
	year = {2018},
	doi = {10.1101/460980},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2018/12/14/460980},
	eprint = {https://www.biorxiv.org/content/early/2018/12/14/460980.full.pdf},
	journal = {bioRxiv}
}

References

  • [1] D. Sage, L. Donati, F. Soulez, D. Fortun, G. Schmit, A. Seitz, R. Guiet, C. Vonesch, M. Unser
    DeconvolutionLab2: An Open-Source Software for Deconvolution Microscopy
    Methods - Image Processing for Biologists, 115, 2017.
  • [2] J. Li, F. Xue and T. Blu
    Fast and accurate three-dimensional point spread function computation for fluorescence microscopy
    J. Opt. Soc. Am. A, vol. 34, no. 6, pp. 1029-1034, 2017.
  • [3] Brandner, D. and Withers, G.
    The Cell Image Library, CIL: 10106, 10107, and 10108.
    Available at http://www.cellimagelibrary.org. Accessed December 08, 2010.

flowdec's People

Contributors

chrisroat avatar eczech avatar eric-czech avatar hammer avatar rs-gh-sa avatar russellb avatar volkerh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flowdec's Issues

Impractical GPU memory requirements

While indeed extremely fast, the GPU memory requirement is impractical on my setup: about 8 GB for a 1024x1024x19 image (16-bit) and a tiny 32x32x16 PSF. For images slightly above 1024x1024 (same number of Z slices), I can only run the code on a RTX 3090 (24 GB)!

The problem seems to stem from the FFT CUDA kernel. The error reported is:

tensorflow/stream_executor/cuda/cuda_fft.cc:253] failed to allocate work area.
tensorflow/stream_executor/cuda/cuda_fft.cc:430] Initialize Params: rank: 3 elem_count: 32 input_embed: 32 input_stride: 1 input_distance: 536870912 output_embed: 32 output_stride: 1 output_distance: 536870912 batch_count: 1
tensorflow/stream_executor/cuda/cuda_fft.cc:439] failed to initialize batched cufft plan with customized allocator:

Something is probably not right in the code... anybody knows of a workaround?

Problem with running notebooks in series

Perhaps obvious to others, but I (being a python noob) was struggling to figure out why one notebook was working, but not a second.

I think the answer is: Tensorflow gpu will "map nearly all of the GPU memory of all GPUs". This of course will prevent a second notebook from running after a first notebook is finished, because the first notebook had not released the GPU resources. Once I closed that first notebook, then the second worked.

Am I missing the obvious way to unload tensorflow besides restarting/killing the python kernel?

Release flowdec on pypi

Instead of having install instructions that require cloning the source using git and then installing via pip, flowdec could be released to pypi so the install instructions become just "pip install flowdec".

Roughly, the steps would just be tagging a release in git, then uploading a package of that release to pypi. There are some instructions here: https://packaging.python.org/tutorials/packaging-projects/

Feature request:Voxel sizes for PSF and data

This library really rocks!
Feature request: The voxel sizes for the images to be deconvolved do not always match the voxel sizes of the measured PSF. Can you these be added as parameters and have the necessary interpolation done on the GPU?

skimage error when opening 3D tiffs

Hello, first of all thanks for the ice tool.
I am trying to work with flowdec but got a relatively simple error:
from skimage.io import imread
image = imread('test_small.tif')
TiffPage 0: TypeError: read_bytes() missing 3 required positional arguments: 'dtype', 'count', and 'offsetsize'
I do realize that the error is not from flowdec but from the underlying tifffile library, the problem is that to solve the error I need to update tifffile as posted here but flowdec does not support the latest skimage. I guess is a version issue, any ideas?
I'm using this:
python ir 3.6.13
skimage 0.17.2
tifffile 2020.9.3
I have to say that it took a while to get flowdec working since there were many issues with TF anf the drivers and versions etc etc.
any ideas are welcome,
Alvaro

ResourceExhaustedError

Dear Eric,

I tried to deconvolve two identical stacks. One works and the other results in a ResourceExhuastedError. I've attached the ImageJ headers:

(1) 2048x2048pixels (does not work)
(2) 204.4x204.4microns (works)

ques

Comparison with scikit-image / cupyimg

I was starting to benchmark flowdec against this implemention in the cupyimg package, which aims to port ndimage and scikit-image onto the GPU using cupy.

I noticed that the scikit-image (and hence cupyimg's port) implementation suffer from problems with values near 0. I think it because it sometimes leads to a divide-by-zero. This explains why all the toy example code injects noise on the demo images!

Is flowdec a bit immune to this problem because of this filtering step or this clipping?

As a concrete example, the astronaut example you use to demonstrate edge ringing will not work directly in scikit-image's implementation without the added Poisson noise. Though, I've found it does reasonably well if you just rescale+shift the image from 0->1 to 0.5->1.

[My reason for trying cupyimg is that I'm already using cupy in my docker image, and adding tensorflow's 1.5GB is making my images even larger. For now, at least, I'm sticking with flowdec because it works on our data, since we hit the divide-by-zeros with the naive implementation.]

GPU memory exhaustion errors

via email from Samantha Esteves:

I’m really interested in getting GPU Richardson-Lucy deconvolution working on my machine (Windows10, Quadro M2200 4GB) and am running through the CElegans example notebook. But I get an out of memory error when I runt the deconvolution cell. I’m wondering if this is a true error (GPU requires more than 4GB of memory for this example?) or if there is a configuration error I should be looking into?

How to use shared RAM+VRAM

I managed to do use the system RAM to complement the VRAM (it also filled my GPU mem usage), however it is not worth it versus splitting the images using dask to do map_overlaps in 100% GPU/VRAM as the transfer is much slower.

I am under the impression that it uses the shared memory if needed (when I split the image in 4 with dask as usual, with the same configProto shared settings, I had normal processing speed).

If you really can't do map_overlaps and you don't have enough VRAM, or for reference:

I don't know how the speed compares with plain CPU computing, but when it uses less shared memory it is considerably faster than when it needs a big chunk or RAM (I tested by not splitting my 2048x2044 (much, much slower), splitting my image in 2 (considerably faster and only used about 2G of RAM for the processing) or splitting in 4 (using no RAM and full speed)

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
# choosing 2 forces tensorflow to use RAM together with allow_growth
config.gpu_options.per_process_gpu_memory_fraction = 2
config.gpu_options.allow_growth = True

algo = fd_restoration.RichardsonLucyDeconvolver(n_dims=psfgfp.ndim
                                                , pad_mode='2357'
#                                                     , pad_mode='none'
                                                ,pad_min=(0,1,1)
    tmp = algo.run(fd_data.Acquisition(data=chunk, kernel=psf)
                                ,niter=20
                                , session_config=config
                               )                                               ).initialize()

No module named 'tfdecon'

While running CElegans - Multiple Channel Example.ipynb

Also had a quick look at the source, and could not track tfdecon down, but could be blind.

Running out of memory in 4GB Quadro T2000

Hi,
This same code works in my 6GB and 8GB nvidia RTX (personal laptop and microscope computer).

My work laptop is a nvidia Quadro T2000 that only has 4GB of ram:

I run into not enough memory errors (my arrays are 11Z 2044Y and 2048X) in my T2000.

def observer(img, i, *args):
    #mgs.append(img.max(axis=0))
    if i % 10 == 0:
        print('Observing iteration = {} (dtype = {}, max = {:.3f})'.format(i, img.dtype, img.max()))   
#config = tf.ConfigProto(device_count={'GPU': 1})
#algo = fd_restoration.RichardsonLucyDeconvolver(n_dims=acq.data.ndim, pad_min=[1, 1, 1], session_config=config).initialize()
algo = fd_restoration.RichardsonLucyDeconvolver(n_dims=psfgfp.ndim
                                                , pad_mode='none'
                                                #, pad_mode='none'
                                                #, pad_min=[1,1,1]
                                                #,pad_min=np.ones(psfdapi.ndim)
                                                ,observer_fn=observer
                                               ).initialize()

I have tried withouth any padding arguments or [1,1,1] as well.

Then I run through my Nd2 files channels and call the algo with:

                    res0 =algo.run(
                    fd_data.Acquisition(
                    data=frames[0],
                    kernel= psfdapi)
                    , niter=15)

This is the error that jupyter notebook spits in my terminal:

2020-06-27 16:36:02.786780: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-27 16:36:02.786992: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-27 16:36:02.787183: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2341 MB memory) -> physical GPU (device: 0, name: Quadro T2000, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-06-27 16:36:03.444695: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-27 16:36:03.672501: W tensorflow/core/common_runtime/bfc_allocator.cc:245] Allocator (GPU_0_bfc) ran out of memory trying to allocate 351.31MiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
2020-06-27 16:36:03.672528: E tensorflow/stream_executor/cuda/cuda_fft.cc:249] failed to allocate work area.
2020-06-27 16:36:03.672535: E tensorflow/stream_executor/cuda/cuda_fft.cc:426] Initialize Params: rank: 3 elem_count: 11 input_embed: 11 input_stride: 1 input_distance: 46047232 output_embed: 11 output_stride: 1 output_distance: 46047232 batch_count: 1
2020-06-27 16:36:03.672540: F tensorflow/stream_executor/cuda/cuda_fft.cc:435] failed to initialize batched cufft plan with customized allocator: 
[I 16:36:10.592 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
kernel 0a7ff11c-97f8-41f9-b738-8ca7b0f86ab5 restarted

Anything I can do?

Thank you for flowdec!

Lightsheet Processing

Hi Eric,

I manage to established the pipeline using flowdec to deconvolved the lighthsheet images, but I am observing that for each channel and block the tensorflow enviorment is reload. Is there a way to avoid this?

Script:

data = []
total_data = []

algo = fd_restoration.RichardsonLucyDeconvolver(3).initialize()

for n in range(4):
for k in range(4):
datao = dapi[:, n*480:(n+1)480, k480:(k+1)*480]
res = algo.run(fd_data.Acquisition(data=datao, kernel=psf_single), niter=70)
data_deconvolved = res.data
data.append(data_deconvolved)
total_data.append(data)
data = []
combined_image = da.block(total_data)
combined_image.shape

Verbose:

2020-04-14 11:51:31.772567: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-04-14 11:51:31.796488: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2599990000 Hz
2020-04-14 11:51:31.796918: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55806f8d8a70 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-04-14 11:51:31.796938: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-04-14 11:51:31.863707: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:31.864106: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55806fa87b00 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-04-14 11:51:31.864120: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 2070 with Max-Q Design, Compute Capability 7.5
2020-04-14 11:51:31.864332: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:31.864642: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 with Max-Q Design computeCapability: 7.5
coreClock: 1.185GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 357.69GiB/s
2020-04-14 11:51:31.864701: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-14 11:51:31.864727: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-14 11:51:31.864748: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-14 11:51:31.864769: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-14 11:51:31.864790: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-14 11:51:31.864811: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-14 11:51:31.864832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-14 11:51:31.864879: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:31.865210: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:31.865520: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-14 11:51:31.865559: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-14 11:51:31.866593: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-14 11:51:31.866603: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-14 11:51:31.866607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-14 11:51:31.866686: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:31.867018: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:31.867325: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7010 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-04-14 11:51:32.577218: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-14 11:51:32.813305: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-14 11:51:37.893802: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:37.894299: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 with Max-Q Design computeCapability: 7.5
coreClock: 1.185GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 357.69GiB/s
2020-04-14 11:51:37.894369: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-04-14 11:51:37.894390: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-04-14 11:51:37.894407: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-04-14 11:51:37.894424: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-04-14 11:51:37.894442: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-04-14 11:51:37.894460: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-04-14 11:51:37.894478: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-04-14 11:51:37.894538: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:37.894779: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:37.894979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-04-14 11:51:37.894998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-04-14 11:51:37.895003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] 0
2020-04-14 11:51:37.895007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0: N
2020-04-14 11:51:37.895075: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:37.895318: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:37.895524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7010 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-04-14 11:51:43.133267: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-04-14 11:51:43.133516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce RTX 2070 with Max-Q Design computeCapability: 7.5
coreClock: 1.185GHz coreCount: 36 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 357.69GiB/s

choosing parameters for Lightsheet microscopy data

Hi Eric,

I am planning on using flowdec to apply deconvolution for Lightsheet microscopy. I think you have the parameter extraction already integrated into cytokit with the experiment json file, but is there a way of doing the same thing directly with flowdec to apply for this type of images?

Command Line Deconvolution Troubleshooting

I have a flowdec virtual environment set up on our Cluster and look forward to using it! I am having trouble setting up deconvolution with the command line:

(1) I have multi-channel images similar to the C. elegans example. My file is:
2048 x 2048 x 54 (2 color channels with 27 slices each)
I get:
File "/nfs/nhome/live/naureeng/.conda/envs/flowdec/lib/python3.7/site-packages/flowdec/data.py", line 24, in init
raise ValueError('Number of data and kernel dimensions must be 1, 2, or 3')
ValueError: Number of data and kernel dimensions must be 1, 2, or 3

The PSF file I have was done with a bead and is:
100 x 100 x 100

(2) Can we use the fd_data function to load our own datasets?

Thanks!

even shapes required for `real_domain_fft`

While experimenting with real_domain_fft I noticed that this has different input shape requirements compared to complex_domain_fft. As a result, pad_modes in ['none', '2357'] will throw an error for some input shapes.

A notebook to reproduce this is here:
https://github.com/VolkerH/flowdec/blob/realdomainexperiments/python/examples/notebooks/experimenting%20with%20real_domain%20option.ipynb

I first noticed this here:
VolkerH/Lattice_Lightsheet_Deskew_Deconv#37

Suggested fix:

  • apply desired pad mode
  • if real_domain_fft: check if shape after padding is odd in any dimension and increase shape by one.

It is not obvious whether handling pad_mode == None like this is desirable behaviour. If users don't want padding maybe a warning should be issued that padding happens anyway.

Edit:
While
https://www.tensorflow.org/api_docs/python/tf/signal/rfft3d
doesn't explicitly mention the permitted shape, the fact that fft_length / 2 positive-frequency terms are returned seems to confirm that even dimensions are rquired.

Longer time for first deconvolution in a process

I am running via dask, where a small batch of images (2-30 or so) is passed to a worker. The worker creates a process for each batch, and uses the gpu_options to avoid reserving all memory up front. In one process, I've noted that the first deconv in a process takes a lot longer than subsequent ones -- several seconds vs a fraction of a second. Is there any trick to cut back on that overhead?

PSF min max and type.

Should the PSF be Float32 with min being 0 and max being 1?

Can I use an acquired PSF straight from a 16bit image? Or should I contrast from 0 to 1?

Thanks.

Observer function example

Hi,

first of all thank you for this awesome library. I started experimenting with it a couple of days ago and am getting excellent results.

Now my issue: I'm trying to use the observer function to write intermediate iteration results to disk (I would like to see how the iterations progress) during deconvolution. There is no documentation on this, so a working example would be greatly appreciated. My own attempts were unsuccessful so far (defining some minimalist observer function).

#define an observer function
def observer(*arg):
    print("Observer called")

# Direct deconvolution on skewed llsm dataset
spimrawvol = tifffile.imread("/home/vhil0002/test_vol.tif")
spimskewpsf = tifffile.imread("/home/vhil0002/test_skewed_psf.tif")

algo_skew = tfd_restoration.RichardsonLucyDeconvolver(n_dims=3, start_mode='input', observer_fn=observer).initialize()
aq_skew = fd_data.Acquisition(data=spimrawvol.astype(np.float32), kernel=spimskewpsf.astype(np.float32))
res_skew = algo_skew.run(aq_skew, niter=3)

I can see that it gets called once and then I get an exception (probably when it returns) with a lengthy traceback. I've tried returning integers (32 and 64 bit) from the observer or just returning all the input arguments but get pretty much the same error no matter what.

Observer called

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1333     try:
-> 1334       return fn(*args)
   1335     except errors.OpError as e:

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
   1318       return self._call_tf_sessionrun(
-> 1319           options, feed_dict, fetch_list, target_list, run_metadata)
   1320 

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
   1406         self._session, options, feed_dict, fetch_list, target_list,
-> 1407         run_metadata)
   1408 

InvalidArgumentError: 0-th value returned by pyfunc_6 is int64, but expects int32
	 [[{{node while/observer}} = PyFunc[Tin=[DT_FLOAT, DT_INT32], Tout=[DT_INT32], token="pyfunc_6", _device="/job:localhost/replica:0/task:0/device:CPU:0"](while/Maximum/_85, while/Identity/_87)]]

During handling of the above exception, another exception occurred:

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-37-007b7300fa24> in <module>
     12 
     13  #   %%timeit
---> 14 res_skew = algo_skew.run(aq_skew, niter=3)

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/restoration.py in run(self, acquisition, niter, session_config)
    211     def run(self, acquisition, niter, session_config=None):
    212         input_kwargs = dict(niter=niter, pad_mode=self.pad_mode, pad_min=self.pad_min, start_mode=self.start_mode)
--> 213         res = self._run(acquisition, input_kwargs, session_config=session_config)
    214         return DeconvolutionResult(res['result'], info={k: v for k, v in res.items() if k != 'result'})
    215 

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/restoration.py in _run(self, acquisition, input_kwargs, session_config)
     81             data_dict = {self.graph.inputs[k]:v for k, v in acquisition.to_feed_dict().items()}
     82             args_dict = {self.graph.inputs[k]:v for k, v in input_kwargs.items() if v is not None}
---> 83             res = sess.run(self.graph.outputs, feed_dict={**data_dict, **args_dict})
     84             return res
     85 

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
    927     try:
    928       result = self._run(None, fetches, feed_dict, options_ptr,
--> 929                          run_metadata_ptr)
    930       if run_metadata:
    931         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1150     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1151       results = self._do_run(handle, final_targets, final_fetches,
-> 1152                              feed_dict_tensor, options, run_metadata)
   1153     else:
   1154       results = []

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1326     if handle is None:
   1327       return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1328                            run_metadata)
   1329     else:
   1330       return self._do_call(_prun_fn, handle, feeds, fetches)

~/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
   1346           pass
   1347       message = error_interpolation.interpolate(message, self._graph)
-> 1348       raise type(e)(node_def, op, message)
   1349 
   1350   def _extend_graph(self):

InvalidArgumentError: 0-th value returned by pyfunc_6 is int64, but expects int32
	 [[node while/observer (defined at /home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/tf_ops.py:31)  = PyFunc[Tin=[DT_FLOAT, DT_INT32], Tout=[DT_INT32], token="pyfunc_6", _device="/job:localhost/replica:0/task:0/device:CPU:0"](while/Maximum/_85, while/Identity/_87)]]

Caused by op 'while/observer', defined at:
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel/kernelapp.py", line 505, in start
    self.io_loop.start()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/platform/asyncio.py", line 132, in start
    self.asyncio_loop.run_forever()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/asyncio/base_events.py", line 427, in run_forever
    self._run_once()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/asyncio/base_events.py", line 1440, in _run_once
    handle._run()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/asyncio/events.py", line 145, in _run
    self._callback(*self._args)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/ioloop.py", line 758, in _run_callback
    ret = callback()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/stack_context.py", line 300, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/gen.py", line 1233, in inner
    self.run()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/gen.py", line 1147, in run
    yielded = self.gen.send(value)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 357, in process_one
    yield gen.maybe_future(dispatch(*args))
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 267, in dispatch_shell
    yield gen.maybe_future(handler(stream, idents, msg))
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel/kernelbase.py", line 534, in execute_request
    user_expressions, allow_stdin,
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tornado/gen.py", line 326, in wrapper
    yielded = next(result)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel/ipkernel.py", line 294, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/ipykernel/zmqshell.py", line 536, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2819, in run_cell
    raw_cell, store_history, silent, shell_futures)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2845, in _run_cell
    return runner(coro)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/IPython/core/async_helpers.py", line 67, in _pseudo_sync_runner
    coro.send(None)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3020, in run_cell_async
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3185, in run_ast_nodes
    if (yield from self.run_code(code, result)):
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-37-007b7300fa24>", line 9, in <module>
    algo_skew = tfd_restoration.RichardsonLucyDeconvolver(n_dims=3, start_mode='input', observer_fn=observer).initialize()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/restoration.py", line 62, in initialize
    self.graph = self._get_tf_graph()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/restoration.py", line 58, in _get_tf_graph
    inputs, outputs = self._build_tf_graph()
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/restoration.py", line 308, in _build_tf_graph
    result = tf.while_loop(cond, body, [1, decon], parallel_iterations=1)[1]
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3291, in while_loop
    return_same_structure)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3004, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2939, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/restoration.py", line 304, in body
    decon, i = tf_observer([decon, i], self.observer_fn)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/tf_ops.py", line 31, in tf_observer
    observe_op = tf.py_func(_observe, tensors, tf.int32, stateful=True, name='observer')[0]
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 457, in py_func
    func=func, inp=inp, Tout=Tout, stateful=stateful, eager=False, name=name)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/ops/script_ops.py", line 281, in _internal_py_func
    input=inp, token=token, Tout=Tout, name=name)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/ops/gen_script_ops.py", line 129, in py_func
    "PyFunc", input=input, token=token, Tout=Tout, name=name)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): 0-th value returned by pyfunc_6 is int64, but expects int32
	 [[node while/observer (defined at /home/vhil0002/su62_scratch/volker_conda/llsm/lib/python3.6/site-packages/flowdec/tf_ops.py:31)  = PyFunc[Tin=[DT_FLOAT, DT_INT32], Tout=[DT_INT32], token="pyfunc_6", _device="/job:localhost/replica:0/task:0/device:CPU:0"](while/Maximum/_85, while/Identity/_87)]]

Deconvolution use pixel based psf

Dear flowdec team,

thank you for releasing this tool!

I have a rather simple question but I couldn't reach a definitive answer while going through your excellent documentation.

Does the deconvolution process use pixel based psf (regardless of the psf voxel dimension) ? As in, should the pixel sizes of the PSF be identical to those of the image to deconvolve?
or
Does it resize the given psf so it will match the voxel dimensions of the image to deconvolve ?

From reading the code and documentation, my guess is that it's pixel-based but I'll be happy to have a confirmation.

Thank you again,

Best,

Romain

Flowdec video memory usage

Hi,

Not sure if it's alright to post this here, but looked for an email and couldn't find one.

I'm wondering what the input stack dimension limits are. We have a GTX Titan with 11GB of memory but still need to downsample our stacks quite a bit. Is there a way to deconvolve a stack of dimensions (1440, 1080, 1000) or is that unfeasible with flowdec?

Estimation of required memory

Hello.
I am trying to deconvolve a 101x2160x2560 (DxWxH) image with a 3x9x9 (DxWxH) PSF. I have tf-gpu 1.14.0 installed, and have 11GB card.

import tensorflow as tf
from flowdec import data as fd_data
from flowdec import restoration as fd_restoration
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
from skimage.external import tifffile as tif
from skimage.io import imsave, imread

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

imname = 'new.tif'
actual = imread(imname)
actual = np.asarray(actual, dtype=np.float32)
print(actual.shape)

kname = 'psf1.tif'
kernel = imread(kname)
kernel = np.asarray(kernel,dtype=np.float32)
kernel = np.transpose(kernel,(2,1,0))
print(kernel.shape)

t1 = time()
print('Initializing')
algo = fd_restoration.RichardsonLucyDeconvolver(actual.ndim, pad_mode='none').initialize()
print('Running')
res = algo.run(fd_data.Acquisition(data=actual, kernel=kernel), niter=15).data
res = np.asarray(res,dtype=np.uint16)

t2 = time()
print(t2-t1)
print(res.shape)
imsave('decon_result.tif', res)

I get segmentation fault. However with a smaller (101x1000x1000) image, the same PSF works with approximately 8815 MB gpu memory.

I have tried both with allow_growth = True and allow_growth = False. In both cases, I got segfault.

This is the segfault message with 101x2160x2560 image.
Can you please tell me how to estimate GPU memory based on image size and PSF size, so that I can efficiently do chunking?

[roys5@mh02112259dt tmp]$ python flowdec_example.py
WARNING:tensorflow:From flowdec_example.py:20: The name tf.ConfigProto is deprecated. Please use tf.compat.v1.ConfigProto instead.

WARNING:tensorflow:From flowdec_example.py:22: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-06-07 17:48:13.460340: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-07 17:48:13.468494: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-06-07 17:48:13.685218: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5585f3cc4d80 executing computations on platform CUDA. Devices:
2020-06-07 17:48:13.685291: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-06-07 17:48:13.690803: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2294660000 Hz
2020-06-07 17:48:13.696609: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5585f3f2ed20 executing computations on platform Host. Devices:
2020-06-07 17:48:13.696668: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2020-06-07 17:48:13.698584: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:04:00.0
2020-06-07 17:48:13.699119: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-07 17:48:13.702031: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-07 17:48:13.704691: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-07 17:48:13.705260: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-07 17:48:13.708751: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-07 17:48:13.711392: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-07 17:48:13.718264: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-07 17:48:13.721262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-07 17:48:13.721354: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-07 17:48:13.724138: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-07 17:48:13.724189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0
2020-06-07 17:48:13.724220: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N
2020-06-07 17:48:13.727270: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10073 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:04:00.0, compute capability: 7.5)
(101, 2160, 2560)
(3, 9, 9)
Initializing
WARNING:tensorflow:From /home/roys5/miniconda3/lib/python3.6/site-packages/flowdec-1.1.0-py3.6.egg/flowdec/fft_utils_tf.py:75: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Running
2020-06-07 17:48:36.875671: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:04:00.0
2020-06-07 17:48:36.875773: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-07 17:48:36.875802: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-07 17:48:36.875870: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-07 17:48:36.875897: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-07 17:48:36.875922: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-07 17:48:36.875947: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-07 17:48:36.875988: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-07 17:48:36.877502: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-07 17:48:36.878919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
pciBusID: 0000:04:00.0
2020-06-07 17:48:36.878983: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-06-07 17:48:36.879029: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-06-07 17:48:36.879051: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-06-07 17:48:36.879072: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-06-07 17:48:36.879106: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-06-07 17:48:36.879127: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-06-07 17:48:36.879148: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-06-07 17:48:36.880641: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-06-07 17:48:36.880699: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-07 17:48:36.880713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0
2020-06-07 17:48:36.880723: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N
2020-06-07 17:48:36.882393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10073 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:04:00.0, compute capability: 7.5)
2020-06-07 17:48:38.418516: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
2020-06-07 17:48:49.798603: W tensorflow/core/common_runtime/bfc_allocator.cc:314] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.08GiB (rounded to 2233958400).  Current allocation summary follows.
2020-06-07 17:48:49.798712: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (256):   Total Chunks: 9, Chunks in use: 9. 2.2KiB allocated for chunks. 2.2KiB in use in bin. 44B client-requested in use in bin.
2020-06-07 17:48:49.798796: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (512):   Total Chunks: 1, Chunks in use: 0. 512B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.798927: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1024):  Total Chunks: 2, Chunks in use: 2. 2.2KiB allocated for chunks. 2.2KiB in use in bin. 2.0KiB client-requested in use in bin.
2020-06-07 17:48:49.798970: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799004: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4096):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799057: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799090: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16384):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799127: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (32768):         Total Chunks: 1, Chunks in use: 1. 62.5KiB allocated for chunks. 62.5KiB in use in bin. 62.5KiB client-requested in use in bin.
2020-06-07 17:48:49.799163: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (65536):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799213: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (131072):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799247: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (262144):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799282: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (524288):        Total Chunks: 1, Chunks in use: 0. 956.5KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799314: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1048576):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799366: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2097152):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799399: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4194304):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799431: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8388608):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799482: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16777216):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799516: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (33554432):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799549: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (67108864):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799581: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (134217728):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:49.799617: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (268435456):     Total Chunks: 4, Chunks in use: 2. 8.00GiB allocated for chunks. 4.16GiB in use in bin. 4.16GiB client-requested in use in bin.
2020-06-07 17:48:49.799652: I tensorflow/core/common_runtime/bfc_allocator.cc:780] Bin for 2.08GiB was 256.00MiB, Chunk State:
2020-06-07 17:48:49.799694: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 1.92GiB | Requested Size: 4B | in_use: 0 | bin_num: 20, prev:   Size: 2.08GiB | Requested Size: 2.08GiB | in_use: 1 | bin_num: -1
2020-06-07 17:48:49.799727: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 1.92GiB | Requested Size: 0B | in_use: 0 | bin_num: 20, prev:   Size: 2.08GiB | Requested Size: 2.08GiB | in_use: 1 | bin_num: -1
2020-06-07 17:48:49.799751: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 4294967296
2020-06-07 17:48:49.799781: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fd532000000 next 17 of size 2233958400
2020-06-07 17:48:49.799806: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fd5b7278000 next 18446744073709551615 of size 2061008896
2020-06-07 17:48:49.799828: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 4294967296
2020-06-07 17:48:49.799861: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fd632000000 next 14 of size 2233958400
2020-06-07 17:48:49.799886: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fd6b7278000 next 18446744073709551615 of size 2061008896
2020-06-07 17:48:49.799908: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 1048576
2020-06-07 17:48:49.799931: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000000 next 1 of size 256
2020-06-07 17:48:49.799953: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000100 next 2 of size 256
2020-06-07 17:48:49.799976: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000200 next 3 of size 256
2020-06-07 17:48:49.799998: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000300 next 4 of size 256
2020-06-07 17:48:49.800038: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000400 next 5 of size 256
2020-06-07 17:48:49.800063: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000500 next 6 of size 64000
2020-06-07 17:48:49.800085: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb5900ff00 next 7 of size 256
2020-06-07 17:48:49.800107: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010000 next 8 of size 256
2020-06-07 17:48:49.800129: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010100 next 9 of size 256
2020-06-07 17:48:49.800152: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010200 next 10 of size 256
2020-06-07 17:48:49.800174: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010300 next 11 of size 1280
2020-06-07 17:48:49.800197: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fdb59010800 next 15 of size 512
2020-06-07 17:48:49.800220: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010a00 next 16 of size 1024
2020-06-07 17:48:49.800245: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fdb59010e00 next 18446744073709551615 of size 979456
2020-06-07 17:48:49.800267: I tensorflow/core/common_runtime/bfc_allocator.cc:809]      Summary of in-use Chunks by size:
2020-06-07 17:48:49.800300: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 9 Chunks of size 256 totalling 2.2KiB
2020-06-07 17:48:49.800326: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1024 totalling 1.0KiB
2020-06-07 17:48:49.800351: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1280 totalling 1.2KiB
2020-06-07 17:48:49.800376: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 64000 totalling 62.5KiB
2020-06-07 17:48:49.800401: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 2233958400 totalling 4.16GiB
2020-06-07 17:48:49.800425: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum Total of in-use chunks: 4.16GiB
2020-06-07 17:48:49.800454: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 8590983168 memory_limit_: 10562958132 available bytes: 1971974964 curr_region_allocation_bytes_: 8589934592
2020-06-07 17:48:49.800485: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit:                 10562958132
InUse:                  4467985408
MaxInUse:               4467985408
NumAllocs:                      25
MaxAllocSize:           2233958400

2020-06-07 17:48:49.800552: W tensorflow/core/common_runtime/bfc_allocator.cc:319] ***************************______________________***************************_______________________*
2020-06-07 17:48:49.800620: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at constant_op.cc:172 : Resource exhausted: OOM when allocating tensor with shape[101,2160,2560] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
2020-06-07 17:48:59.801010: W tensorflow/core/common_runtime/bfc_allocator.cc:314] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.08GiB (rounded to 2233958400).  Current allocation summary follows.
2020-06-07 17:48:59.801088: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (256):   Total Chunks: 9, Chunks in use: 9. 2.2KiB allocated for chunks. 2.2KiB in use in bin. 44B client-requested in use in bin.
2020-06-07 17:48:59.801157: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (512):   Total Chunks: 1, Chunks in use: 0. 512B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801193: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1024):  Total Chunks: 2, Chunks in use: 2. 2.2KiB allocated for chunks. 2.2KiB in use in bin. 2.0KiB client-requested in use in bin.
2020-06-07 17:48:59.801227: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801260: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4096):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801292: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801343: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16384):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801381: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (32768):         Total Chunks: 1, Chunks in use: 1. 62.5KiB allocated for chunks. 62.5KiB in use in bin. 62.5KiB client-requested in use in bin.
2020-06-07 17:48:59.801413: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (65536):         Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801463: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (131072):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801496: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (262144):        Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801530: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (524288):        Total Chunks: 1, Chunks in use: 0. 956.5KiB allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801563: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1048576):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801597: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2097152):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801630: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4194304):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801662: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8388608):       Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801713: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16777216):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801745: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (33554432):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801777: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (67108864):      Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801809: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (134217728):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2020-06-07 17:48:59.801871: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (268435456):     Total Chunks: 4, Chunks in use: 2. 8.00GiB allocated for chunks. 4.16GiB in use in bin. 4.16GiB client-requested in use in bin.
2020-06-07 17:48:59.801910: I tensorflow/core/common_runtime/bfc_allocator.cc:780] Bin for 2.08GiB was 256.00MiB, Chunk State:
2020-06-07 17:48:59.801945: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 1.92GiB | Requested Size: 4B | in_use: 0 | bin_num: 20, prev:   Size: 2.08GiB | Requested Size: 2.08GiB | in_use: 1 | bin_num: -1
2020-06-07 17:48:59.801977: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 1.92GiB | Requested Size: 0B | in_use: 0 | bin_num: 20, prev:   Size: 2.08GiB | Requested Size: 2.08GiB | in_use: 1 | bin_num: -1
2020-06-07 17:48:59.802022: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 4294967296
2020-06-07 17:48:59.802049: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fd532000000 next 17 of size 2233958400
2020-06-07 17:48:59.802073: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fd5b7278000 next 18446744073709551615 of size 2061008896
2020-06-07 17:48:59.802096: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 4294967296
2020-06-07 17:48:59.802118: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fd632000000 next 14 of size 2233958400
2020-06-07 17:48:59.802141: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fd6b7278000 next 18446744073709551615 of size 2061008896
2020-06-07 17:48:59.802163: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 1048576
2020-06-07 17:48:59.802186: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000000 next 1 of size 256
2020-06-07 17:48:59.802208: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000100 next 2 of size 256
2020-06-07 17:48:59.802230: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000200 next 3 of size 256
2020-06-07 17:48:59.802273: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000300 next 4 of size 256
2020-06-07 17:48:59.802295: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000400 next 5 of size 256
2020-06-07 17:48:59.802318: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59000500 next 6 of size 64000
2020-06-07 17:48:59.802340: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb5900ff00 next 7 of size 256
2020-06-07 17:48:59.802362: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010000 next 8 of size 256
2020-06-07 17:48:59.802383: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010100 next 9 of size 256
2020-06-07 17:48:59.802405: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010200 next 10 of size 256
2020-06-07 17:48:59.802427: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010300 next 11 of size 1280
2020-06-07 17:48:59.802449: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fdb59010800 next 15 of size 512
2020-06-07 17:48:59.802471: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0x7fdb59010a00 next 16 of size 1024
2020-06-07 17:48:59.802494: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 0x7fdb59010e00 next 18446744073709551615 of size 979456
2020-06-07 17:48:59.802520: I tensorflow/core/common_runtime/bfc_allocator.cc:809]      Summary of in-use Chunks by size:
2020-06-07 17:48:59.802546: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 9 Chunks of size 256 totalling 2.2KiB
2020-06-07 17:48:59.802571: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1024 totalling 1.0KiB
2020-06-07 17:48:59.802595: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1280 totalling 1.2KiB
2020-06-07 17:48:59.802620: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 64000 totalling 62.5KiB
2020-06-07 17:48:59.802644: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 2233958400 totalling 4.16GiB
2020-06-07 17:48:59.802668: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum Total of in-use chunks: 4.16GiB
2020-06-07 17:48:59.802696: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 8590983168 memory_limit_: 10562958132 available bytes: 1971974964 curr_region_allocation_bytes_: 8589934592
2020-06-07 17:48:59.802725: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit:                 10562958132
InUse:                  4467985408
MaxInUse:               4467985408
NumAllocs:                      25
MaxAllocSize:           2233958400

2020-06-07 17:48:59.802801: W tensorflow/core/common_runtime/bfc_allocator.cc:319] ***************************______________________***************************_______________________*
2020-06-07 17:48:59.802853: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at pad_op.cc:137 : Resource exhausted: OOM when allocating tensor with shape[101,2160,2560] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Segmentation fault (core dumped)

Using newer tensorflow versions

Is there any active work on this project to upgrade to TF 2.0? I'd be interested in helping out, if it's likely straightforward changes.

windowing and type conversion possbile using `input_prep_fn`, `output_prep_fn` ?

Hi,

I'm just trying to get into tensorflow to be able to modify flowdec to my needs.
There are two things I am trying to achieve:

  1. do input and output type conversions on the GPU (currently I do this in numpy on the CPU and cProfile shows that the code is spending quite a bit of time on astype(np.*)).
  2. integrate a windowing/apodization function to reduce artifacts caused by discontinuities at the boundary.

While looking through the flowdec source code to see where I could add these things I noticed the input_prep_fn and output_prep_fn stubs and I am wondering whether I could somehow use these for the above-mentioned purposes.

However, in both cases I somehow need to allocate additional arrays (or "tensors")

  1. For the input/output dtpye conversion I will have to create additional arrays with the input/output dtypes (typically uint16).
  2. For the windowing function I would like to pass in a pre-computed windowing function that I multiply with the array.

I notice that inputs and outputs are passed in as dictionaries.
So can I achieve these objectives by initialzing the deconvolution object with some additional key/value pairs in the input/output dictionaries and passing in appropriate input_prep_ and output_prep_ functions or do I need to make modifications to the actual code in flowdec/restoration.py ?

Some guidance with how to approach this would be highly appreciated.

Batch Input Operation Question

Hi, I try your code recently and it works well and fast. Thanks!
However, I really need to do some simple deconvolution process for thousands of times. This means I need some like batch input and output operation in Tensorflow and GPU rather than simple for loop. So could you give me some advice to add this function in your code?
Thanks again! @armish @hammer @eczech @russellb @eric-czech

How does padding 'none' work?

Hi,

I have picked up some code that someone else was using that is no longer with us.

He is setting pad_mode to 'none' and pad_min to [16,16,16]

Does this not pad at all or pad with zeroes.

Thanks,
Scott

ResourceExhausedError for z-stack

Dear Eric,
I've been able to do batch processing with your help! I now have z-stacks that are:
512 x 512 x 348

I am running into a memory issue:

ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[512,1024,1024] and type complex64 on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
	 [[node while/FFT3D (defined at /nfs/nhome/live/naureeng/.conda/envs/flowdec/lib/python3.7/site-packages/flowdec/restoration.py:289) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[node result (defined at /nfs/nhome/live/naureeng/.conda/envs/flowdec/lib/python3.7/site-packages/flowdec/restoration.py:318) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

However, I'm allocating an entire node (600GB). Is there a better way to get around this? dask?
Many thanks for your time!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.