tissueimageanalytics / tiatoolbox Goto Github PK
View Code? Open in Web Editor NEWComputational Pathology Toolbox developed by TIA Centre, University of Warwick.
Home Page: https://warwick.ac.uk/tia
License: Other
Computational Pathology Toolbox developed by TIA Centre, University of Warwick.
Home Page: https://warwick.ac.uk/tia
License: Other
This command:
The output is on standard error not standard out.
Using LSF by running:
docker run --rm --name tiatoolbox_gpu_test.sh_bioformatspull2759.ndpi -v $v1 --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES="7" tialab/tiatoolbox_100_py39_cuda:t3 tiatoolbox patch-predictor --img-input /root/workspace/tiatoolbox_sample_wsis/bioformatspull2759.ndpi --mode wsi --on-gpu True --output-path /root/workspace/segment-results_bioformatspull2759.ndpi
The output appears in the LSF [job number].err file and not in [job number].out
Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.
Just as the title.
I need to use develop ver, but pip install master ver...
I was trying to use WSIReader.open() to read .tif images downloaded from Camelyon16 dataset and the codes are as follows:
import requests
from tiatoolbox.wsicore.wsireader import WSIReader, get_wsireader
import os
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mplmpl.rcParams['figure.dpi'] = 150 # for high resolution figure in notebook
sample_wsi_path = '/home/gzr/code/NIC/data/CAMELYON16/training/normal/normal_001.tif'
wsi = WSIReader.open(input_img=sample_wsi_path)
print(type(wsi))
And I'm sure the image path is correct, however, the following problem showed up:
error: OpenCV(4.5.4) /tmp/pip-req-build-kv0l0wqx/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'
Looking through the Makefile we probably want to tidy it up a bit. For example,. there is a lint option with flake8 but flake8 is not in the requirements files. Also, we probably want to add a format action to format the code with black. Lastly, the coverage action is set to use the python coverage module whereas we are currently using pytest-cov (although maybe we should try coverage).
In summary, issues to resolve:
PatchPredictor
needs to be rewritten so that its member function predict
cannot contradict whether GPU is in use or not.Describe what you were trying to get done.
I was testing idars.ipynb without GPU
Tell us what happened, what went wrong, and what you expected to happen.
Fatal errors. I expected the jupyter notebook to run to conclusion, but slowly
First I corrected one fatal error, caused by ON_GPU being undefined. (The user had been instructed in a text cell to skip the cell defining ON_GPU if not using a GPU.) Then I ran it again, and there was a different fatal error. The notebook seems to have been tested only with GPU. The code cell causing the error had 3 statements, so I split the cell into three, and realised that the code was not set up to work without GPU.
The lines
tumour_predictor = PatchPredictor(
pretrained_model='resnet18-idars-tumour',
batch_size=64,
num_loader_workers=8)
expect a GPU, but the subsequent lines
tumour_output = tumour_predictor.predict(
imgs=wsi_file_list,
mode='wsi',
return_probabilities=True,
on_gpu=ON_GPU)
do not expect a GPU, because at this point, ON_GPU == False
Paste the command(s) you ran and the output.
I kept pressing Command Return, which on a Mac runs the next cell.
If there was a crash, please include the traceback here.
RuntimeError Traceback (most recent call last)
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _try_get_data(self, timeout)
1010 try:
-> 1011 data = self._data_queue.get(timeout=timeout)
1012 return (True, data)
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/multiprocessing/queues.py in get(self, block, timeout)
103 timeout = deadline - time.monotonic()
--> 104 if not self._poll(timeout):
105 raise Empty
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/multiprocessing/connection.py in poll(self, timeout)
256 self._check_readable()
--> 257 return self._poll(timeout)
258
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/multiprocessing/connection.py in _poll(self, timeout)
413 def _poll(self, timeout):
--> 414 r = wait([self], timeout)
415 return bool(r)
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/multiprocessing/connection.py in wait(object_list, timeout)
920 while True:
--> 921 ready = selector.select(timeout)
922 if ready:
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/selectors.py in select(self, timeout)
414 try:
--> 415 fd_event_list = self._selector.poll(timeout)
416 except InterruptedError:
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py in handler(signum, frame)
65 # Python can still get and update the process status successfully.
---> 66 _error_if_any_worker_fails()
67 if previous_handler is not None:
RuntimeError: DataLoader worker (pid 91493) is killed by signal: Segmentation fault: 11.
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
/var/folders/f2/pyj3khtr8xj8vr0059_y9lvh0000gp/T/ipykernel_91358/1154328519.py in <module>
3 mode='wsi',
4 return_probabilities=True,
----> 5 on_gpu=ON_GPU)
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/models/engine/patch_predictor.py in predict(self, imgs, masks, labels, mode, return_probabilities, return_labels, on_gpu, ioconfig, patch_input_shape, stride_shape, resolution, units, merge_predictions, save_dir, save_output)
565 return_probabilities=return_probabilities,
566 return_coordinates=return_coordinates,
--> 567 on_gpu=on_gpu,
568 )
569 output_model["label"] = img_label
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/models/engine/patch_predictor.py in _predict_engine(self, dataset, return_probabilities, return_labels, return_coordinates, on_gpu)
318 "labels": [],
319 }
--> 320 for _, batch_data in enumerate(dataloader):
321
322 batch_output_probabilities = self.model.infer_batch(
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
528 if self._sampler_iter is None:
529 self._reset()
--> 530 data = self._next_data()
531 self._num_yielded += 1
532 if self._dataset_kind == _DatasetKind.Iterable and \
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _next_data(self)
1205
1206 assert not self._shutdown and self._tasks_outstanding > 0
-> 1207 idx, data = self._get_data()
1208 self._tasks_outstanding -= 1
1209 if self._dataset_kind == _DatasetKind.Iterable:
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _get_data(self)
1171 else:
1172 while True:
-> 1173 success, data = self._try_get_data()
1174 if success:
1175 return data
~/opt/miniconda3/envs/tiatoolbox/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _try_get_data(self, timeout)
1022 if len(failed_workers) > 0:
1023 pids_str = ', '.join(str(w.pid) for w in failed_workers)
-> 1024 raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
1025 if isinstance(e, queue.Empty):
1026 return (False, None)
RuntimeError: DataLoader worker (pid(s) 91493) exited unexpectedly
ERROR: Unexpected segmentation fault encountered in worker.
ERROR: Unexpected segmentation fault encountered in worker.
ERROR: Unexpected segmentation fault encountered in worker.
ERROR: Unexpected segmentation fault encountered in worker.
We currently have both a setup.cfg
and pypackage.toml
file. According to PEP 518 and PEP 621 pypackage.toml
is to replace setup.cfg
. Therefore, we should look at merging the two. There may be some old tools that do not work with pypackage.toml
though so this will have to be carefully tested before going ahead.
We are currently using a conda file for setting up the development environment but also have a requirements_dev file around which is used for Travis. Do we even need this anymore if moving to conda for travis? (see #6)
We need to tidy up our requirements files and make sure that the pinned versions are compatible etc. I know that we recently had issues with the conda resolve failing. We need to decide if we want to now pin version numbers for reproducibility when setting up the environment. This would also require making sure that is work on linux, maxOS, and window with the pinned versions.
In summary, issues to resolve:
setup.py
should define pip dependencies.requirements_dev.txt
. If not, make sure CI services e.g. Travis and Read the Docs are set to use conda. See #6.Thanks for making the package.
For all the *.predict(...)
with GPU, would it make sense to allow users to pass in the device id rather than on_gpu=True
?
The code here could pass in the device id like 'cuda:1' to model.to(device_id) instead of model.to('cuda'). It can allow parallel slide processing when more than one GPU is available.
PyTest is missing from the conda dev requirements file (https://github.com/TIA-Lab/tiatoolbox/blob/master/requirements.dev.conda.yml) and trying to run tests will fail. Even worse if you have pytest installed elsewhere can use that installation instead and confusingly fail to import modules.
TIA Toolbox version: latest
The execution of self._precess_predictions() inside of the for loop in self._predict_one_wsi() is causing major performance loss for segmentation prediction - especially on larger images/wsi.
In every for loop execution every element of "cum_output" is processed and merged instead of using the merge result of the previous execution and just add the most recent tile.
Hi everyone,
Just started working with Tiatoolbox.
First and foremost I need it for Tissue segmentation that will provide easy preprocessing, patching, mask saving, and reloading from a faster file format (e.g. hdf5).
I am aware of the patch extraction module, and I understand it needs a mask as input. The thing is, that it seems the tissue mask module does not enable easy export of masks, nor for an easy preprocessing pipeline with validity checks on the masks that were received.
Are these functionalities existing and I simply didn't find them, or are they actually missing from the toolbox?
tnx
TIAToolbox fails tests with latest version of scikit-image. Currently it is pinned to 0.18.3 which needs to be changed to scikit-image>=0.19.1
Line 16 in 47b9494
Run tests on travis after updating scikit-image
There are some behaviours of the CLI which are unexpected or I would like to raise for considering changing the default behaviour.
The help string says 'TIA Lab', this needs to be update to 'TIA Centre'
tiatoolbox slide-thumbnail
takes the input slide as an --
option not positional arguments. I think most people would expect the inputs to be positional and possibly also to be able to pass in multiple inputs e.g. tiatoolbox slide-thumb *.svs
to process all svs in the current path. I understand that this may complicate things and that users could alternatively use a simple loop e.g. for f in *.svs; do tiatoolbox slide-thumb --img-input "$f"; done
. Perhaps if we do not implement this we could simply make the input and output two positional arguments and give a loop example in the documentation e.g. a loop to make a thumbnails folder as follows:
for f in ./example/*.svs;
do tiatoolbox slide-thumbnail --mode save --img-input "$f" --output-path "./thumbs/$(basename $f).jpg";
done
or with positional and save default:
for f in ./example/*.svs;
do tiatoolbox slide-thumbnail "$f" --output-path "./thumbs/$(basename $f).jpg";
done
With positional input and output and abbreviated thumbnail to thumb, which could make it even easier to use:
for f in ./example/*.svs;
do tiatoolbox slide-thumb "$f" "./thumbs/$(basename $f).jpg";
done
The default for slide-thumb
is currently to show the image in a window. This is good but maybe should not be the default as the CLI will commonly be used where there is not a display. I would like to suggest making the default be to output the thumb at the same path but with a .jpg
suffix e.g. tiatoolbox slide-thumb *.svs
would output a *.jpg
for each slide in the input as the default action. Similar to ImageMagick mogrify for batch conversion of images (mogrify -format jpg *.png
to convert all png to jpg)
I thought it sensible to make an issue to track the discussion around this. I recently suggested looking into using ONNX runtime for models as it would allow one cross-platform dependency for doing model inference instead of having to worry about both PyTorch and TensorFlow. It also looks as if training an ONNX model is also in beta (via PyTorch or TensorFlow).
If anyone could try converting their model to ONNX and running it that would be good to get some feedback.
Some links for more information
Markdown cells in a jupyter notebook (JN) are processed in two different ways in our system. One is by JN markdown and the other is by sphinx for readthedocs. What looks great and makes sense in one, can look ugly and be meaningless in the other. Our JNs all look great in JN format, but the corresponding documentation in our readthecods can be meaningless and ugly. It is possible to fix some of these completely. For example, one can create a clickable "here" with the construct here, which is rendered correctly in both contexts (though some characters, like "(" in the url component, need to be replaced by corresponding HTML code). But this compatible construct is not always used in our JNs, with resulting failure in readthedocs.
I do not know complete fixes for all incompatible constructs in our JNs, but there do seem to be kludges for all examples I've looked at, producing acceptable, though not perfect, results in both. At present readthedocs of our JNs is not acceptable, in my opinion. (It may be possible to fix these incompatibilities in a better way by a different configuration of sphinx or nbsphinx, but that is currently beyond my expertise. For example, can one alter docs/conf.py so that JN markdown cells are rendered by JN markdown in readthedocs?)
This comment has been changing frequently as I have discovered more about the problem.
When traying Macenko stain normalization to normalize images with that don't have perfect staining in first place, the normalization process would deteriorate the output quality. In particular, it seems that division by maxC_source
is causing it:
maxC_source = np.percentile(source_concentrations, 99, axis=0).reshape((1, 2))
source_concentrations *= self.maxC_target / maxC_source
It seems that returned maxC_source
is not favorable when we don't have enough stain variation in the source image.
Consider the the target image:
this will be the normalized source which is clearly deteriorated:
the code to reproduce this:
from tiatoolbox.utils.misc import imread, imwrite
from tiatoolbox.tools.stainnorm import MacenkoNormalizer
source= imread('D:/source.png')
target = imread('D:/target.png')
normalizer = MacenkoNormalizer()
normalizer.fit(target)
source_normalized = normalizer.transform(source)
The similar problem exists with Macenko normalizer in staintools
:
the code to reproduce this:
import staintools
# Read data
target = staintools.read_image("/root/workspace/target.png")
to_transform = staintools.read_image("/root/workspace/source.png")
# Standardize brightness (optional, can improve the tissue mask calculation)
target = staintools.LuminosityStandardizer.standardize(target)
to_transform = staintools.LuminosityStandardizer.standardize(to_transform)
# Stain normalize
normalizer = staintools.StainNormalizer(method='macenko')
normalizer.fit(target)
transformed = normalizer.transform(to_transform)
However, if you use Vahedane normalizer, the output would look desirable like this:
First of all, many thanks to all the contributors for this great & valuable effort preparing this toolbox. I am using the patch predictor example and I had two issues:
(1) when I run the pre-trained model: [[ pretrained_model='resnet18-kather100k' ]] on the validation dataset downloaded from the Warwick tiatoolbox dataset page. Then, found that the accuracy was low. After debugging, realized that the provided order of classes is not the same as it was trained with (not sure if this issue is only with me). Anyway, I changed it accordingly and got good accuracy and thought it might be useful to report this issue ( the updated order can be found in the next section).
(2) I had an issue while predicting the mask using the same pre-trained network but on the wsi mode.
For issue (1):
That is the previous class order:
label_dict = {'ADI': 0, 'BACK': 1, 'DEB': 2, 'LYM': 3, 'MUC': 4, 'MUS': 5, 'NORM': 6, 'STR': 7, 'TUM': 8}
This is the updated order that was working for me:
label_dict = {'BACK': 0, 'NORM': 1, 'DEB': 2, 'TUM': 3, 'ADI': 4, 'MUC': 5, 'MUS': 6, 'STR': 7, 'LYM': 8}
For issue (2):
The commands I used were as follows:
predictor = CNNPatchPredictor(pretrained_model='resnet18-kather100k', batch_size=64)
wsi_output = predictor.predict([wsi_file_name],
mode='wsi',
stride_size=(112,112),
return_probabilities=True)
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.2\helpers\pydev\_pydevd_bundle\pydevd_exec2.py", line 3, in Exec
exec(exp, global_vars, local_vars)
File "<string>", line 5, in <module>
File "D:\Hammam\Research\TILs\TIL_project\2_Expermients_including_codes_and_results\code\pre-process\TIA_toolbox\tiatoolbox_example\tiatoolbox\models\classification\patch_predictor.py", line 435, in predict
units=units,
File "D:\Hammam\Research\TILs\TIL_project\2_Expermients_including_codes_and_results\code\pre-process\TIA_toolbox\tiatoolbox_example\tiatoolbox\models\dataset\classification.py", line 310, in __init__
units=units,
File "D:\Hammam\Research\TILs\TIL_project\2_Expermients_including_codes_and_results\code\pre-process\TIA_toolbox\tiatoolbox_example\tiatoolbox\tools\patchextraction.py", line 170, in filter_coordinates
raise ValueError("`mask_reader` should be wsireader.VirtualWSIReader.")
ValueError: `mask_reader` should be wsireader.VirtualWSIReader.
Any help would be much appreciated,
Thanks & best regards,
Hammam
Conversion between MPP (microns per pixel) to objective power.
from tiatoolbox.utils.misc import mpp2common_objective_power, objective_power2mpp
mpp2common_objective_power(0.234)
>> array(40)
objective_power2mpp(20.)
>> array(0.5)
The current implementation of above functions does not account for sensor size and results in erroneous estimates. In my case, I have a slide scanned at 20x (with 3D Histech Panoramic Midi) which results in ~0.234 mpp (from slide's metainfo), half of the estimated values. I guess there is no way of avoiding adding an extra parameter to above functions, to allow different values for the sensor.
See: Sellaro et al, Relationship between magnification and resolution in digital pathology systems, J Pathol Inform 2013, 4:21, DOI: 10.4103/2153-3539.116866
Best,
V.
Using CNNPatchPredictor.predict
with the merge_predictions
argument set to True can produce a ValueError
during merging after all patches are predicted. Setting merge_predictions
to False does not cause an exception.
Try to run CNNPatchPredictor.predict
with the merge_predictions
argument set true. Example code:
from tiatoolbox.wsicore.wsireader import WSIReader
from tiatoolbox.models.controller import patch_predictor
wsi = WSIReader.open("./TCGA-A6-5662-01Z-00-DX1.82569684-1c31-4346-af9b-c296a020f624.svs")
predictor = patch_predictor.CNNPatchPredictor(pretrained_model="resnet18-kather100k")
output = predictor.predict([wsi.input_path], mode="wsi", merge_predictions=True)
Error output:
100%|#########################################| 378/378 [00:56<00:00, 6.65it/s]
Traceback (most recent call last):
File ".../issue.py", line 19, in <module>
File ".../tiatoolbox/models/controller/patch_predictor.py", line 531, in predict
postproc_func=self.model.postproc,
File ".../tiatoolbox/models/controller/patch_predictor.py", line 216, in merge_predictions
canvas_shape = reader.slide_dimensions(resolution=resolution, units=units)
File ".../tiatoolbox/wsicore/wsireader.py", line 509, in slide_dimensions
[0, 0] + list(wsi_shape_at_baseline), resolution, units, precisions
File ".../tiatoolbox/wsicore/wsireader.py", line 545, in _find_read_bounds_params
resolution, units, precision
File ".../tiatoolbox/wsicore/wsireader.py", line 285, in _find_optimal_level_and_downsample
level_scales = self._relative_level_scales(resolution, units)
File ".../tiatoolbox/wsicore/wsireader.py", line 253, in _relative_level_scales
raise ValueError(f"Invalid units `{units}`.")
ValueError: Invalid units `None`.
Describe what you were trying to get done.
Produce readthedocs for branch docs-idars-algorithm
Tell us what happened, what went wrong, and what you expected to happen.
Unable to tell whether 'make html' had succeeded in building readthedocs
or failed.
Unable to find readthedocs
, assuming it had succeeded.
Paste the command(s) you ran and the output.
(tiatoolbox-dev) dbae:docs$ git status
On branch doc-idars-algorithm
Your branch is up to date with 'origin/doc-idars-algorithm'.
nothing to commit, working tree clean
(tiatoolbox-dev) dbae:docs$ make html
stdout and stderr files attached
stderr.txt
stdout.txt
Some tests are writing files into the project too directory instead of into a temporary directory such as PyTests temp_dir (https://docs.pytest.org/en/6.2.x/tmpdir.html).
Offending lines of code:
$HOME/.tiatoolbox
instead of the project root?I have added anyone listed in the git blame for the above lines as assignees.
Run tests with PyTest:
$ pytest
and check for created files.
I forgot to copy some jp2 and svs slide images in my Docker container. When I tried to create their thumbnails, for the images in the jp2 format I got an IndexError, while for the svs images I got an error from OpenSlide that the file is missing. I would expect for the jp2 formats (and all other formats) to get a similar error message, informing me what went wrong and how to fix it.
Output for missing svs image
Traceback (most recent call last):
File "//read_image.py", line 4, in <module>
wsi_reader_v1 = wsireader.get_wsireader(input_img='./test1.svs')
File "/usr/local/lib/python3.9/site-packages/tiatoolbox/wsicore/wsireader.py", line 2417, in get_wsireader
return WSIReader.open(input_img)
File "/usr/local/lib/python3.9/site-packages/tiatoolbox/wsicore/wsireader.py", line 137, in open
return OpenSlideWSIReader(input_img, mpp=mpp, power=power)
File "/usr/local/lib/python3.9/site-packages/tiatoolbox/wsicore/wsireader.py", line 1346, in __init__
self.openslide_wsi = openslide.OpenSlide(filename=str(self.input_path))
File "/usr/local/lib/python3.9/site-packages/openslide/__init__.py", line 160, in __init__
self._osr = lowlevel.open(filename)
File "/usr/local/lib/python3.9/site-packages/openslide/lowlevel.py", line 128, in _check_open
raise OpenSlideUnsupportedFormatError(
openslide.lowlevel.OpenSlideUnsupportedFormatError: Unsupported or missing image file
Output from missing jp2 image
Traceback (most recent call last):
File "//read_image.py", line 13, in <module>
wsi_info = wsi_reader_v1.info.as_dict()
File "/usr/local/lib/python3.9/site-packages/tiatoolbox/wsicore/wsireader.py", line 190, in info
self._m_info = self._info()
File "/usr/local/lib/python3.9/site-packages/tiatoolbox/wsicore/wsireader.py", line 1718, in _info
description = box[3].xml.find("description")
IndexError: list index out of range
I was trying to understand why various versions of jupyter notebooks that I had been editing kept on crashing under various circumstances.
Since I was previously able to run this file in Colab without difficulty and without intervening, and since it was probably checked on Colab before being merged, my conclusion is that the problem is due to recent changes in tiatoolbox.
IMPORTANT NOTES: The master version of idars.ipynb is dated 31 January 2022, so the change in the repo causing these problems must be subsequent to that. I think that I observed no problem two weeks ago. Problems with this file have nothing to do with defects in my virtual environment (discussed in Issue #352) since everything relevant was done on Colab.
Following points need to be amended in the documentation
In section "Reading WSI Image Data", keyword for WSIReader function is "input_img" and not "input_path" as mentioned in https://tia-toolbox.readthedocs.io/en/latest/usage.html . Also there should be get_wsireader() instead of wsireader() in documentation. Updation is required in case if any of functionality is getting deprecated.
slide_param = wsi.info() causes error due to its property decorator but calling it without () works and its doc is also missing.
In slide_thumbnail function,OpenSlideWSIReader is redundant, get_wsireader() is enough
Explore how we can include line gaps in sphinx docs. There need some gaps between description of slide_thumbnail() and get_wsireader() functions.
Has got unexpected keyword argument tile_read_size_w and tile_read_size_h in function save_tiles()
In part “Functional” in “Accessing Metadata”, AttributeError: 'WSIMeta' object has no attribute 'file_name'. Replace file_name with file_path
There is () in import statement in example of class tiatoolbox.tools.stainnorm.CustomNormaliser. Same () for next few Stain normalizers needed to be removed. Add an example of stain_matrix (defining it and taking some value).
Import statement spelling mistake staiextract instead of stainextract
Function name mistake but function performs as given in example of tiatoolbox.utils.misc.mpp2objective_power
Numpy import statement is missing but used as np in example of tiatoolbox.utils.transforms.background_composite
Toolbox throws an exception for the pixman version.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/opt/miniconda3/envs/isyntax2zarr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "/opt/miniconda3/envs/isyntax2zarr/lib/python3.7/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/opt/miniconda3/envs/isyntax2zarr/lib/python3.7/site-packages/tiatoolbox/utils/env_detection.py", line 283, in _show_warning
version, using = pixman_version()
File "/opt/miniconda3/envs/isyntax2zarr/lib/python3.7/site-packages/tiatoolbox/utils/env_detection.py", line 249, in pixman_version
version = tuple(int(part) for part in matches.group(1).split("."))
AttributeError: 'NoneType' object has no attribute 'split'
For better reproduction of the environment and examples, docker support can be added. The docker can run on any environment and the user can get hands-on with the Library easily without wondering about dependencies.
Below are the steps that can be adopted.
I can help if that sounds needed.
The documentation https://tia-toolbox.readthedocs.io/en/stable/usage.html#module-tiatoolbox.tools.patchextraction suggests that the parameter input_mask
can be a VirtualWSIReader. However, this doesn't seem to be actually supported.
from tiatoolbox.tools import patchextraction
mask = wsi_reader_v1.tissue_mask(resolution=5, units='level')
fixed_patch_extractor = patchextraction.get_patch_extractor(
input_img='/home/deep/users/ralf/code/notebooks/example2.svs', # input image path, numpy array, or WSI object
input_mask = mask,
method_name="slidingwindow", # also supports "point" and "slidingwindow"
patch_size=(500, 500), # size of the patch to extract around the centroids from centroids_list
stride=(250, 250), # 250 pixels overlap in both axes
resolution=1,
units='level',
within_bound=True
)
This gives the error message
TypeError: expected str, bytes or os.PathLike object, not VirtualWSIReader
thrown by the line
self.mask = wsireader.VirtualWSIReader(
input_mask, info=self.wsi.info, mode="bool"
)
input_mask
argument (which would have to be a path to a tiff for this to work), rather then simply checking that input_mask
is of type WSIReader
, as the docs suggestinput_img
could also accept a WSIReader
object rather then a path to a file - if I already created the object in memory then there is no point doing it again.I would like to suggest changing the WSIReader option currently called "power" to "appmag" (or similar). The word "power" was a carry over from the openslide metadata referring to "objective power". However, this is (in most cases) actually referring to the total apparent magnification not the power of the objective lens alone. This small change would avoid confusion. We could keep "power" functioning as an alias with a deprecation notice at least until version 1.0 for backward compatibility.
###Describe what you were trying to get done.
###Tell us what happened, what went wrong, and what you expected to happen.
I am trying to edit (https://github.com/TIA-Lab/tiatoolbox/blob/develop/examples/01_example_wsiread.ipynb) file by loading a .svs dataset from my drive into colab but the "Reading in a WSI" chunk of code is not responding. The directory tmp is not created after I run "Reading in a WSI" chunk of code. I have perceived that somehow the path I am giving to load my dataset from drive is not working (because for a split second I saw this error that "filename: cannot be fetched from the backend) Can anyone please help me out?
Dataset: The dataset I am trying to load and open/read can be accessed at this link: https://www.dropbox.com/sh/qnbs012c11575jy/AAAX3TlB0nyEvH1SgtPO8Z9Ra/TCGA-A8-A09K-01Z-00-DX1.41B2DF5F-C0E1-43BB-BAA5-2946A9EC4650?dl=0&preview=TCGA-A8-A09K-01Z-00-DX1.41B2DF5F-C0E1-43BB-BAA5-2946A9EC4650.svs&subfolder_nav_tracking=1.
The dataset is a .svs file of 0.97 GB.
Edit: Also if I want to read a dataset file with extension .qpdata (size: 33KB), can I do that with the same "Reading in a WSI" chunk of code? Or we can only read the .svs files in this example.
Note: Its my first time working with WSI data and also discussing an issue in GitHub, please excuse me for any novice mistake. Any kind of help will be appreciated.
from google.colab import drive
drive.mount('/content/drive')
data_dir = './tmp'
sample_file_name = 'sample_wsi.svs'
user_sample_wsi_path = '/content/drive/MyDrive/TCGA-A8-A09K-01Z-00-DX1.41B2DF5F-C0E1-43BB-BAA5-2946A9EC4650.svs'
if user_sample_wsi_path is None:
sample_wsi_path = '%s/%s' % (data_dir, sample_file_name)
else:
sample_wsi_path = user_sample_wsi_path
if not os.path.exists(sample_wsi_path):
# os.mkdir(data_dir)
# r = requests.get( " " )
print('path not found!')
with open(sample_wsi_path, "wb") as f:
f.write(r.content)
Tried to utilise resnet18-kather100k pretrained model for transfer learning (prediction), the issue seems to be the same with other pretrained models also, but this may need a double check. Saad also had the same issue after answering for verification
The error is a missing and unexpected keys when trying to initialise the state dict.
I believe I know what is causing the issue which is that in either the unweighted model, or the resnet18-kather100k model itself, there are two mispelt keys 'classifer.weight' and 'classifer.bias' these should both be 'classifier'.
Ran the line:
predictor = PatchPredictor(pretrained_model='resenet18-kather100k', batch_size=32)
RuntimeError: Error(s) in loading state_dict for CNNmodel:
Missing key(s) in state_dict: 'classifer.weight', 'classifer.bias'.
Unexpected key(s) in state_dict: 'classifier.weight', 'classifier.bias'.
The link on the PyPI page to the github repor is broken.
Update setup.py
to point to the correct locations for homepage and source code.
The get_wsireader method currently does not support .tiff files, errors with FileNotSupported. Can this be added as .tiff files are already supported in the OpenSlideWSIReader method?
Trying to read in a tiff file called wsi_file_name
:
wsi_reader_v1 = wsireader.get_wsireader(
input_img=wsi_file_name)
Errors with the following:
FileNotSupported Traceback (most recent call last)
<ipython-input-4-027d95e28b58> in <module>
1 wsi_reader_v1 = wsireader.get_wsireader(
----> 2 input_img=wsi_file_name)
3 print(type(wsi_reader_v1))
~/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox-0.6.0-py3.7.egg/tiatoolbox/wsicore/wsireader.py in get_wsireader(input_img)
1794
1795 else:
-> 1796 raise FileNotSupported("Filetype not supported.")
1797 elif isinstance(input_img, np.ndarray):
1798 wsi = VirtualWSIReader(input_img)
FileNotSupported: Filetype not supported.
While .tiff files can still be read in with the OpenSlideWSIReader method this means downstream methods in classes like the patchpredictor class will also error with this same message as they use this get_wsireader method.
Notebooks need to run on the following platforms:
Colab without GPU, Colab with GPU, Kaggle without GPU, Kaggle with GPU, jalapeno (following recipe from John P or from Gozde), personal machines, for example Macos with possible different versions depending on chip M1 or Intel, Windows, Linux.
This is probably going to need a proliferation of yaml files, and a new look at README, and possibly at other files, such as CONTRIBUTING.rst, docs/installation.rst, and setup.*
The minimum requirement is that the notebooks should run without error on Colab without GPU. However, this is not the case for some recent PRs I was asked to review. Admittedly the PRs concerned have been give Draft status by their authors.
However, it seems to me that a can or worms is being covered up, and this urgently needs to be cleaned up.
Using get_wsireader
on a tiff files converted from jp2 errors but using the OpenSlideWSIReader class works. This in turn means classes like SemanticSegmentor that use get_wsireader
also don't work with this.
Code to reproduce this
wsi = get_wsireader(".../file.tiff")
results in following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/robj/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/wsicore/wsireader.py", line 2417, in get_wsireader
return WSIReader.open(input_img)
File "/home/robj/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/wsicore/wsireader.py", line 134, in open
return VirtualWSIReader(input_img, mpp=mpp, power=power)
File "/home/robj/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/wsicore/wsireader.py", line 1816, in __init__
self.img = utils.misc.imread(self.input_path)
File "/home/robj/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/utils/misc.py", line 160, in imread
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.5.5) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'
but using
wsi = OpenSlideWSIReader(".../file.tiff")
on the same file runs absolutely fine.
TIFFWSIReader gives a different error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/robj/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/wsicore/wsireader.py", line 2053, in __init__
raise ValueError("Unsupported TIFF WSI format.")
ValueError: Unsupported TIFF WSI format.
When using OpenSlide on Ubuntu 20.x versions, the included libpixman library version (0.38) causes errors which reading any pyramid level other than 0. See issue openslide/openslide-python#114 and openslide/openslide#291 for more. Using conda this can easily be fixed by installing another version (our requirements files use the correct version). However if installing from pip and using apt for packages such as openslide and openjpeg then this will be an issue as apt will not (without a lot of fussing) allow installing another version.
Add a check on import of OpenSlideReader to check if the system is GNU Linux and pixman version 0.38 is being used. If so, display a warning to the user. The user may have to use conda or install a different pixman version from source.
It would be very useful to have a way to read a WSI using DeepZoom or Zoomify tile indexes.
There are two main use cases for this:
get_tile
function.I propose having a get_tile
function to get a single tile and a write_tiles
(or similar name) function to dump all tiles to disk.
It appears that pytest-cov cannot detect code coverage for any function run in a subprocess using the decorator function in multiproc.py
. I am not clear if this is a multiprocessing issue of a decorator issue. This needs to be resolved to get anything near accurate code coverage.
There is some information on pytest-cov subprocesses here: https://pytest-cov.readthedocs.io/en/latest/subprocess-support.html.
It is likely due to pathos not being compatible with code coverage tools 😞
During the the mask-based patch extraction, I realized that the default coordinate selection helper function (default_sel_func
) in the filter_coordinates
method does not work as expected. The reason is that for each input coordination, default_sel_func
uses read_bounds
method of mask's VirtualWSI to extract the corresponding region of the input coordination. However, during the the read_bounds
call, resolution
and unit
arguments are provided and also coord_space
is set to "resolution"
:
Now, because coord
input is coordination that is already converted to the 'baseline'
resolution, read_bounds
method will extract regions according to the baseline coords
but from the user's selected resolution
and unit
(which are set during the initialization of the patchextractor) and this will result to wrong mask's patch extraction. Note that this won't be an issue if user wants to extract patches at level 0 (or baseline resolution), the problem only raises if user wants to extracted patches at higher levels.
read_bounds
method's input, except for coord
. This way, read_bounds
will use coord_space='baseline'
as expected.roi = reader.read_bounds(coord)
read_rect
method (similar to patch extraction from the original image in __getitem__
method (if it's more efficient way to extract patches at various resolution):I was trying to run one of our examples. Because of a coding error (not by me), the run came to a juddering halt with an error message. Afterwards, I noticed (but only because of using "git status") that a file had incorrectly been left behind, and would not be cleaned up for the next run.
To avoid problems, e notebook should begin each run with the same collection of unaltered files. Remember that these example notebooks will be run by a very large variety of users, some with very little experience.
Deleted the offending file and wrote this issue
The right way to program the clean up is to create a special directory into which all downloaded or program-created files are placed. This is done correctly in the first notebooks written for TIAToolbox (eg wsi-reading.ipynb).
At the beginning of the Jupyter notebook, delete that special directory and its contents. It is easy to leave debris lying around if this is not done. For example, the user may get bored and interrupt the execution, or there could be a very brief power cut or something else that you haven't thought of.
The existing Jupyter notebooks should be checked and improved where necessary.
It would be great if this was available on conda in addition to pip.
N.A.
Many of the TIA toolbox APIs silently assume complete slide metadata, e.g. mpp
and power
. However, some slides may have incomplete metadata (I myself work with slides that don't have mpp
and power
metadata), and that will break many of the APIs, e.g. there are places in the code where some mask is created and the unit "power" is hardcoded - this will fail if metadata is missing and the user has no way to recover from it.
Since the level
coordinate units are always going to be there, it would be good if there was a way to use all APIs with this coordinate system, perhaps even as default.
A test in test_stainaugment
is causing a whole PNG file to be dumped to stdout. Here is some sample output from Travis:
Update: It appears in different places on different runs and is likely not to do with stain augmentation but rather some other thread or process dumping to stdout during tests running.
tests/test_stainaugment.py Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/%{ <-- HERE (.*?)}/ at /usr/bin/run-mailcap line 528.
Error: no "view" rule for type "image/png" passed its test case
(for more information, add "--debug=1" on the command line)
Can't create config directory (/.w3m)!?PNG
�
IHDR�?��?m�?�IDATx�??Yo&;???i?U?Z???F?�??
�?E?s?????�??�7�
???????]kՠy?�?��??g?#?T?R?z?I�cbd&��?d?���???-?-?-?-?-?-?-?#,??�???A??o?�?:M�???D???k?�?�?�?�??�k?????�?�?�?�?�?�?�X?�?�xiS????u?<�?_?X??w�t�t�t�<?��?C???�?�?�?�?�?�?�?j�???o?????�???/s??�??�?3?�?�?�?�?�??�=?????t�t�t�t�t�t�̳??a?~?<?t?$??�??a???�?�?�?9-??-?9?[[?[?[?[?[?mY???yw???o??5?�?�?�?�?�?�o?�=??��joR?@?@?@?@??Og�^�????<?O???p?@?@?@?@?@???f?�s???xoo?@?@?@?@??????Q?��?Wט?P?@?@?@?@?@?@?@?@_ϥX???�?�?�?�?�?�??��???�U???Qk??o,Pϑ
?'K5E?t�t�t�t�?�???�0z�?-?-?-?-?-?-0}s??M*m??^?h?*&'K?M~Ճ???-0?@�:�?{??uz??,?c./g???[?[?[?[?[?[`?�?v'??????[`Y�???e-???�?�?�?�??�?z.???]J?@?@?@?@?@???�??????F??n?n?n?n?�?@�p^Ĭ???�?1?�M?�?�?�?�?�?�?�^?�O??t??�?Jg?-?-?-?�-?�???}?h?��?kmR�???�??W?@?@?@?@???d?<? �??]~?s???[?[?[?�,P�???i�M?????��???�?tp?@?@?@?@?@??w?@??./???�??v???�?�?�??�VGk???1?�?h�?�?~5K??>?�z???-??u�t�t�t�t�t�,m?i7??ί)���&+Aw?��?\?@?@??/o?q??5G?�?�?:??�K?gs\????9�????;??A?�?�?�?�?�/n?%}??n???䰸%?L$1??e/????֖??f?<�?k?-?-?J-0??R�?C}????9Ve{?'?@????�???-?-?-?-?-?#-P??�Tb�s?? ?G#???{4??NК1??|???
IHDR and IDAT are markers within the PNG file.
I have been thinking about more optimal methods for iterating over a WSI e.g. when generating tiles or performing patch extraction or running inference etc.. I wanted to share my thoughts and start a discussion.
Region reads/tiling/patch extraction may not be very performant (particularly for JPEG-2000 images). This is because 'under the hood' OpenSlide/glymur/tifffile will have to decode 1-4 or more tiles (depending on the number of tile borders that the requested region passes over). Read performance can be dramatically improved by caching these tiles read from the underlying file format to avoid decoding the same time multiple times.
However, the obvious / naive way to do this by iterating over the x and y coordinates in steps of the tile size is still sub-optimal. For example, if the tile cache size is less than the number of tiles across the slide there is little gain from subsequent passes over the same tiles.
Simulation of raster order processing with a tile cache of size 5:
A more optimal ordering is Morton/z order or a Hilbert curve. This may still be sub-optimal in some cases but in general should perform better than a raster scan across the tiles and is fairly simple to implement.
Simulation of Morton/z order processing with a tile cache of size 5:
Hi,
often you might have labels only in some region of the slide (e.g. all cells labeled in some ROI) and you provide a mask (e.g. out of ROI polygon) to extract patches only from that region, but when you use within_bound=False, patches are extracted out of boundary as well (as expected and fine), but part of the patch should be also masked (i.e. have an option for that). Currently padding ON/OFF is only supported for out of WSI region. I would like to have this kind of support also for the inside "out of mask" region (e.g. part of border patches when within_bound=False will not have complete labels and the tissue should be masked with a constant value).
I see 2 options how to implement this:
If you see the use of this and you provide some ideas what would be the best option to have this, I can implement it and make the PR. I have real-world use case on which I can try :)
We are currently using a conda yaml file to set up the build environment for Read The Docs but apt install command with pip requirements file on Travis CI. I suggest moving to using conda for both Read The Docs and Travis. This way we have one less thing to maintain. We currently have these places where we define dependencies:
See Using Conda with Travis CI for more.
Edit: Typo fixes
I tested this both with the most recent develop and feature-patchwsi commits. When using the CNNPatchPredictor class to predict in wsi mode using the example arguments from the example notebook 05 the method errors and when using the updated arguments for the method not included in the notebook.
Running this as per the example in the notebook
wsi_output = predictor.predict([wsi_file_name],
mode='wsi',
stride_size=(112,112),
return_probabilities=True)
results in the error:
AttributeError Traceback (most recent call last)
<ipython-input-13-aeabd4a04e59> in <module>
2 mode='wsi',
3 stride_size=(112,112),
----> 4 return_probabilities=True)
~/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox-0.6.0-py3.7.egg/tiatoolbox/models/classification/patch_predictor.py in predict(self, img_list, mask_list, label_list, mode, return_probabilities, return_labels, on_gpu, patch_size, stride_size, resolution, units, merge_predictions, save_dir)
539 mask_path=img_mask,
540 patch_size=self.iostate.patch_size,
--> 541 stride_size=self.iostate.stride_size,
542 resolution=self.iostate.input_resolutions[0]["resolution"],
543 units=self.iostate.input_resolutions[0]["units"],
AttributeError: '_IOStatePatchPredictor' object has no attribute 'stride_size'
Additionally, after looking through the source code and adding in the new arguments for the predict method:
wsi_output = predictor.predict([wsi_jp2],
mode='wsi',
patch_size=(244,244),
return_probabilities=True,
resolution=0.5,
units='mpp',
merge_predictions=True
)
this also errors with the following
AttributeError Traceback (most recent call last)
<ipython-input-31-270288848f10> in <module>
5 resolution=0.5,
6 units='mpp',
----> 7 merge_predictions=True
8 )
~/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/models/classification/patch_predictor.py in predict(self, img_list, mask_list, label_list, mode, return_probabilities, return_labels, on_gpu, patch_size, stride_size, resolution, units, merge_predictions, save_dir)
545 stride_size=self.iostate.stride_size,
546 resolution=self.iostate.input_resolutions[0]['resolution'],
--> 547 units=self.iostate.input_resolutions[0]['units'],
548 )
549 output_model = self._predict_engine(
~/.conda/envs/tiatoolbox/lib/python3.7/site-packages/tiatoolbox/models/dataset/classification.py in __init__(self, img_path, mode, mask_path, preproc_func, patch_size, stride_size, resolution, units, auto_get_mask)
298 mask_reader = self.reader.tissue_mask(resolution=1.25, units="power")
299 # ? will this mess up ?
--> 300 mask_reader.attach_to_reader(self.reader.info)
301
302 if mask_reader is not None:
AttributeError: 'VirtualWSIReader' object has no attribute 'attach_to_reader'
Seems the answer to the comment in the source code is that yes this will mess up :).
Extracting patches with there x and y coordinates
patch_extractor = patchextraction.get_patch_extractor(
input_img=wsi, # input image path, numpy array, or WSI object
method_name="slidingwindow",
patch_size=(256, 256), # size of the patch to extract around the centroids from centroids_list
stride=(256, 256))
MethodNotSupported: Method is not supported
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.