Comments (4)
Hello @smcch,
your pipeline looks really good!
But I would recommend some small improvements in order to obtain a state-of-the-art pipeline:
- REQUIRED: Error Fix:
Augmentation techniques regularly requires the image/volume to be in a 0-255 encoding like grayscale or RGB. This is why, AUCMEDI throws this excetion if it is not the case for a passed sample. You can disable this assertion with the following line of code:
data_aug= VolumeAugmentation(...)
data_aug.refine = False
Since you are performing just flipping, rotation, and brightness adjustments, you should be fine.
- HIGHLY RECOMMENDED: Preprocessing
You volumes have to be the same shape when passed into the neural network model.
Currently, this is ensured by theresize
parameter in the DataGenerator. However, this is not the ideal shape (commonly 64x64x64 is used resulting in drastic resizing and information loss) and also breaks the voxel spacing again.
It works also without preprocessing but if you want the optimal performance, I would highly recommend it.
This is why, center cropping & padding is commonly utilized as preprocessing (called subfunctions in AUCMEDI) in the literature.
Would recommend adding a padding to (160x160x80) followed by a center crop to (160x160x80) as Subfunctions.
But please validate that your Region-of-Interest is most likely inside such a crop.
You can adjust the final shape as you wish (also recommend to adjust it depending on the voxel spacing). The idea here is to find a high-enough resolution which you can fit at least 6-8 times (batch size) in your GPU VRAM (be aware that after transfer learning, the amount of VRAM increases as before only partial of the architecture is used during transfer learning).
Also would adjust the batchsize in the DataGenerator. Micro-batch sizes of 6-8 are fine for 3D volumes.
Else, you could run into a OOM issue after transfer learning (epoch 10).
- RECOMMENDED: Resampling
Currently, you pass correctly the sitk_loader to the DataGenerator. However, by default sITK will resample you scans to 1.0x1.0x1.0 voxel spacing. This will probably result in quite large volume which maybe will not fit in your GPU.
In my experiments, I find the voxel spacing (1.58, 1.58, 2.70) ideal for 24GB VRAM GPUs like an NVIDIA TITAN RTX.
- RECOMMENDED: Better Augmentation:
I would recommend to utilize the batchgenerators instead of volume augmentation due to its way more robustly implemented and used in high-performance studies by the DKFZ (German Cancer Research Center).
Everything else, looks perfect! Good luck on your study!
Feel free to ask or report if you run against any further issues.
Best Regards,
Dominik
from aucmedi.
Thank you so much!
Now I can perform the data augmentation via BatchgeneratorsAugmentation(). But I still get an error about crop/padd subfunctions. My original nifti images have the shape 193 x 229 x 193. I am not sure how to use the "resize" or "input_shape", because it appears in NeuralNetwork(), BatchgeneratorsAugmentation(), DataGenerator().....
Here my code:
from aucmedi import *
from aucmedi.data_processing.io_loader import image_loader
from aucmedi.data_processing.io_loader import sitk_loader
ds = input_interface(interface="csv",
path_imagedir=r"C:\Users\Santiago\PycharmProject\CUDA_CNN\patients",
path_data=r"C:\Users\Santiago\PycharmProject\CUDA_CNN\input.csv",
ohe=False, # OHE short for one-hot encoding
col_sample="ID", col_class="CLASS")
(index_list, class_ohe, nclasses, class_names, image_format) = ds
model = NeuralNetwork(n_labels=nclasses, channels=1, architecture="3D.ResNet50", activation_output="softmax")
from aucmedi.evaluation import *
evaluate_dataset(
samples=index_list,
labels=class_ohe,
out_path=r"*****\CUDA_CNN",
class_names=class_names,
plot_barplot=True
)
from aucmedi.sampling.split import sampling_split
set_train, set_val, set_test = sampling_split(
samples=index_list, # list of sample names
labels=class_ohe, # list of corresponding labels
sampling=[0.5, 0.25, 0.25], # percentage splits
stratified=True, # Allow overlaps between sets
seed=100
)
from aucmedi.data_processing import augmentation
data_aug = BatchgeneratorsAugmentation( image_shape= (160,160,80),
mirror=True,
rotate=True,
scale=True,
elastic_transform=False,
gaussian_noise=True,
brightness=True,
contrast=True,
gamma=True,
)
from aucmedi.data_processing.subfunctions import *
sf_crop= Crop(shape=(160, 160, 80), mode= "center")
sf_padding = Padding(mode= "square")
sf_list = [sf_crop, sf_padding]
from aucmedi.data_processing import data_generator
from aucmedi.data_processing.io_loader import sitk_loader
gen_train = data_generator.DataGenerator(
samples=set_train[0],
labels=set_train[1],
path_imagedir=r"******\patients",
image_format=image_format,
data_aug=data_aug,
batch_size= 4,
resize= (160,160,80),
seed=100,
loader= sitk_loader,
resampling= (1.58, 1.58, 2.70),
subfunctions=sf_list
)
gen_val = data_generator.DataGenerator(
samples=set_val[0],
labels=set_val[1],
path_imagedir=r"*****\patients",
image_format=image_format,
data_aug=data_aug,
batch_size= 4,
resize= (160,160,80),
seed=100,
loader=sitk_loader,
resampling=(1.58, 1.58, 2.70),
subfunctions=sf_list
)
history = model.train(
training_generator=gen_train,
validation_generator=gen_val,
epochs=50,
transfer_learning=True
)
gen_test = data_generator.DataGenerator(
samples=set_test[0],
labels=None,
path_imagedir=r"*****patients",
image_format=image_format,
data_aug=None,
batch_size=4,
resize=(160,160,80),
seed=100,
loader=sitk_loader,
resampling=(1.58, 1.58, 2.70),
subfunctions=sf_list
)
predictions = model.predict(
prediction_generator=gen_test
)
from aucmedi import evaluation
from aucmedi.utils.callbacks import *
evaluation.fitting.evaluate_fitting(
train_history=history,
out_path=r"*****\images",
suffix='from_memory',
show=True
)
from aucmedi.evaluation import metrics
scores = metrics.compute_metrics(
preds=predictions,
labels=set_test[1],
n_labels=nclasses
)
from aucmedi.evaluation import performance
performance.evaluate_performance(
preds=predictions,
labels=set_test[1],
out_path=r"*****\images",
class_names=class_names,
show=True,
multi_label=None
)
Here the error:
Traceback (most recent call last):
File "C:\Users\Santiago\AppData\Roaming\JetBrains\PyCharmCE2021.2\scratches\scratch_11.py", line 91, in <module>
history = model.train(
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\aucmedi\neural_network\model.py", line 317, in train
history_start = self.model.fit(training_generator,
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\aucmedi\data_processing\data_generator.py", line 279, in _get_batches_of_transformed_samples
batch_img = self.preprocess_image(index=i,
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\aucmedi\data_processing\data_generator.py", line 345, in preprocess_image
img = sf.transform(img)
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\aucmedi\data_processing\subfunctions\crop.py", line 86, in transform
image_cropped = self.aug_transform(image=image)["image"]
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\volumentations\core\composition.py", line 60, in __call__
data = tr(force_apply, self.targets, **data)
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\volumentations\core\transforms_interface.py", line 117, in __call__
data[k] = self.apply(v, **params)
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\volumentations\augmentations\transforms.py", line 216, in apply
return F.center_crop(img, self.shape[0], self.shape[1], self.shape[2])
File "C:\Users\Santiago\PycharmProject\CUDA_CNN\venv\lib\site-packages\volumentations\augmentations\functional.py", line 124, in center_crop
raise ValueError
ValueError
I appreciate any help you can provide
from aucmedi.
I found a solution, and everything works except "resampling". When I tried to apply it, I got the "center_crop" error. My original images are already resampled to 1x1x1
from aucmedi import *
from aucmedi.data_processing.io_loader import image_loader
from aucmedi.data_processing.io_loader import sitk_loader
ds = input_interface(interface="csv",
path_imagedir=r"******\patients",
path_data=r"******\input.csv",
ohe=False, # OHE short for one-hot encoding
col_sample="ID", col_class="CLASS")
(index_list, class_ohe, nclasses, class_names, image_format) = ds
model = NeuralNetwork(n_labels=nclasses, channels=1, architecture="3D.ResNet50", input_shape= (160,180,160), activation_output="softmax")
from aucmedi.evaluation import *
evaluate_dataset(
samples=index_list,
labels=class_ohe,
out_path=r"*****\CUDA_CNN",
class_names=class_names,
plot_barplot=True
)
from aucmedi.sampling.split import sampling_split
set_train, set_val, set_test = sampling_split(
samples=index_list, # list of sample names
labels=class_ohe, # list of corresponding labels
sampling=[0.5, 0.25, 0.25], # percentage splits
stratified=True, # Allow overlaps between sets
seed=100
)
from aucmedi.data_processing import augmentation
data_aug = BatchgeneratorsAugmentation( image_shape= (160,180,160),
mirror=True,
rotate=True,
scale=True,
elastic_transform=False,
gaussian_noise=True,
brightness=True,
contrast=True,
gamma=True,
)
from aucmedi.data_processing.subfunctions import *
sf_padding = Padding(shape= (160,180,160),mode= "square")
sf_crop= Crop(shape=(160, 180, 160), mode= "center")
sf_list = [sf_padding, sf_crop]
from aucmedi.data_processing import data_generator
from aucmedi.data_processing.io_loader import sitk_loader
gen_train = data_generator.DataGenerator(
samples=set_train[0],
labels=set_train[1],
path_imagedir=r"*****\patients",
image_format=image_format,
data_aug=data_aug,
batch_size= 4,
resize= (160,180,160),
seed=100,
loader= sitk_loader,
#resampling= (1.58, 1.58, 2.70),
subfunctions=sf_list
)
gen_val = data_generator.DataGenerator(
samples=set_val[0],
labels=set_val[1],
path_imagedir=r"*****patients",
image_format=image_format,
data_aug=data_aug,
batch_size= 4,
seed=100,
loader=sitk_loader,
resize= (160, 180,160),
#resampling=(1.58, 1.58, 2.70),
subfunctions=sf_list
)
history = model.train(
training_generator=gen_train,
validation_generator=gen_val,
epochs=10,
transfer_learning=True
)
gen_test = data_generator.DataGenerator(
samples=set_test[0],
labels=None,
path_imagedir=r"****\patients",
image_format=image_format,
data_aug=None,
batch_size=4,
resize= (160, 180,160),
seed=100,
loader=sitk_loader,
#resampling=(1.58, 1.58, 2.70),
subfunctions=sf_list
)
predictions = model.predict(
prediction_generator=gen_test
)
from aucmedi import evaluation
from aucmedi.utils.callbacks import *
)
evaluation.fitting.evaluate_fitting(
train_history=history,
out_path=r"****\images",
suffix='from_memory',
show=True
)
from aucmedi.evaluation import metrics
scores = metrics.compute_metrics(
preds=predictions,
labels=set_test[1],
n_labels=nclasses
)
from aucmedi.evaluation import performance
performance.evaluate_performance(
preds=predictions,
labels=set_test[1],
out_path=r"****\images",
class_names=class_names,
show=True,
multi_label=None
)
from aucmedi.
Hey @smcch,
looking really good!
The issues is that you have to replace mode="square"
with mode="edge"
in the padding subfunction. Then, it should work! :)
Also would recommend adjusting the resampling to something like this: resampling= (1.58, 2.0, 1.58)
As you have your shape like this 160x180x160, I assume that your CT scans look like this 512xSLICESx512, correct?
Would also try out a shape like this 180x80x180 since the number of slices is commonly smaller than the x-/y-axes. Maybe, you can get some VRAM free then and increase the batch_size a little bit. Batch_size and input_size are a balance. Higher batch_size would force you to reduce input_size, higher input_size resolution will force you to lower your batch_size. But you want both to be high as possible. For CT scans, I would recommend something like 180x80x180 for input_size and then as high as possible for batch_size. Regularly, you can get something like 8.
Also, on what region of interest are you interested? As you are working with CT, maybe you can add a clipping subfunction for Hounsfield Unit windowing corresponding to your desired ROI.
But your pipeline looks good!
Would also recommend to increase number of training eochs to 1000 and add some callbacks:
# Define callbacks
from tensorflow.keras.callbacks import ModelCheckpoint, CSVLogger, \
ReduceLROnPlateau, EarlyStopping
cb_mc = ModelCheckpoint("model.best_loss.hdf5",
monitor="val_loss", verbose=1,
save_best_only=True, mode="min")
cb_cl = CSVLogger("logs.training.csv", separator=',', append=True)
cb_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5,
verbose=1, mode='min', min_lr=1e-7)
cb_es = EarlyStopping(monitor='val_loss', patience=25, verbose=1)
callback_list = [cb_mc, cb_cl, cb_lr, cb_es]
model.train(..., callbacks=callback_list)
Be aware that this will increase the training time but will commonly result in better performance.
from aucmedi.
Related Issues (20)
- Error using Grad-CAM HOT 2
- Architecture: Add ConvNeXt 3D HOT 2
- EfficientNet not working HOT 1
- Automatic batch size identification tensorflow
- PIL.UnidentifiedImageError: cannot identify image file HOT 2
- DICOM series loader
- DataGenerator iterations HOT 1
- TF dataset from generator: unknown number of iterations in first epoch HOT 1
- TF dataset: Improve CPU performance HOT 3
- Benchmark: loading times keras.utils.sequence vs tf.datasets HOT 1
- Rollback to keras.Sequence again? HOT 1
- Reduce codecov coverage drop fail rate
- Training freeze at end of first epoch (validation computation) HOT 1
- Misplaced link in the tutorials HOT 1
- Wasserstein Distance add to loss function
- Codecov token issue HOT 1
- Add Mac M1 Apple Silicon Support HOT 1
- pathology slide interface with samplify? HOT 1
- AutoML indicates training for 10 Epochs but then trains for 500 when reaching 10 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from aucmedi.