Why are there 2 GaussianNoiseTransforms and 2 GammaTransforms in brats2017_dataloader_3D.py?,about mic-dkfz/batchgenerators

Comments (12)

zxyyxzz commented on July 17, 2024

I corrected the questions as following:

Dear FabianIsensee:
1.
I have seen the code of brats2017_dataloader_3D.py,but i saw there are two same transforms about GammaTransforms and GaussianNoiseTransforms,why do you do that?
2.
if you use all spatial transformations with a probability of 0.2 per sample,no augmentation with a probability of 0.8 per sample
The other transformations with a probability of 0.18 per sample, no augmentation with a probability of 0.75 per sample
There are 6 transformations in total, This means that 1-(0.8x0.85x0.85x0.85x0.85x0.85)= 74% of samples will be augmented,is this too much for these sample?

Hope you can reply me :) thanks !~~~

Dear FabianIsensee:
1.
I have seen the code of brats2017_dataloader_3D.py,but i saw there are two same transforms about GammaTransforms and GaussianNoiseTransforms,why do you do that?
2.
if you use all spatial transformations with a probability of 0.2 per sample,no augmentation with a probability of 0.8 per sample
The other transformations with a probability of 0.15 per sample, no augmentation with a probability of 0.85 per sample
There are 6 transformations in total, This means that 1-(0.8x0.85x0.85x0.85x0.85x0.85)= 74% of samples will be augmented,is this too much for these sample?

Hope you can reply me :) thanks !~~~

from batchgenerators.

FabianIsensee commented on July 17, 2024

Hi,

I have seen the code of brats2017_dataloader_3D.py,but i saw there are two same transforms about GammaTransforms and GaussianNoiseTransforms,why do you do that?

There is just one GaussianNoiseTransforms. The two GammaTransforms are there because one works on inverted images invert_image=True and the other does not. They are not the same.

if you use all spatial transformations with a probability of 0.2 per sample,no augmentation with a probability of 0.8 per sample The other transformations with a probability of 0.18 per sample, no augmentation with a probability of 0.75 per sample There are 6 transformations in total, This means that 1-(0.8x0.85x0.85x0.85x0.85x0.85)= 74% of samples will be augmented,is this too much for these sample?

I do not claim that this is ideal for BraTS. This is just an example of one could do. Your computation is a little off, though: 0.9x0.9x0.9x0.85x0.85x0.85x0.85x0.85 is what is should look like (it's 0.1 for each of the three spatial transforms: elastic deform, scaling, rotation). That's 32% not augmented and 68% augmented. I think that ratio is fine, but ultimately you need to try this out.

Best,
Fabian

from batchgenerators.

zxyyxzz commented on July 17, 2024

Hi,

I have seen the code of brats2017_dataloader_3D.py,but i saw there are two same transforms about GammaTransforms and GaussianNoiseTransforms,why do you do that?

There is just one GaussianNoiseTransforms. The two GammaTransforms are there because one works on inverted images invert_image=True and the other does not. They are not the same.

if you use all spatial transformations with a probability of 0.2 per sample,no augmentation with a probability of 0.8 per sample The other transformations with a probability of 0.18 per sample, no augmentation with a probability of 0.75 per sample There are 6 transformations in total, This means that 1-(0.8x0.85x0.85x0.85x0.85x0.85)= 74% of samples will be augmented,is this too much for these sample?

I do not claim that this is ideal for BraTS. This is just an example of one could do. Your computation is a little off, though: 0.9x0.9x0.9x0.85x0.85x0.85x0.85x0.85 is what is should look like (it's 0.1 for each of the three spatial transforms: elastic deform, scaling, rotation). That's 32% not augmented and 68% augmented. I think that ratio is fine, but ultimately you need to try this out.

Best,
Fabian

Thanks for your reply :) i will have a try ,thanks~~~~~:)

from batchgenerators.

zxyyxzz commented on July 17, 2024

Hi,

I have seen the code of brats2017_dataloader_3D.py,but i saw there are two same transforms about GammaTransforms and GaussianNoiseTransforms,why do you do that?

There is just one GaussianNoiseTransforms. The two GammaTransforms are there because one works on inverted images invert_image=True and the other does not. They are not the same.

if you use all spatial transformations with a probability of 0.2 per sample,no augmentation with a probability of 0.8 per sample The other transformations with a probability of 0.18 per sample, no augmentation with a probability of 0.75 per sample There are 6 transformations in total, This means that 1-(0.8x0.85x0.85x0.85x0.85x0.85)= 74% of samples will be augmented,is this too much for these sample?

I do not claim that this is ideal for BraTS. This is just an example of one could do. Your computation is a little off, though: 0.9x0.9x0.9x0.85x0.85x0.85x0.85x0.85 is what is should look like (it's 0.1 for each of the three spatial transforms: elastic deform, scaling, rotation). That's 32% not augmented and 68% augmented. I think that ratio is fine, but ultimately you need to try this out.
Best,
Fabian

Thanks for your reply :) i will have a try ,thanks~~~~~:)

There is a new confusing for me here, i have run the dataloader of the code ,
1.MultiThreadedAugmenter has the num_threads_for_brats_example==8
and
2. num_batches_per_epoch = 10
num_validation_batches_per_epoch = 3
num_epochs = 5

But the speed of the dataloader is very slowly,and show as folloing:
"Running 5 epochs took a total of 218.83 seconds with time per epoch being [71.07379913330078, 28.94435453414917, 37.057417154312134, 33.44325494766235, 48.315285205841064]"
It took 218/(510+53)=3.3 seconds for just a batch(2 examples),
3.Does i run as a incorrect way?
I think maybe the transformation is low, If the speed is not fast, so why don‘t you try to do augmentaion first before dataloader? it must be faster than flying training .

4.And i seen the parameter patch_size of dataloader_validation is 128x128x128 , so why do you input 128x128x128 patch_size rather than input a full image? Beacause the Fully Convolutional Networks
can be input arbitrarily size of image, right?

5.i don't understand the metadata's meaning, Does the Brats data need do the operation to gte metadata?

HaHa, i am not experienced for the Brats project,maybe my problem is very easy and stupid.
So Thanks again,hope you can reply me :)

Best
Xinyu

from batchgenerators.

FabianIsensee commented on July 17, 2024

Hi Xinyu,
what CPU are you running this on? I got the following result:

Running 5 epochs took a total of 85.88 seconds with time per epoch being [34.329609632492065, 7.970992565155029, 8.351309537887573, 12.86898946762085, 22.35595417022705]

That's about twice as fast (if not more) than what you posted. My CPU is a fairly old and slow E5-2640 v3 @ 2.60GHz (8 Cores, 16Threads). You should habe your data stored on an SSD, not a HDD

This patch size is what I ran in all my BraTS experiments. Typically you run this with a large 3D network and even on a new GPU one forward/backward pass + parameter updates takes longer than the batches take to process.

3.Does i run as a incorrect way?
I think maybe the transformation is low, If the speed is not fast, so why don‘t you try to do augmentaion first before dataloader? it must be faster than flying training .

In my experiments I never have a CPU bottleneck so there is no reason for me to do this. Also running augmentations on the fly is just the superior way of doing this as you will get an infinite amount of augmentations.

4.And i seen the parameter patch_size of dataloader_validation is 128x128x128 , so why do you input 128x128x128 patch_size rather than input a full image? Beacause the Fully Convolutional Networks
can be input arbitrarily size of image, right?

Processing larger patches may temporarily make the memory consumption spike higher. I have not tested that. But that doesn't matter to be honest. I always do the validation during training like this and only at the very end I do a proper prediction of the validation set with entire images.

Also you would never be able to put an entire CT scan (500x500x500) into GPU memory. The way I do it here is more generic and still gives a good estimate for the validation loss during training (to check for overfitting).

5.i don't understand the metadata's meaning, Does the Brats data need do the operation to gte metadata?

This is just so that when we convert a predicted segmentation back to nifti we can use it to restore the original geometry of the data. The geometry information is lost when converting a nifti/mha image to numpy array.

Best,
Fabian

from batchgenerators.

zxyyxzz commented on July 17, 2024

Hi Xinyu,
what CPU are you running this on? I got the following result:

Running 5 epochs took a total of 85.88 seconds with time per epoch being [34.329609632492065, 7.970992565155029, 8.351309537887573, 12.86898946762085, 22.35595417022705]

That's about twice as fast (if not more) than what you posted. My CPU is a fairly old and slow E5-2640 v3 @ 2.60GHz (8 Cores, 16Threads). You should habe your data stored on an SSD, not a HDD

This patch size is what I ran in all my BraTS experiments. Typically you run this with a large 3D network and even on a new GPU one forward/backward pass + parameter updates takes longer than the batches take to process.

3.Does i run as a incorrect way?
I think maybe the transformation is low, If the speed is not fast, so why don‘t you try to do augmentaion first before dataloader? it must be faster than flying training .

In my experiments I never have a CPU bottleneck so there is no reason for me to do this. Also running augmentations on the fly is just the superior way of doing this as you will get an infinite amount of augmentations.

4.And i seen the parameter patch_size of dataloader_validation is 128x128x128 , so why do you input 128x128x128 patch_size rather than input a full image? Beacause the Fully Convolutional Networks
can be input arbitrarily size of image, right?

Processing larger patches may temporarily make the memory consumption spike higher. I have not tested that. But that doesn't matter to be honest. I always do the validation during training like this and only at the very end I do a proper prediction of the validation set with entire images.

Also you would never be able to put an entire CT scan (500x500x500) into GPU memory. The way I do it here is more generic and still gives a good estimate for the validation loss during training (to check for overfitting).

5.i don't understand the metadata's meaning, Does the Brats data need do the operation to gte metadata?

This is just so that when we convert a predicted segmentation back to nifti we can use it to restore the original geometry of the data. The geometry information is lost when converting a nifti/mha image to numpy array.

Best,
Fabian

Hi FabianIsensee:
Thanks for your reply :),i know you are the best one of Brasts competition, so i admire you very much.I am a graduate student at Xiangtan University,China.
1.so you do a prediction of the validation set with entire images at the end?Thanks for your reply
2.about metadata:you have said:"in BraTS all cases have already been resampled to 1x1x1mm
by the organizers" So this operation is not necessary for Brats data, right?

At the end ,I am not strong in coding ability，So I especially want to refer to your code.I have found your code for ten days,But I didn't find your 17 and 18 years of code.So Can you give me a 18-year code? My email address is [email protected].

Thanks for your reply, and hope you can help me:)

Best
Xinyu

from batchgenerators.

FabianIsensee commented on July 17, 2024

Hi Xinyu,
unfortunately I cannot share my code of BraTS2018 as it is not in a state where anybody else could actually use it. I hope you understand.

2.about metadata:you have said:"in BraTS all cases have already been resampled to 1x1x1mm
by the organizers" So this operation is not necessary for Brats data, right?

Brats is reampled but the geometry is more than just voxel spacing! There is an offset and a rotation involved as well. Those need to be stored

Best,
Fabian

from batchgenerators.

zxyyxzz commented on July 17, 2024

Hi Xinyu,
unfortunately I cannot share my code of BraTS2018 as it is not in a state where anybody else could actually use it. I hope you understand.

2.about metadata:you have said:"in BraTS all cases have already been resampled to 1x1x1mm
by the organizers" So this operation is not necessary for Brats data, right?

Brats is reampled but the geometry is more than just voxel spacing! There is an offset and a rotation involved as well. Those need to be stored

Best,
Fabian

Hi FabianIsensee:
Thanks for you reply, it dose't matter about the code ,i can understand :)
1.The last confusing question i want to ask is that when you test the test data,the 155x240x240 can not be divided by 16,that means unet have to be input a size == n16,so how do you input an entire images?Do you cut an n16 image to input to unet for test?
2.Apparently, the Brats18 challenge is finished, so i want to use the whole training dataset for training,do you think is it ok? That means whether the whole dataset is better than 5 fold cross-validation,because there is a known validation, there are more data for training.

Thanks for your reply :）

Best
Xinyu

from batchgenerators.

FabianIsensee commented on July 17, 2024

Hi Xinyu,

1.The last confusing question i want to ask is that when you test the test data,the 155x240x240 can not be divided by 16,that means unet have to be input a size == n16,so how do you input an entire images?Do you cut an n16 image to input to unet for test?

pad the image with zeroes so that it has a shape that is divisible by 16

2.Apparently, the Brats18 challenge is finished, so i want to use the whole training dataset for training,do you think is it ok? That means whether the whole dataset is better than 5 fold cross-validation,because there is a known validation, there are more data for training.

The labels for the validation set are not public. You cannot train on them. With the training data you can do whatever you want, I prefer to run a cross-validation because otherwise I will not get a good performance estimate.

Best,
Fabian

from batchgenerators.

zxyyxzz commented on July 17, 2024

Hi Xinyu,

1.The last confusing question i want to ask is that when you test the test data,the 155x240x240 can not be divided by 16,that means unet have to be input a size == n16,so how do you input an entire images?Do you cut an n16 image to input to unet for test?

pad the image with zeroes so that it has a shape that is divisible by 16

2.Apparently, the Brats18 challenge is finished, so i want to use the whole training dataset for training,do you think is it ok? That means whether the whole dataset is better than 5 fold cross-validation,because there is a known validation, there are more data for training.

The labels for the validation set are not public. You cannot train on them. With the training data you can do whatever you want, I prefer to run a cross-validation because otherwise I will not get a good performance estimate.

Best,
Fabian

Hi Fabian:
1.when test,if you pad the entire image with zeros ,there will be a bigger size than 155x240x240.It will waste more computation, so do you think mask a brain area from raw image and then pad with zeros is more efficent?Which method did you use?
2.But you can evaluate the validation by the online platform, then cycle evaluation until you get a good result,right?

Best
Xinyu

from batchgenerators.

FabianIsensee commented on July 17, 2024

1.when test,if you pad the entire image with zeros ,there will be a bigger size than 155x240x240.It will waste more computation, so do you think mask a brain area from raw image and then pad with zeros is more efficent?Which method did you use?

I always crop. See BraTS2017 preprocessing example

2.But you can evaluate the validation by the online platform, then cycle evaluation until you get a good result,right?

I prefer cross-validation because I don't want to overfit to the validation data but others may do it differently. There is no rule about what you should do - you do what you think is best :-)

Best,
Fabian

from batchgenerators.

zxyyxzz commented on July 17, 2024

1.when test,if you pad the entire image with zeros ,there will be a bigger size than 155x240x240.It will waste more computation, so do you think mask a brain area from raw image and then pad with zeros is more efficent?Which method did you use?

I always crop. See BraTS2017 preprocessing example

2.But you can evaluate the validation by the online platform, then cycle evaluation until you get a good result,right?

I prefer cross-validation because I don't want to overfit to the validation data but others may do it differently. There is no rule about what you should do - you do what you think is best :-)

Best,
Fabian

Hi Fabian:
Thanks for your reply !!~~ :0 i will try it :)

Best
Xinyu

from batchgenerators.

Why are there 2 GaussianNoiseTransforms and 2 GammaTransforms in brats2017_dataloader_3D.py? about batchgenerators HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent