Git Product home page Git Product logo

pytorch_tiramisu's Introduction

One Hundred Layers Tiramisu

PyTorch implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation.

Tiramisu combines DensetNet and U-Net for high performance semantic segmentation. In this repository, we attempt to replicate the authors' results on the CamVid dataset.

Setup

Requires Anaconda for Python3 installed.

conda create --name tiramisu python=3.6
source activate tiramisu
conda install pytorch torchvision -c pytorch

The train.ipynb notebook shows a basic train/test workflow.

Dataset

Download

Specs

  • Training: 367 frames
  • Validation: 101 frames
  • TestSet: 233 frames
  • Dimensions: 360x480
  • Classes: 11 (+1 background)

Architecture

Tiramisu adopts the UNet design with downsampling, bottleneck, and upsampling paths and skip connections. It replaces convolution and max pooling layers with Dense blocks from the DenseNet architecture. Dense blocks contain residual connections like in ResNet except they concatenate, rather than sum, prior feature maps.

Layers

FCDenseNet103

Authors' Results

Authors Results on CamVid

Authors Results on CamVid

Our Results

FCDenseNet67

We trained for 670 epochs (224x224 crops) with 100 epochs fine-tuning (full-size images). The authors mention "global accuracy" of 90.8 for FC-DenseNet67 on Camvid, compared to our 86.8. If we exclude the 'background' class, accuracy increases to ~89%. We think the authors did this, but haven't confirmed.

Dataset     Loss Accuracy
Validation .209 92.5
Testset .435 86.8

Our Results on CamVid

FCDenseNet103

We trained for 874 epochs with 50 epochs fine-tuning.

Dataset     Loss Accuracy
Validation .178 92.8
Testset .441 86.6

Our Results on CamVid

Predictions

Our Results on CamVid

Training

Hyperparameters

  • WeightInitialization = HeUniform
  • Optimizer = RMSProp
  • LR = .001 with exponential decay of 0.995 after each epoch
  • Data Augmentation = Random Crops, Vertical Flips
  • ValidationSet with early stopping based on IoU or MeanAccuracy with patience of 100 (50 during finetuning)
  • WeightDecay = .0001
  • Finetune with full-size images, LR = .0001
  • Dropout = 0.2
  • BatchNorm "we use current batch stats at training, validation, and test time"

References and Links

pytorch_tiramisu's People

Contributors

bfortuner avatar jph00 avatar mattkleinsmith avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch_tiramisu's Issues

Performance cannot approach as mention

hi @bfortuner , I' m interested in your pytorch_tiramisu.

I use the train.ipynb as the backbone to train the model. However, I cannot approach the performance as you do.

My FC-DenseNet67 result converge at 0.8357994871794872.

It will be appreciate to help me figure out my questions.
Here are my questions:

  1. The origin code doesn't have weighted loss function. The weighted loss function do improve
    iou, but harm to pixel accuracy. How did you get those weight?

  2. I see some past issue mention that the train.ipynb is slightly different from the original train
    code. Do you add a lot of things on it? Can you send me the original code for further
    comparison ?

Here is the modification from the code :

in train.ipynb

  1. N_EPOCHS -> 1000

  2. To use multi-gpu I add this line

image

  3. To implement "exclude the 'background' class" , I add this lines. Try to set 11 class, and 
      ignore void label loss

image

   4.  I train it directly. I didn't not finetune.

image

Please help me figure out these questions.🙏🙏🙏🙏🙇🙇🙇
My email is : [email protected]

pre-trained weights

Hello @bfortuner ,

Is it possible to share the weights you obtained with your training ? Everyone will be infinitely grateful :)

Thank you

Is there any details for how to train a model?

hi @bfortuner ,
Is there any details for how to train a model?
I tried but there is following error ,

python train_densenet_pytorch.py
Traceback (most recent call last):
  File "train_densenet_pytorch.py", line 25, in <module>
    import utils.training as train_utils
  File "/pytorch_tiramisu-master/utils/training.py", line 47, in <module>
    def train(epoch, net, trainLoader, optimizer, trainF, sessionName=get_rand_str(5)):
  File "/pytorch_tiramisu-master/utils/training.py", line 21, in get_rand_str
    return ''.join(random.choices(string.ascii_uppercase + string.digits, k=n))
AttributeError: 'module' object has no attribute 'choices'

Hope for your help.
Thanks a ton!

The Loss decrease to 0 quickly when it trains during Epoch2 on my dataset

Hi, there comes a silly question to me.
I use this repo to train my dataset, which is a 2 classes dataset on Semantic Segmentation.
There is everything ok during Epoch1(I use the Epoch1.pth to predict pictures, and the result is ok. The Segmentation performs good at some place.)
And things go strange at Epoch2. The Loss decreases quickly to 0 at about in 100 steps.
I'm confused of it. Maybe because the num of classes of my dataset is too small? How can I modify it?
Thank you in advance.

Train on the (512,512,3) dataset

First of all, congratulations on the good work. I got a size error in size of my data set (512,512.3). How do I fix this?
What changes do I have to make?

Need help for test.

Hello!
I'm new for neural network and pytorch!
When I run this code on my own dataset , I have confusion about test() in train.py

  1. I use model.load_state_dict( torch.load('weights-100-0.018-0.065.pth')) to load parameters before train_utils.test(model, test_loader, criterion, epoch=1)
    but i have error like this:
    KeyError: 'unexpected key "startEpoch" in state_dict'
  2. How to use latest.th?

Thank you for your help!

How do I delete empty classes???

Hello, first of all, thank you very much for your code, let me benefit a lot.
Second, I want to know how to modify the code to remove empty classes from the dataset. I post a slight change but always report error:
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED.
But restore works again, and I suspect this isn't a cuda/cudnn issue, so I'm eager to find out.
Looking forward to your early reply!

The number of parameters

Hi, in the original paper, the parameters for FC-DenseNet56 is about 1.5 M
But your code requires about 4000MB gpu to do a forward computation.

I wonder why there is a huge difference between these.

Training always starts with accuracy of 1.000

Thank you so much @bfortuner for the implementation of this model in PyTorch. When I try to train the Network with the mentioned CamVid dataset, my training always starts with the Train Acc and Val Acc of 1.000. I have set the hyper-parameters as mentioned in the requirements section.
Screenshot 2019-08-30 at 6 31 23 PM

In other words training error and validation error is always 0. Wondering if am missing something ??

"Out of memory" on 6GB GTX1060

I tried batch_size=1, but no difference. I haven't attempted to calculate the actual memory requirement, but would you expect this to run on a 6GB GPU?

Can this be used for semantic segmentation?

Although this repository claims to implement a paper about semantic segmentation, the code in this repository only does image classification based on the CIFAR dataset. Can this code actually be used for pixel-level semantic segmentation?

Take portion of tiramisu

Hi;

After training a tiramisu, I would like to just use all the layers till the bottleneck layer for feature extraction. Is there a good and recommended way to do it. ?

Trained network

Hi, can you share the weights of the network after training?
They would be useful for running simple inference or plugging the trained network into other systems.
I saw that in the original project there are the trained weights, but I think it's hard to convert them from Theano to PyTorch.

A tiny mistake in tiramisu-pytorch.ipynb

In update_viz_summary_plot fuction:

`
def update_viz_summary_plot(self):

    trn_loss = self.loss_history['train'][-1]

    val_loss = self.loss_history['val'][-1]

    trn_err = self.error_history['train'][-1]    

    val_err = self.error_history['val'][-1]

    txt = ("""Epoch: %d

        Train - Loss: %.3f Err: %.3f

        Test - Loss: %.3f Err: %.3f""" % (self.epoch, 

        trn_loss, trn_err, tst_loss, tst_err))

    window = self.visdom_plots['summary']

    return viz.text(

        txt,

        win=window,

        env=self.name

    )

`
there are no tst_loss and tst_err variables.

So I think they should be replaced by val_loss and val_err respectively, right?

fine tuning

did you training the whole network/weights in fine tuning or just some selected layers ?

Results?

What kind of results do you get? It would be really cool if you could post some learning curves, or even just the final mean IoU and accuracy on the camvid dataset for reference? :-)

Found a waste line in models/tiramisu.py

The line 60 in your code, tiramisu.py, : cur_channels_count += prev_block_channels
actually does nothing, since the cur_channels_count is overwritten before and after. Please consider removing it, since it is confusing and misleading.

        #######################
       #   Upsampling path   #
       #######################

       self.transUpBlocks = nn.ModuleList([])
       self.denseBlocksUp = nn.ModuleList([])
       for i in range(len(up_blocks)-1):
           self.transUpBlocks.append(TransitionUp(prev_block_channels, prev_block_channels))
           cur_channels_count = prev_block_channels + skip_connection_channel_counts[i]

           self.denseBlocksUp.append(DenseBlock(
               cur_channels_count, growth_rate, up_blocks[i],
                   upsample=True))
           prev_block_channels = growth_rate*up_blocks[i]
           cur_channels_count += prev_block_channels

Need help with adapting to different dataset

Hi @bfortuner ,

First of all thanks for the excellent implementation of the FCDenseNets.

I am trying to use your tiramisu implemetation for a different dataset and could really use your help. Particularly I need insight into how this is working

class LabelToLongTensor(object):
def __call__(self, pic):
    if isinstance(pic, np.ndarray):
        # handle numpy array
        label = torch.from_numpy(pic).long()
    else:
        label = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
        label = label.view(pic.size[1], pic.size[0], 1)
        label = label.transpose(0, 1).transpose(0, 2).squeeze().contiguous().long()
    return label

This is making a 1x224x224 label tensor for a label image of size 224x224x3. Now I am unable to adapt this for my dataset. I have 7 classes and each label image is 224x224x3. Should my label tensor be 1x224x224 with each value between 0-6 or 1-7 ? The nll_loss2d expects the output to be 7224224 if I am correct.

DenseNet is the concatenation operation

In your code you do DenseNet with the sum operation, which is the operation in paper ResNet. And DenseNet is the concatenation operation. This is probably why you have more code parameters than FC-DenseNet itself.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.