Git Product home page Git Product logo

deeplabv3's Introduction

deeplabv3

PyTorch implementation of DeepLabV3, trained on the Cityscapes dataset.

demo video with results

Index




Paperspace:

To train models and to run pretrained models (with small batch sizes), you can use an Ubuntu 16.04 P4000 VM with 250 GB SSD on Paperspace. Below I have listed what I needed to do in order to get started, and some things I found useful.

#!/bin/bash

# DEFAULT VALUES
GPUIDS="0"
NAME="paperspace_GPU"


NV_GPU="$GPUIDS" nvidia-docker run -it --rm \
        -p 5584:5584 \
        --name "$NAME""$GPUIDS" \
        -v /home/paperspace:/root/ \
        pytorch/pytorch:0.4_cuda9_cudnn7 bash
  • Inside the image, /root/ will now be mapped to /home/paperspace (i.e., $ cd -- takes you to the regular home folder).

  • To start the image:

    • $ sudo sh start_docker_image.sh
  • To commit changes to the image:

    • Open a new terminal window.
    • $ sudo docker commit paperspace_GPU0 pytorch/pytorch:0.4_cuda9_cudnn7
  • To stop the image when it’s running:

    • $ sudo docker stop paperspace_GPU0
  • To exit the image without killing running code:

    • Ctrl + P + Q
  • To get back into a running image:

    • $ sudo docker attach paperspace_GPU0
  • To open more than one terminal window at the same time:

    • $ sudo docker exec -it paperspace_GPU0 bash
  • To install the needed software inside the docker image:

    • $ apt-get update
    • $ apt-get install nano
    • $ apt-get install sudo
    • $ apt-get install wget
    • $ sudo apt install unzip
    • $ sudo apt-get install libopencv-dev
    • $ pip install opencv-python
    • $ python -mpip install matplotlib
    • Commit changes to the image (otherwise, the installed packages will be removed at exit!)
  • Do the following outside of the docker image:

    • $ --
    • Download the Cityscapes dataset:
      • $ wget --keep-session-cookies --save-cookies=cookies.txt --post-data 'username=XXXXX&password=YYYYY&submit=Login' https://www.cityscapes-dataset.com/login/ (where you replace XXXXX with your username, and YYYYY with your password)
      • $ unzip gtFine_trainvaltest.zip
      • $ unzip leftImg8bit_trainvaltest.zip
      • $ mkdir deeplabv3/data
      • $ mkdir deeplabv3/data/cityscapes
      • $ mv gtFine deeplabv3/data/cityscapes
      • $ mv leftImg8bit deeplabv3/data/cityscapes
      • $ unzip leftImg8bit_demoVideo.zip
      • $ mv leftImg8bit/demoVideo deeplabv3/data/cityscapes/leftImg8bit
      • $ unzip thn.zip?dl=0
      • $ mv thn deeplabv3/data
      • $ cd deeplabv3
      • Comment out the line print type(obj).name on line 238 in deeplabv3/cityscapesScripts/cityscapesscripts/helpers/annotation.py (this is need for the cityscapes scripts to be runnable with Python3)



Pretrained model:




Train model on Cityscapes:

  • SSH into the paperspace server.
  • $ sudo sh start_docker_image.sh
  • $ cd --
  • $ python deeplabv3/utils/preprocess_data.py (ONLY NEED TO DO THIS ONCE!)
  • $ python deeplabv3/train.py



Evaluation

evaluation/eval_on_val.py:

  • SSH into the paperspace server.

  • $ sudo sh start_docker_image.sh

  • $ cd --

  • $ python deeplabv3/utils/preprocess_data.py (ONLY NEED TO DO THIS ONCE!)

  • $ python deeplabv3/evaluation/eval_on_val.py

    • This will run the pretrained model (set on line 31 in eval_on_val.py) on all images in Cityscapes val, compute and print the loss, and save the predicted segmentation images in deeplabv3/training_logs/model_eval_val.

evaluation/eval_on_val_for_metrics.py:

  • SSH into the paperspace server.

  • $ sudo sh start_docker_image.sh

  • $ cd --

  • $ python deeplabv3/utils/preprocess_data.py (ONLY NEED TO DO THIS ONCE!)

  • $ python deeplabv3/evaluation/eval_on_val_for_metrics.py

  • $ cd deeplabv3/cityscapesScripts

  • $ pip install . (ONLY NEED TO DO THIS ONCE!)

  • $ python setup.py build_ext --inplace (ONLY NEED TO DO THIS ONCE!) (this enables cython, which makes the cityscapes evaluation script run A LOT faster)

  • $ export CITYSCAPES_RESULTS="/root/deeplabv3/training_logs/model_eval_val_for_metrics"

  • $ export CITYSCAPES_DATASET="/root/deeplabv3/data/cityscapes"

  • $ python cityscapesscripts/evaluation/evalPixelLevelSemanticLabeling.py

    • This will run the pretrained model (set on line 55 in eval_on_val_for_metrics.py) on all images in Cityscapes val, upsample the predicted segmentation images to the original Cityscapes image size (1024, 2048), and compute and print performance metrics:
classes          IoU      nIoU
--------------------------------
road          : 0.918      nan
sidewalk      : 0.715      nan
building      : 0.837      nan
wall          : 0.413      nan
fence         : 0.397      nan
pole          : 0.404      nan
traffic light : 0.411      nan
traffic sign  : 0.577      nan
vegetation    : 0.857      nan
terrain       : 0.489      nan
sky           : 0.850      nan
person        : 0.637    0.491
rider         : 0.456    0.262
car           : 0.897    0.759
truck         : 0.582    0.277
bus           : 0.616    0.411
train         : 0.310    0.133
motorcycle    : 0.322    0.170
bicycle       : 0.583    0.413
--------------------------------
Score Average : 0.593    0.364
--------------------------------


categories       IoU      nIoU
--------------------------------
flat          : 0.932      nan
construction  : 0.846      nan
object        : 0.478      nan
nature        : 0.869      nan
sky           : 0.850      nan
human         : 0.658    0.521
vehicle       : 0.871    0.744
--------------------------------
Score Average : 0.786    0.632
--------------------------------



Visualization

visualization/run_on_seq.py:

  • SSH into the paperspace server.

  • $ sudo sh start_docker_image.sh

  • $ cd --

  • $ python deeplabv3/utils/preprocess_data.py (ONLY NEED TO DO THIS ONCE!)

  • $ python deeplabv3/visualization/run_on_seq.py

    • This will run the pretrained model (set on line 33 in run_on_seq.py) on all images in the Cityscapes demo sequences (stuttgart_00, stuttgart_01 and stuttgart_02) and create a visualization video for each sequence, which is saved to deeplabv3/training_logs/model_eval_seq. See Youtube video from the top of the page.

visualization/run_on_thn_seq.py:

  • SSH into the paperspace server.

  • $ sudo sh start_docker_image.sh

  • $ cd --

  • $ python deeplabv3/utils/preprocess_data.py (ONLY NEED TO DO THIS ONCE!)

  • $ python deeplabv3/visualization/run_on_thn_seq.py

    • This will run the pretrained model (set on line 31 in run_on_thn_seq.py) on all images in the Thn sequence (real-life sequence collected with a standard dash cam) and create a visualization video, which is saved to deeplabv3/training_logs/model_eval_seq_thn. See Youtube video from the top of the page.



Documentation of remaining code

  • model/resnet.py:

    • Definition of the custom Resnet model (output stride = 8 or 16) which is the backbone of DeepLabV3.
  • model/aspp.py:

    • Definition of the Atrous Spatial Pyramid Pooling (ASPP) module.
  • model/deeplabv3.py:

    • Definition of the complete DeepLabV3 model.
  • utils/preprocess_data.py:

    • Converts all Cityscapes label images from having Id to having trainId pixel values, and saves these to deeplabv3/data/cityscapes/meta/label_imgs. Also computes class weights according to the ENet paper and saves these to deeplabv3/data/cityscapes/meta.
  • utils/utils.py:

    • Contains helper funtions which are imported and utilized in multiple files.
  • datasets.py:

    • Contains all utilized dataset definitions.

deeplabv3's People

Contributors

erjanmx avatar fregu856 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplabv3's Issues

What does the OS stand for in the model name?

Hi

I have a question about the models. In the name you specify which ResNet you are using (how many layers) but also "OS_X". Looking at the code it seems to me it has to do with the number of skip-connection blocks. Is that correct?

Thank you very much!

Better Model?

Hi:

I evaluated your pretrained model on cityscapes dataset and find that the performance is not that good. I haven't trained deeplabv3 using your scripts, but before that, I wondered if this is the best model trained by your scripts?(I think that you must have also adjust many parameters to achieve current performance) Thank you very much!

I'm trying to produce deeplabv3 paper's performance recently, so this question is important to me :)

Data Enhancement

Using a random cropping method, what is the difference between training and testing?

shape doesn't match

when i use resnet50-152 as the backbone to evaluate on my datasets , the shape doesn't match?

The dataloader problem

I want to know make the 2K image to 1024, then use random crop to make it to 255*255 shape in the train dataloader.Won't this have a bad effect on the results of image segmentation?

about evaluation result

Hi, @fregu856 ,

Thanks for releasing such a useful package using pytorch. I had practiced on eval_on_val_for_metrics.py as guided, and obtain the same metrics output as yours. I'm concerning about the big gap between your result and the official deeplabv3+ result.
The class IOU of yours is 59.3, while the official deeplabv3+ is reported as 82.1.
Could you list the difference regarding your implementation? Is the provided pre-trained model model_13_2_2_2_epoch_580.pth a very preliminary training result?

THX!

size mismatch for aspp when loading pre-trained model

Hi!

Very nice repo! I'm currently trying to integrate your model into our framework (https://github.com/DIVA-DIA/DeepDIVA, feel free to check it out!). However, when I load the provided weights for deeplabv3 I get the following error:

	size mismatch for aspp.conv_1x1_4.weight: copying a param with shape torch.Size([20, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([8, 256, 1, 1]).
	size mismatch for aspp.conv_1x1_4.bias: copying a param with shape torch.Size([20]) from checkpoint, the shape in current model is torch.Size([8]). (deeplabv3.py:50)

I am using exactly the Resnet (ResNet18_OS8) and the ASPP (no bottleneck) that you are using in your code. Do you know what could be causing this?

Thank you very much already in advance.

Cheers,
Linda

Add License

Hi,

Can you please add a license to your repo? Otherwise people will not be able to use your project.

Thanks

Multi Grid Support

Hi , Thank you for this code repository.
Does this implementation support Multi Grid method as discussed in the paper?

Changing No.# of classes

Hi,

For a custom dataset,

  • I changed the number of classes to 7 in the deeplabv3.py file (my data has 8 classes).
  • Eliminated the class_wieghts file requirement in the train.py file
  • Put my data & labels into the same folder structure

I get the following error in the line loss = loss_fn(outputs, label_imgs):
IndexError: Target 78 is out of bounds.

Can someone point out where I am going wrong?
Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.