Git Product home page Git Product logo

deep-segmentation's Introduction

Deep Segmentation

This repository contains several CNNs for semantic segmentation (U-Net, SegNet, ResNet, FractalNet) using Keras library. The code was developed assuming the use of depth data (e.g. Kinect, Asus Xtion Pro Live).

This project has been included in the paper "Convolutional Networks for Semantic Heads Segmentation using Top-View Depth Data in Crowded Environment" accepted in Internation Conference on Pattern Recognition (ICPR), 2018.

If you find this code useful, we encourage you to cite the paper. BibTeX:

@conference {liciotti2018convolutional,
  title = {Convolutional Networks for Semantic Heads Segmentation using Top-View Depth Data in Crowded Environment},
  booktitle = {2018 24th International Conference on Pattern Recognition (ICPR)},
  year = {2018},
  month = {Aug},
  pages = {1384-1389},
  abstract = {Detecting and tracking people is a challenging task in a persistent crowded environment (i.e. retail, airport, station, etc.) for human behaviour analysis of security purposes. This paper introduces an approach to track and detect people in cases of heavy occlusions based on CNNs for semantic segmentation using top-view depth visual data. The purpose is the design of a novel U-Net architecture, U-Net3, that has been modified compared to the previous ones at the end of each layer. In particular, a batch normalization is added after the first ReLU activation function and after each max-pooling and up-sampling functions. The approach was applied and tested on a new and public available dataset, TVHeads Dataset, consisting of depth images of people recorded from an RGB-D camera installed in top-view configuration. Our variant outperforms baseline architectures while remaining computationally efficient at inference time. Results show high accuracy, demonstrating the effectiveness and suitability of our approach.},
  keywords = {Cameras, Computer architecture, Fractals, Head, Image segmentation, Semantics, Training},
  issn = {1051-4651},
  doi = {10.1109/ICPR.2018.8545397},
  author = {Daniele Liciotti and Marina Paolanti and R. Pietrini and Emanuele Frontoni and Primo Zingaretti}
}

The code has been tested on:

  • Ubuntu 16.04
  • Python 3.5.2
  • Keras 2.2.2
  • TensorFlow 1.7.0

You can test these scripts on the following datasets:

YouTubeDemoHeads YouTubeDemoInfant

Data

Provided data is processed by data.py script. This script just loads the images and saves them into NumPy binary format files .npy for faster loading later.

python data.py

Models

The provided models are basically a convolutional auto-encoders.

python train_fractal_unet.py
python train_resnet.py
python train_segnet.py
python train_unet.py
python train_unet2.py
python train_unet3_conv.py

These deep neural network is implemented with Keras functional API.

Output from the networks is a 96 x 128 which represents mask that should be learned. Sigmoid activation function makes sure that mask pixels are in [0, 1] range.

Prediction

You can test the online prediction with an OpenNI registration (.oni file).

python online_prediction.py --v <oni_video_path>

Requirement for this is OpenNI2 installation: https://github.com/occipital/OpenNI2, then link the libOpenNI2.so and the OpenNI2 directory in the script path. Before launching the script create a folder predicted_images.

Python Environment Setup

sudo apt-get install python3-pip python3-dev python-virtualenv # for Python 3.n
virtualenv -p python3 deepseg
. deepseg/bin/activate

The preceding command should change your prompt to the following:

(deepseg)$ 

Install TensorFlow in the active virtualenv environment:

pip3 install --upgrade tensorflow-gpu # for Python 3.n and GPU

Install the others library:

pip3 install --upgrade keras scikit-learn scikit-image h5py opencv-python primesense

Run

  • Create a folder raw in the same filesystem level of the above python scripts.
  • Download the dataset and extract all the images in a folder raw/train.
  • Run python data.py a folder npy will be created containig Numpy binary format npy files with traning and validation dataset.
  • Run the above python training and testing scripts, for example python train_unet3_conv.py.
  • Log files with final results log_conv_8.csv and log_conv_16.csv will be created.
  • Predicted images for the test data will be stored in folders preds_16 and preds_8.

Authors

Acknowledgements

  • This work is partially inspired by the work of jocicmarko.

deep-segmentation's People

Contributors

danielelic avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

deep-segmentation's Issues

Linking OpenNI

How do I link the libOpenNI2.so and the OpenNI2 directory?

Other video formats than ONI

Hi Daniele, thanks again for your code, it's been very useful.

I'm wondering how I can articulate this code with video files in more traditional formats such as .mov, .avi or .mp4. Is it possible to convert these videos to .oni? What's the origin of an .oni video? Or do you have any other idea that could help me?

Thanks!

Error while training the model

Hi Daniele,

I'm trying to train the model according to the "Run" section but when I run the command python train_unet3_conv.py I get this:

(...)

Fitting model (bit = 8) ...

Train on 0 samples, validate on 0 samples
Epoch 1/200
Traceback (most recent call last):
File "train_unet3_conv.py", line 232, in
train_and_predict(8)
File "train_unet3_conv.py", line 186, in train_and_predict
callbacks=[csv_logger, model_checkpoint])
File "C:\Program Files\Anaconda3\lib\site-packages\keras\engine\training.py", line 1039, in fit
validation_steps=validation_steps)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\engine\training_arrays.py", line 217, in fit_loop
callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\callbacks.py", line 79, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "C:\Program Files\Anaconda3\lib\site-packages\keras\callbacks.py", line 338, in on_epoch_end
self.progbar.update(self.seen, self.log_values)
AttributeError: 'ProgbarLogger' object has no attribute 'log_values'

Apparently there's no samples to train or to validate, but the npy folder is (i guess) well created.

How can I solve this? is there something I'm missing?

Thanks a lot

Possible error in data.py

On data.py, lines 58-59

    images = np.array(images, dtype=np.uint8)
    images16 = np.array(images, dtype=np.uint16)

Both are using the 8-bit images, so the training will not use the 16 bit resolution.

Is it so?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.