Git Product home page Git Product logo

segsort's Introduction

SegSort: Segmentation by Discriminative Sorting of Segments

By Jyh-Jing Hwang, Stella X. Yu, Jianbo Shi, Maxwell D. Collins, Tien-Ju Yang, Xiao Zhang, and Liang-Chieh Chen

Almost all existing deep learning approaches for semantic segmentation tackle this task as a pixel-wise classification problem. Yet humans understand a scene not in terms of pixels, but by decomposing it into perceptual groups and structures that are the basic building blocks of recognition. This motivates us to propose an end-to-end pixel-wise metric learning approach that mimics this process. In our approach, the optimal visual representation determines the right segmentation within individual images and associates segments with the same semantic classes across images. The core visual learning problem is therefore to maximize the similarity within segments and minimize the similarity between segments. Given a model trained this way, inference is performed consistently by extracting pixel-wise embeddings and clustering, with the semantic label determined by the majority vote of its nearest neighbors from an annotated set.

As a result, we present the SegSort, as a first attempt using deep learning for unsupervised semantic segmentation, achieving 76% performance of its supervised counterpart. When supervision is available, SegSort shows consistent improvements over conventional approaches based on pixel-wise softmax training. Additionally, our approach produces more precise boundaries and consistent region predictions. The proposed SegSort further produces an interpretable result, as each choice of label can be easily understood from the retrieved nearest segments.

SegSort is published in ICCV 2019, see our paper for more details.

Codebase

This release of SegSort is based on our previous published codebase AAF in ECCV 2018. It is also easy to integrate SegSort modules network/segsort/ with the popular codebase DeepLab.

Prerequisites

  1. Linux
  2. Python2.7 or Python3 (>=3.5)
  3. Cuda 8.0 and Cudnn 6

Required Python Packages

  1. tensorflow 1.X
  2. numpy
  3. scipy
  4. tqdm
  5. PIL
  6. opencv

Data Preparation

  • PASCAL VOC 2012
  • The ground truth semantic segmentation masks are reformatted as grayscale images, or you can download them here. Please put them under the VOC2012/ folder.
  • The oversegmentation masks (from contours) can be produced by combining any contour detectors with gPb-owt-ucm. We provide the HED-owt-ucm masks here. Please put them under the VOC2012/ folder.
  • Dataset folder structure: VOC2012/
    • JPEGImages/
    • segcls/
    • hed/

ImageNet Pre-Trained Models

Download ResNet101.v1 from Tensorflow-Slim. Please put it under a new directory SegSort/snapshots/imagenet/trained/.

Bashscripts to Get Started

  • SegSort (Single-GPU and fast training)
source bashscripts/voc12/train_segsort.sh
  • SegSort (Multi-GPUs)
source bashscripts/voc12/train_segsort_mgpu.sh
  • Unsupervised SegSort (Single-GPU)
source bashscripts/voc12/train_segsort_unsup.sh
  • Baseline Models: Please refer to our previous codebase AAF.

Citation

If you find this code useful for your research, please consider citing our paper SegSort: Segmentation by Discriminative Sorting of Segments.

@inproceedings{hwang2019segsort,
  title={SegSort: Segmentation by Discriminative Sorting of Segments},
  author={Hwang, Jyh-Jing and Yu, Stella X and Shi, Jianbo and Collins, Maxwell D and Yang, Tien-Ju and Zhang, Xiao and Chen, Liang-Chieh},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={7334--7344},
  year={2019}
}

License

SegSort is released under the MIT License (refer to the LICENSE file for details).

segsort's People

Contributors

jyhjinghwang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.