Git Product home page Git Product logo

cagnetv2-zero-shot-semantic-segmentation's Introduction

CaGNetv2: From Pixel to Patch (accpeted by TNNLS)

Code for "From Pixel to Patch: Synthesize Context-aware Features for Zero-shot Semantic Segmentation".

Created by Zhangxuan Gu, Siyuan Zhou, Li Niu*, Zihan Zhao, Liqing Zhang*.

Paper Link: [arXiv]

Note

This work is an extension of our previous CaGNet [arXiv, github].

Visualization on Pascal-VOC

Visualization on Pascal-VOC

Introduction

Zero-shot learning has been actively studied for image classification task to relieve the burden of annotating image labels. Interestingly, semantic segmentation task requires more labor-intensive pixel-wise annotation, but zero-shot semantic segmentation has only attracted limited research interest. Thus, we focus on zero-shot semantic segmentation, which aims to segment unseen objects with only category-level semantic representations provided for unseen categories. In this paper, we propose a novel Context-aware feature Generation Network (CaGNetv2), which can synthesize context-aware pixel-wise visual features for unseen categories based on category-level semantic representations and pixel-wise contextual information. The synthesized features are used to finetune the classifier to enable segmenting unseen objects. Furthermore, we extend pixel-wise feature generation and finetuning to patch-wise feature generation and finetuning, which additionally considers inter-pixel relationship. Experimental results on Pascal-VOC, Pascal-Context, and COCO-stuff show that our method significantly outperforms the existing zero-shot semantic segmentation methods.

Overview of Our CaGNet

Experimental Results

We compare our CaGNetv2 with SPNet [github, paper] and ZS3Net [github, paper].

β€œST” in the following tables stands for self-training mentioned in ZS3Net.

Our Results on Pascal-VOC dataset

Method hIoU mIoU PA MA S-mIoU U-mIoU U-PA U-MA
SPNet 0.0002 0.5687 0.7685 0.7093 0.7583 0.0001 0.0007 0.0001
SPNet-c 0.2610 0.6315 0.7755 0.7188 0.7800 0.1563 0.2955 0.2387
ZS3Net 0.2874 0.6164 0.7941 0.7349 0.7730 0.1765 0.2147 0.1580
CaGNet(pi) 0.3972 0.6545 0.8068 0.7636 0.7840 0.2659 0.4297 0.3940
CaGNet(pa) 0.4326 0.6623 0.8068 0.7643 0.7814 0.2990 0.5176 0.4710
ZS3Net+ST 0.3328 0.6302 0.8095 0.7382 0.7802 0.2115 0.3407 0.2637
CaGNet(pi)+ST 0.4366 0.6577 0.8164 0.7560 0.7859 0.3031 0.5855 0.5071
CaGNet(pa)+ST 0.4528 0.6657 0.8036 0.7650 0.7813 0.3188 0.5939 0.5417

Our Results on COCO-Stuff dataset

Method hIoU mIoU PA MA S-mIoU U-mIoU U-PA U-MA
SPNet 0.0140 0.3164 0.5132 0.4593 0.3461 0.0070 0.0171 0.0007
SPNet-c 0.1398 0.3278 0.5341 0.4363 0.3518 0.0873 0.2450 0.1614
ZS3Net 0.1495 0.3328 0.5467 0.4837 0.3466 0.0953 0.2275 0.2701
CaGNet(pi) 0.1819 0.3345 0.5658 0.4845 0.3549 0.1223 0.2545 0.2701
CaGNet(pa) 0.1984 0.3327 0.5632 0.4909 0.3468 0.1389 0.2962 0.3132
ZS3Net+ST 0.1620 0.3372 0.5631 0.4862 0.3489 0.1055 0.2488 0.2718
CaGNet(pi)+ST 0.1946 0.3372 0.5676 0.4854 0.3555 0.1340 0.2670 0.2728
CaGNet(pa)+ST 0.2269 0.3456 0.5711 0.4629 0.3617 0.1654 0.3702 0.2567

Our Results on Pascal-Context dataset

Method hIoU mIoU PA MA S-mIoU U-mIoU U-PA U-MA
SPNet 0 0.2938 0.5793 0.4486 0.3357 0 0 0
SPNet-c 0.0718 0.3079 0.5790 0.4488 0.3514 0.0400 0.1673 0.1361
ZS3Net 0.1246 0.3010 0.5710 0.4442 0.3304 0.0768 0.1922 0.1532
CaGNet(pi) 0.2061 0.3347 0.5975 0.4900 0.3610 0.1442 0.3976 0.3248
CaGNet(pa) 0.2135 0.3243 0.5816 0.5082 0.3718 0.1498 0.3981 0.3412
ZS3Net+ST 0.1488 0.3102 0.5842 0.4532 0.3398 0.0953 0.3030 0.1721
CaGNet(pi)+ST 0.2252 0.3352 0.5951 0.4962 0.3644 0.1630 0.4038 0.4214
CaGNet(pa)+ST 0.2478 0.3364 0.5832 0.4964 0.3482 0.1923 0.4075 0.4023

Please note that our reproduced results of SPNet on Pascal-VOC dataset are obtained using their released model and code with careful tuning, but still lower than their reported results.

Code

COMING SOON !

cagnetv2-zero-shot-semantic-segmentation's People

Contributors

siyuan-zhou avatar zhangxgu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.