Git Product home page Git Product logo

f3net's Introduction

by Jun Wei, Shuhui Wang, Qingming Huang

Introduction

frameworkMost of existing salient object detection models have achieved great progress by aggregating multi-level features extracted from convolutional neural networks. However, because of the different receptive fields of different convolutional layers, there exists big differences between features generated by these layers. Common feature fusion strategies (addition or concatenation) ignore these differences and may cause suboptimal solutions. In this paper, we propose the F3Net to solve above problem, which mainly consists of cross feature module (CFM) and cascaded feedback decoder (CFD) trained by minimizing a new pixel position aware loss (PPA). Specifically, CFM aims to selectively aggregate multi-level features. Different from addition and concatenation, CFM adaptively selects complementary components from input features before fusion, which can effectively avoid introducing too much redundant information that may destroy the original features. Besides, CFD adopts a multi-stage feedback mechanism, where features closed to supervision will be introduced to the output of previous layers to supplement them and eliminate the differences between features. These refined features will go through multiple similar iterations before generating the final saliency maps. Furthermore, different from binary cross entropy, the proposed PPA loss doesn’t treat pixels equally, which can synthesize the local structure information of a pixel to guide the network to focus more on local details. Hard pixels from boundaries or error-prone parts will be given more attention to emphasize their importance. F3Net is able to segment salient object regions accurately and provide clear local details. Comprehensive experiments on five benchmark datasets demonstrate that F3Net outperforms state-of-the-art approaches on six evaluation metrics.

Prerequisites

Clone repository

git clone [email protected]:weijun88/F3Net.git
cd F3Net/

Download dataset

Download the following datasets and unzip them into data folder

Download model

  • If you want to test the performance of F3Net, please download the model into out folder
  • If you want to train your own model, please download the pretrained model into res folder

Training

    cd src/
    python3 train.py
  • ResNet-50 is used as the backbone of F3Net and DUTS-TR is used to train the model
  • batch=32, lr=0.05, momen=0.9, decay=5e-4, epoch=32
  • Warm-up and linear decay strategies are used to change the learning rate lr
  • After training, the result models will be saved in out folder

Testing

    cd src
    python3 test.py
  • After testing, saliency maps of PASCAL-S, ECSSD, HKU-IS, DUT-OMRON, DUTS-TE will be saved in eval/F3Net/ folder.

Saliency maps & Trained model

Evaluation

  • To evaluate the performace of F3Net, please use MATLAB to run main.m
    cd eval
    matlab
    main
  • Quantitative comparisons performace

  • Qualitative comparisons sample

Citation

  • If you find this work is helpful, please cite our paper
@inproceedings{F3Net,
  title     = {F3Net: Fusion, Feedback and Focus for Salient Object Detection},
  author    = {Jun Wei, Shuhui Wang, Qingming Huang},
  booktitle = {AAAI Conference on Artificial Intelligence (AAAI)},
  year      = {2020}
}

f3net's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

f3net's Issues

I have a question about the model's performance

good job!
I have a question about the model's performance
I can't not get same result when i load the model-32 to get predict maps or directly download predict maps.
the former situation which is evaluated in DUTS
Maxf :0.881 MeanF:0.829 MAE:0.036
the latter situation which is evaluated in DUTS
Maxf :0.891 MeanF:0.84 MAE:0.035

how to use the model-32

I clone your repo and download the model-32, want to use it output the mask of a photo from myself to compare the effect with other model.
However, I face some hard works with code debug , I want to get the advice with directly using the model-32
really thanks!!!!

How to test code

Hi thanks for sharing your great work.

How do I run the code on my input images? where do I put my images for testing?

python3 test.py
returns the following error:
TypeError: 'NoneType' object is not subscriptable

请问std和mean的计算

self.mean   = np.array([[[124.55, 118.90, 102.94]]])
self.std    = np.array([[[ 56.77,  55.97,  57.50]]])

请问这两个值是怎么来的?我发现用255乘以pytorch给的值和这个有差别

question about train.py

thanks for your code!!!
i have a question about your code of train.py,
if epoch>cfg.epoch/3*2: torch.save(net.state_dict(), cfg.savepath+'/model-'+str(epoch+1))

it mean save all the checkpoint when epoch>cfg.epoch/3*2, but how to choose a better one for test?

你好,关于CFM模块和decoder处理

恭喜取得这么好的结果,阅读了你的论文后有两个问题。
1.CFM模块中将上一层特征和下一层特征做卷积等操作后,进行相乘操作,然后将结果做相应操作后分别相加至上下层中返回。 ------这里为什么进行相乘操作,您是怎么想到这样处理或是意义如何又或是参考了某篇论文。 比较疑惑,特征图相乘的想法点
2.在您的net.py代码中F3Net( )只调用了第一次decoder( )和第二次decoder( ),但您的论文中该解码调用了n次。------可能我看的不仔细,还没找到您的for循环定义n次decoder操作在哪?

Retrain the existing model model-32

Hello sir,
Thank you for your great work.

I want to re-train your existing model-32 with my custom data set . For that i am replacing
self.load_state_dict(torch.load('../res/resnet50-19c8e357.pth'), strict=False)
with
self.load_state_dict(torch.load('../res/model-32'), strict=False)

But for using that performance of the model is downgraded. Please guide me how to retrain model-32 with my custom data set.
Thanks.

损失函数的实现

你好,关于损失函数有两个地方不太理解,wbce = (weit*wbce).sum(dim=(2,3))/weit.sum(dim=(2,3)),这一句中,分母是权重的和,而论文中是ya的和。第二个就是计算iou时,分子分母都加了一个1,目的是避免0/0的情况吗?谢谢

Apex can't be installed.

The apex dependency requires specific cuda versions to be installed (still haven't managed to install it) and makes the whole F3Net code extremely hard to install and experiment with.

pip install apex fails and installation also fails when apex is installed via the github folder with either:

python setup.py install --cuda_ext --cpp_ext
or
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Small mistake fig 2 on your paper

Hi,
If I may, I think you have a small mistake in your paper:

  • Figure 2, about the CFM module: after element wise multiplication, given the direction of the arrow, your colored blocked should be reversed (currently it says you are doing Relu, bn and conv after mult)

Thanks for sharing your code and work with us!

How to get the Fig.1 in paper?

image
Thanks for your code and paper. I'm curious about the visualization of low-level and high-level features. Could your please tell me how to generate such a figure?

每一次训练的结果都不一样

您好,我跑您的代码的时候,每一次训练完结果都不一样。对最终测试结果也有很大的影响。
我已经在代码开头固定了随机数种子,但是并没有起到任何作用。
观察训练的loss,每次重新运行还是不一样。请问这个问题是怎么回事呢?

Matlab code in /eval folder gives different results

Hello,

thanks for your work!

however, when I download your pre-calculated map and use the matlab code given in eval/ folder
the results is different from paper.

I think there are some issue in the matlab code.

May I get your help to verify?
or do you mind to provide a python version of evaluation? especially for E-Measure and S-Measure

Thanks!

损失函数

weit = 1+5*torch.abs(F.avg_pool2d(mask, kernel_size=31, stride=1, padding=15)-mask)
你好,请问kernel_size=31是如何选择的。

关于model-32是在那个数据集上训练得到的?

我根据您提供的model-32预训练模型进行了测试,得到了非常满意的效果,非常感谢你提供的源码。
但是我想知道model-32是在哪个数据集上预训练得到的?我看到train.py文件中为DUTS数据集,不知道是不是呢?
期待回答。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.