dengpingfan / polyp-pvt Goto Github PK

View Code? Open in Web Editor NEW

176.0 7.0 45.0 2.61 MB

Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers, AIR 2023.

Shell 0.11% Python 99.89%

polyp-pvt's Introduction

Polyp-PVT

by Bo Dong, Wenhai Wang, Jinpeng Li, Deng-Ping Fan.

This repo is the official implementation of "Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers".

1. Introduction

Polyp-PVT is initially described in arxiv.

Most polyp segmentation methods use CNNs as their backbone, leading to two key issues when exchanging information between the encoder and decoder: 1) taking into account the differences in contribution between different-level features; and 2) designing effective mechanism for fusing these features. Different from existing CNN-based methods, we adopt a transformer encoder, which learns more powerful and robust representations. In addition, considering the image acquisition influence and elusive properties of polyps, we introduce three novel modules, including a cascaded fusion module (CFM), a camouflage identification module (CIM), and a similarity aggregation module (SAM). Among these, the CFM is used to collect the semantic and location information of polyps from high-level features, while the CIM is applied to capture polyp information disguised in low-level features. With the help of the SAM, we extend the pixel features of the polyp area with high-level semantic position information to the entire polyp area, thereby effectively fusing cross-level features. The proposed model, named Polyp-PVT , effectively suppresses noises in the features and significantly improves their expressive capabilities.

Polyp-PVT achieves strong performance on image-level polyp segmentation (0.808 mean Dice and 0.727 mean IoU on ColonDB) and video polyp segmentation (0.880 mean dice and 0.802 mean IoU on CVC-300-TV), surpassing previous models by a large margin.

2. Framework Overview

3. Results

3.1 Image-level Polyp Segmentation

3.2 Image-level Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:qw9i], including our results and that of compared models.

3.3 Video Polyp Segmentation

3.4 Video Polyp Segmentation Compared Results:

We also provide some result of baseline methods, You could download from Google Drive/Baidu Drive [code:rtvt], including our results and that of compared models.

4. Usage:

4.1 Recommended environment:

Python 3.8
Pytorch 1.7.1
torchvision 0.8.2

4.2 Data preparation:

Downloading training and testing datasets and move them into ./dataset/, which can be found in this Google Drive/Baidu Drive [code:sydz].

4.3 Pretrained model:

You should download the pretrained model from Google Drive/Baidu Drive [code:w4vk], and then put it in the './pretrained_pth' folder for initialization.

4.4 Training:

Clone the repository:

git clone https://github.com/DengPingFan/Polyp-PVT.git
cd Polyp-PVT 
bash train.sh

4.5 Testing:

cd Polyp-PVT 
bash test.sh

4.6 Evaluating your trained model:

Matlab: Please refer to the work of MICCAI2020 (link).

Python: Please refer to the work of ACMMM2021 (link).

Please note that we use the Matlab version to evaluate in our paper.

4.7 Well trained model:

You could download the trained model from Google Drive/Baidu Drive [code:9rpy] and put the model in directory './model_pth'.

4.8 Pre-computed maps:

Google Drive/Baidu Drive [code:x3jc]

5. Citation:

@aticle{dong2023PolypPVT,
  title={Polyp-PVT: Polyp Segmentation with PyramidVision Transformers},
  author={Bo, Dong and Wenhai, Wang and Deng-Ping, Fan and Jinpeng, Li and Huazhu, Fu and Ling, Shao},
  journal={CAAI AIR},
  year={2023}
}

6. Acknowledgement

We are very grateful for these excellent works PraNet, EAGRNet and MSEG, which have provided the basis for our framework.

7. FAQ:

If you want to improve the usability or any piece of advice, please feel free to contact me directly ([email protected]).

8. License

The source code is free for research and education use only. Any comercial use should get formal permission first.

polyp-pvt's People

Contributors

Stargazers

Watchers

polyp-pvt's Issues

最优权重

你好，我想问问最后的最优权重是怎么选取的，是根据总的测试文件进行选取的最优吗，还是根据各个不同的测试数据集进行测试分别选取的

Pretained only works for RGB. Not for Grayscale

How to solve it ?

Question about the channel selection in SAM

Hello, I notice this fantastic work, and have the following question：
In SAM, this work applies "a Softmax function on the channel dimension of T2 and chooses the second channel as the attention map". I wonder why choose the second channel.
I have already read the related paper "Edge-aware graph representation learning and reasoning for face parsing", but still have no idea about it. I would appreciate it if you could give me an answer!
Best regard

evaluation

Hello, I see that you only have the mdice evaluation indicator in Train.py. Could you please share the code for other evaluation indicators during the training of the model?

def test(model, path, dataset):

data_path = os.path.join(path, dataset)
image_root = '{}/images/'.format(data_path)
gt_root = '{}/masks/'.format(data_path)
model.eval()
num1 = len(os.listdir(gt_root))
test_loader = test_dataset(image_root, gt_root, 352)
DSC = 0.0
for i in range(num1):
    image, gt, name = test_loader.load_data()
    gt = np.asarray(gt, np.float32)
    gt /= (gt.max() + 1e-8)
    image = image.cuda()

    res, res1  = model(image)
    # eval Dice
    res = F.upsample(res + res1 , size=gt.shape, mode='bilinear', align_corners=False)
    res = res.sigmoid().data.cpu().numpy().squeeze()
    res = (res - res.min()) / (res.max() - res.min() + 1e-8)
    input = res
    target = np.array(gt)
    N = gt.shape
    smooth = 1
    input_flat = np.reshape(input, (-1))
    target_flat = np.reshape(target, (-1))
    intersection = (input_flat * target_flat)
    dice = (2 * intersection.sum() + smooth) / (input.sum() + target.sum() + smooth)
    dice = '{:.4f}'.format(dice)
    dice = float(dice)
    DSC = DSC + dice

return DSC / num1

PVT V2 implementation

Hi @DengPingFan

Did you check the implementation of PVT V2? Actually, I need a classification head in forward propagation to apply some loss functions in classification HEAD. Unfortunately, you comment out this line. Could you please tell me the solution to this problem?

About pretrained module

Hello, I saw in the paper that you compared the resunet + + network. Can you send the pre training model of this network? I've tried for a long time and haven't realized it. I want to do a comparative experiment.

可视化结果

可视化结果中，红色，绿神，黄色是怎么画出来的啊

About CFM module

CFM module is similar to the partial decoder module

baidu drive

链接失效

About the results of your video polyp segmentation which your mention in the conclusion

How should I get the results of your video polyp segmentation which your mention in the conclusion? How can I train for it to get the result？

Article publication

请问你们这个工作有发表在期刊或者会议上吗

could u give FPS or FLOPs about Polyp-PVT, i test this backbone, its so slow

if name == "main":
a = torch.randn(1, 3, 512, 512).cuda()
backbone = pvt_v2_b0().cuda()
start = time.time()
out = backbone(a)
end = time.time()-start
print('each image use %5f seconds, and image size is 512' % end, )
print([i.shape for i in out])
each image use 0.374312 seconds, and image size is 512
[torch.Size([1, 32, 128, 128]), torch.Size([1, 64, 64, 64]), torch.Size([1, 160, 32, 32]), torch.Size([1, 256, 16, 16])]

Sincerely request the thesis baseline code

Hi, thank you for your excellent work.
The baseline you used for comparison in your paper is from "Pvtv2: Improved baselines with pyramid vision transformer", which I have tried many times without success. I didn't find this part of the code in the project. Could you provide a baseline code of PVTV2 that you use?
Thanks!

关于模型 Train.py 的几点疑问

以下疑问以 Train.py 的最新版本为准。

第 110 行的 (epoch + 1) % 1 == 0
所有整数对 1 取余的结果都是 0。所以如果作者您的用意是每次训练完一个周期都对比一下各周期下不同模型的 mdice 系数的话，那么为什么要用这个取余运算呢？我反复看了这一行代码总觉得似乎没有必要写，也可能是我没有真正理解您的用意，烦请您指教！
第 215 行的 for epoch in range(1, opt.epoch):
您的论文里说的是训练 100 个周期，但是我看这行代码是从 1 开始到 99 结束，也就是说实际上模型只训练了99个周期吗？