可以提供下Resnet的指标吗

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

Resnet相比VIT的效果怎么样呢 about vand-april-gan HOT 5 OPEN

hitlei commented on June 16, 2024

Resnet相比VIT的效果怎么样呢

from vand-april-gan.

Comments (5)

ByChelsea commented on June 16, 2024 3

Hi! What's your intuition on why auroc_sp is much lower on image vs pixel? And what could be done to improve?

Hi! In the zero-shot setting, the anomaly scores used for classification are obtained by calculating the similarity between class tokens and text features. In this case, the accuracy of classification and segmentation is not significantly relevant. The accuracy of classification primarily depends on the pre-trained model and the prompts. Designing appropriate prompts manually is quite challenging. Perhaps you could try prompt learning to further enhance the classification performance.

Besides, due to the sufficiently high predictive accuracy of anomaly maps, you can also attempt to take their maximum values as anomaly scores. Adding these two parts together can further enhance performance.

from vand-april-gan.

ByChelsea commented on June 16, 2024

Hi, I'm sorry for replying late.

In our approach, we focused on exploring the CLIP model based on ViTs without conducting extensive experiments on the model based on ResNets. I can provide you with a reference result obtained by training the RN50x16 model for 5 epochs at a resolution of 384 (the standard resolution for this model).

objects	auroc_px	f1_px	ap_px	aupro	auroc_sp	f1_sp	ap_sp
candle	97.5	16.9	9.9	87.1	94.4	87.5	93.9
capsules	90.4	8.0	3.6	67.9	62.2	77.2	73.1
cashew	82.0	12.8	8.1	86.2	60.7	80.2	79.3
chewinggum	99.2	70.8	75.9	82.7	94.9	94.5	97.9
fryum	92.8	25.0	18.1	79.0	77.5	81.8	87.9
macaroni1	97.3	24.1	13.3	87.3	43.5	66.7	48.0
macaroni2	96.7	9.5	3.1	86.3	52.6	66.9	51.2
pcb1	90.6	7.8	4.5	81.1	71.6	70.8	69.1
pcb2	89.0	14.2	6.3	68.6	61.0	67.4	63.0
pcb3	88.1	9.3	4.6	66.0	65.2	68.1	65.9
pcb4	94.1	19.9	13.4	81.8	94.6	89.9	94.8
pipe_fryum	96.5	35.3	24.9	94.5	92.1	90.7	96.1
mean	92.8	21.1	15.5	80.7	72.5	78.5	76.7

The training command is:

python train.py --dataset mvtec --train_data_path ./data/mvtec \ 
--save_path ./exps/visa/RN50x16 --config_path ./open_clip/model_configs/RN50x16.json --model RN50x16 \ 
--features_list 1 2 3 4 --pretrained openai --image_size 384  --batch_size 8 --aug_rate 0.2 --print_freq 1 \ 
--epoch 5 --save_freq 1 --learning_rate 0.0001

In our experiments, we found that training the linear layers with ResNets is more challenging compared to using ViTs, and it may require more suitable hyperparameters and training strategies. The code now supports changing resolutions, so you can continue your exploration using these code modifications.

Hope that my answer can be helpful to you. :)

from vand-april-gan.

afvca commented on June 16, 2024

Hi! What's your intuition on why auroc_sp is much lower on image vs pixel? And what could be done to improve?

from vand-april-gan.

twmht commented on June 16, 2024

@ByChelsea

Do you have any reference paper for prompt learning?

from vand-april-gan.

ByChelsea commented on June 16, 2024

@ByChelsea

Do you have any reference paper for prompt learning?

I'm sorry, I just noticed this question. You can refer to CoOp and AnomalyCLIP.

from vand-april-gan.

Resnet相比VIT的效果怎么样呢 about vand-april-gan HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent