Git Product home page Git Product logo

Comments (5)

ByChelsea avatar ByChelsea commented on June 16, 2024 3

Hi! What's your intuition on why auroc_sp is much lower on image vs pixel? And what could be done to improve?

Hi! In the zero-shot setting, the anomaly scores used for classification are obtained by calculating the similarity between class tokens and text features. In this case, the accuracy of classification and segmentation is not significantly relevant. The accuracy of classification primarily depends on the pre-trained model and the prompts. Designing appropriate prompts manually is quite challenging. Perhaps you could try prompt learning to further enhance the classification performance.

Besides, due to the sufficiently high predictive accuracy of anomaly maps, you can also attempt to take their maximum values as anomaly scores. Adding these two parts together can further enhance performance.

from vand-april-gan.

ByChelsea avatar ByChelsea commented on June 16, 2024

Hi, I'm sorry for replying late.

In our approach, we focused on exploring the CLIP model based on ViTs without conducting extensive experiments on the model based on ResNets. I can provide you with a reference result obtained by training the RN50x16 model for 5 epochs at a resolution of 384 (the standard resolution for this model).

objects auroc_px f1_px ap_px aupro auroc_sp f1_sp ap_sp
candle 97.5 16.9 9.9 87.1 94.4 87.5 93.9
capsules 90.4 8.0 3.6 67.9 62.2 77.2 73.1
cashew 82.0 12.8 8.1 86.2 60.7 80.2 79.3
chewinggum 99.2 70.8 75.9 82.7 94.9 94.5 97.9
fryum 92.8 25.0 18.1 79.0 77.5 81.8 87.9
macaroni1 97.3 24.1 13.3 87.3 43.5 66.7 48.0
macaroni2 96.7 9.5 3.1 86.3 52.6 66.9 51.2
pcb1 90.6 7.8 4.5 81.1 71.6 70.8 69.1
pcb2 89.0 14.2 6.3 68.6 61.0 67.4 63.0
pcb3 88.1 9.3 4.6 66.0 65.2 68.1 65.9
pcb4 94.1 19.9 13.4 81.8 94.6 89.9 94.8
pipe_fryum 96.5 35.3 24.9 94.5 92.1 90.7 96.1
mean 92.8 21.1 15.5 80.7 72.5 78.5 76.7

The training command is:

python train.py --dataset mvtec --train_data_path ./data/mvtec \ 
--save_path ./exps/visa/RN50x16 --config_path ./open_clip/model_configs/RN50x16.json --model RN50x16 \ 
--features_list 1 2 3 4 --pretrained openai --image_size 384  --batch_size 8 --aug_rate 0.2 --print_freq 1 \ 
--epoch 5 --save_freq 1 --learning_rate 0.0001

In our experiments, we found that training the linear layers with ResNets is more challenging compared to using ViTs, and it may require more suitable hyperparameters and training strategies. The code now supports changing resolutions, so you can continue your exploration using these code modifications.

Hope that my answer can be helpful to you. :)

from vand-april-gan.

afvca avatar afvca commented on June 16, 2024

Hi! What's your intuition on why auroc_sp is much lower on image vs pixel? And what could be done to improve?

from vand-april-gan.

twmht avatar twmht commented on June 16, 2024

@ByChelsea

Do you have any reference paper for prompt learning?

from vand-april-gan.

ByChelsea avatar ByChelsea commented on June 16, 2024

@ByChelsea

Do you have any reference paper for prompt learning?

I'm sorry, I just noticed this question. You can refer to CoOp and AnomalyCLIP.

from vand-april-gan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.