csailvision / places365 Goto Github PK

View Code? Open in Web Editor NEW

1.9K 58.0 534.0 1.37 MB

The Places365-CNNs for Scene Classification

Home Page: http://places2.csail.mit.edu/

License: MIT License

Python 98.24% Dockerfile 1.76%

cnn baseline-cnns

places365's Introduction

Release of Places365-CNNs

We release various convolutional neural networks (CNNs) trained on Places365 to the public. Places365 is the latest subset of Places2 Database. There are two versions of Places365: Places365-Standard and Places365-Challenge. The train set of Places365-Standard has ~1.8 million images from 365 scene categories, where there are at most 5000 images per category. We have trained various baseline CNNs on the Places365-Standard and released them as below. Meanwhile, the train set of Places365-Challenge has extra 6.2 million images along with all the images of Places365-Standard (so totally ~8 million images), where there are at most 40,000 images per category. Places365-Challenge will be used for the Places2 Challenge 2016 to be held in conjunction with the ILSVRC and COCO joint workshop at ECCV 2016.

The data Places365-Standard and Places365-Challenge are released at Places2 website.

Pre-trained CNN models on Places365-Standard:

AlexNet-places365: deploy weights
GoogLeNet-places365: deploy weights
VGG16-places365: deploy weights
VGG16-hybrid1365: deploy weights
ResNet152-places365 fine-tuned from ResNet152-ImageNet: deploy weights
ResNet152-hybrid1365: deploy weights
ResNet152-places365 trained from scratch using Torch: torch model converted caffemodel:deploy weights. It is the original ResNet with 152 layers. On the validation set, the top1 error is 45.26% and the top5 error is 15.02%.
ResNet50-places365 trained from scratch using Torch: torch model. It is Preact ResNet with 50 layers. The top1 error is 44.82% and the top5 error is 14.71%.
To use the alexnet and vgg16 caffemodels in Torch, use the torch library loadcaffe, where you could simply load the caffe model use the following commands. But note that the input image scale should be from 0-255, which is different to the 0-1 scale in the previous resnet Torch models trained from scratch in fb.resnet.torch.

	require 'loadcaffe'
	model = loadcaffe.load('deploy_alexnet_places365.prototxt', 'alexnet_places365.caffemodel', 'cudnn')

PyTorch Places365 models: AlexNet, ResNet18, ResNet50, DenseNet161. The models are trained in Python2.7+PyTorch 0.2, see this issue if you run into some format errors. You don't need to untar the pytorch model files, refer to the following placesCNN demo code to see how to load the model. Run basic code to get the scene prediction from PlacesCNN:

    python run_placesCNN_basic.py

    RESULT ON http://places.csail.mit.edu/demo/12.jpg
    0.519 -> patio
    0.394 -> restaurant_patio
    0.018 -> beer_garden
    0.017 -> diner/outdoor
    0.016 -> courtyard

or run unified code to predict scene categories, indoor/outdoor type, scene attributes, and the class activation map together from PlacesCNN:

    python run_placesCNN_unified.py

    RESULT ON http://places.csail.mit.edu/demo/6.jpg
    --TYPE: indoor
    --SCENE CATEGORIES:
    0.690 -> food_court
    0.163 -> cafeteria
    0.033 -> dining_hall
    0.022 -> fastfood_restaurant
    0.016 -> restaurant
    --SCENE ATTRIBUTES:
    no horizon, enclosed area, man-made, socializing, indoor lighting, cloth, congregating, eating, working
    Class activation map is output as cam.jpg

Train PlacesCNN using Pytorch. The training script is at here. Download the Places365 standard easyformat split at here. Untar it to some folder. Then run the following:

    python train_placesCNN.py -a resnet18 /xxx/yyy/places365standard_easyformat

The category index file is the file. Here we combine the training set of ImageNet 1.2 million data with Places365-Standard to train VGG16-hybrid1365 model, its category index file is the file. The indoor and outdoor labels for the categories is in the file. The scene hierarchy is listed at here, with a simple browswer at here.

Performance of the Places365-CNNs

The performance of the baseline CNNs is listed below. ResidualNet's performance will be updated soon. We use the class score averaged over 10-crops of each testing image to classify. Here we also fine-tune the resNet152 on Places365-standard, for 10 crop average it has 85.08% on the validation set and 85.07% on the test set for top-5 accuracy.

As comparison, we list the performance of the baseline CNNs trained on Places205 as below. There are 160 more scene categories in Places365 than the Places205, the top-5 accuracy doesn't drop much.

The performance of the deep features of Places365-CNNs as generic visual features is listed below ResidualNets' performances will be included soon. The setup for each experiment is the same as the ones in our NIPS'14 paper

Some qualitative prediction results using the VGG16-Places365:

Reference

Link: Places2 Database, Places1 Database

Please cite the following IEEE Transaction on Pattern Analysis and Machine Intelligence paper if you use the data or pre-trained CNN models.

 @article{zhou2017places,
   title={Places: A 10 million Image Database for Scene Recognition},
   author={Zhou, Bolei and Lapedriza, Agata and Khosla, Aditya and Oliva, Aude and Torralba, Antonio},
   journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
   year={2017},
   publisher={IEEE}
 }

Acknowledgements and License

Places dataset development has been partly supported by the National Science Foundation CISE directorate (#1016862), the McGovern Institute Neurotechnology Program (MINT), ONR MURI N000141010933, MIT Big Data Initiative at CSAIL, and Google, Xerox, Amazon and NVIDIA. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation and other funding agencies.

The pretrained places-CNN models can be used under the Creative Common License (Attribution CC BY). Please give appropriate credit, such as providing a link to our paper or to the Places Project Page. The copyright of all the images belongs to the image owners.

places365's People

Contributors

Stargazers

Watchers

Forkers

daokouer loliod fakinormal caomw sbaner03 pchankh amds123 liuhuiwisdom gezhiwei mtmarsh2 aliscifp qierzhao walkoncross soledad89 mynameisvinn cfandy iamgroot42 quanfang papamadeleine2022 ewenger wac81 embeddedsamurai sjoerdapp sumanthreddykaliki tinyloop lyimage ktl014 elegantgod jacky168 fucevin giserh gislas3 shanumantesc-zz jy-hamlet xllau ranjita13 vcsrc yhchen-gerineo tobechao rodrigoperes aniucd yiweichen04 patrickhuzc yiliangnie wenxuanliu palashshastri strongwolf zhhezhhe echotian kfriesth theshadow29 kissyzhou fansiee chenbangfeng zgsxwsdxg baileyqbb silverbulletmdc bmyan gitmrzk cnn-gan jzlhit muneebak pyjhzwh perception-slam pchank janes 19ai jgabriellima 10183308 mutual-ai kajiyu bkj sinianyutian tank145161 two222 cain4318 ml-lab ztwe kevinwenya yesyu cwhypt qianwenjun0801 yzw124 wzhang1 zhuiyuan616124 youyicloud jamesben6688 foreverfei danfouer bradypuz canalstar chenghuige pengyulong autohe sterio-wang duyuting atylermorgan walter1218 lianghao93 hailsham

places365's Issues

Preprocessing options

Is it possible to describe what preprocessing was done on the data for use in Caffe please?

Clarify license

There is currently no indication of the license for the contents of this repository. Creative commons perhaps?

run docker build -t places365_container . got an error

when i run the command:
docker build -t places365_container .

I got the errors as the follow:
error pulling image configuration: Get https://dseasb33srnrn.cloudfront.net/registry-v2/docker/registry/v2/blobs/sha256/0b/0b9fc622f1b7840a160ddb2377fcb085109e26983fe2dcca4ca5d52902f6a65d/data?Expires=1523203761&Signature=hFnXCgLk0ivjhqwl4oXsC~kWnHEkyFCSR3Oy8N2S41MkQc1m~SmoxdW7ZwupbHGLpGljQBvHkGVkgqL9TpcttH8Wt4ug20dqtrE8V5NJ6RmzWXgTFsE-hOXQMJKuy56iV9zsgnAWUAOl1lXGBmyhst7nFiRi21K1I7QgUaITlkM_&Key-Pair-Id=APKAJECH5M7VWIS5YZ6Q: net/http: TLS handshake timeout

I've try to access the image file, and https://dseasb33srnrn.cloudfront.net/ can't be accessed
Thanks

whole_wideresnet18_places365 in Caffe

Hi, I am sorry for reopening this issue, but I did not get response.

Hi guys,

this is an excellent work. I really like to unified version of prediction multiple attributes. Do you also have the whole_wideresnet18_places365 model used in the unified version in Caffe? Or can you give me a hint how to convert it? Can I also use a different network in the unified prediction?

Error in running docker build command

When I run

docker build -t places365_container .

I get this error:

---> Running in db1ece13a39a
--2017-01-31 18:04:36--  http://places2.csail.mit.edu/models_places365/alexnet_places365.caffemodel
Resolving places2.csail.mit.edu (places2.csail.mit.edu)... failed: Name or service not known.
wget: unable to resolve host address 'places2.csail.mit.edu'
INFO[0045] The command [/bin/sh -c wget http://places2.csail.mit.edu/models_places365/alexnet_places365.caffemodel] returned a non-zero code: 4

Any advice?

Thanks!

Image preprocessing

Can someone familiar with how the models were trained verify that the preprocessing in run_placesCNN_basic.py and run_placesCNN_unified.py are both correct?

In run_placesCNN_basic.py it's:

centre_crop = trn.Compose([
        trn.Scale(256),
        trn.CenterCrop(224),
        trn.ToTensor(),
        trn.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

That is, rescaling the smaller edge of the image to 256 and then taking the center 224 pix?

In run_placesCNN_unified.py, it's:

    tf = trn.Compose([
        trn.Scale((224,224)),
        trn.ToTensor(),
        trn.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])

That is, squashing the image to 224x224. Presumably the appropriate preprocessing should be the one that's closes to what was used for training.

Thanks

could not find places365CNN_mean.binaryproto

Hi,
To be able to use the vgg16_places365.caffemodel, I need the corresponding mean data file (used for pre-processing normalization). Where can I find the places365CNN_mean.binaryproto file?

Regards,
Abhishek

Loading DenseNet161 model does not work

I have tried the run_placesCNN_basic.py script with the densenet161 architecture. It works well when I choose a ResNet architecture. I have attached the error log

Do you have any suggestions why this particular model fails to load?

Thank you

error when untar the pytorch model downloaded from DenseNet161.

Excuse me,when i download tar from website, the downloaded file can not be untar. How can i fix this problem? Thanks!

The models are downloaded from websites below.

PyTorch Places365 models: AlexNet, ResNet18, ResNet50, DenseNet161. The models are trained in Python2.7+PyTorch 0.2, when the models are being loaded in python3, you might encounter UnicodeDecodeError, see this issue. Run basic code to get the scene prediction from PlacesCNN:

Places365 Test data ground truth

I wanted to reproduce the results in the paper. Where can I find the ground truth results for the test data set of Places365 Standard? I am currently trying it on small images http://places2.csail.mit.edu/download.html.

I have downloaded the image list and annotation file of Places365 as given in Places365 dev kit, but it only has the file names of the test data set, and not the ground truth.

Why training data is not applied with scale or resize operation, is it on purpose?

I think the training data should also do the scale operation which would help.

UnicodeDecodeError

Hi all! Running run_placesCNN_basic.py in this repository I run into this following error:
Traceback (most recent call last):
File "", line 4, in
File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 231, in load
return _load(f, map_location, pickle_module)
File "/usr/local/lib/python3.5/dist-packages/torch/serialization.py", line 379, in _load
result = unpickler.load()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 875: ordinal not in range(128)

Any advice? Thanks in advance!

./train_places_cnn.py: line 23: syntax error near unexpected token `('

I'm trying to run the train places py code provided and after a whole weekend of waiting, I got this error:

./train_places_cnn.py: line 23: syntax error near unexpected token `('
./train_places_cnn.py: line 23: 'model_names = sorted(name for name in models.dict'

I have not modified train_places_cnn.py.

I was hoping to get a snapshot.caffemodel file generated so I can plug this model into the nvidia GRE. I have had success testing with the places 205 model, but I was not able to locate a places365 snapshot file on the github or some direct route to downloading it.

Is there a mean.binaryproto file for vgg16_hybrid.model?

only find places365CNN_mean.binaryproto, is there a mean.binaryproto file for vgg16_hybrid.model?

how to save model ?

how to save a pytorch model to xxx.pth.tar ? like you

AttributeError: 'module' object has no attribute 'constant_'

Hello，everyone！ Running run_placesCNN_unified.py in this repository I run into this following error:

Traceback (most recent call last):
File "run_placesCNN_unified.py", line 124, in
model = load_model()
File "run_placesCNN_unified.py", line 94, in load_model
model = wideresnet.resnet18(num_classes=365)
File "/home/user/zou/code/places365/wideresnet.py", line 164, in resnet18
model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
File "/home/user/zou/code/places365/wideresnet.py", line 120, in init
nn.init.constant_(m.weight, 1)
AttributeError: 'module' object has no attribute 'constant_'

Any advice about that? Thanks in advance!

Which mean file to use?

The repo contains this mean file: places365CNN_mean.binaryproto

But in /docker/run_scene.py this line of code seems to imply using the ILSVRC 2012 mean file.

transformer.set_mean('data', np.load(
    'python/caffe/imagenet/ilsvrc_2012_mean.npy').mean(1).mean(1))

Which is the right one to use?

train_val.prototxt ?

Hi, is it possible to release the train_val.prototxt / solver.prototxt that you used to train your networks? I could not find them.

Thanks!

Which channel order: RGB or BGR

Hi,
I want to use your models as pre-trained weight for other tasks. However, I am not sure how you processed your input images, as I saw you have :

ResNet152-places365 fine-tuned from ResNet152-ImageNet
ResNet152-places365 trained from scratch using Torch

As the author of ResNet152-ImageNet, mentioned that he used BGR, which is expected if using caffe... So i guess the model in 1, should also use BGR? While the model in 2 is not clear to me, as it was trained from scratch and used Torch...

It would be nice if you can tell me it directly. Thanks!

Error in googlenet caffe proto

Hello!
I use python 2.7 and compiled distribution of caffe.
I try to run:
net = caffe.Net('places_model/deploy_googlenet_places365.prototxt', caffe.TEST)
But I get error:

[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 7:1: Expected identifier. F0912 12:11:03.998572 19221 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: places_model/deploy_googlenet_places365.prototxt
*** Check failure stack trace: ***

I open proto file and dont found some syntax error.
What I can do to fix this? Thank you.

Indoor/Outdoor label

For places205 a mapping was provided from the canonical label to indoor/outdoor.

I was wondering if that also exists for places365, as I can't seem to find it.

Pytorch models unarchive failed

I downloaded the pytorch pretrained models, but failed to unarchive the '.tar' files, and I have tried any method I can think of.
It seems that there is something wrong with the '.tar' files.

Runtime Error while in pytorch 0.4 with alexnet model

Traceback (most recent call last):                                                                                
File "train.py", line 155, in <module>                                                                          
main()                                                                                                        
File "train.py", line 150, in main                                                                              
trainer.train()                                                                                               
File "C:\myFile\code\image_scene_classification\CH\model\trainer.py", line 314, in train                        
self.train_epoch()                                                                                            
File "C:\myFile\code\image_scene_classification\CH\model\trainer.py", line 228, in train_epoch                  
self.validate()                                                                                               
File "C:\myFile\code\image_scene_classification\CH\model\trainer.py", line 123, in validate                     
output = self.model(inputs)                                                                                   
File "C:\Users\chmtt\Anaconda3\envs\SCENE\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__  
result = self.forward(*input, **kwargs)                                                                       
File "C:\Users\chmtt\Anaconda3\envs\SCENE\lib\site-packages\torchvision\models\alexnet.py", line 43, in forward 
x = self.features(x)                                                                                          
File "C:\Users\chmtt\Anaconda3\envs\SCENE\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__  
result = self.forward(*input, **kwargs)                                                                       
File "C:\Users\chmtt\Anaconda3\envs\SCENE\lib\site-packages\torch\nn\modules\container.py", line 91, in forward 
input = module(input)                                                                                         
File "C:\Users\chmtt\Anaconda3\envs\SCENE\lib\site-packages\torch\nn\modules\module.py", line 491, in __call__  
result = self.forward(*input, **kwargs)                                                                       
File "C:\Users\chmtt\Anaconda3\envs\SCENE\lib\site-packages\torch\nn\modules\conv.py", line 301, in forward     
self.padding, self.dilation, self.groups)                                                                     
RuntimeError: thnn_conv2d_forward is not implemented for type torch.ByteTensor

and my code is
// this is for get the file_path
model_path = load_model_from_name('alexnet')

model= torch.load(model_path)

the output is gotten when call model(inputs)

CV2

why in run_placesCNN_unified.py the cv2 is not import.

Is there a labels file for validating data?

I downloaded the 'Places365 Development kit', and I found that there were only the 'categories_places365.txt' and 'categories_hybrid1365.txt' for labels file of training data. So, is there a labels file for validating data?

No gpu used

I run" CUDA_VISIBLE_DEVICES=0 python run_placesCNN_basic.py" on my server .

but I found the program runs on cpu.

How can I use GPU to do test?

whole_wideresnet18_places365 in Caffe

Hi guys,

Single-crop validation performance

Would it be possible to release the single-crop validation performance? This would make it easier to verify models that have been converted from caffe [without writing a complicated data loader].

Thanks!

Hybrid Ground Truths File

Hi, is there a ground truths file for Hybrid models?

TypeError: unsupported operand type(s) for /: 'tuple' and 'int'

why run the run_placesCNN_unified.py show this error.

Traceback (most recent call last):
File "/home/xyx/Downloads/places365 new/run_placesCNN_unified.py", line 132, in
input_img = V(tf(img).unsqueeze(0), volatile=True)
File "/home/xyx/anaconda2/lib/python2.7/site-packages/torchvision/transforms.py", line 29, in call
img = t(img)
File "/home/xyx/anaconda2/lib/python2.7/site-packages/torchvision/transforms.py", line 139, in call
ow = int(self.size * w / h)
TypeError: unsupported operand type(s) for /: 'tuple' and 'int'

Cannot Find deploy_vgg16_hybrid1365.prototxt

Hello guys,
I could not find the prototxt file for vgg16_hybrid1365. Where can I find it?
thanks

Where to get train_val.prototxt for pre-trained models

Hi,
For those pre-trained models, where can we find the tran_val.prototxt so that we get to know the transform_param ?

I need it to do preprocessing for input images

thanks

Inquiry about 'wideresnet18' model used in 'run_placesCNN_unified.py'

Hello,

For indoor scene classification, I've run 'run_placeCNN_unified.py' code and used 'wideresnet18' model you attached.

However, I do not know the difference between 'wideresnet18_places365' and 'resnet18_places365' except for maxpooling layer before convolution layer. So, I do not know if it is really wide residual model.

I want to know why you have used 'wideresnet18' model in 'run_placesCNN_unified.py' and named 'wideresnet'.

I would be grateful if you answer my question.
Thanks.

pretrained model on mxnet

The pretrained model by mxnet ,resnet-50' accuracy is only 0.3112 and resnet-152' accuracy is only 0.3355.
the site is https://github.com/apache/incubator-mxnet/tree/master/example/image-classification.
So, I want use pretrained model by pytorch ,but ,how to convert pytorch's model to mxnet's model.
Thank you

Mean shape invalid

I'm attempting to use the mean file with the vgg16 network and the caffe Classifier class and have encountered a shape issue.
Do I even need the mean file here? According to #3 it appears maybe not.

Example:

import caffe
MODEL_PROTOTXT = 'deploy_vgg16_places365.prototxt'
MODEL_TRAINED = 'vgg16_places365.caffemodel'
MEAN_FN = 'places365CNN_mean.binaryproto'

def load_mean():
    mean_fh = open(MEAN_FN, 'rb')
    blob = caffe.proto.caffe_pb2.BlobProto()
    mean_string = mean_fh.read()
    blob.ParseFromString(mean_string)
    mean_fh.close()
    mean = caffe.io.blobproto_to_array(blob)
    return mean

mean = load_mean()
c = caffe.Classifier(MODEL_PROTOTXT, MODEL_TRAINED) # works
c = caffe.Classifier(MODEL_PROTOTXT, MODEL_TRAINED, mean=mean) # raises ValueError: Mean shape invalid

Torch Models Inputs

Hi, What's the inputs' range for the provided Torch models? 0-1 or 0-255?

Different probability in demo website and run_placesCNN_unified.py for SCENE CATEGORIES

Does anyone know why the demo here: http://places2.csail.mit.edu/demo.html and the run_placesCNN_unified.py . I tried testing with the same images mentiones in the README.md file, and the result differs a lot:

According to the README file and places demo website, the result is as follows:

RESULT ON http://places.csail.mit.edu/demo/6.jpg
--TYPE: indoor
--SCENE CATEGORIES:
0.690 -> food_court
0.163 -> cafeteria
0.033 -> dining_hall
0.022 -> fastfood_restaurant
0.016 -> restaurant
--SCENE ATTRIBUTES:
no horizon, enclosed area, man-made, socializing, indoor lighting, cloth, congregating, eating, working
Class activation map is output as cam.jpg

But when I run run_placesCNN_unified.py on my server, I get the following result:

RESULT ON http://places.csail.mit.edu/demo/6.jpg
--TYPE OF ENVIRONMENT: indoor
--SCENE CATEGORIES:
0.511 -> food_court
0.085 -> fastfood_restaurant
0.083 -> cafeteria
0.040 -> dining_hall
0.021 -> flea_market/indoor
--SCENE ATTRIBUTES:
no horizon, enclosed area, man-made, socializing, indoor lighting, cloth, congregating, eating, working
Class activation map is saved as cam.jpg

Does anyone knows which model the current demo website is using ?

Torch version of Caffe models

Could you provide the Torch version of following models, please? I couldn't load and convert them. Thanks

AlexNet-places365
GoogLeNet-places365
VGG16-places365
VGG16-hybrid1365
ResNet152-hybrid1365

How to use deep scene features for new scene classification training?

If we want to test the performance of the deep features of Places365-CNNs, how to set up new training for new dataset such as Scene15?

Keras (2.0) implementation of the pre-trained VGG-16 models on Places365-Standard

A Keras (2.0) implementation of the pre-trained VGG 16 convolutional-layer CNN models on Places365-Standard

Implementation issue with model ResNet50-places365

Hi,

I downloaded the two ResNet model from you:

As I use tensorflow, so I have to parse the model first...(I use pytorch torch.utils.serialization.load_lua()).
And I was confused by the structure of ResNet50, because I saw it does batch normalization after element-wise addition of original and shortcut.

As an example, I pasted the representation of the 2nd block of conv2_x, which corresponds to part(in your code):

layer {
	bottom: "res2a_branch1"
	bottom: "res2a_branch2c"
	top: "res2a"
	name: "res2a"
	type: "Eltwise"
}

nn.Sequential {
  [input -> (0) -> (1) -> output]
  (0): torch.legacy.nn.ConcatTable.ConcatTable {
    input
      |`-> (0): nn.Sequential {
      |      [input -> (0) -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> output]
      |      (0): nn.SpatialBatchNormalization  
      |      (1): TorchObject(cudnn.ReLU, {'output': [torch.FloatTensor with no dimension]
      |      , '_type': 'torch.FloatTensor', 'train': True, 'inplace': True, 'gradInput': [torch.FloatTensor with no dimension]
      |      })
      |      (2): TorchObject(cudnn.SpatialConvolution, {'groups': 1, '_type': 'torch.FloatTensor', 'weight': 

... (3) - (5)
      |      (6): nn.SpatialBatchNormalization
      |      (7): TorchObject(cudnn.ReLU, {'output': [torch.FloatTensor with no dimension]
      |      , '_type': 'torch.FloatTensor', 'train': True, 'inplace': True, 'gradInput': [torch.FloatTensor with no dimension]
      |      })
      |      (8): TorchObject(cudnn.SpatialConvolution, {'groups': 1, '_type': 'torch.FloatTensor', 
... 
      |      [torch.FloatTensor of size 256x64x1x1]
      |      , 'train': True, 'kW': 1, 'padW': 0, 'output': [torch.FloatTensor with no dimension]
      |      , 'dW': 1, 'nInputPlane': 64})
      |    }
      |`-> (1): nn.Identity
       +. -> output
  }
  (1): nn.CAddTable
}

But according to your code, (0): nn.SpatialBatchNormalization should be before addition, right? After checking model ResNet152,
I found its implementation seems more consistent with the code in deploy_resnet152_places365.prototxt, as following:

nn.Sequential {
  [input -> (0) -> (1) -> (2) -> output]
  (0): torch.legacy.nn.ConcatTable.ConcatTable {
    input
      |`-> (0): nn.Sequential {
      |      [input -> (0) -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> output]
      |      (0): TorchObject(cudnn.SpatialConvolution, {'groups': 1, '_type': 'torch.FloatTensor', 'weight': 

... (1) - (6)

      |      , 'dW': 1, 'nInputPlane': 64})
      |      (7): nn.SpatialBatchNormalization
      |    }
      |`-> (1): nn.Identity
       +. -> output
  }
  (1): nn.CAddTable
  (2): TorchObject(cudnn.ReLU, {'output': [torch.FloatTensor with no dimension]

The confirmation is important for me to reuse your model, thanks!

labels.pkl file for the hybrid model i.e. with 1365 categories

Hi, can you share the labels.pkl file for the for the hybrid model with 1365 categories. I want to build a docker container with resnet hybrid model in that. but I didn't find any labels.pkl file for it. It will be very helpful if you can share the file.

PyTorch Place365 models is not possible download

It occur Apache Error

Not Found
The requested URL /models_places365/densenet161_places365_python36.pth.tar was not found on this server.

What is problem?

parameters used to train alexnet

Hi Would it be possible for you to post the parameters used for training alexnet/caffenet? I'm trying to fine my own network to reach your accuracy but so far only got to 0.47 for one crop validation set.
Thank you!

_pickle.UnpicklingError: invalid load key, '<'

Hi all!
while running run_placesCNN_basic.py, this error comes up
Traceback (most recent call last):
File "run_placesCNN_basic.py", line 31, in
checkpoint = torch.load(model_file, map_location=lambda storage, loc: storage)
File "/home/pytorch_py35/lib/python3.5/site-packages/torch/serialization.py", line 267, in load
return _load(f, map_location, pickle_module)
File "/home/pytorch_py35/lib/python3.5/site-packages/torch/serialization.py", line 412, in _load
magic_number = pickle_module.load(f)
_pickle.UnpicklingError: invalid load key, '<'.
Any advice? Thanks!

"SCENE ATTRIBUTES" always be same when I try to input more than one pictures with model loaded once

I modify the code in run_placesCNN_unified.py to input more than one pictures and just need to load the model once, I write a function which input is picture_path, and output is the results.But is seems that SCENE ATTRIBUTES always the same, cause the features_blobs is used in load_model(), so it doesn't modify when the picture is varying.Does any body else know how to fix this and don't need to load the model every time?

issue with black&white images

Hi all! I am dealing with a very annoying issue and as I am new at Python I don't know how to go thought.
Shortly I am running places365 over many images, and it looks that it can't handle with black and white images, since I got the following error:
RuntimeError: Need input of dimension 4 and input.size[1] == 3 but got input to be of shape: [1 x 1 x 224 x 224] at /pytorch/torch/lib/THNN/generic/SpatialConvolutionMM.c:47
As I would not want to delete all the black and white images, could someone suggest me few lines of code that would solve the problem?
Thanks!

Issue with Python3

Hi,
When I run the run_placesCNN_basic.py in Python3 [with PyTorch 0.2] the following error occurred:
$ python3 run_placesCNN_basic.py

Traceback (most recent call last):
  File "run_placesCNN_basic.py", line 26, in <module>
    model = torch.load(model_weight, map_location=lambda storage, loc: storage) # model trained in GPU could be deployed in CPU machine like this!
  File "/home/karami/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 231, in load
    return _load(f, map_location, pickle_module)
  File "/home/karami/anaconda3/lib/python3.6/site-packages/torch/serialization.py", line 379, in _load
    result = unpickler.load()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 875: ordinal not in range(128)

would you please help me to fix this issue.

Pretrained Model Output Permuted

It appears the pre-trained pytorch model consistently predicts a label different from what it should. For example, instead of predicting 2, the model predicts 10; instead of predicting 3, it predicts 100.

Here is some code to reproduce this issue.

import argparse
import torch
import torch.backends.cudnn as cudnn
import torch.nn.functional as F
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torchvision.models as models
from torch.autograd import Variable
import numpy as np

parser = argparse.ArgumentParser(description='Demo of bug',
                                 formatter_class=argparse.ArgumentDefaultsHelpFormatter)
# Optimization options
parser.add_argument('--batch_size', '-b', type=int, default=100, help='Batch size.')
parser.add_argument('--test_bs', type=int, default=100)
# Acceleration
parser.add_argument('--ngpu', type=int, default=1, help='0 = CPU.')
parser.add_argument('--prefetch', type=int, default=5, help='Pre-fetching threads.')
args = parser.parse_args()

mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
transform = transforms.Compose(
    [transforms.Scale(256), transforms.CenterCrop(224), transforms.ToTensor(), transforms.Normalize(mean, std)])

test_data = dset.ImageFolder(root="/share/data/lang/users/dan/datasets/places365/val", transform=transform)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=args.test_bs, shuffle=False,
                                          num_workers=args.prefetch, pin_memory=True)

net = models.resnet50(num_classes=365)
checkpoint = torch.load('/share/data/lang/users/dan/.torch/models/resnet50_places365.pth')
state_dict = {str.replace(k,'module.',''): v for k,v in checkpoint['state_dict'].items()}
net.load_state_dict(state_dict)
net.eval()

for p in net.parameters():
    p.volatile = True

if args.ngpu > 1:
    net = torch.nn.DataParallel(net, device_ids=list(range(args.ngpu)))
if args.ngpu > 0:
    net.cuda()

np.random.seed(1)
torch.manual_seed(1)
if args.ngpu > 0:
    torch.cuda.manual_seed(1)

cudnn.benchmark = True  # fire on all cylinders

to_np = lambda x: x.data.cpu().numpy()

for batch_idx, (data, target) in enumerate(test_loader):
    data = Variable(data.cuda(), volatile=True)

    output = net(data)
    smax = to_np(F.softmax(output))
    # batch_idx = target since batch size = 100, the size of each val folder
    print(np.argmax(smax, axis=1), batch_idx)

Classes 0 and 1 are correctly predicted. After that, 2 -> 10, 3 -> 100, 4 -> 101, 5 -> 102, 6 -> 103, ...

[  0   0   0 170 174   0  18 174 293 170   0 170 293   0 293 174   0   0
 174   0   0   0   0 174 174 293   0   0 169 170   0 293   0   0 293 293
 186   0 174   0 174   0   0 207 293   0 293   0 192   0   0   0   0   0
   0   0   0   0   0 174   0   0   0   0   0 140   0   0   0   0 293 293
   0   0   0   0 293   0 174 293 174 174 293   0 293   0   0   0 293 140
   0 293   0 174   0 293 293 293 174   0] 0
[ 38   1   1 346   1   1  58   1   1   1   1  98   1   1  98   1   1   1
   1   1   1   1  27   1   1   1   1   1   1   1   1   1   1   1   1   1
   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
   1   1 336   1   1   1   1   1 336   1   1   1   1   1   1   1  98   1
   1   1   1   1   1  55   1  98   1   1   1   1   1   1   1   1   1   2
 135  55   1   1   1   1 298   1   1  82] 1
[ 10 113  10  10 278  84  84 109 347  10 347 347 113  66 218  10  10  10
  10  10 347 218  10  10  10  10 347  10 292 229  10  10 347  10  78 347
 347  10  10  10  66  10  10  10 347  10 349  12  10  10  10  10  10  10
  10 283  10  66  10  10  10 218  10  81  10  10  66 347 347  33  10  10
 292  10 292  10 347 307 127  10  10 347 347  66  10 347  10  10  10  10
  10  85 273  10 347  10  10  10  10  10] 2
[100 238 100 176 176 100 246 100 100 176  56 100 100 235 176 244 182 100
 298 100 264 176 176 244 100 177 264 100 210 246 100 176 210 246 100 100
 244 246 100 212 100 244 102 100 100  63 244 238 244 244 202 100 100 100
 176 246 211 177 100 100 100 177 176 246 100  54 176 100 100 100 100 100
 100 246 244  33 100 238 100 100  19 100 176  20 176 100 100 246 244 246
 100 100 176 100 100 100 100 244 100 100] 3
[ 38 210 101 177 210 101 101 102 210 280 210  16 101 102  27  38 101 211
 101 101 101  27 101 210 101 211 211 101  92 101 101 101  27 102 211  38
 101 101 101 101 101 211 101 101 101 210  27 241 102 211  38 102 101 235
 101 101 211 211 102 211 235 101 211 101 101 210 101 211  27 211 101  18
 101 211 101 244 101  27 101 211 101 101  27  38 101 101 101 101 210 101
 210 211  27 101 211 235  38 101 101 101] 4
[102 102 102 102 101  38 211 102 102 211 102 121 211  38 102 102 102 102
 102 219 102 211 102 246  27 121 102 102  27 121 102 102 102  38 102 244
 102 101 102 102 244 120 210 102 102 102 102 211 100 102 102 102 235 212
 101 102  20 210 102 102 102 128 102 101 102 102 102 101 102 102 102 102
 102 211 102 102 210 102 210 101 211 102 102 102 102 102 102 102 102 102
 210 121 102 102 102 248 244  38 101 102] 5
[103  13 308 192 216 218 192  66 103 136 206 136 226 103 103 294 307 245
 255 103 125 103 103 192   8 216 308 216 125 103 143 165 103 340 256 103
  76 127 125 256 339 103 103 247 103 103 127 103 103 136 103 103 103 103
 103 143  13 103 103   7 103 171 103 338 307 103 136 103 103 103 131 103
 340 103 318 127 228 103  41 103  13  71 103 103 192 131 103 143  19 103
 331 166 339  43 103   5 221 223 103 169] 6

With these commands I got the validation data suitable for pytorch's ImageFolder.

import os
for i in range(365):
    os.mkdir('./val/' + str(i))
with open('places365_val.txt') as f:
    for line in f:
        line = line.split()
        os.rename('./val/images/' + line[0], "./val/"+str(line[1])+'/'+line[0])

Error when loading pre-trained model in pytorch 0.4

I'm using pytorch0.4 with python2.7 and had some problems when loading the pre-trained model. I had change the model download link by delete '_python36' and download the model for python2.7.

The log is below:
Traceback (most recent call last): File "run_placesCNN_basic.py", line 77, in <module> logit = model.forward(input_img) File "/home/public/anaconda2/lib/python2.7/site-packages/torchvision/models/resnet.py", line 140, in forward x = self.bn1(x) File "/home/public/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 491, in __call__ result = self.forward(*input, **kwargs) File "/home/public/anaconda2/lib/python2.7/site-packages/torch/nn/modules/batchnorm.py", line 49, in forward self.training or not self.track_running_stats, self.momentum, self.eps) File "/home/public/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 532, in __getattr__ type(self).__name__, name)) AttributeError: 'BatchNorm2d' object has no attribute 'track_running_stats'

How to solve this problem without downgrade the pytorch?

csailvision / places365 Goto Github PK

places365's Introduction

Release of Places365-CNNs

Pre-trained CNN models on Places365-Standard:

Performance of the Places365-CNNs

Reference

Acknowledgements and License

places365's People

Contributors

Stargazers

Watchers

Forkers

places365's Issues

Recommend Projects

Recommend Topics

Recommend Org