kuz / caffe-with-spearmint Goto Github PK

View Code? Open in Web Editor NEW

96.0 96.0 30.0 12.23 MB

Automatic Caffe parameter search via Spearmint Bayesian optimisation

License: MIT License

Python 100.00%

caffe-with-spearmint's People

Contributors

Stargazers

Watchers

caffe-with-spearmint's Issues

Interface for GPU

If I understand the code correctly , currently it would only run on CPU. It would be nice and necessary for run.py to accept an optional parameter that specifies the GPU number and to run the code on the corresponding GPU.

Fine-tuning

Hi @kuz,

I tried caffe-with-spearmint today and it seems really useful. Thanks for this work!!

I would like to use your code to optimize parameters in network fine-tuning rather than training from scratch. How can I do? Basically I just need to do 2 things:

call the 'caffe train' API passing the weights of the pre-trained net to start from
change the name of the layers that are learned from scratch (e.g. fc80 to fc8_newtask)

I think that if you can provide this feature or give some hints about this many people would appreciate it.

Thanks again!!!!!
Giulia Pasquale

Use HDF5 data instead of LMDB

Hi @kuz,

Is there a way to use HDF5 data instead of LMDB? Cant see any explicit checks for the LMDB, other than the presence of the folders. Would it be enough to just put the HDF5 files in the train_lmdb and val_lmdb folders and mention the type as "HDF5Data" in the net prototxt?

Thanks!

Using with googlenet

I am trying to use caffe-with-spearmint with googlenet. Googlenet contains three accuracy layers (one for each Inception module), but I'm trying to optimize by loss. I am running this:

python run.py --experiment experiments/protein --optimize loss --optimizewrt best >& run0.txt,
but I get the error:

smconfig.parse_in(solver)

File "/net/kihara/cbelth/caffe-with-spearmint/cwsm/spearmint.py", line 52, in parse_in
param = json.loads(param)
File "/usr/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 27 (char 26)

I attached the trainval.prototxt for my mode (changed to .txt). I changed the name at line 951 to "loss"
trainval.txt

Limit running time / jobs proposed

To make caffe-with-spearmint into a fully automated pipeline of hyper-parameter selection, training and testing, it seems to be necessary to make it possible to control the running time or the number of jobs proposed by spearmint. Is it possible within current framework?

Output not changing and GPU not being used

I specified to use the GPU in my solver.prototxt, but it is not being used. Also, I am getting the same output over and over again, and I'm not sure how to interpret it. I'm only fine tuning the learning rate and weight decay for now just to see how it works. Here is my output:

Submitted job 14 with local scheduler (process id: 27649).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 15 with local scheduler (process id: 27691).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 16 with local scheduler (process id: 27724).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 17 with local scheduler (process id: 27766).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 18 with local scheduler (process id: 27799).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 19 with local scheduler (process id: 27841).
Status: 1 pending, 0 complete.

output statements cut off

When I run this optimizer for multiple kernel sizes, all the output suggestions in the terminal are cut off so that they only read "kernel_size_"

Is there a file I can edit or access that will tell me which kernel size is which?

caffe snapshot prefix in caffeout directory

Hi,
I am able to run the experiment/mnist demo here, but when I finished running that, I am kind of confused about the prefix of the caffe snapshots located in caffeout directory. For example what does "2015-06-08-09-08-41" mean?

ValueError: Expecting : delimiter: line 1 column 46 (char 45)

Intermediate states ignored?

It seems to be that current code ignores intermediate states in each run (i.e. each parameter setting), and only takes the accuracy/loss at the last iteration as the metric for the that run. But in practice, due to overfitting and other issue, the model trained till last iteration is usually not the best one to use. Why not just take the best accuracy/loss in each run as its accuracy/loss, and compare different runs on that?

Multiple variables are getting mapped to OPTIMIZE_std_1? (instead of _2, _3, ...)

I've successfully run the caffe-with-spearmint MNIST example, and now I'm moving toward using caffe-with-spearmint for my own application. Right now, I'm interested in optimizing the standard deviation (std) of the weight initialization in a Network-in-Network.

Let me walk you through where things are going wrong...

I set up my model/trainval.prototxt with two std fields filled in like this:
std: OPTIMIZE{"type":"FLOAT", "min": 0.005, "max": 0.05}
In tmp/template_trainval.prototxt, these both get mapped to OPTIMIZE_std_1:

layer {name: "conv1"
     convolution_param {weight_filler{ std: OPTIMIZE_std_1 } }
}
layer {name: "cccp1"
     convolution_param {weight_filler{ std: OPTIMIZE_std_1 } } ##(should be OPTIMIZE_std_2)
}

This is odd, because in the MNIST example, I see that optimizing kernel_size in multiple layers maps to OPTIMIZE_kernel_size_1, OPTIMIZE_kernel_size_2, and OPTIMIZE_kernel_size_3.

As a result of having two fields that are both OPTIMIZE_kernel_size_1, tmp/<datetime>_trainval.prototxt filled in with an optimized value for conv1, but it still OPTIMIZE_std_1 for the next convolutional layer:

layer {name: "conv1"
     convolution_param {weight_filler{ std: 0.005 } }
}
layer {name: "cccp1"
     convolution_param {weight_filler{ std: OPTIMIZE_std_1 } }
}

Caffe naturally doesn't understand what to do with std: OPTIMIZE_std_1, so it fails to run training.

Any ideas why caffe-with-spearmint isn't filling in OPTIMIZE_std_2 in the 2nd convolutional layer's std field?

I put all samples of all 3 files (models/trainval.prototxt, tmp/template_trainval.prototxt, and tmp/datetime_trainval.prototxt) in a gist here:
https://gist.github.com/forresti/138d8bb79457a1146607

kuz / caffe-with-spearmint Goto Github PK

caffe-with-spearmint's People

Contributors

Stargazers

Watchers

Forkers

caffe-with-spearmint's Issues

Interface for GPU

Fine-tuning

Use HDF5 data instead of LMDB

Using with googlenet

Limit running time / jobs proposed

Output not changing and GPU not being used

output statements cut off

caffe snapshot prefix in caffeout directory

ValueError: Expecting : delimiter: line 1 column 46 (char 45)

Intermediate states ignored?

Multiple variables are getting mapped to OPTIMIZE_std_1? (instead of _2, _3, ...)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent