Git Product home page Git Product logo

caffe-with-spearmint's People

Contributors

kashefy avatar kuz avatar m0003r avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

caffe-with-spearmint's Issues

Interface for GPU

If I understand the code correctly , currently it would only run on CPU. It would be nice and necessary for run.py to accept an optional parameter that specifies the GPU number and to run the code on the corresponding GPU.

Fine-tuning

Hi @kuz,

I tried caffe-with-spearmint today and it seems really useful. Thanks for this work!!

I would like to use your code to optimize parameters in network fine-tuning rather than training from scratch. How can I do? Basically I just need to do 2 things:

  1. call the 'caffe train' API passing the weights of the pre-trained net to start from
  2. change the name of the layers that are learned from scratch (e.g. fc80 to fc8_newtask)

I think that if you can provide this feature or give some hints about this many people would appreciate it.

Thanks again!!!!!
Giulia Pasquale

Use HDF5 data instead of LMDB

Hi @kuz,

Is there a way to use HDF5 data instead of LMDB? Cant see any explicit checks for the LMDB, other than the presence of the folders. Would it be enough to just put the HDF5 files in the train_lmdb and val_lmdb folders and mention the type as "HDF5Data" in the net prototxt?

Thanks!

Using with googlenet

I am trying to use caffe-with-spearmint with googlenet. Googlenet contains three accuracy layers (one for each Inception module), but I'm trying to optimize by loss. I am running this:

python run.py --experiment experiments/protein --optimize loss --optimizewrt best >& run0.txt,
but I get the error:

smconfig.parse_in(solver)

File "/net/kihara/cbelth/caffe-with-spearmint/cwsm/spearmint.py", line 52, in parse_in
param = json.loads(param)
File "/usr/lib/python2.7/json/init.py", line 339, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 380, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 1 column 27 (char 26)

I attached the trainval.prototxt for my mode (changed to .txt). I changed the name at line 951 to "loss"
trainval.txt

Limit running time / jobs proposed

To make caffe-with-spearmint into a fully automated pipeline of hyper-parameter selection, training and testing, it seems to be necessary to make it possible to control the running time or the number of jobs proposed by spearmint. Is it possible within current framework?

Output not changing and GPU not being used

I specified to use the GPU in my solver.prototxt, but it is not being used. Also, I am getting the same output over and over again, and I'm not sure how to interpret it. I'm only fine tuning the learning rate and weight decay for now just to see how it works. Here is my output:

Submitted job 14 with local scheduler (process id: 27649).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 15 with local scheduler (process id: 27691).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 16 with local scheduler (process id: 27724).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 17 with local scheduler (process id: 27766).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 18 with local scheduler (process id: 27799).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion: NAME TYPE VALUE
---- ---- -----
weight_decay float 0.000100
base_lr weig float 0.000100
Submitted job 19 with local scheduler (process id: 27841).
Status: 1 pending, 0 complete.

output statements cut off

When I run this optimizer for multiple kernel sizes, all the output suggestions in the terminal are cut off so that they only read "kernel_size_"

Is there a file I can edit or access that will tell me which kernel size is which?

caffe snapshot prefix in caffeout directory

Hi,
I am able to run the experiment/mnist demo here, but when I finished running that, I am kind of confused about the prefix of the caffe snapshots located in caffeout directory. For example what does "2015-06-08-09-08-41" mean?

Intermediate states ignored?

It seems to be that current code ignores intermediate states in each run (i.e. each parameter setting), and only takes the accuracy/loss at the last iteration as the metric for the that run. But in practice, due to overfitting and other issue, the model trained till last iteration is usually not the best one to use. Why not just take the best accuracy/loss in each run as its accuracy/loss, and compare different runs on that?

Multiple variables are getting mapped to OPTIMIZE_std_1? (instead of _2, _3, ...)

I've successfully run the caffe-with-spearmint MNIST example, and now I'm moving toward using caffe-with-spearmint for my own application. Right now, I'm interested in optimizing the standard deviation (std) of the weight initialization in a Network-in-Network.

Let me walk you through where things are going wrong...

  1. I set up my model/trainval.prototxt with two std fields filled in like this:
    std: OPTIMIZE{"type":"FLOAT", "min": 0.005, "max": 0.05}

  2. In tmp/template_trainval.prototxt, these both get mapped to OPTIMIZE_std_1:

layer {name: "conv1"
     convolution_param {weight_filler{ std: OPTIMIZE_std_1 } }
}
layer {name: "cccp1"
     convolution_param {weight_filler{ std: OPTIMIZE_std_1 } } ##(should be OPTIMIZE_std_2)
}

This is odd, because in the MNIST example, I see that optimizing kernel_size in multiple layers maps to OPTIMIZE_kernel_size_1, OPTIMIZE_kernel_size_2, and OPTIMIZE_kernel_size_3.

  1. As a result of having two fields that are both OPTIMIZE_kernel_size_1, tmp/<datetime>_trainval.prototxt filled in with an optimized value for conv1, but it still OPTIMIZE_std_1 for the next convolutional layer:
layer {name: "conv1"
     convolution_param {weight_filler{ std: 0.005 } }
}
layer {name: "cccp1"
     convolution_param {weight_filler{ std: OPTIMIZE_std_1 } }
}
  1. Caffe naturally doesn't understand what to do with std: OPTIMIZE_std_1, so it fails to run training.

Any ideas why caffe-with-spearmint isn't filling in OPTIMIZE_std_2 in the 2nd convolutional layer's std field?

I put all samples of all 3 files (models/trainval.prototxt, tmp/template_trainval.prototxt, and tmp/datetime_trainval.prototxt) in a gist here:
https://gist.github.com/forresti/138d8bb79457a1146607

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.