hips / spearmint Goto Github PK

View Code? Open in Web Editor NEW

1.5K 1.5K 328.0 1.67 MB

Spearmint Bayesian optimization codebase

License: Other

Python 99.76% Shell 0.24%

spearmint's People

Contributors

Stargazers

Watchers

Forkers

kyunghyuncho lorenzfischer xycforgithub codeashu lorisbaz js850 janr kevinbache savkov gongbudaizhe everglory99 twyckoff hihihippp wenhuizhang jesselivezey silky se4u liori twistedmove jmrinaldi mprat lenovor mittald eraldop nkhuyu edwardt jethrotan iahs rama100 pnayak craffel deeplearningindia atisman89 mathkann davek44 joaquincorrea acil-bwh fighterlyl fdoperezi javiergonzalezh yang-song forresti jwyang mshvartsman majidaldo txd866 flwenzel emfs gopal-m ajschumacher dengliu frizfealer libardo1 rtvt123 mcanthony xebitstudios jmgo zbxzc35 caomw denizyuret sidharthms li-ch uberstig munick wellsguo acedesci mydaisy2 akshayc11 lizhangzhan sxjscience yeqinglee rollingstone alibaheri carrknight bouthilx hamedmp codingluke marcofraccaro lngvietthang mechcoder tariqdaouda kracwarlock xiuyanni jpuigcerver markveillette jonathanseguin dnzengou adityosanjaya directorscut82 lewisacidic owajawa xypan1232 dwww2012 bmcfee midak janchorowski oswaldoludwig shashankg7 fulifeng oesteban

spearmint's Issues

Spearmint on the cluster

Dear all
I'm trying to use Spearmint package on the cluster and I have the following problem for running the "simple" in examples! (I have to mention that, I have tested this example on my laptop successfully!)

When I first try to execute the command "mongod --fork --logpath --dbpath" in the simple example folder, I have the following error;

about to fork child process, waiting until server is ready for connections.
forked process: 108120
all output going to: /home/mehrianmohammad/Spearmint-master/examples/simple/--dbpath
log file [/home/mehrianmohammad/Spearmint-master/examples/simple/--dbpath] exists; copied to temporary file [/home/mehrianmohammad/Spearmint-master/examples/simple/--dbpath.2015-06-02T14-32-33]
ERROR: child process failed, exited with error number 100

Jasper suggested that I delete fork in the above command, and when I tried, there is no error! So, I moved to the spearmint folder and then execute the command: "python main.py ../examples/simple" but there is an error;

Traceback (most recent call last):
File "main.py", line 198, in
from spearmint.utils.database.mongodb import MongoDB
ImportError: No module named spearmint.utils.database.mongodb

So, as Jasper suggested again!, I installed Spearmint package one more time in the spearmint folder, using the command; "pip install -e." but nothing changes!!!

I have to mention that, I do NOT see all these errors when I log into the cluster with admin user!! (With admin user, it works properly!)
So, As Jasper (Thanks for his time) suggested, I installed Anaconda on the cluster...but nothing changed!
So, definitely the problem is that I am NOT the admin on the cluster! I'm wondering if anybody have any suggestion?

Best wishes,
Mohammad

qsub error because of -j option in SGE 8.1.6

I get the following qsub error on SGE 8.1.6

qsub: invalid option argument "-j -N"

To fix this, I had to change line 192 in the spearmint/schedulers/SGE.py, to follow the -j option with a 'y':
return 'qsub -S /bin/bash -e %s -o %s -j y -N %s' % (output_file, output_file, job_name)

I didn't submit a pull request because I'm not sure if this solution is backward compatible with older versions of SGE.

any guide/document for start/usage?

KeyError: 'NaN'

Any idea why spearmint fell over with this? I'm running on git commit ac8a37e from Sep 29.

Fitting GP for main task...
Getting suggestion...

Minimum expected objective value under model is 0.12737 (+/- 0.01400), at location:
                NAME          TYPE       VALUE
                ----          ----       -----
                line_width    float      0.050000
                log_line_V    float      -1.000000
                log_combo_V   float      -1.000000
                log_combined  float      -1.000000
                synergy_widt  float      10.000000
                combo_width   float      0.050000

Minimum of observed values is 0.121821, at location:
                NAME          TYPE       VALUE
                ----          ----       -----
                line_width    float      0.050000
                log_line_V    float      -1.000000
                log_combo_V   float      -1.000000
                log_combined  float      -1.000000
                synergy_widt  float      10.000000
                combo_width   float      0.050000

Suggestion:     NAME          TYPE       VALUE
                ----          ----       -----
                line_width    float      0.400000
                log_line_V    float      1.000000
                log_combo_V   float      -1.000000
                log_combined  float      -1.000000
                synergy_widt  float      10.829132
                combo_width   float      0.400000
Submitted job 3 with local scheduler (process id: 52394).
Status: 1 pending, 1 complete.

Fitting GPClassifier for NaN task...
Getting suggestion...

Suggestion:     NAME          TYPE       VALUE
                ----          ----       -----
                line_width    float      0.312500
                log_line_V    float      -0.500000
                log_combo_V   float      -0.500000
                log_combined  float      0.500000
                synergy_widt  float      115.000000
                combo_width   float      0.137500
Submitted job 3 with local scheduler (process id: 52532).
Status: 1 pending, 1 complete.

Fitting GP for main task...
Getting suggestion...
Traceback (most recent call last):
  File "spearmint/main.py", line 494, in <module>
    main()
  File "spearmint/main.py", line 286, in main
    suggested_job = get_suggestion(chooser, resource.tasks, db, expt_dir, options, resource_name)
  File "spearmint/main.py", line 361, in get_suggestion
    suggested_input = chooser.suggest()
  File "/home/john/Dev/AZSanger/etc/Spearmint/spearmint/choosers/default_chooser.py", line 335, in suggest
    current_best, current_best_location = self.best()
  File "/home/john/Dev/AZSanger/etc/Spearmint/spearmint/choosers/default_chooser.py", line 447, in best
    mc = self.probabilistic_constraint(grid)
  File "/home/john/Dev/AZSanger/etc/Spearmint/spearmint/choosers/default_chooser.py", line 519, in probabilistic_constraint
    for c in self.constraints],
KeyError: 'NaN'

Problem installing Spearmint! and testing the branin example!

Dear all
I'm a new user and trying to install Spearmint package for Bayesian optimization and I have problem using the software. My problem is with the third step that is mentioned in the site:

STEP 3: Running spearmint

Start up a MongoDB daemon instance:
mongod --fork --logpath <path/to/logfile> --dbpath <path/to/dbfolder>
Run spearmint: python main.py </path/to/experiment/directory>)

I don't know where is the logpath and dbpath for mongod.....
Would you please help me to install the package and run the branin example!

Thank you in advance,
Best wishes,
Mohammad

Current implemented features

Hi!

I've read the papers about spearmint and i got to say that there seems to be a lot of good stuff in there.

However, I noticed that part of that stuff is not implemented in the current version. I think that a simple list of things that are not currently implemented would help people who see the code for the first time.

From what I've seen these are the things that are not implemented (mainly from the paper 'Multi-task bayesian optimization'):

Transferring knowledge to a new domain (did not see multi task kernel)
Optimizing average function over multiple tasks ( for fast k-fold cross validation)
Multi-task acquisition function (did not see an entropy based acquisition function)
Other acquisition functions, besides EI

Can you confirm me that these features are not implemented?

Also when I set max-concurrent jobs to >1, is somekind o parallelization strategy performed? Perhaps the one presented in the paper 'Practical Bayesian Optimization of Machine Learning Algorithms'?
"resources" : {
"my-machine" : {
"scheduler" : "local",
"max-concurrent" : 2,
"max-finished-jobs" : 100

Best Regards,
Jorge

Python 3 support

Hi, any chance for Python 3 support?

Conditionally tune a variable

Is there a way of setting a variable to be tunable given the enum state of another variable?

Docker image

I've produced a Docker image for Spearmint, which builds from master on a weekly basis. The image also includes MongoDB, so the entire system can be run in a container. If you wish I can add a note to the bottom of STEP 1: Installation as an alternative installation option.

On a related note, some example documentation on my software may be of interest. I'm currently writing up notes on how to accomplish these setups entirely within Docker.

Installation on Odyssey (Harvard Cluster) ?

Do folks either doing dev or just using spearmint at Harvard already have a stable installation set up? If so, can I soft link it or something? Or maybe there's a module built? I haven't found one using module avail, but I suppose I could request one via RC.

No module named subset

On Mac OS X with anaconda Python 2.7.7 and Mongodb 2.6.4 when I invoke Spearmint I get "ImportError: No module named subset".

Ideas?

Running a matlab file with spearmint

Dear all,
I have a function to optimize that is written in MATLAB, as it is mentioned in instructions, I have changed the language in config.json file to 'MATLAB' and also, the branin.py has changed to branin.m (I've changed the lines with Matlab format as follows:)

function y = branin(x1,x2)

    y = (x1.^2)-x2+1);
end


%print 'Result = %f' % result
%time.sleep(np.random.randint(60))
%eturn result
%Write a function like this called 'main'
function y1 = main(job_id, params)
    print ('Anything printed here will end up in the output directory for job #%d') % job_id)
    print ('params')
   y1 = branin(params['x'], params['y']);
end

when I run this file using main.py in spearmint directory, I get this error in the output folder:

Job launching after 0.15 seconds in submission.
Booting up Matlab...
Traceback (most recent call last):
File "/home/mohammad/Desktop/Bayesian/Spearmint-master/spearmint/launcher.py", line 240, in launch
result = matlab_launcher(job)
File "/home/mohammad/Desktop/Bayesian/Spearmint-master/spearmint/launcher.py", line 351, in matlab_launcher
session.run("cd('%s')" % os.path.realpath(job['expt_dir']))
File "/home/mohammad/anaconda/lib/python2.7/site-packages/pymatlab/matlab.py", line 84, in run
error_string)))
RuntimeError: Error from Matlab: None
Problem executing the function
Job failed in 0.75 seconds.
(<type 'exceptions.RuntimeError'>, RuntimeError('Error from Matlab: None',), <traceback object at 0x7f779d5ea290>)

I have to mention that initially, it asked me to install 'pymatlab', which I did and there is no more error about that. I really appreciate if anyone can help me to run this matlab file :)
Best regards,
Mohammad

print sys.exc_info() SyntaxError: invalid syntax

I'm trying to run the noisy example on Arch Linux. Python 3 is default on Arch, so to run python 2, you need to run it with the command python2 . So here is what I do:

python2 main.py /home/ed/Spearmint/examples/noisy/

I get this SyntaxError:

File "/usr/lib/python2.7/site-packages/spearmint-0.1-py2.7.egg/spearmint/launcher.py", line 274
print sys.exc_info()
        ^
SyntaxError: invalid syntax

I thought that at some point script was trying to run python scrpits with python command which would use python 3, so print without brackets wouldn't work. Changing the line to:

print (sys.exc_info())

doesn't fix the problem, so it must be something else.

Problems running example files

Hi,

I was just wondering if anyone encountered this error message? I'm not entirely sure what is going wrong here, but it seems to be a lower level c++ code that is causing the problem?

I got this message when running the example btw. The first 2 points ran without any problems, but the part where it tries to fit a GP over the points seem to be causing the error messages. I tried updating scipy and numpy but it didn't help.

Thanks :)

Looking for python27.dll
Looking for python27.dll
Looking for python27.dll
In file included from C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\scipy\weave\blitz/blitz/array-impl.h:37:0,
from C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\scipy\weave\blitz/blitz/array.h:26,
from c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:11:
C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\scipy\weave\blitz/blitz/range.h: In member function 'bool blitz::Range::isAscendingContiguous() const':
C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\scipy\weave\blitz/blitz/range.h:120:34: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
return ((first_ < last_) && (stride_ == 1) || (first_ == last_));
^
In file included from C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\numpy\core\include/numpy/ndarraytypes.h:1804:0,
from C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\numpy\core\include/numpy/ndarrayobject.h:17,
from C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\numpy\core\include/numpy/arrayobject.h:4,
from c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:23:
C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\numpy\core\include/numpy/npy_1_7_deprecated_api.h: At global scope:
C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\numpy\core\include/numpy/npy_1_7_deprecated_api.h:13:79: note: #pragma message: C:\SciSoft\WinPython-64bit-2.7.9.4\python-2.7.9.amd64\lib\site-packages\numpy\core\include/numpy/npy_1_7_deprecated_api.h(12) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
"#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION")
^
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp: In function 'PyObject* compiled_func(PyObject_, PyObject_)':
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:745:24: error: ambiguous overload for 'operator<' (operand types are 'int' and 'py::object')
for (int i=0; i<N; i++)
^
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:745:24: note: candidates are:
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:745:24: note: operator<(int, int)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:745:24: note: operator<(int, float)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:745:24: note: operator<(int, double)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:746:26: error: ambiguous overload for 'operator<' (operand types are 'int' and 'py::object')
for (int j=0; j<M; j++)
^
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:746:26: note: candidates are:
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:746:26: note: operator<(int, int)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:746:26: note: operator<(int, float)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:746:26: note: operator<(int, double)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:747:28: error: ambiguous overload for 'operator<' (operand types are 'int' and 'py::object')
for (int d=0; d<D; d++)
^
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:747:28: note: candidates are:
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:747:28: note: operator<(int, int)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:747:28: note: operator<(int, float)
c:\users\usr\appdata\local\temp\usr\python27_compiled\sc_efe69a3fecba151ef9b75ee9eed626cb290.cpp:747:28: note: operator<(int, double)

How to use enum?

How do I feed an enum in the configuration file?

On loading "heavy" external data for the model

Hi,

Spearmint iterates for each combination of hyperparameters (as given in the config JSON file). On each iteration it calls the user-created function that outputs a "score" or value which are then compared in order to optimize. These scores are usually the result of calling an underlying model -- defined inside the function -- with the corresponding set of hyperparameters. This makes sense since in this manner the logic of the model is totally separated from Spearmint. However, it turns out that one of the inputs models usually need -- of course, besides hyperparameters - is data, which due to this will have to be loaded within each iteration.

Concretely, the data we are using to train our models is considerably heavy, taking a lot of time to load in memory each time. We would like to load the data in memory only once and then just give it to our model on each iteration, saving precious time. Is this somehow possible? Maybe if, for instance, Spearmint could be called from within a python script we could pass the data.

I hope you can guide us a little.
Thanks!

Problem running branin.py as an example!

Dear all
I've installed the spearmint package and mongoDB on Linux. My problem is in the final step, where I want to run main.py
for testing branin.py, first I go to this address ("Desktop/Spearmint-master/examples/distributed/") and execute the branin.py
Then, in the same address I execute the command " mongod --fork --logpath --dbpath"
everything is OK to this step!
for running the main.py, first I go back to this address("Desktop/Spearmint-master/Spearmint") and when I execute the command I see the following error!

Traceback (most recent call last):
File "main.py", line 494, in
main()
File "main.py", line 249, in main
options, expt_dir = get_options()
File "main.py", line 216, in get_options
expt_dir = os.path.realpath(os.path.expanduser(args[0]))
IndexError: list index out of range

I'm wondering where we have to define the branin parameters in the main.py ???
would you please help me to solve the error?
Thank you in advance,
Mohammad

feature: ipyparallel task manager

i suggest someone code up a task 'scheduler' that uses task info stored in a ipyparallel task db.

Trying to test an example with spearmint!

Dear all
I have 3 parameters in my function, so I changed the config.json file as below:
{
"language" : "PYTHON",
"main-file" : "Neo.py",
"likelihood" : "GAUSSIAN",
"variables" : {
"x" : {
"type" : "FLOAT",
"size" : 1,
"min" : 0.0001,
"max" : 0.01
},
"y" : {
"type" : "FLOAT",
"size" : 1,
"min" : 0.4,
"max" : 0.9
},
"z" : {
"type" : "FLOAT",
"size" : 1,
"min" : 1e-5,
"max" : 5e-5
}
}
}

I don't know what to write for " "experiment-name""???
Please someone answers my question!! I really need to use this package!
Best Regards,
Mohammad

setting variable to a fixed value

  File "~/Spearmint/spearmint/tasks/base_task.py", line 353, in from_unit
    assert(variable['max'] - variable['min'] > 0.0), 'Your specified min (%f) for the variable %s must be less than the max (%f)' % (variable['min'], name, variable['max'])
AssertionError: Your specified min (3.000000) for the variable layerNum must be less than the max (3.000000)

I'd say setting max equal to min is a perfectly valid way to set a variable to a constant. Do you see any complications involved if I want to change the code to make this possible? (I don't promise a pull request, but might consider it.)

What's the current syntax to achieve this within config files?

ImportError: No module named subset

When I tried to run spearmint I get the following error:

File "main.py", line 482, in
main()
File "main.py", line 254, in main
chooser_module = importlib.import_module('spearmint.choosers.' + options['chooser'])
File "/usr/lib/python2.7/importlib/init.py", line 37, in import_module
import(name)
File "/home/ianastacio/Apps/Spearmint/spearmint/choosers/default_chooser.py", line 197, in
from ..models.abstract_model import function_over_hypers
File "/home/ianastacio/Apps/Spearmint/spearmint/models/init.py", line 1, in
from gp import GP
File "/home/ianastacio/Apps/Spearmint/spearmint/models/gp.py", line 194, in
from ..kernels import Matern52, Noise, Scale, SumKernel, TransformKernel
File "/home/ianastacio/Apps/Spearmint/spearmint/kernels/init.py", line 6, in
from subset import Subset
ImportError: No module named subset

NaN crash

After several successful runs, it seems spearmint crashes with the following output:

/Users/name/Desktop/external_libraries/Spearmint/spearmint/tasks/task.py:287: RuntimeWarning: invalid value encountered in subtract
  return y - mean
Fitting GP for main task...
Traceback (most recent call last):
 File "main.py", line 494, in <module>
   main()
 File "main.py", line 286, in main
   suggested_job = get_suggestion(chooser, resource.tasks, db, expt_dir, options, resource_name)
 File "main.py", line 355, in get_suggestion
   hypers = chooser.fit(task_group, hypers, task_options)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/choosers/default_chooser.py", line 309, in fit
   hypers=hypers.get(task_name, None)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/models/gp.py", line 505, in fit
   self._hypers_list = self._collect_samples(self.mcmc_iters)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/models/gp.py", line 378, in _collect_samples
   sampler.sample(self)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/sampling/slice_sampler.py", line 262, in sample
   params_array, current_ll = slice_sample(params_array, self.logprob, model, **self.sampler_options)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/sampling/mcmc.py", line 354, in slice_sample
   new_x, new_llh = direction_slice(direction, init_x)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/sampling/mcmc.py", line 289, in direction_slice
   llh_s = np.log(npr.rand()) + dir_logprob(0.0)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/sampling/mcmc.py", line 272, in dir_logprob
   return logprob(direction*z + init_x, *logprob_args)
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/sampling/slice_sampler.py", line 240, in logprob
   lp += model.log_likelihood()
 File "/Users/name/Desktop/external_libraries/Spearmint/spearmint/models/gp.py", line 538, in log_likelihood
   solve = spla.cho_solve((chol, True), self.observed_values - self.mean.value)
 File "/usr/local/lib/python2.7/site-packages/scipy/linalg/decomp_cholesky.py", line 162, in cho_solve
   b1 = asarray_chkfinite(b)
 File "/usr/local/lib/python2.7/site-packages/numpy/lib/function_base.py", line 613, in asarray_chkfinite
   "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

My config file is as follows, if helpful:

{
   "language": "PYTHON",
   "main-file": "stub_optimizing.py",
   "experiment-name": "swingy-monkey-optimization",
   "likelihood": "GAUSSIAN",
   "variables": {
       "height_bucket_div": {
           "type": "FLOAT",
           "size": 1,
           "min": 1,
           "max": 15
       },
       "width_bucket_div": {
           "type": "FLOAT",
           "size": 1,
           "min": 1,
           "max": 10
       },
       "vel_bucket_len": {
           "type": "FLOAT",
           "size": 1,
           "min": 1,
           "max": 30
       },
       "discount": {
           "type": "FLOAT",
           "size": 1,
           "min": 0,
           "max": 1
       },
       "greed_exp": {
           "type": "FLOAT",
           "size": 1,
           "min": 1,
           "max": 3
       }
   }
}

How is optimization of acquisition function over integer/categorical parameters done?

Do you restrict search in the acquisition function only over feasible (discrete) values? As a gradient descent is at least not obvious to apply in this case, I would imagine one can use a combination of gradient descent and some smart enumeration of discrete parameters. If not, do you do some sort or relaxation over discrete variables? I would greatly appreciate if you would drop a short message on how specifically you optimize, so that I can understand what I should expect in terms of scalability if I have many discrete variables.

Testing for presentation of CONTRIBUTING.rst

Test test

trace.csv removed from experiment output?

Hi,
I'm running spearmint on OS X with anaconda. The output of branin.py doesn't generate a trace.csv file are described in the readme. Is this a bug or was something changed from the original Snoek version of spearmint?

Error when running 'constrained' example

Hi I get the following error when running the 'constrained' example using PESC using the latest origin/PESC code. All other examples appear to work fine for me. It appears to do a single fit but then produces errors when trying to identify the next point to search.

I looked into the code and it seems model.options doesn't have a key 'binomial_trials'. Also model._one_minus_epsilon is not defined.

Any help would be appreciated,
Thanks.

##@##:~/Spearmint$ python spearmint/main.py examples/constrained/

Getting suggestion for y_at_least_x, y_at_most_10, branin...


Suggestion:     
                NAME          TYPE       VALUE
                ----          ----       -----
                X             float      6.781006    
                Y             float      11.715088   
Submitted job 1 for tasks(s) y_at_least_x, y_at_most_10, branin with local scheduler (process id: 56519).
Current time: 2015-08-02 17:55:27
Status: 1 pending, 0 complete.
ID(s) of pending job(s) for y_at_least_x, y_at_most_10, branin: 1
Waiting for results...

Fitting GP to 0 data for y_at_least_x task...
Fitting GP to 1 data for NaN task...
Fitting GP to 0 data for y_at_most_10 task...
Fitting GP to 0 data for branin task...
Computing current best...

No feasible solution found (yet).

Maximum total probability of satisfying constraints = 0.05659
  Probability of satisfying       y_at_least_x constraint: 0.33181
  Probability of satisfying                NaN constraint: 0.49298
  Probability of satisfying       y_at_most_10 constraint: 0.34597

At location:    
                NAME          TYPE       VALUE
                ----          ----       -----
                X             float      6.781006    
                Y             float      11.715088   
Traceback (most recent call last):
  File "spearmint/main.py", line 514, in <module>
    main()
  File "spearmint/main.py", line 339, in main
    recommendation = chooser.best()
  File "/Users/##/Spearmint/spearmint/choosers/default_chooser.py", line 820, in best
    val_o, loc_o = self.bestObservedConstrained()
  File "/Users/##/Spearmint/spearmint/choosers/default_chooser.py", line 1242, in bestObservedConstrained
    all_constraints_satisfied = np.all([self.constraintSatisfiedAtObservedInputs(c) for c in self.constraints], axis=0)
  File "/Users/##/Spearmint/spearmint/choosers/default_chooser.py", line 1295, in constraintSatisfiedAtObservedInputs
    sat = values/float(model.options['binomial_trials']) >= model._one_minus_epsilon
KeyError: 'binomial_trials'

Licence

Hey @JasperSnoek,

what's the story behind this licence? Are we allowed to Fork it (or are all 54 Forks illegal)?
Please answer me by email (jan.gleixner (gmail)) if a private answer can be more informative.

best, Jan

parsing experiment dir fails

Using python 2.7.8 (default, Oct 18 2014, 12:50:18) [GCC 4.9.1],

I found that os.path.realpath() does not remove the '~~' character if the path is specified as (e.g) '~~/path/to/experiments/':

'/home/lee/~/projects/Spearmint/examples'

I suggest that os.path.realpath() should wrap os.path.expanduser(), as in the following example.

os.path.realpath(os.path.expanduser('~/projects/Spearmint/examples/'))
'/home/lee/projects/Spearmint/examples'
os.path.realpath(os.path.expanduser('~/projects/Spearmint/examples/'))

I'll create a PR

which MongoDB version

Could you tell me, which version of MongoDB is needed? Is 2.4 enough, or is the latest to be used?

Spearmint on Windows

I was wondering if anybody has some experience in successfully installing spearmint on a windows machine.

Python 2.7
Windows 10
PyCharm

Thank you

Incorrect use of local scheduler? (i.e. bad while loop?)

On a test problem I've set up to get spearmint working I get the following output:

(SPEARMINT_ENV)dscott@rclogin10:/n/moorcroftfs4/dscott/spearmint/spearmint=>python main.py /n/moorcroftfs4/dscott/runfiles/smnt/                                                Using database at localhost.
Getting suggestion...

Suggestion:     NAME          TYPE       VALUE
                ----          ----       -----
                y             float      3.000000
                x             float      0.000000
Submitted job 1 with local scheduler (process id: 32161).
Status: 1 pending, 0 complete.

Getting suggestion...

Suggestion:     NAME          TYPE       VALUE
                ----          ----       -----
                y             float      3.000000
                x             float      0.000000
Submitted job 2 with local scheduler (process id: 32241).
Status: 1 pending, 0 complete.

...

Getting suggestion...

Suggestion:     NAME          TYPE       VALUE
                ----          ----       -----
                y             float      3.000000
                x             float      0.000000
Submitted job 35 with local scheduler (process id: 2208).
Status: 1 pending, 0 complete.

^CTraceback (most recent call last):
  File "main.py", line 494, in <module>
    main()
  File "main.py", line 309, in main
    time.sleep(options.get('polling-time', 5))
KeyboardInterrupt

The config file looks like this:

  1 {
  2     "language"        : "PYTHON",
  3     "experiment-name" : "smnt-test",
  4     "polling-time"    : 1,
  5     "resources" : {
  6         "my-machine" : {
  7             "scheduler"         : "local",
  8             "max-concurrent"    : 1,
  9             "max-finished-jobs" : 3
 10         }
 11     },
 12     "tasks": {
 13         "job_wrap" : {
 14             "type"       : "OBJECTIVE",
 15             "likelihood" : "NOISELESS",
 16             "main-file"  : "job_wrap",
 17             "resources"  : ["my-machine"]
 18         }
 19     },
 20     "variables": {
 21         "x" : {
 22             "type" : "FLOAT",
 23             "size" : 1,
 24             "min"  : 0,
 25             "max"  : 5
 26         },
 27         "y" : {
 28             "type" : "FLOAT",
 29             "size" : 1,
 30             "min"  : 3,
 31             "max"  : 7
 32         }
 33     }
 34 }
~
~

Does anyone know what I might do about this?

Spearmint with starcluster

I am trying to use spearmint in an amazon cluster but I am facing with a problem, that i can't fix.
I am using the example brainin.py that i found in the directory distributed.
I changed SLURM to SGE in the configuration file and -j to -j y. I compiled spearmint and launched the example. I am using a test cluster with two nodes.

qstat -f
queuename qtype resv/used/tot. load_avg arch states

all.q@master BIP 0/0/2 0.01 linux-x64

all.q@node001 BIP 0/0/2 0.01 linux-x64

PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS
############################################################################
195 0.55500 branin-dis root Eqw 11/30/2015 20:42:10 1
197 0.55500 branin-dis root Eqw 11/30/2015 20:42:11 1
198 0.55500 branin-dis root Eqw 11/30/2015 20:42:19 1

As you can see all jobs in node001 terminate in a error.
When i look a the a little bit in detail i see

acct -j 198

qname all.q
hostname node001
group root
owner root
project NONE
department defaultdepartment
jobname branin-distributed-example-00000005
jobnumber 198
taskid undefined
account sge
priority 0
qsub_time Mon Nov 30 20:42:19 2015
start_time -/-
end_time -/-
granted_pe NONE
slots 1
failed 26 : opening input/output file
exit_status 0
...
...

The input ouput error is due to the fact that spearmint is trying to read and write in the node001 filesystem. So I tryied to change SGE.py ,according to documentation with

return 'qsub -S /bin/bash -e master:%s -o master:%s -j y -N %s' % (output_file, output_file, job_name)

where i specify the node in which to read and write.
The code still does not working. Any Idea?

Add danielherandezlobato

@danielhernandezlobato, you need to agree to the Contributors License Agreement to contribute to the codebase (https://github.com/HIPS/Spearmint/blob/master/CONTRIBUTING.rst). Please indicate whether you agree as a comment on this issue.

Hanging after "Using database at localhost."

After I used ctr-c in the middle of a job, I restarted spearmint to continue. Unfortunately, spearmint gets hung up after "Using database at localhost.". With a little debugging, I noticed that it is not entering the while loop --- while resource.acceptingJobs(jobs): ---.

Is this due to MongoDB being hung up on a particular job?
Some insight would be much appreciated :)

Printing breaks with 'enum' type

This print

Spearmint/spearmint/tasks/base_task.py

Line 260 in 92d6b5d

for i in xrange(len(param['values'])):

fails when the variable is type 'enum' and it gets a string rather than a float.

output files

The spearmint is creating the output/0000*.out files, but not writing anything inside it. It is creating the job files and trace files with the content. It also writes the best results file.

What might be leading to this issue? I am using the following command to run the file.

nohup python main.py --driver=local --method=GPEIOptChooser --method-args=noiseless=1 config.pb --method-args=use_multiprocessing=0 &

MPES and ParEGO?

Hello!

I just landed on this repository by following the link provided on "Predictive Entropy Search for Multi-objective Bayesian Optimization" (here).
The paper, very cool by the way, links directly to this spearmint repository and mentions that both MPES and ParEGO were implemented in it.

I just can't find any reference to these acquisition functions in the code. Am I looking at the wrong branch?

Thank you!

ERROR: child process failed, exited with error number 100

Dear all
I'm trying to test the branin.py on the cluster (On my laptop every thing is OK), and I have the following error when I try to start mongo using command ; "mongod --fork --logpath --dbpath".

ERROR: child process failed, exited with error number 100

I have to mention that, when I type "mongo" on the cluster it says:

MongoDB shell version: 2.4.9
connecting to: test
Server has startup warnings:
Mon May 11 13:09:59.173 [initandlisten]
Mon May 11 13:09:59.173 [initandlisten] ** WARNING: You are running on a NUMA machine.
Mon May 11 13:09:59.173 [initandlisten] ** We suggest launching mongod like this to avoid performance problems:
Mon May 11 13:09:59.173 [initandlisten] ** numactl --interleave=all mongod [other options]
Mon May 11 13:09:59.173 [initandlisten]

So, I tried starting mongo using command; "numactl --interleave=all mongod --fork --logpath --dbpath", but didn't see any difference in the output!

Thanks,
Mohammad

Providing an initial guess

Is there any way to provide an initial guess of hyperparameters to Spearmint?

Broken Jobs on Distributed Example Using Starcluster

Hi
I am trying to get the distributed example working on an aws cluster i started with starcluster.
I'm using the starcluster ubuntu 12.04 ami and have installed numpy, mongodb, pymongo and spearmint on both nodes and am using SGE which seems to be working fine for simple jobs i've tried on it. The simple spearmint example runs correctly on both nodes.
I also changed the line of code mentioned by peterjsadowski from -j to -j y in the qsub in SGE.py.
I created the log file and db directory and ran
mongod --fork --logpath log --dbpath db
on the master node.

In the distributed example i changed "SLURM" to "SGE".
When i run the example on the master node all jobs that run on the master node seem to execute fine however the jobs that are supposed to run on the other node give an error like:
EXC: < class 'drmaa.errors.InvalidJobException' >
Could not find job for rocess id 335
Broken job 85 detected.

Also all the output files for the failed jobs are blank. While the rest of the output looks fine.

I tried running the example on the other node instead and the problem is just reversed with a similar error on master node jobs and the rest working fine.

Any idea what is going on here?
Thanks

cluster_scheduler does not use hostname as database address

When running the distributed example, an exception is raised for all tasks run on nodes: Exception: Could not establish a connection to MongoDB.

Looking at cluster_scheduler.py, it gets the proper hostname but doesn't actually use it. I'm assuming then that the nodes try to use a db in their own localhost and can't find one.

MixedGP not in the repository

simple_chooser and test_mixed_gp.py required module MixedGP which is not in the repository.

Simple Case of 1 Optimization Variable

Hi,

I think things might be broken when you are only trying to optimize over a single variable with dimension 1.

In base_task.py's paramify, data_vector is a float if there is only one optimization variable, rather than a numpy array containing a float, which causes the "Exception('Input to paramify must be a 1-D array.')" to throw. It's solved by putting it in a numpy array as....

if data_vector.ndim != 1:
data_vector = np.array([data_vector])

error: package directory 'spearmint/visualizations' does not exist

When I try to do pip install -e folder it fails with the error:

error: package directory 'spearmint/visualizations' does not exist

After deleting this dependency from the setup.py the install procedure completed without errors.

int_to_unit and float_to_unit break if max == min

In spearmint/tasks/base_task.py, the functions int_to_unit and float_to_unit assume that vmax > vmin. When running experiments with trivial parameter ranges (vmax == vmin), this breaks when I think the reasonable behavior is to just always choose the one value.

Spearmint/spearmint/tasks/base_task.py

Line 372 in 92d6b5d

def int_to_unit(self, v, vmin, vmax):

simple_chooser is broken

simple_chooser attempts to retrieve 'normalized_data_dict' from a TaskGroup object.
(line 255 in simple_chooser.py)
I presume simple_chooser was working only with the simple_task which is now deprecated.

Spearmint with SQlite for Quick Deployment in Jupyter Notebooks

I'd love to use Spearmint (it is an awesome package!) but often the overhead of setting up a MongoDB server porting my Jupyter Notebook into a runnable python script is too much. Is there any way to run Spearmint in a similar fashion to hyperopt? Maybe with SQlite as a backend?

should exit when max-finished-jobs reached

i think this is reasonable behavior to expect from the program. (it didn't exit when i resumed my runs when they reached max-finished-jobs)

BetaWarp UserWarning: BetaWarp encountered negative values

Spearmint is issuing the following warning 1-20 times between suggestions:
/.../spearmint/transformations/beta_warp.py:206: UserWarning: BetaWarp encountered negative values: [ -1.08420217e-19]
warnings.warn('BetaWarp encountered negative values: %s' % inputs[inputs<0])

My variables are 2 ints in [0,63] and an enum, and objective function is a float in [-100000, 0].

Am I doing something wrong?

Python 2.7.8 on Ubuntu, Spearmint master@c97959a (latest)

Config file:
{
"experiment-name" : "main",
"resources" : {
"local" : {
"scheduler" : "local",
"max-concurrent" : 5,
"max-finished-jobs" : 100
}
},
"tasks": {
"asc" : {
"type" : "OBJECTIVE",
"likelihood" : "GAUSSIAN",
"main-file" : "main.py",
"language" : "PYTHON",
"resources" : ["local"]
}
},
"variables" : {
"a" : {
"type" : "ENUM",
"size" : 1,
"options" : ["17a", "17f", "184", "18a", "18d", "190", "193", "19a", "19b", "19f", "1a2", "1a5", "1a9", "1ac", "1b3", "1b4", "1b8", "1bb", "1be", "1c0", "1c3", "1ca", "1cb", "1cf", "1d2", "1d5", "1d9", "1db", "1de", "1e5", "1e6", "1ea", "1ed", "1f0", "1f3", "1f6", "1fa", "1fd", "200", "202", "205", "208", "20b", "210", "215", "21a", "21f", "221", "225", "229", "22b", "22f", "233", "235", "239", "23f", "243", "249"]
},
"b" : {
"type" : "INT",
"size" : 1,
"min" : 0,
"max" : 63
},
"c" : {
"type" : "INT",
"size" : 1,
"min" : 8,
"max" : 63
}
}
}

minimize error for joint modeling task

I am trying to use Spearmint to find the optimal hyper-parameters that mimimize the error for a joint modeling task. Since there are two tasks, there are two errors. But I am supposed to return a single scalar from the callable function. I am unable to understand how to make it work. Kindly help