josejimenezluna / pygpgo Goto Github PK

View Code? Open in Web Editor NEW

236.0 9.0 61.0 60.89 MB

Bayesian optimization for Python

Home Page: http://pygpgo.readthedocs.io

License: MIT License

Python 85.40% TeX 14.60%

bayesian optimization machine-learning

pygpgo's Introduction

pyGPGO: Bayesian Optimization for Python

pyGPGO is a simple and modular Python (>3.5) package for bayesian optimization.

Bayesian optimization is a framework that can be used in situations where:

Your objective function may not have a closed form. (e.g. the result of a simulation)
No gradient information is available.
Function evaluations may be noisy.
Evaluations are expensive (time/cost-wise)

Installation

Retrieve the latest stable release from pyPI:

pip install pyGPGO

Or if you're feeling adventurous, retrieve it from this repo,

pip install git+https://github.com/hawk31/pyGPGO

Check our documentation in http://pygpgo.readthedocs.io/.

Features

Different surrogate models: Gaussian Processes, Student-t Processes, Random Forests, Gradient Boosting Machines.
Type II Maximum-Likelihood of covariance function hyperparameters.
MCMC sampling for full-Bayesian inference of hyperparameters (via pyMC3).
Integrated acquisition functions

A small example!

The user only has to define a function to maximize and a dictionary specifying input space.

import numpy as np
from pyGPGO.covfunc import matern32
from pyGPGO.acquisition import Acquisition
from pyGPGO.surrogates.GaussianProcess import GaussianProcess
from pyGPGO.GPGO import GPGO


def f(x, y):
    # Franke's function (https://www.mathworks.com/help/curvefit/franke.html)
    one = 0.75 * np.exp(-(9 * x - 2) ** 2 / 4 - (9 * y - 2) ** 2 / 4)
    two = 0.75 * np.exp(-(9 * x + 1) ** 2/ 49 - (9 * y + 1) / 10)
    three = 0.5 * np.exp(-(9 * x - 7) ** 2 / 4 - (9 * y -3) ** 2 / 4)
    four = 0.25 * np.exp(-(9 * x - 4) ** 2 - (9 * y - 7) ** 2)
    return one + two + three - four

cov = matern32()
gp = GaussianProcess(cov)
acq = Acquisition(mode='ExpectedImprovement')
param = {'x': ('cont', [0, 1]),
         'y': ('cont', [0, 1])}

np.random.seed(1337)
gpgo = GPGO(gp, acq, f, param)
gpgo.run(max_iter=10)

Check the tutorials and examples folders for more ideas on how to use the software.

Citation

If you use pyGPGO in academic work please cite:

Jiménez, J., & Ginebra, J. (2017). pyGPGO: Bayesian Optimization for Python. The Journal of Open Source Software, 2, 431.

pygpgo's People

Contributors

Stargazers

Watchers

pygpgo's Issues

Adding to Conda Forge

Hi!

We are adding your code to DeepChem.
In order to get a single conda installer we have to be able to install pyGPGO via conda.
Do you have any issues with us creating a conda install on conda-forge?

Afterwords you would be able to install pyGPGO via

conda install -c conda-forge pyGPGO

Range of int parameters not considered as expected

Hello,
I use the package (v0.5.1) to tune hyperparameters in deep learning. I noted that for the parameters that are defined as integer values, the last value is never considered. I think this is because the function :
def _sampleParam(self):
d = OrderedDict()
for index, param in enumerate(self.parameter_key):
if self.parameter_type[index] == 'int':
d[param] = np.random.randint(
self.parameter_range[index][0], self.parameter_range[index][1])
elif self.parameter_type[index] == 'cont':
d[param] = np.random.uniform(
self.parameter_range[index][0], self.parameter_range[index][1])
else:
raise ValueError('Unsupported variable type.')
return d
in GPGO.py.
This is not convenient in particular when I use an integer variable to define activation functions, as it makes that the last one is not evaluated. For now I just put a dummy function as last element of my vector. I think it should be:
d[param] = np.random.randint(
self.parameter_range[index][0], self.parameter_range[index][1]+1)

Integrated acquisition functions

Notebook jupyter

@Hawk31 Wonderful library, but it would be better if the given examples are provided in Jupyter notebook.
They are more overall immediate and reproducible examples of .py and the display are supported by github by default.

Parallelize acquisition optimization procedure

Using joblib.

test_tSP_opt_nograd is flaky when all seed-setting code is removed

Introduction

test_tSP_opt_nograd in tests/test_surrogates.py seems to be flaky when all seed-setting code (e.g. np.random.seed(0) or tf.random.set_seed(0)) is commented out.

In commit 295a7b1, test_tSP_opt_nograd failed ~5% of the time (out of 500 runs) compared to 0% of the time (out of 500 runs) when no seed-setting code is removed.

Motivation

Some tests can be flaky with high failure rates, but are not discovered when the seeds are set. We are trying to stabilize such tests.

Environment

The tests were run using pytest 6.2.4 in a conda environment with Python 3.6.13. The OS used was Ubuntu 16.04.

Possible Solution and Discussion

A possible solution is to change the values used in the assertions. Increasing the upper bound of the assertion on line 69 (i.e. the assertion that checks the value of params['l']) from 0.5 to 0.6 and the upper bound of the assertion on line 70 (i.e. the assertion that checks the value of params['sigmaf']) from 0.6 to 0.7 reduced flakiness to ~3%.

Increasing the upper bounds of the assertions did not increase runtimes significantly.

Several of the failures that occurred seemed to be due to params['l'] having a value of 0.0001, which may indicate a bug. Additionally, params['sigmaf'] had an abnormally high value of ~0.94 on one of the runs, which may also indicate a bug.

Please let us know if these solutions are feasible or if there are any other solutions that should be incorporated. We will be happy to raise a Pull Request to fix the tests and incorporate any feedback that you may have.

Covariance function bounds being overwritten

This line discards user-defined bounds on the covariance function.

In other words, if I do:

covfcn_bounds = {
	"l": [1e-4, 2],
	"sigmaf": [1e-4, 4],
	"sigman": [1e-6, 4]
}
covfcn_bounds = [v for v in covfcn_bounds.values()]
sexp = squaredExponential(bounds=covfcn_bounds)

Those custom bounds will be lost in the first call to _lmlik. The same happens in this line and this line. Other files are also affected (tStudentProcess, GaussianProcessMCMC, and tStudentProcessMCMC).

Easy way to add pre-traind GP's (or custom init parameters)

Hi,

Just want to first of thank you for the awesome package, its being helpful in tuning models already (xgboost and randomforests).

Being able to add some custom init parameters would be awesome as a way of better including some domain knowledge. E.g. When tuning random forests I would try limiting the tree depth to 5,10 and n_features, while also limiting the mtry parameter to sqrt(n_features), n_features/3 and n_features. Running a quick grid search first and then GPGO to fine tune would be the ideal workflow.

For anyone else reading this there is a gist by @Hawk31 showing how to do this manually: https://gist.github.com/hawk31/ed222c4cf6b21cbd7d4b5186f3f132b5

Thanks!

SVM optimization with GPGO: objective function not improving after several iterations and different config of acq funct and surrogate model

`from pyGPGO.covfunc import squaredExponential
from pyGPGO.surrogates.GaussianProcess import GaussianProcess
from pyGPGO.surrogates.RandomForest import RandomForest
from pyGPGO.GPGO import GPGO
from pyGPGO.acquisition import Acquisition

from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import cross_val_score

from sklearn.datasets import load_wine
import collections
df=load_wine()

collections.Counter(df['target'])
xtrain, xtest, ytrain, ytest = train_test_split(df['data'],df['target'],test_size=0.24,random_state=seed)

def accuracy_SVC(C,coef0,gamma):
clf = svm.SVC(C=C,coef0=coef0,gamma=gamma,kernel='sigmoid')
scores = cross_val_score(clf, xtrain, ytrain, cv=10)
return (scores.mean())

from pyGPGO.covfunc import matern32
from pyGPGO.covfunc import gammaExponential

covf=gammaExponential()
covf2=matern32()

GP = GaussianProcess(covf)
RF = RandomForest()

acq = Acquisition(mode='ProbabilityImprovement')
acq3= Acquisition(mode='UCB')

import numpy as np

bo = GPGO(GP,acq,accuracy_SVC,param)
bo.run(init_evals=30,max_iter=120)

bo6=GPGO(RF,acq3,accuracy_SVC,param)
bo6.run(init_evals=30,max_iter=120)`

Example covzoo.py needs to import Numpy

The example is not stand alone unless import numpy as np is added.

[JOSS]: More prose & clearer examples in online docs

I'm reviewing the JOSS submission at openjournals/joss-reviews#41 and most things are looking good, but I'll open a few issues here.

To be consistent with the JOSS requirements, the docs are going to need some work. At the moment, the documentation has a brief statement of need, a single example, and then API docs. More discussion should be included on the readthedocs page to make it easier to get started. This should include things like one or two tutorials (the examples directory is not sufficient - these examples don't even include any comments!), a page outlining the installation procedure, and a more detailed description of the available options.

Another thing that might be good to include is a comparison to some of the other existing Bayesian optimization packages that are already available in Python: fmfn/BayesianOptimization, GPyOpt, and skopt to name a few.

Does this tool support BO with constrains?

Have you implemented the idea from "Bayesian Optimization with Inequality Constraints" in current acquisition function? Thanks.

Add support or documentation for noisy vs. noiseless objective functions

Other Bayesian optimization libraries I have used allow users to specify whether evaluation of the objective function is exact or noisy. These two scenarios should be treated differently in the optimization. I haven't seen anything in pyGPGO's documentation about this distinction. If it is not supported, it would be useful to add support for it, and in the meantime make clear which of the two noise scenarios is supported. If both are supported and the option is just not documented, it would help to add documentation.

Error when running

I get the following error randomly when running with ExpectedImprovement acquisition:
gpgo.run(max_iter=32, resume=True)

Error:

C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\GPGO.py:109: RuntimeWarning: invalid value encountered in sqrt
  new_std = np.sqrt(new_var + 1e-6)
C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\covfunc.py:51: RuntimeWarning: invalid value encountered in less
  return cdist(X, Xstar) < np.finfo(np.float32).eps
Traceback (most recent call last):
  File "D:\Dropbox\Dev\Python\MultiDDM\TeaLab\run_MultiDDM.py", line 243, in <module>
    gpgo.run(max_iter=explore_schedule[sched], resume=True)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\GPGO.py", line 191, in run
    self._optimizeAcq()
  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\GPGO.py", line 131, in _optimizeAcq
    bounds=self.parameter_range)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\_minimize.py", line 601, in minimize
    callback=callback, **options)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 335, in _minimize_lbfgsb
    f, g = func_and_grad(x)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\lbfgsb.py", line 280, in func_and_grad
    f = fun(x, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\optimize.py", line 300, in function_wrapper
    return function(*(wrapper_args + args))
  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\GPGO.py", line 108, in _acqWrapper
    new_mean, new_var = self.GP.predict(xnew, return_std=True)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\surrogates\GaussianProcess.py", line 222, in predict
    v = solve(self.L, kstar.T)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg\basic.py", line 140, in solve
    b1 = atleast_1d(_asarray_validated(b, check_finite=check_finite))
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\_lib\_util.py", line 239, in _asarray_validated
    a = toarray(a)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 498, in asarray_chkfinite
    "array must not contain infs or NaNs")

I do not know how to reproduce the behavior since it works fine most of the time.

Support for constrained optimization

Hello there,

first of all, I would like to thank you for the package.

Secondly, I have a request. Many real-world optimization problems of interest have constraints which are unknown a priori. It would be very cool if this package could implement constrained Bayesian optimization as reported in this paper: https://arxiv.org/abs/1403.5607

Essentially, one would define probabilistic constraints of the sort:

And place a GP surrogate on each constraint. These are updated in each evaluation of the target function as well.

Best,
Miha

[JOSS]: Test coverage

Another comment for the JOSS review: openjournals/joss-reviews#41

I'm a bit concerned about the test coverage. There are tests that are being run on Travis (this is great!) but, as far as I can tell, only three different options are being tested even though many options seem to be available. I'd love to see the coverage report and it might be worth putting some thought into a broader test suite.

Parameters specified as "int" are returned as floats

Hi,

first - thanks for the nice library!

When I specify a parameter to be an integer, gpgo.run will pass the exploration values as floats which causes problems with some library (e.g. lightgbm). Interestingly, only does so in the exploration but not in the initial phase. Is there some nice way around this?

version: 0.4.0.dev1
system: ubuntu
python: anaconda 3.6

example:
from pyGPGO.covfunc import squaredExponential
from pyGPGO.acquisition import Acquisition
from pyGPGO.surrogates.GaussianProcess import GaussianProcess
from pyGPGO.GPGO import GPGO
import multiprocessing

def func(**params):
for k, v in params.items():
print(" ", k, v)
return 0.

param_sets = {
"learning_rate": ("cont", [0.01, 2]),
"num_leaves": ("int", [2, 100])
}

max_iter = 2; init_evals = 2
sexp = squaredExponential()
gp = GaussianProcess(sexp)
acq = Acquisition("UCB")
gpgo = GPGO(gp, acq, func, param_sets, n_jobs=multiprocessing.cpu_count())

gpgo.run(max_iter, init_evals)

output:
Evaluation Proposed point Current eval. Best eval.
learning_rate 1.5251204907602136
num_leaves 64
learning_rate 0.21636593353219993
num_leaves 30
init [ 1.52512 64. ]. 0.0 0.0
init [ 0.21637 30. ]. 0.0 0.0
learning_rate 0.22181394115246836
num_leaves 7.0
1 [0.22181 7. ]. 0.0 0.0
learning_rate 1.8110339488370346
num_leaves 36.0
2 [ 1.81103 36. ]. 0.0 0.0

Improve logger

Missing hyperparameter in matern52

Hi Hawk31,

I have notice that norm r in matern52 is not divided by the length-scale l (hyperparameter) everywhere. This result in that the kernel is not always positive definite (when l is less than 1), which in turn results in that the Cholesky decomposition crashes. Here is an example where this happen:

from pyGPGO.covfunc import matern52
from pyGPGO.surrogates.GaussianProcess import GaussianProcess
import numpy as np

np.random.seed(0)
X = np.random.uniform((10, 10), size = (10,2))
y = y = np.random.uniform(10, size = (10))
matern_52 = matern52(l = 0.5)
gp = GaussianProcess(matern_52)
gp.fit(X,y)

I have made an update of matern52 and I have included an extra test that check if the eigenvalues are greater than zero for different generated kernels and with different data (and thereby it is positive definite). However, the test does not include kernel expSine because that kernel usually crashes. I am not even sure that expSine is a positive definite function(?).

Best Regards,
Saizor

GP.fit() error when K not PD

I have an issue with GP.fit() when self.covfunc.K(self.X, self.X) returns a K that is not positive definite.

I am suspecting that it happens when the same gpgo.best is selected several times in a row, but I am not sure. Is there a way to handle this ?

Traceback:

 File "<ipython-input-219-e67b7c4ee454>", line 20, in <module>
    gpgo.updateGP()

  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\GPGO.py", line 150, in updateGP
    self.GP.update(np.atleast_2d(self.best), np.atleast_1d(f_new))

  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\surrogates\GaussianProcess.py", line 241, in update
    self.fit(X, y)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pyGPGO\surrogates\GaussianProcess.py", line 78, in fit
    self.L = cholesky(self.K).T

  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg\decomp_cholesky.py", line 91, in cholesky
    check_finite=check_finite)

  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\linalg\decomp_cholesky.py", line 40, in _cholesky
    "definite" % info)

LinAlgError: 30-th leading minor of the array is not positive definite

Parallelization in GPGO.GPGO returns BrokenProcessPool error

Hello,

I am testing a number of Bayesian Optimization packages out for setting up a hyperparameter search, and had quite liked the setup of pyGPGO, but am running into an issue with parallelizing the code. I was using the tutorial from https://github.com/hawk31/pyGPGO/blob/master/tutorials/mlopt.ipynb as a reference.

When modifying gpgo = GPGO(gp, acq, evaluateModel, params) to be gpgo = GPGO(gp, acq, evaluateModel, params, n_jobs=4) with the same example. I get a BrokenProcessPool error with both your sample code and my own similar example. The output for modifying the code as I described is as follows.

I'm not really sure if the underlying issue is with joblib.Parallel or with pyGPGO, any help would be greatly appreciated!

For reference I am running on MacOS with an Anaconda distribution of Python 3.6.

Evaluation 	 Proposed point 	  Current eval. 	 Best eval.
init   	 [ 3.28824672 -0.91429671]. 	  0.7398555220462545 	 0.7399868590570241
init   	 [-0.49380448 -0.48245949]. 	  0.7399868590570241 	 0.7399868590570241
init   	 [-3.75686293  1.00979526]. 	  0.7399868590570241 	 0.7399868590570241

exception calling callback for <Future at 0x1c3b6f2198 state=finished raised BrokenProcessPool>
Traceback (most recent call last):
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/_base.py", line 322, in _invoke_callbacks
    callback(self)
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 375, in __call__
    self.parallel.dispatch_next()
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 795, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 823, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 780, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 504, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/reusable_executor.py", line 151, in submit
    fn, *args, **kwargs)
  File "/Users/[user]/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 993, in submit
    raise BrokenProcessPool(self._flags.broken)
joblib.externals.loky.process_executor.BrokenProcessPool: A process in the executor was terminated abruptly, the pool is not usable anymore.

---------------------------------------------------------------------------
BrokenProcessPool                         Traceback (most recent call last)
<ipython-input-28-133aafb98ba8> in <module>()
     24 from pyGPGO.GPGO import GPGO
     25 gpgo = GPGO(gp, acq, evaluateModel, params, n_jobs=4)
---> 26 gpgo.run(max_iter = 20)
     27 
     28 

~/anaconda3/lib/python3.6/site-packages/pyGPGO/GPGO.py in run(self, max_iter, init_evals, resume)
    189             self.logger._printInit(self)
    190         for iteration in range(max_iter):
--> 191             self._optimizeAcq()
    192             self.updateGP()
    193             self.logger._printCurrent(self)

~/anaconda3/lib/python3.6/site-packages/pyGPGO/GPGO.py in _optimizeAcq(self, method, n_start)
    136                                                                  method='L-BFGS-B',
    137                                                                  bounds=self.parameter_range) for start_point in
--> 138                                                start_points_arr)
    139             x_best = np.array([res.x for res in opt])
    140             f_best = np.array([np.atleast_1d(res.fun)[0] for res in opt])

~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in __call__(self, iterable)
    992 
    993             with self._backend.retrieval_context():
--> 994                 self.retrieve()
    995             # Make sure that we get a last message telling us we are done
    996             elapsed_time = time.time() - self._start_time

~/anaconda3/lib/python3.6/site-packages/joblib/parallel.py in retrieve(self)
    895             try:
    896                 if getattr(self._backend, 'supports_timeout', False):
--> 897                     self._output.extend(job.get(timeout=self.timeout))
    898                 else:
    899                     self._output.extend(job.get())

~/anaconda3/lib/python3.6/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
    513         AsyncResults.get from multiprocessing."""
    514         try:
--> 515             return future.result(timeout=timeout)
    516         except LokyTimeoutError:
    517             raise TimeoutError()

~/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/_base.py in result(self, timeout)
    429                 raise CancelledError()
    430             elif self._state == FINISHED:
--> 431                 return self.__get_result()
    432             else:
    433                 raise TimeoutError()

~/anaconda3/lib/python3.6/site-packages/joblib/externals/loky/_base.py in __get_result(self)
    380     def __get_result(self):
    381         if self._exception:
--> 382             raise self._exception
    383         else:
    384             return self._result

BrokenProcessPool: A process in the executor was terminated abruptly while the future was running or pending.

Thank you,
Rob

Passing parameter arrays

Not seeing a way in documentation to easily pass arrays into PyGPGO. The hackiest way I've thought of handling this problem is as follows:

import numpy as np
from pyGPGO.covfunc import squaredExponential
from pyGPGO.acquisition import Acquisition
from pyGPGO.surrogates.GaussianProcess import GaussianProcess
from pyGPGO.GPGO import GPGO

param = {f'x_{i}': ('cont', [0, 1]) for i in range(3)}
param.update({f'y_{i}': ('cont', [0, 1]) for i in range(3)})

def f(p):
    v1 = np.array([p[f'x_{i}'] for i in range(3)])
    v2 = np.array([p[f'y_{i}'] for i in range(3)])
    return np.dot(v1,v2)


sexp = squaredExponential()
gp = GaussianProcess(sexp)
acq = Acquisition(mode='ExpectedImprovement')

np.random.seed(23)
gpgo = GPGO(gp, acq, f, param)
gpgo.run(max_iter=20)

However, the error is returned:

Evaluation   Proposed point       Current eval.      Best eval.
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-49-f58f602dca8c> in <module>()
     26 np.random.seed(23)
     27 gpgo = GPGO(gp, acq, f, param)
---> 28 gpgo.run(max_iter=20)

1 frames
/usr/local/lib/python3.7/dist-packages/pyGPGO/GPGO.py in _firstRun(self, n_eval)
     86             s_param_val = list(s_param.values())
     87             self.X[i] = s_param_val
---> 88             self.y[i] = self.f(**s_param)
     89         self.GP.fit(self.X, self.y)
     90         self.tau = np.max(self.y)

TypeError: f() got an unexpected keyword argument 'x_1'

leading minor of the array is not positive definite

numpy.linalg.linalg.LinAlgError: 9-th leading minor of the array is not positive definite
I used the same code yesterday,it worked well,but the bug appears after I change the params boundry
old params
param = {
'layer_1_size': ('int', [992, 1056]),
'layer_2_size': ('int', [992, 1056]),
'p_dropout': ('cont', [0.30, 0.70]),
'learning_rate': ('cont', [0.200, 0.400]),
'weight_decay': ('cont', [0.001, 0.010]),

new param
param = {
'layer_1_size': ('int', [1023, 1025]),
'layer_2_size': ('int', [1023, 1025]),
'p_dropout': ('cont', [0.45, 0.55]),
'learning_rate': ('cont', [0.250, 0.350]),
'weight_decay': ('cont', [0.001, 0.003]),

Rely on sklearns Gaussian Process obj.

Changing DF/nu parameter from default=3.0

Hello hawk31,

I am probably making an obvious mistake, having unsuccessfully tried to change both the "df" and "nu=3.0" parameters in both 'acquisition.py' and 'tStudentProcess.py' files.

Can you quickly explain how I change this parameter? I would like to see the impact of differing DF on the expected improvement utility function.

Many thanks,
ConorC

theano and cov.py warnings

I ran into these warnings when running the tutorials on Google Colab. Do these signal version incompatibility for some libraries?

_Evaluation Proposed point Current eval. Best eval.
/usr/local/lib/python3.7/dist-packages/pymc3/gp/cov.py:97: UserWarning: Only 1 column(s) out of 2 are being used to compute the covariance function. If this is not intended, increase 'input_dim' parameter to the number of columns to use. Ignore otherwise.
UserWarning,
WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
Only 300 samples in chain.
Sequential sampling (2 chains in 1 job)
CompoundStep

Slice: [log_s2_n]
Slice: [log_s2_f]
Slice: [l]

100.00% [1300/1300 00:05<00:00 Sampling chain 0, 0 divergences]

100.00% [1300/1300 00:03<00:00 Sampling chain 1, 0 divergences]
Sampling 2 chains for 1_000 tune and 300 draw iterations (2_000 + 600 draws total) took 10 seconds.
The estimated number of effective samples is smaller than 200 for some parameters.
init [0.5881308 0.89771373]. 0.12055515791243088 0.35562010326493304
init [0.89153073 0.81583748]. 0.06525608955367107 0.35562010326493304
init [0.03588959 0.69175758]. 0.35562010326493304 0.35562010326493304
/usr/local/lib/python3.7/dist-packages/pymc3/gp/cov.py:97: UserWarning: Only 1 column(s) out of 2 are being used to compute the covariance function. If this is not intended, increase 'input_dim' parameter to the number of columns to use. Ignore otherwise.
UserWarning,
Only 300 samples in chain.
Sequential sampling (2 chains in 1 job)
CompoundStep

Slice: [log_s2_n]
Slice: [log_s2_f]
Slice: [l]

100.00% [1300/1300 00:04<00:00 Sampling chain 0, 0 divergences]

100.00% [1300/1300 00:04<00:00 Sampling chain 1, 0 divergences]
Sampling 2 chains for 1_000 tune and 300 draw iterations (2_000 + 600 draws total) took 8 seconds.
1 [0. 0.]. 0.7664205912849231 0.7664205912849231
/usr/local/lib/python3.7/dist-packages/pymc3/gp/cov.py:97: UserWarning: Only 1 column(s) out of 2 are being used to compute the covariance function. If this is not intended, increase 'input_dim' parameter to the number of columns to use. Ignore otherwise.
UserWarning,_

predictive entropy search without sampling

Hi there,

Thanks for your package!
We noticed that the the predictive entropy search in the file acquisition.py is different from the method introduced by the paper "Predictive Entropy Search for Efficient Global Optimization of Black-box Functions".

It seems, the method in the package only calculates the approximation of H[p(y|Dn; x; x*)] and did not sample x* based on estimated posterior distribution and take average of the approximated H[p(y|Dn; x; x*)].