lucfra / far-ho Goto Github PK
View Code? Open in Web Editor NEWGradient based hyperparameter optimization & meta-learning package for TensorFlow
License: MIT License
Gradient based hyperparameter optimization & meta-learning package for TensorFlow
License: MIT License
I have emailed Luca Franceschi about some issues with this library, and he asked me to share it.
I've been working on a MLP and wanted to optimize the following parameters, but found some problems:
Best regards,
Nicolás Zorzano.
This is a nice package for HPO, but it seems that it is built only on TF.
Just wondering is there any version that supports pytorch?
I think that will make the package more popular for HPO reseachers!
I really don't get https://github.com/lucfra/FAR-HO/blob/master/far_ho/hyper_gradients.py#L351, why was it needed?
Aren't we already getting the hyper-gradient at https://github.com/lucfra/FAR-HO/blob/master/far_ho/hyper_gradients.py#L349
Also, please explain if this step is really needed as I don't see it explicitly stated in your paper, thanks.
When I run this notebook I get the following error:
First I am told that the param lambda is not even connected with the model, which doe not make any sense.
Then it fails at the assertion a t line 113 in file utils.py
Please help, and it would be really helpful, if you could also provide an example notebook with ForwardHG for MNIST (both) with and without online learning ...
Thanks in advance ...
Habib
//----------------------------------------------------------------------------------------------------------
Hyperparameter <tf.Variable 'lambda_components/0:0' shape=() dtype=float32_ref> is detached from this optimization dynamics.
AssertionError Traceback (most recent call last)
in ()
----> 1 ss, farho, cost, oo = _test(far.ForwardHG)
2
3 tf.global_variables_initializer().run()
4
5 # execution with gradient descent! (looks ol)
in _test(method)
25 optim_oo = tf.train.AdamOptimizer(.01)
26 farho = far.HyperOptimizer(hypergradient=method())
---> 27 farho.minimize(oo, optim_oo, cost, io_optim)
28 return ss, farho, cost, oo
/volume1/scratch/r0605927/backMeUpPlz/lib/python2.7/site-packages/far_ho/hyper_parameters.pyc in minimize(self, outer_objective, outer_objective_optimizer, inner_objective, inner_objective_optimizer, hyper_list, var_list, init_dynamics_dict, global_step, aggregation_fn, process_fn)
132 """
133 optim_dict = self.inner_problem(inner_objective, inner_objective_optimizer, var_list, init_dynamics_dict)
--> 134 self.outer_problem(outer_objective, optim_dict, outer_objective_optimizer, hyper_list, global_step)
135 return self.finalize(aggregation_fn=aggregation_fn, process_fn=process_fn)
136
/volume1/scratch/r0605927/backMeUpPlz/lib/python2.7/site-packages/far_ho/hyper_parameters.pyc in outer_problem(self, outer_objective, optim_dict, outer_objective_optimizer, hyper_list, global_step)
117 :return: itself
118 """
--> 119 hyper_list = self._hypergradient.compute_gradients(outer_objective, optim_dict, hyper_list=hyper_list)
120 self._h_optim_dict[outer_objective_optimizer].update(hyper_list)
121 self._global_step = global_step
/volume1/scratch/r0605927/backMeUpPlz/lib/python2.7/site-packages/far_ho/hyper_gradients.pyc in compute_gradients(self, outer_objective, optimizer_dict, hyper_list)
343 # d_E_T = dot(vectorize_all(d_oo_d_state), vectorize_all(zs))
344 d_E_T = [dot(d_oo_d_s, z) for d_oo_d_s, z in zip(d_oo_d_state, zs)
--> 345 if d_oo_d_s is not None and z is not None]
346 hg = maybe_add(tf.reduce_sum(d_E_T), d_oo_d_hyp) # this is right... the error is not here!
347 # hg = maybe_add(d_E_T, d_oo_d_hyp)
/volume1/scratch/r0605927/backMeUpPlz/lib/python2.7/site-packages/far_ho/utils.pyc in dot(a, b, name)
111 Dot product between vectors a
and b
with optional name
112 """
--> 113 assert a.shape.ndims == 1, '{} must be a vector'.format(a)
114 assert b.shape.ndims == 1, '{} must be a vector'.format(b)
115 with tf.name_scope(name, 'Dot', [a, b]):
AssertionError: Tensor("Mean_1_1/gradients/mul_5_grad/Reshape:0", shape=(2, 3), dtype=float32) must be a vector
Hi,
Thanks for providing the code. It is well written and very easy to use. I was wondering which version of Python and Tensorflow you used for testing?
Can you please provide support for python 2.7.x ?
Hi @lucfra ,
I ran the same code with CLASSES=20 with your h5 files. However, I get an error. Here is the code:
from far_ho.examples.hyper_representation import train, mini_imagenet_model
if __name__ == '__main__':
CLASSES = 20
SHOTS = 1
META_BATCH_SIZE = 4
from experiment_manager.datasets import load
mini_imagenet = load.meta_mini_imagenet(std_num_classes=CLASSES,
std_num_examples=(SHOTS*CLASSES, 15*CLASSES), h5=True)
res = train(mini_imagenet, 'test', mini_imagenet_model, T=1, print_every=1000, MBS=META_BATCH_SIZE, n_episodes_testing=150, patience=20)
The error is related to dataset.
Traceback (most recent call last):
File "run_hyper.py", line 11, in <module>
res = train(mini_imagenet, 'maml', mini_imagenet_model, T=1, print_every=500, MBS=META_BATCH_SIZE, n_episodes_testing=150, patience=20)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/far_ho/examples/hyper_representation.py", line 245, in train
farho.run(T[0], trfd, vfd) # one iteration of optimization of representation variables (hyperparameters)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/savers/records.py", line 72, in _saver_wrapped
self._execute_save(res, *args, **kwargs)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/savers/records.py", line 125, in _execute_save
super()._execute_save(res, *args, **kwargs)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/savers/records.py", line 98, in _execute_save
_res=res)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/savers/save_and_load.py", line 535, in save
rss = _compute_value(pt, save_dict)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/savers/save_and_load.py", line 516, in _compute_value
if callable(pt[1]) else _tf_run_catch_not_initialized(pt, _partial_save_dict)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/savers/save_and_load.py", line 492, in _maybe_call
_out = _method()
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/far_ho/examples/hyper_representation.py", line 157, in <lambda>
'FLAT', lambda: accs_and_errs(metasets),
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/far_ho/examples/hyper_representation.py", line 133, in accs_and_errs
for _d in meta_dataset.generate(n_episodes_testing, batch_size=MBS, rand=0):
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/datasets/structures.py", line 265, in generate
yield self.generate_batch(batch_size, rand=rand, *args, **kwargs)
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/datasets/structures.py", line 271, in generate_batch
return [self.generate_datasets(rand, *args, **kwargs) for _ in range(batch_size)]
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/datasets/structures.py", line 271, in <listcomp>
return [self.generate_datasets(rand, *args, **kwargs) for _ in range(batch_size)]
File "/home/amir/.conda/envs/farho/lib/python3.5/site-packages/experiment_manager/datasets/load.py", line 400, in generate_datasets
random_classes = rand.choice(list(clss.keys()), size=(num_classes,), replace=False)
File "mtrand.pyx", line 1437, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:17481)
ValueError: Cannot take a larger sample than population when 'replace=False'
Could you please let me know how to fix this?
Hi,
I wrote the following code to compare the hyper-gradient computed by ReverseHG and ForwardHG methods in the same file:
### ReverseHG
farho = far.HyperOptimizer()
hypergradient = farho.hypergradient
run = farho.minimize(val_loss, oo_optim, tr_loss, io_optim)
grads_hvars = [hypergradient.hgrads_hvars(hyper_list=hll)
for opt, hll in farho._h_optim_dict.items()]
run(T, inner_objective_feed_dicts=tr_supplier, outer_objective_feed_dicts=val_supplier, _skip_hyper_ts=True)
grads_hvars_val = ss.run(grads_hvars, _opt_fd(farho._global_step, val_supplier))
print(grads_hvars_val)
### ForwardHG
hypergradient_fwd = far.ForwardHG()
farho_fwd = far.HyperOptimizer(hypergradient=hypergradient_fwd)
run_fwd = farho_fwd.minimize(val_loss, oo_optim, tr_loss, io_optim)
grads_hvars_fwd = [hypergradient_fwd.hgrads_hvars(hyper_list=hll)
for opt, hll in farho_fwd._h_optim_dict.items()]
run_fwd(T, inner_objective_feed_dicts=tr_supplier, outer_objective_feed_dicts=val_supplier, _skip_hyper_ts=True)
grads_hvars_fwd_val = ss.run(grads_hvars_fwd, _opt_fd(farho_fwd._global_step, val_supplier))
print(grads_hvars_fwd_val)
They receive identical inputs and compute the hyper-gradient for the same hyper-variable (_skip_hyper_ts=True so the hyper-parameter remains unchanged) but for some reason their output is quiet different. If noticed that if I run them in separate files (with a fix random seed) or run ForwardHG block before ReverseHG block their outputs would be similar. I can not see how Reverse and Forward hyper-gradient computation can effect each other as they don't share any variable. Could you please explain how these two methods can be run in the same file?
I have also attached the complete python code for this experiment.
Hi, I was wondering if you could share a code that can reproduce the mini imagenet results in you workshop paper. I have tried couple of different learning rates and the best one-shot test accuracy I could get was around 43%. I used T=4 as it is mentioned in the paper.
Thanks,
Haamoon
We work on a sparse logistic regression task on data set 20newsgroups and want to find the best regularization lambda. When we tried to use far.AdamOptimizer() for the inner optimizer and ReverseHG() for hyper optimizer, lambda goes to NAN. We found that:
Best regards,
Xiang Geng
Hi @lucfra
While I wanted to try miniImageNet for meta batch size > 1, I got out of memory errors (tried on gtx 1080 and titax X gpus). Below is the code:
from hyper_representation import train, mini_imagenet_model
if __name__ == '__main__':
CLASSES = 5
SHOTS = 1
META_BATCH_SIZE = 2
from experiment_manager.datasets import load
mini_imagenet = load.meta_mini_imagenet(std_num_classes=CLASSES,
std_num_examples=(SHOTS*CLASSES, 15*CLASSES), h5=False, load_all_images=True)
res = train(mini_imagenet, 'maml', mini_imagenet_model, T=1, print_every=500, MBS=META_BATCH_SIZE, n_episodes_testing=150, patience=20)
Could you please let me know how you ran the miniImageNet experiments?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.