google / model_search Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
To get started, I just ran the example code. Unfortunately I get a RuntimeError. Did anyone get the same error or has an idea to solve it?
Thanks in advance!
============================CODE OUTPUT========================================
2021-04-15 08:37:31.183727: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-04-15 08:37:31.184042: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "demo.py", line 55, in <module>
experiment_owner="model_search_user")
File "C:\...\model_search\model_search\single_trainer.py", line 65, in try_models
metadata=None)
File "C:\...t\model_search\model_search\phoenix.py", line 240, in __init__
study_owner)
File "C:\...\model_search\model_search\metadata\ml_metadata_db.py", line 105, in __init__
self._store = metadata_store.MetadataStore(self._connection_config)
File "C:\...7\AppData\Local\Programs\Python\Python37\lib\site-packages\ml_metadata\metadata_store\metadata_store.py", line 92, in __init__
config.SerializeToString(), migration_options.SerializeToString())
RuntimeError: Cannot connect sqlite3 database: unable to open database file
EDIT1: Formating
EDIT2: Changed VENV from python 3.7 to python 3.8 but didn't solve the problem
Hello,
Well... I can't find ml_metadata module or package anywhere.
What is weird thing is... I didn't get this error in other PC. (It might have different python version and environment)
Anyway, where is ml_metadata module? Is this no problem???
Please advise as necessary.
=================== Error message ===================================================
(modelsch) PS D:\PROGRAMMING\model_search> python ms.py
2021-02-27 07:05:53.471047: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Traceback (most recent call last):
File "ms.py", line 3, in
from model_search import single_trainer
File "D:\PROGRAMMING\model_search\model_search\single_trainer.py", line 17, in
from model_search import oss_trainer_lib
File "D:\PROGRAMMING\model_search\model_search\oss_trainer_lib.py", line 29, in
from model_search import phoenix
File "D:\PROGRAMMING\model_search\model_search\phoenix.py", line 33, in
from model_search.metadata import ml_metadata_db
File "D:\PROGRAMMING\model_search\model_search\metadata\ml_metadata_db.py", line 24, in
from ml_metadata import metadata_store
ModuleNotFoundError: No module named 'ml_metadata'
NotFoundError: NewRandomAccessFile failed to Create/Open: model_search/model_search/configs/dnn_config.pbtxt : The system cannot find the path specified. ; No such process
It seems like the path "model_search" was imported twice...Any idea how to solve it? Thanks in advance!
Is there any solution to this problem?
I'm try to run Model Search, but get back this error:
UnparsedFlagAccessError: Trying to access flag --phoenix_master before flags were parsed.
ERROR:absl:Flags are not parsed. Using default in file mlmd database. Please run main with absl.app.run(main) to fix this. If running in distributed mode, this means that the trainers are not sharing information between one another.
---------------------------------------------------------------------------
UnparsedFlagAccessError Traceback (most recent call last)
<ipython-input-10-93cbf25f70e3> in <module>
19 batch_size=32,
20 experiment_name="root",
---> 21 experiment_owner="root")
Hi here,
looks like the input available is for a CSV file or image file. if I have a wordVec sequence, is it possible to input as it?
I have a problem with understanding the inputs to the model. I set my experiments on a dataset with 2496 columns using csv file. I provided label_index and record_defaults via a list.
Parameters that I set in single_trainer.SingleTrainer (as according to readme) were label_index, logits_dimension, record_defaults, filename, spec.
After the experiments were done, I started to look at the graphs to understand how to use them in a pipeline where keras models are used (I wanted to wrap selected graph with keras lambda layer and use for inference).
When looking at the graph I see
gdef = gpb.GraphDef()
with open('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/1/graph.pbtxt', 'r') as fh:
graph_str = fh.read()
pbtf.Parse(graph_str, gdef)
tf.import_graph_def(gdef)
for op in tf.get_default_graph().get_operations():
print(str(op.name))
import/record_defaults_0
import/record_defaults_1
import/record_defaults_2
import/record_defaults_3
import/record_defaults_4
...
up to 2496 as it should be. I also see:
import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims/dim
import/Phoenix/search_generator_0/Input/input_layer/1_1/ExpandDims
import/Phoenix/search_generator_0/Input/input_layer/1_1/Shape
import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack
import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack_1
import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice/stack_2
import/Phoenix/search_generator_0/Input/input_layer/1_1/strided_slice
import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape/shape/1
import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape/shape
import/Phoenix/search_generator_0/Input/input_layer/1_1/Reshape
for all 2496 inputs.
but I only see :
input_1
input_2
input_3
input_4
...
input_21
21 inputs instead of 2496. Could you please help me to understand this situation?
My final goal is to do something like that:
import tensorflow as tf
#import tensorflow.compat.v1 as tf
#To make tf 2.0 compatible with tf1.0 code, we disable the tf2.0 functionalities
#tf.disable_eager_execution()
debug = False
if debug:
tf.autograph.set_verbosity(3, True)
else:
tf.autograph.set_verbosity(0, True)
from tensorflow.core.framework.graph_pb2 import GraphDef
from google.protobuf import text_format as pbtf
import numpy as np
@tf.autograph.experimental.do_not_convert
@tf.function
def my_model(x):
#tf.get_default_graph()
#tensor_input = ['inputs_'+str(i+1) for i in range(2496)]
tensor_input = [str(i+1) for i in range(2496)]
tensor_input_sample = [x[:,i] for i in range(2496)]
dict_input = {tensor_input[i]: tensor_input_sample[i] for i in range(len(tensor_input))}
input_map_ = dict_input
y, z = tf.graph_util.import_graph_def(
gd, name='', input_map=input_map_, return_elements=['Phoenix/Trainer/ArgMax:0', 'Phoenix/Trainer/Softmax:0'])
return [y, z]
x = tf.keras.Input(shape=2496)
print(x)
gd = GraphDef()
print("open tf graph")
with open('/Users/tomasz.p/Desktop/GOOGLE_SEARCH_RESULTS/r524xlarge-2/RESULTS/11/graph.pbtxt', 'rb') as f:
print("read file")
graph_str = f.read()
print("parse")
pbtf.Parse(graph_str, gd)
print("import graph")
tf.import_graph_def(gd)
y, z = tf.keras.layers.Lambda(my_model)(x)
model = tf.keras.Model(x, [y, z])
model.summary()
y_out, z_out = model.predict(np.ones((5, 2496), dtype=np.float32))
print(y_out.shape, z_out.shape)
print(y_out, z_out)
but unfortunately I do not understand inputs at this point.
Thank you for any help!
Hi,
I have tried simple classification dataset "German credit data" to make predictions.
I load the model_search saved_model.pb model.
importedModel = tf.saved_model.load(saved_model_dir)
But, when I try to predict or make summary() from the model ,I get the following error,
AttributeError: 'AutoTrackable' object has no attribute 'summary'
When I print print (gs_model.signatures), I got "['serving_default']".
How can I predict, get metrics like f1_score, accuracy, R2,etc from the model.
Is it possible to store the model_search model as tenorflow v2 compatible in oss_trainer_lib.py instead of "estimator"model.
Pls suggest on this.
I am using tensorflow 2.4.0
How can I make the model to work like keras model.
Thanks,
SJRam
Hi,
I cannot connect to database while running the “Getting started” example.
How can I fix this? Thank you.
Here is the full trace:
Traceback (most recent call last):
File "C:\Users\e173196\Anaconda projects\model_search\load_flags.py", line 32, in <module>
trainer.try_models(
File "C:\Users\e173196\Anaconda projects\model_search\model_search\single_trainer.py", line 57, in try_models
phoenix_instance = phoenix.Phoenix(
File "C:\Users\e173196\Anaconda projects\model_search\model_search\phoenix.py", line 239, in __init__
self._metadata = ml_metadata_db.MLMetaData(phoenix_spec, study_name,
File "C:\Users\e173196\Anaconda projects\model_search\model_search\metadata\ml_metadata_db.py", line 100, in __init__
self._store = metadata_store.MetadataStore(self._connection_config)
File "C:\Users\e173196\Anaconda3\envs\Model_search\lib\site-packages\ml_metadata\metadata_store\metadata_store.py", line 91, in __init__
self._metadata_store = metadata_store_serialized.CreateMetadataStore(
RuntimeError: Cannot connect sqlite3 database: unable to open database file
blocks_to_use: "CIFAR_NASA_REDUCTION"
blocks_to_use: "CIFAR_NASA"
this two block don't define in the block.py
just wondering, after trying a number of models , ie.200 models, is it possible to output a list / file to state each model's performance / locations and etc
hi,I want to know how to use it to search for LSTM and GRU models.
As a test, I just ran the supplied code with all of the default parameters for the "Getting Started" example, just to make sure I installed everything correctly and that the code would run smoothly. The test has now finished, and I saw that the model search put all the outputs in a 'tmp\run_example' directory, with each model getting its own folder numbered from 1 - 200. I would now like to evaluate the results of this run, but I have no idea where to even start. I would like to see things like optimal model architecture, architectures evaluated, architecture with highest ratio of accuracy to # parameters, etc., but I am just lost as to how to go about doing that. Would anyone be able to offer any guidance in that area?
I tried loading in a model taking inspiration from the following tutorial on Tensorflow,
but ultimately I got an error after running the code below:
M = tf.keras.models.load_model("C:\\tmp\\run_example\\tuner-1\\200\\saved_model\\1616871647") M.summary() Traceback (most recent call last): File "<string>", line 1, in <module> AttributeError: 'AutoTrackable' object has no attribute 'summary'
Does anyone have any ideas how to get information about the model architectures from these runs?
I'm using the Colab example https://colab.research.google.com/drive/1k1EaKDCTB2fU9XtIdiiXEDyDOvrU6fmD?usp=sharing
Using my own data (composed of two attributes, both of them classes for regression), I got the following error:
I0313 17:21:15.287168 140503459874688 metadata_store.py:93] MetadataStore with DB connection initialized
I0313 17:21:15.307923 140503459874688 oss_trainer_lib.py:290] creating directory: /tmp/run_example/tuner-1/5
I0313 17:21:15.309459 140503459874688 oss_trainer_lib.py:337] Tuner id: tuner-1
I0313 17:21:15.310938 140503459874688 oss_trainer_lib.py:338] Training with the following hyperparameters:
I0313 17:21:15.312628 140503459874688 oss_trainer_lib.py:339] {'learning_rate': 1.0536239272270075e-05, 'new_block_type': 'FULLY_CONNECTED_RESIDUAL_FORCE_MATCH_SHAPES', 'optimizer': 'adam', 'initial_architecture_0': 'FULLY_CONNECTED_RESIDUAL_CONCAT', 'exponential_decay_rate': 0.8225550864907855, 'exponential_decay_steps': 250, 'gradient_max_norm': 2, 'dropout_rate': 0.20000000596046447, 'initial_architecture': ['FULLY_CONNECTED_RESIDUAL_CONCAT']}
I0313 17:21:15.314630 140503459874688 run_config.py:550] TF_CONFIG environment variable: {'model_dir': '/tmp/run_example/tuner-1/5', 'session_master': ''}
I0313 17:21:15.316400 140503459874688 run_config.py:973] Using model_dir in TF_CONFIG: /tmp/run_example/tuner-1/5
I0313 17:21:15.318898 140503459874688 estimator.py:191] Using config: {'_model_dir': '/tmp/run_example/tuner-1/5', '_tf_random_seed': None, '_save_summary_steps': 2000, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 120, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0313 17:21:15.373794 140503459874688 estimator.py:1169] Calling model_fn.
I0313 17:21:15.389836 140503459874688 controller.py:160] trial id: 5
I0313 17:21:15.391364 140503459874688 controller.py:239] intermix ensemble search mode
I0313 17:21:15.406418 140503459874688 phoenix.py:371] {'prior_generator': GeneratorWithTrials(instance=<model_search.generators.prior_generator.PriorGenerator object at 0x7fc8eed807d0>, relevant_trials=[])}
---------------------------------------------------------------------------
FailedPreconditionError Traceback (most recent call last)
<ipython-input-17-bfaafc6a9348> in <module>()
6 batch_size=32,
7 experiment_name="example",
----> 8 experiment_owner="model_search_user")
11 frames
/content/model_search/model_search/generators/prior_generator.py in _nonadaptive_ensemble(self, features, input_layer_fn, shared_input_tensor, shared_lengths, logits_dimension, relevant_trials, is_training, num_trials_to_consider, width, my_model_dir)
64 if not best_trials:
65 raise tf.errors.FailedPreconditionError(
---> 66 None, None, "No completed trials to perform ensembling.")
67
68 if len(best_trials) < width:
FailedPreconditionError: No completed trials to perform ensembling.
Any clue?
Does the model_search support tf1.x?
I run the demo test.py:
import model_search
from model_search import constants
from model_search import single_trainer
from model_search.data import csv_data
trainer = single_trainer.SingleTrainer(
data=csv_data.Provider(
label_index=0,
logits_dimension=2,
record_defaults=[0, 0, 0, 0],
filename="model_search/data/testdata/csv_random_data.csv"),
spec=constants.DEFAULT_DNN)
trainer.try_models(
number_models=200,
train_steps=1000,
eval_steps=100,
root_dir="/tmp/run_example",
batch_size=32,
experiment_name="example",
experiment_owner="model_search_user")
but there is an error and I can't find the reason. The error informations is:
(gpu) E:\src\model_search>python test.py
2021-02-23 16:24:06.170523: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cudart64_101.dll'; dlerror: cudart64_101.dll not found
2021-02-23 16:24:06.175639: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "test.py", line 12, in
spec=constants.DEFAULT_DNN)
File "E:\src\model_search\model_search\single_trainer.py", line 33, in init
text_format.Parse(f.read(), self._spec)
File "d:\Documents\Application Data\Python\Python37\site-packages\tensorflow\python\lib\io\file_io.py", line 116, in read
self._preread_check()
File "d:\Documents\Application Data\Python\Python37\site-packages\tensorflow\python\lib\io\file_io.py", line 79, in _preread_check
self.__name, 1024 * 512)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 100: invalid continuation byte
I was able to get the code running with the test data supplied with the module. Now, I am testing out the code with a custom dataset. However, the dataset isn't just composed of int32 entries but also has float entries. This is probably causing the following error:
InvalidArgumentError: Field 1 in record is not a valid int32: 1.0000000007063299
Could you help me fix it? Is there a keyword in the csvdata module that will accommodate for floating point numbers in the inputs?
The full error trace is below:
`InvalidArgumentError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1374 try:
-> 1375 return fn(*args)
1376 except errors.OpError as e:
20 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata)
1359 return self._call_tf_sessionrun(options, feed_dict, fetch_list,
-> 1360 target_list, run_metadata)
1361
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata)
1452 fetch_list, target_list,
-> 1453 run_metadata)
1454
InvalidArgumentError: Field 1 in record is not a valid int32: 1.0000000007063299
[[{{node IteratorGetNext}}]]
During handling of the above exception, another exception occurred:
InvalidArgumentError Traceback (most recent call last)
in ()
31 batch_size=32,
32 experiment_name="example",
---> 33 experiment_owner="model_search_user")
/content/drive/My Drive/oqmd_structures/sony/search/model_search/single_trainer.py in try_models(self, number_models, train_steps, eval_steps, root_dir, batch_size, experiment_name, experiment_owner)
84 train_steps=train_steps,
85 eval_steps=eval_steps,
---> 86 batch_size=batch_size):
87 pass
/content/drive/My Drive/oqmd_structures/sony/search/model_search/oss_trainer_lib.py in run_parameterized_train_and_eval(phoenix_instance, oracle, tuner_id, root_dir, max_trials, data_provider, train_steps, eval_steps, batch_size)
338 train_steps=train_steps,
339 eval_steps=eval_steps,
--> 340 batch_size=batch_size)
341
342 oracle.update_trial(
/content/drive/My Drive/oqmd_structures/sony/search/model_search/oss_trainer_lib.py in run_train_and_eval(hparams, model_dir, phoenix_instance, data_provider, train_steps, eval_steps, batch_size)
240 mode=tf.estimator.ModeKeys.TRAIN,
241 batch_size=batch_size),
--> 242 max_steps=train_steps)
243 tf.compat.v1.reset_default_graph()
244 tf.keras.backend.clear_session()
/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
347
348 saving_listeners = _check_listeners_type(saving_listeners)
--> 349 loss = self._train_model(input_fn, hooks, saving_listeners)
350 logging.info('Loss for final step: %s.', loss)
351 return self
/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
1173 return self._train_model_distributed(input_fn, hooks, saving_listeners)
1174 else:
-> 1175 return self._train_model_default(input_fn, hooks, saving_listeners)
1176
1177 def _train_model_default(self, input_fn, hooks, saving_listeners):
/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py in _train_model_default(self, input_fn, hooks, saving_listeners)
1206 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
1207 hooks, global_step_tensor,
-> 1208 saving_listeners)
1209
1210 def _train_model_distributed(self, input_fn, hooks, saving_listeners):
/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py in _train_with_estimator_spec(self, estimator_spec, worker_hooks, hooks, global_step_tensor, saving_listeners)
1512 any_step_done = False
1513 while not mon_sess.should_stop():
-> 1514 _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
1515 any_step_done = True
1516 if not any_step_done:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
776 feed_dict=feed_dict,
777 options=options,
--> 778 run_metadata=run_metadata)
779
780 def run_step_fn(self, step_fn):
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
1281 feed_dict=feed_dict,
1282 options=options,
-> 1283 run_metadata=run_metadata)
1284 except _PREEMPTION_ERRORS as e:
1285 logging.info(
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py in run(self, *args, **kwargs)
1382 raise six.reraise(*original_exc_info)
1383 else:
-> 1384 raise six.reraise(*original_exc_info)
1385
1386
/usr/local/lib/python3.7/dist-packages/six.py in reraise(tp, value, tb)
701 if value.traceback is not tb:
702 raise value.with_traceback(tb)
--> 703 raise value
704 finally:
705 value = None
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py in run(self, *args, **kwargs)
1367 def run(self, *args, **kwargs):
1368 try:
-> 1369 return self._sess.run(*args, **kwargs)
1370 except _PREEMPTION_ERRORS:
1371 raise
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
1440 feed_dict=feed_dict,
1441 options=options,
-> 1442 run_metadata=run_metadata)
1443
1444 for hook in self._hooks:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/monitored_session.py in run(self, *args, **kwargs)
1198
1199 def run(self, *args, **kwargs):
-> 1200 return self._sess.run(*args, **kwargs)
1201
1202 def run_step_fn(self, step_fn, raw_session, run_with_hooks):
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in run(self, fetches, feed_dict, options, run_metadata)
966 try:
967 result = self._run(None, fetches, feed_dict, options_ptr,
--> 968 run_metadata_ptr)
969 if run_metadata:
970 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
1189 if final_fetches or final_targets or (handle and feed_dict_tensor):
1190 results = self._do_run(handle, final_targets, final_fetches,
-> 1191 feed_dict_tensor, options, run_metadata)
1192 else:
1193 results = []
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1367 if handle is None:
1368 return self._do_call(_run_fn, feeds, fetches, targets, options,
-> 1369 run_metadata)
1370 else:
1371 return self._do_call(_prun_fn, handle, feeds, fetches)
/usr/local/lib/python3.7/dist-packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)
1392 '\nsession_config.graph_options.rewrite_options.'
1393 'disable_meta_optimizer = True')
-> 1394 raise type(e)(node_def, op, message)
1395
1396 def _extend_graph(self):
InvalidArgumentError: Field 1 in record is not a valid int32: 1.0000000007063299
[[node IteratorGetNext (defined at /usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/util.py:61) ]]
Errors may have originated from an input operation.
Input Source operations connected to node IteratorGetNext:
IteratorV2 (defined at /usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/util.py:59)
Original stack trace for 'IteratorGetNext':
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py", line 16, in
app.launch_new_instance()
File "/usr/local/lib/python3.7/dist-packages/traitlets/config/application.py", line 845, in launch_instance
app.start()
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelapp.py", line 499, in start
self.io_loop.start()
File "/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py", line 132, in start
self.asyncio_loop.run_forever()
File "/usr/lib/python3.7/asyncio/base_events.py", line 541, in run_forever
self._run_once()
File "/usr/lib/python3.7/asyncio/base_events.py", line 1786, in _run_once
handle._run()
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/usr/local/lib/python3.7/dist-packages/tornado/platform/asyncio.py", line 122, in _handle_events
handler_func(fileobj, events)
File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 451, in _handle_events
self._handle_recv()
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 480, in _handle_recv
self._run_callback(callback, msg)
File "/usr/local/lib/python3.7/dist-packages/zmq/eventloop/zmqstream.py", line 434, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/tornado/stack_context.py", line 300, in null_wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 233, in dispatch_shell
handler(stream, idents, msg)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/ipkernel.py", line 208, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "/usr/local/lib/python3.7/dist-packages/ipykernel/zmqshell.py", line 537, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2828, in run_ast_nodes
if self.run_code(code, result):
File "/usr/local/lib/python3.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 33, in
experiment_owner="model_search_user")
File "search/model_search/single_trainer.py", line 86, in try_models
batch_size=batch_size):
File "search/model_search/oss_trainer_lib.py", line 340, in run_parameterized_train_and_eval
batch_size=batch_size)
File "search/model_search/oss_trainer_lib.py", line 242, in run_train_and_eval
max_steps=train_steps)
File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 349, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1175, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1201, in _train_model_default
self._get_features_and_labels_from_input_fn(input_fn, ModeKeys.TRAIN))
File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/estimator.py", line 1037, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/usr/local/lib/python3.7/dist-packages/tensorflow_estimator/python/estimator/util.py", line 61, in parse_input_fn_result
result = iterator.get_next()
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/data/ops/iterator_ops.py", line 419, in get_next
name=name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_dataset_ops.py", line 2601, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 750, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 3536, in _create_op_internal
op_def=op_def)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 1990, in init
self._traceback = tf_stack.extract_stack()`
File model_search/metadata/ml_metadata_db.py
, lines 102-103 are:
self._connection_config.sqlite.filename_uri = (
"/tmp/filedb-%d" % random.randint(0, 1000000))
/tmp/...
is of course not a valid path for Windows.
Can you update this path such that it chooses a valid path when running on Windows?
I tried to utilize this framework to find the best possible model architecture for a multi class classsification problem. As a result the respective files were created. eg. under /tmp/run_example/tuner-1/1/. The type of files include a checkpoint file, graph.pbtxt, replay_config.pbtxt along with .index, .meta and .data-000000-of-00001 files.
Given that the framework does not store the searched model in a .h5 or SavedModel format, I wanted to know how I could utilize the produced files and load the searched model / restore it, such that I can use it to make predicitions / inference ?
I would really appreciate any help / suggestions in the right direction !
Cheers.
I encountered this error when I run the example on my jupyter notebook: AttributeError: module 'ml_metadata.metadata_store' has no attribute 'MetadataStore'. Appreciate any advice on how to fix this issue. Thank you.
I get the error when I run the code snippet from the project page:
Traceback (most recent call last): File "btc.py", line 14, in <module> trainer.try_models( File "/home/lasse/Development/projects/btcpred/model_search/model_search/single_trainer.py", line 56, in try_models phoenix_instance = phoenix.Phoenix( File "/home/lasse/Development/projects/btcpred/model_search/model_search/phoenix.py", line 239, in __init__ self._metadata = ml_metadata_db.MLMetaData(phoenix_spec, study_name, File "/home/lasse/Development/projects/btcpred/model_search/model_search/metadata/ml_metadata_db.py", line 84, in __init__ if FLAGS.mlmd_default_sqllite_filename: File "/home/lasse/Development/projects/btcpred/.env/lib/python3.8/site-packages/absl/flags/_flagvalues.py", line 498, in __getattr__ raise _exceptions.UnparsedFlagAccessError(error_message) absl.flags._exceptions.UnparsedFlagAccessError: Trying to access flag --mlmd_default_sqllite_filename before flags were parsed.
error:UnparsedFlagAccessError: Trying to access flag --mlmd_default_sqllite_filename before flags were parsed.
How do I train my own data for image classification by this code?
Suggestion: Would you please add sample code snippet on scoring a new dataset using the best (or selected) model
File "C:...\lib\site-packages\ml_metadata\metadata_store\metadata_store.py", line 92, in init
self._metadata_store = metadata_store_serialized.CreateMetadataStore(
RuntimeError: Error when executing query: file is not a database query: CREATE TABLE IF NOT EXISTS Type ( id INTEGER PRIMARY KEY AUTOINCREMENT, name VARCHAR(255) NOT NULL, version VARCHAR(255), type_kind TINYINT(1) NOT NULL, description TEXT, input_type TEXT, output_type TEXT );
The current task that I'm trying to run requires regression modeling, and I'm wondering if there is any progress for creating a model for regression analysis? For example, if there is some beta code or something similar that I could use, it would be greatly appreciated.
Any detailed method for single-machine multi-card distributed search?
from model_search import oss_trainer_lib
Traceback (most recent call last):
File "", line 1, in
File "E:\src\model_search\model_search\oss_trainer_lib.py", line 28, in
from model_search import hparam as hp
File "E:\src\model_search\model_search\hparam.py", line 26, in
from model_search.proto import hparam_pb2
ImportError: cannot import name 'hparam_pb2' from 'model_search.proto' (unknown location)
I have just finished a test run, and obtained a new directory containing the results from 200 separate models. For a given model, I noticed there is a "saved_model.pb" file, but I get an error every time I try to load it into Python. All I would like to do is see the model layers, similar to the output of a typical model.summary( ) in Keras:
Is it possible to see this output with the models that have been saved?
Thank you for making this work open source, and let us glimpse some of Google's achievements on the NAS platform. , I have successfully run feature data, binary and multi-classes image network search after checking materials and issues.
The current problem: model_search is only trained on a single GPU card, and the computational efficiency is not satisfactory. Is there a way to modify model_search (via distributing configs?) for multi-GPU card synchronous/asynchronous training?
Is there already a way to use the code for multiclass image datasets? Documentation shows only for binary image datasets. Tried changing the "label_mode" variable in image_data.py to "categorical" and change the return value of the "number_of_classes" function to the number of classes. Still an error.
Is there anywhere where we can find additional documentation? I've run the Google Model Search and obtained a saved_model.pb, but I'm not sure how to load this back into Python.
Here is some code that I've found for converting the output, but it doesn't seem quite right. So, documentation for how to actually implement this would be appreciated.
Hi - What could be causing this?
UnparsedFlagAccessError Traceback (most recent call last)
in ()
14 batch_size=32,
15 experiment_name="animalfaces",
---> 16 experiment_owner="myname")
3 frames
/usr/local/lib/python3.6/dist-packages/absl/flags/_flagvalues.py in getattr(self, name)
496 # get too much noise.
497 logging.error(error_message)
--> 498 raise _exceptions.UnparsedFlagAccessError(error_message)
499
500 def setattr(self, name, value):
UnparsedFlagAccessError: Trying to access flag --mlmd_default_sqllite_filename before flags were parsed.
I have trained 200 models using trainer.try_models() but now I need to know which model performed the best and its accuracy?
Anyone knows how to get the details such as: what evaluation score was used? what is the best model? and how can we use it later for prediction?
Thank you.
There is no hparam_pb2 under proto.
Here is the full trace:
ImportError Traceback (most recent call last)
<ipython-input-8-14ac9b22f6ac> in <module>
1 import model_search
2 from model_search import constants
----> 3 from model_search import single_trainer
4 from model_search.data import csv_data
5
~/Projects/model_search/model_search/single_trainer.py in <module>
15
16 import kerastuner
---> 17 from model_search import oss_trainer_lib
18 from model_search import phoenix
19 from model_search.proto import phoenix_spec_pb2
~/Projects/model_search/model_search/oss_trainer_lib.py in <module>
26 import kerastuner
27
---> 28 from model_search import hparam as hp
29 from model_search import phoenix
30 from model_search import registry
~/Projects/model_search/model_search/hparam.py in <module>
24 import re
25
---> 26 from model_search.proto import hparam_pb2
27 import six
28
ImportError: cannot import name 'hparam_pb2' from 'model_search.proto' (unknown location)```
I ran the default setting on my dataset and I am trying to figure out what the architecture of the model looks like (i.e how many layers, its size) and what Hyperparameter values did it choose. Where could I find this information?
Is it able to perform training and evaluation with structured multi dimension data ( like time-series data, image data, etc.) that cannot be represented with a csv?
I am trying to use it to do regression tasks but found it only for cls. After trying to revise the codes to apply it on reg tasks, I gave up. The codes are too hard to understand when you are moving from a strange class to another strange class... Any ideas about (1) how to revise it to apply it reg tasks; (2) how to understand the phoenix class?
I'm trying to run the example provided by the tool in the README file. After many fixed error, I have found this one and I have no solution.
The error
Traceback (most recent call last):
File "d:/Tesi_Magistrale/google_model_search/test.py", line 14, in <module>
trainer.try_models(
File "d:\Tesi_Magistrale\google_model_search\model_search\single_trainer.py", line 56, in try_models
phoenix_instance = phoenix.Phoenix(
File "d:\Tesi_Magistrale\google_model_search\model_search\phoenix.py", line 244, in __init__
self._controller = controller.InProcessController(
File "d:\Tesi_Magistrale\google_model_search\model_search\controller.py", line 147, in __init__
self._search_candidate_generator = SearchCandidateGenerator(
File "d:\Tesi_Magistrale\google_model_search\model_search\generators\search_candidate_generator.py", line 57, in __init__
self._search_algorithm = search_algorithms[phoenix_spec.search_type]
KeyError: 0
The code I'm running
import model_search
from model_search import constants
from model_search import single_trainer
from model_search.data import csv_data
trainer = single_trainer.SingleTrainer(
data=csv_data.Provider(
label_index=0,
logits_dimension=2,
record_defaults=[0, 0, 0, 0],
filename="model_search/data/testdata/csv_random_data.csv"),
spec=constants.DEFAULT_DNN)
trainer.try_models(
number_models=200,
train_steps=1000,
eval_steps=100,
root_dir="/tmp/run_example",
batch_size=32,
experiment_name="example",
experiment_owner="model_search_user")
I have no idea on what I need to check for the resolution.
Hello,
It setup.py missing, or I'm missing something?
Thanks,
Vic
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.