ageron / handson-ml Goto Github PK
View Code? Open in Web Editor NEW⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
License: Apache License 2.0
⛔️ DEPRECATED – See https://github.com/ageron/handson-ml3 instead.
License: Apache License 2.0
The issue is already open in the scikit-learn repo. But maybe you can help with it.
scikit-learn/scikit-learn#8588
http://mldata.org/ is down for more than a week now and sklearn code cannot download the MNIST data anymore.
Is there any alternative site to get it?.
Hello Aurélien,
You're mentioning in a previous issue a french translation (awesome!). Is there an ETA?
Thank you!
I've been following along your wonderful book for quite a while, and it's a great treasure for the whole community. But there're a couple of questions I've run into recently:
In Execution Phase of Char 10, it says 'Next, at the end of each epoch, the code evaluates the model on the last mini-batch and on the full training set, and it prints out the result. Finally, the model parameters are saved to disk.' However, the code evaluating the training set is only feeded with the last batch_size
training data, which confuses me a lot. Why do you evaluate only the last batch training set instead of the whole training set (X_train, y_train)
?
It always baffles me that how do make sure all the batches in one epoch make up the whole training set? I don't know how mnist.train.next_batch()
inside, but could you explain a little bit about the fetch_batch()
:
def fetch_batch(epoch, batch_index, batch_size):
rnd.seed(epoch * n_batches + batch_index)
indices = rnd.randint(m, size=batch_size)
X_batch = scaled_housing_data_plus_bias[indices]
y_batch = housing.target.reshape(-1, 1)[indices]
return X_batch, y_batch
What are the batches you sample with the seeds like? To be more specific, do the batches have overlapping instances? And is it correct to assume that all the batches in one epoch consist of all the samples?
I ran the function to fetch the data and it has created a directory datasets/housing correctly, however I think it has trouble downloading housing.tgz file. Please help!
---------------------------------------------------------------------------
SSLError Traceback (most recent call last)
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
1317 h.request(req.get_method(), req.selector, req.data, headers,
-> 1318 encode_chunked=req.has_header('Transfer-encoding'))
1319 except OSError as err: # timeout error
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py in request(self, method, url, body, headers, encode_chunked)
1238 """Send a complete request to the server."""
-> 1239 self._send_request(method, url, body, headers, encode_chunked)
1240
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
1284 body = _encode(body, 'body')
-> 1285 self.endheaders(body, encode_chunked=encode_chunked)
1286
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py in endheaders(self, message_body, encode_chunked)
1233 raise CannotSendHeader()
-> 1234 self._send_output(message_body, encode_chunked=encode_chunked)
1235
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py in _send_output(self, message_body, encode_chunked)
1025 del self._buffer[:]
-> 1026 self.send(msg)
1027
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py in send(self, data)
963 if self.auto_open:
--> 964 self.connect()
965 else:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py in connect(self)
1399 self.sock = self._context.wrap_socket(self.sock,
-> 1400 server_hostname=server_hostname)
1401 if not self._context.check_hostname and self._check_hostname:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py in wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
400 server_hostname=server_hostname,
--> 401 _context=self, _session=session)
402
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py in __init__(self, sock, keyfile, certfile, server_side, cert_reqs, ssl_version, ca_certs, do_handshake_on_connect, family, type, proto, fileno, suppress_ragged_eofs, npn_protocols, ciphers, server_hostname, _context, _session)
807 raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
--> 808 self.do_handshake()
809
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py in do_handshake(self, block)
1060 self.settimeout(None)
-> 1061 self._sslobj.do_handshake()
1062 finally:
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/ssl.py in do_handshake(self)
682 """Start the SSL/TLS handshake."""
--> 683 self._sslobj.do_handshake()
684 if self.context.check_hostname:
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)
During handling of the above exception, another exception occurred:
URLError Traceback (most recent call last)
<ipython-input-28-09097de26f5a> in <module>()
----> 1 fetch_housing_data()
<ipython-input-27-fa2a4bf02df6> in fetch_housing_data(housing_url, housing_path)
12 os.makedirs(housing_path)
13 tgz_path = os.path.join(housing_path, "housing.tgz")
---> 14 urllib.request.urlretrieve(housing_url, tgz_path)
15 housing_tgz = tarfile.open(tgz_path)
16 housing_tgz.extractall(path=housing_path)
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in urlretrieve(url, filename, reporthook, data)
246 url_type, path = splittype(url)
247
--> 248 with contextlib.closing(urlopen(url, data)) as fp:
249 headers = fp.info()
250
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
221 else:
222 opener = _opener
--> 223 return opener.open(url, data, timeout)
224
225 def install_opener(opener):
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
524 req = meth(req)
525
--> 526 response = self._open(req, data)
527
528 # post-process response
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in _open(self, req, data)
542 protocol = req.type
543 result = self._call_chain(self.handle_open, protocol, protocol +
--> 544 '_open', req)
545 if result:
546 return result
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
502 for handler in handlers:
503 func = getattr(handler, meth_name)
--> 504 result = func(*args)
505 if result is not None:
506 return result
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in https_open(self, req)
1359 def https_open(self, req):
1360 return self.do_open(http.client.HTTPSConnection, req,
-> 1361 context=self._context, check_hostname=self._check_hostname)
1362
1363 https_request = AbstractHTTPHandler.do_request_
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py in do_open(self, http_class, req, **http_conn_args)
1318 encode_chunked=req.has_header('Transfer-encoding'))
1319 except OSError as err: # timeout error
-> 1320 raise URLError(err)
1321 r = h.getresponse()
1322 except:
URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:749)>
Hi,
I'm running the mutli-gpu code and it's throwing the error:
File "/home/USER/Tensorflow_Models/Lstm-Rnn/model.py", line 68, in init
dtype=tf.float32)
File "/usr/lib/python3.4/site-packages/tensorflow/python/ops/rnn.py", line 508, in dynamic_rnn
raise TypeError("cell must be an instance of RNNCell")
TypeError: cell must be an instance of RNNCell
Understanding of Error: The dynamic_rnn module cannot deal with the new wrapper. Despite the wrapper being a RNNCell.
I've had a crack at trying to fix it but to no avail! Thanks!
Francos
Hi @ageron ,
Thanks for the link to the Jupyter notebooks. The link to the GDP Dataset from IMF appears to not work (for the downloads alone). Would it be possible to upload the CSV files itself to the repo ?
Thanks
There's a typo in the book: page 69. The code is presented as:
from sklearn.model_selection import cross_val_score
scores = cross_val_score(tree_reg, housing_prepared, housing_labels, scoring="neg_mean_squared_error", cv=10)
rmse_scores = np.sqrt(-scores) ### here is the typo, the variable should have been names: tree_rmse_scores
while on the second page, we have:
display_scores(tree_rmse_scores)
Variable: tree_rmse_scores does not exists.
On Github the code is correct: ch2 code project: In [79]
Hi,
I am using TF 1.1 with GPU on Windows 10, When i try to run below code in Ch10 to use tf.learn, i get an error as described below:
INFO:tensorflow:Create CheckpointSaverHook.
---------------------------------------------------------------------------
InternalError Traceback (most recent call last)
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1038 try:
-> 1039 return fn(*args)
1040 except errors.OpError as e:
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata)
1020 feed_dict, fetch_list, target_list,
-> 1021 status, run_metadata)
1022
C:\ProgramData\Anaconda2\envs\tensorflow\lib\contextlib.py in __exit__(self, type, value, traceback)
65 try:
---> 66 next(self.gen)
67 except StopIteration:
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\framework\errors_impl.py in raise_exception_on_not_ok_status()
465 compat.as_text(pywrap_tensorflow.TF_Message(status)),
--> 466 pywrap_tensorflow.TF_GetCode(status))
467 finally:
InternalError: Blas GEMM launch failed : a.shape=(50, 784), b.shape=(784, 300), m=50, n=300, k=784
[[Node: dnn/hiddenlayer_0/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](dnn/input_from_feature_columns/input_from_feature_columns/concat, dnn/hiddenlayer_0/weights)]]
During handling of the above exception, another exception occurred:
InternalError Traceback (most recent call last)
<ipython-input-3-dc76eaab997d> in <module>()
7 feature_columns=feature_cols, config=config)
8 dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
----> 9 dnn_clf.fit(X_train, y_train, batch_size=50, steps=40000)
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py in fit(self, x, y, batch_size, steps, max_steps, monitors)
1315 steps=steps,
1316 max_steps=max_steps,
-> 1317 monitors=all_monitors)
1318 return self
1319
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py in new_func(*args, **kwargs)
279 _call_location(), decorator_utils.get_qualified_name(func),
280 func.__module__, arg_name, date, instructions)
--> 281 return func(*args, **kwargs)
282 new_func.__doc__ = _add_deprecated_arg_notice_to_docstring(
283 func.__doc__, date, instructions)
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py in fit(self, x, y, input_fn, steps, batch_size, monitors, max_steps)
428 hooks.append(basic_session_run_hooks.StopAtStepHook(steps, max_steps))
429
--> 430 loss = self._train_model(input_fn=input_fn, hooks=hooks)
431 logging.info('Loss for final step: %s.', loss)
432 return self
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py in _train_model(self, input_fn, hooks)
976 loss = None
977 while not mon_sess.should_stop():
--> 978 _, loss = mon_sess.run([model_fn_ops.train_op, model_fn_ops.loss])
979 summary_io.SummaryWriterCache.clear()
980 return loss
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
482 feed_dict=feed_dict,
483 options=options,
--> 484 run_metadata=run_metadata)
485
486 def should_stop(self):
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
818 feed_dict=feed_dict,
819 options=options,
--> 820 run_metadata=run_metadata)
821 except _PREEMPTION_ERRORS as e:
822 logging.info('An error was raised. This may be due to a preemption in '
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, *args, **kwargs)
774
775 def run(self, *args, **kwargs):
--> 776 return self._sess.run(*args, **kwargs)
777
778
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, fetches, feed_dict, options, run_metadata)
928 feed_dict=feed_dict,
929 options=options,
--> 930 run_metadata=run_metadata)
931
932 for hook in self._hooks:
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\training\monitored_session.py in run(self, *args, **kwargs)
774
775 def run(self, *args, **kwargs):
--> 776 return self._sess.run(*args, **kwargs)
777
778
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
776 try:
777 result = self._run(None, fetches, feed_dict, options_ptr,
--> 778 run_metadata_ptr)
779 if run_metadata:
780 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
980 if final_fetches or final_targets:
981 results = self._do_run(handle, final_targets, final_fetches,
--> 982 feed_dict_string, options, run_metadata)
983 else:
984 results = []
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
1030 if handle is None:
1031 return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
-> 1032 target_list, options, run_metadata)
1033 else:
1034 return self._do_call(_prun_fn, self._session, handle, feed_dict,
C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args)
1050 except KeyError:
1051 pass
-> 1052 raise type(e)(node_def, op, message)
1053
1054 def _extend_graph(self):
**InternalError: Blas GEMM launch failed :** a.shape=(50, 784), b.shape=(784, 300), m=50, n=300, k=784
[[Node: dnn/hiddenlayer_0/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](dnn/input_from_feature_columns/input_from_feature_columns/concat, dnn/hiddenlayer_0/weights)]]
Caused by op 'dnn/hiddenlayer_0/MatMul', defined at:
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
app.launch_new_instance()
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
app.start()
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel\kernelapp.py", line 477, in start
ioloop.IOLoop.instance().start()
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
super(ZMQIOLoop, self).start()
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tornado\ioloop.py", line 888, in start
handler_func(fd_obj, events)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
self._handle_recv()
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
self._run_callback(callback, msg)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
callback(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tornado\stack_context.py", line 277, in null_wrapper
return fn(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 283, in dispatcher
return self.dispatch_shell(stream, msg)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel\kernelbase.py", line 399, in execute_request
user_expressions, allow_stdin)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute
res = shell.run_cell(code, store_history=store_history, silent=silent)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\ipykernel\zmqshell.py", line 533, in run_cell
return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell
interactivity=interactivity, compiler=compiler, result=result)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2827, in run_ast_nodes
if self.run_code(code, result):
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-3-dc76eaab997d>", line 9, in <module>
dnn_clf.fit(X_train, y_train, batch_size=50, steps=40000)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 1317, in fit
monitors=all_monitors)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 430, in fit
loss = self._train_model(input_fn=input_fn, hooks=hooks)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 927, in _train_model
model_fn_ops = self._get_train_ops(features, labels)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 1132, in _get_train_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.TRAIN)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\estimator.py", line 1103, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\learn\python\learn\estimators\dnn.py", line 143, in _dnn_model_fn
scope=hidden_layer_scope)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\framework\python\ops\arg_scope.py", line 181, in func_with_args
return func(*args, **current_args)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\contrib\layers\python\layers\layers.py", line 1433, in fully_connected
outputs = layer.apply(inputs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\layers\base.py", line 320, in apply
return self.__call__(inputs, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\layers\base.py", line 290, in __call__
outputs = self.call(inputs, **kwargs)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\layers\core.py", line 144, in call
outputs = standard_ops.matmul(inputs, self.kernel)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1801, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1263, in _mat_mul
transpose_b=transpose_b, name=name)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 768, in apply_op
op_def=op_def)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 2336, in create_op
original_op=self._default_original_op, op_def=op_def)
File "C:\ProgramData\Anaconda2\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1228, in __init__
self._traceback = _extract_stack()
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(50, 784), b.shape=(784, 300), m=50, n=300, k=784
[[Node: dnn/hiddenlayer_0/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](dnn/input_from_feature_columns/input_from_feature_columns/concat, dnn/hiddenlayer_0/weights)]]
This code segment throws this exception in TF 1.1:
import tensorflow as tf
config = tf.contrib.learn.RunConfig(tf_random_seed=42) # not shown in the config
feature_cols = tf.contrib.learn.infer_real_valued_columns_from_input(X_train)
dnn_clf = tf.contrib.learn.DNNClassifier(hidden_units=[300,100], n_classes=10,
feature_columns=feature_cols, config=config)
dnn_clf = tf.contrib.learn.SKCompat(dnn_clf) # if TensorFlow >= 1.1
dnn_clf.fit(X_train, y_train, batch_size=50, steps=40000)
Any help is appreciated it?
In handson-ml/15_autoencoders.ipynb, cell 7 (and chapter 15 in the book), you say an AE is equivalent to PCA w/ an MSE loss, but you have the following code:
reconstruction_loss = tf.reduce_sum(tf.square(outputs - X)) # MSE
This seems to me as the SSE. Probably this is a slight error, or I am just confused. Just a heads up :) Amazing book by the way!
This line in Chapter 3:
y_train_knn_pred = cross_val_predict(knn_clf, X_train, y_train, cv=3)
is causing my system to hang.
macOS Sierra 10.12.3 (16D32)
Python:
3.6.0 (default, Dec 30 2016, 15:26:50)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]
jupyter-client==4.4.0
jupyter-core==4.3.0
Hi Aurélien,
In this exercise, does it implicitly assume three different filters are applied to the three input channel (RGB) independently like depthwise_conv2d()
? Previously, only one filter operates on the same patch across different channels. Is that the case? Is there any rules regarding how to choose these two ways of applying filters?
hi , @ageron
in the book figure 15-8: "you can first train a stacked autoencoder using all the data, then reuse the lower layers to create a neural network for your actual task, and train it using the labeled data"
but in the example code phase 1 , it does not use all the data "output ≈ input" to train the autoencoder.
Thanks!
This is not an issue as such but a request. In chapter 16, we spend a lot of time getting acquainted with reinforcement learning and finally train a decent DQN, but we don't really see it play Pacman.
I think it'd be good to reuse the code already present in the notebook and create a short animation of our DQN playing.
I tried to do it myself, but ran into difficulties while calling action.eval()
.
Hello, Aurélien.
Thank you a lot for such a great and comprehensive book.
I have one question regarding exercises in chapter 4.
Question: Suppose the features in your training set have very different scales. What algorithms might suffer from this, and how? What can you do about it?
Answer (from Excercise solutions): If the features in your training set have very different scales, the cost function will have the shape of an elongated bowl, so the Gradient Descent algorithms will take a long time to converge. To solve this you should scale the data before training the model. Note that the Normal Equation will work just fine without scaling.
I understand that actually in the case of Gradient Descent, given enough time it converges anyway (although it might hit the limits on the number of iterations). However, with regularization models, the situation is even worse, because, in the case of different feature scales, the model can be less precise. Looking the documentation I found that LogisticRegression also uses some regularization and therefore can suffer noticeably.
That was my conclusion, but I didn't find in the exercise solution and thought that I might miss something.
After training a Linear Regression model and loading some data to try it out on, I received the following error:
some_data_prepared = full_pipeline.transform(some_data)
Traceback (most recent call last):
File "<ipython-input-10-ce624cfb8912>", line 2, in <module>
some_data_prepared = full_pipeline.transform(some_data)
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\pipeline.py", line 763, in transform
for name, trans, weight in self._iter())
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 758, in
__call__
while self.dispatch_one_batch(iterator):
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 608, in
dispatch_one_batch
self._dispatch(tasks)
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 571, in
_dispatch
job = self._backend.apply_async(batch, callback=cb)
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line
109, in apply_async
result = ImmediateResult(func)
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line
326, in __init__
self.results = batch()
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 131, in
__call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 131, in
<listcomp>
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\pipeline.py", line 567, in _transform_one
res = transformer.transform(X)
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\pipeline.py", line 445, in _transform
Xt = transform.transform(Xt)
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\preprocessing\imputation.py", line 311, in
transform
check_is_fitted(self, 'statistics_')
File "C:\Users\bryan\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 690, in
check_is_fitted
raise _NotFittedError(msg % {'name': type(estimator).__name__})
NotFittedError: This Imputer instance is not fitted yet. Call 'fit' with appropriate arguments before
using this method.`
I changed the method to fit_transform and it worked fine.
some_data_prepared = full_pipeline.fit_transform(some_data)
Perhaps the code in the notebook should be updated accordingly?
Update
My suggestion yields some_data_prepared with a shape of (5,14) which will not allow for the predict method to run due to shapes not aligning.
Instead, the imputer instance needed to be fit separately before running the transform method of full_pipeline on some_data.
imputer = Imputer(strategy="median")
imputer.fit(housing_num)
If users went through the chapter entering all the code as instructed, this would not be an issue. I had closed out and come back to it, losing the fit imputer instance in the process.
Im having this problem, im new using jupyter and don't know what info could be relevant for the issue, please let me know if there's any extra data I can provide:
[W 19:07:07.908 NotebookApp] server_extensions is deprecated, use nbserver_extensions
[W 19:07:08.549 NotebookApp] Error loading server extension jupyter_nbextensions_configurator
in page 41,
$ cd $ML_PATH
$ virtualenv env
is there a typo error ? should it be virtualenv $ML_PATH ?
if no, what does env represents ?
virtualenv usage
Where ENV is a directory to place the new virtual environment.
In Solution 8.4. (Now try adding Batch Normalization ...) you concluded that with Batch Normalization the model performed even worse. However, I found that there is a little bug in DNNClassifier
:
self._graph = tf.Graph()
with self._graph.as_default():
self._build_graph(n_inputs, n_outputs)
# ... removed 5 lines ... #
# extra ops for batch normalization
extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
# Now train the model!
self._session = tf.Session(graph=self._graph)
with self._session.as_default() as sess:
self._init.run()
Here extra_update_ops
are taken outside of the graph, I beleive this is the issue with Batch Normalization. If you move the line extra_update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
into the with self._graph.as_default():
block, it will work pretty close to the original DNNClassifier
or even better.
And I reported this bug as an excuse to tell you again how much I love your book. Thank you very much for your incredible work. I'm enjoying every single chapter of it!
In chapter two, the cell which imports the scatter matrix from pandas uses deprecated call to do so:
`from pandas.tools.plotting import scatter_matrix
attributes = ["median_house_value", "median_income", "total_rooms",
"housing_median_age"]
scatter_matrix(housing[attributes], figsize=(12, 8))
save_fig("scatter_matrix_plot")`
This returns the warning:
/home/martyn/anaconda2/lib/python2.7/site-packages/ipykernel/__main__.py:5: FutureWarning: 'pandas.tools.plotting.scatter_matrix' is deprecated, import 'pandas.plotting.scatter_matrix' instead.
It is remedied by simply deleting the tools. in after the "from pandas"
Hi @ageron , in chapter 4, section Linear regression using batch gradient descent following code does not update theta
. Am a missing something?
eta = 0.1
n_iterations = 1000
m = 100
theta = np.random.randn(2,1)
for iteration in range(n_iterations):
gradients = 2/m * X_b.T.dot(X_b.dot(theta) - y)
theta = theta - eta * gradients
Hi,
I've been following the book and I really like the content so far. While going through Chapter 8, I tried the exercise (# 9) that asks you to reduce the dimensionality of the MNIST dataset and compare the training times of the original set and the reduced set on a random forest classifier. However, the reduced dataset takes significantly longer to train for me and results in much lower accuracy. However, It's certainly possible I've done something wrong. https://github.com/ndalton12/Python-Learning/blob/master/ML/Chapter8_Exercises/Chapter8_Exercises.ipynb shows my code, which is based heavily on the examples from the chapter.
Any help or reasoning I'm missing is appreciated.
Hi,
I ran the code in [42] and received this error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-27-87fea2980420> in <module>()
----> 1 show_graph(tf.get_default_graph)
<ipython-input-26-37b5a5465d6e> in show_graph(graph_def, max_const_size)
18 if hasattr(graph_def, 'as_graph_def'):
19 graph_def = graph_def.as_graph_def()
---> 20 strip_def = strip_consts(graph_def, max_const_size=max_const_size)
21 code = """
22 <script>
<ipython-input-26-37b5a5465d6e> in strip_consts(graph_def, max_const_size)
4 """Strip large constant values from graph_def."""
5 strip_def = tf.GraphDef()
----> 6 for n0 in graph_def.node:
7 n = strip_def.node.add()
8 n.MergeFrom(n0)
AttributeError: 'function' object has no attribute 'node'
In [ ]:
This code is in cell [22] of the notebook but does not appear on page 85 of the book.
y_train_perfect_predictions = y_train_5
@ageron What is the best way to let you know of items such as this one? I am going through the book in a very methodical, slow manner; hence, I might find things that you might want to know about.
Hi Aurelien,
Been getting through your (excellent) book, and I noticed that for the manual computation of the gradients section of chapter 9 I was getting gradients of 0 (no learning). It looked like the offending line was: gradients = 2/m * tf.matmul(tf.transpose(X), error) where 2/m is an integer division resulting in 0.
When I switched the 2 to float notation (so, 2.), the gradients were being updated accordingly and everything seemed to work.
Take care.
Michael
Hi,
I have purchased this book from Amazon. But where's the method fetch_batch() function used in chapter 9? I tried but couldn't found.
In 14_recurrent_neural_networks.ipynb
the cell that's directly after the Deep RNN title didn't work for me.
The error reported was:
ValueError Traceback (most recent call last)
<ipython-input-41-fcd832f48dd9> in <module>()
10 basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons)
11 multi_layer_cell = tf.contrib.rnn.MultiRNNCell([basic_cell] * n_layers)
---> 12 outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, X, dtype=tf.float32)
13
14 init = tf.global_variables_initializer()
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py in dynamic_rnn(cell, inputs, sequence_length, initial_state, dtype, parallel_iterations, swap_memory, time_major, scope)
551 swap_memory=swap_memory,
552 sequence_length=sequence_length,
--> 553 dtype=dtype)
554
555 # Outputs of _dynamic_rnn_loop are always shaped [time, batch, depth].
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py in _dynamic_rnn_loop(cell, inputs, initial_state, parallel_iterations, swap_memory, sequence_length, dtype)
718 loop_vars=(time, output_ta, state),
719 parallel_iterations=parallel_iterations,
--> 720 swap_memory=swap_memory)
721
722 # Unpack final output if not using output tuples.
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py in while_loop(cond, body, loop_vars, shape_invariants, parallel_iterations, back_prop, swap_memory, name)
2621 context = WhileContext(parallel_iterations, back_prop, swap_memory, name)
2622 ops.add_to_collection(ops.GraphKeys.WHILE_CONTEXT, context)
-> 2623 result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
2624 return result
2625
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py in BuildLoop(self, pred, body, loop_vars, shape_invariants)
2454 self.Enter()
2455 original_body_result, exit_vars = self._BuildLoop(
-> 2456 pred, body, original_loop_vars, loop_vars, shape_invariants)
2457 finally:
2458 self.Exit()
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/control_flow_ops.py in _BuildLoop(self, pred, body, original_loop_vars, loop_vars, shape_invariants)
2404 structure=original_loop_vars,
2405 flat_sequence=vars_for_body_with_tensor_arrays)
-> 2406 body_result = body(*packed_vars_for_body)
2407 if not nest.is_sequence(body_result):
2408 body_result = [body_result]
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py in _time_step(time, output_ta_t, state)
703 skip_conditionals=True)
704 else:
--> 705 (output, new_state) = call_cell()
706
707 # Pack state if using state tuples
/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/rnn.py in <lambda>()
689
690 input_t = nest.pack_sequence_as(structure=inputs, flat_sequence=input_t)
--> 691 call_cell = lambda: cell(input_t, state)
692
693 if sequence_length is not None:
/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in __call__(self, inputs, state, scope)
951 state, [0, cur_state_pos], [-1, cell.state_size])
952 cur_state_pos += cell.state_size
--> 953 cur_inp, new_state = cell(cur_inp, cur_state)
954 new_states.append(new_state)
955 new_states = (tuple(new_states) if self._state_is_tuple else
/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in __call__(self, inputs, state, scope)
118 def __call__(self, inputs, state, scope=None):
119 """Most basic RNN: output = new_state = act(W * input + U * state + B)."""
--> 120 with _checked_scope(self, scope or "basic_rnn_cell", reuse=self._reuse):
121 output = self._activation(
122 _linear([inputs, state], self._num_units, True))
/usr/lib/python3.5/contextlib.py in __enter__(self)
57 def __enter__(self):
58 try:
---> 59 return next(self.gen)
60 except StopIteration:
61 raise RuntimeError("generator didn't yield") from None
/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in _checked_scope(cell, scope, reuse, **kwargs)
75 "this error will remain until then.)"
76 % (cell, cell_scope.name, scope_name, type(cell).__name__,
---> 77 type(cell).__name__))
78 else:
79 weights_found = False
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.BasicRNNCell object at 0x7f845c714160> with a different variable scope than its first use. First use of cell was with scope 'rnn/multi_rnn_cell/cell_0/basic_rnn_cell', this attempt is with scope 'rnn/multi_rnn_cell/cell_1/basic_rnn_cell'. Please create a new instance of the cell if you would like it to use a different set of weights. If before you were using: MultiRNNCell([BasicRNNCell(...)] * num_layers), change to: MultiRNNCell([BasicRNNCell(...) for _ in range(num_layers)]). If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.)
The system used is Ubuntu 16.04 x64,with both python2 and python3 installed and tensorflow-gpu present for both pythons.
Hi, ageron, firstly, I really find your book interesting and easily to follow up for a programmer who wants to know the black technology behind machine learning, Thanks a lot!!
Here is a little problem when I run part of your code under python 2:
Chapter 4 Gradient Descent:
gradients = 2/m * X_b.T.dot(X_b.dot(theta) - y)
in python2, '/' doesn't return float but integer number when both divisor and dividend are integer, m=100 here, 2/100 = 0, so gradients are always 0, we can fix this by changing to:
gradients = 2.0/m * X_b.T.dot(X_b.dot(theta) - y)
Thx :)
Getting "unknown character" renders of some equations in
https://github.com/ageron/handson-ml/blob/master/math_linear_algebra.ipynb
for example, in the Main properties section of Dot Product. Is it just me? I've tried chrome and safari.
The paragraph at the top of page 52 of your book says "The following code creates an income category attribute by dividing the median income by 1.5 ....., and then merging all the categories greater than 5 into category 5"
housing["income_cat"].where(housing["income_cat"] < 5, 5.0, inplace=True)
Is this an error or an incorrect statement? It seems to me that the where clause is trying to do the exact opposite, i.e., the condition should be > 5 instead of < 5.? Please explain. Also, let me know if this is not the right place to ask questions. If not then where should I ask...
Hello
First let me say I am enjoying your book and am learning quite a bit. I did have trouble with getting the MNIST dataset, and as your notebook suggests, you have a fallback implementation if mldata.org is down. I had a problem with your alternate implementation around SSL certificates. However I was able to get around this using the requests package.
I have put together a Gist with my implementation incase anyone else might also have a problem.
https://gist.github.com/youngsoul/fc69665c5d08e189c57c0db0e93017a6
with tf.Session() as sess:
y_val, z_val = sess.run([y, z])
print(y) # 10
print(z) # 15
In this code block, based on the book print(y) and print(z) part should be print(y_val) and print(z_val)
Book page:235
Hey, first off I'm really enjoying your book so far, but I was wondering when you will have the solutions up for the coding exercises. I'm really stuck on Chapter 2 Exercise 3 and would love to know the correct way to solve the problem.
"This concludes this introduction to Linear Algeabra"
s/Algeabra/Algebra/
Hi
anyone knows the workaround ?
in page 43/564 Example 1-1. Training and running a linear model using Scikit-Learn
how do i overcome this error ?
country_stats = prepare_country_stats(oecd_bli, gdp_per_capita)
Traceback (most recent call last):
File ".\example_1.1.py", line 12, in
country_stats = prepare_country_stats(oecd_bli, gdp_per_capita)
NameError: name 'prepare_country_stats' is not defined
Enjoy reading the book thus far!
However, when implementing the the full_pipeline.fit_transform(housing)
I receive an IndexError, which seems to be a product of the transform method on the DataFrameSelector class.
IndexError: only integers, slices (:
), ellipsis (...
), numpy.newaxis (None
) and integer or boolean arrays are valid indices
DataFrameSelector
is defined in cell 51 of the notebook.
In the book you reference other ways to do the task of DataFrameSelector
but I'd like to understand exactly what went wrong here. #
Hi, ageron!
Great book! I am using the notebooks and trying to understand properly everything. Therefore I forked and I will submite some PR's soon... but this one I couldn't solve. On 02_end_to_end_machine_learning_project.ipynb
there is a part where you do the split by the hash of each instance’s identifier with this function. I am not able to make this function work because it is throwing the following error:
----> 4 return hash(identifier).digest()[-1] < 256 * test_ratio
5
6 def split_train_test_by_id(data, test_ratio, id_column, hash=hashlib.md5):
TypeError: object supporting the buffer API required
Thanks for the book and I will keep reading on the early-release!
def plot_confusion_matrix(matrix):
"""If you prefer color and a colorbar"""
fig = plt.figure(figsize=(8,8))
ax = fig.add_subplot(111)
cax = ax.matshow(conf_mx)
fig.colorbar(cax)
In this code block, line 5 should be cax = ax.matshow(matrix)
Binder link leads to a page which requires password. Text on the page reads:
Token authentication is enabled. You need to open the notebook server with its first-time login token in the URL, or enable a password in order to gain access. The command:
jupyter notebook list
will show you the URLs of running servers with their tokens, which you can copy and paste into your browser. For example:
Currently running servers:
http://localhost:8888/?token=c8de56fa... :: /Users/you/notebooks
Or you can paste just the token value into the password field on this page.
Cookies are required for authenticated access to notebooks.
hi @ageron ,
a function in 15_autoencoders
def show_reconstructed_digits(X, outputs, model_path = None, n_test_digits = 2):
the 'X' parameter is never used.
Thanks!
Not so much an issue, but more a suggestion, I noticed while in the latter stages of the Chapter 2 end to end tutorial, that despite supporting 8 threads, the more maths intensive end of the code was only ever running on one core in my CPU. Given the complexity and size of some datasets, it may be wiser to optimise the code to spread itself across all cores available?
Hi Geron,
Thanks for your great book, I am new to Tensorflow and following your implementation, however, I am encountering an error with Linear Regression implementation.
I implemented using the California Dataset that was downloaded from chapter 2 because my download using sklearn fetch() returns a responseError
After my implementation the result returns Nan, Can you help me figure out what could be wrong
My observation using np.dot to operate on X.T * X, I found out that the result contains nan values, I don't know if this could be a clue to why the result is as it is
Below is my code
`import numpy as np
import os
import pandas as pd
FILENAME = 'housing.csv'
HOUSING_PATH = "datasets/housing"
def load_housing_data(housing_path=HOUSING_PATH, csv_filename=FILENAME):
"""
Function to load csv file given a housing, path
"""
csv_path = os.path.join(housing_path, csv_filename)
return pd.read_csv(csv_path)
housing = load_housing_data()
housing_target = housing['median_house_value']
housing_target = pd.Series.as_matrix(housing_target)
housing.drop(['ocean_proximity', 'median_house_value'],axis=1, inplace=True)
housing_matrix = pd.DataFrame.as_matrix(housing)
m, n =housing.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing_matrix]
tf.reset_default_graph()
X = tf.constant(housing_data_plus_bias, dtype=tf.float64, name="X")
y = tf.constant(housing_target.reshape(-1, 1), dtype=tf.float64, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)
with tf.Session() as sess:
result = theta.eval()
print(result)
[[ nan]
[ nan]
[ nan]
[ nan]
[ nan]
[ nan]
[ nan]
[ nan]
[ nan]]`
Hi, @ageron ,
According to the notes mentioned in book,
the ResNet-34 model is to be implemented. However, I cannot find it in 13_convolutional_neural_networks.ipynb
. Could you help to upload it?
THX~
I'm working through Chapter 2 exercises w/ latest scikit (0.18.2), and this code was failing in exercise 4:
prepare_select_and_predict_pipeline.fit(housing, housing_labels)
The error was that 'fit' expects 2 arguments but was getting 3. I traced it back to LabelBinarizer's fit method not being compatible with pipelines. I think this is something that must have changed since ch. 2 was authored. Here is a relevant issue from the scikit learn project: scikit-learn/scikit-learn#3112 .
i managed to fix it by wrapping the LabelBinarizer in a Transformer with the needed signature and it works:
class CustomBinarizer(BaseEstimator, TransformerMixin):
def fit(self, X, y=None,**fit_params):
return self
def transform(self, X):
return LabelBinarizer().fit(X).transform(X)
then just replace LabelBinarizer with CustomBinarizer when 'cat_pipeline' is defined.
note: everything works up until the exercises because you are not calling fit w/ 2 arguments until then.
Hi Aurelien - Great book... I've caught a recurring error that you may be able to help with. In the available code you've provided to supplement the book, you frequently use "save_fig()" after generating a plot. I'd like to use this feature, but it always errors out for me and I've seen quite a few discussion threads on it as well.
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-38-299395e3626e> in <module>()
1 housing.plot(kind="scatter", x="median_income", y="median_house_value", alpha=0.1)
----> 2 save_fig("income_vs_house_value_scatterplot")
NameError: name 'save_fig' is not defined
Just wondering if we've missed a step somewhere...
As you may be able to surmise, I'm quite new to machine learning, but am thoroughly enjoying this book.
Hi, I'm getting an error message when I try to run this code on Chapter 9, p.248. Any suggestions for solving this please or is the resource link broken?
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]
The final error message is
BadZipFile: File is not a zip file
Dear Aurélien,
I was noticed that your great book has been published 2 days ago from your twitter
So, before published officially , do the codes in your book have also been updated to tf1.0 just as the online and digital version ?
thx a lot.
There's a note that says:
Don’t forget that you can treat some of the data preparation steps as hyperparameters. For example, the grid search will automatically find out whether or not to add a feature you were not sure about (e.g., using the add_bedrooms_per_room hyperparameter of your CombinedAttributesAdder transformer).
It's not obvious to me how to do this. The given example only shows how to search the parameters of the model, not the data pipeline.
Looking online, it seems that one way is to add the model to the pipeline, and then identify the hyperparameters as stagename__hyperparamname?
Thanks for your fantastic book!
Not sure if this is a bug or it's just the way things are. The linear algebra notebook is incredibly slow to render and to scroll through. I suspect it's getting bogged down on rendering the equations. I've tried it on two different computers using IE, Edge, and Opera. Edge sometimes just crashes. Opera runs the notebook but poorly.
Let me know if there's any debug info I should gather.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.