random-forests / tutorials Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
It is obvious that a program Pandas Dataframe is more useful as compared to one in 2D matrix.
So please translate the code to pandas Dataframe or I could do that for you.
Hi, Josh Gordon
Thanks for your great sharing, it helped me get a much better understanding of decision tree.
I have a suggestion on the function ''is_numeric''
In your example, there is no bool column in the training data, so 'is_numeric' function works fine, yet if i add a bool column in the dataset, is_numeric(True) will be true
so i suggest change the
function into the following to take bool value into account
def is_numeric(value): return type(value) in (float,int)
thanks~
In the latest version of tensorflow (1.3.0) the LinearClassifier object does not have a weights_ member. Instead, the weights have to be retrieved as follows:
weights = classifier.get_variable_value("linear//weight")
Full disclosure: I'm not using the Docker image, but working in my own environment on a Mac (10.12.2) with Python 2.7 (via Homebrew) and Tensorflow 0.12.1.
I'm going through the code for Episode 7, where I get an "IndexError: invalid index to scalar variable
" on the line classifier.fit(data, labels, batch_size=100, steps=1000)
. Here's my code in full (exactly same as the tutorial):
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
learn = tf.contrib.learn
mnist = learn.datasets.load_dataset('mnist')
data = mnist.train.images
labels = np.asarray(mnist.train.labels, dtype=np.int32)
test_data = mnist.test.images
test_labels = np.asarray(mnist.test.labels, dtype=np.int32)
max_examples = 10000
data = data[:max_examples]
labels = labels[max_examples]
feature_columns = learn.infer_real_valued_columns_from_input(data)
classifier = learn.LinearClassifier(feature_columns=feature_columns, n_classes=10)
classifier.fit(data, labels, batch_size=100, steps=1000)
Here's the complete error:
Traceback (most recent call last):
File "/Users/mbaytas/Dropbox/works-code/ml-studies/google-recipes/ep7-mnist/ep7.py", line 38, in <module>
classifier.fit(data, labels, batch_size=100, steps=1000)
File "/usr/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/linear.py", line 446, in fit
max_steps=max_steps)
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 191, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 355, in fit
max_steps=max_steps)
File "/usr/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 733, in _train_model
max_steps=max_steps)
File "/usr/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/graph_actions.py", line 300, in _monitored_train
_, loss = super_sess.run([train_op, loss_op], feed_fn() if feed_fn else
File "/usr/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/learn_io/data_feeder.py", line 407, in _feed_dict_fn
out[i] = _access(self._y, sample)
File "/usr/local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/learn_io/data_feeder.py", line 208, in _access
return data[iloc]
IndexError: invalid index to scalar variable.
I am trying to follow your seventh episode and trying to print the prediction using tensor flow as depicted :
print ("Predicted %d, Label: %d" % (classifier.predict(test_data[0]), test_labels[0]))
But I am getting the following error :
print ("Predicted %d, Label: %d" % (classifier.predict(test_data[0]), test_labels[0]))
TypeError: %d format: a number is required, not generator
How to fix it ?
Hi,
There is a mistake in the calculation of the # of columns in the ipython book.
'''n_features = len(rows[0]) - 1 # number of columns''' does define the number of samples -1, not the number of columns.
This should be replaced with:
n_features = rows.shape(1) for getting the number of columns.
The code in the example works, because of the number of the rows and columns are not way too off from each other.
the code works fine!
However I am a newbie and tried to make editing the training data a bit more easy so I've created a file training_data.data as:
Green,3,Apple
Yellow,3,Apple
Red,1,Grape
Red,1,Grape
Yellow,3,Lemon
and then import this with:
import pandas as pd
training_data = pd.read_csv('training_data.data', header=-1)
and now I gut the error TypeError: 'int' object is not subscriptable.
print ("Predicted %d, Label: %d" % (classifier.predict(test_data[0]), test_labels[0]))
the below error occurred.
InvalidArgumentError (see above for traceback): tensor_name = linear//weight; shape in shape_and_slice spec [1,10] does not match the shape stored in checkpoint: [784,10]
[[Node: save/RestoreV2_1 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_save/Const_0_0, save/RestoreV2_1/tensor_names, save/RestoreV2_1/shape_and_slices)]]
Although classifier.evaluate(test_data[0:1,:], test_labels[0:1]) is working..
{'accuracy': 1.0, 'global_step': 1000, 'loss': 0.010729363}
Thanks for the tutorial, it is very great, waiting for the random forest video.
I have run the code, and face different results for the impurity than the expected ones.
when removing **2 , it worked well .
bu when completing to the next sections I got different best question, so I do not know.
I think that the correct choice is to remove #**2
Thanks,
I'm getting this error when trying to upload the Jupyter notebook ep7.ipynb
I simply follow the command "docker run -it -p 8888:8888 tensorflow/tensorflow:0.10.0rc0" and when i try to upload the error appears.
This would be great for tinkering with some of the lower level code examples you have shown in your episodes. Namely the ScrappyKNN example.
I copy pasted this to https://colab.research.google.com (google's jupyter notebook implementation) and added some of my own notes.
https://drive.google.com/file/d/1KFFdyYaU1rbM9G6Vx4rMEL9fIXAx4A84/view?usp=sharing
Hope it helps someone get their hands dirty with decision trees! BIG Thanks to random-forests as well!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.