justmarkham / scikit-learn-videos Goto Github PK
View Code? Open in Web Editor NEWJupyter notebooks from the scikit-learn video series
Home Page: https://courses.dataschool.io/introduction-to-machine-learning-with-scikit-learn
Jupyter notebooks from the scikit-learn video series
Home Page: https://courses.dataschool.io/introduction-to-machine-learning-with-scikit-learn
running this:
y_test.value_counts()
I get this:
AttributeError Traceback (most recent call last)
in ()
1 # examine the class distribution of the testing set (using a Pandas Series method)
----> 2 y_test.value_counts()
AttributeError: 'numpy.ndarray' object has no attribute 'value_counts'
Hi! I just noticed that all the links to IPython notebooks show a 400 error when clicked on.
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data'
col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
pima = pd.read_csv(url, header=None, names=col_names)
->
temporary solution (N.B.: comment='#'
in read_csv is important)
url = 'https://gist.githubusercontent.com/ktisha/c21e73a1bd1700294ef790c56c8aec1f/raw/819b69b5736821ccee93d05b51de0510bea00294/pima-indians-diabetes.csv'
col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
pima = pd.read_csv(url, header=None, names=col_names, comment='#')
Hi!
I wanted to start off by saying that your tutorials and videos are really great! so clear and simple!
I've been working on a binary classification problem for my school with scikit-learn and I have been scratching my head in regards to how it displays the confusion matrix. For instance I have as output
[ [30 5]
[2 42] ]
I noticed by looking at the classification report that scikit learn by default outputs the negative class first. This leads me to understand that the first list is the negative class and that the second is the positive class. However, what I don't understand how to interpret what each number stands for as in TP, FP, TN, FN.
TN(30) FN (5)
FP(2) TP (42)
Is this a current representation of the input above?
Thanks a bunch!
When running these notebooks on Jupyter 3.2.x or 4.2.x, I get the following error:
Failed to start the kernel
The 'None' kernel is not available. Please pick another suitable kernel instead, or install that kernel.
Note, the Kernel is running when I create a new notebook, so the problem seems to be related to incompatibility between these notebooks and Jupyter.
Here is my local Environment:
The version of the notebook server is 4.2.1 and is running on:
Python 3.5.1 |Anaconda 4.1.0 (64-bit)| (default, Jun 15 2016, 15:32:45)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]Current Kernel Information:
unable to contact kernel
The version of the notebook server is 3.2.0-8b0eef4 and is running on:
Python 2.7.10 |Anaconda 2.3.0 (64-bit)| (default, May 28 2015, 17:02:03)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)]
Can someone let me know the issue with this?when i tried installing with anaconda and using it on to detect cars in a video using Haar Cascade Classifier.
cheers
.
/Users/mohan/anaconda3/lib/python3.5/site-packages/skvideo/init.py:356: UserWarning: avconv/avprobe not found in path:
warnings.warn("avconv/avprobe not found in path: " + str(path), UserWarning)
Hi, I'm used KNN for my research but I don't know how to display accuracy of a result
This is my code:
# In[38]:
from sklearn import neighbors, metrics
from sklearn.neighbors import NearestNeighbors
import numpy as np
import pandas as pd
import sys
import json
import math
data = pd.read_excel('dataset.xlsx')
data = np.array(data.as_matrix())
# In[40]:
knn=neighbors.KNeighborsClassifier(n_neighbors=5)
# In[41]:
X = data[:,:-3]
Y = data[:,-1:]
Y = np.zeros(len(Y))
for i in range(0,len(Y)):
if data[i,5] == 0:
Y[i] = 0
elif data[i,5] == 1:
Y[i] = 1
elif data[i,5] == 2:
Y[i] = 2
elif data[i,5] == 3:
Y[i] = 3
elif data[i,5] == 4:
Y[i] = 4
# In[42]:
knn.fit(X, Y)
# In[49]:
result = knn.predict([[220.4, 6.39,1855]])
print(result)
# result = knn.predict(X)
# print(metrics.accuracy_score(Y[2000], result))
Hi.
I have a question from scikit-learn-videos/07_cross_validation.ipynb. The output of the classification accuracy is usually several digits after the decimal e,g. 0.966666666667. If I multiply this value with the total number of observations i.e. 25, I will get 24.1666666667. What does this mean? That 24.1666666667 were classified correctly. Should not it give me a whole number? such as 24 maybe.
Hello!
First of all, thankyou so much for this series and all the resources you have mentioned with them. I started out with machine learning a few months ago and after reading, searching online, I was still not able to grasp the core of the machine learning. Your videos made it really simple and easy to understand! Most of my confusions cleared up! I hope you keep making these videos.
Anyways, Can you please make or refer me a video tutorial or great resource on Self-Organizing Maps in Scikit Learn? It will be a great help!
Thankyou again!
the line
print('{:^9} {} {:^25}'.format(iteration, data[0], data[1]))
gives a type error
print('{:^9} {} {:^25}'.format(iteration, data[0], str(data[1])))
solves the problem
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.