Comments (8)
OK,I decide try to coding this algorithm. But I think I need some time, because I have other job. I will finish it as soon as possible.
Nice to meet you. @ddbourgin
from numpy-ml.
Hi @daidai21 - thanks for your interest!
There actually is a k-means model as part of the KNN module, though I haven't explicitly called it that in the READMEs. Specifically, the KNN
object takes an argument classifier
, which converts between k-nearest neighbors regression (classifier=False
) and k-means classification/clustering (classifier=True
).
Feel free to propose other models you'd be interested in working on, though!
from numpy-ml.
I'm sorry. Excuse me. There is no SVM? @ddbourgin
from numpy-ml.
@daidai21 - No need to apologize! An SVM implementation would be awesome -- it's been on my TODO list for ages :)
The crux will be implementing the SMO algorithm properly I suspect. If you decide to do it, I wouldn't worry too much about being efficient - for this repo, the focus is more on making everything as clean/clear as possible rather than on being clever.
Also, if you end up referencing other implementations when writing your code, please make sure to cite them in the docstrings and PR. It's important that any code you submit is your own work.
Finally - thanks! Let me know if you have any questions as you go along :)
from numpy-ml.
Sure, take your time, and let me know if you have any questions!
from numpy-ml.
Hi, David
I took time to finish it, but the test didn't pass all. There is a 78% probability that my model and Sklearn's model predict the accuracy of the results. I don't know what to do now?
Sometimes my models are good, sometimes sklearns are good.
I think this result is related to the distribution of randomly generated data. I think my code is OK. What do you think?
This is test code.
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import random
# load myself model
# from SVM import SVM
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.datasets.samples_generator import make_blobs
from sklearn.model_selection import train_test_split
def test_SVM():
i = 1
np.random.seed(12345)
while True:
X, Y = make_blobs( # generate dataset
n_samples=np.random.randint(2, 100),
n_features=np.random.randint(2, 100),
centers=2, random_state=i,
)
X, X_test, Y, Y_test = train_test_split(X, Y, test_size=0.3, random_state=i)
if 0 not in Y or 1 not in Y: # ignore split error(train/test data only 1 class)
continue
# generate param
C = random.uniform(0.1, 0.9)
max_iter = random.uniform(50, 500)
kernel = np.random.choice(["linear", "rbf"])
tol = random.uniform(0.000001, 0.1)
# fit and predict
clf1 = SVC(C=C, max_iter=max_iter, kernel=kernel, tol=tol)
clf1.fit(X, Y)
pred1 = clf1.predict(X_test)
clf2 = SVM(C=C, max_iter=max_iter, kernel=kernel, tol=tol)
clf2.fit(X, Y)
pred2 = clf2.predict(X_test)
# judge
# err_msg = "ERROR {0} {1}".format(accuracy_score(Y_test, pred1), accuracy_score(Y_test, pred2))
# assert accuracy_score(Y_test, pred1) == accuracy_score(Y_test, pred2), err_msg
# print("PASSED")
if accuracy_score(Y_test, pred1) == accuracy_score(Y_test, pred2):
print("PASSED")
else:
print("ERROR", accuracy_score(Y_test, pred1), accuracy_score(Y_test, pred2))
if __name__ == "__main__":
test_SVM()
This test code run result.
PASSED
PASSED
PASSED
PASSED
ERROR 0.3333333333333333 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.0
PASSED
ERROR 0.3125 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.7692307692307693
PASSED
PASSED
ERROR 1.0 0.5
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.9655172413793104 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.5 1.0
PASSED
ERROR 1.0 0.5384615384615384
PASSED
ERROR 1.0 0.9090909090909091
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.3333333333333333
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.75
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.9
ERROR 1.0 0.6666666666666666
PASSED
ERROR 0.3333333333333333 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.4444444444444444 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.14285714285714285 1.0
ERROR 1.0 0.8
PASSED
PASSED
ERROR 1.0 0.9583333333333334
PASSED
ERROR 1.0 0.3333333333333333
PASSED
ERROR 1.0 0.9047619047619048
ERROR 0.0 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.2 1.0
PASSED
PASSED
ERROR 1.0 0.6
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.9090909090909091
ERROR 0.0 1.0
ERROR 0.3333333333333333 1.0
ERROR 1.0 0.6
PASSED
PASSED
PASSED
PASSED
ERROR 0.5 1.0
ERROR 1.0 0.8
ERROR 1.0 0.9523809523809523
PASSED
ERROR 0.32 1.0
PASSED
PASSED
ERROR 1.0 0.8333333333333334
ERROR 1.0 0.9259259259259259
ERROR 1.0 0.96
PASSED
PASSED
PASSED
ERROR 1.0 0.9259259259259259
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.5
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.4
PASSED
ERROR 0.4 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.6666666666666666
PASSED
PASSED
ERROR 0.4166666666666667 1.0
ERROR 1.0 0.9166666666666666
PASSED
ERROR 1.0 0.6666666666666666
PASSED
PASSED
ERROR 1.0 0.6
PASSED
PASSED
ERROR 0.3333333333333333 1.0
PASSED
ERROR 0.4 1.0
ERROR 0.8235294117647058 1.0
PASSED
PASSED
PASSED
PASSED
ERROR 0.5555555555555556 1.0
PASSED
PASSED
ERROR 0.0 1.0
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.5
PASSED
PASSED
from numpy-ml.
Hi @daidai21 - thank you for working on this! It's not clear to me why random data generation would result in failed tests, since both models receive the same input data and targets. Perhaps I'm missing something?
Anyway, feel free to submit a PR and we can try to work through the code together to identify what's going on. It's difficult to know right now why certain tests aren't passing, since I don't know what the model code looks like.
Finally, to help track down the cause of the failed tests, I'd recommend directly comparing pred1
and pred2
to ensure that individual data points are being categorized in the same way between the two models. This will help you to better identify why some of the tests are failing :)
Thanks again!
from numpy-ml.
Closing this, as the code you are talking about is not your own work.
See #37
from numpy-ml.
Related Issues (20)
- Using numpy.tensordot for Conv2D HOT 1
- error in DecisionTree
- Bug in transfer learning
- Bug in initializers init_from_dict()
- [Question] Gradient of Gradient Penalty in WGAN-GP.
- Naive Bayes HOT 3
- Feature Request: Clustering Kmeans (hard and soft version) HOT 2
- Feature Request: Online Linear Regression HOT 7
- Feature request: Accept multiple samples for online least squares HOT 1
- Best choice for my use case? HOT 1
- `numpy_ml.linear_model.LinearRegression.predict()` generates `ValueError` when used with copy-pasted code, but pip installed version works as expected!! HOT 1
- neural nets optimizer shape mismatch during backward pass
- Import of collections.Hashable fails in Python 3.10 HOT 3
- Columns and DataType Not Explicitly Set on line 228 of rl_utils.py
- Feature request: save/load model to/from json
- Question to improve the code.
- Declare your version of modules
- Automatic diferentiation for neural networks
- no 'load_dataset' in numpy_mL
- Example of MLP architecture
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from numpy-ml.