Git Product home page Git Product logo

Comments (8)

daidai21 avatar daidai21 commented on May 16, 2024 1

OK,I decide try to coding this algorithm. But I think I need some time, because I have other job. I will finish it as soon as possible.

Nice to meet you. @ddbourgin

from numpy-ml.

ddbourgin avatar ddbourgin commented on May 16, 2024

Hi @daidai21 - thanks for your interest!

There actually is a k-means model as part of the KNN module, though I haven't explicitly called it that in the READMEs. Specifically, the KNN object takes an argument classifier, which converts between k-nearest neighbors regression (classifier=False) and k-means classification/clustering (classifier=True).

Feel free to propose other models you'd be interested in working on, though!

from numpy-ml.

daidai21 avatar daidai21 commented on May 16, 2024

I'm sorry. Excuse me. There is no SVM? @ddbourgin

from numpy-ml.

ddbourgin avatar ddbourgin commented on May 16, 2024

@daidai21 - No need to apologize! An SVM implementation would be awesome -- it's been on my TODO list for ages :)

The crux will be implementing the SMO algorithm properly I suspect. If you decide to do it, I wouldn't worry too much about being efficient - for this repo, the focus is more on making everything as clean/clear as possible rather than on being clever.

Also, if you end up referencing other implementations when writing your code, please make sure to cite them in the docstrings and PR. It's important that any code you submit is your own work.

Finally - thanks! Let me know if you have any questions as you go along :)

from numpy-ml.

ddbourgin avatar ddbourgin commented on May 16, 2024

Sure, take your time, and let me know if you have any questions!

from numpy-ml.

daidai21 avatar daidai21 commented on May 16, 2024

Hi, David

I took time to finish it, but the test didn't pass all. There is a 78% probability that my model and Sklearn's model predict the accuracy of the results. I don't know what to do now?

Sometimes my models are good, sometimes sklearns are good.

I think this result is related to the distribution of randomly generated data. I think my code is OK. What do you think?

This is test code.

import warnings
warnings.filterwarnings('ignore')
import numpy as np
import random

# load myself model
# from SVM import SVM

from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
from sklearn.datasets.samples_generator import make_blobs
from sklearn.model_selection import train_test_split


def test_SVM():
    i = 1
    np.random.seed(12345)
    while True:
        X, Y = make_blobs(  # generate dataset
            n_samples=np.random.randint(2, 100), 
            n_features=np.random.randint(2, 100),
            centers=2, random_state=i, 
        )
        X, X_test, Y, Y_test = train_test_split(X, Y, test_size=0.3, random_state=i)
        if 0 not in Y or 1 not in Y:  # ignore split error(train/test data only 1 class)
            continue
        # generate param
        C = random.uniform(0.1, 0.9)
        max_iter = random.uniform(50, 500)
        kernel = np.random.choice(["linear", "rbf"])
        tol = random.uniform(0.000001, 0.1)
        # fit and predict
        clf1 = SVC(C=C, max_iter=max_iter, kernel=kernel, tol=tol)
        clf1.fit(X, Y)
        pred1 = clf1.predict(X_test)
        clf2 = SVM(C=C, max_iter=max_iter, kernel=kernel, tol=tol)
        clf2.fit(X, Y)
        pred2 = clf2.predict(X_test)
        # judge
        # err_msg = "ERROR {0} {1}".format(accuracy_score(Y_test, pred1), accuracy_score(Y_test, pred2))
        # assert accuracy_score(Y_test, pred1) == accuracy_score(Y_test, pred2), err_msg
        # print("PASSED")
        if accuracy_score(Y_test, pred1) == accuracy_score(Y_test, pred2):
            print("PASSED")
        else:
            print("ERROR", accuracy_score(Y_test, pred1), accuracy_score(Y_test, pred2))


if __name__ == "__main__":
    test_SVM()

This test code run result.

PASSED
PASSED
PASSED
PASSED
ERROR 0.3333333333333333 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.0
PASSED
ERROR 0.3125 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.7692307692307693
PASSED
PASSED
ERROR 1.0 0.5
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.9655172413793104 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.5 1.0
PASSED
ERROR 1.0 0.5384615384615384
PASSED
ERROR 1.0 0.9090909090909091
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.3333333333333333
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.75
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.9
ERROR 1.0 0.6666666666666666
PASSED
ERROR 0.3333333333333333 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.4444444444444444 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.14285714285714285 1.0
ERROR 1.0 0.8
PASSED
PASSED
ERROR 1.0 0.9583333333333334
PASSED
ERROR 1.0 0.3333333333333333
PASSED
ERROR 1.0 0.9047619047619048
ERROR 0.0 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 0.2 1.0
PASSED
PASSED
ERROR 1.0 0.6
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.9090909090909091
ERROR 0.0 1.0
ERROR 0.3333333333333333 1.0
ERROR 1.0 0.6
PASSED
PASSED
PASSED
PASSED
ERROR 0.5 1.0
ERROR 1.0 0.8
ERROR 1.0 0.9523809523809523
PASSED
ERROR 0.32 1.0
PASSED
PASSED
ERROR 1.0 0.8333333333333334
ERROR 1.0 0.9259259259259259
ERROR 1.0 0.96
PASSED
PASSED
PASSED
ERROR 1.0 0.9259259259259259
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.5
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.4
PASSED
ERROR 0.4 1.0
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.6666666666666666
PASSED
PASSED
ERROR 0.4166666666666667 1.0
ERROR 1.0 0.9166666666666666
PASSED
ERROR 1.0 0.6666666666666666
PASSED
PASSED
ERROR 1.0 0.6
PASSED
PASSED
ERROR 0.3333333333333333 1.0
PASSED
ERROR 0.4 1.0
ERROR 0.8235294117647058 1.0
PASSED
PASSED
PASSED
PASSED
ERROR 0.5555555555555556 1.0
PASSED
PASSED
ERROR 0.0 1.0
PASSED
PASSED
PASSED
PASSED
ERROR 1.0 0.5
PASSED
PASSED

from numpy-ml.

ddbourgin avatar ddbourgin commented on May 16, 2024

Hi @daidai21 - thank you for working on this! It's not clear to me why random data generation would result in failed tests, since both models receive the same input data and targets. Perhaps I'm missing something?

Anyway, feel free to submit a PR and we can try to work through the code together to identify what's going on. It's difficult to know right now why certain tests aren't passing, since I don't know what the model code looks like.

Finally, to help track down the cause of the failed tests, I'd recommend directly comparing pred1 and pred2 to ensure that individual data points are being categorized in the same way between the two models. This will help you to better identify why some of the tests are failing :)

Thanks again!

from numpy-ml.

ddbourgin avatar ddbourgin commented on May 16, 2024

Closing this, as the code you are talking about is not your own work.

See #37

from numpy-ml.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.