Git Product home page Git Product logo

pidentify's Introduction

Pidentify

Pidentify is an API designed to compute the variance between input data and training datasets.

It involves introducing a "p-value" component, which will contribute to determining the p-value for each class, enabling the evaluation of input data deviation from the trained datasets. Our objective is to equip researchers with a robust tool that can improve classifier accuracy and streamline validation procedures.

Only .csv files and numerical data are supported now. Please also make sure the format of imported datasets is correct. The name of class needs to be at the end of each row. (You can use iris.data as a reference).

How to use

Command for compiling "main.cpp", "process.cpp", and "fit.cpp": g++ --std=c++11 -I alglib/src -o exe main.cpp process.cpp fit.cpp alglib.a

Execute the compiled file: ./exe

After executing the compiled file, it will print out all best fit values along with residuals for each sigmoid functions and the best fit value among all functions.

Read data from a csv file (main.cpp)

"main.cpp" will turn a csv file into "std::vector dataset" for future use.

Data processing (process.cpp)

"process.cpp" will normalize all features, and compute the nearest neighbor distances by using KNN (k = 1 here).

And then, it will sort all distances in an ascending order and eliminate duplicated results.

Nonlinear square fitting (fit.cpp)

"fit.cpp" is based on "alglib" library: https://www.alglib.net/interpolation/leastsquares.php#header4

Apply nonlinear square fitting to find a best value for sigmoid functions in each class.

We assume there are 2 parameters (c & a in c(x-a) in function) in each sigmoid funciton need to be tailored for ECDF points. In "fit.cpp", real_1d_array c holds the initial values for c & a. (c[0] stands for c, c[1] stands for a).

5 sigmoid functions are supported right now. They are: Logistic function, hyperbolic tangent function, arctangent function, gudermannian function, and simple algebraic function.

"function name_f" (e.g. logistic_f) stands for the original function. "function name_fd" (e.g. logistic_fd) stands for the derivative of the corresponding original function in terms of a & c.

rep.terminationtype: a status code returned

rep.wrmserror: residual with weights

You can uncomment the codes for fitting procedure to show the whole fitting process for each sigmoid function.

pidentify's People

Contributors

yiqing-gu avatar xiaoxiazhang00 avatar

Watchers

 avatar

Forkers

waynebhayes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.