Git Product home page Git Product logo

captcha-recognition-based-on-dsp-using-machine-learning-techniques's Introduction

###Purpose This project is for ELEN 4810E Digital Signal Processing class @columbia university.

###Enviroment

  • Operating system: MAC OS X EI Capitan
  • Language: python
  • Lib: PIL, requests, sklearn, numpy

###Program

####scrapyPicture.py

####binary.py

  • This program is used to binarilize pictures.
  • This program contains two classes: class BinarizingImage is used to change the images into binary base, and class savingImage is used to save the splitted letters into folder /splitPhotos.

#####class BinarizingImage:

######functions:

  • loadPicture(self, _filePath, _threshold): Load pictures.
  • binarizing(self): Using threshold to binarilize the pictures.
  • depoint(self): Using connected component labeling method to suppress noise.
  • equidistanceSegment(self): split image in equal distance

#####class savingImage:

  • save pictures as *.bmp, input an Image list.

######functions:

  • saveImage(self, images): save images into folder '/splitPhotos'.

####createTrainingDataset.py

  • This program is used to divide images into different folders by their label.
  • After that, we need to justify the result manully, the accuray of OCR is about 80%.

####Extraction.py

  • use to write the training dataset as traindataX.csv and traindatay.csv

#####class extractFeatures:

######functions:

  • getBinaryPixel(self, _filePath): from *.bmp get get a binary pixel array.
  • getFileNames(self, _dirs): Scanning all the file in the dirs.
  • writeFile(self): Writing training dataset.

####SVMmodel.py

  • This program is used to train SVM model, and use joblib to dump the model as SVM_PKL.pkl.

#####class SVMmodel:

######functions:

  • loadData(self, _train_X, _train_y): from 'traindataX.csv' and 'traindatay.csv', load training dataset.
  • train(self): use clf.fit(X,y) to train the model.
  • selfTest(self): self test the accuary of our model.
  • searchBestParameter(self): using sklearn.grid_search to find the best parameters of SVM model.

####finalPrediction.py

  • This program use trained SVM model to varify a new code.

#####class predictionModel:

######functions:

  • loaddata(self, filePath): from traindataX.csv and traindatay.csv, load training dataset.
  • SVMpredict(self): use clf.fit(X,y) to train the model.

####KneighorsModelPrediction.py

  • This program use Kmeans method to varify a new varification code.

####classification.py

  • This program will use trained SVM model to verify the images in testpictures folder.

captcha-recognition-based-on-dsp-using-machine-learning-techniques's People

Contributors

guangyangpku avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.