Git Product home page Git Product logo

swiftocr's People

Contributors

albinekcom avatar ashutoshsharma12 avatar bermudalocket avatar guymoreillon avatar huntermonk avatar jillevdw avatar jrtibbetts avatar matthiasu avatar mattstanford avatar mkonapelsky avatar mrdrprofk avatar msztech avatar neoneye avatar nmac427 avatar rsaeks avatar saracevas avatar terhechte avatar valeriyvan avatar waterskier2007 avatar zackrw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swiftocr's Issues

Question about aspectRatio

I noticed the origin aspectRatio is pushed into end of the data to train.
The ratio value in one of 321 data. I'm wondering if it's really useful for the accuracy.

I've tried a pic which looks like "W" but the ratio is totally different from the ratio trained.It's still recognised as "W".

Preprocessing on small images

Hi again,

I have been playing around with some small images of text to check whether I could recognize some text on them, although I retrained the NN, tweaked some parameters... But no luck. I think it might be the preprocessing algorithm that lead to a non-readable text.

My original image:
sample5

What SwiftOCR debug says:
captura de pantalla 2016-07-23 a las 16 27 16

Maybe it is the size of the image?

Pod File

Add project to cocoapods? Thanks

Characters show up that aren't in white list

For instance, i whitelist digits only, but still get some letters in string output.

Perhaps there was no confidence of any digit?
[SwiftOCR.SwiftOCRRecognizedBlob(charactersWithConfidence: [("3", 0.500555277), ("7", 0.0274531413)], boundingBox: (6.0, 6.0, 21.0, 29.0)), SwiftOCR.SwiftOCRRecognizedBlob(charactersWithConfidence: [], boundingBox: (29.0, 7.0, 10.0, 27.0)), SwiftOCR.SwiftOCRRecognizedBlob(charactersWithConfidence: [("6", 0.083129324)], boundingBox: (44.0, 5.0, 23.0, 29.0)), SwiftOCR.SwiftOCRRecognizedBlob(charactersWithConfidence: [("2", 0.25452441)], boundingBox: (81.0, 6.0, 20.0, 28.0)), SwiftOCR.SwiftOCRRecognizedBlob(charactersWithConfidence: [("3", 0.500555277), ("7", 0.0274531413)], boundingBox: (105.0, 6.0, 21.0, 29.0)), SwiftOCR.SwiftOCRRecognizedBlob(charactersWithConfidence: [("5", 0.0765574574), ("9", 0.066627048)], boundingBox: (129.0, 6.0, 20.0, 29.0))]
The sting returned was "3IE23" (actual 316235)
Actually looking at the output here, I guess the issue is the confidence is lower than .1, so it still chose the letters?

NSCameraUsageDescription missing

The app's Info.plist must contain an NSCameraUsageDescription key with a string value explaining to the user how the app uses this data.

Japanese & Chinese hieroglyphs

Hi.
Thanks for this awesome project and your work. And I have a question, Are you planning to implement recognition of hieroglyphs? It'd be so appreciated

Training based on images?

Would it be difficult to mod the training app such that you can give it images for a digit and tell it the correct answer?

Read only numbers!!!

I want read only numbers. when try to internal var recognizableCharacters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" convert to internal var recognizableCharacters = "0123456789" it getting crash or sometime it read character also.
so how to solve?

thank you

Fragment Shader Exception

On OS X I get an exception immediately after dragging an image into the window:

GPUImage`runSynchronouslyOnVideoProcessingQueue

Failed to compile fragment shader
Program link log: (null)
Fragment shader compile log: ERROR: 0:25: 'vec2' : syntax error: syntax error
Vertex shader compile log: (null)
*** Assertion failure in -[GPUImageSingleComponentGaussianBlurFilter switchToVertexShader:fragmentShader:], /SwiftOCR/framework/SwiftOCR/GPUImage-master/framework/Source/GPUImageGaussianBlurFilter.m:401
An uncaught exception was raised
Filter shader link failed

I didn't manage to track down which vec2 it was (1503 results in 128 files)

Carthage build failed

The following build commands failed:
Check dependencies
(1 failure)
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
warning: using 'ALWAYS_SEARCH_USER_PATHS = YES' while building targets which define modules ('DEFINES_MODULE = YES') may fail. Please migrate to using 'ALWAYS_SEARCH_USER_PATHS = NO'.
A shell task (/usr/bin/xcrun xcodebuild -project /projects/PangPingfei/SuperDemo/SuperDemo/Carthage/Checkouts/SwiftOCR/framework/SwiftOCR.xcodeproj -scheme SwiftOCR -configuration Release -sdk iphoneos ONLY_ACTIVE_ARCH=NO BITCODE_GENERATION_MODE=bitcode CODE_SIGNING_REQUIRED=NO CODE_SIGN_IDENTITY= CARTHAGE=YES clean build) failed with exit code 65:
** CLEAN FAILED **

The following build commands failed:
Check dependencies
(1 failure)
** BUILD FAILED **

The following build commands failed:
Check dependencies
(1 failure)

Dependancy Analysis Error

I downloaded the example project and tried running it in Xcode 8.2. I'm getting the following error.

“Use Legacy Swift Language Version” (SWIFT_VERSION) is required to be configured correctly for targets which use Swift. Use the [Edit > Convert > To Current Swift Syntax…] menu to choose a Swift version or use the Build Settings editor to configure the build setting directly.

Deleted the Derived Data folder, cleaned the project and built it again but still the same error occurs.

Help on blur

Hi, I was wondering if this can be used for detecting text and perform action not just get the string, but perform action like blurring on the text.. like car plate number.. can you help shed some light on how to achieve this? Many thanks in advance.

Optimize Training parameters

Hi,

I just came across your project and it is awesome! 👍

I have worked with some Deep NN and I know it is better to try different parameters (learning rate, momentum, etc.) to get better results and provide the best trained network (using the test group) to the user. I have looked at your training sample and I realized you simply set the parameters to fixed ones (0.7 learning rate , 0.4 momentum ...) and you have a fixed 1-hidden layer size.

I do not know the algorithm you use to recognize the texts (BTW: Could you provide a source for the algorithm so I read about it a bit?) but I would like to try a bunch of parameters to see if I can improve the obtained weights of the neural network, but the problem is I do not find much documentation on your project. Could you provide me some insights about how the algorithm works, why you use that learning rate, momentum, etc?

Thanks!

Unicode or ASCII chars?

Hi there,

This is more of a question than an issue, as I haven't had a chance to try this out yet - just wondering if your library is able to recognize unicode chars, such as ♠️♦️♥️♣️, or ASCII chars, such as ♠♥♦♣ ? Maybe if I supply any of these as part of the recognizable characters set and/or do training with different images?

recognize().indexToCharacter() - fatal error: Index out of range

The crash happens in its inner function indexToCharacter()

    func indexToCharacter(_ index: Int) -> Character {
        return Array(recognizableCharacters.characters)[index]
    }

The index is 36.

The recognizableCharacters is ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789

I use the latest version of SwiftOCR. commit hash: 204e6850149dfd50e507e66edf265fc58cf38178

I have trained my own OCR-Network that I'm using. This is trained with these characters ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789/-

It doesn't crash if I use the default OCR-Network file that is in the SwiftOCR repository.

I like SwiftOCR and find it better than Tesseract.

Training doesn't finish

I am trying to train the network by using Monofont I waited around 10 minutes and I still see FFNN console output. Is it normal? Here is the log from Xcode console.

2016-05-22 09:40:16 +0000 ----> 190.271
2016-05-22 09:40:18 +0000 ----> 107.849
.....
2016-05-22 09:51:31 +0000 ----> 13.8021
2016-05-22 09:51:34 +0000 ----> 13.8024
2016-05-22 09:51:36 +0000 ----> 13.8026
2016-05-22 09:51:38 +0000 ----> 13.8028
2016-05-22 09:51:41 +0000 ----> 13.803
2016-05-22 09:51:43 +0000 ----> 13.8032

i stopped in the middle, saved training data. I changed OCR-Network with the produced one, but it failed with following images.
original
Rotated
Removed some noise
Clear
It only recognized last one as A5C8U5

Recognising numbers

Sorry to open another issue so soon (is an email better?) Despite adding numbers to the training list and adding them to the whitelist I cannot get numbers to be recognised.
img_1951

BTW we're trying to get this working to capture a Mac address with - seperators, any tips you might have would be appreciated.

how to start an example?

In attempt to start an example the mistake appears

dyld: Library not loaded: @rpath/GPUImage.framework/GPUImage
  Referenced from: /private/var/containers/Bundle/Application/97507C02-4AF1-4CE4-9D20-D6ABA8E0B050/SwiftOCR Camera.app/Frameworks/SwiftOCR.framework/SwiftOCR
  Reason: image not found
(lldb) 

Help me please to begin to work with this framework.

El Capitan 10.11.5 (15F34), XCode Version 7.3.1 (7D1014)

Port to GPUImage2?

Is anyone working on porting this to work with GPUImage2?
Considering that is 100% swift AND Linux compatible, that would greatly expand the usefulness of this project.

Adding extra recognized characters to training app errors FFNN

In the training app when adding extra recognizable character the FFNN barfs with error:

InvalidAnswerError("Invalid number of outputs given in answer: 37. Expected: 36")

If I remove a value at the front of the string then add a the new character the FFNN finds what it needs to train and successful.

training ?

Hello

I am trying to understand what do to when the app is trained ? how do I get back the training neural network to the camera app ?

Also in the SwiftOCR Camera app there is also SwiftOCRTraining, which is never used what is its purpose?

Training with different Font

Hey, I'm trying to train the network with a font other than Arial and it doesn't seem to be working. Any help?

This repo is marked as Objective-C

I'm assuming it's because you've forked GPUImage so the vast majority of files in this repo are *.m and *.h. Not sure if you can force Github to recognize it as a Swift repository, but it would help discoverability for people filtering by language.

Getting bad OCR on "Simple" Image.

Can you provide some tips on how to properly train SwiftOCR? Every image I feed it even after training with new fonts, SwiftOCR spits out garbage or an empty array. I feel like I'm not doing something properly.

These are not difficult images either, something as simple at 3 numbers it can't recognize. What am I doing wrong??

Here is an example image. I've trained the OCR-Network with all fonts.

screen shot 2016-08-30 at 9 17 53 pm

It seems to skip letters and Tesseract can identify this 100%. What am I doing wrong?

Configuration(init, or to add a method of settings) SwiftOCR

Thank you, I have started the project. Thanks to your answer #21.

But at the moment I can't replace TesseractOCR with yours, because of the following problems:

  1. How to allow reading symbols "., { } [];-+= %..."?
  2. The symbol "3" very often reads out as the letter "B"

Can't build any of the example project

When trying to build any of the example projects I get the following error /Users/ferdinand/Library/Developer/Xcode/DerivedData/SwiftOCR_Training-gomzjvpwejvmqhcbyowxjqloqzmd/Build/Intermediates/GPUImage.build/Debug-iphoneos/Documentation.build/Script-BC552B3A1558C6FC001F3FFA.sh: line 5: /usr/local/bin/appledoc: No such file or directory

Any suggestions how I can fix this thanks in advance

Empty String

I have successfully integrated the project by following this #25

I tried this image
test 2

But unfortunately getting an empty string. What could be the reason?

Issue with simple image

Trying sample app for OS X to recognize attached image. It results in 'F7' as recognized string. Any ideas why this may happen ?
what are the requirements for fine recognition?
img_2457 2

Number patterns

Hi @garnele007!

This library is useful. But can I recogize digit only and the image has this pattern xxxx xxxx xxxx? Thanks.

Training doesn't appear to work

I'm trying to get the iOS app to recognise some text and for the most part it doesn't seem to work. I tried to train using all fonts listed and this made no difference. If I understand things correctly the training some modify the OCR-Network file but Finder tells me the file has not been modified. Am I doing it wrong?

Empty string while trying to convert an Image

I tried implementing SwiftOCR using a simple image, However i get an empty string as output.

var myImage:UIImage = UIImage(named: "/Users/sriteja/Desktop/Finance Manager/Finance Manager/bill2.png")!

@IBAction func testButton(_ sender: AnyObject) {
    
    let swiftOCRInstance   = SwiftOCR()
    
    swiftOCRInstance.recognize(myImage) {recognizedString in
        
        print(recognizedString)
    }
}

Use submodules instead of direct frameworks.

If by any chance these included frameworks were updated, then you wouldn't have to worry about manually updating them.

Also it sets the repo language to a greater percentage of Swift than Objective-C.

Training not based on selected font in TrainingApp?

Hi, i'm sure I'm wrong, but I've been stepping through the code and it looks like the training images are generated using the font name array in trainingFontNames.

I would have assume that the images would have been generated based on the selectedFontNames being passed in, but as far as i can tell, selectedFontNames, while tracked within the view controller, never affect trainWithCharSet.

Am I missing something?

I've change my file locally:

//    private let trainingFontNames  = ["Arial Narrow", "Arial Narrow Bold"]
    public var trainingFontNames  = ["Arial Narrow", "Arial Narrow Bold"]

and set this as so;

                self.trainingInstance.trainingFontNames = self.selectedFontNames
                self.trainingInstance.trainWithCharSet() { error in

GPUImage module not found

image

I'm found similar issues around but none of the solutions seem to work. I get the error "No such module 'GPUImage'" when building the final product on all examples. I can build all the frameworks with no problem. Also, GPUImage framework is included in the embedded binaries and and linked frameworks. I'm using XCode Version 7.3.1 (7D1014), OS X 10.11.3 on a MacBook Air.

Thanks!

Getting empty text or two letter text

Not work for the majority of the tested images. Only once returned two letters that were not related to the text in the image. Application for iOS, XCode 8.2, Swift 3.0.1.
receipt
screen shot 2017-01-10 at 10 37 30 pm

Seven Segment Optical Character Recognition

Hi, thank you for your library.
Can i use it to recognize Seven Segment Character?
What i need to do for it (to download font and to train )?
I want to recognize digits from odometer.

error going up and up

Hi

I have tried with 10 common font ( courier helvetica) and after a while the error kept on rising
2017-01-03 20:51:44 +0000 ----> 277.007 vs .0.2
2017-01-03 20:52:02 +0000 ----> 277.148 vs .0.2
2017-01-03 20:52:21 +0000 ----> 277.286 vs .0.2
2017-01-03 20:52:39 +0000 ----> 277.421 vs .0.2
2017-01-03 20:52:48 +0000 ----> 277.555 vs .0.2
2017-01-03 20:52:57 +0000 ----> 277.685 vs .0.2
2017-01-03 20:53:06 +0000 ----> 277.814 vs .0.2
2017-01-03 20:53:14 +0000 ----> 277.941 vs .0.2
2017-01-03 20:53:23 +0000 ----> 278.065 vs .0.2
2017-01-03 20:53:31 +0000 ----> 278.187 vs .0.2
...
2017-01-03 20:59:44 +0000 ----> 338.858 vs .0.3

how is this possible ?

Low reco count even using example on iOS

I'm testing the sample app that comes with the package and when I run it the only way I can get it to accurately recognise alpha numerics is to type them in GIMP and test off my screen.

I have tried various texts, all single line from number plates, to single words to just numbers and I only get 100% hit when using typed text off my computer screen?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.