gojiplus / abbyyr Goto Github PK
View Code? Open in Web Editor NEWR Client for the Abbyy Cloud OCR
Home Page: http://soodoku.github.io/abbyyR/
License: Other
R Client for the Abbyy Cloud OCR
Home Page: http://soodoku.github.io/abbyyR/
License: Other
In processImage():
querylist <- list(language = language, letterSet = letterSet,
regExp = regExp, textType = textType, oneTextLine = oneTextLine,
oneWordPerTextLine = oneWordPerTextLine, markingType = markingType,
placeholdersCount = placeholdersCount, writingStyle = writingStyle,
description = description, pdfPassword = pdfPassword)
This omits the "region" parameters as per https://ocrsdk.com/documentation/api-reference/process-text-field-method/
Hence OCR on a region cannot be done.
> library('EBImage')
> library('abbyyR')
> lnk <- 'http://www.theage.com.au/ffximage/2005/07/22/id_card1_gallery__502x329,0.jpg'
> pic <- readImage(lnk)
> display(pic)
> download.file(lnk,destfile=paste0(getwd(),'/pic.jpg'))
--2015-11-03 18:23:39-- http://www.theage.com.au/ffximage/2005/07/22/id_card1_gallery__502x329,0.jpg
Resolving www.theage.com.au (www.theage.com.au)... 104.86.110.66, 104.86.110.27
Connecting to www.theage.com.au (www.theage.com.au)|104.86.110.66|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 23154 (23K) [image/jpeg]
Saving to: ‘/home/ryoeng/pic.jpg’
0K .......... .....
> setapp(c('cloud_ocr', 'PgUbJEeFzlKMjeX/668puVFZ'))
> processPhotoId(file_path=paste0(getwd(),'/pic.jpg'), idType='auto', imageSource='auto')
..... .. 100% 19.2M=0.001s
2015-11-03 18:23:39 (19.2 MB/s) - ‘/home/ryoeng/pic.jpg’ saved [23154/23154]
- Error in processPhotoId(file_path = paste0(getwd(), "/pic.jpg"), idType = "auto", :
- client error: (450) Blocked by Windows Parental Controls (Microsoft)
>
> file.remove(paste0(getwd(),'/pic.jpg'))
[1] TRUE
Just curous that is there any connection of your apps related to Microsoft since abbyyR is a web-base application I though? Any solution?
Hello,
Thanks for making available your package to test Abbyy OCR.
I am processing a bunch of OCR images. I was able to submit around 100 images following the example you includes in this vignnete:
https://cloud.r-project.org/web/packages/abbyyR/vignettes/wiscads.html
But suddenly stops and responds with Error: HTTP failure: 450
.
From that moment on, I cannot process any other image and even I get the same error with deleteTask()
but not with listTasks()
. I changed the way to connect to Internet with a mobile connection and it seemed to work again, but after submitting the 100 images, again the same problem with the same error.
I saw what you responded in this issue https://github.com/soodoku/abbyyR/issues/1
and I updated CRAN version with the one you have here in GitHub, but the problem persists.
If you know any workaround or alternative....
Hello,
I'm getting the 450 error when working with both processImage and ocrFile functions. As far as I can tell, it does not depends on my app limitations nor my code repository connections, since I'm able to run all other functions that do not provide an immediate file download (such as submitImage, processDocument, etc..).
Also, I think these errors depends on some recent update in abbyyR package or in Abbyy cloud service, since I was able to run those functions up until April/May and I found the same issue posted on Abbyy forum in June (but no proper answer was given to that in the forum).
Any idea of what may be going on?
Thanks
Luca
I've got the error below.
Brief description of the problem
library(abbyyR)
setapp(c("tidyverse", "password"))
getAppInfo()
Error: HTTP failure: 403
The name of the downloaded files are the IDs of the file in listTasks(). Names are not meaningful and does not contain the file extension. Is there any way to keep the name of files in the 'description' as stored in the sample C# app below:
http://ocrsdk.com/documentation/code-samples/dotNet.zip
Then the name can be taken from description and extension could be appended against each file.
I'm attempting to use the ocrFile() function to read pdf image of a table. If I use Abbyyr's "finreaderonline.com" with my pdfs I get a nice .csv table back.
When I use ocrFile() I get the following error message:
Error in Ops.factor(res$id, listFinishedTasks()$id) : level sets of factors are different
If I run getAppInfo()
getAppInfo() Name of Application: 1 No. of Pages Remaining: 1 No. of Fields Remaining: 1 Application Credits Expire on: 1 Type: 1
Any ideas about what I could check?
Hi.
When trying to use the ocrFile()
function, I enter the below syntax:
ocrFile(file_path="~/#R_Projects/officeR/ImageOnly.pdf", output_dir="~/#R_Projects/officeR/abbyyR/", exportFormat = "pdfa", save_to_file = TRUE)
However, I get the following error: Error in curl_download(finishedlist$resultUrl[res$id == finishedlist$id], : Argument 'url' must be string.
I found a response to a similar issue at stackoverflow
, but running that syntax didn't help either.
Not sure what is going wrong. Any help is greatly appreciated.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.