Git Product home page Git Product logo

ocropus4-old's People

Contributors

j-rausch avatar snyk-bot avatar tmbdev avatar tmbnv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ocropus4-old's Issues

Update bash syntax in ocropus4 script

There are some small syntax corrections necessary for the ocropus4 bash file. Lines 33 and 47 terminate cases with a ;& instead of a ;; for the data extraction and inference steps, respectively. You can replicate the syntax error with

# in ocropus4/g1000test
. vocropus --help

Use parameterized arguments instead of `if false` in run script

There are two if false blocks containing additional dependencies under cmd_venv() in the run bash script. I found that installing those dependencies was necessary for me to get parts of the codebase working. Are these requirements meant to be installed at all, or only installed sometimes? If the latter, it might be easier to add some kind of arg (maybe call it sos-tools) that will only install those packages if specified.

Update float type assertion in utils.py

Issue

Floating-point image arrays in my machine have dtype np.float32, not float. This fails the dtype checks in function ocropus/utils.py, line 194 function autoinvert. I'm using numpy version 1.20.3 (more package info below) and adding a type check of image.dtype == np.float32 for the first case seems to yield the proper results. I'm using a Mac, Python 3.8.8, and am otherwise using the packages as specified in ./run venv and pip install -r requirements.txt.

Suggested solution

I don't have an ideal solution here--I'm sure that image.dtype == float might be sufficient in other systems, so we might not want to remove it. if (image.dtype == float) or (image.dtype == np.float32): is clunky but a potentially easy patch. Pinning the package versions for which float is always the expected dtype is a more robust solution.

Anyway, if I'm using this code wrong (always a possibility) I'm fine for this issue to be closed.

My package versions:

(venv) (base) ➜  g1000test git:(main) ✗ pip freeze 
ansiwrap==0.8.4
anyio==3.1.0
appdirs==1.4.4
appnope==0.1.2
argon2-cffi==20.1.0
async-generator==1.10
attrs==21.2.0
autopep8==1.5.7
Babel==2.9.1
backcall==0.2.0
bash-kernel==0.7.2
beautifulsoup4==4.9.3
black==21.5b1
bleach==3.3.0
braceexpand==0.1.7
bs4==0.0.1
certifi==2020.12.5
cffi==1.14.5
chardet==4.0.0
click==7.1.1
coverage==5.5
cycler==0.10.0
decorator==4.4.2
defusedxml==0.7.1
editdistance==0.5.3
entrypoints==0.3
fasteners==0.16
future==0.18.2
greenlet==1.1.0
humanhash3==0.0.6
idna==2.10
imageio==2.9.0
iniconfig==1.1.1
ipykernel==5.5.5
ipython==7.23.1
ipython-genutils==0.2.0
isort==5.8.0
jedi==0.18.0
Jinja2==3.0.1
json5==0.9.5
jsonschema==3.2.0
jupyter-client==6.1.12
jupyter-core==4.7.1
jupyter-server==1.8.0
jupyterlab==3.0.16
jupyterlab-pygments==0.1.2
jupyterlab-server==2.5.2
jupyterlab-sos==0.8.1
kiwisolver==1.3.1
lxml==4.6.3
MarkupSafe==2.0.1
matplotlib==3.4.2
matplotlib-inline==0.1.2
mistune==0.8.4
msgpack==1.0.2
mypy==0.812
mypy-extensions==0.4.3
nbclassic==0.3.1
nbclient==0.5.3
nbconvert==6.0.7
nbformat==5.1.3
neovim==0.3.1
nest-asyncio==1.5.1
networkx==2.5.1
notebook==6.4.0
numpy==1.20.3
-e git://github.com/NVlabs/ocrodeg.git@21109cb4ea0ff90306658e904a3a7b36c1e4f6b7#egg=ocrodeg
packaging==20.9
pandas==1.2.4
pandocfilters==1.4.3
papermill==2.3.3
parso==0.8.2
pathspec==0.8.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.2.0
pluggy==0.13.1
prometheus-client==0.10.1
prompt-toolkit==3.0.18
psutil==5.8.0
ptyprocess==0.7.0
py==1.10.0
pycodestyle==2.7.0
pycparser==2.20
pydocstyle==6.1.1
pydot==1.4.2
pydotplus==2.0.2
Pygments==2.9.0
pynvim==0.4.3
pyparsing==2.4.7
pyrsistent==0.17.3
pytest==6.2.4
pytest-cov==2.12.0
python-dateutil==2.8.1
pytz==2021.1
PyWavelets==1.1.1
PyYAML==5.4.1
pyzmq==22.0.3
regex==2021.4.4
requests==2.25.1
scikit-image==0.18.1
scipy==1.6.3
Send2Trash==1.5.0
simplejson==3.17.2
six==1.16.0
sniffio==1.2.0
snowballstemmer==2.1.0
sos==0.22.5
sos-notebook==0.22.4
sos-papermill==0.2.1
sos-python==0.18.4
soupsieve==2.2.1
tabulate==0.8.9
-e git://github.com/tmbdev/tarproc.git@b233aee5ce654f970ba301bcfdc24147bc4eebc1#egg=tarproc
tenacity==7.0.0
-e git://github.com/NVlabs/tensorcom.git@52fc7c3a4e71e1ab2b6508be4c133cd1ac8a50cd#egg=tensorcom
terminado==0.10.0
testpath==0.5.0
textwrap3==0.9.2
tifffile==2021.4.8
tk==0.1.0
toml==0.10.2
torch==1.8.1
-e git://github.com/tmbdev/torchmore.git@395d9b34a8d4251863fc83f25b9ac4616195d1bc#egg=torchmore
torchvision==0.9.1
tornado==6.1
tqdm==4.61.0
traitlets==5.0.5
-e git://github.com/vatlab/transient-display-data@52047ace8be7f4e073427b023fc40886b932dfef#egg=transient_display_data
typed-ast==1.4.3
typer==0.3.2
typing-extensions==3.10.0.0
urllib3==1.26.4
wcwidth==0.2.5
-e git://github.com/tmbdev/webdataset.git@315977952b74a87848983518c64c9ad43e66c71f#egg=webdataset
webencodings==0.5.1
websocket-client==1.0.1
xxhash==2.0.2

Make CPU option for training/prediction/model loading?

Right now cuda is required in a number of places, which is likely fine in practice for training but also means that much of the code can't be used out of the box on any machine without cuda access (such as macbook laptops). A CPU option would be really helpful for testing but also for inference, since a pretrained model can be handled by the cpu even if it'll be pretty slow.

lseg

@tmbdev Hi there,

i have noticed that you are testing the segmentation of text-lines, you seem to have chosen the classes of:
(x-height) + (textline boundary) + (text region)

It's interesting to hear about your results:

  • How does it perform against connected components, example of 2 lines close or touching each other.
  • Can it generalize to unseen documents with different fonts and noise.
  • Can it segment vertical and curved lines

by-the-way, i have noticed that some labels in the gsub-lseg-test.tar might need correction, an example is attached
sample.zip

.

.

Could ocropus4 be a (part of) solution?

opengpt chat lead me to this project while I tried to interrogate it about projects to do QC on data entered in GUI. The original need/motivation could be found at con/noisseur#1 and that repo README. More on overall idea is

  • continuously grab screenshots from some screen
  • when detecting screen of interest, perform OCR on the entered data, do data validation, display results on another screen in case of errors

I feel like ocropus has some ideas/tools which could be of use, but decided to just ask and possibly for pointers to other projects you might know in this domain.

.

.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.