ocropus4-old's People
Forkers
dominicshanshan shailzajolly mbencherif ocroarchive ztiane johnzed chomolungma cnerger2 j-rausch dvivarelli pandinosaurus webstorage119 chai104ocropus4-old's Issues
Update bash syntax in ocropus4 script
There are some small syntax corrections necessary for the ocropus4
bash file. Lines 33 and 47 terminate cases with a ;&
instead of a ;;
for the data extraction and inference steps, respectively. You can replicate the syntax error with
# in ocropus4/g1000test
. vocropus --help
Use parameterized arguments instead of `if false` in run script
There are two if false
blocks containing additional dependencies under cmd_venv()
in the run
bash script. I found that installing those dependencies was necessary for me to get parts of the codebase working. Are these requirements meant to be installed at all, or only installed sometimes? If the latter, it might be easier to add some kind of arg (maybe call it sos-tools
) that will only install those packages if specified.
Update float type assertion in utils.py
Issue
Floating-point image arrays in my machine have dtype np.float32
, not float
. This fails the dtype checks in function ocropus/utils.py
, line 194 function autoinvert
. I'm using numpy version 1.20.3
(more package info below) and adding a type check of image.dtype == np.float32
for the first case seems to yield the proper results. I'm using a Mac, Python 3.8.8, and am otherwise using the packages as specified in ./run venv
and pip install -r requirements.txt
.
Suggested solution
I don't have an ideal solution here--I'm sure that image.dtype == float
might be sufficient in other systems, so we might not want to remove it. if (image.dtype == float) or (image.dtype == np.float32):
is clunky but a potentially easy patch. Pinning the package versions for which float
is always the expected dtype is a more robust solution.
Anyway, if I'm using this code wrong (always a possibility) I'm fine for this issue to be closed.
My package versions:
(venv) (base) ➜ g1000test git:(main) ✗ pip freeze
ansiwrap==0.8.4
anyio==3.1.0
appdirs==1.4.4
appnope==0.1.2
argon2-cffi==20.1.0
async-generator==1.10
attrs==21.2.0
autopep8==1.5.7
Babel==2.9.1
backcall==0.2.0
bash-kernel==0.7.2
beautifulsoup4==4.9.3
black==21.5b1
bleach==3.3.0
braceexpand==0.1.7
bs4==0.0.1
certifi==2020.12.5
cffi==1.14.5
chardet==4.0.0
click==7.1.1
coverage==5.5
cycler==0.10.0
decorator==4.4.2
defusedxml==0.7.1
editdistance==0.5.3
entrypoints==0.3
fasteners==0.16
future==0.18.2
greenlet==1.1.0
humanhash3==0.0.6
idna==2.10
imageio==2.9.0
iniconfig==1.1.1
ipykernel==5.5.5
ipython==7.23.1
ipython-genutils==0.2.0
isort==5.8.0
jedi==0.18.0
Jinja2==3.0.1
json5==0.9.5
jsonschema==3.2.0
jupyter-client==6.1.12
jupyter-core==4.7.1
jupyter-server==1.8.0
jupyterlab==3.0.16
jupyterlab-pygments==0.1.2
jupyterlab-server==2.5.2
jupyterlab-sos==0.8.1
kiwisolver==1.3.1
lxml==4.6.3
MarkupSafe==2.0.1
matplotlib==3.4.2
matplotlib-inline==0.1.2
mistune==0.8.4
msgpack==1.0.2
mypy==0.812
mypy-extensions==0.4.3
nbclassic==0.3.1
nbclient==0.5.3
nbconvert==6.0.7
nbformat==5.1.3
neovim==0.3.1
nest-asyncio==1.5.1
networkx==2.5.1
notebook==6.4.0
numpy==1.20.3
-e git://github.com/NVlabs/ocrodeg.git@21109cb4ea0ff90306658e904a3a7b36c1e4f6b7#egg=ocrodeg
packaging==20.9
pandas==1.2.4
pandocfilters==1.4.3
papermill==2.3.3
parso==0.8.2
pathspec==0.8.1
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.2.0
pluggy==0.13.1
prometheus-client==0.10.1
prompt-toolkit==3.0.18
psutil==5.8.0
ptyprocess==0.7.0
py==1.10.0
pycodestyle==2.7.0
pycparser==2.20
pydocstyle==6.1.1
pydot==1.4.2
pydotplus==2.0.2
Pygments==2.9.0
pynvim==0.4.3
pyparsing==2.4.7
pyrsistent==0.17.3
pytest==6.2.4
pytest-cov==2.12.0
python-dateutil==2.8.1
pytz==2021.1
PyWavelets==1.1.1
PyYAML==5.4.1
pyzmq==22.0.3
regex==2021.4.4
requests==2.25.1
scikit-image==0.18.1
scipy==1.6.3
Send2Trash==1.5.0
simplejson==3.17.2
six==1.16.0
sniffio==1.2.0
snowballstemmer==2.1.0
sos==0.22.5
sos-notebook==0.22.4
sos-papermill==0.2.1
sos-python==0.18.4
soupsieve==2.2.1
tabulate==0.8.9
-e git://github.com/tmbdev/tarproc.git@b233aee5ce654f970ba301bcfdc24147bc4eebc1#egg=tarproc
tenacity==7.0.0
-e git://github.com/NVlabs/tensorcom.git@52fc7c3a4e71e1ab2b6508be4c133cd1ac8a50cd#egg=tensorcom
terminado==0.10.0
testpath==0.5.0
textwrap3==0.9.2
tifffile==2021.4.8
tk==0.1.0
toml==0.10.2
torch==1.8.1
-e git://github.com/tmbdev/torchmore.git@395d9b34a8d4251863fc83f25b9ac4616195d1bc#egg=torchmore
torchvision==0.9.1
tornado==6.1
tqdm==4.61.0
traitlets==5.0.5
-e git://github.com/vatlab/transient-display-data@52047ace8be7f4e073427b023fc40886b932dfef#egg=transient_display_data
typed-ast==1.4.3
typer==0.3.2
typing-extensions==3.10.0.0
urllib3==1.26.4
wcwidth==0.2.5
-e git://github.com/tmbdev/webdataset.git@315977952b74a87848983518c64c9ad43e66c71f#egg=webdataset
webencodings==0.5.1
websocket-client==1.0.1
xxhash==2.0.2
Make CPU option for training/prediction/model loading?
Right now cuda is required in a number of places, which is likely fine in practice for training but also means that much of the code can't be used out of the box on any machine without cuda access (such as macbook laptops). A CPU option would be really helpful for testing but also for inference, since a pretrained model can be handled by the cpu even if it'll be pretty slow.
lseg
@tmbdev Hi there,
i have noticed that you are testing the segmentation of text-lines, you seem to have chosen the classes of:
(x-height) + (textline boundary) + (text region)
It's interesting to hear about your results:
- How does it perform against connected components, example of 2 lines close or touching each other.
- Can it generalize to unseen documents with different fonts and noise.
- Can it segment vertical and curved lines
by-the-way, i have noticed that some labels in the gsub-lseg-test.tar
might need correction, an example is attached
sample.zip
.
.
Could ocropus4 be a (part of) solution?
opengpt chat lead me to this project while I tried to interrogate it about projects to do QC on data entered in GUI. The original need/motivation could be found at con/noisseur#1 and that repo README. More on overall idea is
- continuously grab screenshots from some screen
- when detecting screen of interest, perform OCR on the entered data, do data validation, display results on another screen in case of errors
I feel like ocropus has some ideas/tools which could be of use, but decided to just ask and possibly for pointers to other projects you might know in this domain.
.
.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.