ndl-lab / ndlocr_cli Goto Github PK

View Code? Open in Web Editor NEW

342.0 342.0 18.0 195 KB

NDLOCRアプリケーションのリポジトリ（ソースコードを含む）

License: Creative Commons Attribution 4.0 International

Python 95.33% Dockerfile 2.44% Shell 0.81% Batchfile 1.43%

ndlocr_cli's People

Contributors

Stargazers

Watchers

Forkers

blue0620 nakamura196 kazzzu watanabe3tipapa nabeta kenoharada ryonakai ooe1123 d-masaoka somiyagawa yu8ikmnbgt6y eltociear fred-ss syoyo matsuyamayukilib shiyuushimo tmzncty jet082

ndlocr_cli's Issues

WindowsにてDocker Buildができません。

Windows 11 Pro 64bit にてDockerのビルドを実行しようとしています。
dockerbuild.sh　が動かないので、中身をみて、ファイルをダウンロードして配置しました。
そのうえで、

docker build -t ocr-cli-py37 -f docker\Dockerfile .

と実行したところ、以下のエラーが発生しました。

 > [ 3/18] RUN rm /etc/apt/sources.list.d/nvidia-ml.list:
#6 0.630 rm: cannot remove '/etc/apt/sources.list.d/nvidia-ml.list': No such file or directory
------
executor failed running [/bin/sh -c rm /etc/apt/sources.list.d/nvidia-ml.list]: exit code: 1

Ubuntuでテストされているとのことでしたが、Dockerのビルドであり、このエラーは
nvcr.io/nvidia/cuda:11.1.1-cudnn8-devel-ubuntu18.04
に依存している部分かもしれませんが、Dockerファイルのこの箇所を削除していいものか判断ができずIssueとしてあげさせていただきました。

Chinese support

Hi,

I am a volunteer of wikimedia movement. I am looking for an OCR tool for old Chinese book.

https://commons.m.wikimedia.org/wiki/Commons:Library_back_up_project

Do you support Chinese? If you do, how to set it in the option?

Thanks!

submoduleのmodelが見つからないと言われる

glt cloneした後、

cd ndlocr_cli
docker\dockerbuild.bat

では何も動かなかったので、過去のissuesを参考にして
docker build -t ocr-v2-cli-py37 -f docker\Dockerfile .
し、
その後は
docker run --gpus all -d --rm --name ocr_cli_runner -v /home/user/tmpdir:/root/tmpdir/img -i ocr-v2-cli-py37:latest
で無事起動した(/home/user/tempdirは適宜変えてますが)ので中に入ってinferしようとしたのですが、以下のようなエラーが出ています。

python main.py infer /root/tmpdir/img/test.png /root/tmpdir/img/output -p 3 -s f
start inference !
input_root : /root/tmpdir/img/test.png
output_root : /root/tmpdir/img/output
config_file : config.yml
[WARNING] Directory /root/tmpdir/img/output already exist.
[WARNING] Directory is changed to /root/tmpdir/img/output_20230829113828.
/usr/local/lib/python3.8/dist-packages/pytorch_lightning/utilities/parsing.py:261: UserWarning: Attribute 'net' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['net'])`.  rank_zero_warn(
Logger config is empty.
Traceback (most recent call last):
  File "main.py", line 142, in <module>
    main()
  File "main.py", line 138, in main
    cmd(obj={})
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "main.py", line 75, in infer
    inferrer = OcrInferrer(infer_cfg)
  File "/root/ocr_cli/cli/core/inference.py", line 60, in __init__
    self.proc_list = self._create_proc_list(cfg)
  File "/root/ocr_cli/cli/core/inference.py", line 508, in _create_proc_list
    proc_list.append(procs.LineAttributeProcess(cfg, 'ex3'))
  File "/root/ocr_cli/cli/procs/line_attribute.py", line 54, in __init__
    self._object_dict = create_object_dict(self._hydra_cfg, title_model_path, author_model_path)
  File "/root/ocr_cli/submodules/text_recognition_lightning/src/tasks/infer_rf_task.py", line 26, in create_object_dict
    trainer_title = joblib.load(pkl_path_title)
  File "/usr/local/lib/python3.8/dist-packages/joblib/numpy_pickle.py", line 650, in load
    with open(filename, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'submodules/text_recognition_lightning/models/rf_title/model.pkl'

git cloneした際にもsubmoduleをDLしている様子はあり、buildでエラーも出ていないので、原因がわからずにいます。submodules/text_recognition_lightningの中を見てみると、画像のようにmodelsというフォルダ自体がないようでした。

何か解決策ご存じでしょうか？

環境としては
windows 10.0.19045
Nvidia ドライバ 516.94
Quadro RTX 5000
CUDA 11.7
です。

Colabのノートブックがバグっています

ColabでOCR作業を行う時に　IndexError: list index out of rangeというエラーメッセージが出ています。もともとノートブックに載っているURLでPDF入力しようとしても同じエラーがでますので、モデル側の問題かと思います。

input画像などの設定に関する質問+readmeファイルの充実

初めまして。docker初心者で、利用方法について初歩的な質問があります。

readme通りに進めていって、
docker exec -i -t --user root ocr_cli_runner bash
までは済んだのですけど、この後手元の画像を

私の理解ではinput_rootなどはコンテナ内で
mkdir input_root
などとフォルダを別途作成する必要があると理解していますが、そうして作成したフォルダに手元の画像をコピーしようと
docker cp C:\Users\username\Desktop\ocrtest.png ocr_cli_runner:/ocr_cli/input_root\ocrtest.png
としても、
Error: No such container:path: ocr_cli_runner:~\ocr_cli\input_root
と出力されてしまいます。

このエラーをどうすれば良いのか、そもそもこのやり方をそちらが想定しているという理解で良いのか、というのが質問です。加えて、初心者の目で見たときに

この理解に至るまでしばらくかかってしまったので、readmeで省略されているこの作業を明記した方が親切かなと思いました。
そもそも論としては、何かしらフォルダを初期設定でマウントしておいて、ホスト側で画像をGUIでコピーすれば良いようにしておいた方がわかりやすいのではと思いました。ファイルのホストコンテナ間の受け渡しをターミナルで行うのは一般ユーザーはあまり慣れていないかと思います。さらに言えば、何かしらのサンプル画像がinput_rootにあらかじめ置かれているとさらにわかりやすいと思います。

この2点、フィードバックさせていだければと思いました。お手数ですがご回答よろしくお願いいたします。

OCR結果が、行についてバラバラになる

OCR結果が、行についてバラバラになる。

txtに格納されるデータを見ると、各行ごとにはテキストデータに変換されているものの、行の順番がバラバラになります。
こちらの問題について心当たりはありますでしょうか?

Index out of Rangeエラー

初めまして、
数ヶ月前、ColabでOCR作業を問題ないで行なっていましたが、最近、行なってみるとIndexError Traceback (most recent call last)　IndexError: list index out of rangeというエラーメッセージが出ています。

よろしくお願いいたします。

page_deskew の XML 出力

README では以下のように記載されていますが、

│   │   ├── 1_page_deskew
│   │   │   ├── pred_img
│   │   │   └── xml

python main.py infer input_data_dir output_dir -d で出力した際、dump ディレクトリ内に 0_page_sep や 2_layer_ext 以降は処理内容を記載した TSV や XML が出力されているのですが、1_page_deskew のみ、結果画像だけ出力されるようです。

deskew 処理の回転角度（？）などが記録されていた方が、元画像との位置対応がとりやすくありがたいのですが、ご検討いただけないでしょうか。

PS
ver. 1 と ver. 2.1 で両方試してみましたが、どちらも同じのようでした。

Ubuntu 20.04: No matching distribution found for numpy==1.22.4

Ubuntu 20.04 のマシンで docker をビルドしようとすると以下のエラーが発生します。
なお、requirements.txt で numpy==1.21.6 にすることでビルドは可能でした。

 => ERROR [ 8/12] RUN set -x     && pip install -r /root/ocr_cli/requirements.txt                                                                                             1.9s
------
 > [ 8/12] RUN set -x     && pip install -r /root/ocr_cli/requirements.txt:
#0 0.401 + pip install -r /root/ocr_cli/requirements.txt
#0 0.879 Collecting click
#0 0.934   Downloading click-8.1.3-py3-none-any.whl (96 kB)
#0 0.960      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 4.8 MB/s eta 0:00:00
#0 1.067 Collecting lmdb==1.2.1
#0 1.076   Downloading lmdb-1.2.1-cp37-cp37m-manylinux2010_x86_64.whl (299 kB)
#0 1.110      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.4/299.4 kB 10.4 MB/s eta 0:00:00
#0 1.190 Collecting natsort==7.1.1
#0 1.199   Downloading natsort-7.1.1-py3-none-any.whl (35 kB)
#0 1.291 Collecting nltk==3.6.6
#0 1.303   Downloading nltk-3.6.6-py3-none-any.whl (1.5 MB)
#0 1.399      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 16.4 MB/s eta 0:00:00
#0 1.758 ERROR: Ignored the following versions that require a different python version: 1.22.0 Requires-Python >=3.8; 1.22.1 Requires-Python >=3.8; 1.22.2 Requires-Python >=3.8; 1.22.3 Requires-Python >=3.8; 1.22.4 Requires-Python >=3.8; 1.23.0 Requires-Python >=3.8; 1.23.0rc1 Requires-Python >=3.8; 1.23.0rc2 Requires-Python >=3.8; 1.23.0rc3 Requires-Python >=3.8; 1.23.1 Requires-Python >=3.8; 1.23.2 Requires-Python >=3.8; 1.23.3 Requires-Python >=3.8; 1.23.4 Requires-Python >=3.8; 1.23.5 Requires-Python >=3.8; 1.24.0 Requires-Python >=3.8; 1.24.0rc1 Requires-Python >=3.8; 1.24.0rc2 Requires-Python >=3.8; 1.24.1 Requires-Python >=3.8; 1.24.2 Requires-Python >=3.8
#0 1.758 ERROR: Could not find a version that satisfies the requirement numpy==1.22.4 (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.3, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5, 1.20.0, 1.20.1, 1.20.2, 1.20.3, 1.21.0, 1.21.1, 1.21.2, 1.21.3, 1.21.4, 1.21.5, 1.21.6)
#0 1.759 ERROR: No matching distribution found for numpy==1.22.4

縦読みを指定することは可能でしょうか？

はじめまして．
すべてのテキストが縦書きになっている資料の内容を本コードで摘出しようとしています．
テキストが縦書きなっているという情報をレイアウト抽出に反映させることは可能でしょうか？

手元にある資料にコードを走らせてみたところ，横書きと縦書きが混じった形でレイアウト検知が行われておりました．

最後にこのような試みをしてくださり誠にありがとうございます．

mismatch num of predicted result and xml line

RuntimeError: CUDA out of memory ... 306.00 MiB reserved in total by PyTorch

この docker イメージを用いて推論処理をしようとすると、必ず以下のようなエラーで停止します。

RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 47.54 GiB total capacity; 304.36 MiB already allocated; 5.62 MiB free; 306.00 MiB reserved in total by PyTorch)

明らかに PyTorch が予約するメモリ量が少なすぎるような気がするのですが、同様の症状を検索してもあまり類例が見当たりません。何かご存じでしょうか。なお、構成の異なる複数台の Ubuntu 20.04 で同じ現象が発生しました。