tmbdev / clstm Goto Github PK

View Code? Open in Web Editor NEW

819.0 819.0 224.0 5.07 MB

A small C++ implementation of LSTM networks, focused on OCR.

License: Apache License 2.0

Python 0.99% C++ 27.36% Shell 0.35% C 0.09% Makefile 0.04% Jupyter Notebook 70.92% Cuda 0.07% Dockerfile 0.19%

clstm's People

Contributors

Stargazers

Watchers

Forkers

bestsonny lazymike tajmorton yanweifu fireae adnanulhasan gview zjucsxxd kmfeng antimatter15 henesissrl yodebu jankneumann uikit0 liu4lin mittagessen kba xiaoyang10 bygreencn supersom steckdenis matrixplayer kendemu amitdo michalbusta colingogo lesliekim totkichi mathzxw2002 cmxnono ryanfb pakjce phdhe guoyilin mkolod xuanhan863 pombredanne leon-liangwu asd5510 milesqli xetrocoen xuchongbo sunsocool wait1988 deeplearningsprint hyperji leizi007 mkroutikov relh zhangxinnan quanyuanhang hn18001 shariq avr248 crazylyf asden billcamel ai42 bullud atuxhe sunmanli wavelets jianbotang caomw dengcy028 wyw636 morusu zhang5555 arasharchor bwrsandman anupamaray chagge icaas ericustc kuyun-zhangyang lji72 veterun shuaiwanggit michaelyin matrixping feng520893 jbaiter facegen dreadlord1984 wenyafei4 hengqujushi neuroradiology misscrastal hongxin sunxingxingtf koberf diggerdu anazou mstat hshazly geyijun zgsxwsdxg ismymajia stweil mihaelacr-google

clstm's Issues

clstm_wrap.cpp:6375:15: error: no member named 'resize' in 'ocropus::Batch'
(arg1)->resize(arg2,arg3);
~~~~~~ ^
clstm_wrap.cpp:6426:15: error: no member named 'setZero' in 'ocropus::Batch'
(arg1)->setZero(arg2,arg3);

Error building the Python extension

Hi,
Building the python extension (python setup.py build) failed :

clstm_wrap.cpp: In function ‘PyObject* wrap_INetwork_d_inputs_set(PyObject, PyObject_)’:
clstm_wrap.cpp:11749:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
if (arg1) (arg1)->d_inputs = arg2;
^
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_inputs_get(PyObject, PyObject_)’:
clstm_wrap.cpp:11783:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
result = (Sequence )& ((arg1)->d_inputs);
^
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_outputs_set(PyObject, PyObject_)’:
clstm_wrap.cpp:11825:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
if (arg1) (arg1)->d_outputs = arg2;
^
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_outputs_get(PyObject, PyObject_)’:
clstm_wrap.cpp:11859:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
result = (Sequence )& ((arg1)->d_outputs);
^
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_inputs_set(PyObject, PyObject_)’:
clstm_wrap.cpp:11749:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
if (arg1) (arg1)->d_inputs = arg2;
^
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_inputs_get(PyObject, PyObject_)’:
clstm_wrap.cpp:11783:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
result = (Sequence )& ((arg1)->d_inputs);
^
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_outputs_set(PyObject, PyObject_)’:
clstm_wrap.cpp:11825:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
if (arg1) (arg1)->d_outputs = arg2;
^
clstm_wrap.cpp: In function ‘PyObject wrap_INetwork_d_outputs_get(PyObject, PyObject_)’:
clstm_wrap.cpp:11859:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
result = (Sequence *)& ((arg1)->d_outputs);

How to get top-N results?

Hi @tmbdev , thanks for your reply in #24, it is perfect!

I have another question and have commented in #24.
Here I copy the question for easier retrieval for other people.

Can the predict interface provide top-N results?
That will be helpful for better results and finding weakness of datasets.

Thanks.

clstm.i and clstm.h are not in sync

clstm_wrap.cpp:11671:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
   if (arg1) (arg1)->d_inputs = *arg2;
                     ^
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_inputs_get(PyObject*, PyObject*)’:
clstm_wrap.cpp:11705:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
   result = (Sequence *)& ((arg1)->d_inputs);
                                   ^
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_set(PyObject*, PyObject*)’:
clstm_wrap.cpp:11823:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
   if (arg1) (arg1)->d_outputs = *arg2;
                     ^
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_get(PyObject*, PyObject*)’:
clstm_wrap.cpp:11857:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
   result = (Sequence *)& ((arg1)->d_outputs);
                                   ^
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_inputs_set(PyObject*, PyObject*)’:
clstm_wrap.cpp:11671:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
   if (arg1) (arg1)->d_inputs = *arg2;
                     ^
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_inputs_get(PyObject*, PyObject*)’:
clstm_wrap.cpp:11705:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_inputs’
   result = (Sequence *)& ((arg1)->d_inputs);
                                   ^
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_set(PyObject*, PyObject*)’:
clstm_wrap.cpp:11823:21: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
   if (arg1) (arg1)->d_outputs = *arg2;
                     ^
clstm_wrap.cpp: In function ‘PyObject* _wrap_INetwork_d_outputs_get(PyObject*, PyObject*)’:
clstm_wrap.cpp:11857:35: error: ‘struct ocropus::INetwork’ has no member named ‘d_outputs’
   result = (Sequence *)& ((arg1)->d_outputs);
                                   ^

Illegal instruction

hello,I loaded your code,it's very amazing! I know little about it so i have some questions.
when i run ./test-lstm , it gives error: test-lstm.cc:80:3: error: use of undeclared identifier 'unlink'; did you mean 'inline'?After googled, I add head file #include <unistd.h> and solve this problem.
Then run ./test-filter.sh , it gives anther error :./test-filter.sh: line 7: 26632 Illegal instruction: 4 hidden=20 ntrain=1001 neps=0 report_every=200 save_every=1000 lrate=1e-2 save_name=_filter ./clstmfiltertrain _filter.txt
clstmfilter FAILED
how can I do with it ？

Undefined symbol error when "import clstm" in python

I tried to compile and "setup.py build" with clstm, when I did

python
import clstm

I got "_clstm.so: undefined symbol: _ZN7ocropus14default_deviceE" The platform is Ubuntu 14.04. I am using Anaconda python.

Given the name, I suspect the symbol is related to ocropy, so I tried to install the python packaged of ocropy. But I still get the same error. Would anyone mind to give me some hints? Thanks in advance for your time. And sorry if this is a trivial issue.

Arthur

trained models like ocropy's en-default.pyrnn?

Could you be kind enough to provide trained models like you do for ocropy/ocropus?

An OCR example for 2D LSTM

it would be very nice to provide a code or a high level description of using 2DLSTMs in CLSTM for OCR tasks.
going though the image filter example in test-2d.cc with 1x1 patch (if I understand right) it is not that obvious how to use 2DLSTMs in e.g. clstmocrtrain.cc
direct replacing of BLSTM with 2DLSTM would compile and work but won't converge and would be using 48x1 (i.e. 1D) patches.
What one would like to try is using instead e.g. 1x2 or 2x4 as in here or here

Also, I would like to ask what are the limitations of CLSTM implementation compared to RNNLIBs' when handling 2DLSTMs

Thanks,

test_set_error

At 171-173 in clstmocrtrain.cc, errors and count seems to be turned over.

double errors = tse.first;
double count = tse.second;

Or, test_set_error should return an inverted pair.

loading of previously trained models

Hi,
I have an issue concerning the master-branch, where loading of trained models into clstmocrtrain is implemented but not working (for me). I have audited the code but cannot find what issues the following error with a given load- and/or start-parameter.
got 950 files, 50 tests
.Stacked: 0.0001 0.9 in 0 48 out 0 74
FATAL: missing parameter

Best wishes

What Gradient Descent Method clstm is using?

What Gradient Descent Method clstm is using? SGD? AdaGrad? NAG? RMSProp? Adam?
I want to increase the speed of the learning.
If clstm is not using adaptive learning rate algorithm, I also have to ask that this method can change the learning rate dynamically to implement adaptive learning rate algorithm:

net.setLearningRate(1e-4,0.9)

What's the meaning of the argument skip?

Hi @tmbdev,

I am trying to figure out how the CTC works in clstm and got stuck.

I'm confused when looking into the following code in ctc.cc.

void forward_algorithm(Mat &lr, Mat &lmatch, double skip) {
    int n = ROWS(lmatch), m = COLS(lmatch);
    lr.resize(n, m);
    Vec v(m), w(m);
    for (int j = 0; j < m; j++) v(j) = skip * j;
    for (int i = 0; i < n; i++) {
        w.segment(1, m-1) = v.segment(0, m-1);
        w(0) = skip * i;
        for (int j = 0; j < m; j++) {
            Float same = log_mul(v(j), lmatch(i, j));
            Float next = log_mul(w(j), lmatch(i, j));
            v(j) = log_add(same, next);
        }
        lr.row(i) = v;
    }
}

I'm wondering what the argument skip means since it not in the original thesis of Graves nor in the source code of rnnlib.

Thanks!

clstm.pb.h

is being requested by the compiler but isn't in the source tree ?

test file for ctc.

would be good to have one. the files generated seem to be only for seq.

errors in test-ocr

below is the running log. i added some printf in the program.
and locate the error happens at line 60 in the file clstm_compute.h
the code calls the sum function of a tensor1 variable.

the version of Eigen i use is Eigen3.3-beta1

need some help here, thank you

export seed=0.7733

scons -s -c

rm -f test-lstm.o *.a

scons -j 4 debug=1 clstmocrtrain clstmfiltertrain clstmfilter clstmocr test-lstm
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
g++ --std=c++11 -Wno-unused-result -o clstmocrtrain.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstmocrtrain.cc
g++ --std=c++11 -Wno-unused-result -o clstm.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstm.cc
g++ --std=c++11 -Wno-unused-result -o ctc.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 ctc.cc
protoc(["clstm", "clstm.pb.cc", "clstm.pb.h"], ["clstm.proto"])
g++ --std=c++11 -Wno-unused-result -o clstm_proto.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstm_proto.cc
g++ --std=c++11 -Wno-unused-result -o clstm_prefab.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstm_prefab.cc
g++ --std=c++11 -Wno-unused-result -o extras.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 extras.cc
g++ --std=c++11 -Wno-unused-result -o clstm.pb.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstm.pb.cc
g++ --std=c++11 -Wno-unused-result -o clstm_compute.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstm_compute.cc
g++ --std=c++11 -Wno-unused-result -o clstmfiltertrain.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstmfiltertrain.cc
g++ --std=c++11 -Wno-unused-result -o clstmfilter.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstmfilter.cc
g++ --std=c++11 -Wno-unused-result -o clstmocr.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 clstmocr.cc
g++ --std=c++11 -Wno-unused-result -o test-lstm.o -c -g -g -DCATCH=catch -DNODISPLAY=1 -DTHROW=throw -DTRY=try -I/usr/local/include/eigen3 test-lstm.cc
ar rc libclstm.a clstm.o ctc.o clstm_proto.o clstm_prefab.o extras.o clstm.pb.o clstm_compute.o
ranlib libclstm.a
g++ --std=c++11 -Wno-unused-result -o test-lstm -g test-lstm.o libclstm.a -lpng -lprotobuf
g++ --std=c++11 -Wno-unused-result -o clstmocr -g clstmocr.o libclstm.a -lpng -lprotobuf
g++ --std=c++11 -Wno-unused-result -o clstmfilter -g clstmfilter.o libclstm.a -lpng -lprotobuf
g++ --std=c++11 -Wno-unused-result -o clstmfiltertrain -g clstmfiltertrain.o libclstm.a -lpng -lprotobuf
g++ --std=c++11 -Wno-unused-result -o clstmocrtrain -g clstmocrtrain.o libclstm.a -lpng -lprotobuf
scons: done building targets.

./test-ocr.sh
got 1 files, 0 tests
got 15 classes
.Stacked: 0.01 0.9 in 0 48 out 0 15
.Stacked.Parallel: 0.01 0.9 in 0 48 out 0 200
.Stacked.Parallel.NPLSTM: 0.01 0.9 in 0 48 out 0 100
.Stacked.Parallel.Reversed: 0.01 0.9 in 0 48 out 0 100
.Stacked.Parallel.Reversed.NPLSTM: 0.01 0.9 in 0 48 out 0 100
.Stacked.SoftmaxLayer: 0.01 0.9 in 0 200 out 0 15
loading done!
for loop!
trial = 0
reading sample
training
forward
measure
normalize
set inputs
net forward
encode
mktargets
ctc_align_targets
compute log probability
for loop
Tensor1 out(nc) nc = 15
fmax
fmax end
Dims = 1, dim 0: 15
out data
0.0662834, 0.0671138, 0.0670167, 0.0671662, 0.0665378, 0.0658698, 0.0676015, 0.0660524, 0.0659969, 0.0665682, 0.0666416, 0.0675991, 0.0667013, 0.0661178, 0.0667335,
m.sumI()
clstmocrtrain: /usr/local/include/eigen3/unsupported/Eigen/CXX11/src/Tensor/TensorStorage.h:108: void Eigen::TensorStorage<T, Eigen::DSizes<IndexType, NumIndices_>, Options_>::resize(Eigen::TensorStorage<T, Eigen::DSizes<IndexType, NumIndices_>, Options_>::Index, Eigen::array<IndexType, NumIndices_>&) [with T = float; int Options_ = 0; IndexType = long int; int NumIndices_ = 1; Eigen::TensorStorage<T, Eigen::DSizes<IndexType, NumIndices_>, Options_>::Index = long int; Eigen::array<IndexType, NumIndices_> = std::array<long int, 1ul>]: Assertion `size >= 1' failed.
./test-ocr.sh: line 6: 4659 Aborted (core dumped) ntrain=201 hidden=50 lrate=1e-2 save_name=_ocrtest ./clstmocrtrain _ocrtest.txt
clstmocrtrain FAILED

echo TEST FAILED
TEST FAILED

extras.h

is in the includes, scons doesn't find it.

How to use CNN feature sequence as training data?

@tmbdev I have a question about is this clstm suitable for Scene Text Recongnition. I extract feature sequence from raw images(use CNN), Now, I want use the CNN feature sequence to train the blstm+ctcRNN，but I don't know how to create the .h5 file for training(use the CNN feature sequence). Can you provide an example?

[Feature request] Add option to clstmtrainocr to load previous trained model.

Question about scons compile

I meet a problem when I am trying to compile this code. I use scons in command. But it shows this error.

scons: Reading SConscript files ...
version 1a61d3524c6bc173fcbbbb0e4c340a7d01619645
scons: done reading SConscript files.
scons: Building targets ...
g++ --std=c++11 -Wno-unused-result -o clstm.o -c -g -O3 -finline -g -DHGVERSION=\"1a61d3524c6bc173fcbbbb0e4c340a7d01619645\" -DNODISPLAY=1 -I/usr/local/include/eigen3 clstm.cc
In file included from clstm.cc:1:0:
clstm.h:389:42: error: 'std::map<std::basic_string<char>, ocropus::String>::map' names constructor
scons: *** [clstm.o] Error 1
scons: building terminated because of errors.

But when I add a comment on clstm.h 389, that comment the using std::map<std::string, String>::map;. It shows another problem.

scons: Reading SConscript files ...
version 1a61d3524c6bc173fcbbbb0e4c340a7d01619645
scons: done reading SConscript files.
scons: Building targets ...
g++ --std=c++11 -Wno-unused-result -o clstm_prefab.o -c -g -O3 -finline -g -DHGVERSION=\"1a61d3524c6bc173fcbbbb0e4c340a7d01619645\" -DNODISPLAY=1 -I/usr/local/include/eigen3 clstm_prefab.cc
clstm_prefab.cc: In function 'ocropus::Network ocropus::make_lstm1(const ocropus::Assoc&)':
clstm_prefab.cc:30:19: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc: In function 'ocropus::Network ocropus::make_revlstm1(const ocropus::Assoc&)':
clstm_prefab.cc:44:29: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:46:19: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc: In function 'ocropus::Network ocropus::make_bidi(const ocropus::Assoc&)':
clstm_prefab.cc:62:39: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:63:29: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:65:19: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc: In function 'ocropus::Network ocropus::make_bidi2(const ocropus::Assoc&)':
clstm_prefab.cc:82:39: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:83:29: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:88:39: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:89:29: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
clstm_prefab.cc:91:19: error: invalid initialization of reference of type 'const ocropus::Assoc&' from expression of type '<brace-enclosed initializer list>'
In file included from clstm_prefab.cc:1:0:
clstm.h:398:9: error: in passing argument 4 of 'ocropus::Network ocropus::layer(const string&, int, int, const ocropus::Assoc&, const Networks&)'
scons: *** [clstm_prefab.o] Error 1
scons: building terminated because of errors.

Can anyone give me a help? Thanks

Code sample for original dataset using clstm python and using opencv2

I suffered a lot to come to code clstm python with the original dataset using opencv2 and python. Hope this code helps other people trying to use clstm python.

import clstm
import numpy as np
import os
from scipy.ndimage import filters
import cv2

def mktarget(transcript,noutput):
    N = len(transcript)
    target = np.zeros((2*N+1,noutput),'f')
    #assert 0 not in transcript
    target[0,0] = 1
    for i,c in enumerate(transcript):
        target[2*i+1,c] = 1
        target[2*i+2,0] = 1
    return target

def decode(pred, codec, threshold = .5):
    eps = filters.gaussian_filter(pred[:,0,0],2,mode='nearest')
    loc = (np.roll(eps,-1)>eps) & (np.roll(eps,1)>eps) & (np.eps<threshold)
    classes = np.argmax(pred,axis=1)[:,0]
    codes = classes[loc]
    chars = [chr(codec[c]) for c in codes]
    return "".join(chars)    

if __name__ == "__main__":
    f = open("words.txt","r")
    lines  = f.read().split("\n")
    context_lines.pop()
    codec = list(set("".join(ans)))
    ninput = 100
    noutput = len(codec)
    print "noutput : ", noutput
    #define network and learning rate
    net = clstm.make_net_init("bidi","ninput=%d:nhidden=200:noutput=%d"%(ninput,noutput))
    net.setLearningRate(1e-4,0.9)
    iteration = 200000

    #input files data
    img_files = filter(lambda n: n.find(".bin.txt") == -1, os.listdir("dataset/"))
    img_name  = [img_files[i].replace(".png","") for i in range(len(img_files))]

    transcripts = []

    #load transcripts
    for i in range(len(img_name)):
        print "loading file", float(i)/float(len(img_name)) * 100, "percent complete"
        f = open("dataset/"+img_name[i]+".bin.txt","r")
        transcript_text = f.read()
        transcripts.append([codec.index(transcript_text[j]) for j in range(len(transcript_text))])
        f.close()

    #learning
    for i in range(iteration):
        print float(i)/float(iteration) * 100, "% complete"
        index = int(np.random.rand()*len(img_name))
        #set input
        img = cv2.imread("dataset/" + img_name[i]+".png",0)
        img_input = [list([0.0 if img[j][k] == -1 else float(img[j][k])]) for j in range(len(img)) for k in range(len(img[j]))]
        #same type as input provided in the tutorial
        xs=np.ndarray(shape=(img.shape[1], 100, 1), buffer=np.array(img_input),dtype=np.float32)
        net.inputs.aset(xs)
        #forward propagation
        net.forward()
        #prediction
        pred = net.outputs.array()
        target = mktarget(transcripts[index],noutput)
        seq = clstm.Sequence()
        seq.aset(target.reshape(-1,noutput,1))
        #align ctc
        aligned = clstm.Sequence()
        clstm.seq_ctc_align(aligned,net.outputs,seq)
        aligned = aligned.array()
        #delta val
        deltas = aligned - net.outputs.array()
        #input delta of aligned ctc and output of network
        net.d_outputs.aset(deltas)
        #backward propagation
        net.backward()
        #update network
        net.update()

    #save network
    clstm.save_net("sample.clstm",net)

wstring to u32string

The current wstring is not entirely portable, since wchar_t is only 16 bits on Windows. Occurrences of wchar_t should be replaced with char32_t and occurrences of wstring with u32string

install error?

ubgpu@ubgpu:/github/clstm$ sudo scons install
scons: Reading SConscript files ...
version daa8f08
scons: done reading SConscript files.
scons: Building targets ...
g++ --std=c++11 -Wno-unused-result -o clstm_proto.o -c -g -O3 -finline -g -DHGVERSION="daa8f08e060d6d2bc415becc32eba36c0b4de524" -DNODISPLAY=1 -DTHROW=throw -I/usr/include/eigen3 clstm_proto.cc
clstm_proto.cc:18:22: fatal error: clstm.pb.h: No such file or directory
#include "clstm.pb.h"
^
compilation terminated.
scons: *** [clstm_proto.o] Error 1
scons: building terminated because of errors.
ubgpu@ubgpu:/github/clstm$

Providing manual alignment

Is there any way to give a gold-standard alignment to CLSTMText?
I'm working on some research evaluating its efficacy on other tasks besides OCR, such as Twitter, where expanding acronyms is a big deal. Right now clstmtext is suffering mightily because it takes text lines directly instead of letting me give it the direct alignments (e.g. two sequences, ["lol", "wut"] --> ["laughing out loud", "what"])

Anything I can do to help it on this front?

import error in clstm python module

I tried to import clstm python module but this following error happens:

ImportError                               Traceback (most recent call last)
<ipython-input-1-2e5fe7a7c3df> in <module>()
----> 1 import clstm

/home/kendemu/clstm/clstm.py in <module>()
     30                 fp.close()
     31             return _mod
---> 32     _clstm = swig_import_helper()
     33     del swig_import_helper
     34 else:

/home/kendemu/clstm/clstm.py in swig_import_helper()
     22             fp, pathname, description = imp.find_module('_clstm', [dirname(__file__)])
     23         except ImportError:
---> 24             import _clstm
     25             return _clstm
     26         if fp is not None:

ImportError: No module named _clstm

How can I solve this error? I did scons, sudo scons install command and I saw the README that python setup.py build command is broken.

Task for offline handwriting

@tmbdev I have a question about is this clstm suitable for offline English handwriting. I am newer to this area. I have some questions for discussion.

Feature extraction. Deep learning, such as CNN, which can extract feature map from raw images. I don't know what is the feature extraction method in clstm. Can I replace this feature input part by Deep learning? For example, sliding window from a handwritten text line, from left to right to get a sequence feature.
In clstm examples, there are seems only 1 layer lstm. Is there other examples to show more complex network structures? Such as more layers.
What's your opinion for solving offline English handwriting problem? Thanks.

It seems that clstm network will mapping this character to label. So it's better to do some norm operation of this handwriting words? So it has better segmentation result for easy recognition.

How to use pretrained model in python script

I am using a pretrained clstm language model and loading it.
This is my code:

    def PredictWords(self,image):
        noutput = 3877
        if self.lang == "EN":
            pass
        elif self.lang == "JP":
            net = clstm.load_net("../lang/jp3877.clstm")

        net.inputs.aset(image)
        net.forward()
        prod = net.outputs.array()
        seq = clstm.Sequence()
        seq.aset(target.reshape(-1,noutput,1))
        aligned = clstm.Sequence()
        aligned = aligned.array()
        return clstm.Codec.decode(prod)

But this error occurs:
[libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "clstm.NetworkProto" because it is missing required fields: kind, ninput, noutput

I think I am mistaking loading the neural net model or need more specific parameters to load it. How to solve this problem?

deprecation of genericLSTM

I have seen that in the last version genericLSTM has been "softly" deprecated. Could you please explain me the reason why this choice has been made?

run-tests failed

hi,
running ./run-tests gave me this, how can I fix this?

test-lstm.cc:81:3: error: use of undeclared identifier 'unlink'
  unlink("__test0__.clstm");
  ^
test-lstm.cc:114:3: error: use of undeclared identifier 'unlink'
  unlink("__test__.clstm");
  ^
2 errors generated.
scons: *** [test-lstm.o] Error 1
scons: building terminated because of errors.

>>>>>>> echo TEST FAILED
TEST FAILED

Scons not building Protobuf

On running the script - run-tests; the following error is thrown:

/usr/include/google/protobuf/generated_message_util.h:84: undefined reference to google::protobuf::internal::empty_string_' libclstm.a(clstm.pb.o): In functionclstm::Array::MergeFrom(clstm::Array const&)':
/home/inno/Desktop/clstm/clstm.pb.cc:899: undefined reference to `google::protobuf::internal::ArenaStringPtr::AssignWithDefault(std::string const_, google::protobuf::internal::ArenaStringPtr)'
collect2: error: ld returned 1 exit status
scons: *_* [test-lstm] Error 1

Is there a tutorial on how to use CLSTM instead of the python lstm in ocropy

I'm trying to use the C++ version CLSTM in ocropy . i tried building the python parts using setup.py but it throws errors

`~/OCROpus/clstm$ python2 setup.py build
PY_CORE_CFLAGS -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -I. -IInclude -I../Include -fPIC -DPy_BUILD_CORE
BLDSHARED x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security
LDCXXSHARED c++ -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions
LDFLAGS -Wl,-Bsymbolic-functions -Wl,-z,relro
TESTPYTHON LD_LIBRARY_PATH=/build/buildd/python2.7-2.7.6/build-shared: ./python -Wd -3 -E -tt
CONFIG_ARGS '--enable-shared' '--prefix=/usr' '--enable-ipv6' '--enable-unicode=ucs4' '--with-dbmliborder=bdb:gdbm' '--with-system-expat' '--with-system-ffi' '--with-fpectl' 'CC=x86_64-linux-gnu-gcc' 'CFLAGS=-D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security ' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro'
CONFIGURE_CFLAGS -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security
LDSHARED x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security
CONFIGURE_LDFLAGS -Wl,-Bsymbolic-functions -Wl,-z,relro
OPT -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes
PY_LDFLAGS -Wl,-Bsymbolic-functions -Wl,-z,relro
LINKFORSHARED -Xlinker -export-dynamic -Wl,-O1 -Wl,-Bsymbolic-functions
PY_CFLAGS -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security
CFLAGS -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -D_FORTIFY_SOURCE=2 -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security
running build
running build_py
copying clstm.py -> build/lib.linux-x86_64-2.7
running build_ext
building '_clstm' extension
swigging clstm.i to clstm_wrap.cpp
swig -python -c++ -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/lib/python2.7/dist-packages/numpy/core/include -o clstm_wrap.cpp clstm.i
C compiler: x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -fPIC

compile options: '-I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c'
extra options: '-std=c++11 -Wno-sign-compare -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION=""unknown""'
x86_64-linux-gnu-gcc: ctc.cc
x86_64-linux-gnu-gcc: clstm_wrap.cpp
clstm_wrap.cpp: In function ‘PyObject* wrap_Batch_resize(PyObject, PyObject_)’:
clstm_wrap.cpp:6253:15: error: ‘struct ocropus::Batch’ has no member named ‘resize’
(arg1)->resize(arg2,arg3);
^
clstm_wrap.cpp: In function ‘PyObject* wrap_Batch_setZero(PyObject, PyObject_)’:
clstm_wrap.cpp:6304:15: error: ‘struct ocropus::Batch’ has no member named ‘setZero’
(arg1)->setZero(arg2,arg3);
^
clstm_wrap.cpp: In function ‘PyObject* wrap_Batch_resize(PyObject, PyObject_)’:
clstm_wrap.cpp:6253:15: error: ‘struct ocropus::Batch’ has no member named ‘resize’
(arg1)->resize(arg2,arg3);
^
clstm_wrap.cpp: In function ‘PyObject* wrap_Batch_setZero(PyObject, PyObject_)’:
clstm_wrap.cpp:6304:15: error: ‘struct ocropus::Batch’ has no member named ‘setZero’
(arg1)->setZero(arg2,arg3);
^
error: Command "x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -fPIC -I/usr/include/eigen3 -I/usr/local/include/eigen3 -I/usr/local/include -I/usr/include/hdf5/serial -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c clstm_wrap.cpp -o build/temp.linux-x86_64-2.7/clstm_wrap.o -std=c++11 -Wno-sign-compare -Dadd_raw=add -DNODISPLAY=1 -DTHROW=throw -DHGVERSION=""unknown""" failed with exit status 1
`

Thats the output from console .

How can I generate clstmseq?

Using RHEL 6.5
I've installed hdf5 and hdf5-devel
I've looked in the SConstruct and gave the hdf5lib=hdf5_serial option.
I even added some debug code to make sure it was accepting it.
No matter what I do, the code moves forward without error and installs the regular tools, but none of the HDF5 tools.
Any ideas?

which eigen version?

Hi,
Which eigen version is used? I have errors compiling with 3.2.8
`./tensor.h:39:10: fatal error: 'unsupported/Eigen/CXX11/Tensor' file not found

include <unsupported/Eigen/CXX11/Tensor>`

And after I go to https://github.com/RLovelett/eigen to get the unsupported/eigen/CXX11, there is just too many errors compiling:

`In file included from clstmfilter.cc:1:
In file included from ./clstm.h:17:
In file included from ./batches.h:6:
In file included from ./tensor.h:39:
In file included from /usr/local/include/eigen3/unsupported/Eigen/CXX11/Tensor:18:
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:292:36: error:
unknown type name 'EIGEN_DEVICE_FUNC'
template<typename A, typename B> EIGEN_DEVICE_FUNC constexpr static in...
^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:292:54: error:
expected member name or ';' after declaration specifiers
template<typename A, typename B> EIGEN_DEVICE_FUNC constexpr static in...
~~~~~~~~~~~~~~~~~ ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:296:36: error:
unknown type name 'EIGEN_DEVICE_FUNC'
template<typename A, typename B> EIGEN_DEVICE_FUNC constexpr static in...
^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:296:54: error:
expected member name or ';' after declaration specifiers
template<typename A, typename B> EIGEN_DEVICE_FUNC constexpr static in...
~~~~~~~~~~~~~~~~~ ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:358:3: error:
unknown type name 'EIGEN_DEVICE_FUNC'
EIGEN_DEVICE_FUNC constexpr static inline auto run(array<T, N> arr, T ...
^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:358:21: error:
expected member name or ';' after declaration specifiers
EIGEN_DEVICE_FUNC constexpr static inline auto run(array<T, N> arr, T ...

/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:367:3: error:
      unknown type name 'EIGEN_DEVICE_FUNC'
  EIGEN_DEVICE_FUNC constexpr static inline T run(const array<T, N>& arr, T)
  ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:367:21: error:
      expected member name or ';' after declaration specifiers
  EIGEN_DEVICE_FUNC constexpr static inline T run(const array<T, N>& arr, T)
  ~~~~~~~~~~~~~~~~~ ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:376:3: error:
      unknown type name 'EIGEN_DEVICE_FUNC'
  EIGEN_DEVICE_FUNC constexpr static inline T run(const array<T, 0>&, T ...
  ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:376:21: error:
      expected member name or ';' after declaration specifiers
  EIGEN_DEVICE_FUNC constexpr static inline T run(const array<T, 0>&, T ...
  ~~~~~~~~~~~~~~~~~ ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:383:1: error:
      unknown type name 'EIGEN_DEVICE_FUNC'
EIGEN_DEVICE_FUNC constexpr inline auto array_reduce(const array<T, N>& ...
^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:383:19: error:
      expected unqualified-id
EIGEN_DEVICE_FUNC constexpr inline auto array_reduce(const array<T, N>& ...
                  ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:482:44: error:
      no template named 'h_repeat'; did you mean 'repeat'?
constexpr array<t, n> repeat(t v) { return h_repeat<n>::run(v, typename ...
                                           ^~~~~~~~
                                           repeat
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:482:23: note:
      'repeat' declared here
constexpr array<t, n> repeat(t v) { return h_repeat<n>::run(v, typename ...
                      ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:482:55: error:
      qualified name refers into a specialization of function template 'repeat'
constexpr array<t, n> repeat(t v) { return h_repeat<n>::run(v, typename ...
                                           ~~~~~~~~~~~^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:482:23: note:
      function template 'repeat' declared here
constexpr array<t, n> repeat(t v) { return h_repeat<n>::run(v, typename ...
                      ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/CXX11Meta.h:482:23: error:
      no return statement in constexpr function
constexpr array<t, n> repeat(t v) { return h_repeat<n>::run(v, typename ...
                      ^
In file included from clstmfilter.cc:1:
In file included from ./clstm.h:17:
In file included from ./batches.h:6:
In file included from ./tensor.h:39:
In file included from /usr/local/include/eigen3/unsupported/Eigen/CXX11/Tensor:19:
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/MaxSizeVector.h:34:3: error:
      unknown type name 'EIGEN_DEVICE_FUNC'
  EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE
  ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/util/MaxSizeVector.h:34:21: error:
      expected member name or ';' after declaration specifiers
  EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE
  ~~~~~~~~~~~~~~~~~ ^
/usr/local/include/eigen3/Eigen/src/Core/util/Macros.h:408:29: note: expanded
      from macro 'EIGEN_STRONG_INLINE'
#define EIGEN_STRONG_INLINE inline
                            ^
In file included from clstmfilter.cc:1:
In file included from ./clstm.h:17:
In file included from ./batches.h:6:
In file included from ./tensor.h:39:
In file included from /usr/local/include/eigen3/unsupported/Eigen/CXX11/Tensor:71:
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h:17:36: error:
      unknown type name 'EIGEN_DEVICE_FUNC'
template<typename T1, typename T2> EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE
                                   ^
/usr/local/include/eigen3/unsupported/Eigen/CXX11/src/Tensor/TensorMeta.h:17:54: error:
      expected unqualified-id
template<typename T1, typename T2> EIGEN_DEVICE_FUNC EIGEN_ALWAYS_INLINE
                                                     ^
/usr/local/include/eigen3/Eigen/src/Core/util/Macros.h:419:29: note: expanded
      from macro 'EIGEN_ALWAYS_INLINE'
#define EIGEN_ALWAYS_INLINE __attribute__((always_inline)) inline
                            ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
scons: *** [clstmfilter.o] Error 1
scons: building terminated because of errors.`

Documentations required

Any documentations/tutorials to train models and to test for OCR?

Many NaN errors with the new Tensor-based version

For many models (especially deep ones with many parameters e.g. bidi2), I keep getting the following error

clstm.cc:664: void ocropus::GenericNPLSTM<F, G, H>::backward() [with int F = 1; int G = 2; int H = 2]: Assertion `!anynan(out)' failed.

where the old version (Mat-based) works just fine

proto for the python version

Is there anybody who has written protobuf code for the python wrapper?

usemat acceleration in current version?

In the previous version, one can accelerate calculation using "usemat=1" during installation, which use matrix instead of tensor. But in the current version, the matrix alternative is not supported. Is this means the tensor version is as fast now? Why matrix part is removed?

Best

CLSTM for pronouns like places,names etc

I am a beginner in LSTM.My doubt is, with what I have understood LSTM wont do just OCR rather it will also try to predict the next word or character unlike tesseract.But suppose I am training for say a set of pronouns like names,places etc should I train for all possible names and places.Because given a new place with say 2 words, clstm may try to predict the next word and if that is not in the training set,will it do OCR alone rather than predicting it? bcz prediction may be completely wrong if it is not in the dataset?

I tried for person names and I trained over the english default model when I give different name other than in the dataset,it is failing very badly.Any thoughts will be helpful

Other language trial report

With 500-character subset of Japanese Kanji, clstm works fine. (hidden_nodes = 100, MacBook, gcc48 from homebrew)
Your kid reads Japanese brilliantly.

I am trying 2492-char subset. it seems to take several weeks (hidden=200, this time)
(NO nhidden = 200 seems to be hopeless, he/she seems to learn one char by forgetting another)

Now trying 3700 chars( little bigger tesseract jp-dataset ) with nhidden = 800 and nhidden =1200.
Unless my PC broke, I will see the result next spring.

What's the license of the UW3 dataset?

http://www.tmbdev.net/ocrdata/uw3-lines-book.tgz

How to get confidence values?

Hi @tmbdev , I did some experiments about offline handwriting recognition with clstm. The network can produce one sequential result by given image.

Is it possible to get confidence values of the result? For example, as shown in the image given by #18 , every position of the sequence has different colors for different predictions.

Are there any interfaces in clstm to get such confidence values?

Thanks.

Use clstm to handwriting recognition like the IAM dataset

Hello, have anyone tried clstm to handwriting recognition? I have just applied it for the IAM dataset(http://www.fki.inf.unibe.ch/databases/iam-handwriting-database), but there isn't a good performance(33.5% error ratio after 200,000 iterations). I used the config like run-uw3-500, and tried to modify the parameters(e.g. set hidden to 500, 300, set lrate to 1e-4, 1e-3), but the performances are almost the same. Can anyone give me some suggestions? Many thanks~

Future / Documentation / Roadmap / Stuff of that sort

Hi,

I was working on a GUI for scanning PDF documents and I was looking
for a way to have correct OCR under Linux and to sandwich the text into the generated PDFs.

Obviously the state of open source OCR isn't stellar, everything I tried wasn't exactly a breeze
to setup and most importantly the actual detection was quite bad for example with tesselact and
all the other known open source projects.

Then I read about ocropy/ocropus/clstm, and finally was relieved that someone was at last working from scratch on an innovative modern approach, that was actually being actively developed and some of the results I saw in one online tutorial where a guy modified the model for typewriter fonts and it worked rather good were promising.

So I got ocropy to work on Fedora and tried it with the standard model on a simple plain PDF document I had scanned. The results were just as terrible as with the other open source OCRs, if not worse, although it really is a high resolution scan with normal looking fonts. I felt disappointed to be honest.

From the documentation/README.md and so on I didn't get a lot of information about where the project is going and where it might need some help. The basic concepts are explained well, but I didn't get a good picture of where the problems of the project are and what is done to solve them.
It seems "the model" is just a file that was uploaded sometime ago, but doesn't get actively maintained and improved currently.

I would be interesting to know what the obstacles currently are for supporting more fonts/fontstyles and so on. Is there any plan for what the next milestones should be? Is there any guide how people can best help the project to fix bugs, add features, refactor code, improve models and so on.

I'm writing this in this project, because I really see it as the same project basically.

To summarize:

It would be nice if you kept a detailed TODO list for clstm and ocropy
It would be nice to have a Roadmap for both those projects as a whole where
important milestones are defined
The documentation should explain how people can help with development,
especially how models can be created and improved upon, so that they can
eventually be included as "official" models. Do you for example require some
example PDFs from people that currently don't work well in order for you to
improve future models?

I'm sorry, if this doesn't fit into "issues". Feel free to remove or move this, I just
wanted to express that what's going on with ocropy/clstm development seems
like a blackbox from the outside.

can clstm recognize speical character well

There are some things the currently trained models for ocropus-rpred will not handle well, largely because they are nearly absent in the current training data. That includes all-caps text, some special symbols (including "?"), typewriter fonts, and subscripts/superscripts. This will be addressed in a future release,

parameter name: 'ntrain' vs. 'maxtrain'

'ntrain' appears in these files:

run-cmu
run-uw3-500
clstmtext.cc
clstmconv.cc
clstmseq.cc
clstmctc.cc

'maxtrain' appears in these files:

clstmfiltertrain.cc
clstmocrtrain.cc

Tom, please use either 'ntrain' or 'maxtrain' for the parameter name in all files.

clstm.load() fails to parse what clstm.save()ed

To enable resumed ocrtrain, I am trying to load *.clstm. but when I reload, parse error occurs with this message.

[libprotobuf ERROR google/protobuf/message_lite.cc:123] Can't parse message of type "clstm.NetworkProto" because it is missing required fields: kind, ninput, noutput
libc++abi.dylib: terminating

These 3 fields(kind, ninput, noutput) are not saved?

compile

I used be able to run setup.py fine, in the latest two version i get swig errors (setting the NPY version doesn't fix it).

warning: "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]

warning "Using deprecated NumPy API, disable it by " \

^
clstm_wrap.cpp:5906:15: error: no member named 'train' in 'ocropus::INetwork'
(arg1)->train(_arg2,_arg3);
~~~~~~ ^
clstm_wrap.cpp:5975:15: error: no member named 'ctrain' in 'ocropus::INetwork'
(arg1)->ctrain(_arg2,_arg3);
~~~~~~ ^
clstm_wrap.cpp:6053:15: error: no member named 'ctrain_accelerated' in 'ocropus::INetwork'
(arg1)->ctrain_accelerated(_arg2,_arg3,arg4);
~~~~~~ ^
clstm_wrap.cpp:6122:15: error: no member named 'ctrain_accelerated' in 'ocropus::INetwork'
(arg1)->ctrain_accelerated(_arg2,_arg3);
~~~~~~ ^
clstm_wrap.cpp:6253:15: error: no member named 'cpred' in 'ocropus::INetwork'
(arg1)->cpred(_arg2,_arg3);
~~~~~~ ^
clstm_wrap.cpp:6373:15: error: no member named 'setInputs' in 'ocropus::INetwork'
(arg1)->setInputs(_arg2);
~~~~~~ ^
clstm_wrap.cpp:6430:15: error: no member named 'setTargets' in 'ocropus::INetwork'
(arg1)->setTargets(_arg2);
~~~~~~ ^
clstm_wrap.cpp:6487:15: error: no member named 'setClasses' in 'ocropus::INetwork'
(arg1)->setClasses(*arg2);
~~~~~~ ^

test-gpu.cc is missing & state of GPU support

the test-gpu.cc refered in the SConstruct file is missing from the repository
is it enough to set gpu=1to scons to enable GPU support ?
what is the current state of GPU support ?

Thanks,

How to use GPU to speed up in clstm?

when I try to use command like this to compile
scons -j 4 gpu=1 clstmocrtrain clstmocr
but failed to enable GPU, what is the right things to do to use GPU to speed up the training process?

clstmtrainocr and clstmfiltertrain tests are crashing

with commit a6a33a8

$ ./run-cmu
+ ntrain=3000000
+ hidden=50
+ test_every=50000
+ testset=misc/cmu-test.txt
+ lrate=3e-4
+ report_every=100
+ save_every=10000
+ save_name=cmu
+ neps=3
+ ./clstmfiltertrain misc/cmu-train.txt misc/cmu-test.txt
got 111503 inputs, 12389 tests
.stacked: 0.0001 0.9 in 0 28 out 0 38
.stacked.parallel: 0.0001 0.9 in 0 28 out 0 200
.stacked.parallel.lstm: 0.0001 0.9 in 0 28 out 0 100
.stacked.parallel.reversed: 0.0001 0.9 in 0 28 out 0 100
.stacked.parallel.reversed.lstm: 0.0001 0.9 in 0 28 out 0 100
.stacked.softmax: 0.0001 0.9 in 0 200 out 0 38
clstmfiltertrain: clstm_compute.cc:136: void ocropus::forward_stack1(ocropus::Batch&, ocropus::Batch&, ocropus::Sequence&, int): Assertion `inp.cols() == out.cols()' failed.
Aborted (core dumped)

$ ./run-uw3-500
+ set -a
+ test -d book
+ find book -name '*.bin.png'
+ sort -r
+ sed 1,50d uw3-all
+ sed 50q uw3-all
+ report_every=10
+ save_every=1000
+ ntrain=200000
+ dewarp=center
+ display_every=10
+ test_every=10000
+ testset=uw3-test.h5
+ hidden=100
+ lrate=1e-4
+ save_name=uw3-500
+ report_time=1
+ ./clstmocrtrain uw3-train uw3-test
*** charsep 
got 450 files, 50 tests
got 83 classes
.stacked: 0.0001 0.9 in 0 48 out 0 83
.stacked.parallel: 0.0001 0.9 in 0 48 out 0 200
.stacked.parallel.lstm: 0.0001 0.9 in 0 48 out 0 100
.stacked.parallel.reversed: 0.0001 0.9 in 0 48 out 0 100
.stacked.parallel.reversed.lstm: 0.0001 0.9 in 0 48 out 0 100
.stacked.softmax: 0.0001 0.9 in 0 200 out 0 83
clstmocrtrain: /usr/include/eigen3/Eigen/src/Core/SelfCwiseBinaryOp.h:136: Eigen::SelfCwiseBinaryOp<BinOp, Lhs, Rhs>& Eigen::SelfCwiseBinaryOp<BinOp, Lhs, Rhs>::lazyAssign(const Eigen::DenseBase<OtherDerived>&) [with RhsDerived = Eigen::Matrix<float, -1, -1>; BinaryOp = Eigen::internal::scalar_sum_op<float>; Lhs = Eigen::Matrix<float, -1, -1>; Rhs = Eigen::Matrix<float, -1, -1>]: Assertion `rows() == rhs.rows() && cols() == rhs.cols()' failed.
./run-uw3-500: line 23:  4971 Aborted                 (core dumped) ./clstmocrtrain uw3-train uw3-test