Git Product home page Git Product logo

Comments (7)

courao avatar courao commented on June 1, 2024

你可以看一下在报错之前的一些提示,有些字符没有覆盖到引起的,你把所有涉及的字符添加到字符集里面就可以了

from ocr.pytorch.

InferMaster avatar InferMaster commented on June 1, 2024

你可以看一下在报错之前的一些提示,有些字符没有覆盖到引起的,你把所有涉及的字符添加到字符集里面就可以了

我个人觉得是在130行,将str形式的text转换为list形式,但是搞不定该怎么写哈,麻烦您指正一下
image

from ocr.pytorch.

courao avatar courao commented on June 1, 2024

不用,你可以看一下148行捕捉的exception,就是因为dict里面缺少字符引发的,
你需要在准确数据的时候生成一个新的alphabet,在keys.py 这个文件家里面读新的字典,把所有你需要的字符都包含进去即可

from ocr.pytorch.

InferMaster avatar InferMaster commented on June 1, 2024

首先谢谢你的回答呀,我还有个问题需要请教,这个key.py用来干嘛的呀,还有在程序的34行,看着是将每个字符转换为int,我在处理藏语的时候直接没办法转换,此时在想将藏字拆分是可以转换的,但是拆分后后续程序不知道将会用到key.py的什么功能
image

from ocr.pytorch.

EurekaTesla avatar EurekaTesla commented on June 1, 2024

非常感谢作者提供的解决方案,我已经修改好了该错误。我就粗浅得补充一下,在utils.py中if ord(ch) not in self.dict.keys():下一行加入print(ch.encode('utf-8')),然后就可以在报错的时候输出缺少的字符,将这个字符都写入到alphabet.pkl(里面是一个列表,保存着中英文和符号等字符集)中,就可以解决这个问题。我训练的是好未来手写体英文数据集,虽然不知道为啥会有'\xc2'和'\xad',但是在alphabet.pkl中加入之后就解决了这个问题。我想alphabet.pkl中应该没有您用的藏语字符,需要您添加进去。

from ocr.pytorch.

SkrDrag avatar SkrDrag commented on June 1, 2024

我想问下怎么在这个alphabet.pkl中写入新的字符啊

from ocr.pytorch.

1ymtics avatar 1ymtics commented on June 1, 2024

key文件里就是写入alphabet.pkl的程序,注意修改一下数据集中区分fname和label的分隔符,作者使用的是\t,根据自己实际情况修改;另外如果出现SyntaxError: Non-UTF-8 code starting with '\xbc' in file,but no encoding declared报错的话,在最顶上加 # -- coding: utf-8 --

from ocr.pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.