Git Product home page Git Product logo

Comments (16)

opconty avatar opconty commented on September 18, 2024

您好:
识别数据集使用的是人工合成的: https://github.com/opconty/synthetic_Chinese_OCR_dataset
检测数据集由于涉及一些版权问题,所以暂不方便分享,,抱歉.

best..

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

谢谢您的分享,我现在就是检测效果不好。我使用的ctpn里有个将检测框进行宽度16分割的脚本,最终会转换成VOC格式,想请问您检测数据集标注流程是什么,也是生成txt然后转换到VOC格式吗,还是有能直接使用的标注工具

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

其实标注格式不是很重要,毕竟最终数据处理的时候只要得到文本框的四个点(如果是倾斜)或者两个点(左上右下矩形)坐标就好了..
我采用的是xml格式,,可以参考一下keras_std矩形框或者keras_std_plus_plus(四边形框)的数据读取方式

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

抱歉没细看您代码,您的意思是您的label只是文本框4个点坐标,将文本框以16宽度分割是后续处理的?因为我自己用的代码是提前分割了文本框作为输入

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

ctpn的数据集当然要先分隔好...
我这里说的是数据标签格式问题

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

好的,感谢您没有嫌弃我这个小白的耐心解答

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

不客气

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

抱歉再问您一个问题,识别部分的字典是怎么生成的,就是每行一个字符那个Txt,您项目里的chars_addr_names.txt

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

从所有训练标签里面提取出来的

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

您提供的识别数据集有对应的label文件吗

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

您是说train_name.txt吗,为什么我提取出来是空的

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

那就奇怪了,我打开是有的

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

里面一共是3131903文件吗,我第一次解压出来一个空的txt,后面重新解压,txt都没有了,全是图片,有点懵。

from pytorch_ctpn.

P5best avatar P5best commented on September 18, 2024

重新下载解压了好几次,始终找不到label,只有图片,不知您是否方便发我一份,我给您发过一封邮件,标题是ctpn数据集。

from pytorch_ctpn.

opconty avatar opconty commented on September 18, 2024

https://github.com/opconty/synthetic_Chinese_OCR_dataset/blob/master/models/chars_addr_names.txt

from pytorch_ctpn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.