Comments (3)
Again, why do you say results are bad in this case? Mistakes seem to only occur for the footnotes. Three things you can try to prevent text lines merging and improve results:
- clean ground truth: it seems you have merged line labels in your GT, so you should expect predicted lines to be merged as well, you first need to clean it to prevent such behaviour
- use x-height label instead of bounding box label (check here): predicted lines should be thinner and more accurate
- use border label to force gaps between lines (check the paper for explanation and ablation experiments)
from docextractor.
@seekingdeep closing the issue, please reopen if necessary
from docextractor.
you are right, this most likely is caused because i didn't use x-height+border in the annotation for training.
instead i used bounding-box for entire text-line which resulted in text-line to be merging with another.
from docextractor.
Related Issues (20)
- Trying to train a Text Region detector but failed HOT 6
- The process of GT generation HOT 2
- via_converter.py generate with boarders HOT 3
- error HOT 6
- Problem with PolynomialLR HOT 5
- Post-processing step HOT 2
- bug -- tester.py HOT 2
- Demo website down HOT 7
- where is the UI? HOT 1
- [suggestion] Store datasets and models in a data archive HOT 1
- conda conflicts HOT 2
- Parallel Prediction? HOT 1
- [bug] translation.exception.TranslateError: No translation get, you may retry HOT 4
- [bug] KeyError: 'filename' HOT 2
- [suggestion] save detected regions as vgg-json HOT 1
- [suggestion] directly input vgg.json for training from scratch or finetuning HOT 1
- [donate] include FUNDING.yml to accept donations HOT 1
- [suggestion] loading the data on the fly HOT 3
- from line level to word level? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from docextractor.