Git Product home page Git Product logo

text-binarization-groundtruth's Introduction

Text Binarization Groundtruth

Groundtruth for Text Binarization Models

Processing

To prepare the images for SAM-fine-tune, I ran this command on all of the images:

for img in *; do name="${img%.*}" convert "$img" -alpha off -type TrueColor png24:"$name".png; done

To convert them all to RGB PNG files with no transparency.

Sources

Repo Name Link
DIBCO_2009(_PRINT) DIBCO 2009
DIBCO_2010 DIBCO 2010
DIBCO_2011(_PRINT) DIBCO 2011
DIBCO_2012 DIBCO 2012
DIBCO_2013 DIBCO 2013
DIBCO_2014 H-DIBCO 2014
DIBCO_2016 H-DIBCO 2016
DIBCO_2017 DIBCO 2017
DIBCO_2018 H-DIBCO 2018
DIBCO_2019 DIBCO 2019
LIVEMEMORY LiveMemory Dataset (dataset 3)
NABUCO_(1/2) Nabuco Dataset
PERSIAN PHIBD 2012
BICKLEY Bickley Diary Dataset
RENNES Custom Dataset (CC BY-NC 4.0)
BLEEDTHROUGH Bleed-Through Database

Unused Sources

Bleed-Through Database

  • Did not use UCD.MSA29.12v as it was too messy and cleaned up NLI.MSG18.147 manually.

Palm Leaf Manuscript Dataset

  • The lines are accurate in path but not thickness.

DIVA-HisDB

  • Too innacurate for use here.

text-binarization-groundtruth's People

Contributors

martholomew avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.