Git Product home page Git Product logo

cascadetabnet's Introduction

CascadeTabNet

PWC PWC PWC

CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents
Devashish Prasad, Ayan Gadpal, Kshitij Kapadni, Manish Visave,
Preprint Link of Paper
Supplementary file
The paper has been accepted at CVPR 2020 Workshop on Text and Documents in the Deep Learning Era

End to End Table Recognition Dataset

We manually annotated some of the ICDAR 19 table competition (cTDaR) dataset images. Details about the dataset are mentioned in the paper. dataset link

1. Introduction

CascadTabNet is an automatic table recognition method for interpretation of tabular data in document images. We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition using a single Convolution Neural Network (CNN) model. CascadeTabNet is a Cascade mask Region-based CNN High-Resolution Network (Cascade mask R-CNN HRNet) based model that detects the regions of tables and recognizes the structural body cells from the detected tables at the same time. We evaluate our results on ICDAR 2013, ICDAR 2019 and TableBank public datasets. We achieved 3rd rank in ICDAR 2019 post-competition results for table detection while attaining the best accuracy results for the ICDAR 2013 and TableBank dataset. We also attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.

2. Setup

Models are developed in Pytorch based MMdetection framework (Version 1.2)

pip install -q mmcv terminaltables
git clone --branch v1.2.0 'https://github.com/open-mmlab/mmdetection.git'
cd "mmdetection"
python setup.py install
python setup.py develop
pip install -r {"requirements.txt"}

Code is developed under following library dependencies

PyTorch = 1.4.0
Torchvision = 0.5.0
Cuda = 10.0

pip install torch==1.4.0+cu100 torchvision==0.5.0+cu100 -f https://download.pytorch.org/whl/torch_stable.html

If you are using Google Colaboratory (Colab), Then you need add

from google.colab.patches import cv2_imshow

and replace all the cv2.imshow with cv2_imshow

3. Model Architecture

Model Computation Graph

4. Image Augmentation


Codes: Code for dilation transform Code for smudge transform

5. Benchmarking

5.1. Table Detection

1. ICDAR 13

2. ICDAR 19 (Track A Modern)

3. TableBank


TableBank Benchmarking : Leaderboard
TableBank Dataset Divisions : TableBank

5.2. Table Structure Recognition

1. ICDAR 19 (Track B2)

6. Model Zoo

Config file for the Models :

cascade_mask_rcnn_hrnetv2p_w32_20e.py
Note: Config paths are only required to change during training

Checkpoints of the Models we have trained :

Model NameCheckpoint File
General Model table detectionCheckpoint
ICDAR 13 table detectionCheckpoint
ICDAR 19 (Track A Modern) table detectionCheckpoint
Table Bank Word table detectionCheckpoint
Table Bank Latex table detectionCheckpoint
Table Bank Both table detectionCheckpoint
ICDAR 19 (Track B2 Modern) table structure recognitionCheckpoint

7. Training

You may refer this tutorial for training Mmdetection models on your custom datasets in colab.
having useful links and results

Contact

Devashish Prasad : devashishkprasad [at] gmail [dot] com
Ayan Gadpal : ayangadpal2 [at] gmail [dot] com
Kshitij Kapadni : kshitij.kapadni [at] gmail [dot] com
Manish Visave : manishvisave149 [at] gmail [dot] com

License

The code of CascadeTabNet is released under the MIT License. There is no limitation for both academic and commercial usage.

Cite as

If you find this work useful for your research, please cite our paper:

@misc{ cascadetabnet2020,
    title={CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents},
    author={Devashish Prasad and Ayan Gadpal and Kshitij Kapadni and Manish Visave and Kavita Sultanpure},
    year={2020},
    eprint={2004.12629},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

cascadetabnet's People

Contributors

devashishprasad avatar kshitijkapadni avatar ayangadpal avatar manishdv avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.