The detectron2 from mertsayar8

detectron2's Introduction

This project focuses on cross domain Document Object Detection (DOD). DOD is the task of decomposing a document page image into structural and logical units such as texts, titles, lists, figures. In this paper, recent research on this task is discussed, two recent datasets used in research namely Publaynet and PRImA Layout Analysis are summarized and a method based on Mask RCNN and feature pyramid networks is given using recently released large-scaled dataset Publaynet training data for cross domain DOD task on PRImA Layout Analysis Dataset. Results on PRImA and validation set of Publaynet are compared.

This project is forked from https://github.com/hpanwar08/detectron2 and uses the pre-trained network on PubLayNet dataset.

PRImA dataset can be reached from https://www.primaresearch.org/datasets/Layout_Analysis.

Conversion from PRImA PAGEXML to COCOJSON format can be done with running convert_prima_to_coco.py script. Input of the script should be --prima_datapath "path_to_your_folder". Your PRImA folder should have XML and Images folders as subfolders. After running the script, final destinations of the folders should be like:

      data/
      └── prima/
          ├── Images/
          ├── XML/
          ├── License.txt
          └── annotations*.json

This conversion script is updated from https://github.com/Layout-Parser/layout-model-training.

cs555_project.py script or cs555_projectFinal.ipynb notebook should be run in order to make object predictions and dataset evaluations. This project is run on Google Colab.

Recommend Projects

mertsayar8 / detectron2 Goto Github PK

detectron2's Introduction

detectron2's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent