Git Product home page Git Product logo

detectron2's Introduction


This project focuses on cross domain Document Object Detection (DOD). DOD is the task of decomposing a document page image into structural and logical units such as texts, titles, lists, figures. In this paper, recent research on this task is discussed, two recent datasets used in research namely Publaynet and PRImA Layout Analysis are summarized and a method based on Mask RCNN and feature pyramid networks is given using recently released large-scaled dataset Publaynet training data for cross domain DOD task on PRImA Layout Analysis Dataset. Results on PRImA and validation set of Publaynet are compared.

This project is forked from https://github.com/hpanwar08/detectron2 and uses the pre-trained network on PubLayNet dataset.

PRImA dataset can be reached from https://www.primaresearch.org/datasets/Layout_Analysis.

Conversion from PRImA PAGEXML to COCOJSON format can be done with running convert_prima_to_coco.py script. Input of the script should be --prima_datapath "path_to_your_folder". Your PRImA folder should have XML and Images folders as subfolders. After running the script, final destinations of the folders should be like:

      data/
      └── prima/
          ├── Images/
          ├── XML/
          ├── License.txt
          └── annotations*.json

This conversion script is updated from https://github.com/Layout-Parser/layout-model-training.

cs555_project.py script or cs555_projectFinal.ipynb notebook should be run in order to make object predictions and dataset evaluations. This project is run on Google Colab.

detectron2's People

Contributors

ppwwyyxx avatar hpanwar08 avatar mertsayar8 avatar maxfrei750 avatar bryant1410 avatar vkhalidov avatar lyttonhao avatar wangg12 avatar superirabbit avatar raymondcm avatar sampepose avatar endernewton avatar yanicklandry avatar botcs avatar alicanb avatar anshulrai avatar bxiong1202 avatar bradezard131 avatar brettkoonce avatar corysaildrone avatar danmorozoff avatar dannyfeliz avatar gkioxari avatar kharshit avatar jaesuny avatar jakobu5 avatar jeremyfix avatar jsoref avatar jss367 avatar kdexd avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.