Git Product home page Git Product logo

dsg's Introduction

DSG: Document Structure Generator

Paper

Further information and evaluations can be found in the paper

Requirements

We tested our code on a Linux machine with python 3.10, detectron 0.6 and pytorch 1.11.

Installation instructions for detectron and pytorch can be found here

To setup the environment with all the required dependencies, we provide further steps here

Datasets and model download

Please use following link to download model checkpoints and datasets.

Unzip checkpoints.zip and datasets.zip at the root level of this repository and download the images as described in download_ep_images_helper. Move the train/test/val image directories to datasets/eperiodica3/imgs.

At the moment, there are two images which are inaccessible to the public due to copyright restrictions. Until they are publicly available, we download similar images from these magazines for which the original bounding boxes roughly match. In 2024 "edm.001.2018.073.0201-0" in the training set will be publicly available, and "tbg.002.2020.158.0072-0" in the test set will be publicly available in 2026.

File Naming in our Demo

DSG_E2E_arxivdocs: DSG trained on arXivdocs

DSG_E2E_eperiodica: DSG trained on E-Periodica

Demonstration of our system

Demo entity prediction with postprocesing:

Note: When running the code for the first time, glove word embeddings are automatically downloaded.

First, create an output directory, e.g. at ./demo/EP_outputs.

To run DSG for prediction and use grammar-based postprocessing, run:

python visualizations/demo.py --config-file ./configs/sgg_end2end_EP.yaml --input ./datasets/eperiodica3/imgs/val/* --output ./demo/EP_outputs --raw_output ./demo/EP_outputs --opts MODEL.ROI_SCENEGRAPH_HEAD.PREDICT_USE_VISION True MODEL.WEIGHTS ./checkpoints/DSG_E2E_eperiodica/dsg_e2e_eperiodica_checkpoint.pth TEST.USE_GRAMMAR_POSTPROCESSING True

Demo of hOCR creation

The hOCR creation demo uses the outputs created by the previous script.

For convenience, we prepared outputs for one sample and a jupyter notebook to demonstrate our hOCR creation and querying here

Credits

This repository builds on other open source implementations, including detectron2 and segmentation-sg

dsg's People

Contributors

rashitig avatar j-rausch avatar inf800 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.