The docparser from ds3lab

docparser's Issues

When will the code be released

When will the code be released？

How to use the model to predict

I want to ask if I want to use this model to classify some new pages, where should I put my data?

DocParser Inference

Is there an inference code to pass my own input document using your pre-trained model and get the detected tables output?

Thank you.

[Question] Unexpected results when running on custom PDF files

Hello, thank you for the effort you've put into this repository. I've recently tried to run it against my own pdfs (converted to images)

Click here to see the image

And this was result:

Click here to see the image

My code to draw:

import argparse
import json
import logging
import os
import cv2

import skimage.io

from docparser import stage1_entity_detector


image_name="outputname-01.png"
sample_img_path = "/home/ubuntu/projects/pdf-expirements/DocParser/images/{}".format(image_name)
output_dir = "/home/ubuntu/projects/pdf-expirements/DocParser/demos/output"

entity_detector = stage1_entity_detector.EntityDetector()
entity_detector.init_model(default_weights="highlevel_wsft")

sample_img = skimage.io.imread(sample_img_path)
results = entity_detector.predict(sample_img)

image = cv2.imread(sample_img_path)

shapes = []
for pred in results["prediction_list"]:
  box = pred["bbox_orig_coords"].tolist()
  label = pred["class_name"]
  score = pred["pred_score"]
  shapes.append({"box": box, "label": label, "score": score})

for shape in shapes:
  y1, x1, y2, x2 = shape["box"]
  image = cv2.rectangle(image, (x1, y1), (x2, y2), (255, 0, 0), 2)


cv2.imwrite("{}/{}".format(output_dir, image_name), image)

Most of the bounding boxes were drawn very incorrectly. Am I doing the drawing wrong, or does it not work with files, which contain a structure like this. ? Thank you for your time.

How to generate table structures for ICDAR 2019 Modern Images

Hello,
Thanks for your work. I am trying to use your model to generate structures for the tables in ICDAR 2019 TRACK B2 Modern images. Please can you point me to the file to use and how to generate the structures for the tables in those images?

Thanks

License

Hi,
Thanks for sharing the DocParser!
Is there a license associated with this repo?
Thanks in advance!

ds3lab / docparser Goto Github PK

docparser's People

Contributors

Stargazers

Watchers

Forkers

docparser's Issues

When will the code be released

How to use the model to predict

DocParser Inference

[Question] Unexpected results when running on custom PDF files

How to generate table structures for ICDAR 2019 Modern Images

License

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent